rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality
@ 2023-03-24  0:19 Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU() Paul E. McKenney
                   ` (18 more replies)
  0 siblings, 19 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch

Hello!

This RFC series shrinks the srcu_struct structure to the bare minimum
required to support SRCU readers, relegating the remaining fields to a new
srcu_usage structure.  Statically allocated srcu_struct structures created
by DEFINE_SRCU() and DEFINE_STATIC_SRCU() have statically allocated
srcu_usage structures, but those required for dynamically allocated
srcu_struct structures that are initialized using init_srcu_struct()
are dynamically allocated.

The results is a reduction in the size of an srcu_struct structure from
a couple hundred bytes to just 24 bytes on x86_64 systems.  This can be
helpful when SRCU readers are used in a fastpath for which the srcu_struct
structure must be embedded in another structure, and especially where
that fastpath also needs to access fields both before and after the
srcu_struct structure.

This series takes baby steps, in part because breaking SRCU means that
you get absolutely no console output.  Yes, I did learn this the hard way.
Why do you ask?  ;-)

Here are those baby steps:

1.	Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU().

2.	Use static init for statically allocated in-module srcu_struct.

3.	Begin offloading srcu_struct fields to srcu_update.  Note that
	this affects notifiers, which open-code static allocation of
	an srcu_struct structure.  (And no, I still do not see a way to
	abstract this, sorry!)

4.	Move ->level from srcu_struct to srcu_usage.

5.	Move ->srcu_size_state from srcu_struct to srcu_usage.

6.	Move ->srcu_cb_mutex from srcu_struct to srcu_usage.

7.	Move ->lock initialization after srcu_usage allocation.

8.	Move ->lock from srcu_struct to srcu_usage.

9.	Move ->srcu_gp_mutex from srcu_struct to srcu_usage.

10.	Move grace-period fields from srcu_struct to srcu_usage.

11.	Move heuristics fields from srcu_struct to srcu_usage.

12.	Move ->sda_is_static from srcu_struct to srcu_usage.

13.	Move srcu_barrier() fields from srcu_struct to srcu_usage.

14.	Move work-scheduling fields from srcu_struct to srcu_usage.

15.	Fix long lines in srcu_get_delay().

16.	Fix long lines in cleanup_srcu_struct().

17.	Fix long lines in srcu_gp_end().

18.	Fix long lines in srcu_funnel_gp_start().

19.	Remove extraneous parentheses from srcu_read_lock() etc.

						Thanx, Paul

------------------------------------------------------------------------

 b/include/linux/notifier.h |    5 
 b/include/linux/srcu.h     |    8 
 b/include/linux/srcutiny.h |    6 
 b/include/linux/srcutree.h |   28 +-
 b/kernel/rcu/rcu.h         |    6 
 b/kernel/rcu/srcutree.c    |   19 +
 include/linux/srcutree.h   |  123 ++++++-----
 kernel/rcu/srcutree.c      |  488 +++++++++++++++++++++++----------------------
 8 files changed, 368 insertions(+), 315 deletions(-)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct Paul E. McKenney
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This is a whitespace-only commit with no change in functionality.
Its purpose is to prepare for later commits that: (1) Cause statically
allocated srcu_struct structures to rely on compile-time initialization
and (2) Move fields from the srcu_struct structure to a new srcu_usage
structure.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 include/linux/srcutree.h | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index a6910805f9c5..ac8af12f93b3 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -121,13 +121,13 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN1	1
 #define SRCU_STATE_SCAN2	2
 
-#define __SRCU_STRUCT_INIT(name, pcpu_name)				\
-{									\
-	.sda = &pcpu_name,						\
-	.lock = __SPIN_LOCK_UNLOCKED(name.lock),			\
-	.srcu_gp_seq_needed = -1UL,					\
-	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),		\
-	__SRCU_DEP_MAP_INIT(name)					\
+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
+{												\
+	.sda = &pcpu_name,									\
+	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
+	.srcu_gp_seq_needed = -1UL,								\
+	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
+	__SRCU_DEP_MAP_INIT(name)								\
 }
 
 /*
@@ -150,15 +150,15 @@ struct srcu_struct {
  * See include/linux/percpu-defs.h for the rules on per-CPU variables.
  */
 #ifdef MODULE
-# define __DEFINE_SRCU(name, is_static)					\
-	is_static struct srcu_struct name;				\
-	extern struct srcu_struct * const __srcu_struct_##name;		\
-	struct srcu_struct * const __srcu_struct_##name			\
+# define __DEFINE_SRCU(name, is_static)								\
+	is_static struct srcu_struct name;							\
+	extern struct srcu_struct * const __srcu_struct_##name;					\
+	struct srcu_struct * const __srcu_struct_##name						\
 		__section("___srcu_struct_ptrs") = &name
 #else
-# define __DEFINE_SRCU(name, is_static)					\
-	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);	\
-	is_static struct srcu_struct name =				\
+# define __DEFINE_SRCU(name, is_static)								\
+	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
+	is_static struct srcu_struct name =							\
 		__SRCU_STRUCT_INIT(name, name##_srcu_data)
 #endif
 #define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-30  4:11   ` Zhang, Qiang1
  2023-03-24  0:19 ` [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update Paul E. McKenney
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

Further shrinking the srcu_struct structure is eased by requiring
that in-module srcu_struct structures rely more heavily on static
initialization.  In particular, this preserves the property that
a module-load-time srcu_struct initialization can fail only due
to memory-allocation failure of the per-CPU srcu_data structures.
It might also slightly improve robustness by keeping the number of memory
allocations that must succeed down percpu_alloc() call.

This is in preparation for splitting an srcu_usage structure out
of the srcu_struct structure.

[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 include/linux/srcutree.h | 19 ++++++++++++++-----
 kernel/rcu/srcutree.c    | 19 +++++++++++++------
 2 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index ac8af12f93b3..428480152375 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -121,15 +121,24 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN1	1
 #define SRCU_STATE_SCAN2	2
 
-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
-{												\
-	.sda = &pcpu_name,									\
+#define __SRCU_STRUCT_INIT_COMMON(name)								\
 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
-	__SRCU_DEP_MAP_INIT(name)								\
+	__SRCU_DEP_MAP_INIT(name)
+
+#define __SRCU_STRUCT_INIT_MODULE(name)								\
+{												\
+	__SRCU_STRUCT_INIT_COMMON(name)								\
 }
 
+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
+{												\
+	.sda = &pcpu_name,									\
+	__SRCU_STRUCT_INIT_COMMON(name)								\
+}
+
+
 /*
  * Define and initialize a srcu struct at build time.
  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
@@ -151,7 +160,7 @@ struct srcu_struct {
  */
 #ifdef MODULE
 # define __DEFINE_SRCU(name, is_static)								\
-	is_static struct srcu_struct name;							\
+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
 	extern struct srcu_struct * const __srcu_struct_##name;					\
 	struct srcu_struct * const __srcu_struct_##name						\
 		__section("___srcu_struct_ptrs") = &name
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index cd46fe063e50..7a6d9452a5d0 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1895,13 +1895,14 @@ void __init srcu_init(void)
 static int srcu_module_coming(struct module *mod)
 {
 	int i;
+	struct srcu_struct *ssp;
 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
-	int ret;
 
 	for (i = 0; i < mod->num_srcu_structs; i++) {
-		ret = init_srcu_struct(*(sspp++));
-		if (WARN_ON_ONCE(ret))
-			return ret;
+		ssp = *(sspp++);
+		ssp->sda = alloc_percpu(struct srcu_data);
+		if (WARN_ON_ONCE(!ssp->sda))
+			return -ENOMEM;
 	}
 	return 0;
 }
@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module *mod)
 static void srcu_module_going(struct module *mod)
 {
 	int i;
+	struct srcu_struct *ssp;
 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
 
-	for (i = 0; i < mod->num_srcu_structs; i++)
-		cleanup_srcu_struct(*(sspp++));
+	for (i = 0; i < mod->num_srcu_structs; i++) {
+		ssp = *(sspp++);
+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
+				cleanup_srcu_struct(ssp);
+		free_percpu(ssp->sda);
+	}
 }
 
 /* Handle one module, either coming or going. */
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU() Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24 19:10   ` Wysocki, Rafael J
  2023-03-24  0:19 ` [PATCH RFC rcu 04/19] srcu: Move ->level from srcu_struct to srcu_usage Paul E. McKenney
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney,
	Rafael J. Wysocki, Michał Mirosław, Dmitry Osipenko

The current srcu_struct structure is on the order of 200 bytes in size
(depending on architecture and .config), which is much better than the
old-style 26K bytes, but still all too inconvenient when one is trying
to achieve good cache locality on a fastpath involving SRCU readers.

However, only a few fields in srcu_struct are used by SRCU readers.
The remaining fields could be offloaded to a new srcu_update
structure, thus shrinking the srcu_struct structure down to a few
tens of bytes.  This commit begins this noble quest, a quest that is
complicated by open-coded initialization of the srcu_struct within the
srcu_notifier_head structure.  This complication is addressed by updating
the srcu_notifier_head structure's open coding, given that there does
not appear to be a straightforward way of abstracting that initialization.

This commit moves only the ->node pointer to srcu_update.  Later commits
will move additional fields.

[ paulmck: Fold in qiang1.zhang@intel.com's memory-leak fix. ]

Link: https://lore.kernel.org/all/20230320055751.4120251-1-qiang1.zhang@intel.com/
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
---
 include/linux/notifier.h |  5 ++++-
 include/linux/srcutiny.h |  6 +++---
 include/linux/srcutree.h | 27 ++++++++++++++++++---------
 kernel/rcu/rcu.h         |  6 ++++--
 kernel/rcu/srcutree.c    | 28 +++++++++++++++++++---------
 5 files changed, 48 insertions(+), 24 deletions(-)

diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index aef88c2d1173..2aba75145144 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -73,6 +73,9 @@ struct raw_notifier_head {
 
 struct srcu_notifier_head {
 	struct mutex mutex;
+#ifdef CONFIG_TREE_SRCU
+	struct srcu_usage srcuu;
+#endif
 	struct srcu_struct srcu;
 	struct notifier_block __rcu *head;
 };
@@ -107,7 +110,7 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
 	{							\
 		.mutex = __MUTEX_INITIALIZER(name.mutex),	\
 		.head = NULL,					\
-		.srcu = __SRCU_STRUCT_INIT(name.srcu, pcpu),	\
+		.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
 	}
 
 #define ATOMIC_NOTIFIER_HEAD(name)				\
diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
index 5aa5e0faf6a1..ebd72491af99 100644
--- a/include/linux/srcutiny.h
+++ b/include/linux/srcutiny.h
@@ -31,7 +31,7 @@ struct srcu_struct {
 
 void srcu_drive_gp(struct work_struct *wp);
 
-#define __SRCU_STRUCT_INIT(name, __ignored)				\
+#define __SRCU_STRUCT_INIT(name, __ignored, ___ignored)			\
 {									\
 	.srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq),	\
 	.srcu_cb_tail = &name.srcu_cb_head,				\
@@ -44,9 +44,9 @@ void srcu_drive_gp(struct work_struct *wp);
  * Tree SRCU, which needs some per-CPU data.
  */
 #define DEFINE_SRCU(name) \
-	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
+	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
 #define DEFINE_STATIC_SRCU(name) \
-	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
+	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
 
 void synchronize_srcu(struct srcu_struct *ssp);
 
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 428480152375..2689e64024bb 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -57,11 +57,17 @@ struct srcu_node {
 	int grphi;				/* Biggest CPU for node. */
 };
 
+/*
+ * Per-SRCU-domain structure, update-side data linked from srcu_struct.
+ */
+struct srcu_usage {
+	struct srcu_node *node;			/* Combining tree. */
+};
+
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	struct srcu_node *node;			/* Combining tree. */
 	struct srcu_node *level[RCU_NUM_LVLS + 1];
 						/* First node at each level. */
 	int srcu_size_state;			/* Small-to-big transition state. */
@@ -90,6 +96,7 @@ struct srcu_struct {
 	unsigned long reschedule_count;
 	struct delayed_work work;
 	struct lockdep_map dep_map;
+	struct srcu_usage *srcu_sup;		/* Update-side data. */
 };
 
 // Values for size state variable (->srcu_size_state).  Once the state
@@ -121,24 +128,24 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN1	1
 #define SRCU_STATE_SCAN2	2
 
-#define __SRCU_STRUCT_INIT_COMMON(name)								\
+#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
+	.srcu_sup = &usage_name,								\
 	__SRCU_DEP_MAP_INIT(name)
 
-#define __SRCU_STRUCT_INIT_MODULE(name)								\
+#define __SRCU_STRUCT_INIT_MODULE(name, usage_name)						\
 {												\
-	__SRCU_STRUCT_INIT_COMMON(name)								\
+	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
 }
 
-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
+#define __SRCU_STRUCT_INIT(name, usage_name, pcpu_name)						\
 {												\
 	.sda = &pcpu_name,									\
-	__SRCU_STRUCT_INIT_COMMON(name)								\
+	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
 }
 
-
 /*
  * Define and initialize a srcu struct at build time.
  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
@@ -160,15 +167,17 @@ struct srcu_struct {
  */
 #ifdef MODULE
 # define __DEFINE_SRCU(name, is_static)								\
-	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
+	static struct srcu_usage name##_srcu_usage;						\
+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
 	extern struct srcu_struct * const __srcu_struct_##name;					\
 	struct srcu_struct * const __srcu_struct_##name						\
 		__section("___srcu_struct_ptrs") = &name
 #else
 # define __DEFINE_SRCU(name, is_static)								\
 	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
+	static struct srcu_usage name##_srcu_usage;						\
 	is_static struct srcu_struct name =							\
-		__SRCU_STRUCT_INIT(name, name##_srcu_data)
+		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
 #endif
 #define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
 #define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index a3adcf9a9919..4a1b9622598b 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -378,11 +378,13 @@ extern void rcu_init_geometry(void);
  * specified state structure (for SRCU) or the only rcu_state structure
  * (for RCU).
  */
-#define srcu_for_each_node_breadth_first(sp, rnp) \
+#define _rcu_for_each_node_breadth_first(sp, rnp) \
 	for ((rnp) = &(sp)->node[0]; \
 	     (rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++)
 #define rcu_for_each_node_breadth_first(rnp) \
-	srcu_for_each_node_breadth_first(&rcu_state, rnp)
+	_rcu_for_each_node_breadth_first(&rcu_state, rnp)
+#define srcu_for_each_node_breadth_first(ssp, rnp) \
+	_rcu_for_each_node_breadth_first(ssp->srcu_sup, rnp)
 
 /*
  * Scan the leaves of the rcu_node hierarchy for the rcu_state structure.
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 7a6d9452a5d0..ad1d5ca42a99 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -173,12 +173,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 
 	/* Initialize geometry if it has not already been initialized. */
 	rcu_init_geometry();
-	ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), gfp_flags);
-	if (!ssp->node)
+	ssp->srcu_sup->node = kcalloc(rcu_num_nodes, sizeof(*ssp->srcu_sup->node), gfp_flags);
+	if (!ssp->srcu_sup->node)
 		return false;
 
 	/* Work out the overall tree geometry. */
-	ssp->level[0] = &ssp->node[0];
+	ssp->level[0] = &ssp->srcu_sup->node[0];
 	for (i = 1; i < rcu_num_lvls; i++)
 		ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
 	rcu_init_levelspread(levelspread, num_rcu_lvl);
@@ -195,7 +195,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 		snp->srcu_gp_seq_needed_exp = SRCU_SNP_INIT_SEQ;
 		snp->grplo = -1;
 		snp->grphi = -1;
-		if (snp == &ssp->node[0]) {
+		if (snp == &ssp->srcu_sup->node[0]) {
 			/* Root node, special case. */
 			snp->srcu_parent = NULL;
 			continue;
@@ -236,8 +236,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
  */
 static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 {
+	if (!is_static)
+		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
+	if (!ssp->srcu_sup)
+		return -ENOMEM;
 	ssp->srcu_size_state = SRCU_SIZE_SMALL;
-	ssp->node = NULL;
+	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_cb_mutex);
 	mutex_init(&ssp->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
@@ -249,8 +253,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	ssp->sda_is_static = is_static;
 	if (!is_static)
 		ssp->sda = alloc_percpu(struct srcu_data);
-	if (!ssp->sda)
+	if (!ssp->sda) {
+		if (!is_static)
+			kfree(ssp->srcu_sup);
 		return -ENOMEM;
+	}
 	init_srcu_struct_data(ssp);
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
@@ -259,6 +266,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 			if (!ssp->sda_is_static) {
 				free_percpu(ssp->sda);
 				ssp->sda = NULL;
+				kfree(ssp->srcu_sup);
 				return -ENOMEM;
 			}
 		} else {
@@ -656,13 +664,15 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
 		return; /* Caller forgot to stop doing call_srcu()? */
 	}
+	kfree(ssp->srcu_sup->node);
+	ssp->srcu_sup->node = NULL;
+	ssp->srcu_size_state = SRCU_SIZE_SMALL;
 	if (!ssp->sda_is_static) {
 		free_percpu(ssp->sda);
 		ssp->sda = NULL;
+		kfree(ssp->srcu_sup);
+		ssp->srcu_sup = NULL;
 	}
-	kfree(ssp->node);
-	ssp->node = NULL;
-	ssp->srcu_size_state = SRCU_SIZE_SMALL;
 }
 EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 04/19] srcu: Move ->level from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (2 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 05/19] srcu: Move ->srcu_size_state " Paul E. McKenney
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->level[] array from the srcu_struct structure to
the srcu_usage structure to reduce the size of the former in order to
improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  4 ++--
 kernel/rcu/srcutree.c    | 14 +++++++-------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 2689e64024bb..362c8f39c53d 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -62,14 +62,14 @@ struct srcu_node {
  */
 struct srcu_usage {
 	struct srcu_node *node;			/* Combining tree. */
+	struct srcu_node *level[RCU_NUM_LVLS + 1];
+						/* First node at each level. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	struct srcu_node *level[RCU_NUM_LVLS + 1];
-						/* First node at each level. */
 	int srcu_size_state;			/* Small-to-big transition state. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index ad1d5ca42a99..90d753e10e33 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -178,9 +178,9 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 		return false;
 
 	/* Work out the overall tree geometry. */
-	ssp->level[0] = &ssp->srcu_sup->node[0];
+	ssp->srcu_sup->level[0] = &ssp->srcu_sup->node[0];
 	for (i = 1; i < rcu_num_lvls; i++)
-		ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
+		ssp->srcu_sup->level[i] = ssp->srcu_sup->level[i - 1] + num_rcu_lvl[i - 1];
 	rcu_init_levelspread(levelspread, num_rcu_lvl);
 
 	/* Each pass through this loop initializes one srcu_node structure. */
@@ -202,10 +202,10 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 		}
 
 		/* Non-root node. */
-		if (snp == ssp->level[level + 1])
+		if (snp == ssp->srcu_sup->level[level + 1])
 			level++;
-		snp->srcu_parent = ssp->level[level - 1] +
-				   (snp - ssp->level[level]) /
+		snp->srcu_parent = ssp->srcu_sup->level[level - 1] +
+				   (snp - ssp->srcu_sup->level[level]) /
 				   levelspread[level - 1];
 	}
 
@@ -214,7 +214,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 	 * leaves of the srcu_node tree.
 	 */
 	level = rcu_num_lvls - 1;
-	snp_first = ssp->level[level];
+	snp_first = ssp->srcu_sup->level[level];
 	for_each_possible_cpu(cpu) {
 		sdp = per_cpu_ptr(ssp->sda, cpu);
 		sdp->mynode = &snp_first[cpu / levelspread[level]];
@@ -889,7 +889,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		srcu_for_each_node_breadth_first(ssp, snp) {
 			spin_lock_irq_rcu_node(snp);
 			cbs = false;
-			last_lvl = snp >= ssp->level[rcu_num_lvls - 1];
+			last_lvl = snp >= ssp->srcu_sup->level[rcu_num_lvls - 1];
 			if (last_lvl)
 				cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq;
 			snp->srcu_have_cbs[idx] = gpseq;
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 05/19] srcu: Move ->srcu_size_state from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (3 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 04/19] srcu: Move ->level from srcu_struct to srcu_usage Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 06/19] srcu: Move ->srcu_cb_mutex " Paul E. McKenney
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_size_state field from the srcu_struct
structure to the srcu_usage structure to reduce the size of the former
in order to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  2 +-
 kernel/rcu/srcutree.c    | 37 +++++++++++++++++++------------------
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 362c8f39c53d..72fb01fb2eb5 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -64,13 +64,13 @@ struct srcu_usage {
 	struct srcu_node *node;			/* Combining tree. */
 	struct srcu_node *level[RCU_NUM_LVLS + 1];
 						/* First node at each level. */
+	int srcu_size_state;			/* Small-to-big transition state. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	int srcu_size_state;			/* Small-to-big transition state. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 90d753e10e33..2717217de136 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -225,7 +225,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 		}
 		sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
 	}
-	smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
+	smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
 	return true;
 }
 
@@ -240,7 +240,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
 	if (!ssp->srcu_sup)
 		return -ENOMEM;
-	ssp->srcu_size_state = SRCU_SIZE_SMALL;
+	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_cb_mutex);
 	mutex_init(&ssp->srcu_gp_mutex);
@@ -261,7 +261,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	init_srcu_struct_data(ssp);
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
-	if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
+	if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
 		if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
 			if (!ssp->sda_is_static) {
 				free_percpu(ssp->sda);
@@ -270,7 +270,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 				return -ENOMEM;
 			}
 		} else {
-			WRITE_ONCE(ssp->srcu_size_state, SRCU_SIZE_BIG);
+			WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
 		}
 	}
 	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
@@ -315,7 +315,7 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
 static void __srcu_transition_to_big(struct srcu_struct *ssp)
 {
 	lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
-	smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_ALLOC);
+	smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_ALLOC);
 }
 
 /*
@@ -326,10 +326,10 @@ static void srcu_transition_to_big(struct srcu_struct *ssp)
 	unsigned long flags;
 
 	/* Double-checked locking on ->srcu_size-state. */
-	if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL)
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL)
 		return;
 	spin_lock_irqsave_rcu_node(ssp, flags);
-	if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL) {
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL) {
 		spin_unlock_irqrestore_rcu_node(ssp, flags);
 		return;
 	}
@@ -345,7 +345,7 @@ static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp)
 {
 	unsigned long j;
 
-	if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_size_state)
+	if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_sup->srcu_size_state)
 		return;
 	j = jiffies;
 	if (ssp->srcu_size_jiffies != j) {
@@ -666,7 +666,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 	}
 	kfree(ssp->srcu_sup->node);
 	ssp->srcu_sup->node = NULL;
-	ssp->srcu_size_state = SRCU_SIZE_SMALL;
+	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	if (!ssp->sda_is_static) {
 		free_percpu(ssp->sda);
 		ssp->sda = NULL;
@@ -770,7 +770,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 	struct srcu_data *sdp;
 	int state;
 
-	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
 		sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
 	else
 		sdp = this_cpu_ptr(ssp->sda);
@@ -880,7 +880,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	/* A new grace period can start at this point.  But only one. */
 
 	/* Initiate callback invocation as needed. */
-	ss_state = smp_load_acquire(&ssp->srcu_size_state);
+	ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state);
 	if (ss_state < SRCU_SIZE_WAIT_BARRIER) {
 		srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, get_boot_cpu_id()),
 					cbdelay);
@@ -940,7 +940,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		if (ss_state == SRCU_SIZE_ALLOC)
 			init_srcu_struct_nodes(ssp, GFP_KERNEL);
 		else
-			smp_store_release(&ssp->srcu_size_state, ss_state + 1);
+			smp_store_release(&ssp->srcu_sup->srcu_size_state, ss_state + 1);
 	}
 }
 
@@ -1002,7 +1002,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	unsigned long snp_seq;
 
 	/* Ensure that snp node tree is fully initialized before traversing it */
-	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
 		snp_leaf = NULL;
 	else
 		snp_leaf = sdp->mynode;
@@ -1229,7 +1229,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	 * sequence number cannot wrap around in the meantime.
 	 */
 	idx = __srcu_read_lock_nmisafe(ssp);
-	ss_state = smp_load_acquire(&ssp->srcu_size_state);
+	ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state);
 	if (ss_state < SRCU_SIZE_WAIT_CALL)
 		sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
 	else
@@ -1568,7 +1568,7 @@ void srcu_barrier(struct srcu_struct *ssp)
 	atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
 
 	idx = __srcu_read_lock_nmisafe(ssp);
-	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
 		srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda,	get_boot_cpu_id()));
 	else
 		for_each_possible_cpu(cpu)
@@ -1806,7 +1806,7 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 	int cpu;
 	int idx;
 	unsigned long s0 = 0, s1 = 0;
-	int ss_state = READ_ONCE(ssp->srcu_size_state);
+	int ss_state = READ_ONCE(ssp->srcu_sup->srcu_size_state);
 	int ss_state_idx = ss_state;
 
 	idx = ssp->srcu_idx & 0x1;
@@ -1893,8 +1893,9 @@ void __init srcu_init(void)
 		ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
 				      work.work.entry);
 		list_del_init(&ssp->work.work.entry);
-		if (SRCU_SIZING_IS(SRCU_SIZING_INIT) && ssp->srcu_size_state == SRCU_SIZE_SMALL)
-			ssp->srcu_size_state = SRCU_SIZE_ALLOC;
+		if (SRCU_SIZING_IS(SRCU_SIZING_INIT) &&
+		    ssp->srcu_sup->srcu_size_state == SRCU_SIZE_SMALL)
+			ssp->srcu_sup->srcu_size_state = SRCU_SIZE_ALLOC;
 		queue_work(rcu_gp_wq, &ssp->work.work);
 	}
 }
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 06/19] srcu: Move ->srcu_cb_mutex from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (4 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 05/19] srcu: Move ->srcu_size_state " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 07/19] srcu: Move ->lock initialization after srcu_usage allocation Paul E. McKenney
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_cb_mutex field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 2 +-
 kernel/rcu/srcutree.c    | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 72fb01fb2eb5..69d49de87b67 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -65,13 +65,13 @@ struct srcu_usage {
 	struct srcu_node *level[RCU_NUM_LVLS + 1];
 						/* First node at each level. */
 	int srcu_size_state;			/* Small-to-big transition state. */
+	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 	unsigned int srcu_idx;			/* Current rdr array element. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 2717217de136..6b6d92fe41cf 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -242,7 +242,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 		return -ENOMEM;
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
-	mutex_init(&ssp->srcu_cb_mutex);
+	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
 	mutex_init(&ssp->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
 	ssp->srcu_gp_seq = 0;
@@ -861,7 +861,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	int ss_state;
 
 	/* Prevent more than one additional grace period. */
-	mutex_lock(&ssp->srcu_cb_mutex);
+	mutex_lock(&ssp->srcu_sup->srcu_cb_mutex);
 
 	/* End the current grace period. */
 	spin_lock_irq_rcu_node(ssp);
@@ -921,7 +921,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		}
 
 	/* Callback initiation done, allow grace periods after next. */
-	mutex_unlock(&ssp->srcu_cb_mutex);
+	mutex_unlock(&ssp->srcu_sup->srcu_cb_mutex);
 
 	/* Start a new grace period if needed. */
 	spin_lock_irq_rcu_node(ssp);
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 07/19] srcu: Move ->lock initialization after srcu_usage allocation
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (5 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 06/19] srcu: Move ->srcu_cb_mutex " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 08/19] srcu: Move ->lock from srcu_struct to srcu_usage Paul E. McKenney
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

Currently, both __init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=y kernels
and init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=n kernel initialize
the srcu_struct structure's ->lock before the srcu_usage structure has
been allocated.  This of course prevents the ->lock from being moved
to the srcu_usage structure, so this commit moves the initialization
into the init_srcu_struct_fields() after the srcu_usage structure has
been allocated.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 6b6d92fe41cf..f6f37e34a1e0 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -240,6 +240,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
 	if (!ssp->srcu_sup)
 		return -ENOMEM;
+	if (!is_static)
+		spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
@@ -285,7 +287,6 @@ int __init_srcu_struct(struct srcu_struct *ssp, const char *name,
 	/* Don't re-initialize a lock while it is held. */
 	debug_check_no_locks_freed((void *)ssp, sizeof(*ssp));
 	lockdep_init_map(&ssp->dep_map, name, key, 0);
-	spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
 	return init_srcu_struct_fields(ssp, false);
 }
 EXPORT_SYMBOL_GPL(__init_srcu_struct);
@@ -302,7 +303,6 @@ EXPORT_SYMBOL_GPL(__init_srcu_struct);
  */
 int init_srcu_struct(struct srcu_struct *ssp)
 {
-	spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
 	return init_srcu_struct_fields(ssp, false);
 }
 EXPORT_SYMBOL_GPL(init_srcu_struct);
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 08/19] srcu: Move ->lock from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (6 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 07/19] srcu: Move ->lock initialization after srcu_usage allocation Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 09/19] srcu: Move ->srcu_gp_mutex " Paul E. McKenney
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->lock field from the srcu_struct structure to
the srcu_usage structure to reduce the size of the former in order to
improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 11 +++++---
 kernel/rcu/srcutree.c    | 56 ++++++++++++++++++++--------------------
 2 files changed, 35 insertions(+), 32 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 69d49de87b67..6fe5743c1179 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -66,13 +66,13 @@ struct srcu_usage {
 						/* First node at each level. */
 	int srcu_size_state;			/* Small-to-big transition state. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
+	spinlock_t __private lock;		/* Protect counters and size state. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	spinlock_t __private lock;		/* Protect counters and size state. */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
@@ -129,7 +129,6 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN2	2
 
 #define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
-	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
 	.srcu_sup = &usage_name,								\
@@ -167,7 +166,9 @@ struct srcu_struct {
  */
 #ifdef MODULE
 # define __DEFINE_SRCU(name, is_static)								\
-	static struct srcu_usage name##_srcu_usage;						\
+	static struct srcu_usage name##_srcu_usage = {						\
+		.lock = __SPIN_LOCK_UNLOCKED(name.lock),					\
+	};											\
 	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
 	extern struct srcu_struct * const __srcu_struct_##name;					\
 	struct srcu_struct * const __srcu_struct_##name						\
@@ -175,7 +176,9 @@ struct srcu_struct {
 #else
 # define __DEFINE_SRCU(name, is_static)								\
 	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
-	static struct srcu_usage name##_srcu_usage;						\
+	static struct srcu_usage name##_srcu_usage = {						\
+		.lock = __SPIN_LOCK_UNLOCKED(name.lock),					\
+	};											\
 	is_static struct srcu_struct name =							\
 		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
 #endif
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index f6f37e34a1e0..579c43ff0f4b 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -103,7 +103,7 @@ do {										\
 
 #define spin_trylock_irqsave_rcu_node(p, flags)					\
 ({										\
-	bool ___locked = spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags);	\
+	bool ___locked = spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \
 										\
 	if (___locked)								\
 		smp_mb__after_unlock_lock();					\
@@ -241,7 +241,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	if (!ssp->srcu_sup)
 		return -ENOMEM;
 	if (!is_static)
-		spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
+		spin_lock_init(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
@@ -314,7 +314,7 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
  */
 static void __srcu_transition_to_big(struct srcu_struct *ssp)
 {
-	lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
+	lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
 	smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_ALLOC);
 }
 
@@ -328,13 +328,13 @@ static void srcu_transition_to_big(struct srcu_struct *ssp)
 	/* Double-checked locking on ->srcu_size-state. */
 	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL)
 		return;
-	spin_lock_irqsave_rcu_node(ssp, flags);
+	spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags);
 	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL) {
-		spin_unlock_irqrestore_rcu_node(ssp, flags);
+		spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 		return;
 	}
 	__srcu_transition_to_big(ssp);
-	spin_unlock_irqrestore_rcu_node(ssp, flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
 /*
@@ -369,9 +369,9 @@ static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned lon
 
 	if (spin_trylock_irqsave_rcu_node(sdp, *flags))
 		return;
-	spin_lock_irqsave_rcu_node(ssp, *flags);
+	spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags);
 	spin_lock_irqsave_check_contention(ssp);
-	spin_unlock_irqrestore_rcu_node(ssp, *flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, *flags);
 	spin_lock_irqsave_rcu_node(sdp, *flags);
 }
 
@@ -383,9 +383,9 @@ static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned lon
  */
 static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags)
 {
-	if (spin_trylock_irqsave_rcu_node(ssp, *flags))
+	if (spin_trylock_irqsave_rcu_node(ssp->srcu_sup, *flags))
 		return;
-	spin_lock_irqsave_rcu_node(ssp, *flags);
+	spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags);
 	spin_lock_irqsave_check_contention(ssp);
 }
 
@@ -404,13 +404,13 @@ static void check_init_srcu_struct(struct srcu_struct *ssp)
 	/* The smp_load_acquire() pairs with the smp_store_release(). */
 	if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq_needed))) /*^^^*/
 		return; /* Already initialized. */
-	spin_lock_irqsave_rcu_node(ssp, flags);
+	spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags);
 	if (!rcu_seq_state(ssp->srcu_gp_seq_needed)) {
-		spin_unlock_irqrestore_rcu_node(ssp, flags);
+		spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 		return;
 	}
 	init_srcu_struct_fields(ssp, true);
-	spin_unlock_irqrestore_rcu_node(ssp, flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
 /*
@@ -774,7 +774,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 		sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
 	else
 		sdp = this_cpu_ptr(ssp->sda);
-	lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
+	lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
 	WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
 	spin_lock_rcu_node(sdp);  /* Interrupts already disabled. */
 	rcu_segcblist_advance(&sdp->srcu_cblist,
@@ -864,7 +864,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	mutex_lock(&ssp->srcu_sup->srcu_cb_mutex);
 
 	/* End the current grace period. */
-	spin_lock_irq_rcu_node(ssp);
+	spin_lock_irq_rcu_node(ssp->srcu_sup);
 	idx = rcu_seq_state(ssp->srcu_gp_seq);
 	WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
 	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
@@ -875,7 +875,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
 	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, gpseq))
 		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, gpseq);
-	spin_unlock_irq_rcu_node(ssp);
+	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	mutex_unlock(&ssp->srcu_gp_mutex);
 	/* A new grace period can start at this point.  But only one. */
 
@@ -924,15 +924,15 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	mutex_unlock(&ssp->srcu_sup->srcu_cb_mutex);
 
 	/* Start a new grace period if needed. */
-	spin_lock_irq_rcu_node(ssp);
+	spin_lock_irq_rcu_node(ssp->srcu_sup);
 	gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
 	if (!rcu_seq_state(gpseq) &&
 	    ULONG_CMP_LT(gpseq, ssp->srcu_gp_seq_needed)) {
 		srcu_gp_start(ssp);
-		spin_unlock_irq_rcu_node(ssp);
+		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 		srcu_reschedule(ssp, 0);
 	} else {
-		spin_unlock_irq_rcu_node(ssp);
+		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	}
 
 	/* Transition to big if needed. */
@@ -975,7 +975,7 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 	spin_lock_irqsave_ssp_contention(ssp, &flags);
 	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
 		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
-	spin_unlock_irqrestore_rcu_node(ssp, flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
 /*
@@ -1064,7 +1064,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 		else if (list_empty(&ssp->work.work.entry))
 			list_add(&ssp->work.work.entry, &srcu_boot_list);
 	}
-	spin_unlock_irqrestore_rcu_node(ssp, flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
 /*
@@ -1621,17 +1621,17 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 	 */
 	idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq)); /* ^^^ */
 	if (idx == SRCU_STATE_IDLE) {
-		spin_lock_irq_rcu_node(ssp);
+		spin_lock_irq_rcu_node(ssp->srcu_sup);
 		if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
 			WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq));
-			spin_unlock_irq_rcu_node(ssp);
+			spin_unlock_irq_rcu_node(ssp->srcu_sup);
 			mutex_unlock(&ssp->srcu_gp_mutex);
 			return;
 		}
 		idx = rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq));
 		if (idx == SRCU_STATE_IDLE)
 			srcu_gp_start(ssp);
-		spin_unlock_irq_rcu_node(ssp);
+		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 		if (idx != SRCU_STATE_IDLE) {
 			mutex_unlock(&ssp->srcu_gp_mutex);
 			return; /* Someone else started the grace period. */
@@ -1645,10 +1645,10 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 			return; /* readers present, retry later. */
 		}
 		srcu_flip(ssp);
-		spin_lock_irq_rcu_node(ssp);
+		spin_lock_irq_rcu_node(ssp->srcu_sup);
 		rcu_seq_set_state(&ssp->srcu_gp_seq, SRCU_STATE_SCAN2);
 		ssp->srcu_n_exp_nodelay = 0;
-		spin_unlock_irq_rcu_node(ssp);
+		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	}
 
 	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN2) {
@@ -1732,7 +1732,7 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
 {
 	bool pushgp = true;
 
-	spin_lock_irq_rcu_node(ssp);
+	spin_lock_irq_rcu_node(ssp->srcu_sup);
 	if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
 		if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq))) {
 			/* All requests fulfilled, time to go idle. */
@@ -1742,7 +1742,7 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
 		/* Outstanding request and no GP.  Start one. */
 		srcu_gp_start(ssp);
 	}
-	spin_unlock_irq_rcu_node(ssp);
+	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 
 	if (pushgp)
 		queue_delayed_work(rcu_gp_wq, &ssp->work, delay);
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 09/19] srcu: Move ->srcu_gp_mutex from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (7 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 08/19] srcu: Move ->lock from srcu_struct to srcu_usage Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 10/19] srcu: Move grace-period fields " Paul E. McKenney
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_gp_mutex field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  2 +-
 kernel/rcu/srcutree.c    | 14 +++++++-------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 6fe5743c1179..6402cd5a93ef 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -67,13 +67,13 @@ struct srcu_usage {
 	int srcu_size_state;			/* Small-to-big transition state. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
+	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
 	unsigned long srcu_gp_seq_needed;	/* Latest gp_seq needed. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 579c43ff0f4b..3bf3408c7716 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -245,7 +245,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
-	mutex_init(&ssp->srcu_gp_mutex);
+	mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
 	ssp->srcu_gp_seq = 0;
 	ssp->srcu_barrier_seq = 0;
@@ -876,7 +876,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, gpseq))
 		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, gpseq);
 	spin_unlock_irq_rcu_node(ssp->srcu_sup);
-	mutex_unlock(&ssp->srcu_gp_mutex);
+	mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 	/* A new grace period can start at this point.  But only one. */
 
 	/* Initiate callback invocation as needed. */
@@ -1607,7 +1607,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 {
 	int idx;
 
-	mutex_lock(&ssp->srcu_gp_mutex);
+	mutex_lock(&ssp->srcu_sup->srcu_gp_mutex);
 
 	/*
 	 * Because readers might be delayed for an extended period after
@@ -1625,7 +1625,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
 			WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq));
 			spin_unlock_irq_rcu_node(ssp->srcu_sup);
-			mutex_unlock(&ssp->srcu_gp_mutex);
+			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return;
 		}
 		idx = rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq));
@@ -1633,7 +1633,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 			srcu_gp_start(ssp);
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 		if (idx != SRCU_STATE_IDLE) {
-			mutex_unlock(&ssp->srcu_gp_mutex);
+			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return; /* Someone else started the grace period. */
 		}
 	}
@@ -1641,7 +1641,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN1) {
 		idx = 1 ^ (ssp->srcu_idx & 1);
 		if (!try_check_zero(ssp, idx, 1)) {
-			mutex_unlock(&ssp->srcu_gp_mutex);
+			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return; /* readers present, retry later. */
 		}
 		srcu_flip(ssp);
@@ -1659,7 +1659,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		 */
 		idx = 1 ^ (ssp->srcu_idx & 1);
 		if (!try_check_zero(ssp, idx, 2)) {
-			mutex_unlock(&ssp->srcu_gp_mutex);
+			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return; /* readers present, retry later. */
 		}
 		ssp->srcu_n_exp_nodelay = 0;
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 10/19] srcu: Move grace-period fields from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (8 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 09/19] srcu: Move ->srcu_gp_mutex " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 11/19] srcu: Move heuristics " Paul E. McKenney
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed,
->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields
from the srcu_struct structure to the srcu_usage structure to reduce
the size of the former in order to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  25 ++++----
 kernel/rcu/srcutree.c    | 126 +++++++++++++++++++--------------------
 2 files changed, 76 insertions(+), 75 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 6402cd5a93ef..7458df6bbe00 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -68,6 +68,11 @@ struct srcu_usage {
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
+	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
+	unsigned long srcu_gp_seq_needed;	/* Latest gp_seq needed. */
+	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
+	unsigned long srcu_gp_start;		/* Last GP start timestamp (jiffies) */
+	unsigned long srcu_last_gp_end;		/* Last GP end timestamp (ns) */
 };
 
 /*
@@ -75,11 +80,6 @@ struct srcu_usage {
  */
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
-	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
-	unsigned long srcu_gp_seq_needed;	/* Latest gp_seq needed. */
-	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
-	unsigned long srcu_gp_start;		/* Last GP start timestamp (jiffies) */
-	unsigned long srcu_last_gp_end;		/* Last GP end timestamp (ns) */
 	unsigned long srcu_size_jiffies;	/* Current contention-measurement interval. */
 	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
 	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
@@ -128,8 +128,13 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN1	1
 #define SRCU_STATE_SCAN2	2
 
-#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
+#define __SRCU_USAGE_INIT(name)									\
+{												\
+	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
+}
+
+#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
 	.srcu_sup = &usage_name,								\
 	__SRCU_DEP_MAP_INIT(name)
@@ -166,9 +171,7 @@ struct srcu_struct {
  */
 #ifdef MODULE
 # define __DEFINE_SRCU(name, is_static)								\
-	static struct srcu_usage name##_srcu_usage = {						\
-		.lock = __SPIN_LOCK_UNLOCKED(name.lock),					\
-	};											\
+	static struct srcu_usage name##_srcu_usage = __SRCU_USAGE_INIT(name##_srcu_usage);	\
 	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
 	extern struct srcu_struct * const __srcu_struct_##name;					\
 	struct srcu_struct * const __srcu_struct_##name						\
@@ -176,9 +179,7 @@ struct srcu_struct {
 #else
 # define __DEFINE_SRCU(name, is_static)								\
 	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
-	static struct srcu_usage name##_srcu_usage = {						\
-		.lock = __SPIN_LOCK_UNLOCKED(name.lock),					\
-	};											\
+	static struct srcu_usage name##_srcu_usage = __SRCU_USAGE_INIT(name##_srcu_usage);	\
 	is_static struct srcu_struct name =							\
 		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
 #endif
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 3bf3408c7716..6c1ac78b1cfc 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -135,8 +135,8 @@ static void init_srcu_struct_data(struct srcu_struct *ssp)
 		spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
 		rcu_segcblist_init(&sdp->srcu_cblist);
 		sdp->srcu_cblist_invoking = false;
-		sdp->srcu_gp_seq_needed = ssp->srcu_gp_seq;
-		sdp->srcu_gp_seq_needed_exp = ssp->srcu_gp_seq;
+		sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq;
+		sdp->srcu_gp_seq_needed_exp = ssp->srcu_sup->srcu_gp_seq;
 		sdp->mynode = NULL;
 		sdp->cpu = cpu;
 		INIT_WORK(&sdp->work, srcu_invoke_callbacks);
@@ -247,7 +247,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
 	mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
-	ssp->srcu_gp_seq = 0;
+	ssp->srcu_sup->srcu_gp_seq = 0;
 	ssp->srcu_barrier_seq = 0;
 	mutex_init(&ssp->srcu_barrier_mutex);
 	atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
@@ -261,8 +261,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 		return -ENOMEM;
 	}
 	init_srcu_struct_data(ssp);
-	ssp->srcu_gp_seq_needed_exp = 0;
-	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
+	ssp->srcu_sup->srcu_gp_seq_needed_exp = 0;
+	ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
 	if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
 		if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
 			if (!ssp->sda_is_static) {
@@ -275,7 +275,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 			WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
 		}
 	}
-	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
+	smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
 	return 0;
 }
 
@@ -402,10 +402,10 @@ static void check_init_srcu_struct(struct srcu_struct *ssp)
 	unsigned long flags;
 
 	/* The smp_load_acquire() pairs with the smp_store_release(). */
-	if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq_needed))) /*^^^*/
+	if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed))) /*^^^*/
 		return; /* Already initialized. */
 	spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags);
-	if (!rcu_seq_state(ssp->srcu_gp_seq_needed)) {
+	if (!rcu_seq_state(ssp->srcu_sup->srcu_gp_seq_needed)) {
 		spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 		return;
 	}
@@ -616,11 +616,11 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 	unsigned long j;
 	unsigned long jbase = SRCU_INTERVAL;
 
-	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
+	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
 		jbase = 0;
-	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) {
+	if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq))) {
 		j = jiffies - 1;
-		gpstart = READ_ONCE(ssp->srcu_gp_start);
+		gpstart = READ_ONCE(ssp->srcu_sup->srcu_gp_start);
 		if (time_after(j, gpstart))
 			jbase += j - gpstart;
 		if (!jbase) {
@@ -656,12 +656,12 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 		if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
 			return; /* Forgot srcu_barrier(), so just leak it! */
 	}
-	if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
-	    WARN_ON(rcu_seq_current(&ssp->srcu_gp_seq) != ssp->srcu_gp_seq_needed) ||
+	if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
+	    WARN_ON(rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq) != ssp->srcu_sup->srcu_gp_seq_needed) ||
 	    WARN_ON(srcu_readers_active(ssp))) {
 		pr_info("%s: Active srcu_struct %p read state: %d gp state: %lu/%lu\n",
-			__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)),
-			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
+			__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)),
+			rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq), ssp->srcu_sup->srcu_gp_seq_needed);
 		return; /* Caller forgot to stop doing call_srcu()? */
 	}
 	kfree(ssp->srcu_sup->node);
@@ -775,18 +775,18 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 	else
 		sdp = this_cpu_ptr(ssp->sda);
 	lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
-	WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
+	WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
 	spin_lock_rcu_node(sdp);  /* Interrupts already disabled. */
 	rcu_segcblist_advance(&sdp->srcu_cblist,
-			      rcu_seq_current(&ssp->srcu_gp_seq));
+			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
 	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
-				       rcu_seq_snap(&ssp->srcu_gp_seq));
+				       rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
 	spin_unlock_rcu_node(sdp);  /* Interrupts remain disabled. */
-	WRITE_ONCE(ssp->srcu_gp_start, jiffies);
+	WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies);
 	WRITE_ONCE(ssp->srcu_n_exp_nodelay, 0);
 	smp_mb(); /* Order prior store to ->srcu_gp_seq_needed vs. GP start. */
-	rcu_seq_start(&ssp->srcu_gp_seq);
-	state = rcu_seq_state(ssp->srcu_gp_seq);
+	rcu_seq_start(&ssp->srcu_sup->srcu_gp_seq);
+	state = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
 	WARN_ON_ONCE(state != SRCU_STATE_SCAN1);
 }
 
@@ -865,16 +865,16 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 
 	/* End the current grace period. */
 	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	idx = rcu_seq_state(ssp->srcu_gp_seq);
+	idx = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
 	WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
-	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
+	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
 		cbdelay = 0;
 
-	WRITE_ONCE(ssp->srcu_last_gp_end, ktime_get_mono_fast_ns());
-	rcu_seq_end(&ssp->srcu_gp_seq);
-	gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
-	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, gpseq))
-		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, gpseq);
+	WRITE_ONCE(ssp->srcu_sup->srcu_last_gp_end, ktime_get_mono_fast_ns());
+	rcu_seq_end(&ssp->srcu_sup->srcu_gp_seq);
+	gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
+	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq))
+		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq);
 	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 	/* A new grace period can start at this point.  But only one. */
@@ -925,9 +925,9 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 
 	/* Start a new grace period if needed. */
 	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
+	gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
 	if (!rcu_seq_state(gpseq) &&
-	    ULONG_CMP_LT(gpseq, ssp->srcu_gp_seq_needed)) {
+	    ULONG_CMP_LT(gpseq, ssp->srcu_sup->srcu_gp_seq_needed)) {
 		srcu_gp_start(ssp);
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 		srcu_reschedule(ssp, 0);
@@ -960,7 +960,7 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 	if (snp)
 		for (; snp != NULL; snp = snp->srcu_parent) {
 			sgsne = READ_ONCE(snp->srcu_gp_seq_needed_exp);
-			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) ||
+			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) ||
 			    (!srcu_invl_snp_seq(sgsne) && ULONG_CMP_GE(sgsne, s)))
 				return;
 			spin_lock_irqsave_rcu_node(snp, flags);
@@ -973,8 +973,8 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 			spin_unlock_irqrestore_rcu_node(snp, flags);
 		}
 	spin_lock_irqsave_ssp_contention(ssp, &flags);
-	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
-		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
+	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s))
+		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s);
 	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
@@ -1010,7 +1010,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	if (snp_leaf)
 		/* Each pass through the loop does one level of the srcu_node tree. */
 		for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
-			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) && snp != snp_leaf)
+			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) && snp != snp_leaf)
 				return; /* GP already done and CBs recorded. */
 			spin_lock_irqsave_rcu_node(snp, flags);
 			snp_seq = snp->srcu_have_cbs[idx];
@@ -1037,20 +1037,20 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 
 	/* Top of tree, must ensure the grace period will be started. */
 	spin_lock_irqsave_ssp_contention(ssp, &flags);
-	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed, s)) {
+	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed, s)) {
 		/*
 		 * Record need for grace period s.  Pair with load
 		 * acquire setting up for initialization.
 		 */
-		smp_store_release(&ssp->srcu_gp_seq_needed, s); /*^^^*/
+		smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, s); /*^^^*/
 	}
-	if (!do_norm && ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
-		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
+	if (!do_norm && ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s))
+		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s);
 
 	/* If grace period not already in progress, start it. */
-	if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) &&
-	    rcu_seq_state(ssp->srcu_gp_seq) == SRCU_STATE_IDLE) {
-		WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
+	if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) &&
+	    rcu_seq_state(ssp->srcu_sup->srcu_gp_seq) == SRCU_STATE_IDLE) {
+		WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
 		srcu_gp_start(ssp);
 
 		// And how can that list_add() in the "else" clause
@@ -1184,18 +1184,18 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp)
 
 	/* First, see if enough time has passed since the last GP. */
 	t = ktime_get_mono_fast_ns();
-	tlast = READ_ONCE(ssp->srcu_last_gp_end);
+	tlast = READ_ONCE(ssp->srcu_sup->srcu_last_gp_end);
 	if (exp_holdoff == 0 ||
 	    time_in_range_open(t, tlast, tlast + exp_holdoff))
 		return false; /* Too soon after last GP. */
 
 	/* Next, check for probable idleness. */
-	curseq = rcu_seq_current(&ssp->srcu_gp_seq);
+	curseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
 	smp_mb(); /* Order ->srcu_gp_seq with ->srcu_gp_seq_needed. */
-	if (ULONG_CMP_LT(curseq, READ_ONCE(ssp->srcu_gp_seq_needed)))
+	if (ULONG_CMP_LT(curseq, READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed)))
 		return false; /* Grace period in progress, so not idle. */
 	smp_mb(); /* Order ->srcu_gp_seq with prior access. */
-	if (curseq != rcu_seq_current(&ssp->srcu_gp_seq))
+	if (curseq != rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq))
 		return false; /* GP # changed, so not idle. */
 	return true; /* With reasonable probability, idle! */
 }
@@ -1238,8 +1238,8 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	if (rhp)
 		rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
 	rcu_segcblist_advance(&sdp->srcu_cblist,
-			      rcu_seq_current(&ssp->srcu_gp_seq));
-	s = rcu_seq_snap(&ssp->srcu_gp_seq);
+			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
+	s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
 	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
 	if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
 		sdp->srcu_gp_seq_needed = s;
@@ -1452,7 +1452,7 @@ unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp)
 	// Any prior manipulation of SRCU-protected data must happen
 	// before the load from ->srcu_gp_seq.
 	smp_mb();
-	return rcu_seq_snap(&ssp->srcu_gp_seq);
+	return rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
 }
 EXPORT_SYMBOL_GPL(get_state_synchronize_srcu);
 
@@ -1499,7 +1499,7 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
  */
 bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)
 {
-	if (!rcu_seq_done(&ssp->srcu_gp_seq, cookie))
+	if (!rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, cookie))
 		return false;
 	// Ensure that the end of the SRCU grace period happens before
 	// any subsequent code that the caller might execute.
@@ -1619,16 +1619,16 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 	 * The load-acquire ensures that we see the accesses performed
 	 * by the prior grace period.
 	 */
-	idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq)); /* ^^^ */
+	idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq)); /* ^^^ */
 	if (idx == SRCU_STATE_IDLE) {
 		spin_lock_irq_rcu_node(ssp->srcu_sup);
-		if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
-			WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq));
+		if (ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed)) {
+			WARN_ON_ONCE(rcu_seq_state(ssp->srcu_sup->srcu_gp_seq));
 			spin_unlock_irq_rcu_node(ssp->srcu_sup);
 			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return;
 		}
-		idx = rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq));
+		idx = rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq));
 		if (idx == SRCU_STATE_IDLE)
 			srcu_gp_start(ssp);
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
@@ -1638,7 +1638,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		}
 	}
 
-	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN1) {
+	if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) == SRCU_STATE_SCAN1) {
 		idx = 1 ^ (ssp->srcu_idx & 1);
 		if (!try_check_zero(ssp, idx, 1)) {
 			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
@@ -1646,12 +1646,12 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		}
 		srcu_flip(ssp);
 		spin_lock_irq_rcu_node(ssp->srcu_sup);
-		rcu_seq_set_state(&ssp->srcu_gp_seq, SRCU_STATE_SCAN2);
+		rcu_seq_set_state(&ssp->srcu_sup->srcu_gp_seq, SRCU_STATE_SCAN2);
 		ssp->srcu_n_exp_nodelay = 0;
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	}
 
-	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN2) {
+	if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) == SRCU_STATE_SCAN2) {
 
 		/*
 		 * SRCU read-side critical sections are normally short,
@@ -1688,7 +1688,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	rcu_cblist_init(&ready_cbs);
 	spin_lock_irq_rcu_node(sdp);
 	rcu_segcblist_advance(&sdp->srcu_cblist,
-			      rcu_seq_current(&ssp->srcu_gp_seq));
+			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
 	if (sdp->srcu_cblist_invoking ||
 	    !rcu_segcblist_ready_cbs(&sdp->srcu_cblist)) {
 		spin_unlock_irq_rcu_node(sdp);
@@ -1716,7 +1716,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	spin_lock_irq_rcu_node(sdp);
 	rcu_segcblist_add_len(&sdp->srcu_cblist, -len);
 	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
-				       rcu_seq_snap(&ssp->srcu_gp_seq));
+				       rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
 	sdp->srcu_cblist_invoking = false;
 	more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
 	spin_unlock_irq_rcu_node(sdp);
@@ -1733,12 +1733,12 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
 	bool pushgp = true;
 
 	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
-		if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq))) {
+	if (ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed)) {
+		if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_sup->srcu_gp_seq))) {
 			/* All requests fulfilled, time to go idle. */
 			pushgp = false;
 		}
-	} else if (!rcu_seq_state(ssp->srcu_gp_seq)) {
+	} else if (!rcu_seq_state(ssp->srcu_sup->srcu_gp_seq)) {
 		/* Outstanding request and no GP.  Start one. */
 		srcu_gp_start(ssp);
 	}
@@ -1784,7 +1784,7 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
 	if (test_type != SRCU_FLAVOR)
 		return;
 	*flags = 0;
-	*gp_seq = rcu_seq_current(&ssp->srcu_gp_seq);
+	*gp_seq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
 }
 EXPORT_SYMBOL_GPL(srcutorture_get_gp_data);
 
@@ -1813,7 +1813,7 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 	if (ss_state < 0 || ss_state >= ARRAY_SIZE(srcu_size_state_name))
 		ss_state_idx = ARRAY_SIZE(srcu_size_state_name) - 1;
 	pr_alert("%s%s Tree SRCU g%ld state %d (%s)",
-		 tt, tf, rcu_seq_current(&ssp->srcu_gp_seq), ss_state,
+		 tt, tf, rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq), ss_state,
 		 srcu_size_state_name[ss_state_idx]);
 	if (!ssp->sda) {
 		// Called after cleanup_srcu_struct(), perhaps.
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 11/19] srcu: Move heuristics fields from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (9 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 10/19] srcu: Move grace-period fields " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 12/19] srcu: Move ->sda_is_static " Paul E. McKenney
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_size_jiffies, ->srcu_n_lock_retries,
and ->srcu_n_exp_nodelay fields from the srcu_struct structure to the
srcu_usage structure to reduce the size of the former in order to improve
cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  6 +++---
 kernel/rcu/srcutree.c    | 18 +++++++++---------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 7458df6bbe00..82d07466da93 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -73,6 +73,9 @@ struct srcu_usage {
 	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
 	unsigned long srcu_gp_start;		/* Last GP start timestamp (jiffies) */
 	unsigned long srcu_last_gp_end;		/* Last GP end timestamp (ns) */
+	unsigned long srcu_size_jiffies;	/* Current contention-measurement interval. */
+	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
+	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
 };
 
 /*
@@ -80,9 +83,6 @@ struct srcu_usage {
  */
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
-	unsigned long srcu_size_jiffies;	/* Current contention-measurement interval. */
-	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
-	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
 	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
 	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 6c1ac78b1cfc..3f389b3a25aa 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -348,11 +348,11 @@ static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp)
 	if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_sup->srcu_size_state)
 		return;
 	j = jiffies;
-	if (ssp->srcu_size_jiffies != j) {
-		ssp->srcu_size_jiffies = j;
-		ssp->srcu_n_lock_retries = 0;
+	if (ssp->srcu_sup->srcu_size_jiffies != j) {
+		ssp->srcu_sup->srcu_size_jiffies = j;
+		ssp->srcu_sup->srcu_n_lock_retries = 0;
 	}
-	if (++ssp->srcu_n_lock_retries <= small_contention_lim)
+	if (++ssp->srcu_sup->srcu_n_lock_retries <= small_contention_lim)
 		return;
 	__srcu_transition_to_big(ssp);
 }
@@ -624,8 +624,8 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 		if (time_after(j, gpstart))
 			jbase += j - gpstart;
 		if (!jbase) {
-			WRITE_ONCE(ssp->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
-			if (READ_ONCE(ssp->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
+			WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) + 1);
+			if (READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
 				jbase = 1;
 		}
 	}
@@ -783,7 +783,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 				       rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
 	spin_unlock_rcu_node(sdp);  /* Interrupts remain disabled. */
 	WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies);
-	WRITE_ONCE(ssp->srcu_n_exp_nodelay, 0);
+	WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, 0);
 	smp_mb(); /* Order prior store to ->srcu_gp_seq_needed vs. GP start. */
 	rcu_seq_start(&ssp->srcu_sup->srcu_gp_seq);
 	state = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
@@ -1647,7 +1647,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		srcu_flip(ssp);
 		spin_lock_irq_rcu_node(ssp->srcu_sup);
 		rcu_seq_set_state(&ssp->srcu_sup->srcu_gp_seq, SRCU_STATE_SCAN2);
-		ssp->srcu_n_exp_nodelay = 0;
+		ssp->srcu_sup->srcu_n_exp_nodelay = 0;
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	}
 
@@ -1662,7 +1662,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return; /* readers present, retry later. */
 		}
-		ssp->srcu_n_exp_nodelay = 0;
+		ssp->srcu_sup->srcu_n_exp_nodelay = 0;
 		srcu_gp_end(ssp);  /* Releases ->srcu_gp_mutex. */
 	}
 }
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 12/19] srcu: Move ->sda_is_static from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (10 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 11/19] srcu: Move heuristics " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 13/19] srcu: Move srcu_barrier() fields " Paul E. McKenney
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->sda_is_static field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 2 +-
 kernel/rcu/srcutree.c    | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 82d07466da93..ca48b97d9f3b 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -76,6 +76,7 @@ struct srcu_usage {
 	unsigned long srcu_size_jiffies;	/* Current contention-measurement interval. */
 	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
 	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
+	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
 };
 
 /*
@@ -84,7 +85,6 @@ struct srcu_usage {
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
-	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
 	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
 	struct mutex srcu_barrier_mutex;	/* Serialize barrier ops. */
 	struct completion srcu_barrier_completion;
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 3f389b3a25aa..7c192f7fb559 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -252,7 +252,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	mutex_init(&ssp->srcu_barrier_mutex);
 	atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
 	INIT_DELAYED_WORK(&ssp->work, process_srcu);
-	ssp->sda_is_static = is_static;
+	ssp->srcu_sup->sda_is_static = is_static;
 	if (!is_static)
 		ssp->sda = alloc_percpu(struct srcu_data);
 	if (!ssp->sda) {
@@ -265,7 +265,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
 	if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
 		if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
-			if (!ssp->sda_is_static) {
+			if (!ssp->srcu_sup->sda_is_static) {
 				free_percpu(ssp->sda);
 				ssp->sda = NULL;
 				kfree(ssp->srcu_sup);
@@ -667,7 +667,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 	kfree(ssp->srcu_sup->node);
 	ssp->srcu_sup->node = NULL;
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
-	if (!ssp->sda_is_static) {
+	if (!ssp->srcu_sup->sda_is_static) {
 		free_percpu(ssp->sda);
 		ssp->sda = NULL;
 		kfree(ssp->srcu_sup);
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 13/19] srcu: Move srcu_barrier() fields from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (11 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 12/19] srcu: Move ->sda_is_static " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 14/19] srcu: Move work-scheduling " Paul E. McKenney
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_barrier_seq, ->srcu_barrier_mutex,
->srcu_barrier_completion, and ->srcu_barrier_cpu_cnt fields from the
srcu_struct structure to the srcu_usage structure to reduce the size of
the former in order to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 14 +++++++-------
 kernel/rcu/srcutree.c    | 38 +++++++++++++++++++-------------------
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index ca48b97d9f3b..70f5a3abe80a 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -77,6 +77,13 @@ struct srcu_usage {
 	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
 	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
 	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
+	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
+	struct mutex srcu_barrier_mutex;	/* Serialize barrier ops. */
+	struct completion srcu_barrier_completion;
+						/* Awaken barrier rq at end. */
+	atomic_t srcu_barrier_cpu_cnt;		/* # CPUs not yet posting a */
+						/*  callback for the barrier */
+						/*  operation. */
 };
 
 /*
@@ -85,13 +92,6 @@ struct srcu_usage {
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
-	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
-	struct mutex srcu_barrier_mutex;	/* Serialize barrier ops. */
-	struct completion srcu_barrier_completion;
-						/* Awaken barrier rq at end. */
-	atomic_t srcu_barrier_cpu_cnt;		/* # CPUs not yet posting a */
-						/*  callback for the barrier */
-						/*  operation. */
 	unsigned long reschedule_jiffies;
 	unsigned long reschedule_count;
 	struct delayed_work work;
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 7c192f7fb559..1df4a1467765 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -248,9 +248,9 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
 	ssp->srcu_sup->srcu_gp_seq = 0;
-	ssp->srcu_barrier_seq = 0;
-	mutex_init(&ssp->srcu_barrier_mutex);
-	atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
+	ssp->srcu_sup->srcu_barrier_seq = 0;
+	mutex_init(&ssp->srcu_sup->srcu_barrier_mutex);
+	atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0);
 	INIT_DELAYED_WORK(&ssp->work, process_srcu);
 	ssp->srcu_sup->sda_is_static = is_static;
 	if (!is_static)
@@ -1518,8 +1518,8 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
 
 	sdp = container_of(rhp, struct srcu_data, srcu_barrier_head);
 	ssp = sdp->ssp;
-	if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
-		complete(&ssp->srcu_barrier_completion);
+	if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt))
+		complete(&ssp->srcu_sup->srcu_barrier_completion);
 }
 
 /*
@@ -1533,13 +1533,13 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
 static void srcu_barrier_one_cpu(struct srcu_struct *ssp, struct srcu_data *sdp)
 {
 	spin_lock_irq_rcu_node(sdp);
-	atomic_inc(&ssp->srcu_barrier_cpu_cnt);
+	atomic_inc(&ssp->srcu_sup->srcu_barrier_cpu_cnt);
 	sdp->srcu_barrier_head.func = srcu_barrier_cb;
 	debug_rcu_head_queue(&sdp->srcu_barrier_head);
 	if (!rcu_segcblist_entrain(&sdp->srcu_cblist,
 				   &sdp->srcu_barrier_head)) {
 		debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
-		atomic_dec(&ssp->srcu_barrier_cpu_cnt);
+		atomic_dec(&ssp->srcu_sup->srcu_barrier_cpu_cnt);
 	}
 	spin_unlock_irq_rcu_node(sdp);
 }
@@ -1552,20 +1552,20 @@ void srcu_barrier(struct srcu_struct *ssp)
 {
 	int cpu;
 	int idx;
-	unsigned long s = rcu_seq_snap(&ssp->srcu_barrier_seq);
+	unsigned long s = rcu_seq_snap(&ssp->srcu_sup->srcu_barrier_seq);
 
 	check_init_srcu_struct(ssp);
-	mutex_lock(&ssp->srcu_barrier_mutex);
-	if (rcu_seq_done(&ssp->srcu_barrier_seq, s)) {
+	mutex_lock(&ssp->srcu_sup->srcu_barrier_mutex);
+	if (rcu_seq_done(&ssp->srcu_sup->srcu_barrier_seq, s)) {
 		smp_mb(); /* Force ordering following return. */
-		mutex_unlock(&ssp->srcu_barrier_mutex);
+		mutex_unlock(&ssp->srcu_sup->srcu_barrier_mutex);
 		return; /* Someone else did our work for us. */
 	}
-	rcu_seq_start(&ssp->srcu_barrier_seq);
-	init_completion(&ssp->srcu_barrier_completion);
+	rcu_seq_start(&ssp->srcu_sup->srcu_barrier_seq);
+	init_completion(&ssp->srcu_sup->srcu_barrier_completion);
 
 	/* Initial count prevents reaching zero until all CBs are posted. */
-	atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
+	atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 1);
 
 	idx = __srcu_read_lock_nmisafe(ssp);
 	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
@@ -1576,12 +1576,12 @@ void srcu_barrier(struct srcu_struct *ssp)
 	__srcu_read_unlock_nmisafe(ssp, idx);
 
 	/* Remove the initial count, at which point reaching zero can happen. */
-	if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
-		complete(&ssp->srcu_barrier_completion);
-	wait_for_completion(&ssp->srcu_barrier_completion);
+	if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt))
+		complete(&ssp->srcu_sup->srcu_barrier_completion);
+	wait_for_completion(&ssp->srcu_sup->srcu_barrier_completion);
 
-	rcu_seq_end(&ssp->srcu_barrier_seq);
-	mutex_unlock(&ssp->srcu_barrier_mutex);
+	rcu_seq_end(&ssp->srcu_sup->srcu_barrier_seq);
+	mutex_unlock(&ssp->srcu_sup->srcu_barrier_mutex);
 }
 EXPORT_SYMBOL_GPL(srcu_barrier);
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 14/19] srcu: Move work-scheduling fields from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (12 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 13/19] srcu: Move srcu_barrier() fields " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 15/19] srcu: Fix long lines in srcu_get_delay() Paul E. McKenney
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->reschedule_jiffies, ->reschedule_count, and
->work fields from the srcu_struct structure to the srcu_usage structure
to reduce the size of the former in order to improve cache locality.

However, this means that the container_of() calls cannot get a pointer
to the srcu_struct because they are no longer in the srcu_struct.
This issue is addressed by adding a ->srcu_ssp field in the srcu_usage
structure that references the corresponding srcu_struct structure.
And given the presence of the sup pointer to the srcu_usage structure,
replace some ssp->srcu_usage-> instances with sup->.

[ paulmck Apply feedback from kernel test robot. ]

Link: https://lore.kernel.org/oe-kbuild-all/202303191400.iO5BOqka-lkp@intel.com/
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  9 +++++----
 kernel/rcu/srcutree.c    | 41 +++++++++++++++++++++-------------------
 2 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 70f5a3abe80a..8f3f72480e78 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -84,6 +84,10 @@ struct srcu_usage {
 	atomic_t srcu_barrier_cpu_cnt;		/* # CPUs not yet posting a */
 						/*  callback for the barrier */
 						/*  operation. */
+	unsigned long reschedule_jiffies;
+	unsigned long reschedule_count;
+	struct delayed_work work;
+	struct srcu_struct *srcu_ssp;
 };
 
 /*
@@ -92,9 +96,6 @@ struct srcu_usage {
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
-	unsigned long reschedule_jiffies;
-	unsigned long reschedule_count;
-	struct delayed_work work;
 	struct lockdep_map dep_map;
 	struct srcu_usage *srcu_sup;		/* Update-side data. */
 };
@@ -132,10 +133,10 @@ struct srcu_struct {
 {												\
 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
+	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
 }
 
 #define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
-	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
 	.srcu_sup = &usage_name,								\
 	__SRCU_DEP_MAP_INIT(name)
 
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 1df4a1467765..22dc266d2090 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -251,7 +251,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	ssp->srcu_sup->srcu_barrier_seq = 0;
 	mutex_init(&ssp->srcu_sup->srcu_barrier_mutex);
 	atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0);
-	INIT_DELAYED_WORK(&ssp->work, process_srcu);
+	INIT_DELAYED_WORK(&ssp->srcu_sup->work, process_srcu);
 	ssp->srcu_sup->sda_is_static = is_static;
 	if (!is_static)
 		ssp->sda = alloc_percpu(struct srcu_data);
@@ -275,6 +275,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 			WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
 		}
 	}
+	ssp->srcu_sup->srcu_ssp = ssp;
 	smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
 	return 0;
 }
@@ -647,7 +648,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 		return; /* Just leak it! */
 	if (WARN_ON(srcu_readers_active(ssp)))
 		return; /* Just leak it! */
-	flush_delayed_work(&ssp->work);
+	flush_delayed_work(&ssp->srcu_sup->work);
 	for_each_possible_cpu(cpu) {
 		struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu);
 
@@ -1059,10 +1060,10 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 		// can only be executed during early boot when there is only
 		// the one boot CPU running with interrupts still disabled.
 		if (likely(srcu_init_done))
-			queue_delayed_work(rcu_gp_wq, &ssp->work,
+			queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work,
 					   !!srcu_get_delay(ssp));
-		else if (list_empty(&ssp->work.work.entry))
-			list_add(&ssp->work.work.entry, &srcu_boot_list);
+		else if (list_empty(&ssp->srcu_sup->work.work.entry))
+			list_add(&ssp->srcu_sup->work.work.entry, &srcu_boot_list);
 	}
 	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
@@ -1745,7 +1746,7 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
 	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 
 	if (pushgp)
-		queue_delayed_work(rcu_gp_wq, &ssp->work, delay);
+		queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work, delay);
 }
 
 /*
@@ -1756,22 +1757,24 @@ static void process_srcu(struct work_struct *work)
 	unsigned long curdelay;
 	unsigned long j;
 	struct srcu_struct *ssp;
+	struct srcu_usage *sup;
 
-	ssp = container_of(work, struct srcu_struct, work.work);
+	sup = container_of(work, struct srcu_usage, work.work);
+	ssp = sup->srcu_ssp;
 
 	srcu_advance_state(ssp);
 	curdelay = srcu_get_delay(ssp);
 	if (curdelay) {
-		WRITE_ONCE(ssp->reschedule_count, 0);
+		WRITE_ONCE(sup->reschedule_count, 0);
 	} else {
 		j = jiffies;
-		if (READ_ONCE(ssp->reschedule_jiffies) == j) {
-			WRITE_ONCE(ssp->reschedule_count, READ_ONCE(ssp->reschedule_count) + 1);
-			if (READ_ONCE(ssp->reschedule_count) > srcu_max_nodelay)
+		if (READ_ONCE(sup->reschedule_jiffies) == j) {
+			WRITE_ONCE(sup->reschedule_count, READ_ONCE(sup->reschedule_count) + 1);
+			if (READ_ONCE(sup->reschedule_count) > srcu_max_nodelay)
 				curdelay = 1;
 		} else {
-			WRITE_ONCE(ssp->reschedule_count, 1);
-			WRITE_ONCE(ssp->reschedule_jiffies, j);
+			WRITE_ONCE(sup->reschedule_count, 1);
+			WRITE_ONCE(sup->reschedule_jiffies, j);
 		}
 	}
 	srcu_reschedule(ssp, curdelay);
@@ -1870,7 +1873,7 @@ early_initcall(srcu_bootup_announce);
 
 void __init srcu_init(void)
 {
-	struct srcu_struct *ssp;
+	struct srcu_usage *sup;
 
 	/* Decide on srcu_struct-size strategy. */
 	if (SRCU_SIZING_IS(SRCU_SIZING_AUTO)) {
@@ -1890,13 +1893,13 @@ void __init srcu_init(void)
 	 */
 	srcu_init_done = true;
 	while (!list_empty(&srcu_boot_list)) {
-		ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
+		sup = list_first_entry(&srcu_boot_list, struct srcu_usage,
 				      work.work.entry);
-		list_del_init(&ssp->work.work.entry);
+		list_del_init(&sup->work.work.entry);
 		if (SRCU_SIZING_IS(SRCU_SIZING_INIT) &&
-		    ssp->srcu_sup->srcu_size_state == SRCU_SIZE_SMALL)
-			ssp->srcu_sup->srcu_size_state = SRCU_SIZE_ALLOC;
-		queue_work(rcu_gp_wq, &ssp->work.work);
+		    sup->srcu_size_state == SRCU_SIZE_SMALL)
+			sup->srcu_size_state = SRCU_SIZE_ALLOC;
+		queue_work(rcu_gp_wq, &sup->work.work);
 	}
 }
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 15/19] srcu: Fix long lines in srcu_get_delay()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (13 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 14/19] srcu: Move work-scheduling " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 16/19] srcu: Fix long lines in cleanup_srcu_struct() Paul E. McKenney
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 22dc266d2090..f6bb9fbe1b9c 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -616,17 +616,18 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 	unsigned long gpstart;
 	unsigned long j;
 	unsigned long jbase = SRCU_INTERVAL;
+	struct srcu_usage *sup = ssp->srcu_sup;
 
-	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
+	if (ULONG_CMP_LT(READ_ONCE(sup->srcu_gp_seq), READ_ONCE(sup->srcu_gp_seq_needed_exp)))
 		jbase = 0;
-	if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq))) {
+	if (rcu_seq_state(READ_ONCE(sup->srcu_gp_seq))) {
 		j = jiffies - 1;
-		gpstart = READ_ONCE(ssp->srcu_sup->srcu_gp_start);
+		gpstart = READ_ONCE(sup->srcu_gp_start);
 		if (time_after(j, gpstart))
 			jbase += j - gpstart;
 		if (!jbase) {
-			WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) + 1);
-			if (READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
+			WRITE_ONCE(sup->srcu_n_exp_nodelay, READ_ONCE(sup->srcu_n_exp_nodelay) + 1);
+			if (READ_ONCE(sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
 				jbase = 1;
 		}
 	}
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 16/19] srcu: Fix long lines in cleanup_srcu_struct()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (14 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 15/19] srcu: Fix long lines in srcu_get_delay() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 17/19] srcu: Fix long lines in srcu_gp_end() Paul E. McKenney
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index f6bb9fbe1b9c..fd88a98b7254 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -644,12 +644,13 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 void cleanup_srcu_struct(struct srcu_struct *ssp)
 {
 	int cpu;
+	struct srcu_usage *sup = ssp->srcu_sup;
 
 	if (WARN_ON(!srcu_get_delay(ssp)))
 		return; /* Just leak it! */
 	if (WARN_ON(srcu_readers_active(ssp)))
 		return; /* Just leak it! */
-	flush_delayed_work(&ssp->srcu_sup->work);
+	flush_delayed_work(&sup->work);
 	for_each_possible_cpu(cpu) {
 		struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu);
 
@@ -658,21 +659,21 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 		if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
 			return; /* Forgot srcu_barrier(), so just leak it! */
 	}
-	if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
-	    WARN_ON(rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq) != ssp->srcu_sup->srcu_gp_seq_needed) ||
+	if (WARN_ON(rcu_seq_state(READ_ONCE(sup->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
+	    WARN_ON(rcu_seq_current(&sup->srcu_gp_seq) != sup->srcu_gp_seq_needed) ||
 	    WARN_ON(srcu_readers_active(ssp))) {
 		pr_info("%s: Active srcu_struct %p read state: %d gp state: %lu/%lu\n",
-			__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)),
-			rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq), ssp->srcu_sup->srcu_gp_seq_needed);
+			__func__, ssp, rcu_seq_state(READ_ONCE(sup->srcu_gp_seq)),
+			rcu_seq_current(&sup->srcu_gp_seq), sup->srcu_gp_seq_needed);
 		return; /* Caller forgot to stop doing call_srcu()? */
 	}
-	kfree(ssp->srcu_sup->node);
-	ssp->srcu_sup->node = NULL;
-	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
-	if (!ssp->srcu_sup->sda_is_static) {
+	kfree(sup->node);
+	sup->node = NULL;
+	sup->srcu_size_state = SRCU_SIZE_SMALL;
+	if (!sup->sda_is_static) {
 		free_percpu(ssp->sda);
 		ssp->sda = NULL;
-		kfree(ssp->srcu_sup);
+		kfree(sup);
 		ssp->srcu_sup = NULL;
 	}
 }
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 17/19] srcu: Fix long lines in srcu_gp_end()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (15 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 16/19] srcu: Fix long lines in cleanup_srcu_struct() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 18/19] srcu: Fix long lines in srcu_funnel_gp_start() Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 19/19] srcu: Remove extraneous parentheses from srcu_read_lock() etc Paul E. McKenney
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 41 +++++++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index fd88a98b7254..fcb2bac7bb4b 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -862,28 +862,29 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	unsigned long sgsne;
 	struct srcu_node *snp;
 	int ss_state;
+	struct srcu_usage *sup = ssp->srcu_sup;
 
 	/* Prevent more than one additional grace period. */
-	mutex_lock(&ssp->srcu_sup->srcu_cb_mutex);
+	mutex_lock(&sup->srcu_cb_mutex);
 
 	/* End the current grace period. */
-	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	idx = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
+	spin_lock_irq_rcu_node(sup);
+	idx = rcu_seq_state(sup->srcu_gp_seq);
 	WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
-	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
+	if (ULONG_CMP_LT(READ_ONCE(sup->srcu_gp_seq), READ_ONCE(sup->srcu_gp_seq_needed_exp)))
 		cbdelay = 0;
 
-	WRITE_ONCE(ssp->srcu_sup->srcu_last_gp_end, ktime_get_mono_fast_ns());
-	rcu_seq_end(&ssp->srcu_sup->srcu_gp_seq);
-	gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
-	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq))
-		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq);
-	spin_unlock_irq_rcu_node(ssp->srcu_sup);
-	mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
+	WRITE_ONCE(sup->srcu_last_gp_end, ktime_get_mono_fast_ns());
+	rcu_seq_end(&sup->srcu_gp_seq);
+	gpseq = rcu_seq_current(&sup->srcu_gp_seq);
+	if (ULONG_CMP_LT(sup->srcu_gp_seq_needed_exp, gpseq))
+		WRITE_ONCE(sup->srcu_gp_seq_needed_exp, gpseq);
+	spin_unlock_irq_rcu_node(sup);
+	mutex_unlock(&sup->srcu_gp_mutex);
 	/* A new grace period can start at this point.  But only one. */
 
 	/* Initiate callback invocation as needed. */
-	ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state);
+	ss_state = smp_load_acquire(&sup->srcu_size_state);
 	if (ss_state < SRCU_SIZE_WAIT_BARRIER) {
 		srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, get_boot_cpu_id()),
 					cbdelay);
@@ -892,7 +893,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		srcu_for_each_node_breadth_first(ssp, snp) {
 			spin_lock_irq_rcu_node(snp);
 			cbs = false;
-			last_lvl = snp >= ssp->srcu_sup->level[rcu_num_lvls - 1];
+			last_lvl = snp >= sup->level[rcu_num_lvls - 1];
 			if (last_lvl)
 				cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq;
 			snp->srcu_have_cbs[idx] = gpseq;
@@ -924,18 +925,18 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		}
 
 	/* Callback initiation done, allow grace periods after next. */
-	mutex_unlock(&ssp->srcu_sup->srcu_cb_mutex);
+	mutex_unlock(&sup->srcu_cb_mutex);
 
 	/* Start a new grace period if needed. */
-	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
+	spin_lock_irq_rcu_node(sup);
+	gpseq = rcu_seq_current(&sup->srcu_gp_seq);
 	if (!rcu_seq_state(gpseq) &&
-	    ULONG_CMP_LT(gpseq, ssp->srcu_sup->srcu_gp_seq_needed)) {
+	    ULONG_CMP_LT(gpseq, sup->srcu_gp_seq_needed)) {
 		srcu_gp_start(ssp);
-		spin_unlock_irq_rcu_node(ssp->srcu_sup);
+		spin_unlock_irq_rcu_node(sup);
 		srcu_reschedule(ssp, 0);
 	} else {
-		spin_unlock_irq_rcu_node(ssp->srcu_sup);
+		spin_unlock_irq_rcu_node(sup);
 	}
 
 	/* Transition to big if needed. */
@@ -943,7 +944,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		if (ss_state == SRCU_SIZE_ALLOC)
 			init_srcu_struct_nodes(ssp, GFP_KERNEL);
 		else
-			smp_store_release(&ssp->srcu_sup->srcu_size_state, ss_state + 1);
+			smp_store_release(&sup->srcu_size_state, ss_state + 1);
 	}
 }
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 18/19] srcu: Fix long lines in srcu_funnel_gp_start()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (16 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 17/19] srcu: Fix long lines in srcu_gp_end() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 19/19] srcu: Remove extraneous parentheses from srcu_read_lock() etc Paul E. McKenney
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index fcb2bac7bb4b..61dd47981b40 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1004,9 +1004,10 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	struct srcu_node *snp;
 	struct srcu_node *snp_leaf;
 	unsigned long snp_seq;
+	struct srcu_usage *sup = ssp->srcu_sup;
 
 	/* Ensure that snp node tree is fully initialized before traversing it */
-	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+	if (smp_load_acquire(&sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
 		snp_leaf = NULL;
 	else
 		snp_leaf = sdp->mynode;
@@ -1014,7 +1015,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	if (snp_leaf)
 		/* Each pass through the loop does one level of the srcu_node tree. */
 		for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
-			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) && snp != snp_leaf)
+			if (WARN_ON_ONCE(rcu_seq_done(&sup->srcu_gp_seq, s)) && snp != snp_leaf)
 				return; /* GP already done and CBs recorded. */
 			spin_lock_irqsave_rcu_node(snp, flags);
 			snp_seq = snp->srcu_have_cbs[idx];
@@ -1041,20 +1042,20 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 
 	/* Top of tree, must ensure the grace period will be started. */
 	spin_lock_irqsave_ssp_contention(ssp, &flags);
-	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed, s)) {
+	if (ULONG_CMP_LT(sup->srcu_gp_seq_needed, s)) {
 		/*
 		 * Record need for grace period s.  Pair with load
 		 * acquire setting up for initialization.
 		 */
-		smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, s); /*^^^*/
+		smp_store_release(&sup->srcu_gp_seq_needed, s); /*^^^*/
 	}
-	if (!do_norm && ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s))
-		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s);
+	if (!do_norm && ULONG_CMP_LT(sup->srcu_gp_seq_needed_exp, s))
+		WRITE_ONCE(sup->srcu_gp_seq_needed_exp, s);
 
 	/* If grace period not already in progress, start it. */
-	if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) &&
-	    rcu_seq_state(ssp->srcu_sup->srcu_gp_seq) == SRCU_STATE_IDLE) {
-		WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
+	if (!WARN_ON_ONCE(rcu_seq_done(&sup->srcu_gp_seq, s)) &&
+	    rcu_seq_state(sup->srcu_gp_seq) == SRCU_STATE_IDLE) {
+		WARN_ON_ONCE(ULONG_CMP_GE(sup->srcu_gp_seq, sup->srcu_gp_seq_needed));
 		srcu_gp_start(ssp);
 
 		// And how can that list_add() in the "else" clause
@@ -1063,12 +1064,12 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 		// can only be executed during early boot when there is only
 		// the one boot CPU running with interrupts still disabled.
 		if (likely(srcu_init_done))
-			queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work,
+			queue_delayed_work(rcu_gp_wq, &sup->work,
 					   !!srcu_get_delay(ssp));
-		else if (list_empty(&ssp->srcu_sup->work.work.entry))
-			list_add(&ssp->srcu_sup->work.work.entry, &srcu_boot_list);
+		else if (list_empty(&sup->work.work.entry))
+			list_add(&sup->work.work.entry, &srcu_boot_list);
 	}
-	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
+	spin_unlock_irqrestore_rcu_node(sup, flags);
 }
 
 /*
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 19/19] srcu: Remove extraneous parentheses from srcu_read_lock() etc.
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (17 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 18/19] srcu: Fix long lines in srcu_funnel_gp_start() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit removes extraneous parentheses from srcu_read_lock(),
srcu_read_lock_nmisafe(), srcu_read_unlock(), and
srcu_read_unlock_nmisafe().  Looks like someone was once a macro.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 include/linux/srcu.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 41c4b26fb1c1..eb92a50a4599 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -212,7 +212,7 @@ static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
 
 	srcu_check_nmi_safety(ssp, false);
 	retval = __srcu_read_lock(ssp);
-	srcu_lock_acquire(&(ssp)->dep_map);
+	srcu_lock_acquire(&ssp->dep_map);
 	return retval;
 }
 
@@ -229,7 +229,7 @@ static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp
 
 	srcu_check_nmi_safety(ssp, true);
 	retval = __srcu_read_lock_nmisafe(ssp);
-	rcu_lock_acquire(&(ssp)->dep_map);
+	rcu_lock_acquire(&ssp->dep_map);
 	return retval;
 }
 
@@ -284,7 +284,7 @@ static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
 {
 	WARN_ON_ONCE(idx & ~0x1);
 	srcu_check_nmi_safety(ssp, false);
-	srcu_lock_release(&(ssp)->dep_map);
+	srcu_lock_release(&ssp->dep_map);
 	__srcu_read_unlock(ssp, idx);
 }
 
@@ -300,7 +300,7 @@ static inline void srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
 {
 	WARN_ON_ONCE(idx & ~0x1);
 	srcu_check_nmi_safety(ssp, true);
-	rcu_lock_release(&(ssp)->dep_map);
+	rcu_lock_release(&ssp->dep_map);
 	__srcu_read_unlock_nmisafe(ssp, idx);
 }
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-24  0:19 ` [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update Paul E. McKenney
@ 2023-03-24 19:10   ` Wysocki, Rafael J
  2023-03-24 20:11     ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Wysocki, Rafael J @ 2023-03-24 19:10 UTC (permalink / raw)
  To: Paul E. McKenney, rcu
  Cc: linux-kernel, kernel-team, rostedt, hch,
	Michał Mirosław, Dmitry Osipenko

On 3/24/2023 1:19 AM, Paul E. McKenney wrote:
> The current srcu_struct structure is on the order of 200 bytes in size
> (depending on architecture and .config), which is much better than the
> old-style 26K bytes, but still all too inconvenient when one is trying
> to achieve good cache locality on a fastpath involving SRCU readers.
>
> However, only a few fields in srcu_struct are used by SRCU readers.
> The remaining fields could be offloaded to a new srcu_update
> structure, thus shrinking the srcu_struct structure down to a few
> tens of bytes.  This commit begins this noble quest, a quest that is
> complicated by open-coded initialization of the srcu_struct within the
> srcu_notifier_head structure.  This complication is addressed by updating
> the srcu_notifier_head structure's open coding, given that there does
> not appear to be a straightforward way of abstracting that initialization.
>
> This commit moves only the ->node pointer to srcu_update.  Later commits
> will move additional fields.
>
> [ paulmck: Fold in qiang1.zhang@intel.com's memory-leak fix. ]
>
> Link: https://lore.kernel.org/all/20230320055751.4120251-1-qiang1.zhang@intel.com/
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>

Fine with me.

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


> ---
>   include/linux/notifier.h |  5 ++++-
>   include/linux/srcutiny.h |  6 +++---
>   include/linux/srcutree.h | 27 ++++++++++++++++++---------
>   kernel/rcu/rcu.h         |  6 ++++--
>   kernel/rcu/srcutree.c    | 28 +++++++++++++++++++---------
>   5 files changed, 48 insertions(+), 24 deletions(-)
>
> diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> index aef88c2d1173..2aba75145144 100644
> --- a/include/linux/notifier.h
> +++ b/include/linux/notifier.h
> @@ -73,6 +73,9 @@ struct raw_notifier_head {
>   
>   struct srcu_notifier_head {
>   	struct mutex mutex;
> +#ifdef CONFIG_TREE_SRCU
> +	struct srcu_usage srcuu;
> +#endif
>   	struct srcu_struct srcu;
>   	struct notifier_block __rcu *head;
>   };
> @@ -107,7 +110,7 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
>   	{							\
>   		.mutex = __MUTEX_INITIALIZER(name.mutex),	\
>   		.head = NULL,					\
> -		.srcu = __SRCU_STRUCT_INIT(name.srcu, pcpu),	\
> +		.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
>   	}
>   
>   #define ATOMIC_NOTIFIER_HEAD(name)				\
> diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
> index 5aa5e0faf6a1..ebd72491af99 100644
> --- a/include/linux/srcutiny.h
> +++ b/include/linux/srcutiny.h
> @@ -31,7 +31,7 @@ struct srcu_struct {
>   
>   void srcu_drive_gp(struct work_struct *wp);
>   
> -#define __SRCU_STRUCT_INIT(name, __ignored)				\
> +#define __SRCU_STRUCT_INIT(name, __ignored, ___ignored)			\
>   {									\
>   	.srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq),	\
>   	.srcu_cb_tail = &name.srcu_cb_head,				\
> @@ -44,9 +44,9 @@ void srcu_drive_gp(struct work_struct *wp);
>    * Tree SRCU, which needs some per-CPU data.
>    */
>   #define DEFINE_SRCU(name) \
> -	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
> +	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
>   #define DEFINE_STATIC_SRCU(name) \
> -	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
> +	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
>   
>   void synchronize_srcu(struct srcu_struct *ssp);
>   
> diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> index 428480152375..2689e64024bb 100644
> --- a/include/linux/srcutree.h
> +++ b/include/linux/srcutree.h
> @@ -57,11 +57,17 @@ struct srcu_node {
>   	int grphi;				/* Biggest CPU for node. */
>   };
>   
> +/*
> + * Per-SRCU-domain structure, update-side data linked from srcu_struct.
> + */
> +struct srcu_usage {
> +	struct srcu_node *node;			/* Combining tree. */
> +};
> +
>   /*
>    * Per-SRCU-domain structure, similar in function to rcu_state.
>    */
>   struct srcu_struct {
> -	struct srcu_node *node;			/* Combining tree. */
>   	struct srcu_node *level[RCU_NUM_LVLS + 1];
>   						/* First node at each level. */
>   	int srcu_size_state;			/* Small-to-big transition state. */
> @@ -90,6 +96,7 @@ struct srcu_struct {
>   	unsigned long reschedule_count;
>   	struct delayed_work work;
>   	struct lockdep_map dep_map;
> +	struct srcu_usage *srcu_sup;		/* Update-side data. */
>   };
>   
>   // Values for size state variable (->srcu_size_state).  Once the state
> @@ -121,24 +128,24 @@ struct srcu_struct {
>   #define SRCU_STATE_SCAN1	1
>   #define SRCU_STATE_SCAN2	2
>   
> -#define __SRCU_STRUCT_INIT_COMMON(name)								\
> +#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
>   	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
>   	.srcu_gp_seq_needed = -1UL,								\
>   	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> +	.srcu_sup = &usage_name,								\
>   	__SRCU_DEP_MAP_INIT(name)
>   
> -#define __SRCU_STRUCT_INIT_MODULE(name)								\
> +#define __SRCU_STRUCT_INIT_MODULE(name, usage_name)						\
>   {												\
> -	__SRCU_STRUCT_INIT_COMMON(name)								\
> +	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
>   }
>   
> -#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> +#define __SRCU_STRUCT_INIT(name, usage_name, pcpu_name)						\
>   {												\
>   	.sda = &pcpu_name,									\
> -	__SRCU_STRUCT_INIT_COMMON(name)								\
> +	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
>   }
>   
> -
>   /*
>    * Define and initialize a srcu struct at build time.
>    * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> @@ -160,15 +167,17 @@ struct srcu_struct {
>    */
>   #ifdef MODULE
>   # define __DEFINE_SRCU(name, is_static)								\
> -	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> +	static struct srcu_usage name##_srcu_usage;						\
> +	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
>   	extern struct srcu_struct * const __srcu_struct_##name;					\
>   	struct srcu_struct * const __srcu_struct_##name						\
>   		__section("___srcu_struct_ptrs") = &name
>   #else
>   # define __DEFINE_SRCU(name, is_static)								\
>   	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
> +	static struct srcu_usage name##_srcu_usage;						\
>   	is_static struct srcu_struct name =							\
> -		__SRCU_STRUCT_INIT(name, name##_srcu_data)
> +		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
>   #endif
>   #define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
>   #define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index a3adcf9a9919..4a1b9622598b 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -378,11 +378,13 @@ extern void rcu_init_geometry(void);
>    * specified state structure (for SRCU) or the only rcu_state structure
>    * (for RCU).
>    */
> -#define srcu_for_each_node_breadth_first(sp, rnp) \
> +#define _rcu_for_each_node_breadth_first(sp, rnp) \
>   	for ((rnp) = &(sp)->node[0]; \
>   	     (rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++)
>   #define rcu_for_each_node_breadth_first(rnp) \
> -	srcu_for_each_node_breadth_first(&rcu_state, rnp)
> +	_rcu_for_each_node_breadth_first(&rcu_state, rnp)
> +#define srcu_for_each_node_breadth_first(ssp, rnp) \
> +	_rcu_for_each_node_breadth_first(ssp->srcu_sup, rnp)
>   
>   /*
>    * Scan the leaves of the rcu_node hierarchy for the rcu_state structure.
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 7a6d9452a5d0..ad1d5ca42a99 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -173,12 +173,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
>   
>   	/* Initialize geometry if it has not already been initialized. */
>   	rcu_init_geometry();
> -	ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), gfp_flags);
> -	if (!ssp->node)
> +	ssp->srcu_sup->node = kcalloc(rcu_num_nodes, sizeof(*ssp->srcu_sup->node), gfp_flags);
> +	if (!ssp->srcu_sup->node)
>   		return false;
>   
>   	/* Work out the overall tree geometry. */
> -	ssp->level[0] = &ssp->node[0];
> +	ssp->level[0] = &ssp->srcu_sup->node[0];
>   	for (i = 1; i < rcu_num_lvls; i++)
>   		ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
>   	rcu_init_levelspread(levelspread, num_rcu_lvl);
> @@ -195,7 +195,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
>   		snp->srcu_gp_seq_needed_exp = SRCU_SNP_INIT_SEQ;
>   		snp->grplo = -1;
>   		snp->grphi = -1;
> -		if (snp == &ssp->node[0]) {
> +		if (snp == &ssp->srcu_sup->node[0]) {
>   			/* Root node, special case. */
>   			snp->srcu_parent = NULL;
>   			continue;
> @@ -236,8 +236,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
>    */
>   static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>   {
> +	if (!is_static)
> +		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
> +	if (!ssp->srcu_sup)
> +		return -ENOMEM;
>   	ssp->srcu_size_state = SRCU_SIZE_SMALL;
> -	ssp->node = NULL;
> +	ssp->srcu_sup->node = NULL;
>   	mutex_init(&ssp->srcu_cb_mutex);
>   	mutex_init(&ssp->srcu_gp_mutex);
>   	ssp->srcu_idx = 0;
> @@ -249,8 +253,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>   	ssp->sda_is_static = is_static;
>   	if (!is_static)
>   		ssp->sda = alloc_percpu(struct srcu_data);
> -	if (!ssp->sda)
> +	if (!ssp->sda) {
> +		if (!is_static)
> +			kfree(ssp->srcu_sup);
>   		return -ENOMEM;
> +	}
>   	init_srcu_struct_data(ssp);
>   	ssp->srcu_gp_seq_needed_exp = 0;
>   	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
> @@ -259,6 +266,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>   			if (!ssp->sda_is_static) {
>   				free_percpu(ssp->sda);
>   				ssp->sda = NULL;
> +				kfree(ssp->srcu_sup);
>   				return -ENOMEM;
>   			}
>   		} else {
> @@ -656,13 +664,15 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
>   			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
>   		return; /* Caller forgot to stop doing call_srcu()? */
>   	}
> +	kfree(ssp->srcu_sup->node);
> +	ssp->srcu_sup->node = NULL;
> +	ssp->srcu_size_state = SRCU_SIZE_SMALL;
>   	if (!ssp->sda_is_static) {
>   		free_percpu(ssp->sda);
>   		ssp->sda = NULL;
> +		kfree(ssp->srcu_sup);
> +		ssp->srcu_sup = NULL;
>   	}
> -	kfree(ssp->node);
> -	ssp->node = NULL;
> -	ssp->srcu_size_state = SRCU_SIZE_SMALL;
>   }
>   EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
>   

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-24 19:10   ` Wysocki, Rafael J
@ 2023-03-24 20:11     ` Paul E. McKenney
  2023-03-26 23:18       ` Christoph Hellwig
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24 20:11 UTC (permalink / raw)
  To: Wysocki, Rafael J
  Cc: rcu, linux-kernel, kernel-team, rostedt, hch,
	Michał Mirosław, Dmitry Osipenko

On Fri, Mar 24, 2023 at 08:10:31PM +0100, Wysocki, Rafael J wrote:
> On 3/24/2023 1:19 AM, Paul E. McKenney wrote:
> > The current srcu_struct structure is on the order of 200 bytes in size
> > (depending on architecture and .config), which is much better than the
> > old-style 26K bytes, but still all too inconvenient when one is trying
> > to achieve good cache locality on a fastpath involving SRCU readers.
> > 
> > However, only a few fields in srcu_struct are used by SRCU readers.
> > The remaining fields could be offloaded to a new srcu_update
> > structure, thus shrinking the srcu_struct structure down to a few
> > tens of bytes.  This commit begins this noble quest, a quest that is
> > complicated by open-coded initialization of the srcu_struct within the
> > srcu_notifier_head structure.  This complication is addressed by updating
> > the srcu_notifier_head structure's open coding, given that there does
> > not appear to be a straightforward way of abstracting that initialization.
> > 
> > This commit moves only the ->node pointer to srcu_update.  Later commits
> > will move additional fields.
> > 
> > [ paulmck: Fold in qiang1.zhang@intel.com's memory-leak fix. ]
> > 
> > Link: https://lore.kernel.org/all/20230320055751.4120251-1-qiang1.zhang@intel.com/
> > Suggested-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> > Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
> > Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> 
> Fine with me.
> 
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Thank you!  I will add this on my next rebase.

It is possible that this will be v6.5 rather than v6.4 material.

							Thanx, Paul

> > ---
> >   include/linux/notifier.h |  5 ++++-
> >   include/linux/srcutiny.h |  6 +++---
> >   include/linux/srcutree.h | 27 ++++++++++++++++++---------
> >   kernel/rcu/rcu.h         |  6 ++++--
> >   kernel/rcu/srcutree.c    | 28 +++++++++++++++++++---------
> >   5 files changed, 48 insertions(+), 24 deletions(-)
> > 
> > diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> > index aef88c2d1173..2aba75145144 100644
> > --- a/include/linux/notifier.h
> > +++ b/include/linux/notifier.h
> > @@ -73,6 +73,9 @@ struct raw_notifier_head {
> >   struct srcu_notifier_head {
> >   	struct mutex mutex;
> > +#ifdef CONFIG_TREE_SRCU
> > +	struct srcu_usage srcuu;
> > +#endif
> >   	struct srcu_struct srcu;
> >   	struct notifier_block __rcu *head;
> >   };
> > @@ -107,7 +110,7 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
> >   	{							\
> >   		.mutex = __MUTEX_INITIALIZER(name.mutex),	\
> >   		.head = NULL,					\
> > -		.srcu = __SRCU_STRUCT_INIT(name.srcu, pcpu),	\
> > +		.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
> >   	}
> >   #define ATOMIC_NOTIFIER_HEAD(name)				\
> > diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
> > index 5aa5e0faf6a1..ebd72491af99 100644
> > --- a/include/linux/srcutiny.h
> > +++ b/include/linux/srcutiny.h
> > @@ -31,7 +31,7 @@ struct srcu_struct {
> >   void srcu_drive_gp(struct work_struct *wp);
> > -#define __SRCU_STRUCT_INIT(name, __ignored)				\
> > +#define __SRCU_STRUCT_INIT(name, __ignored, ___ignored)			\
> >   {									\
> >   	.srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq),	\
> >   	.srcu_cb_tail = &name.srcu_cb_head,				\
> > @@ -44,9 +44,9 @@ void srcu_drive_gp(struct work_struct *wp);
> >    * Tree SRCU, which needs some per-CPU data.
> >    */
> >   #define DEFINE_SRCU(name) \
> > -	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
> > +	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
> >   #define DEFINE_STATIC_SRCU(name) \
> > -	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
> > +	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
> >   void synchronize_srcu(struct srcu_struct *ssp);
> > diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> > index 428480152375..2689e64024bb 100644
> > --- a/include/linux/srcutree.h
> > +++ b/include/linux/srcutree.h
> > @@ -57,11 +57,17 @@ struct srcu_node {
> >   	int grphi;				/* Biggest CPU for node. */
> >   };
> > +/*
> > + * Per-SRCU-domain structure, update-side data linked from srcu_struct.
> > + */
> > +struct srcu_usage {
> > +	struct srcu_node *node;			/* Combining tree. */
> > +};
> > +
> >   /*
> >    * Per-SRCU-domain structure, similar in function to rcu_state.
> >    */
> >   struct srcu_struct {
> > -	struct srcu_node *node;			/* Combining tree. */
> >   	struct srcu_node *level[RCU_NUM_LVLS + 1];
> >   						/* First node at each level. */
> >   	int srcu_size_state;			/* Small-to-big transition state. */
> > @@ -90,6 +96,7 @@ struct srcu_struct {
> >   	unsigned long reschedule_count;
> >   	struct delayed_work work;
> >   	struct lockdep_map dep_map;
> > +	struct srcu_usage *srcu_sup;		/* Update-side data. */
> >   };
> >   // Values for size state variable (->srcu_size_state).  Once the state
> > @@ -121,24 +128,24 @@ struct srcu_struct {
> >   #define SRCU_STATE_SCAN1	1
> >   #define SRCU_STATE_SCAN2	2
> > -#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > +#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
> >   	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> >   	.srcu_gp_seq_needed = -1UL,								\
> >   	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> > +	.srcu_sup = &usage_name,								\
> >   	__SRCU_DEP_MAP_INIT(name)
> > -#define __SRCU_STRUCT_INIT_MODULE(name)								\
> > +#define __SRCU_STRUCT_INIT_MODULE(name, usage_name)						\
> >   {												\
> > -	__SRCU_STRUCT_INIT_COMMON(name)								\
> > +	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
> >   }
> > -#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> > +#define __SRCU_STRUCT_INIT(name, usage_name, pcpu_name)						\
> >   {												\
> >   	.sda = &pcpu_name,									\
> > -	__SRCU_STRUCT_INIT_COMMON(name)								\
> > +	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
> >   }
> > -
> >   /*
> >    * Define and initialize a srcu struct at build time.
> >    * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> > @@ -160,15 +167,17 @@ struct srcu_struct {
> >    */
> >   #ifdef MODULE
> >   # define __DEFINE_SRCU(name, is_static)								\
> > -	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > +	static struct srcu_usage name##_srcu_usage;						\
> > +	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
> >   	extern struct srcu_struct * const __srcu_struct_##name;					\
> >   	struct srcu_struct * const __srcu_struct_##name						\
> >   		__section("___srcu_struct_ptrs") = &name
> >   #else
> >   # define __DEFINE_SRCU(name, is_static)								\
> >   	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
> > +	static struct srcu_usage name##_srcu_usage;						\
> >   	is_static struct srcu_struct name =							\
> > -		__SRCU_STRUCT_INIT(name, name##_srcu_data)
> > +		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
> >   #endif
> >   #define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
> >   #define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index a3adcf9a9919..4a1b9622598b 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -378,11 +378,13 @@ extern void rcu_init_geometry(void);
> >    * specified state structure (for SRCU) or the only rcu_state structure
> >    * (for RCU).
> >    */
> > -#define srcu_for_each_node_breadth_first(sp, rnp) \
> > +#define _rcu_for_each_node_breadth_first(sp, rnp) \
> >   	for ((rnp) = &(sp)->node[0]; \
> >   	     (rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++)
> >   #define rcu_for_each_node_breadth_first(rnp) \
> > -	srcu_for_each_node_breadth_first(&rcu_state, rnp)
> > +	_rcu_for_each_node_breadth_first(&rcu_state, rnp)
> > +#define srcu_for_each_node_breadth_first(ssp, rnp) \
> > +	_rcu_for_each_node_breadth_first(ssp->srcu_sup, rnp)
> >   /*
> >    * Scan the leaves of the rcu_node hierarchy for the rcu_state structure.
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index 7a6d9452a5d0..ad1d5ca42a99 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -173,12 +173,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
> >   	/* Initialize geometry if it has not already been initialized. */
> >   	rcu_init_geometry();
> > -	ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), gfp_flags);
> > -	if (!ssp->node)
> > +	ssp->srcu_sup->node = kcalloc(rcu_num_nodes, sizeof(*ssp->srcu_sup->node), gfp_flags);
> > +	if (!ssp->srcu_sup->node)
> >   		return false;
> >   	/* Work out the overall tree geometry. */
> > -	ssp->level[0] = &ssp->node[0];
> > +	ssp->level[0] = &ssp->srcu_sup->node[0];
> >   	for (i = 1; i < rcu_num_lvls; i++)
> >   		ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
> >   	rcu_init_levelspread(levelspread, num_rcu_lvl);
> > @@ -195,7 +195,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
> >   		snp->srcu_gp_seq_needed_exp = SRCU_SNP_INIT_SEQ;
> >   		snp->grplo = -1;
> >   		snp->grphi = -1;
> > -		if (snp == &ssp->node[0]) {
> > +		if (snp == &ssp->srcu_sup->node[0]) {
> >   			/* Root node, special case. */
> >   			snp->srcu_parent = NULL;
> >   			continue;
> > @@ -236,8 +236,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
> >    */
> >   static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> >   {
> > +	if (!is_static)
> > +		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
> > +	if (!ssp->srcu_sup)
> > +		return -ENOMEM;
> >   	ssp->srcu_size_state = SRCU_SIZE_SMALL;
> > -	ssp->node = NULL;
> > +	ssp->srcu_sup->node = NULL;
> >   	mutex_init(&ssp->srcu_cb_mutex);
> >   	mutex_init(&ssp->srcu_gp_mutex);
> >   	ssp->srcu_idx = 0;
> > @@ -249,8 +253,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> >   	ssp->sda_is_static = is_static;
> >   	if (!is_static)
> >   		ssp->sda = alloc_percpu(struct srcu_data);
> > -	if (!ssp->sda)
> > +	if (!ssp->sda) {
> > +		if (!is_static)
> > +			kfree(ssp->srcu_sup);
> >   		return -ENOMEM;
> > +	}
> >   	init_srcu_struct_data(ssp);
> >   	ssp->srcu_gp_seq_needed_exp = 0;
> >   	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
> > @@ -259,6 +266,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> >   			if (!ssp->sda_is_static) {
> >   				free_percpu(ssp->sda);
> >   				ssp->sda = NULL;
> > +				kfree(ssp->srcu_sup);
> >   				return -ENOMEM;
> >   			}
> >   		} else {
> > @@ -656,13 +664,15 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
> >   			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
> >   		return; /* Caller forgot to stop doing call_srcu()? */
> >   	}
> > +	kfree(ssp->srcu_sup->node);
> > +	ssp->srcu_sup->node = NULL;
> > +	ssp->srcu_size_state = SRCU_SIZE_SMALL;
> >   	if (!ssp->sda_is_static) {
> >   		free_percpu(ssp->sda);
> >   		ssp->sda = NULL;
> > +		kfree(ssp->srcu_sup);
> > +		ssp->srcu_sup = NULL;
> >   	}
> > -	kfree(ssp->node);
> > -	ssp->node = NULL;
> > -	ssp->srcu_size_state = SRCU_SIZE_SMALL;
> >   }
> >   EXPORT_SYMBOL_GPL(cleanup_srcu_struct);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-24 20:11     ` Paul E. McKenney
@ 2023-03-26 23:18       ` Christoph Hellwig
  2023-03-27  3:29         ` Paul E. McKenney
  2023-04-24 21:06         ` Paul E. McKenney
  0 siblings, 2 replies; 30+ messages in thread
From: Christoph Hellwig @ 2023-03-26 23:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Wysocki, Rafael J, rcu, linux-kernel, kernel-team, rostedt, hch,
	Michał Mirosław, Dmitry Osipenko

On Fri, Mar 24, 2023 at 01:11:47PM -0700, Paul E. McKenney wrote:
> > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Thank you!  I will add this on my next rebase.
> 
> It is possible that this will be v6.5 rather than v6.4 material.

I was hoping the RCU bits could land in 6.4, so that the block
layer work to take advantage of it can go into 6.5 without cross-tree
dependencies.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-26 23:18       ` Christoph Hellwig
@ 2023-03-27  3:29         ` Paul E. McKenney
  2023-04-24 21:06         ` Paul E. McKenney
  1 sibling, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-27  3:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Wysocki, Rafael J, rcu, linux-kernel, kernel-team, rostedt,
	Michał Mirosław, Dmitry Osipenko

On Mon, Mar 27, 2023 at 01:18:58AM +0200, Christoph Hellwig wrote:
> On Fri, Mar 24, 2023 at 01:11:47PM -0700, Paul E. McKenney wrote:
> > > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Thank you!  I will add this on my next rebase.
> > 
> > It is possible that this will be v6.5 rather than v6.4 material.
> 
> I was hoping the RCU bits could land in 6.4, so that the block
> layer work to take advantage of it can go into 6.5 without cross-tree
> dependencies.

Indeed, that patch series does hit a large chunk of SRCU, so my usual
offer of just giving you the patches is likely to run into trouble.

I will see what we can do about v6.4.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-24  0:19 ` [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct Paul E. McKenney
@ 2023-03-30  4:11   ` Zhang, Qiang1
  2023-03-30 14:55     ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Zhang, Qiang1 @ 2023-03-30  4:11 UTC (permalink / raw)
  To: Paul E. McKenney, rcu
  Cc: linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

>
>Further shrinking the srcu_struct structure is eased by requiring
>that in-module srcu_struct structures rely more heavily on static
>initialization.  In particular, this preserves the property that
>a module-load-time srcu_struct initialization can fail only due
>to memory-allocation failure of the per-CPU srcu_data structures.
>It might also slightly improve robustness by keeping the number of memory
>allocations that must succeed down percpu_alloc() call.
>
>This is in preparation for splitting an srcu_usage structure out
>of the srcu_struct structure.
>
>[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
>
>Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>Cc: Christoph Hellwig <hch@lst.de>
>---
> include/linux/srcutree.h | 19 ++++++++++++++-----
> kernel/rcu/srcutree.c    | 19 +++++++++++++------
> 2 files changed, 27 insertions(+), 11 deletions(-)
>
>diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
>index ac8af12f93b3..428480152375 100644
>--- a/include/linux/srcutree.h
>+++ b/include/linux/srcutree.h
>@@ -121,15 +121,24 @@ struct srcu_struct {
> #define SRCU_STATE_SCAN1	1
> #define SRCU_STATE_SCAN2	2
> 
>-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
>-{												\
>-	.sda = &pcpu_name,									\
>+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> 	.srcu_gp_seq_needed = -1UL,								\
> 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
>-	__SRCU_DEP_MAP_INIT(name)								\
>+	__SRCU_DEP_MAP_INIT(name)
>+
>+#define __SRCU_STRUCT_INIT_MODULE(name)								\
>+{												\
>+	__SRCU_STRUCT_INIT_COMMON(name)								\
> }
> 
>+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
>+{												\
>+	.sda = &pcpu_name,									\
>+	__SRCU_STRUCT_INIT_COMMON(name)								\
>+}
>+
>+
> /*
>  * Define and initialize a srcu struct at build time.
>  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
>@@ -151,7 +160,7 @@ struct srcu_struct {
>  */
> #ifdef MODULE
> # define __DEFINE_SRCU(name, is_static)								\
>-	is_static struct srcu_struct name;							\
>+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> 	extern struct srcu_struct * const __srcu_struct_##name;					\
> 	struct srcu_struct * const __srcu_struct_##name						\
> 		__section("___srcu_struct_ptrs") = &name
>diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
>index cd46fe063e50..7a6d9452a5d0 100644
>--- a/kernel/rcu/srcutree.c
>+++ b/kernel/rcu/srcutree.c
>@@ -1895,13 +1895,14 @@ void __init srcu_init(void)
> static int srcu_module_coming(struct module *mod)
> {
> 	int i;
>+	struct srcu_struct *ssp;
> 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
>-	int ret;
> 
> 	for (i = 0; i < mod->num_srcu_structs; i++) {
>-		ret = init_srcu_struct(*(sspp++));
>-		if (WARN_ON_ONCE(ret))
>-			return ret;
>+		ssp = *(sspp++);
>+		ssp->sda = alloc_percpu(struct srcu_data);
>+		if (WARN_ON_ONCE(!ssp->sda))
>+			return -ENOMEM;
> 	}
> 	return 0;
> }
>@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module *mod)
> static void srcu_module_going(struct module *mod)
> {
> 	int i;
>+	struct srcu_struct *ssp;
> 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> 
>-	for (i = 0; i < mod->num_srcu_structs; i++)
>-		cleanup_srcu_struct(*(sspp++));
>+	for (i = 0; i < mod->num_srcu_structs; i++) {
>+		ssp = *(sspp++);
>+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>+				cleanup_srcu_struct(ssp);
>+		free_percpu(ssp->sda);


Hi Paul

About 037b80b8865fb ("srcu: Check for readers at module-exit time ")

--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
                if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
                    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
                        cleanup_srcu_struct(ssp);
-               free_percpu(ssp->sda);
+               else if (!WARN_ON(srcu_readers_active(ssp)))
+                       free_percpu(ssp->sda);

Should the else statement be removed?  like this:

if (!WARN_ON(srcu_readers_active(ssp)))
	free_percpu(ssp->sda);

Thanks
Zqiang


>+	}
> }
> 
> /* Handle one module, either coming or going. */
>-- 
>2.40.0.rc2
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-30  4:11   ` Zhang, Qiang1
@ 2023-03-30 14:55     ` Paul E. McKenney
  2023-03-30 15:03       ` Zhang, Qiang1
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-30 14:55 UTC (permalink / raw)
  To: Zhang, Qiang1
  Cc: rcu, linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

On Thu, Mar 30, 2023 at 04:11:05AM +0000, Zhang, Qiang1 wrote:
> >
> >Further shrinking the srcu_struct structure is eased by requiring
> >that in-module srcu_struct structures rely more heavily on static
> >initialization.  In particular, this preserves the property that
> >a module-load-time srcu_struct initialization can fail only due
> >to memory-allocation failure of the per-CPU srcu_data structures.
> >It might also slightly improve robustness by keeping the number of memory
> >allocations that must succeed down percpu_alloc() call.
> >
> >This is in preparation for splitting an srcu_usage structure out
> >of the srcu_struct structure.
> >
> >[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
> >
> >Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >Cc: Christoph Hellwig <hch@lst.de>
> >---
> > include/linux/srcutree.h | 19 ++++++++++++++-----
> > kernel/rcu/srcutree.c    | 19 +++++++++++++------
> > 2 files changed, 27 insertions(+), 11 deletions(-)
> >
> >diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> >index ac8af12f93b3..428480152375 100644
> >--- a/include/linux/srcutree.h
> >+++ b/include/linux/srcutree.h
> >@@ -121,15 +121,24 @@ struct srcu_struct {
> > #define SRCU_STATE_SCAN1	1
> > #define SRCU_STATE_SCAN2	2
> > 
> >-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >-{												\
> >-	.sda = &pcpu_name,									\
> >+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> > 	.srcu_gp_seq_needed = -1UL,								\
> > 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> >-	__SRCU_DEP_MAP_INIT(name)								\
> >+	__SRCU_DEP_MAP_INIT(name)
> >+
> >+#define __SRCU_STRUCT_INIT_MODULE(name)								\
> >+{												\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > }
> > 
> >+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >+{												\
> >+	.sda = &pcpu_name,									\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> >+}
> >+
> >+
> > /*
> >  * Define and initialize a srcu struct at build time.
> >  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> >@@ -151,7 +160,7 @@ struct srcu_struct {
> >  */
> > #ifdef MODULE
> > # define __DEFINE_SRCU(name, is_static)								\
> >-	is_static struct srcu_struct name;							\
> >+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > 	extern struct srcu_struct * const __srcu_struct_##name;					\
> > 	struct srcu_struct * const __srcu_struct_##name						\
> > 		__section("___srcu_struct_ptrs") = &name
> >diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> >index cd46fe063e50..7a6d9452a5d0 100644
> >--- a/kernel/rcu/srcutree.c
> >+++ b/kernel/rcu/srcutree.c
> >@@ -1895,13 +1895,14 @@ void __init srcu_init(void)
> > static int srcu_module_coming(struct module *mod)
> > {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> >-	int ret;
> > 
> > 	for (i = 0; i < mod->num_srcu_structs; i++) {
> >-		ret = init_srcu_struct(*(sspp++));
> >-		if (WARN_ON_ONCE(ret))
> >-			return ret;
> >+		ssp = *(sspp++);
> >+		ssp->sda = alloc_percpu(struct srcu_data);
> >+		if (WARN_ON_ONCE(!ssp->sda))
> >+			return -ENOMEM;
> > 	}
> > 	return 0;
> > }
> >@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module *mod)
> > static void srcu_module_going(struct module *mod)
> > {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > 
> >-	for (i = 0; i < mod->num_srcu_structs; i++)
> >-		cleanup_srcu_struct(*(sspp++));
> >+	for (i = 0; i < mod->num_srcu_structs; i++) {
> >+		ssp = *(sspp++);
> >+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> >+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> >+				cleanup_srcu_struct(ssp);
> >+		free_percpu(ssp->sda);
> 
> 
> Hi Paul
> 
> About 037b80b8865fb ("srcu: Check for readers at module-exit time ")
> 
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
>                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>                         cleanup_srcu_struct(ssp);
> -               free_percpu(ssp->sda);
> +               else if (!WARN_ON(srcu_readers_active(ssp)))
> +                       free_percpu(ssp->sda);
> 
> Should the else statement be removed?  like this:
> 
> if (!WARN_ON(srcu_readers_active(ssp)))
> 	free_percpu(ssp->sda);

Mightn't that cause us to double-free ssp->sda?  Once in free_percpu(),
and before that in cleanup_srcu_struct()?

							Thanx, Paul

> Thanks
> Zqiang
> 
> 
> >+	}
> > }
> > 
> > /* Handle one module, either coming or going. */
> >-- 
> >2.40.0.rc2
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-30 14:55     ` Paul E. McKenney
@ 2023-03-30 15:03       ` Zhang, Qiang1
  2023-03-30 15:20         ` Zhang, Qiang1
  0 siblings, 1 reply; 30+ messages in thread
From: Zhang, Qiang1 @ 2023-03-30 15:03 UTC (permalink / raw)
  To: paulmck; +Cc: rcu, linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

> >Further shrinking the srcu_struct structure is eased by requiring
> >that in-module srcu_struct structures rely more heavily on static
> >initialization.  In particular, this preserves the property that
> >a module-load-time srcu_struct initialization can fail only due
> >to memory-allocation failure of the per-CPU srcu_data structures.
> >It might also slightly improve robustness by keeping the number of memory
> >allocations that must succeed down percpu_alloc() call.
> >
> >This is in preparation for splitting an srcu_usage structure out
> >of the srcu_struct structure.
> >
> >[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
> >
> >Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >Cc: Christoph Hellwig <hch@lst.de>
> >---
> > include/linux/srcutree.h | 19 ++++++++++++++-----
> > kernel/rcu/srcutree.c    | 19 +++++++++++++------
> > 2 files changed, 27 insertions(+), 11 deletions(-)
> >
> >diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> >index ac8af12f93b3..428480152375 100644
> >--- a/include/linux/srcutree.h
> >+++ b/include/linux/srcutree.h
> >@@ -121,15 +121,24 @@ struct srcu_struct {
> > #define SRCU_STATE_SCAN1	1
> > #define SRCU_STATE_SCAN2	2
> > 
> >-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >-{												\
> >-	.sda = &pcpu_name,									\
> >+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> > 	.srcu_gp_seq_needed = -1UL,								\
> > 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> >-	__SRCU_DEP_MAP_INIT(name)								\
> >+	__SRCU_DEP_MAP_INIT(name)
> >+
> >+#define __SRCU_STRUCT_INIT_MODULE(name)								\
> >+{												\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > }
> > 
> >+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >+{												\
> >+	.sda = &pcpu_name,									\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> >+}
> >+
> >+
> > /*
> >  * Define and initialize a srcu struct at build time.
> >  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> >@@ -151,7 +160,7 @@ struct srcu_struct {
> >  */
> > #ifdef MODULE
> > # define __DEFINE_SRCU(name, is_static)								\
> >-	is_static struct srcu_struct name;							\
> >+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > 	extern struct srcu_struct * const __srcu_struct_##name;					\
> > 	struct srcu_struct * const __srcu_struct_##name						\
> > 		__section("___srcu_struct_ptrs") = &name
> >diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> >index cd46fe063e50..7a6d9452a5d0 100644
> >--- a/kernel/rcu/srcutree.c
> >+++ b/kernel/rcu/srcutree.c
> >@@ -1895,13 +1895,14 @@ void __init srcu_init(void)
> > static int srcu_module_coming(struct module *mod)
> > {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> >-	int ret;
> > 
> > 	for (i = 0; i < mod->num_srcu_structs; i++) {
> >-		ret = init_srcu_struct(*(sspp++));
> >-		if (WARN_ON_ONCE(ret))
> >-			return ret;
> >+		ssp = *(sspp++);
> >+		ssp->sda = alloc_percpu(struct srcu_data);
> >+		if (WARN_ON_ONCE(!ssp->sda))
> >+			return -ENOMEM;
> > 	}
> > 	return 0;
> > }
> >@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module *mod)
> > static void srcu_module_going(struct module *mod)
> > {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > 
> >-	for (i = 0; i < mod->num_srcu_structs; i++)
> >-		cleanup_srcu_struct(*(sspp++));
> >+	for (i = 0; i < mod->num_srcu_structs; i++) {
> >+		ssp = *(sspp++);
> >+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> >+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> >+				cleanup_srcu_struct(ssp);
> >+		free_percpu(ssp->sda);
> 
> 
> Hi Paul
> 
> About 037b80b8865fb ("srcu: Check for readers at module-exit time ")
> 
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
>                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>                         cleanup_srcu_struct(ssp);


The srcu_sup->sda_is_static is true, in cleanup_srcu_struct(), the ssp->sda can not be freed.


> -               free_percpu(ssp->sda);
> +               else if (!WARN_ON(srcu_readers_active(ssp)))
> +                       free_percpu(ssp->sda);
> 
> Should the else statement be removed?  like this:
> 
> if (!WARN_ON(srcu_readers_active(ssp)))
> 	free_percpu(ssp->sda);
>
>Mightn't that cause us to double-free ssp->sda?  Once in free_percpu(),
>and before that in cleanup_srcu_struct()?

Thanks
Zqiang

>
>							Thanx, Paul
>
> Thanks
> Zqiang
> 
> 
> >+	}
> > }
> > 
> > /* Handle one module, either coming or going. */
> >-- 
> >2.40.0.rc2
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-30 15:03       ` Zhang, Qiang1
@ 2023-03-30 15:20         ` Zhang, Qiang1
  2023-03-30 17:05           ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Zhang, Qiang1 @ 2023-03-30 15:20 UTC (permalink / raw)
  To: Zhang, Qiang1, paulmck
  Cc: rcu, linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

> >Further shrinking the srcu_struct structure is eased by requiring 
> >that in-module srcu_struct structures rely more heavily on static 
> >initialization.  In particular, this preserves the property that a 
> >module-load-time srcu_struct initialization can fail only due to 
> >memory-allocation failure of the per-CPU srcu_data structures.
> >It might also slightly improve robustness by keeping the number of 
> >memory allocations that must succeed down percpu_alloc() call.
> >
> >This is in preparation for splitting an srcu_usage structure out of 
> >the srcu_struct structure.
> >
> >[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
> >
> >Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >Cc: Christoph Hellwig <hch@lst.de>
> >---
> > include/linux/srcutree.h | 19 ++++++++++++++-----
> > kernel/rcu/srcutree.c    | 19 +++++++++++++------
> > 2 files changed, 27 insertions(+), 11 deletions(-)
> >
> >diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h 
> >index ac8af12f93b3..428480152375 100644
> >--- a/include/linux/srcutree.h
> >+++ b/include/linux/srcutree.h
> >@@ -121,15 +121,24 @@ struct srcu_struct {
> > #define SRCU_STATE_SCAN1	1
> > #define SRCU_STATE_SCAN2	2
> > 
> >-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >-{												\
> >-	.sda = &pcpu_name,									\
> >+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> > 	.srcu_gp_seq_needed = -1UL,								\
> > 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> >-	__SRCU_DEP_MAP_INIT(name)								\
> >+	__SRCU_DEP_MAP_INIT(name)
> >+
> >+#define __SRCU_STRUCT_INIT_MODULE(name)								\
> >+{												\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > }
> > 
> >+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >+{												\
> >+	.sda = &pcpu_name,									\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> >+}
> >+
> >+
> > /*
> >  * Define and initialize a srcu struct at build time.
> >  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> >@@ -151,7 +160,7 @@ struct srcu_struct {
> >  */
> > #ifdef MODULE
> > # define __DEFINE_SRCU(name, is_static)								\
> >-	is_static struct srcu_struct name;							\
> >+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > 	extern struct srcu_struct * const __srcu_struct_##name;					\
> > 	struct srcu_struct * const __srcu_struct_##name						\
> > 		__section("___srcu_struct_ptrs") = &name diff --git 
> >a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 
> >cd46fe063e50..7a6d9452a5d0 100644
> >--- a/kernel/rcu/srcutree.c
> >+++ b/kernel/rcu/srcutree.c
> >@@ -1895,13 +1895,14 @@ void __init srcu_init(void)  static int 
> >srcu_module_coming(struct module *mod)  {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> >-	int ret;
> > 
> > 	for (i = 0; i < mod->num_srcu_structs; i++) {
> >-		ret = init_srcu_struct(*(sspp++));
> >-		if (WARN_ON_ONCE(ret))
> >-			return ret;
> >+		ssp = *(sspp++);
> >+		ssp->sda = alloc_percpu(struct srcu_data);
> >+		if (WARN_ON_ONCE(!ssp->sda))
> >+			return -ENOMEM;
> > 	}
> > 	return 0;
> > }
> >@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module 
> >*mod)  static void srcu_module_going(struct module *mod)  {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > 
> >-	for (i = 0; i < mod->num_srcu_structs; i++)
> >-		cleanup_srcu_struct(*(sspp++));
> >+	for (i = 0; i < mod->num_srcu_structs; i++) {
> >+		ssp = *(sspp++);
> >+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> >+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> >+				cleanup_srcu_struct(ssp);
> >+		free_percpu(ssp->sda);
> 
> 
> Hi Paul
> 
> About 037b80b8865fb ("srcu: Check for readers at module-exit time ")
> 
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
>                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>                         cleanup_srcu_struct(ssp);


The srcu_sup->sda_is_static is true, in cleanup_srcu_struct(), the ssp->sda can not be freed.


> -               free_percpu(ssp->sda);
> +               else if (!WARN_ON(srcu_readers_active(ssp)))
> +                       free_percpu(ssp->sda);
> 
> Should the else statement be removed?  like this:
> 
> if (!WARN_ON(srcu_readers_active(ssp)))
> 	free_percpu(ssp->sda);
>
>Mightn't that cause us to double-free ssp->sda?  Once in free_percpu(), 
>and before that in cleanup_srcu_struct()?


how about this? any thought?

--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1937,7 +1937,7 @@ static void srcu_module_going(struct module *mod)
                if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
                    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
                        cleanup_srcu_struct(ssp);
-               else if (!WARN_ON(srcu_readers_active(ssp)))
+               if (!WARN_ON(srcu_readers_active(ssp)))
                        free_percpu(ssp->sda);
        }
 }

Thanks
Zqiang

>
>							Thanx, Paul
>
> Thanks
> Zqiang
> 
> 
> >+	}
> > }
> > 
> > /* Handle one module, either coming or going. */
> >--
> >2.40.0.rc2
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-30 15:20         ` Zhang, Qiang1
@ 2023-03-30 17:05           ` Paul E. McKenney
  0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-30 17:05 UTC (permalink / raw)
  To: Zhang, Qiang1
  Cc: rcu, linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

On Thu, Mar 30, 2023 at 03:20:15PM +0000, Zhang, Qiang1 wrote:
> > >Further shrinking the srcu_struct structure is eased by requiring 
> > >that in-module srcu_struct structures rely more heavily on static 
> > >initialization.  In particular, this preserves the property that a 
> > >module-load-time srcu_struct initialization can fail only due to 
> > >memory-allocation failure of the per-CPU srcu_data structures.
> > >It might also slightly improve robustness by keeping the number of 
> > >memory allocations that must succeed down percpu_alloc() call.
> > >
> > >This is in preparation for splitting an srcu_usage structure out of 
> > >the srcu_struct structure.
> > >
> > >[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
> > >
> > >Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > >Cc: Christoph Hellwig <hch@lst.de>
> > >---
> > > include/linux/srcutree.h | 19 ++++++++++++++-----
> > > kernel/rcu/srcutree.c    | 19 +++++++++++++------
> > > 2 files changed, 27 insertions(+), 11 deletions(-)
> > >
> > >diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h 
> > >index ac8af12f93b3..428480152375 100644
> > >--- a/include/linux/srcutree.h
> > >+++ b/include/linux/srcutree.h
> > >@@ -121,15 +121,24 @@ struct srcu_struct {
> > > #define SRCU_STATE_SCAN1	1
> > > #define SRCU_STATE_SCAN2	2
> > > 
> > >-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> > >-{												\
> > >-	.sda = &pcpu_name,									\
> > >+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > > 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> > > 	.srcu_gp_seq_needed = -1UL,								\
> > > 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> > >-	__SRCU_DEP_MAP_INIT(name)								\
> > >+	__SRCU_DEP_MAP_INIT(name)
> > >+
> > >+#define __SRCU_STRUCT_INIT_MODULE(name)								\
> > >+{												\
> > >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > > }
> > > 
> > >+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> > >+{												\
> > >+	.sda = &pcpu_name,									\
> > >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > >+}
> > >+
> > >+
> > > /*
> > >  * Define and initialize a srcu struct at build time.
> > >  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> > >@@ -151,7 +160,7 @@ struct srcu_struct {
> > >  */
> > > #ifdef MODULE
> > > # define __DEFINE_SRCU(name, is_static)								\
> > >-	is_static struct srcu_struct name;							\
> > >+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > > 	extern struct srcu_struct * const __srcu_struct_##name;					\
> > > 	struct srcu_struct * const __srcu_struct_##name						\
> > > 		__section("___srcu_struct_ptrs") = &name diff --git 
> > >a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 
> > >cd46fe063e50..7a6d9452a5d0 100644
> > >--- a/kernel/rcu/srcutree.c
> > >+++ b/kernel/rcu/srcutree.c
> > >@@ -1895,13 +1895,14 @@ void __init srcu_init(void)  static int 
> > >srcu_module_coming(struct module *mod)  {
> > > 	int i;
> > >+	struct srcu_struct *ssp;
> > > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > >-	int ret;
> > > 
> > > 	for (i = 0; i < mod->num_srcu_structs; i++) {
> > >-		ret = init_srcu_struct(*(sspp++));
> > >-		if (WARN_ON_ONCE(ret))
> > >-			return ret;
> > >+		ssp = *(sspp++);
> > >+		ssp->sda = alloc_percpu(struct srcu_data);
> > >+		if (WARN_ON_ONCE(!ssp->sda))
> > >+			return -ENOMEM;
> > > 	}
> > > 	return 0;
> > > }
> > >@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module 
> > >*mod)  static void srcu_module_going(struct module *mod)  {
> > > 	int i;
> > >+	struct srcu_struct *ssp;
> > > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > > 
> > >-	for (i = 0; i < mod->num_srcu_structs; i++)
> > >-		cleanup_srcu_struct(*(sspp++));
> > >+	for (i = 0; i < mod->num_srcu_structs; i++) {
> > >+		ssp = *(sspp++);
> > >+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> > >+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> > >+				cleanup_srcu_struct(ssp);
> > >+		free_percpu(ssp->sda);
> > 
> > 
> > Hi Paul
> > 
> > About 037b80b8865fb ("srcu: Check for readers at module-exit time ")
> > 
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
> >                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> >                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> >                         cleanup_srcu_struct(ssp);
> 
> 
> The srcu_sup->sda_is_static is true, in cleanup_srcu_struct(), the ssp->sda can not be freed.

Very good, thank you!  I will fold your suggested fix into this commit:

037b80b8865f ("srcu: Check for readers at module-exit time")

							Thanx, Paul

> > -               free_percpu(ssp->sda);
> > +               else if (!WARN_ON(srcu_readers_active(ssp)))
> > +                       free_percpu(ssp->sda);
> > 
> > Should the else statement be removed?  like this:
> > 
> > if (!WARN_ON(srcu_readers_active(ssp)))
> > 	free_percpu(ssp->sda);
> >
> >Mightn't that cause us to double-free ssp->sda?  Once in free_percpu(), 
> >and before that in cleanup_srcu_struct()?
> 
> 
> how about this? any thought?
> 
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1937,7 +1937,7 @@ static void srcu_module_going(struct module *mod)
>                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>                         cleanup_srcu_struct(ssp);
> -               else if (!WARN_ON(srcu_readers_active(ssp)))
> +               if (!WARN_ON(srcu_readers_active(ssp)))
>                         free_percpu(ssp->sda);
>         }
>  }
> 
> Thanks
> Zqiang
> 
> >
> >							Thanx, Paul
> >
> > Thanks
> > Zqiang
> > 
> > 
> > >+	}
> > > }
> > > 
> > > /* Handle one module, either coming or going. */
> > >--
> > >2.40.0.rc2
> > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-26 23:18       ` Christoph Hellwig
  2023-03-27  3:29         ` Paul E. McKenney
@ 2023-04-24 21:06         ` Paul E. McKenney
  1 sibling, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-04-24 21:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Wysocki, Rafael J, rcu, linux-kernel, kernel-team, rostedt,
	Michał Mirosław, Dmitry Osipenko

On Mon, Mar 27, 2023 at 01:18:58AM +0200, Christoph Hellwig wrote:
> On Fri, Mar 24, 2023 at 01:11:47PM -0700, Paul E. McKenney wrote:
> > > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Thank you!  I will add this on my next rebase.
> > 
> > It is possible that this will be v6.5 rather than v6.4 material.
> 
> I was hoping the RCU bits could land in 6.4, so that the block
> layer work to take advantage of it can go into 6.5 without cross-tree
> dependencies.

And this is now in mainline.  Please let me know how it goes!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-04-24 21:06 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct Paul E. McKenney
2023-03-30  4:11   ` Zhang, Qiang1
2023-03-30 14:55     ` Paul E. McKenney
2023-03-30 15:03       ` Zhang, Qiang1
2023-03-30 15:20         ` Zhang, Qiang1
2023-03-30 17:05           ` Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update Paul E. McKenney
2023-03-24 19:10   ` Wysocki, Rafael J
2023-03-24 20:11     ` Paul E. McKenney
2023-03-26 23:18       ` Christoph Hellwig
2023-03-27  3:29         ` Paul E. McKenney
2023-04-24 21:06         ` Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 04/19] srcu: Move ->level from srcu_struct to srcu_usage Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 05/19] srcu: Move ->srcu_size_state " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 06/19] srcu: Move ->srcu_cb_mutex " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 07/19] srcu: Move ->lock initialization after srcu_usage allocation Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 08/19] srcu: Move ->lock from srcu_struct to srcu_usage Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 09/19] srcu: Move ->srcu_gp_mutex " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 10/19] srcu: Move grace-period fields " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 11/19] srcu: Move heuristics " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 12/19] srcu: Move ->sda_is_static " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 13/19] srcu: Move srcu_barrier() fields " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 14/19] srcu: Move work-scheduling " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 15/19] srcu: Fix long lines in srcu_get_delay() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 16/19] srcu: Fix long lines in cleanup_srcu_struct() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 17/19] srcu: Fix long lines in srcu_gp_end() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 18/19] srcu: Fix long lines in srcu_funnel_gp_start() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 19/19] srcu: Remove extraneous parentheses from srcu_read_lock() etc Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).