All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality
@ 2023-03-24  0:19 Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU() Paul E. McKenney
                   ` (18 more replies)
  0 siblings, 19 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch

Hello!

This RFC series shrinks the srcu_struct structure to the bare minimum
required to support SRCU readers, relegating the remaining fields to a new
srcu_usage structure.  Statically allocated srcu_struct structures created
by DEFINE_SRCU() and DEFINE_STATIC_SRCU() have statically allocated
srcu_usage structures, but those required for dynamically allocated
srcu_struct structures that are initialized using init_srcu_struct()
are dynamically allocated.

The results is a reduction in the size of an srcu_struct structure from
a couple hundred bytes to just 24 bytes on x86_64 systems.  This can be
helpful when SRCU readers are used in a fastpath for which the srcu_struct
structure must be embedded in another structure, and especially where
that fastpath also needs to access fields both before and after the
srcu_struct structure.

This series takes baby steps, in part because breaking SRCU means that
you get absolutely no console output.  Yes, I did learn this the hard way.
Why do you ask?  ;-)

Here are those baby steps:

1.	Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU().

2.	Use static init for statically allocated in-module srcu_struct.

3.	Begin offloading srcu_struct fields to srcu_update.  Note that
	this affects notifiers, which open-code static allocation of
	an srcu_struct structure.  (And no, I still do not see a way to
	abstract this, sorry!)

4.	Move ->level from srcu_struct to srcu_usage.

5.	Move ->srcu_size_state from srcu_struct to srcu_usage.

6.	Move ->srcu_cb_mutex from srcu_struct to srcu_usage.

7.	Move ->lock initialization after srcu_usage allocation.

8.	Move ->lock from srcu_struct to srcu_usage.

9.	Move ->srcu_gp_mutex from srcu_struct to srcu_usage.

10.	Move grace-period fields from srcu_struct to srcu_usage.

11.	Move heuristics fields from srcu_struct to srcu_usage.

12.	Move ->sda_is_static from srcu_struct to srcu_usage.

13.	Move srcu_barrier() fields from srcu_struct to srcu_usage.

14.	Move work-scheduling fields from srcu_struct to srcu_usage.

15.	Fix long lines in srcu_get_delay().

16.	Fix long lines in cleanup_srcu_struct().

17.	Fix long lines in srcu_gp_end().

18.	Fix long lines in srcu_funnel_gp_start().

19.	Remove extraneous parentheses from srcu_read_lock() etc.

						Thanx, Paul

------------------------------------------------------------------------

 b/include/linux/notifier.h |    5 
 b/include/linux/srcu.h     |    8 
 b/include/linux/srcutiny.h |    6 
 b/include/linux/srcutree.h |   28 +-
 b/kernel/rcu/rcu.h         |    6 
 b/kernel/rcu/srcutree.c    |   19 +
 include/linux/srcutree.h   |  123 ++++++-----
 kernel/rcu/srcutree.c      |  488 +++++++++++++++++++++++----------------------
 8 files changed, 368 insertions(+), 315 deletions(-)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct Paul E. McKenney
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This is a whitespace-only commit with no change in functionality.
Its purpose is to prepare for later commits that: (1) Cause statically
allocated srcu_struct structures to rely on compile-time initialization
and (2) Move fields from the srcu_struct structure to a new srcu_usage
structure.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 include/linux/srcutree.h | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index a6910805f9c5..ac8af12f93b3 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -121,13 +121,13 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN1	1
 #define SRCU_STATE_SCAN2	2
 
-#define __SRCU_STRUCT_INIT(name, pcpu_name)				\
-{									\
-	.sda = &pcpu_name,						\
-	.lock = __SPIN_LOCK_UNLOCKED(name.lock),			\
-	.srcu_gp_seq_needed = -1UL,					\
-	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),		\
-	__SRCU_DEP_MAP_INIT(name)					\
+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
+{												\
+	.sda = &pcpu_name,									\
+	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
+	.srcu_gp_seq_needed = -1UL,								\
+	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
+	__SRCU_DEP_MAP_INIT(name)								\
 }
 
 /*
@@ -150,15 +150,15 @@ struct srcu_struct {
  * See include/linux/percpu-defs.h for the rules on per-CPU variables.
  */
 #ifdef MODULE
-# define __DEFINE_SRCU(name, is_static)					\
-	is_static struct srcu_struct name;				\
-	extern struct srcu_struct * const __srcu_struct_##name;		\
-	struct srcu_struct * const __srcu_struct_##name			\
+# define __DEFINE_SRCU(name, is_static)								\
+	is_static struct srcu_struct name;							\
+	extern struct srcu_struct * const __srcu_struct_##name;					\
+	struct srcu_struct * const __srcu_struct_##name						\
 		__section("___srcu_struct_ptrs") = &name
 #else
-# define __DEFINE_SRCU(name, is_static)					\
-	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);	\
-	is_static struct srcu_struct name =				\
+# define __DEFINE_SRCU(name, is_static)								\
+	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
+	is_static struct srcu_struct name =							\
 		__SRCU_STRUCT_INIT(name, name##_srcu_data)
 #endif
 #define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-30  4:11   ` Zhang, Qiang1
  2023-03-24  0:19 ` [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update Paul E. McKenney
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

Further shrinking the srcu_struct structure is eased by requiring
that in-module srcu_struct structures rely more heavily on static
initialization.  In particular, this preserves the property that
a module-load-time srcu_struct initialization can fail only due
to memory-allocation failure of the per-CPU srcu_data structures.
It might also slightly improve robustness by keeping the number of memory
allocations that must succeed down percpu_alloc() call.

This is in preparation for splitting an srcu_usage structure out
of the srcu_struct structure.

[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 include/linux/srcutree.h | 19 ++++++++++++++-----
 kernel/rcu/srcutree.c    | 19 +++++++++++++------
 2 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index ac8af12f93b3..428480152375 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -121,15 +121,24 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN1	1
 #define SRCU_STATE_SCAN2	2
 
-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
-{												\
-	.sda = &pcpu_name,									\
+#define __SRCU_STRUCT_INIT_COMMON(name)								\
 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
-	__SRCU_DEP_MAP_INIT(name)								\
+	__SRCU_DEP_MAP_INIT(name)
+
+#define __SRCU_STRUCT_INIT_MODULE(name)								\
+{												\
+	__SRCU_STRUCT_INIT_COMMON(name)								\
 }
 
+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
+{												\
+	.sda = &pcpu_name,									\
+	__SRCU_STRUCT_INIT_COMMON(name)								\
+}
+
+
 /*
  * Define and initialize a srcu struct at build time.
  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
@@ -151,7 +160,7 @@ struct srcu_struct {
  */
 #ifdef MODULE
 # define __DEFINE_SRCU(name, is_static)								\
-	is_static struct srcu_struct name;							\
+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
 	extern struct srcu_struct * const __srcu_struct_##name;					\
 	struct srcu_struct * const __srcu_struct_##name						\
 		__section("___srcu_struct_ptrs") = &name
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index cd46fe063e50..7a6d9452a5d0 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1895,13 +1895,14 @@ void __init srcu_init(void)
 static int srcu_module_coming(struct module *mod)
 {
 	int i;
+	struct srcu_struct *ssp;
 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
-	int ret;
 
 	for (i = 0; i < mod->num_srcu_structs; i++) {
-		ret = init_srcu_struct(*(sspp++));
-		if (WARN_ON_ONCE(ret))
-			return ret;
+		ssp = *(sspp++);
+		ssp->sda = alloc_percpu(struct srcu_data);
+		if (WARN_ON_ONCE(!ssp->sda))
+			return -ENOMEM;
 	}
 	return 0;
 }
@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module *mod)
 static void srcu_module_going(struct module *mod)
 {
 	int i;
+	struct srcu_struct *ssp;
 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
 
-	for (i = 0; i < mod->num_srcu_structs; i++)
-		cleanup_srcu_struct(*(sspp++));
+	for (i = 0; i < mod->num_srcu_structs; i++) {
+		ssp = *(sspp++);
+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
+				cleanup_srcu_struct(ssp);
+		free_percpu(ssp->sda);
+	}
 }
 
 /* Handle one module, either coming or going. */
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU() Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24 19:10   ` Wysocki, Rafael J
  2023-03-24  0:19 ` [PATCH RFC rcu 04/19] srcu: Move ->level from srcu_struct to srcu_usage Paul E. McKenney
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney,
	Rafael J. Wysocki, Michał Mirosław, Dmitry Osipenko

The current srcu_struct structure is on the order of 200 bytes in size
(depending on architecture and .config), which is much better than the
old-style 26K bytes, but still all too inconvenient when one is trying
to achieve good cache locality on a fastpath involving SRCU readers.

However, only a few fields in srcu_struct are used by SRCU readers.
The remaining fields could be offloaded to a new srcu_update
structure, thus shrinking the srcu_struct structure down to a few
tens of bytes.  This commit begins this noble quest, a quest that is
complicated by open-coded initialization of the srcu_struct within the
srcu_notifier_head structure.  This complication is addressed by updating
the srcu_notifier_head structure's open coding, given that there does
not appear to be a straightforward way of abstracting that initialization.

This commit moves only the ->node pointer to srcu_update.  Later commits
will move additional fields.

[ paulmck: Fold in qiang1.zhang@intel.com's memory-leak fix. ]

Link: https://lore.kernel.org/all/20230320055751.4120251-1-qiang1.zhang@intel.com/
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
---
 include/linux/notifier.h |  5 ++++-
 include/linux/srcutiny.h |  6 +++---
 include/linux/srcutree.h | 27 ++++++++++++++++++---------
 kernel/rcu/rcu.h         |  6 ++++--
 kernel/rcu/srcutree.c    | 28 +++++++++++++++++++---------
 5 files changed, 48 insertions(+), 24 deletions(-)

diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index aef88c2d1173..2aba75145144 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -73,6 +73,9 @@ struct raw_notifier_head {
 
 struct srcu_notifier_head {
 	struct mutex mutex;
+#ifdef CONFIG_TREE_SRCU
+	struct srcu_usage srcuu;
+#endif
 	struct srcu_struct srcu;
 	struct notifier_block __rcu *head;
 };
@@ -107,7 +110,7 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
 	{							\
 		.mutex = __MUTEX_INITIALIZER(name.mutex),	\
 		.head = NULL,					\
-		.srcu = __SRCU_STRUCT_INIT(name.srcu, pcpu),	\
+		.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
 	}
 
 #define ATOMIC_NOTIFIER_HEAD(name)				\
diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
index 5aa5e0faf6a1..ebd72491af99 100644
--- a/include/linux/srcutiny.h
+++ b/include/linux/srcutiny.h
@@ -31,7 +31,7 @@ struct srcu_struct {
 
 void srcu_drive_gp(struct work_struct *wp);
 
-#define __SRCU_STRUCT_INIT(name, __ignored)				\
+#define __SRCU_STRUCT_INIT(name, __ignored, ___ignored)			\
 {									\
 	.srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq),	\
 	.srcu_cb_tail = &name.srcu_cb_head,				\
@@ -44,9 +44,9 @@ void srcu_drive_gp(struct work_struct *wp);
  * Tree SRCU, which needs some per-CPU data.
  */
 #define DEFINE_SRCU(name) \
-	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
+	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
 #define DEFINE_STATIC_SRCU(name) \
-	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
+	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
 
 void synchronize_srcu(struct srcu_struct *ssp);
 
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 428480152375..2689e64024bb 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -57,11 +57,17 @@ struct srcu_node {
 	int grphi;				/* Biggest CPU for node. */
 };
 
+/*
+ * Per-SRCU-domain structure, update-side data linked from srcu_struct.
+ */
+struct srcu_usage {
+	struct srcu_node *node;			/* Combining tree. */
+};
+
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	struct srcu_node *node;			/* Combining tree. */
 	struct srcu_node *level[RCU_NUM_LVLS + 1];
 						/* First node at each level. */
 	int srcu_size_state;			/* Small-to-big transition state. */
@@ -90,6 +96,7 @@ struct srcu_struct {
 	unsigned long reschedule_count;
 	struct delayed_work work;
 	struct lockdep_map dep_map;
+	struct srcu_usage *srcu_sup;		/* Update-side data. */
 };
 
 // Values for size state variable (->srcu_size_state).  Once the state
@@ -121,24 +128,24 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN1	1
 #define SRCU_STATE_SCAN2	2
 
-#define __SRCU_STRUCT_INIT_COMMON(name)								\
+#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
+	.srcu_sup = &usage_name,								\
 	__SRCU_DEP_MAP_INIT(name)
 
-#define __SRCU_STRUCT_INIT_MODULE(name)								\
+#define __SRCU_STRUCT_INIT_MODULE(name, usage_name)						\
 {												\
-	__SRCU_STRUCT_INIT_COMMON(name)								\
+	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
 }
 
-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
+#define __SRCU_STRUCT_INIT(name, usage_name, pcpu_name)						\
 {												\
 	.sda = &pcpu_name,									\
-	__SRCU_STRUCT_INIT_COMMON(name)								\
+	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
 }
 
-
 /*
  * Define and initialize a srcu struct at build time.
  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
@@ -160,15 +167,17 @@ struct srcu_struct {
  */
 #ifdef MODULE
 # define __DEFINE_SRCU(name, is_static)								\
-	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
+	static struct srcu_usage name##_srcu_usage;						\
+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
 	extern struct srcu_struct * const __srcu_struct_##name;					\
 	struct srcu_struct * const __srcu_struct_##name						\
 		__section("___srcu_struct_ptrs") = &name
 #else
 # define __DEFINE_SRCU(name, is_static)								\
 	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
+	static struct srcu_usage name##_srcu_usage;						\
 	is_static struct srcu_struct name =							\
-		__SRCU_STRUCT_INIT(name, name##_srcu_data)
+		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
 #endif
 #define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
 #define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index a3adcf9a9919..4a1b9622598b 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -378,11 +378,13 @@ extern void rcu_init_geometry(void);
  * specified state structure (for SRCU) or the only rcu_state structure
  * (for RCU).
  */
-#define srcu_for_each_node_breadth_first(sp, rnp) \
+#define _rcu_for_each_node_breadth_first(sp, rnp) \
 	for ((rnp) = &(sp)->node[0]; \
 	     (rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++)
 #define rcu_for_each_node_breadth_first(rnp) \
-	srcu_for_each_node_breadth_first(&rcu_state, rnp)
+	_rcu_for_each_node_breadth_first(&rcu_state, rnp)
+#define srcu_for_each_node_breadth_first(ssp, rnp) \
+	_rcu_for_each_node_breadth_first(ssp->srcu_sup, rnp)
 
 /*
  * Scan the leaves of the rcu_node hierarchy for the rcu_state structure.
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 7a6d9452a5d0..ad1d5ca42a99 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -173,12 +173,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 
 	/* Initialize geometry if it has not already been initialized. */
 	rcu_init_geometry();
-	ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), gfp_flags);
-	if (!ssp->node)
+	ssp->srcu_sup->node = kcalloc(rcu_num_nodes, sizeof(*ssp->srcu_sup->node), gfp_flags);
+	if (!ssp->srcu_sup->node)
 		return false;
 
 	/* Work out the overall tree geometry. */
-	ssp->level[0] = &ssp->node[0];
+	ssp->level[0] = &ssp->srcu_sup->node[0];
 	for (i = 1; i < rcu_num_lvls; i++)
 		ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
 	rcu_init_levelspread(levelspread, num_rcu_lvl);
@@ -195,7 +195,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 		snp->srcu_gp_seq_needed_exp = SRCU_SNP_INIT_SEQ;
 		snp->grplo = -1;
 		snp->grphi = -1;
-		if (snp == &ssp->node[0]) {
+		if (snp == &ssp->srcu_sup->node[0]) {
 			/* Root node, special case. */
 			snp->srcu_parent = NULL;
 			continue;
@@ -236,8 +236,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
  */
 static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 {
+	if (!is_static)
+		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
+	if (!ssp->srcu_sup)
+		return -ENOMEM;
 	ssp->srcu_size_state = SRCU_SIZE_SMALL;
-	ssp->node = NULL;
+	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_cb_mutex);
 	mutex_init(&ssp->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
@@ -249,8 +253,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	ssp->sda_is_static = is_static;
 	if (!is_static)
 		ssp->sda = alloc_percpu(struct srcu_data);
-	if (!ssp->sda)
+	if (!ssp->sda) {
+		if (!is_static)
+			kfree(ssp->srcu_sup);
 		return -ENOMEM;
+	}
 	init_srcu_struct_data(ssp);
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
@@ -259,6 +266,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 			if (!ssp->sda_is_static) {
 				free_percpu(ssp->sda);
 				ssp->sda = NULL;
+				kfree(ssp->srcu_sup);
 				return -ENOMEM;
 			}
 		} else {
@@ -656,13 +664,15 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
 		return; /* Caller forgot to stop doing call_srcu()? */
 	}
+	kfree(ssp->srcu_sup->node);
+	ssp->srcu_sup->node = NULL;
+	ssp->srcu_size_state = SRCU_SIZE_SMALL;
 	if (!ssp->sda_is_static) {
 		free_percpu(ssp->sda);
 		ssp->sda = NULL;
+		kfree(ssp->srcu_sup);
+		ssp->srcu_sup = NULL;
 	}
-	kfree(ssp->node);
-	ssp->node = NULL;
-	ssp->srcu_size_state = SRCU_SIZE_SMALL;
 }
 EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 04/19] srcu: Move ->level from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (2 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 05/19] srcu: Move ->srcu_size_state " Paul E. McKenney
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->level[] array from the srcu_struct structure to
the srcu_usage structure to reduce the size of the former in order to
improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  4 ++--
 kernel/rcu/srcutree.c    | 14 +++++++-------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 2689e64024bb..362c8f39c53d 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -62,14 +62,14 @@ struct srcu_node {
  */
 struct srcu_usage {
 	struct srcu_node *node;			/* Combining tree. */
+	struct srcu_node *level[RCU_NUM_LVLS + 1];
+						/* First node at each level. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	struct srcu_node *level[RCU_NUM_LVLS + 1];
-						/* First node at each level. */
 	int srcu_size_state;			/* Small-to-big transition state. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index ad1d5ca42a99..90d753e10e33 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -178,9 +178,9 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 		return false;
 
 	/* Work out the overall tree geometry. */
-	ssp->level[0] = &ssp->srcu_sup->node[0];
+	ssp->srcu_sup->level[0] = &ssp->srcu_sup->node[0];
 	for (i = 1; i < rcu_num_lvls; i++)
-		ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
+		ssp->srcu_sup->level[i] = ssp->srcu_sup->level[i - 1] + num_rcu_lvl[i - 1];
 	rcu_init_levelspread(levelspread, num_rcu_lvl);
 
 	/* Each pass through this loop initializes one srcu_node structure. */
@@ -202,10 +202,10 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 		}
 
 		/* Non-root node. */
-		if (snp == ssp->level[level + 1])
+		if (snp == ssp->srcu_sup->level[level + 1])
 			level++;
-		snp->srcu_parent = ssp->level[level - 1] +
-				   (snp - ssp->level[level]) /
+		snp->srcu_parent = ssp->srcu_sup->level[level - 1] +
+				   (snp - ssp->srcu_sup->level[level]) /
 				   levelspread[level - 1];
 	}
 
@@ -214,7 +214,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 	 * leaves of the srcu_node tree.
 	 */
 	level = rcu_num_lvls - 1;
-	snp_first = ssp->level[level];
+	snp_first = ssp->srcu_sup->level[level];
 	for_each_possible_cpu(cpu) {
 		sdp = per_cpu_ptr(ssp->sda, cpu);
 		sdp->mynode = &snp_first[cpu / levelspread[level]];
@@ -889,7 +889,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		srcu_for_each_node_breadth_first(ssp, snp) {
 			spin_lock_irq_rcu_node(snp);
 			cbs = false;
-			last_lvl = snp >= ssp->level[rcu_num_lvls - 1];
+			last_lvl = snp >= ssp->srcu_sup->level[rcu_num_lvls - 1];
 			if (last_lvl)
 				cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq;
 			snp->srcu_have_cbs[idx] = gpseq;
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 05/19] srcu: Move ->srcu_size_state from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (3 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 04/19] srcu: Move ->level from srcu_struct to srcu_usage Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 06/19] srcu: Move ->srcu_cb_mutex " Paul E. McKenney
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_size_state field from the srcu_struct
structure to the srcu_usage structure to reduce the size of the former
in order to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  2 +-
 kernel/rcu/srcutree.c    | 37 +++++++++++++++++++------------------
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 362c8f39c53d..72fb01fb2eb5 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -64,13 +64,13 @@ struct srcu_usage {
 	struct srcu_node *node;			/* Combining tree. */
 	struct srcu_node *level[RCU_NUM_LVLS + 1];
 						/* First node at each level. */
+	int srcu_size_state;			/* Small-to-big transition state. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	int srcu_size_state;			/* Small-to-big transition state. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 90d753e10e33..2717217de136 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -225,7 +225,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
 		}
 		sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
 	}
-	smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
+	smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
 	return true;
 }
 
@@ -240,7 +240,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
 	if (!ssp->srcu_sup)
 		return -ENOMEM;
-	ssp->srcu_size_state = SRCU_SIZE_SMALL;
+	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_cb_mutex);
 	mutex_init(&ssp->srcu_gp_mutex);
@@ -261,7 +261,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	init_srcu_struct_data(ssp);
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
-	if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
+	if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
 		if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
 			if (!ssp->sda_is_static) {
 				free_percpu(ssp->sda);
@@ -270,7 +270,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 				return -ENOMEM;
 			}
 		} else {
-			WRITE_ONCE(ssp->srcu_size_state, SRCU_SIZE_BIG);
+			WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
 		}
 	}
 	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
@@ -315,7 +315,7 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
 static void __srcu_transition_to_big(struct srcu_struct *ssp)
 {
 	lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
-	smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_ALLOC);
+	smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_ALLOC);
 }
 
 /*
@@ -326,10 +326,10 @@ static void srcu_transition_to_big(struct srcu_struct *ssp)
 	unsigned long flags;
 
 	/* Double-checked locking on ->srcu_size-state. */
-	if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL)
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL)
 		return;
 	spin_lock_irqsave_rcu_node(ssp, flags);
-	if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL) {
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL) {
 		spin_unlock_irqrestore_rcu_node(ssp, flags);
 		return;
 	}
@@ -345,7 +345,7 @@ static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp)
 {
 	unsigned long j;
 
-	if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_size_state)
+	if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_sup->srcu_size_state)
 		return;
 	j = jiffies;
 	if (ssp->srcu_size_jiffies != j) {
@@ -666,7 +666,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 	}
 	kfree(ssp->srcu_sup->node);
 	ssp->srcu_sup->node = NULL;
-	ssp->srcu_size_state = SRCU_SIZE_SMALL;
+	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	if (!ssp->sda_is_static) {
 		free_percpu(ssp->sda);
 		ssp->sda = NULL;
@@ -770,7 +770,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 	struct srcu_data *sdp;
 	int state;
 
-	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
 		sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
 	else
 		sdp = this_cpu_ptr(ssp->sda);
@@ -880,7 +880,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	/* A new grace period can start at this point.  But only one. */
 
 	/* Initiate callback invocation as needed. */
-	ss_state = smp_load_acquire(&ssp->srcu_size_state);
+	ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state);
 	if (ss_state < SRCU_SIZE_WAIT_BARRIER) {
 		srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, get_boot_cpu_id()),
 					cbdelay);
@@ -940,7 +940,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		if (ss_state == SRCU_SIZE_ALLOC)
 			init_srcu_struct_nodes(ssp, GFP_KERNEL);
 		else
-			smp_store_release(&ssp->srcu_size_state, ss_state + 1);
+			smp_store_release(&ssp->srcu_sup->srcu_size_state, ss_state + 1);
 	}
 }
 
@@ -1002,7 +1002,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	unsigned long snp_seq;
 
 	/* Ensure that snp node tree is fully initialized before traversing it */
-	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
 		snp_leaf = NULL;
 	else
 		snp_leaf = sdp->mynode;
@@ -1229,7 +1229,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	 * sequence number cannot wrap around in the meantime.
 	 */
 	idx = __srcu_read_lock_nmisafe(ssp);
-	ss_state = smp_load_acquire(&ssp->srcu_size_state);
+	ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state);
 	if (ss_state < SRCU_SIZE_WAIT_CALL)
 		sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
 	else
@@ -1568,7 +1568,7 @@ void srcu_barrier(struct srcu_struct *ssp)
 	atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
 
 	idx = __srcu_read_lock_nmisafe(ssp);
-	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
 		srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda,	get_boot_cpu_id()));
 	else
 		for_each_possible_cpu(cpu)
@@ -1806,7 +1806,7 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 	int cpu;
 	int idx;
 	unsigned long s0 = 0, s1 = 0;
-	int ss_state = READ_ONCE(ssp->srcu_size_state);
+	int ss_state = READ_ONCE(ssp->srcu_sup->srcu_size_state);
 	int ss_state_idx = ss_state;
 
 	idx = ssp->srcu_idx & 0x1;
@@ -1893,8 +1893,9 @@ void __init srcu_init(void)
 		ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
 				      work.work.entry);
 		list_del_init(&ssp->work.work.entry);
-		if (SRCU_SIZING_IS(SRCU_SIZING_INIT) && ssp->srcu_size_state == SRCU_SIZE_SMALL)
-			ssp->srcu_size_state = SRCU_SIZE_ALLOC;
+		if (SRCU_SIZING_IS(SRCU_SIZING_INIT) &&
+		    ssp->srcu_sup->srcu_size_state == SRCU_SIZE_SMALL)
+			ssp->srcu_sup->srcu_size_state = SRCU_SIZE_ALLOC;
 		queue_work(rcu_gp_wq, &ssp->work.work);
 	}
 }
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 06/19] srcu: Move ->srcu_cb_mutex from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (4 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 05/19] srcu: Move ->srcu_size_state " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 07/19] srcu: Move ->lock initialization after srcu_usage allocation Paul E. McKenney
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_cb_mutex field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 2 +-
 kernel/rcu/srcutree.c    | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 72fb01fb2eb5..69d49de87b67 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -65,13 +65,13 @@ struct srcu_usage {
 	struct srcu_node *level[RCU_NUM_LVLS + 1];
 						/* First node at each level. */
 	int srcu_size_state;			/* Small-to-big transition state. */
+	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 	unsigned int srcu_idx;			/* Current rdr array element. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 2717217de136..6b6d92fe41cf 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -242,7 +242,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 		return -ENOMEM;
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
-	mutex_init(&ssp->srcu_cb_mutex);
+	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
 	mutex_init(&ssp->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
 	ssp->srcu_gp_seq = 0;
@@ -861,7 +861,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	int ss_state;
 
 	/* Prevent more than one additional grace period. */
-	mutex_lock(&ssp->srcu_cb_mutex);
+	mutex_lock(&ssp->srcu_sup->srcu_cb_mutex);
 
 	/* End the current grace period. */
 	spin_lock_irq_rcu_node(ssp);
@@ -921,7 +921,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		}
 
 	/* Callback initiation done, allow grace periods after next. */
-	mutex_unlock(&ssp->srcu_cb_mutex);
+	mutex_unlock(&ssp->srcu_sup->srcu_cb_mutex);
 
 	/* Start a new grace period if needed. */
 	spin_lock_irq_rcu_node(ssp);
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 07/19] srcu: Move ->lock initialization after srcu_usage allocation
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (5 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 06/19] srcu: Move ->srcu_cb_mutex " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 08/19] srcu: Move ->lock from srcu_struct to srcu_usage Paul E. McKenney
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

Currently, both __init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=y kernels
and init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=n kernel initialize
the srcu_struct structure's ->lock before the srcu_usage structure has
been allocated.  This of course prevents the ->lock from being moved
to the srcu_usage structure, so this commit moves the initialization
into the init_srcu_struct_fields() after the srcu_usage structure has
been allocated.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 6b6d92fe41cf..f6f37e34a1e0 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -240,6 +240,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
 	if (!ssp->srcu_sup)
 		return -ENOMEM;
+	if (!is_static)
+		spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
@@ -285,7 +287,6 @@ int __init_srcu_struct(struct srcu_struct *ssp, const char *name,
 	/* Don't re-initialize a lock while it is held. */
 	debug_check_no_locks_freed((void *)ssp, sizeof(*ssp));
 	lockdep_init_map(&ssp->dep_map, name, key, 0);
-	spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
 	return init_srcu_struct_fields(ssp, false);
 }
 EXPORT_SYMBOL_GPL(__init_srcu_struct);
@@ -302,7 +303,6 @@ EXPORT_SYMBOL_GPL(__init_srcu_struct);
  */
 int init_srcu_struct(struct srcu_struct *ssp)
 {
-	spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
 	return init_srcu_struct_fields(ssp, false);
 }
 EXPORT_SYMBOL_GPL(init_srcu_struct);
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 08/19] srcu: Move ->lock from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (6 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 07/19] srcu: Move ->lock initialization after srcu_usage allocation Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 09/19] srcu: Move ->srcu_gp_mutex " Paul E. McKenney
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->lock field from the srcu_struct structure to
the srcu_usage structure to reduce the size of the former in order to
improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 11 +++++---
 kernel/rcu/srcutree.c    | 56 ++++++++++++++++++++--------------------
 2 files changed, 35 insertions(+), 32 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 69d49de87b67..6fe5743c1179 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -66,13 +66,13 @@ struct srcu_usage {
 						/* First node at each level. */
 	int srcu_size_state;			/* Small-to-big transition state. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
+	spinlock_t __private lock;		/* Protect counters and size state. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	spinlock_t __private lock;		/* Protect counters and size state. */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
@@ -129,7 +129,6 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN2	2
 
 #define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
-	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
 	.srcu_sup = &usage_name,								\
@@ -167,7 +166,9 @@ struct srcu_struct {
  */
 #ifdef MODULE
 # define __DEFINE_SRCU(name, is_static)								\
-	static struct srcu_usage name##_srcu_usage;						\
+	static struct srcu_usage name##_srcu_usage = {						\
+		.lock = __SPIN_LOCK_UNLOCKED(name.lock),					\
+	};											\
 	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
 	extern struct srcu_struct * const __srcu_struct_##name;					\
 	struct srcu_struct * const __srcu_struct_##name						\
@@ -175,7 +176,9 @@ struct srcu_struct {
 #else
 # define __DEFINE_SRCU(name, is_static)								\
 	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
-	static struct srcu_usage name##_srcu_usage;						\
+	static struct srcu_usage name##_srcu_usage = {						\
+		.lock = __SPIN_LOCK_UNLOCKED(name.lock),					\
+	};											\
 	is_static struct srcu_struct name =							\
 		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
 #endif
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index f6f37e34a1e0..579c43ff0f4b 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -103,7 +103,7 @@ do {										\
 
 #define spin_trylock_irqsave_rcu_node(p, flags)					\
 ({										\
-	bool ___locked = spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags);	\
+	bool ___locked = spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \
 										\
 	if (___locked)								\
 		smp_mb__after_unlock_lock();					\
@@ -241,7 +241,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	if (!ssp->srcu_sup)
 		return -ENOMEM;
 	if (!is_static)
-		spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
+		spin_lock_init(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
@@ -314,7 +314,7 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
  */
 static void __srcu_transition_to_big(struct srcu_struct *ssp)
 {
-	lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
+	lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
 	smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_ALLOC);
 }
 
@@ -328,13 +328,13 @@ static void srcu_transition_to_big(struct srcu_struct *ssp)
 	/* Double-checked locking on ->srcu_size-state. */
 	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL)
 		return;
-	spin_lock_irqsave_rcu_node(ssp, flags);
+	spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags);
 	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL) {
-		spin_unlock_irqrestore_rcu_node(ssp, flags);
+		spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 		return;
 	}
 	__srcu_transition_to_big(ssp);
-	spin_unlock_irqrestore_rcu_node(ssp, flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
 /*
@@ -369,9 +369,9 @@ static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned lon
 
 	if (spin_trylock_irqsave_rcu_node(sdp, *flags))
 		return;
-	spin_lock_irqsave_rcu_node(ssp, *flags);
+	spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags);
 	spin_lock_irqsave_check_contention(ssp);
-	spin_unlock_irqrestore_rcu_node(ssp, *flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, *flags);
 	spin_lock_irqsave_rcu_node(sdp, *flags);
 }
 
@@ -383,9 +383,9 @@ static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned lon
  */
 static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags)
 {
-	if (spin_trylock_irqsave_rcu_node(ssp, *flags))
+	if (spin_trylock_irqsave_rcu_node(ssp->srcu_sup, *flags))
 		return;
-	spin_lock_irqsave_rcu_node(ssp, *flags);
+	spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags);
 	spin_lock_irqsave_check_contention(ssp);
 }
 
@@ -404,13 +404,13 @@ static void check_init_srcu_struct(struct srcu_struct *ssp)
 	/* The smp_load_acquire() pairs with the smp_store_release(). */
 	if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq_needed))) /*^^^*/
 		return; /* Already initialized. */
-	spin_lock_irqsave_rcu_node(ssp, flags);
+	spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags);
 	if (!rcu_seq_state(ssp->srcu_gp_seq_needed)) {
-		spin_unlock_irqrestore_rcu_node(ssp, flags);
+		spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 		return;
 	}
 	init_srcu_struct_fields(ssp, true);
-	spin_unlock_irqrestore_rcu_node(ssp, flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
 /*
@@ -774,7 +774,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 		sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
 	else
 		sdp = this_cpu_ptr(ssp->sda);
-	lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
+	lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
 	WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
 	spin_lock_rcu_node(sdp);  /* Interrupts already disabled. */
 	rcu_segcblist_advance(&sdp->srcu_cblist,
@@ -864,7 +864,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	mutex_lock(&ssp->srcu_sup->srcu_cb_mutex);
 
 	/* End the current grace period. */
-	spin_lock_irq_rcu_node(ssp);
+	spin_lock_irq_rcu_node(ssp->srcu_sup);
 	idx = rcu_seq_state(ssp->srcu_gp_seq);
 	WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
 	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
@@ -875,7 +875,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
 	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, gpseq))
 		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, gpseq);
-	spin_unlock_irq_rcu_node(ssp);
+	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	mutex_unlock(&ssp->srcu_gp_mutex);
 	/* A new grace period can start at this point.  But only one. */
 
@@ -924,15 +924,15 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	mutex_unlock(&ssp->srcu_sup->srcu_cb_mutex);
 
 	/* Start a new grace period if needed. */
-	spin_lock_irq_rcu_node(ssp);
+	spin_lock_irq_rcu_node(ssp->srcu_sup);
 	gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
 	if (!rcu_seq_state(gpseq) &&
 	    ULONG_CMP_LT(gpseq, ssp->srcu_gp_seq_needed)) {
 		srcu_gp_start(ssp);
-		spin_unlock_irq_rcu_node(ssp);
+		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 		srcu_reschedule(ssp, 0);
 	} else {
-		spin_unlock_irq_rcu_node(ssp);
+		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	}
 
 	/* Transition to big if needed. */
@@ -975,7 +975,7 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 	spin_lock_irqsave_ssp_contention(ssp, &flags);
 	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
 		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
-	spin_unlock_irqrestore_rcu_node(ssp, flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
 /*
@@ -1064,7 +1064,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 		else if (list_empty(&ssp->work.work.entry))
 			list_add(&ssp->work.work.entry, &srcu_boot_list);
 	}
-	spin_unlock_irqrestore_rcu_node(ssp, flags);
+	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
 /*
@@ -1621,17 +1621,17 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 	 */
 	idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq)); /* ^^^ */
 	if (idx == SRCU_STATE_IDLE) {
-		spin_lock_irq_rcu_node(ssp);
+		spin_lock_irq_rcu_node(ssp->srcu_sup);
 		if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
 			WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq));
-			spin_unlock_irq_rcu_node(ssp);
+			spin_unlock_irq_rcu_node(ssp->srcu_sup);
 			mutex_unlock(&ssp->srcu_gp_mutex);
 			return;
 		}
 		idx = rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq));
 		if (idx == SRCU_STATE_IDLE)
 			srcu_gp_start(ssp);
-		spin_unlock_irq_rcu_node(ssp);
+		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 		if (idx != SRCU_STATE_IDLE) {
 			mutex_unlock(&ssp->srcu_gp_mutex);
 			return; /* Someone else started the grace period. */
@@ -1645,10 +1645,10 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 			return; /* readers present, retry later. */
 		}
 		srcu_flip(ssp);
-		spin_lock_irq_rcu_node(ssp);
+		spin_lock_irq_rcu_node(ssp->srcu_sup);
 		rcu_seq_set_state(&ssp->srcu_gp_seq, SRCU_STATE_SCAN2);
 		ssp->srcu_n_exp_nodelay = 0;
-		spin_unlock_irq_rcu_node(ssp);
+		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	}
 
 	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN2) {
@@ -1732,7 +1732,7 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
 {
 	bool pushgp = true;
 
-	spin_lock_irq_rcu_node(ssp);
+	spin_lock_irq_rcu_node(ssp->srcu_sup);
 	if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
 		if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq))) {
 			/* All requests fulfilled, time to go idle. */
@@ -1742,7 +1742,7 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
 		/* Outstanding request and no GP.  Start one. */
 		srcu_gp_start(ssp);
 	}
-	spin_unlock_irq_rcu_node(ssp);
+	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 
 	if (pushgp)
 		queue_delayed_work(rcu_gp_wq, &ssp->work, delay);
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 09/19] srcu: Move ->srcu_gp_mutex from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (7 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 08/19] srcu: Move ->lock from srcu_struct to srcu_usage Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 10/19] srcu: Move grace-period fields " Paul E. McKenney
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_gp_mutex field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  2 +-
 kernel/rcu/srcutree.c    | 14 +++++++-------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 6fe5743c1179..6402cd5a93ef 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -67,13 +67,13 @@ struct srcu_usage {
 	int srcu_size_state;			/* Small-to-big transition state. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
+	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 };
 
 /*
  * Per-SRCU-domain structure, similar in function to rcu_state.
  */
 struct srcu_struct {
-	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
 	unsigned long srcu_gp_seq_needed;	/* Latest gp_seq needed. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 579c43ff0f4b..3bf3408c7716 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -245,7 +245,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
 	ssp->srcu_sup->node = NULL;
 	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
-	mutex_init(&ssp->srcu_gp_mutex);
+	mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
 	ssp->srcu_gp_seq = 0;
 	ssp->srcu_barrier_seq = 0;
@@ -876,7 +876,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, gpseq))
 		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, gpseq);
 	spin_unlock_irq_rcu_node(ssp->srcu_sup);
-	mutex_unlock(&ssp->srcu_gp_mutex);
+	mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 	/* A new grace period can start at this point.  But only one. */
 
 	/* Initiate callback invocation as needed. */
@@ -1607,7 +1607,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 {
 	int idx;
 
-	mutex_lock(&ssp->srcu_gp_mutex);
+	mutex_lock(&ssp->srcu_sup->srcu_gp_mutex);
 
 	/*
 	 * Because readers might be delayed for an extended period after
@@ -1625,7 +1625,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
 			WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq));
 			spin_unlock_irq_rcu_node(ssp->srcu_sup);
-			mutex_unlock(&ssp->srcu_gp_mutex);
+			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return;
 		}
 		idx = rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq));
@@ -1633,7 +1633,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 			srcu_gp_start(ssp);
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 		if (idx != SRCU_STATE_IDLE) {
-			mutex_unlock(&ssp->srcu_gp_mutex);
+			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return; /* Someone else started the grace period. */
 		}
 	}
@@ -1641,7 +1641,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN1) {
 		idx = 1 ^ (ssp->srcu_idx & 1);
 		if (!try_check_zero(ssp, idx, 1)) {
-			mutex_unlock(&ssp->srcu_gp_mutex);
+			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return; /* readers present, retry later. */
 		}
 		srcu_flip(ssp);
@@ -1659,7 +1659,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		 */
 		idx = 1 ^ (ssp->srcu_idx & 1);
 		if (!try_check_zero(ssp, idx, 2)) {
-			mutex_unlock(&ssp->srcu_gp_mutex);
+			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return; /* readers present, retry later. */
 		}
 		ssp->srcu_n_exp_nodelay = 0;
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 10/19] srcu: Move grace-period fields from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (8 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 09/19] srcu: Move ->srcu_gp_mutex " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 11/19] srcu: Move heuristics " Paul E. McKenney
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed,
->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields
from the srcu_struct structure to the srcu_usage structure to reduce
the size of the former in order to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  25 ++++----
 kernel/rcu/srcutree.c    | 126 +++++++++++++++++++--------------------
 2 files changed, 76 insertions(+), 75 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 6402cd5a93ef..7458df6bbe00 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -68,6 +68,11 @@ struct srcu_usage {
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
 	spinlock_t __private lock;		/* Protect counters and size state. */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
+	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
+	unsigned long srcu_gp_seq_needed;	/* Latest gp_seq needed. */
+	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
+	unsigned long srcu_gp_start;		/* Last GP start timestamp (jiffies) */
+	unsigned long srcu_last_gp_end;		/* Last GP end timestamp (ns) */
 };
 
 /*
@@ -75,11 +80,6 @@ struct srcu_usage {
  */
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
-	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
-	unsigned long srcu_gp_seq_needed;	/* Latest gp_seq needed. */
-	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
-	unsigned long srcu_gp_start;		/* Last GP start timestamp (jiffies) */
-	unsigned long srcu_last_gp_end;		/* Last GP end timestamp (ns) */
 	unsigned long srcu_size_jiffies;	/* Current contention-measurement interval. */
 	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
 	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
@@ -128,8 +128,13 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN1	1
 #define SRCU_STATE_SCAN2	2
 
-#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
+#define __SRCU_USAGE_INIT(name)									\
+{												\
+	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
+}
+
+#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
 	.srcu_sup = &usage_name,								\
 	__SRCU_DEP_MAP_INIT(name)
@@ -166,9 +171,7 @@ struct srcu_struct {
  */
 #ifdef MODULE
 # define __DEFINE_SRCU(name, is_static)								\
-	static struct srcu_usage name##_srcu_usage = {						\
-		.lock = __SPIN_LOCK_UNLOCKED(name.lock),					\
-	};											\
+	static struct srcu_usage name##_srcu_usage = __SRCU_USAGE_INIT(name##_srcu_usage);	\
 	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
 	extern struct srcu_struct * const __srcu_struct_##name;					\
 	struct srcu_struct * const __srcu_struct_##name						\
@@ -176,9 +179,7 @@ struct srcu_struct {
 #else
 # define __DEFINE_SRCU(name, is_static)								\
 	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
-	static struct srcu_usage name##_srcu_usage = {						\
-		.lock = __SPIN_LOCK_UNLOCKED(name.lock),					\
-	};											\
+	static struct srcu_usage name##_srcu_usage = __SRCU_USAGE_INIT(name##_srcu_usage);	\
 	is_static struct srcu_struct name =							\
 		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
 #endif
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 3bf3408c7716..6c1ac78b1cfc 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -135,8 +135,8 @@ static void init_srcu_struct_data(struct srcu_struct *ssp)
 		spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
 		rcu_segcblist_init(&sdp->srcu_cblist);
 		sdp->srcu_cblist_invoking = false;
-		sdp->srcu_gp_seq_needed = ssp->srcu_gp_seq;
-		sdp->srcu_gp_seq_needed_exp = ssp->srcu_gp_seq;
+		sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq;
+		sdp->srcu_gp_seq_needed_exp = ssp->srcu_sup->srcu_gp_seq;
 		sdp->mynode = NULL;
 		sdp->cpu = cpu;
 		INIT_WORK(&sdp->work, srcu_invoke_callbacks);
@@ -247,7 +247,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
 	mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
-	ssp->srcu_gp_seq = 0;
+	ssp->srcu_sup->srcu_gp_seq = 0;
 	ssp->srcu_barrier_seq = 0;
 	mutex_init(&ssp->srcu_barrier_mutex);
 	atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
@@ -261,8 +261,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 		return -ENOMEM;
 	}
 	init_srcu_struct_data(ssp);
-	ssp->srcu_gp_seq_needed_exp = 0;
-	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
+	ssp->srcu_sup->srcu_gp_seq_needed_exp = 0;
+	ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
 	if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
 		if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
 			if (!ssp->sda_is_static) {
@@ -275,7 +275,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 			WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
 		}
 	}
-	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
+	smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
 	return 0;
 }
 
@@ -402,10 +402,10 @@ static void check_init_srcu_struct(struct srcu_struct *ssp)
 	unsigned long flags;
 
 	/* The smp_load_acquire() pairs with the smp_store_release(). */
-	if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq_needed))) /*^^^*/
+	if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed))) /*^^^*/
 		return; /* Already initialized. */
 	spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags);
-	if (!rcu_seq_state(ssp->srcu_gp_seq_needed)) {
+	if (!rcu_seq_state(ssp->srcu_sup->srcu_gp_seq_needed)) {
 		spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 		return;
 	}
@@ -616,11 +616,11 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 	unsigned long j;
 	unsigned long jbase = SRCU_INTERVAL;
 
-	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
+	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
 		jbase = 0;
-	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) {
+	if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq))) {
 		j = jiffies - 1;
-		gpstart = READ_ONCE(ssp->srcu_gp_start);
+		gpstart = READ_ONCE(ssp->srcu_sup->srcu_gp_start);
 		if (time_after(j, gpstart))
 			jbase += j - gpstart;
 		if (!jbase) {
@@ -656,12 +656,12 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 		if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
 			return; /* Forgot srcu_barrier(), so just leak it! */
 	}
-	if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
-	    WARN_ON(rcu_seq_current(&ssp->srcu_gp_seq) != ssp->srcu_gp_seq_needed) ||
+	if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
+	    WARN_ON(rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq) != ssp->srcu_sup->srcu_gp_seq_needed) ||
 	    WARN_ON(srcu_readers_active(ssp))) {
 		pr_info("%s: Active srcu_struct %p read state: %d gp state: %lu/%lu\n",
-			__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)),
-			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
+			__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)),
+			rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq), ssp->srcu_sup->srcu_gp_seq_needed);
 		return; /* Caller forgot to stop doing call_srcu()? */
 	}
 	kfree(ssp->srcu_sup->node);
@@ -775,18 +775,18 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 	else
 		sdp = this_cpu_ptr(ssp->sda);
 	lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
-	WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
+	WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
 	spin_lock_rcu_node(sdp);  /* Interrupts already disabled. */
 	rcu_segcblist_advance(&sdp->srcu_cblist,
-			      rcu_seq_current(&ssp->srcu_gp_seq));
+			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
 	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
-				       rcu_seq_snap(&ssp->srcu_gp_seq));
+				       rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
 	spin_unlock_rcu_node(sdp);  /* Interrupts remain disabled. */
-	WRITE_ONCE(ssp->srcu_gp_start, jiffies);
+	WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies);
 	WRITE_ONCE(ssp->srcu_n_exp_nodelay, 0);
 	smp_mb(); /* Order prior store to ->srcu_gp_seq_needed vs. GP start. */
-	rcu_seq_start(&ssp->srcu_gp_seq);
-	state = rcu_seq_state(ssp->srcu_gp_seq);
+	rcu_seq_start(&ssp->srcu_sup->srcu_gp_seq);
+	state = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
 	WARN_ON_ONCE(state != SRCU_STATE_SCAN1);
 }
 
@@ -865,16 +865,16 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 
 	/* End the current grace period. */
 	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	idx = rcu_seq_state(ssp->srcu_gp_seq);
+	idx = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
 	WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
-	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
+	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
 		cbdelay = 0;
 
-	WRITE_ONCE(ssp->srcu_last_gp_end, ktime_get_mono_fast_ns());
-	rcu_seq_end(&ssp->srcu_gp_seq);
-	gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
-	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, gpseq))
-		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, gpseq);
+	WRITE_ONCE(ssp->srcu_sup->srcu_last_gp_end, ktime_get_mono_fast_ns());
+	rcu_seq_end(&ssp->srcu_sup->srcu_gp_seq);
+	gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
+	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq))
+		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq);
 	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 	/* A new grace period can start at this point.  But only one. */
@@ -925,9 +925,9 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 
 	/* Start a new grace period if needed. */
 	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
+	gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
 	if (!rcu_seq_state(gpseq) &&
-	    ULONG_CMP_LT(gpseq, ssp->srcu_gp_seq_needed)) {
+	    ULONG_CMP_LT(gpseq, ssp->srcu_sup->srcu_gp_seq_needed)) {
 		srcu_gp_start(ssp);
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 		srcu_reschedule(ssp, 0);
@@ -960,7 +960,7 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 	if (snp)
 		for (; snp != NULL; snp = snp->srcu_parent) {
 			sgsne = READ_ONCE(snp->srcu_gp_seq_needed_exp);
-			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) ||
+			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) ||
 			    (!srcu_invl_snp_seq(sgsne) && ULONG_CMP_GE(sgsne, s)))
 				return;
 			spin_lock_irqsave_rcu_node(snp, flags);
@@ -973,8 +973,8 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 			spin_unlock_irqrestore_rcu_node(snp, flags);
 		}
 	spin_lock_irqsave_ssp_contention(ssp, &flags);
-	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
-		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
+	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s))
+		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s);
 	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
 
@@ -1010,7 +1010,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	if (snp_leaf)
 		/* Each pass through the loop does one level of the srcu_node tree. */
 		for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
-			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) && snp != snp_leaf)
+			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) && snp != snp_leaf)
 				return; /* GP already done and CBs recorded. */
 			spin_lock_irqsave_rcu_node(snp, flags);
 			snp_seq = snp->srcu_have_cbs[idx];
@@ -1037,20 +1037,20 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 
 	/* Top of tree, must ensure the grace period will be started. */
 	spin_lock_irqsave_ssp_contention(ssp, &flags);
-	if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed, s)) {
+	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed, s)) {
 		/*
 		 * Record need for grace period s.  Pair with load
 		 * acquire setting up for initialization.
 		 */
-		smp_store_release(&ssp->srcu_gp_seq_needed, s); /*^^^*/
+		smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, s); /*^^^*/
 	}
-	if (!do_norm && ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
-		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
+	if (!do_norm && ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s))
+		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s);
 
 	/* If grace period not already in progress, start it. */
-	if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) &&
-	    rcu_seq_state(ssp->srcu_gp_seq) == SRCU_STATE_IDLE) {
-		WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
+	if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) &&
+	    rcu_seq_state(ssp->srcu_sup->srcu_gp_seq) == SRCU_STATE_IDLE) {
+		WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
 		srcu_gp_start(ssp);
 
 		// And how can that list_add() in the "else" clause
@@ -1184,18 +1184,18 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp)
 
 	/* First, see if enough time has passed since the last GP. */
 	t = ktime_get_mono_fast_ns();
-	tlast = READ_ONCE(ssp->srcu_last_gp_end);
+	tlast = READ_ONCE(ssp->srcu_sup->srcu_last_gp_end);
 	if (exp_holdoff == 0 ||
 	    time_in_range_open(t, tlast, tlast + exp_holdoff))
 		return false; /* Too soon after last GP. */
 
 	/* Next, check for probable idleness. */
-	curseq = rcu_seq_current(&ssp->srcu_gp_seq);
+	curseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
 	smp_mb(); /* Order ->srcu_gp_seq with ->srcu_gp_seq_needed. */
-	if (ULONG_CMP_LT(curseq, READ_ONCE(ssp->srcu_gp_seq_needed)))
+	if (ULONG_CMP_LT(curseq, READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed)))
 		return false; /* Grace period in progress, so not idle. */
 	smp_mb(); /* Order ->srcu_gp_seq with prior access. */
-	if (curseq != rcu_seq_current(&ssp->srcu_gp_seq))
+	if (curseq != rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq))
 		return false; /* GP # changed, so not idle. */
 	return true; /* With reasonable probability, idle! */
 }
@@ -1238,8 +1238,8 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	if (rhp)
 		rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
 	rcu_segcblist_advance(&sdp->srcu_cblist,
-			      rcu_seq_current(&ssp->srcu_gp_seq));
-	s = rcu_seq_snap(&ssp->srcu_gp_seq);
+			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
+	s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
 	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
 	if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
 		sdp->srcu_gp_seq_needed = s;
@@ -1452,7 +1452,7 @@ unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp)
 	// Any prior manipulation of SRCU-protected data must happen
 	// before the load from ->srcu_gp_seq.
 	smp_mb();
-	return rcu_seq_snap(&ssp->srcu_gp_seq);
+	return rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
 }
 EXPORT_SYMBOL_GPL(get_state_synchronize_srcu);
 
@@ -1499,7 +1499,7 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
  */
 bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)
 {
-	if (!rcu_seq_done(&ssp->srcu_gp_seq, cookie))
+	if (!rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, cookie))
 		return false;
 	// Ensure that the end of the SRCU grace period happens before
 	// any subsequent code that the caller might execute.
@@ -1619,16 +1619,16 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 	 * The load-acquire ensures that we see the accesses performed
 	 * by the prior grace period.
 	 */
-	idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq)); /* ^^^ */
+	idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq)); /* ^^^ */
 	if (idx == SRCU_STATE_IDLE) {
 		spin_lock_irq_rcu_node(ssp->srcu_sup);
-		if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
-			WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq));
+		if (ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed)) {
+			WARN_ON_ONCE(rcu_seq_state(ssp->srcu_sup->srcu_gp_seq));
 			spin_unlock_irq_rcu_node(ssp->srcu_sup);
 			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return;
 		}
-		idx = rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq));
+		idx = rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq));
 		if (idx == SRCU_STATE_IDLE)
 			srcu_gp_start(ssp);
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
@@ -1638,7 +1638,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		}
 	}
 
-	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN1) {
+	if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) == SRCU_STATE_SCAN1) {
 		idx = 1 ^ (ssp->srcu_idx & 1);
 		if (!try_check_zero(ssp, idx, 1)) {
 			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
@@ -1646,12 +1646,12 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		}
 		srcu_flip(ssp);
 		spin_lock_irq_rcu_node(ssp->srcu_sup);
-		rcu_seq_set_state(&ssp->srcu_gp_seq, SRCU_STATE_SCAN2);
+		rcu_seq_set_state(&ssp->srcu_sup->srcu_gp_seq, SRCU_STATE_SCAN2);
 		ssp->srcu_n_exp_nodelay = 0;
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	}
 
-	if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN2) {
+	if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) == SRCU_STATE_SCAN2) {
 
 		/*
 		 * SRCU read-side critical sections are normally short,
@@ -1688,7 +1688,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	rcu_cblist_init(&ready_cbs);
 	spin_lock_irq_rcu_node(sdp);
 	rcu_segcblist_advance(&sdp->srcu_cblist,
-			      rcu_seq_current(&ssp->srcu_gp_seq));
+			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
 	if (sdp->srcu_cblist_invoking ||
 	    !rcu_segcblist_ready_cbs(&sdp->srcu_cblist)) {
 		spin_unlock_irq_rcu_node(sdp);
@@ -1716,7 +1716,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	spin_lock_irq_rcu_node(sdp);
 	rcu_segcblist_add_len(&sdp->srcu_cblist, -len);
 	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
-				       rcu_seq_snap(&ssp->srcu_gp_seq));
+				       rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
 	sdp->srcu_cblist_invoking = false;
 	more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
 	spin_unlock_irq_rcu_node(sdp);
@@ -1733,12 +1733,12 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
 	bool pushgp = true;
 
 	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
-		if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq))) {
+	if (ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed)) {
+		if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_sup->srcu_gp_seq))) {
 			/* All requests fulfilled, time to go idle. */
 			pushgp = false;
 		}
-	} else if (!rcu_seq_state(ssp->srcu_gp_seq)) {
+	} else if (!rcu_seq_state(ssp->srcu_sup->srcu_gp_seq)) {
 		/* Outstanding request and no GP.  Start one. */
 		srcu_gp_start(ssp);
 	}
@@ -1784,7 +1784,7 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
 	if (test_type != SRCU_FLAVOR)
 		return;
 	*flags = 0;
-	*gp_seq = rcu_seq_current(&ssp->srcu_gp_seq);
+	*gp_seq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
 }
 EXPORT_SYMBOL_GPL(srcutorture_get_gp_data);
 
@@ -1813,7 +1813,7 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 	if (ss_state < 0 || ss_state >= ARRAY_SIZE(srcu_size_state_name))
 		ss_state_idx = ARRAY_SIZE(srcu_size_state_name) - 1;
 	pr_alert("%s%s Tree SRCU g%ld state %d (%s)",
-		 tt, tf, rcu_seq_current(&ssp->srcu_gp_seq), ss_state,
+		 tt, tf, rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq), ss_state,
 		 srcu_size_state_name[ss_state_idx]);
 	if (!ssp->sda) {
 		// Called after cleanup_srcu_struct(), perhaps.
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 11/19] srcu: Move heuristics fields from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (9 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 10/19] srcu: Move grace-period fields " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 12/19] srcu: Move ->sda_is_static " Paul E. McKenney
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_size_jiffies, ->srcu_n_lock_retries,
and ->srcu_n_exp_nodelay fields from the srcu_struct structure to the
srcu_usage structure to reduce the size of the former in order to improve
cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  6 +++---
 kernel/rcu/srcutree.c    | 18 +++++++++---------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 7458df6bbe00..82d07466da93 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -73,6 +73,9 @@ struct srcu_usage {
 	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
 	unsigned long srcu_gp_start;		/* Last GP start timestamp (jiffies) */
 	unsigned long srcu_last_gp_end;		/* Last GP end timestamp (ns) */
+	unsigned long srcu_size_jiffies;	/* Current contention-measurement interval. */
+	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
+	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
 };
 
 /*
@@ -80,9 +83,6 @@ struct srcu_usage {
  */
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
-	unsigned long srcu_size_jiffies;	/* Current contention-measurement interval. */
-	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
-	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
 	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
 	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 6c1ac78b1cfc..3f389b3a25aa 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -348,11 +348,11 @@ static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp)
 	if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_sup->srcu_size_state)
 		return;
 	j = jiffies;
-	if (ssp->srcu_size_jiffies != j) {
-		ssp->srcu_size_jiffies = j;
-		ssp->srcu_n_lock_retries = 0;
+	if (ssp->srcu_sup->srcu_size_jiffies != j) {
+		ssp->srcu_sup->srcu_size_jiffies = j;
+		ssp->srcu_sup->srcu_n_lock_retries = 0;
 	}
-	if (++ssp->srcu_n_lock_retries <= small_contention_lim)
+	if (++ssp->srcu_sup->srcu_n_lock_retries <= small_contention_lim)
 		return;
 	__srcu_transition_to_big(ssp);
 }
@@ -624,8 +624,8 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 		if (time_after(j, gpstart))
 			jbase += j - gpstart;
 		if (!jbase) {
-			WRITE_ONCE(ssp->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
-			if (READ_ONCE(ssp->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
+			WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) + 1);
+			if (READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
 				jbase = 1;
 		}
 	}
@@ -783,7 +783,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 				       rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
 	spin_unlock_rcu_node(sdp);  /* Interrupts remain disabled. */
 	WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies);
-	WRITE_ONCE(ssp->srcu_n_exp_nodelay, 0);
+	WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, 0);
 	smp_mb(); /* Order prior store to ->srcu_gp_seq_needed vs. GP start. */
 	rcu_seq_start(&ssp->srcu_sup->srcu_gp_seq);
 	state = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
@@ -1647,7 +1647,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 		srcu_flip(ssp);
 		spin_lock_irq_rcu_node(ssp->srcu_sup);
 		rcu_seq_set_state(&ssp->srcu_sup->srcu_gp_seq, SRCU_STATE_SCAN2);
-		ssp->srcu_n_exp_nodelay = 0;
+		ssp->srcu_sup->srcu_n_exp_nodelay = 0;
 		spin_unlock_irq_rcu_node(ssp->srcu_sup);
 	}
 
@@ -1662,7 +1662,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
 			mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
 			return; /* readers present, retry later. */
 		}
-		ssp->srcu_n_exp_nodelay = 0;
+		ssp->srcu_sup->srcu_n_exp_nodelay = 0;
 		srcu_gp_end(ssp);  /* Releases ->srcu_gp_mutex. */
 	}
 }
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 12/19] srcu: Move ->sda_is_static from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (10 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 11/19] srcu: Move heuristics " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 13/19] srcu: Move srcu_barrier() fields " Paul E. McKenney
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->sda_is_static field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 2 +-
 kernel/rcu/srcutree.c    | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 82d07466da93..ca48b97d9f3b 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -76,6 +76,7 @@ struct srcu_usage {
 	unsigned long srcu_size_jiffies;	/* Current contention-measurement interval. */
 	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
 	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
+	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
 };
 
 /*
@@ -84,7 +85,6 @@ struct srcu_usage {
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
-	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
 	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
 	struct mutex srcu_barrier_mutex;	/* Serialize barrier ops. */
 	struct completion srcu_barrier_completion;
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 3f389b3a25aa..7c192f7fb559 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -252,7 +252,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	mutex_init(&ssp->srcu_barrier_mutex);
 	atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
 	INIT_DELAYED_WORK(&ssp->work, process_srcu);
-	ssp->sda_is_static = is_static;
+	ssp->srcu_sup->sda_is_static = is_static;
 	if (!is_static)
 		ssp->sda = alloc_percpu(struct srcu_data);
 	if (!ssp->sda) {
@@ -265,7 +265,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
 	if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
 		if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
-			if (!ssp->sda_is_static) {
+			if (!ssp->srcu_sup->sda_is_static) {
 				free_percpu(ssp->sda);
 				ssp->sda = NULL;
 				kfree(ssp->srcu_sup);
@@ -667,7 +667,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 	kfree(ssp->srcu_sup->node);
 	ssp->srcu_sup->node = NULL;
 	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
-	if (!ssp->sda_is_static) {
+	if (!ssp->srcu_sup->sda_is_static) {
 		free_percpu(ssp->sda);
 		ssp->sda = NULL;
 		kfree(ssp->srcu_sup);
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 13/19] srcu: Move srcu_barrier() fields from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (11 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 12/19] srcu: Move ->sda_is_static " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 14/19] srcu: Move work-scheduling " Paul E. McKenney
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->srcu_barrier_seq, ->srcu_barrier_mutex,
->srcu_barrier_completion, and ->srcu_barrier_cpu_cnt fields from the
srcu_struct structure to the srcu_usage structure to reduce the size of
the former in order to improve cache locality.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h | 14 +++++++-------
 kernel/rcu/srcutree.c    | 38 +++++++++++++++++++-------------------
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index ca48b97d9f3b..70f5a3abe80a 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -77,6 +77,13 @@ struct srcu_usage {
 	unsigned long srcu_n_lock_retries;	/* Contention events in current interval. */
 	unsigned long srcu_n_exp_nodelay;	/* # expedited no-delays in current GP phase. */
 	bool sda_is_static;			/* May ->sda be passed to free_percpu()? */
+	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
+	struct mutex srcu_barrier_mutex;	/* Serialize barrier ops. */
+	struct completion srcu_barrier_completion;
+						/* Awaken barrier rq at end. */
+	atomic_t srcu_barrier_cpu_cnt;		/* # CPUs not yet posting a */
+						/*  callback for the barrier */
+						/*  operation. */
 };
 
 /*
@@ -85,13 +92,6 @@ struct srcu_usage {
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
-	unsigned long srcu_barrier_seq;		/* srcu_barrier seq #. */
-	struct mutex srcu_barrier_mutex;	/* Serialize barrier ops. */
-	struct completion srcu_barrier_completion;
-						/* Awaken barrier rq at end. */
-	atomic_t srcu_barrier_cpu_cnt;		/* # CPUs not yet posting a */
-						/*  callback for the barrier */
-						/*  operation. */
 	unsigned long reschedule_jiffies;
 	unsigned long reschedule_count;
 	struct delayed_work work;
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 7c192f7fb559..1df4a1467765 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -248,9 +248,9 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
 	ssp->srcu_idx = 0;
 	ssp->srcu_sup->srcu_gp_seq = 0;
-	ssp->srcu_barrier_seq = 0;
-	mutex_init(&ssp->srcu_barrier_mutex);
-	atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
+	ssp->srcu_sup->srcu_barrier_seq = 0;
+	mutex_init(&ssp->srcu_sup->srcu_barrier_mutex);
+	atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0);
 	INIT_DELAYED_WORK(&ssp->work, process_srcu);
 	ssp->srcu_sup->sda_is_static = is_static;
 	if (!is_static)
@@ -1518,8 +1518,8 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
 
 	sdp = container_of(rhp, struct srcu_data, srcu_barrier_head);
 	ssp = sdp->ssp;
-	if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
-		complete(&ssp->srcu_barrier_completion);
+	if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt))
+		complete(&ssp->srcu_sup->srcu_barrier_completion);
 }
 
 /*
@@ -1533,13 +1533,13 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
 static void srcu_barrier_one_cpu(struct srcu_struct *ssp, struct srcu_data *sdp)
 {
 	spin_lock_irq_rcu_node(sdp);
-	atomic_inc(&ssp->srcu_barrier_cpu_cnt);
+	atomic_inc(&ssp->srcu_sup->srcu_barrier_cpu_cnt);
 	sdp->srcu_barrier_head.func = srcu_barrier_cb;
 	debug_rcu_head_queue(&sdp->srcu_barrier_head);
 	if (!rcu_segcblist_entrain(&sdp->srcu_cblist,
 				   &sdp->srcu_barrier_head)) {
 		debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
-		atomic_dec(&ssp->srcu_barrier_cpu_cnt);
+		atomic_dec(&ssp->srcu_sup->srcu_barrier_cpu_cnt);
 	}
 	spin_unlock_irq_rcu_node(sdp);
 }
@@ -1552,20 +1552,20 @@ void srcu_barrier(struct srcu_struct *ssp)
 {
 	int cpu;
 	int idx;
-	unsigned long s = rcu_seq_snap(&ssp->srcu_barrier_seq);
+	unsigned long s = rcu_seq_snap(&ssp->srcu_sup->srcu_barrier_seq);
 
 	check_init_srcu_struct(ssp);
-	mutex_lock(&ssp->srcu_barrier_mutex);
-	if (rcu_seq_done(&ssp->srcu_barrier_seq, s)) {
+	mutex_lock(&ssp->srcu_sup->srcu_barrier_mutex);
+	if (rcu_seq_done(&ssp->srcu_sup->srcu_barrier_seq, s)) {
 		smp_mb(); /* Force ordering following return. */
-		mutex_unlock(&ssp->srcu_barrier_mutex);
+		mutex_unlock(&ssp->srcu_sup->srcu_barrier_mutex);
 		return; /* Someone else did our work for us. */
 	}
-	rcu_seq_start(&ssp->srcu_barrier_seq);
-	init_completion(&ssp->srcu_barrier_completion);
+	rcu_seq_start(&ssp->srcu_sup->srcu_barrier_seq);
+	init_completion(&ssp->srcu_sup->srcu_barrier_completion);
 
 	/* Initial count prevents reaching zero until all CBs are posted. */
-	atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
+	atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 1);
 
 	idx = __srcu_read_lock_nmisafe(ssp);
 	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
@@ -1576,12 +1576,12 @@ void srcu_barrier(struct srcu_struct *ssp)
 	__srcu_read_unlock_nmisafe(ssp, idx);
 
 	/* Remove the initial count, at which point reaching zero can happen. */
-	if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
-		complete(&ssp->srcu_barrier_completion);
-	wait_for_completion(&ssp->srcu_barrier_completion);
+	if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt))
+		complete(&ssp->srcu_sup->srcu_barrier_completion);
+	wait_for_completion(&ssp->srcu_sup->srcu_barrier_completion);
 
-	rcu_seq_end(&ssp->srcu_barrier_seq);
-	mutex_unlock(&ssp->srcu_barrier_mutex);
+	rcu_seq_end(&ssp->srcu_sup->srcu_barrier_seq);
+	mutex_unlock(&ssp->srcu_sup->srcu_barrier_mutex);
 }
 EXPORT_SYMBOL_GPL(srcu_barrier);
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 14/19] srcu: Move work-scheduling fields from srcu_struct to srcu_usage
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (12 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 13/19] srcu: Move srcu_barrier() fields " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 15/19] srcu: Fix long lines in srcu_get_delay() Paul E. McKenney
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit moves the ->reschedule_jiffies, ->reschedule_count, and
->work fields from the srcu_struct structure to the srcu_usage structure
to reduce the size of the former in order to improve cache locality.

However, this means that the container_of() calls cannot get a pointer
to the srcu_struct because they are no longer in the srcu_struct.
This issue is addressed by adding a ->srcu_ssp field in the srcu_usage
structure that references the corresponding srcu_struct structure.
And given the presence of the sup pointer to the srcu_usage structure,
replace some ssp->srcu_usage-> instances with sup->.

[ paulmck Apply feedback from kernel test robot. ]

Link: https://lore.kernel.org/oe-kbuild-all/202303191400.iO5BOqka-lkp@intel.com/
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/srcutree.h |  9 +++++----
 kernel/rcu/srcutree.c    | 41 +++++++++++++++++++++-------------------
 2 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 70f5a3abe80a..8f3f72480e78 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -84,6 +84,10 @@ struct srcu_usage {
 	atomic_t srcu_barrier_cpu_cnt;		/* # CPUs not yet posting a */
 						/*  callback for the barrier */
 						/*  operation. */
+	unsigned long reschedule_jiffies;
+	unsigned long reschedule_count;
+	struct delayed_work work;
+	struct srcu_struct *srcu_ssp;
 };
 
 /*
@@ -92,9 +96,6 @@ struct srcu_usage {
 struct srcu_struct {
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	struct srcu_data __percpu *sda;		/* Per-CPU srcu_data array. */
-	unsigned long reschedule_jiffies;
-	unsigned long reschedule_count;
-	struct delayed_work work;
 	struct lockdep_map dep_map;
 	struct srcu_usage *srcu_sup;		/* Update-side data. */
 };
@@ -132,10 +133,10 @@ struct srcu_struct {
 {												\
 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
 	.srcu_gp_seq_needed = -1UL,								\
+	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
 }
 
 #define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
-	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
 	.srcu_sup = &usage_name,								\
 	__SRCU_DEP_MAP_INIT(name)
 
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 1df4a1467765..22dc266d2090 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -251,7 +251,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	ssp->srcu_sup->srcu_barrier_seq = 0;
 	mutex_init(&ssp->srcu_sup->srcu_barrier_mutex);
 	atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0);
-	INIT_DELAYED_WORK(&ssp->work, process_srcu);
+	INIT_DELAYED_WORK(&ssp->srcu_sup->work, process_srcu);
 	ssp->srcu_sup->sda_is_static = is_static;
 	if (!is_static)
 		ssp->sda = alloc_percpu(struct srcu_data);
@@ -275,6 +275,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 			WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
 		}
 	}
+	ssp->srcu_sup->srcu_ssp = ssp;
 	smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
 	return 0;
 }
@@ -647,7 +648,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 		return; /* Just leak it! */
 	if (WARN_ON(srcu_readers_active(ssp)))
 		return; /* Just leak it! */
-	flush_delayed_work(&ssp->work);
+	flush_delayed_work(&ssp->srcu_sup->work);
 	for_each_possible_cpu(cpu) {
 		struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu);
 
@@ -1059,10 +1060,10 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 		// can only be executed during early boot when there is only
 		// the one boot CPU running with interrupts still disabled.
 		if (likely(srcu_init_done))
-			queue_delayed_work(rcu_gp_wq, &ssp->work,
+			queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work,
 					   !!srcu_get_delay(ssp));
-		else if (list_empty(&ssp->work.work.entry))
-			list_add(&ssp->work.work.entry, &srcu_boot_list);
+		else if (list_empty(&ssp->srcu_sup->work.work.entry))
+			list_add(&ssp->srcu_sup->work.work.entry, &srcu_boot_list);
 	}
 	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
 }
@@ -1745,7 +1746,7 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
 	spin_unlock_irq_rcu_node(ssp->srcu_sup);
 
 	if (pushgp)
-		queue_delayed_work(rcu_gp_wq, &ssp->work, delay);
+		queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work, delay);
 }
 
 /*
@@ -1756,22 +1757,24 @@ static void process_srcu(struct work_struct *work)
 	unsigned long curdelay;
 	unsigned long j;
 	struct srcu_struct *ssp;
+	struct srcu_usage *sup;
 
-	ssp = container_of(work, struct srcu_struct, work.work);
+	sup = container_of(work, struct srcu_usage, work.work);
+	ssp = sup->srcu_ssp;
 
 	srcu_advance_state(ssp);
 	curdelay = srcu_get_delay(ssp);
 	if (curdelay) {
-		WRITE_ONCE(ssp->reschedule_count, 0);
+		WRITE_ONCE(sup->reschedule_count, 0);
 	} else {
 		j = jiffies;
-		if (READ_ONCE(ssp->reschedule_jiffies) == j) {
-			WRITE_ONCE(ssp->reschedule_count, READ_ONCE(ssp->reschedule_count) + 1);
-			if (READ_ONCE(ssp->reschedule_count) > srcu_max_nodelay)
+		if (READ_ONCE(sup->reschedule_jiffies) == j) {
+			WRITE_ONCE(sup->reschedule_count, READ_ONCE(sup->reschedule_count) + 1);
+			if (READ_ONCE(sup->reschedule_count) > srcu_max_nodelay)
 				curdelay = 1;
 		} else {
-			WRITE_ONCE(ssp->reschedule_count, 1);
-			WRITE_ONCE(ssp->reschedule_jiffies, j);
+			WRITE_ONCE(sup->reschedule_count, 1);
+			WRITE_ONCE(sup->reschedule_jiffies, j);
 		}
 	}
 	srcu_reschedule(ssp, curdelay);
@@ -1870,7 +1873,7 @@ early_initcall(srcu_bootup_announce);
 
 void __init srcu_init(void)
 {
-	struct srcu_struct *ssp;
+	struct srcu_usage *sup;
 
 	/* Decide on srcu_struct-size strategy. */
 	if (SRCU_SIZING_IS(SRCU_SIZING_AUTO)) {
@@ -1890,13 +1893,13 @@ void __init srcu_init(void)
 	 */
 	srcu_init_done = true;
 	while (!list_empty(&srcu_boot_list)) {
-		ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
+		sup = list_first_entry(&srcu_boot_list, struct srcu_usage,
 				      work.work.entry);
-		list_del_init(&ssp->work.work.entry);
+		list_del_init(&sup->work.work.entry);
 		if (SRCU_SIZING_IS(SRCU_SIZING_INIT) &&
-		    ssp->srcu_sup->srcu_size_state == SRCU_SIZE_SMALL)
-			ssp->srcu_sup->srcu_size_state = SRCU_SIZE_ALLOC;
-		queue_work(rcu_gp_wq, &ssp->work.work);
+		    sup->srcu_size_state == SRCU_SIZE_SMALL)
+			sup->srcu_size_state = SRCU_SIZE_ALLOC;
+		queue_work(rcu_gp_wq, &sup->work.work);
 	}
 }
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 15/19] srcu: Fix long lines in srcu_get_delay()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (13 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 14/19] srcu: Move work-scheduling " Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 16/19] srcu: Fix long lines in cleanup_srcu_struct() Paul E. McKenney
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 22dc266d2090..f6bb9fbe1b9c 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -616,17 +616,18 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 	unsigned long gpstart;
 	unsigned long j;
 	unsigned long jbase = SRCU_INTERVAL;
+	struct srcu_usage *sup = ssp->srcu_sup;
 
-	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
+	if (ULONG_CMP_LT(READ_ONCE(sup->srcu_gp_seq), READ_ONCE(sup->srcu_gp_seq_needed_exp)))
 		jbase = 0;
-	if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq))) {
+	if (rcu_seq_state(READ_ONCE(sup->srcu_gp_seq))) {
 		j = jiffies - 1;
-		gpstart = READ_ONCE(ssp->srcu_sup->srcu_gp_start);
+		gpstart = READ_ONCE(sup->srcu_gp_start);
 		if (time_after(j, gpstart))
 			jbase += j - gpstart;
 		if (!jbase) {
-			WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) + 1);
-			if (READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
+			WRITE_ONCE(sup->srcu_n_exp_nodelay, READ_ONCE(sup->srcu_n_exp_nodelay) + 1);
+			if (READ_ONCE(sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
 				jbase = 1;
 		}
 	}
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 16/19] srcu: Fix long lines in cleanup_srcu_struct()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (14 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 15/19] srcu: Fix long lines in srcu_get_delay() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 17/19] srcu: Fix long lines in srcu_gp_end() Paul E. McKenney
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index f6bb9fbe1b9c..fd88a98b7254 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -644,12 +644,13 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
 void cleanup_srcu_struct(struct srcu_struct *ssp)
 {
 	int cpu;
+	struct srcu_usage *sup = ssp->srcu_sup;
 
 	if (WARN_ON(!srcu_get_delay(ssp)))
 		return; /* Just leak it! */
 	if (WARN_ON(srcu_readers_active(ssp)))
 		return; /* Just leak it! */
-	flush_delayed_work(&ssp->srcu_sup->work);
+	flush_delayed_work(&sup->work);
 	for_each_possible_cpu(cpu) {
 		struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu);
 
@@ -658,21 +659,21 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
 		if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
 			return; /* Forgot srcu_barrier(), so just leak it! */
 	}
-	if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
-	    WARN_ON(rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq) != ssp->srcu_sup->srcu_gp_seq_needed) ||
+	if (WARN_ON(rcu_seq_state(READ_ONCE(sup->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
+	    WARN_ON(rcu_seq_current(&sup->srcu_gp_seq) != sup->srcu_gp_seq_needed) ||
 	    WARN_ON(srcu_readers_active(ssp))) {
 		pr_info("%s: Active srcu_struct %p read state: %d gp state: %lu/%lu\n",
-			__func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)),
-			rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq), ssp->srcu_sup->srcu_gp_seq_needed);
+			__func__, ssp, rcu_seq_state(READ_ONCE(sup->srcu_gp_seq)),
+			rcu_seq_current(&sup->srcu_gp_seq), sup->srcu_gp_seq_needed);
 		return; /* Caller forgot to stop doing call_srcu()? */
 	}
-	kfree(ssp->srcu_sup->node);
-	ssp->srcu_sup->node = NULL;
-	ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
-	if (!ssp->srcu_sup->sda_is_static) {
+	kfree(sup->node);
+	sup->node = NULL;
+	sup->srcu_size_state = SRCU_SIZE_SMALL;
+	if (!sup->sda_is_static) {
 		free_percpu(ssp->sda);
 		ssp->sda = NULL;
-		kfree(ssp->srcu_sup);
+		kfree(sup);
 		ssp->srcu_sup = NULL;
 	}
 }
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 17/19] srcu: Fix long lines in srcu_gp_end()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (15 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 16/19] srcu: Fix long lines in cleanup_srcu_struct() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 18/19] srcu: Fix long lines in srcu_funnel_gp_start() Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 19/19] srcu: Remove extraneous parentheses from srcu_read_lock() etc Paul E. McKenney
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 41 +++++++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index fd88a98b7254..fcb2bac7bb4b 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -862,28 +862,29 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	unsigned long sgsne;
 	struct srcu_node *snp;
 	int ss_state;
+	struct srcu_usage *sup = ssp->srcu_sup;
 
 	/* Prevent more than one additional grace period. */
-	mutex_lock(&ssp->srcu_sup->srcu_cb_mutex);
+	mutex_lock(&sup->srcu_cb_mutex);
 
 	/* End the current grace period. */
-	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	idx = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
+	spin_lock_irq_rcu_node(sup);
+	idx = rcu_seq_state(sup->srcu_gp_seq);
 	WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
-	if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
+	if (ULONG_CMP_LT(READ_ONCE(sup->srcu_gp_seq), READ_ONCE(sup->srcu_gp_seq_needed_exp)))
 		cbdelay = 0;
 
-	WRITE_ONCE(ssp->srcu_sup->srcu_last_gp_end, ktime_get_mono_fast_ns());
-	rcu_seq_end(&ssp->srcu_sup->srcu_gp_seq);
-	gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
-	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq))
-		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq);
-	spin_unlock_irq_rcu_node(ssp->srcu_sup);
-	mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
+	WRITE_ONCE(sup->srcu_last_gp_end, ktime_get_mono_fast_ns());
+	rcu_seq_end(&sup->srcu_gp_seq);
+	gpseq = rcu_seq_current(&sup->srcu_gp_seq);
+	if (ULONG_CMP_LT(sup->srcu_gp_seq_needed_exp, gpseq))
+		WRITE_ONCE(sup->srcu_gp_seq_needed_exp, gpseq);
+	spin_unlock_irq_rcu_node(sup);
+	mutex_unlock(&sup->srcu_gp_mutex);
 	/* A new grace period can start at this point.  But only one. */
 
 	/* Initiate callback invocation as needed. */
-	ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state);
+	ss_state = smp_load_acquire(&sup->srcu_size_state);
 	if (ss_state < SRCU_SIZE_WAIT_BARRIER) {
 		srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, get_boot_cpu_id()),
 					cbdelay);
@@ -892,7 +893,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		srcu_for_each_node_breadth_first(ssp, snp) {
 			spin_lock_irq_rcu_node(snp);
 			cbs = false;
-			last_lvl = snp >= ssp->srcu_sup->level[rcu_num_lvls - 1];
+			last_lvl = snp >= sup->level[rcu_num_lvls - 1];
 			if (last_lvl)
 				cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq;
 			snp->srcu_have_cbs[idx] = gpseq;
@@ -924,18 +925,18 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		}
 
 	/* Callback initiation done, allow grace periods after next. */
-	mutex_unlock(&ssp->srcu_sup->srcu_cb_mutex);
+	mutex_unlock(&sup->srcu_cb_mutex);
 
 	/* Start a new grace period if needed. */
-	spin_lock_irq_rcu_node(ssp->srcu_sup);
-	gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
+	spin_lock_irq_rcu_node(sup);
+	gpseq = rcu_seq_current(&sup->srcu_gp_seq);
 	if (!rcu_seq_state(gpseq) &&
-	    ULONG_CMP_LT(gpseq, ssp->srcu_sup->srcu_gp_seq_needed)) {
+	    ULONG_CMP_LT(gpseq, sup->srcu_gp_seq_needed)) {
 		srcu_gp_start(ssp);
-		spin_unlock_irq_rcu_node(ssp->srcu_sup);
+		spin_unlock_irq_rcu_node(sup);
 		srcu_reschedule(ssp, 0);
 	} else {
-		spin_unlock_irq_rcu_node(ssp->srcu_sup);
+		spin_unlock_irq_rcu_node(sup);
 	}
 
 	/* Transition to big if needed. */
@@ -943,7 +944,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 		if (ss_state == SRCU_SIZE_ALLOC)
 			init_srcu_struct_nodes(ssp, GFP_KERNEL);
 		else
-			smp_store_release(&ssp->srcu_sup->srcu_size_state, ss_state + 1);
+			smp_store_release(&sup->srcu_size_state, ss_state + 1);
 	}
 }
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 18/19] srcu: Fix long lines in srcu_funnel_gp_start()
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (16 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 17/19] srcu: Fix long lines in srcu_gp_end() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  2023-03-24  0:19 ` [PATCH RFC rcu 19/19] srcu: Remove extraneous parentheses from srcu_read_lock() etc Paul E. McKenney
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 kernel/rcu/srcutree.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index fcb2bac7bb4b..61dd47981b40 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1004,9 +1004,10 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	struct srcu_node *snp;
 	struct srcu_node *snp_leaf;
 	unsigned long snp_seq;
+	struct srcu_usage *sup = ssp->srcu_sup;
 
 	/* Ensure that snp node tree is fully initialized before traversing it */
-	if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+	if (smp_load_acquire(&sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
 		snp_leaf = NULL;
 	else
 		snp_leaf = sdp->mynode;
@@ -1014,7 +1015,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	if (snp_leaf)
 		/* Each pass through the loop does one level of the srcu_node tree. */
 		for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
-			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) && snp != snp_leaf)
+			if (WARN_ON_ONCE(rcu_seq_done(&sup->srcu_gp_seq, s)) && snp != snp_leaf)
 				return; /* GP already done and CBs recorded. */
 			spin_lock_irqsave_rcu_node(snp, flags);
 			snp_seq = snp->srcu_have_cbs[idx];
@@ -1041,20 +1042,20 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 
 	/* Top of tree, must ensure the grace period will be started. */
 	spin_lock_irqsave_ssp_contention(ssp, &flags);
-	if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed, s)) {
+	if (ULONG_CMP_LT(sup->srcu_gp_seq_needed, s)) {
 		/*
 		 * Record need for grace period s.  Pair with load
 		 * acquire setting up for initialization.
 		 */
-		smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, s); /*^^^*/
+		smp_store_release(&sup->srcu_gp_seq_needed, s); /*^^^*/
 	}
-	if (!do_norm && ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s))
-		WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s);
+	if (!do_norm && ULONG_CMP_LT(sup->srcu_gp_seq_needed_exp, s))
+		WRITE_ONCE(sup->srcu_gp_seq_needed_exp, s);
 
 	/* If grace period not already in progress, start it. */
-	if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) &&
-	    rcu_seq_state(ssp->srcu_sup->srcu_gp_seq) == SRCU_STATE_IDLE) {
-		WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
+	if (!WARN_ON_ONCE(rcu_seq_done(&sup->srcu_gp_seq, s)) &&
+	    rcu_seq_state(sup->srcu_gp_seq) == SRCU_STATE_IDLE) {
+		WARN_ON_ONCE(ULONG_CMP_GE(sup->srcu_gp_seq, sup->srcu_gp_seq_needed));
 		srcu_gp_start(ssp);
 
 		// And how can that list_add() in the "else" clause
@@ -1063,12 +1064,12 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 		// can only be executed during early boot when there is only
 		// the one boot CPU running with interrupts still disabled.
 		if (likely(srcu_init_done))
-			queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work,
+			queue_delayed_work(rcu_gp_wq, &sup->work,
 					   !!srcu_get_delay(ssp));
-		else if (list_empty(&ssp->srcu_sup->work.work.entry))
-			list_add(&ssp->srcu_sup->work.work.entry, &srcu_boot_list);
+		else if (list_empty(&sup->work.work.entry))
+			list_add(&sup->work.work.entry, &srcu_boot_list);
 	}
-	spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
+	spin_unlock_irqrestore_rcu_node(sup, flags);
 }
 
 /*
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH RFC rcu 19/19] srcu: Remove extraneous parentheses from srcu_read_lock() etc.
  2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
                   ` (17 preceding siblings ...)
  2023-03-24  0:19 ` [PATCH RFC rcu 18/19] srcu: Fix long lines in srcu_funnel_gp_start() Paul E. McKenney
@ 2023-03-24  0:19 ` Paul E. McKenney
  18 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24  0:19 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt, hch, Paul E. McKenney

This commit removes extraneous parentheses from srcu_read_lock(),
srcu_read_lock_nmisafe(), srcu_read_unlock(), and
srcu_read_unlock_nmisafe().  Looks like someone was once a macro.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
---
 include/linux/srcu.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 41c4b26fb1c1..eb92a50a4599 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -212,7 +212,7 @@ static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
 
 	srcu_check_nmi_safety(ssp, false);
 	retval = __srcu_read_lock(ssp);
-	srcu_lock_acquire(&(ssp)->dep_map);
+	srcu_lock_acquire(&ssp->dep_map);
 	return retval;
 }
 
@@ -229,7 +229,7 @@ static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp
 
 	srcu_check_nmi_safety(ssp, true);
 	retval = __srcu_read_lock_nmisafe(ssp);
-	rcu_lock_acquire(&(ssp)->dep_map);
+	rcu_lock_acquire(&ssp->dep_map);
 	return retval;
 }
 
@@ -284,7 +284,7 @@ static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
 {
 	WARN_ON_ONCE(idx & ~0x1);
 	srcu_check_nmi_safety(ssp, false);
-	srcu_lock_release(&(ssp)->dep_map);
+	srcu_lock_release(&ssp->dep_map);
 	__srcu_read_unlock(ssp, idx);
 }
 
@@ -300,7 +300,7 @@ static inline void srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
 {
 	WARN_ON_ONCE(idx & ~0x1);
 	srcu_check_nmi_safety(ssp, true);
-	rcu_lock_release(&(ssp)->dep_map);
+	rcu_lock_release(&ssp->dep_map);
 	__srcu_read_unlock_nmisafe(ssp, idx);
 }
 
-- 
2.40.0.rc2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-24  0:19 ` [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update Paul E. McKenney
@ 2023-03-24 19:10   ` Wysocki, Rafael J
  2023-03-24 20:11     ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Wysocki, Rafael J @ 2023-03-24 19:10 UTC (permalink / raw)
  To: Paul E. McKenney, rcu
  Cc: linux-kernel, kernel-team, rostedt, hch,
	Michał Mirosław, Dmitry Osipenko

On 3/24/2023 1:19 AM, Paul E. McKenney wrote:
> The current srcu_struct structure is on the order of 200 bytes in size
> (depending on architecture and .config), which is much better than the
> old-style 26K bytes, but still all too inconvenient when one is trying
> to achieve good cache locality on a fastpath involving SRCU readers.
>
> However, only a few fields in srcu_struct are used by SRCU readers.
> The remaining fields could be offloaded to a new srcu_update
> structure, thus shrinking the srcu_struct structure down to a few
> tens of bytes.  This commit begins this noble quest, a quest that is
> complicated by open-coded initialization of the srcu_struct within the
> srcu_notifier_head structure.  This complication is addressed by updating
> the srcu_notifier_head structure's open coding, given that there does
> not appear to be a straightforward way of abstracting that initialization.
>
> This commit moves only the ->node pointer to srcu_update.  Later commits
> will move additional fields.
>
> [ paulmck: Fold in qiang1.zhang@intel.com's memory-leak fix. ]
>
> Link: https://lore.kernel.org/all/20230320055751.4120251-1-qiang1.zhang@intel.com/
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>

Fine with me.

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


> ---
>   include/linux/notifier.h |  5 ++++-
>   include/linux/srcutiny.h |  6 +++---
>   include/linux/srcutree.h | 27 ++++++++++++++++++---------
>   kernel/rcu/rcu.h         |  6 ++++--
>   kernel/rcu/srcutree.c    | 28 +++++++++++++++++++---------
>   5 files changed, 48 insertions(+), 24 deletions(-)
>
> diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> index aef88c2d1173..2aba75145144 100644
> --- a/include/linux/notifier.h
> +++ b/include/linux/notifier.h
> @@ -73,6 +73,9 @@ struct raw_notifier_head {
>   
>   struct srcu_notifier_head {
>   	struct mutex mutex;
> +#ifdef CONFIG_TREE_SRCU
> +	struct srcu_usage srcuu;
> +#endif
>   	struct srcu_struct srcu;
>   	struct notifier_block __rcu *head;
>   };
> @@ -107,7 +110,7 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
>   	{							\
>   		.mutex = __MUTEX_INITIALIZER(name.mutex),	\
>   		.head = NULL,					\
> -		.srcu = __SRCU_STRUCT_INIT(name.srcu, pcpu),	\
> +		.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
>   	}
>   
>   #define ATOMIC_NOTIFIER_HEAD(name)				\
> diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
> index 5aa5e0faf6a1..ebd72491af99 100644
> --- a/include/linux/srcutiny.h
> +++ b/include/linux/srcutiny.h
> @@ -31,7 +31,7 @@ struct srcu_struct {
>   
>   void srcu_drive_gp(struct work_struct *wp);
>   
> -#define __SRCU_STRUCT_INIT(name, __ignored)				\
> +#define __SRCU_STRUCT_INIT(name, __ignored, ___ignored)			\
>   {									\
>   	.srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq),	\
>   	.srcu_cb_tail = &name.srcu_cb_head,				\
> @@ -44,9 +44,9 @@ void srcu_drive_gp(struct work_struct *wp);
>    * Tree SRCU, which needs some per-CPU data.
>    */
>   #define DEFINE_SRCU(name) \
> -	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
> +	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
>   #define DEFINE_STATIC_SRCU(name) \
> -	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
> +	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
>   
>   void synchronize_srcu(struct srcu_struct *ssp);
>   
> diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> index 428480152375..2689e64024bb 100644
> --- a/include/linux/srcutree.h
> +++ b/include/linux/srcutree.h
> @@ -57,11 +57,17 @@ struct srcu_node {
>   	int grphi;				/* Biggest CPU for node. */
>   };
>   
> +/*
> + * Per-SRCU-domain structure, update-side data linked from srcu_struct.
> + */
> +struct srcu_usage {
> +	struct srcu_node *node;			/* Combining tree. */
> +};
> +
>   /*
>    * Per-SRCU-domain structure, similar in function to rcu_state.
>    */
>   struct srcu_struct {
> -	struct srcu_node *node;			/* Combining tree. */
>   	struct srcu_node *level[RCU_NUM_LVLS + 1];
>   						/* First node at each level. */
>   	int srcu_size_state;			/* Small-to-big transition state. */
> @@ -90,6 +96,7 @@ struct srcu_struct {
>   	unsigned long reschedule_count;
>   	struct delayed_work work;
>   	struct lockdep_map dep_map;
> +	struct srcu_usage *srcu_sup;		/* Update-side data. */
>   };
>   
>   // Values for size state variable (->srcu_size_state).  Once the state
> @@ -121,24 +128,24 @@ struct srcu_struct {
>   #define SRCU_STATE_SCAN1	1
>   #define SRCU_STATE_SCAN2	2
>   
> -#define __SRCU_STRUCT_INIT_COMMON(name)								\
> +#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
>   	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
>   	.srcu_gp_seq_needed = -1UL,								\
>   	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> +	.srcu_sup = &usage_name,								\
>   	__SRCU_DEP_MAP_INIT(name)
>   
> -#define __SRCU_STRUCT_INIT_MODULE(name)								\
> +#define __SRCU_STRUCT_INIT_MODULE(name, usage_name)						\
>   {												\
> -	__SRCU_STRUCT_INIT_COMMON(name)								\
> +	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
>   }
>   
> -#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> +#define __SRCU_STRUCT_INIT(name, usage_name, pcpu_name)						\
>   {												\
>   	.sda = &pcpu_name,									\
> -	__SRCU_STRUCT_INIT_COMMON(name)								\
> +	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
>   }
>   
> -
>   /*
>    * Define and initialize a srcu struct at build time.
>    * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> @@ -160,15 +167,17 @@ struct srcu_struct {
>    */
>   #ifdef MODULE
>   # define __DEFINE_SRCU(name, is_static)								\
> -	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> +	static struct srcu_usage name##_srcu_usage;						\
> +	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
>   	extern struct srcu_struct * const __srcu_struct_##name;					\
>   	struct srcu_struct * const __srcu_struct_##name						\
>   		__section("___srcu_struct_ptrs") = &name
>   #else
>   # define __DEFINE_SRCU(name, is_static)								\
>   	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
> +	static struct srcu_usage name##_srcu_usage;						\
>   	is_static struct srcu_struct name =							\
> -		__SRCU_STRUCT_INIT(name, name##_srcu_data)
> +		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
>   #endif
>   #define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
>   #define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index a3adcf9a9919..4a1b9622598b 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -378,11 +378,13 @@ extern void rcu_init_geometry(void);
>    * specified state structure (for SRCU) or the only rcu_state structure
>    * (for RCU).
>    */
> -#define srcu_for_each_node_breadth_first(sp, rnp) \
> +#define _rcu_for_each_node_breadth_first(sp, rnp) \
>   	for ((rnp) = &(sp)->node[0]; \
>   	     (rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++)
>   #define rcu_for_each_node_breadth_first(rnp) \
> -	srcu_for_each_node_breadth_first(&rcu_state, rnp)
> +	_rcu_for_each_node_breadth_first(&rcu_state, rnp)
> +#define srcu_for_each_node_breadth_first(ssp, rnp) \
> +	_rcu_for_each_node_breadth_first(ssp->srcu_sup, rnp)
>   
>   /*
>    * Scan the leaves of the rcu_node hierarchy for the rcu_state structure.
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 7a6d9452a5d0..ad1d5ca42a99 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -173,12 +173,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
>   
>   	/* Initialize geometry if it has not already been initialized. */
>   	rcu_init_geometry();
> -	ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), gfp_flags);
> -	if (!ssp->node)
> +	ssp->srcu_sup->node = kcalloc(rcu_num_nodes, sizeof(*ssp->srcu_sup->node), gfp_flags);
> +	if (!ssp->srcu_sup->node)
>   		return false;
>   
>   	/* Work out the overall tree geometry. */
> -	ssp->level[0] = &ssp->node[0];
> +	ssp->level[0] = &ssp->srcu_sup->node[0];
>   	for (i = 1; i < rcu_num_lvls; i++)
>   		ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
>   	rcu_init_levelspread(levelspread, num_rcu_lvl);
> @@ -195,7 +195,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
>   		snp->srcu_gp_seq_needed_exp = SRCU_SNP_INIT_SEQ;
>   		snp->grplo = -1;
>   		snp->grphi = -1;
> -		if (snp == &ssp->node[0]) {
> +		if (snp == &ssp->srcu_sup->node[0]) {
>   			/* Root node, special case. */
>   			snp->srcu_parent = NULL;
>   			continue;
> @@ -236,8 +236,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
>    */
>   static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>   {
> +	if (!is_static)
> +		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
> +	if (!ssp->srcu_sup)
> +		return -ENOMEM;
>   	ssp->srcu_size_state = SRCU_SIZE_SMALL;
> -	ssp->node = NULL;
> +	ssp->srcu_sup->node = NULL;
>   	mutex_init(&ssp->srcu_cb_mutex);
>   	mutex_init(&ssp->srcu_gp_mutex);
>   	ssp->srcu_idx = 0;
> @@ -249,8 +253,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>   	ssp->sda_is_static = is_static;
>   	if (!is_static)
>   		ssp->sda = alloc_percpu(struct srcu_data);
> -	if (!ssp->sda)
> +	if (!ssp->sda) {
> +		if (!is_static)
> +			kfree(ssp->srcu_sup);
>   		return -ENOMEM;
> +	}
>   	init_srcu_struct_data(ssp);
>   	ssp->srcu_gp_seq_needed_exp = 0;
>   	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
> @@ -259,6 +266,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>   			if (!ssp->sda_is_static) {
>   				free_percpu(ssp->sda);
>   				ssp->sda = NULL;
> +				kfree(ssp->srcu_sup);
>   				return -ENOMEM;
>   			}
>   		} else {
> @@ -656,13 +664,15 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
>   			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
>   		return; /* Caller forgot to stop doing call_srcu()? */
>   	}
> +	kfree(ssp->srcu_sup->node);
> +	ssp->srcu_sup->node = NULL;
> +	ssp->srcu_size_state = SRCU_SIZE_SMALL;
>   	if (!ssp->sda_is_static) {
>   		free_percpu(ssp->sda);
>   		ssp->sda = NULL;
> +		kfree(ssp->srcu_sup);
> +		ssp->srcu_sup = NULL;
>   	}
> -	kfree(ssp->node);
> -	ssp->node = NULL;
> -	ssp->srcu_size_state = SRCU_SIZE_SMALL;
>   }
>   EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
>   

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-24 19:10   ` Wysocki, Rafael J
@ 2023-03-24 20:11     ` Paul E. McKenney
  2023-03-26 23:18       ` Christoph Hellwig
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-24 20:11 UTC (permalink / raw)
  To: Wysocki, Rafael J
  Cc: rcu, linux-kernel, kernel-team, rostedt, hch,
	Michał Mirosław, Dmitry Osipenko

On Fri, Mar 24, 2023 at 08:10:31PM +0100, Wysocki, Rafael J wrote:
> On 3/24/2023 1:19 AM, Paul E. McKenney wrote:
> > The current srcu_struct structure is on the order of 200 bytes in size
> > (depending on architecture and .config), which is much better than the
> > old-style 26K bytes, but still all too inconvenient when one is trying
> > to achieve good cache locality on a fastpath involving SRCU readers.
> > 
> > However, only a few fields in srcu_struct are used by SRCU readers.
> > The remaining fields could be offloaded to a new srcu_update
> > structure, thus shrinking the srcu_struct structure down to a few
> > tens of bytes.  This commit begins this noble quest, a quest that is
> > complicated by open-coded initialization of the srcu_struct within the
> > srcu_notifier_head structure.  This complication is addressed by updating
> > the srcu_notifier_head structure's open coding, given that there does
> > not appear to be a straightforward way of abstracting that initialization.
> > 
> > This commit moves only the ->node pointer to srcu_update.  Later commits
> > will move additional fields.
> > 
> > [ paulmck: Fold in qiang1.zhang@intel.com's memory-leak fix. ]
> > 
> > Link: https://lore.kernel.org/all/20230320055751.4120251-1-qiang1.zhang@intel.com/
> > Suggested-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> > Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
> > Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> 
> Fine with me.
> 
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Thank you!  I will add this on my next rebase.

It is possible that this will be v6.5 rather than v6.4 material.

							Thanx, Paul

> > ---
> >   include/linux/notifier.h |  5 ++++-
> >   include/linux/srcutiny.h |  6 +++---
> >   include/linux/srcutree.h | 27 ++++++++++++++++++---------
> >   kernel/rcu/rcu.h         |  6 ++++--
> >   kernel/rcu/srcutree.c    | 28 +++++++++++++++++++---------
> >   5 files changed, 48 insertions(+), 24 deletions(-)
> > 
> > diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> > index aef88c2d1173..2aba75145144 100644
> > --- a/include/linux/notifier.h
> > +++ b/include/linux/notifier.h
> > @@ -73,6 +73,9 @@ struct raw_notifier_head {
> >   struct srcu_notifier_head {
> >   	struct mutex mutex;
> > +#ifdef CONFIG_TREE_SRCU
> > +	struct srcu_usage srcuu;
> > +#endif
> >   	struct srcu_struct srcu;
> >   	struct notifier_block __rcu *head;
> >   };
> > @@ -107,7 +110,7 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
> >   	{							\
> >   		.mutex = __MUTEX_INITIALIZER(name.mutex),	\
> >   		.head = NULL,					\
> > -		.srcu = __SRCU_STRUCT_INIT(name.srcu, pcpu),	\
> > +		.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
> >   	}
> >   #define ATOMIC_NOTIFIER_HEAD(name)				\
> > diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
> > index 5aa5e0faf6a1..ebd72491af99 100644
> > --- a/include/linux/srcutiny.h
> > +++ b/include/linux/srcutiny.h
> > @@ -31,7 +31,7 @@ struct srcu_struct {
> >   void srcu_drive_gp(struct work_struct *wp);
> > -#define __SRCU_STRUCT_INIT(name, __ignored)				\
> > +#define __SRCU_STRUCT_INIT(name, __ignored, ___ignored)			\
> >   {									\
> >   	.srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq),	\
> >   	.srcu_cb_tail = &name.srcu_cb_head,				\
> > @@ -44,9 +44,9 @@ void srcu_drive_gp(struct work_struct *wp);
> >    * Tree SRCU, which needs some per-CPU data.
> >    */
> >   #define DEFINE_SRCU(name) \
> > -	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
> > +	struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
> >   #define DEFINE_STATIC_SRCU(name) \
> > -	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
> > +	static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
> >   void synchronize_srcu(struct srcu_struct *ssp);
> > diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> > index 428480152375..2689e64024bb 100644
> > --- a/include/linux/srcutree.h
> > +++ b/include/linux/srcutree.h
> > @@ -57,11 +57,17 @@ struct srcu_node {
> >   	int grphi;				/* Biggest CPU for node. */
> >   };
> > +/*
> > + * Per-SRCU-domain structure, update-side data linked from srcu_struct.
> > + */
> > +struct srcu_usage {
> > +	struct srcu_node *node;			/* Combining tree. */
> > +};
> > +
> >   /*
> >    * Per-SRCU-domain structure, similar in function to rcu_state.
> >    */
> >   struct srcu_struct {
> > -	struct srcu_node *node;			/* Combining tree. */
> >   	struct srcu_node *level[RCU_NUM_LVLS + 1];
> >   						/* First node at each level. */
> >   	int srcu_size_state;			/* Small-to-big transition state. */
> > @@ -90,6 +96,7 @@ struct srcu_struct {
> >   	unsigned long reschedule_count;
> >   	struct delayed_work work;
> >   	struct lockdep_map dep_map;
> > +	struct srcu_usage *srcu_sup;		/* Update-side data. */
> >   };
> >   // Values for size state variable (->srcu_size_state).  Once the state
> > @@ -121,24 +128,24 @@ struct srcu_struct {
> >   #define SRCU_STATE_SCAN1	1
> >   #define SRCU_STATE_SCAN2	2
> > -#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > +#define __SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
> >   	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> >   	.srcu_gp_seq_needed = -1UL,								\
> >   	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> > +	.srcu_sup = &usage_name,								\
> >   	__SRCU_DEP_MAP_INIT(name)
> > -#define __SRCU_STRUCT_INIT_MODULE(name)								\
> > +#define __SRCU_STRUCT_INIT_MODULE(name, usage_name)						\
> >   {												\
> > -	__SRCU_STRUCT_INIT_COMMON(name)								\
> > +	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
> >   }
> > -#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> > +#define __SRCU_STRUCT_INIT(name, usage_name, pcpu_name)						\
> >   {												\
> >   	.sda = &pcpu_name,									\
> > -	__SRCU_STRUCT_INIT_COMMON(name)								\
> > +	__SRCU_STRUCT_INIT_COMMON(name, usage_name)						\
> >   }
> > -
> >   /*
> >    * Define and initialize a srcu struct at build time.
> >    * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> > @@ -160,15 +167,17 @@ struct srcu_struct {
> >    */
> >   #ifdef MODULE
> >   # define __DEFINE_SRCU(name, is_static)								\
> > -	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > +	static struct srcu_usage name##_srcu_usage;						\
> > +	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage);	\
> >   	extern struct srcu_struct * const __srcu_struct_##name;					\
> >   	struct srcu_struct * const __srcu_struct_##name						\
> >   		__section("___srcu_struct_ptrs") = &name
> >   #else
> >   # define __DEFINE_SRCU(name, is_static)								\
> >   	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);				\
> > +	static struct srcu_usage name##_srcu_usage;						\
> >   	is_static struct srcu_struct name =							\
> > -		__SRCU_STRUCT_INIT(name, name##_srcu_data)
> > +		__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
> >   #endif
> >   #define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
> >   #define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index a3adcf9a9919..4a1b9622598b 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -378,11 +378,13 @@ extern void rcu_init_geometry(void);
> >    * specified state structure (for SRCU) or the only rcu_state structure
> >    * (for RCU).
> >    */
> > -#define srcu_for_each_node_breadth_first(sp, rnp) \
> > +#define _rcu_for_each_node_breadth_first(sp, rnp) \
> >   	for ((rnp) = &(sp)->node[0]; \
> >   	     (rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++)
> >   #define rcu_for_each_node_breadth_first(rnp) \
> > -	srcu_for_each_node_breadth_first(&rcu_state, rnp)
> > +	_rcu_for_each_node_breadth_first(&rcu_state, rnp)
> > +#define srcu_for_each_node_breadth_first(ssp, rnp) \
> > +	_rcu_for_each_node_breadth_first(ssp->srcu_sup, rnp)
> >   /*
> >    * Scan the leaves of the rcu_node hierarchy for the rcu_state structure.
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index 7a6d9452a5d0..ad1d5ca42a99 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -173,12 +173,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
> >   	/* Initialize geometry if it has not already been initialized. */
> >   	rcu_init_geometry();
> > -	ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), gfp_flags);
> > -	if (!ssp->node)
> > +	ssp->srcu_sup->node = kcalloc(rcu_num_nodes, sizeof(*ssp->srcu_sup->node), gfp_flags);
> > +	if (!ssp->srcu_sup->node)
> >   		return false;
> >   	/* Work out the overall tree geometry. */
> > -	ssp->level[0] = &ssp->node[0];
> > +	ssp->level[0] = &ssp->srcu_sup->node[0];
> >   	for (i = 1; i < rcu_num_lvls; i++)
> >   		ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
> >   	rcu_init_levelspread(levelspread, num_rcu_lvl);
> > @@ -195,7 +195,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
> >   		snp->srcu_gp_seq_needed_exp = SRCU_SNP_INIT_SEQ;
> >   		snp->grplo = -1;
> >   		snp->grphi = -1;
> > -		if (snp == &ssp->node[0]) {
> > +		if (snp == &ssp->srcu_sup->node[0]) {
> >   			/* Root node, special case. */
> >   			snp->srcu_parent = NULL;
> >   			continue;
> > @@ -236,8 +236,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
> >    */
> >   static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> >   {
> > +	if (!is_static)
> > +		ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
> > +	if (!ssp->srcu_sup)
> > +		return -ENOMEM;
> >   	ssp->srcu_size_state = SRCU_SIZE_SMALL;
> > -	ssp->node = NULL;
> > +	ssp->srcu_sup->node = NULL;
> >   	mutex_init(&ssp->srcu_cb_mutex);
> >   	mutex_init(&ssp->srcu_gp_mutex);
> >   	ssp->srcu_idx = 0;
> > @@ -249,8 +253,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> >   	ssp->sda_is_static = is_static;
> >   	if (!is_static)
> >   		ssp->sda = alloc_percpu(struct srcu_data);
> > -	if (!ssp->sda)
> > +	if (!ssp->sda) {
> > +		if (!is_static)
> > +			kfree(ssp->srcu_sup);
> >   		return -ENOMEM;
> > +	}
> >   	init_srcu_struct_data(ssp);
> >   	ssp->srcu_gp_seq_needed_exp = 0;
> >   	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
> > @@ -259,6 +266,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> >   			if (!ssp->sda_is_static) {
> >   				free_percpu(ssp->sda);
> >   				ssp->sda = NULL;
> > +				kfree(ssp->srcu_sup);
> >   				return -ENOMEM;
> >   			}
> >   		} else {
> > @@ -656,13 +664,15 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
> >   			rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
> >   		return; /* Caller forgot to stop doing call_srcu()? */
> >   	}
> > +	kfree(ssp->srcu_sup->node);
> > +	ssp->srcu_sup->node = NULL;
> > +	ssp->srcu_size_state = SRCU_SIZE_SMALL;
> >   	if (!ssp->sda_is_static) {
> >   		free_percpu(ssp->sda);
> >   		ssp->sda = NULL;
> > +		kfree(ssp->srcu_sup);
> > +		ssp->srcu_sup = NULL;
> >   	}
> > -	kfree(ssp->node);
> > -	ssp->node = NULL;
> > -	ssp->srcu_size_state = SRCU_SIZE_SMALL;
> >   }
> >   EXPORT_SYMBOL_GPL(cleanup_srcu_struct);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-24 20:11     ` Paul E. McKenney
@ 2023-03-26 23:18       ` Christoph Hellwig
  2023-03-27  3:29         ` Paul E. McKenney
  2023-04-24 21:06         ` Paul E. McKenney
  0 siblings, 2 replies; 30+ messages in thread
From: Christoph Hellwig @ 2023-03-26 23:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Wysocki, Rafael J, rcu, linux-kernel, kernel-team, rostedt, hch,
	Michał Mirosław, Dmitry Osipenko

On Fri, Mar 24, 2023 at 01:11:47PM -0700, Paul E. McKenney wrote:
> > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Thank you!  I will add this on my next rebase.
> 
> It is possible that this will be v6.5 rather than v6.4 material.

I was hoping the RCU bits could land in 6.4, so that the block
layer work to take advantage of it can go into 6.5 without cross-tree
dependencies.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-26 23:18       ` Christoph Hellwig
@ 2023-03-27  3:29         ` Paul E. McKenney
  2023-04-24 21:06         ` Paul E. McKenney
  1 sibling, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-27  3:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Wysocki, Rafael J, rcu, linux-kernel, kernel-team, rostedt,
	Michał Mirosław, Dmitry Osipenko

On Mon, Mar 27, 2023 at 01:18:58AM +0200, Christoph Hellwig wrote:
> On Fri, Mar 24, 2023 at 01:11:47PM -0700, Paul E. McKenney wrote:
> > > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Thank you!  I will add this on my next rebase.
> > 
> > It is possible that this will be v6.5 rather than v6.4 material.
> 
> I was hoping the RCU bits could land in 6.4, so that the block
> layer work to take advantage of it can go into 6.5 without cross-tree
> dependencies.

Indeed, that patch series does hit a large chunk of SRCU, so my usual
offer of just giving you the patches is likely to run into trouble.

I will see what we can do about v6.4.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-24  0:19 ` [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct Paul E. McKenney
@ 2023-03-30  4:11   ` Zhang, Qiang1
  2023-03-30 14:55     ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Zhang, Qiang1 @ 2023-03-30  4:11 UTC (permalink / raw)
  To: Paul E. McKenney, rcu
  Cc: linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

>
>Further shrinking the srcu_struct structure is eased by requiring
>that in-module srcu_struct structures rely more heavily on static
>initialization.  In particular, this preserves the property that
>a module-load-time srcu_struct initialization can fail only due
>to memory-allocation failure of the per-CPU srcu_data structures.
>It might also slightly improve robustness by keeping the number of memory
>allocations that must succeed down percpu_alloc() call.
>
>This is in preparation for splitting an srcu_usage structure out
>of the srcu_struct structure.
>
>[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
>
>Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>Cc: Christoph Hellwig <hch@lst.de>
>---
> include/linux/srcutree.h | 19 ++++++++++++++-----
> kernel/rcu/srcutree.c    | 19 +++++++++++++------
> 2 files changed, 27 insertions(+), 11 deletions(-)
>
>diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
>index ac8af12f93b3..428480152375 100644
>--- a/include/linux/srcutree.h
>+++ b/include/linux/srcutree.h
>@@ -121,15 +121,24 @@ struct srcu_struct {
> #define SRCU_STATE_SCAN1	1
> #define SRCU_STATE_SCAN2	2
> 
>-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
>-{												\
>-	.sda = &pcpu_name,									\
>+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> 	.srcu_gp_seq_needed = -1UL,								\
> 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
>-	__SRCU_DEP_MAP_INIT(name)								\
>+	__SRCU_DEP_MAP_INIT(name)
>+
>+#define __SRCU_STRUCT_INIT_MODULE(name)								\
>+{												\
>+	__SRCU_STRUCT_INIT_COMMON(name)								\
> }
> 
>+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
>+{												\
>+	.sda = &pcpu_name,									\
>+	__SRCU_STRUCT_INIT_COMMON(name)								\
>+}
>+
>+
> /*
>  * Define and initialize a srcu struct at build time.
>  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
>@@ -151,7 +160,7 @@ struct srcu_struct {
>  */
> #ifdef MODULE
> # define __DEFINE_SRCU(name, is_static)								\
>-	is_static struct srcu_struct name;							\
>+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> 	extern struct srcu_struct * const __srcu_struct_##name;					\
> 	struct srcu_struct * const __srcu_struct_##name						\
> 		__section("___srcu_struct_ptrs") = &name
>diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
>index cd46fe063e50..7a6d9452a5d0 100644
>--- a/kernel/rcu/srcutree.c
>+++ b/kernel/rcu/srcutree.c
>@@ -1895,13 +1895,14 @@ void __init srcu_init(void)
> static int srcu_module_coming(struct module *mod)
> {
> 	int i;
>+	struct srcu_struct *ssp;
> 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
>-	int ret;
> 
> 	for (i = 0; i < mod->num_srcu_structs; i++) {
>-		ret = init_srcu_struct(*(sspp++));
>-		if (WARN_ON_ONCE(ret))
>-			return ret;
>+		ssp = *(sspp++);
>+		ssp->sda = alloc_percpu(struct srcu_data);
>+		if (WARN_ON_ONCE(!ssp->sda))
>+			return -ENOMEM;
> 	}
> 	return 0;
> }
>@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module *mod)
> static void srcu_module_going(struct module *mod)
> {
> 	int i;
>+	struct srcu_struct *ssp;
> 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> 
>-	for (i = 0; i < mod->num_srcu_structs; i++)
>-		cleanup_srcu_struct(*(sspp++));
>+	for (i = 0; i < mod->num_srcu_structs; i++) {
>+		ssp = *(sspp++);
>+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>+				cleanup_srcu_struct(ssp);
>+		free_percpu(ssp->sda);


Hi Paul

About 037b80b8865fb ("srcu: Check for readers at module-exit time ")

--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
                if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
                    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
                        cleanup_srcu_struct(ssp);
-               free_percpu(ssp->sda);
+               else if (!WARN_ON(srcu_readers_active(ssp)))
+                       free_percpu(ssp->sda);

Should the else statement be removed?  like this:

if (!WARN_ON(srcu_readers_active(ssp)))
	free_percpu(ssp->sda);

Thanks
Zqiang


>+	}
> }
> 
> /* Handle one module, either coming or going. */
>-- 
>2.40.0.rc2
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-30  4:11   ` Zhang, Qiang1
@ 2023-03-30 14:55     ` Paul E. McKenney
  2023-03-30 15:03       ` Zhang, Qiang1
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-30 14:55 UTC (permalink / raw)
  To: Zhang, Qiang1
  Cc: rcu, linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

On Thu, Mar 30, 2023 at 04:11:05AM +0000, Zhang, Qiang1 wrote:
> >
> >Further shrinking the srcu_struct structure is eased by requiring
> >that in-module srcu_struct structures rely more heavily on static
> >initialization.  In particular, this preserves the property that
> >a module-load-time srcu_struct initialization can fail only due
> >to memory-allocation failure of the per-CPU srcu_data structures.
> >It might also slightly improve robustness by keeping the number of memory
> >allocations that must succeed down percpu_alloc() call.
> >
> >This is in preparation for splitting an srcu_usage structure out
> >of the srcu_struct structure.
> >
> >[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
> >
> >Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >Cc: Christoph Hellwig <hch@lst.de>
> >---
> > include/linux/srcutree.h | 19 ++++++++++++++-----
> > kernel/rcu/srcutree.c    | 19 +++++++++++++------
> > 2 files changed, 27 insertions(+), 11 deletions(-)
> >
> >diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> >index ac8af12f93b3..428480152375 100644
> >--- a/include/linux/srcutree.h
> >+++ b/include/linux/srcutree.h
> >@@ -121,15 +121,24 @@ struct srcu_struct {
> > #define SRCU_STATE_SCAN1	1
> > #define SRCU_STATE_SCAN2	2
> > 
> >-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >-{												\
> >-	.sda = &pcpu_name,									\
> >+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> > 	.srcu_gp_seq_needed = -1UL,								\
> > 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> >-	__SRCU_DEP_MAP_INIT(name)								\
> >+	__SRCU_DEP_MAP_INIT(name)
> >+
> >+#define __SRCU_STRUCT_INIT_MODULE(name)								\
> >+{												\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > }
> > 
> >+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >+{												\
> >+	.sda = &pcpu_name,									\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> >+}
> >+
> >+
> > /*
> >  * Define and initialize a srcu struct at build time.
> >  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> >@@ -151,7 +160,7 @@ struct srcu_struct {
> >  */
> > #ifdef MODULE
> > # define __DEFINE_SRCU(name, is_static)								\
> >-	is_static struct srcu_struct name;							\
> >+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > 	extern struct srcu_struct * const __srcu_struct_##name;					\
> > 	struct srcu_struct * const __srcu_struct_##name						\
> > 		__section("___srcu_struct_ptrs") = &name
> >diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> >index cd46fe063e50..7a6d9452a5d0 100644
> >--- a/kernel/rcu/srcutree.c
> >+++ b/kernel/rcu/srcutree.c
> >@@ -1895,13 +1895,14 @@ void __init srcu_init(void)
> > static int srcu_module_coming(struct module *mod)
> > {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> >-	int ret;
> > 
> > 	for (i = 0; i < mod->num_srcu_structs; i++) {
> >-		ret = init_srcu_struct(*(sspp++));
> >-		if (WARN_ON_ONCE(ret))
> >-			return ret;
> >+		ssp = *(sspp++);
> >+		ssp->sda = alloc_percpu(struct srcu_data);
> >+		if (WARN_ON_ONCE(!ssp->sda))
> >+			return -ENOMEM;
> > 	}
> > 	return 0;
> > }
> >@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module *mod)
> > static void srcu_module_going(struct module *mod)
> > {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > 
> >-	for (i = 0; i < mod->num_srcu_structs; i++)
> >-		cleanup_srcu_struct(*(sspp++));
> >+	for (i = 0; i < mod->num_srcu_structs; i++) {
> >+		ssp = *(sspp++);
> >+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> >+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> >+				cleanup_srcu_struct(ssp);
> >+		free_percpu(ssp->sda);
> 
> 
> Hi Paul
> 
> About 037b80b8865fb ("srcu: Check for readers at module-exit time ")
> 
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
>                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>                         cleanup_srcu_struct(ssp);
> -               free_percpu(ssp->sda);
> +               else if (!WARN_ON(srcu_readers_active(ssp)))
> +                       free_percpu(ssp->sda);
> 
> Should the else statement be removed?  like this:
> 
> if (!WARN_ON(srcu_readers_active(ssp)))
> 	free_percpu(ssp->sda);

Mightn't that cause us to double-free ssp->sda?  Once in free_percpu(),
and before that in cleanup_srcu_struct()?

							Thanx, Paul

> Thanks
> Zqiang
> 
> 
> >+	}
> > }
> > 
> > /* Handle one module, either coming or going. */
> >-- 
> >2.40.0.rc2
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-30 14:55     ` Paul E. McKenney
@ 2023-03-30 15:03       ` Zhang, Qiang1
  2023-03-30 15:20         ` Zhang, Qiang1
  0 siblings, 1 reply; 30+ messages in thread
From: Zhang, Qiang1 @ 2023-03-30 15:03 UTC (permalink / raw)
  To: paulmck; +Cc: rcu, linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

> >Further shrinking the srcu_struct structure is eased by requiring
> >that in-module srcu_struct structures rely more heavily on static
> >initialization.  In particular, this preserves the property that
> >a module-load-time srcu_struct initialization can fail only due
> >to memory-allocation failure of the per-CPU srcu_data structures.
> >It might also slightly improve robustness by keeping the number of memory
> >allocations that must succeed down percpu_alloc() call.
> >
> >This is in preparation for splitting an srcu_usage structure out
> >of the srcu_struct structure.
> >
> >[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
> >
> >Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >Cc: Christoph Hellwig <hch@lst.de>
> >---
> > include/linux/srcutree.h | 19 ++++++++++++++-----
> > kernel/rcu/srcutree.c    | 19 +++++++++++++------
> > 2 files changed, 27 insertions(+), 11 deletions(-)
> >
> >diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
> >index ac8af12f93b3..428480152375 100644
> >--- a/include/linux/srcutree.h
> >+++ b/include/linux/srcutree.h
> >@@ -121,15 +121,24 @@ struct srcu_struct {
> > #define SRCU_STATE_SCAN1	1
> > #define SRCU_STATE_SCAN2	2
> > 
> >-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >-{												\
> >-	.sda = &pcpu_name,									\
> >+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> > 	.srcu_gp_seq_needed = -1UL,								\
> > 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> >-	__SRCU_DEP_MAP_INIT(name)								\
> >+	__SRCU_DEP_MAP_INIT(name)
> >+
> >+#define __SRCU_STRUCT_INIT_MODULE(name)								\
> >+{												\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > }
> > 
> >+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >+{												\
> >+	.sda = &pcpu_name,									\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> >+}
> >+
> >+
> > /*
> >  * Define and initialize a srcu struct at build time.
> >  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> >@@ -151,7 +160,7 @@ struct srcu_struct {
> >  */
> > #ifdef MODULE
> > # define __DEFINE_SRCU(name, is_static)								\
> >-	is_static struct srcu_struct name;							\
> >+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > 	extern struct srcu_struct * const __srcu_struct_##name;					\
> > 	struct srcu_struct * const __srcu_struct_##name						\
> > 		__section("___srcu_struct_ptrs") = &name
> >diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> >index cd46fe063e50..7a6d9452a5d0 100644
> >--- a/kernel/rcu/srcutree.c
> >+++ b/kernel/rcu/srcutree.c
> >@@ -1895,13 +1895,14 @@ void __init srcu_init(void)
> > static int srcu_module_coming(struct module *mod)
> > {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> >-	int ret;
> > 
> > 	for (i = 0; i < mod->num_srcu_structs; i++) {
> >-		ret = init_srcu_struct(*(sspp++));
> >-		if (WARN_ON_ONCE(ret))
> >-			return ret;
> >+		ssp = *(sspp++);
> >+		ssp->sda = alloc_percpu(struct srcu_data);
> >+		if (WARN_ON_ONCE(!ssp->sda))
> >+			return -ENOMEM;
> > 	}
> > 	return 0;
> > }
> >@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module *mod)
> > static void srcu_module_going(struct module *mod)
> > {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > 
> >-	for (i = 0; i < mod->num_srcu_structs; i++)
> >-		cleanup_srcu_struct(*(sspp++));
> >+	for (i = 0; i < mod->num_srcu_structs; i++) {
> >+		ssp = *(sspp++);
> >+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> >+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> >+				cleanup_srcu_struct(ssp);
> >+		free_percpu(ssp->sda);
> 
> 
> Hi Paul
> 
> About 037b80b8865fb ("srcu: Check for readers at module-exit time ")
> 
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
>                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>                         cleanup_srcu_struct(ssp);


The srcu_sup->sda_is_static is true, in cleanup_srcu_struct(), the ssp->sda can not be freed.


> -               free_percpu(ssp->sda);
> +               else if (!WARN_ON(srcu_readers_active(ssp)))
> +                       free_percpu(ssp->sda);
> 
> Should the else statement be removed?  like this:
> 
> if (!WARN_ON(srcu_readers_active(ssp)))
> 	free_percpu(ssp->sda);
>
>Mightn't that cause us to double-free ssp->sda?  Once in free_percpu(),
>and before that in cleanup_srcu_struct()?

Thanks
Zqiang

>
>							Thanx, Paul
>
> Thanks
> Zqiang
> 
> 
> >+	}
> > }
> > 
> > /* Handle one module, either coming or going. */
> >-- 
> >2.40.0.rc2
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-30 15:03       ` Zhang, Qiang1
@ 2023-03-30 15:20         ` Zhang, Qiang1
  2023-03-30 17:05           ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Zhang, Qiang1 @ 2023-03-30 15:20 UTC (permalink / raw)
  To: Zhang, Qiang1, paulmck
  Cc: rcu, linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

> >Further shrinking the srcu_struct structure is eased by requiring 
> >that in-module srcu_struct structures rely more heavily on static 
> >initialization.  In particular, this preserves the property that a 
> >module-load-time srcu_struct initialization can fail only due to 
> >memory-allocation failure of the per-CPU srcu_data structures.
> >It might also slightly improve robustness by keeping the number of 
> >memory allocations that must succeed down percpu_alloc() call.
> >
> >This is in preparation for splitting an srcu_usage structure out of 
> >the srcu_struct structure.
> >
> >[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
> >
> >Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >Cc: Christoph Hellwig <hch@lst.de>
> >---
> > include/linux/srcutree.h | 19 ++++++++++++++-----
> > kernel/rcu/srcutree.c    | 19 +++++++++++++------
> > 2 files changed, 27 insertions(+), 11 deletions(-)
> >
> >diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h 
> >index ac8af12f93b3..428480152375 100644
> >--- a/include/linux/srcutree.h
> >+++ b/include/linux/srcutree.h
> >@@ -121,15 +121,24 @@ struct srcu_struct {
> > #define SRCU_STATE_SCAN1	1
> > #define SRCU_STATE_SCAN2	2
> > 
> >-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >-{												\
> >-	.sda = &pcpu_name,									\
> >+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> > 	.srcu_gp_seq_needed = -1UL,								\
> > 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> >-	__SRCU_DEP_MAP_INIT(name)								\
> >+	__SRCU_DEP_MAP_INIT(name)
> >+
> >+#define __SRCU_STRUCT_INIT_MODULE(name)								\
> >+{												\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > }
> > 
> >+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> >+{												\
> >+	.sda = &pcpu_name,									\
> >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> >+}
> >+
> >+
> > /*
> >  * Define and initialize a srcu struct at build time.
> >  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> >@@ -151,7 +160,7 @@ struct srcu_struct {
> >  */
> > #ifdef MODULE
> > # define __DEFINE_SRCU(name, is_static)								\
> >-	is_static struct srcu_struct name;							\
> >+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > 	extern struct srcu_struct * const __srcu_struct_##name;					\
> > 	struct srcu_struct * const __srcu_struct_##name						\
> > 		__section("___srcu_struct_ptrs") = &name diff --git 
> >a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 
> >cd46fe063e50..7a6d9452a5d0 100644
> >--- a/kernel/rcu/srcutree.c
> >+++ b/kernel/rcu/srcutree.c
> >@@ -1895,13 +1895,14 @@ void __init srcu_init(void)  static int 
> >srcu_module_coming(struct module *mod)  {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> >-	int ret;
> > 
> > 	for (i = 0; i < mod->num_srcu_structs; i++) {
> >-		ret = init_srcu_struct(*(sspp++));
> >-		if (WARN_ON_ONCE(ret))
> >-			return ret;
> >+		ssp = *(sspp++);
> >+		ssp->sda = alloc_percpu(struct srcu_data);
> >+		if (WARN_ON_ONCE(!ssp->sda))
> >+			return -ENOMEM;
> > 	}
> > 	return 0;
> > }
> >@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module 
> >*mod)  static void srcu_module_going(struct module *mod)  {
> > 	int i;
> >+	struct srcu_struct *ssp;
> > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > 
> >-	for (i = 0; i < mod->num_srcu_structs; i++)
> >-		cleanup_srcu_struct(*(sspp++));
> >+	for (i = 0; i < mod->num_srcu_structs; i++) {
> >+		ssp = *(sspp++);
> >+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> >+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> >+				cleanup_srcu_struct(ssp);
> >+		free_percpu(ssp->sda);
> 
> 
> Hi Paul
> 
> About 037b80b8865fb ("srcu: Check for readers at module-exit time ")
> 
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
>                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>                         cleanup_srcu_struct(ssp);


The srcu_sup->sda_is_static is true, in cleanup_srcu_struct(), the ssp->sda can not be freed.


> -               free_percpu(ssp->sda);
> +               else if (!WARN_ON(srcu_readers_active(ssp)))
> +                       free_percpu(ssp->sda);
> 
> Should the else statement be removed?  like this:
> 
> if (!WARN_ON(srcu_readers_active(ssp)))
> 	free_percpu(ssp->sda);
>
>Mightn't that cause us to double-free ssp->sda?  Once in free_percpu(), 
>and before that in cleanup_srcu_struct()?


how about this? any thought?

--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1937,7 +1937,7 @@ static void srcu_module_going(struct module *mod)
                if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
                    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
                        cleanup_srcu_struct(ssp);
-               else if (!WARN_ON(srcu_readers_active(ssp)))
+               if (!WARN_ON(srcu_readers_active(ssp)))
                        free_percpu(ssp->sda);
        }
 }

Thanks
Zqiang

>
>							Thanx, Paul
>
> Thanks
> Zqiang
> 
> 
> >+	}
> > }
> > 
> > /* Handle one module, either coming or going. */
> >--
> >2.40.0.rc2
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct
  2023-03-30 15:20         ` Zhang, Qiang1
@ 2023-03-30 17:05           ` Paul E. McKenney
  0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-03-30 17:05 UTC (permalink / raw)
  To: Zhang, Qiang1
  Cc: rcu, linux-kernel, kernel-team, rostedt, hch, qiang.zhang1211

On Thu, Mar 30, 2023 at 03:20:15PM +0000, Zhang, Qiang1 wrote:
> > >Further shrinking the srcu_struct structure is eased by requiring 
> > >that in-module srcu_struct structures rely more heavily on static 
> > >initialization.  In particular, this preserves the property that a 
> > >module-load-time srcu_struct initialization can fail only due to 
> > >memory-allocation failure of the per-CPU srcu_data structures.
> > >It might also slightly improve robustness by keeping the number of 
> > >memory allocations that must succeed down percpu_alloc() call.
> > >
> > >This is in preparation for splitting an srcu_usage structure out of 
> > >the srcu_struct structure.
> > >
> > >[ paulmck: Fold in qiang1.zhang@intel.com feedback. ]
> > >
> > >Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > >Cc: Christoph Hellwig <hch@lst.de>
> > >---
> > > include/linux/srcutree.h | 19 ++++++++++++++-----
> > > kernel/rcu/srcutree.c    | 19 +++++++++++++------
> > > 2 files changed, 27 insertions(+), 11 deletions(-)
> > >
> > >diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h 
> > >index ac8af12f93b3..428480152375 100644
> > >--- a/include/linux/srcutree.h
> > >+++ b/include/linux/srcutree.h
> > >@@ -121,15 +121,24 @@ struct srcu_struct {
> > > #define SRCU_STATE_SCAN1	1
> > > #define SRCU_STATE_SCAN2	2
> > > 
> > >-#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> > >-{												\
> > >-	.sda = &pcpu_name,									\
> > >+#define __SRCU_STRUCT_INIT_COMMON(name)								\
> > > 	.lock = __SPIN_LOCK_UNLOCKED(name.lock),						\
> > > 	.srcu_gp_seq_needed = -1UL,								\
> > > 	.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0),					\
> > >-	__SRCU_DEP_MAP_INIT(name)								\
> > >+	__SRCU_DEP_MAP_INIT(name)
> > >+
> > >+#define __SRCU_STRUCT_INIT_MODULE(name)								\
> > >+{												\
> > >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > > }
> > > 
> > >+#define __SRCU_STRUCT_INIT(name, pcpu_name)							\
> > >+{												\
> > >+	.sda = &pcpu_name,									\
> > >+	__SRCU_STRUCT_INIT_COMMON(name)								\
> > >+}
> > >+
> > >+
> > > /*
> > >  * Define and initialize a srcu struct at build time.
> > >  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
> > >@@ -151,7 +160,7 @@ struct srcu_struct {
> > >  */
> > > #ifdef MODULE
> > > # define __DEFINE_SRCU(name, is_static)								\
> > >-	is_static struct srcu_struct name;							\
> > >+	is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name);			\
> > > 	extern struct srcu_struct * const __srcu_struct_##name;					\
> > > 	struct srcu_struct * const __srcu_struct_##name						\
> > > 		__section("___srcu_struct_ptrs") = &name diff --git 
> > >a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 
> > >cd46fe063e50..7a6d9452a5d0 100644
> > >--- a/kernel/rcu/srcutree.c
> > >+++ b/kernel/rcu/srcutree.c
> > >@@ -1895,13 +1895,14 @@ void __init srcu_init(void)  static int 
> > >srcu_module_coming(struct module *mod)  {
> > > 	int i;
> > >+	struct srcu_struct *ssp;
> > > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > >-	int ret;
> > > 
> > > 	for (i = 0; i < mod->num_srcu_structs; i++) {
> > >-		ret = init_srcu_struct(*(sspp++));
> > >-		if (WARN_ON_ONCE(ret))
> > >-			return ret;
> > >+		ssp = *(sspp++);
> > >+		ssp->sda = alloc_percpu(struct srcu_data);
> > >+		if (WARN_ON_ONCE(!ssp->sda))
> > >+			return -ENOMEM;
> > > 	}
> > > 	return 0;
> > > }
> > >@@ -1910,10 +1911,16 @@ static int srcu_module_coming(struct module 
> > >*mod)  static void srcu_module_going(struct module *mod)  {
> > > 	int i;
> > >+	struct srcu_struct *ssp;
> > > 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > > 
> > >-	for (i = 0; i < mod->num_srcu_structs; i++)
> > >-		cleanup_srcu_struct(*(sspp++));
> > >+	for (i = 0; i < mod->num_srcu_structs; i++) {
> > >+		ssp = *(sspp++);
> > >+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> > >+		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> > >+				cleanup_srcu_struct(ssp);
> > >+		free_percpu(ssp->sda);
> > 
> > 
> > Hi Paul
> > 
> > About 037b80b8865fb ("srcu: Check for readers at module-exit time ")
> > 
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
> >                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> >                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> >                         cleanup_srcu_struct(ssp);
> 
> 
> The srcu_sup->sda_is_static is true, in cleanup_srcu_struct(), the ssp->sda can not be freed.

Very good, thank you!  I will fold your suggested fix into this commit:

037b80b8865f ("srcu: Check for readers at module-exit time")

							Thanx, Paul

> > -               free_percpu(ssp->sda);
> > +               else if (!WARN_ON(srcu_readers_active(ssp)))
> > +                       free_percpu(ssp->sda);
> > 
> > Should the else statement be removed?  like this:
> > 
> > if (!WARN_ON(srcu_readers_active(ssp)))
> > 	free_percpu(ssp->sda);
> >
> >Mightn't that cause us to double-free ssp->sda?  Once in free_percpu(), 
> >and before that in cleanup_srcu_struct()?
> 
> 
> how about this? any thought?
> 
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1937,7 +1937,7 @@ static void srcu_module_going(struct module *mod)
>                 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
>                     !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
>                         cleanup_srcu_struct(ssp);
> -               else if (!WARN_ON(srcu_readers_active(ssp)))
> +               if (!WARN_ON(srcu_readers_active(ssp)))
>                         free_percpu(ssp->sda);
>         }
>  }
> 
> Thanks
> Zqiang
> 
> >
> >							Thanx, Paul
> >
> > Thanks
> > Zqiang
> > 
> > 
> > >+	}
> > > }
> > > 
> > > /* Handle one module, either coming or going. */
> > >--
> > >2.40.0.rc2
> > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update
  2023-03-26 23:18       ` Christoph Hellwig
  2023-03-27  3:29         ` Paul E. McKenney
@ 2023-04-24 21:06         ` Paul E. McKenney
  1 sibling, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2023-04-24 21:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Wysocki, Rafael J, rcu, linux-kernel, kernel-team, rostedt,
	Michał Mirosław, Dmitry Osipenko

On Mon, Mar 27, 2023 at 01:18:58AM +0200, Christoph Hellwig wrote:
> On Fri, Mar 24, 2023 at 01:11:47PM -0700, Paul E. McKenney wrote:
> > > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Thank you!  I will add this on my next rebase.
> > 
> > It is possible that this will be v6.5 rather than v6.4 material.
> 
> I was hoping the RCU bits could land in 6.4, so that the block
> layer work to take advantage of it can go into 6.5 without cross-tree
> dependencies.

And this is now in mainline.  Please let me know how it goes!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-04-24 21:06 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-24  0:19 [PATCH RFC rcu 0/19] Further shrink srcu_struct to promote cache locality Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 01/19] srcu: Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 02/19] srcu: Use static init for statically allocated in-module srcu_struct Paul E. McKenney
2023-03-30  4:11   ` Zhang, Qiang1
2023-03-30 14:55     ` Paul E. McKenney
2023-03-30 15:03       ` Zhang, Qiang1
2023-03-30 15:20         ` Zhang, Qiang1
2023-03-30 17:05           ` Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 03/19] srcu: Begin offloading srcu_struct fields to srcu_update Paul E. McKenney
2023-03-24 19:10   ` Wysocki, Rafael J
2023-03-24 20:11     ` Paul E. McKenney
2023-03-26 23:18       ` Christoph Hellwig
2023-03-27  3:29         ` Paul E. McKenney
2023-04-24 21:06         ` Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 04/19] srcu: Move ->level from srcu_struct to srcu_usage Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 05/19] srcu: Move ->srcu_size_state " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 06/19] srcu: Move ->srcu_cb_mutex " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 07/19] srcu: Move ->lock initialization after srcu_usage allocation Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 08/19] srcu: Move ->lock from srcu_struct to srcu_usage Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 09/19] srcu: Move ->srcu_gp_mutex " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 10/19] srcu: Move grace-period fields " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 11/19] srcu: Move heuristics " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 12/19] srcu: Move ->sda_is_static " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 13/19] srcu: Move srcu_barrier() fields " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 14/19] srcu: Move work-scheduling " Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 15/19] srcu: Fix long lines in srcu_get_delay() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 16/19] srcu: Fix long lines in cleanup_srcu_struct() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 17/19] srcu: Fix long lines in srcu_gp_end() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 18/19] srcu: Fix long lines in srcu_funnel_gp_start() Paul E. McKenney
2023-03-24  0:19 ` [PATCH RFC rcu 19/19] srcu: Remove extraneous parentheses from srcu_read_lock() etc Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.