All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] srcu: Fix boot stall v2
@ 2021-04-08 22:38 Frederic Weisbecker
  2021-04-08 22:38 ` [PATCH 1/5] srcu: Unconditionally embed struct lockdep_map Frederic Weisbecker
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2021-04-08 22:38 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Uladzislau Rezki, Boqun Feng,
	Lai Jiangshan, Neeraj Upadhyay, Josh Triplett, Joel Fernandes


So here is a different approach to solve the early SRCU issue.
Early initialized ssp's are reset on srcu_init() and their callbacks get
requeued.

I expanded testing in the last patch.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	rcu/dev

HEAD: 0dc3551822c74a0f34783cbd64a9199c9758ec6d

Thanks,
	Frederic
---

Frederic Weisbecker (5):
      srcu: Unconditionally embed struct lockdep_map
      srcu: Initialize SRCU after timers
      srcu: Fix broken node geometry after early ssp init
      srcu: Queue a callback in case of early started poll
      srcu: Early test SRCU polling start


 include/linux/srcu.h     |   6 +++
 include/linux/srcutree.h |   4 +-
 init/main.c              |   2 +
 kernel/rcu/rcu.h         |   6 ---
 kernel/rcu/srcutree.c    | 119 ++++++++++++++++++++++++++++++++++++++++++-----
 kernel/rcu/tiny.c        |   1 -
 kernel/rcu/tree.c        |   1 -
 kernel/rcu/update.c      |  12 ++++-
 8 files changed, 129 insertions(+), 22 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/5] srcu: Unconditionally embed struct lockdep_map
  2021-04-08 22:38 [PATCH 0/5] srcu: Fix boot stall v2 Frederic Weisbecker
@ 2021-04-08 22:38 ` Frederic Weisbecker
  2021-04-08 22:38 ` [PATCH 2/5] srcu: Initialize SRCU after timers Frederic Weisbecker
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2021-04-08 22:38 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Uladzislau Rezki, Boqun Feng,
	Lai Jiangshan, Neeraj Upadhyay, Josh Triplett, Joel Fernandes

struct lockdep_map is empty when CONFIG_DEBUG_LOCK_ALLOC=n. We can
remove the ifdeffery while adding the lockdep map in struct srcu_struct
without risking consuming space in the off-case. This will also simplify
further manipulations on this field.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
---
 include/linux/srcutree.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 9cfcc8a756ae..cb1f4351e8ba 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -82,9 +82,7 @@ struct srcu_struct {
 						/*  callback for the barrier */
 						/*  operation. */
 	struct delayed_work work;
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map dep_map;
-#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 };
 
 /* Values for state variable (bottom bits of ->srcu_gp_seq). */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/5] srcu: Initialize SRCU after timers
  2021-04-08 22:38 [PATCH 0/5] srcu: Fix boot stall v2 Frederic Weisbecker
  2021-04-08 22:38 ` [PATCH 1/5] srcu: Unconditionally embed struct lockdep_map Frederic Weisbecker
@ 2021-04-08 22:38 ` Frederic Weisbecker
  2021-04-09 20:05   ` Paul E. McKenney
  2021-04-08 22:39 ` [PATCH 3/5] srcu: Fix broken node geometry after early ssp init Frederic Weisbecker
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Frederic Weisbecker @ 2021-04-08 22:38 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Uladzislau Rezki, Boqun Feng,
	Lai Jiangshan, Neeraj Upadhyay, Josh Triplett, Joel Fernandes

Once srcu_init() is called, the SRCU core is free to queue delayed
workqueues, which rely on timers. However init_timers() is called
several steps after rcu_init(). Any call_srcu() in-between would finish
its course inside a dangerously uninitialized timer core.

Make sure we stay in early SRCU mode until everything is well settled.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
---
 include/linux/srcu.h  | 6 ++++++
 init/main.c           | 2 ++
 kernel/rcu/rcu.h      | 6 ------
 kernel/rcu/srcutree.c | 5 +++++
 kernel/rcu/tiny.c     | 1 -
 kernel/rcu/tree.c     | 1 -
 6 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index a0895bbf71ce..e6011a9975af 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -64,6 +64,12 @@ unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp);
 unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp);
 bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie);
 
+#ifdef CONFIG_SRCU
+void srcu_init(void);
+#else /* #ifdef CONFIG_SRCU */
+static inline void srcu_init(void) { }
+#endif /* #else #ifdef CONFIG_SRCU */
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 
 /**
diff --git a/init/main.c b/init/main.c
index 53b278845b88..1bc5cc9e52ef 100644
--- a/init/main.c
+++ b/init/main.c
@@ -42,6 +42,7 @@
 #include <linux/profile.h>
 #include <linux/kfence.h>
 #include <linux/rcupdate.h>
+#include <linux/srcu.h>
 #include <linux/moduleparam.h>
 #include <linux/kallsyms.h>
 #include <linux/writeback.h>
@@ -956,6 +957,7 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
 	tick_init();
 	rcu_init_nohz();
 	init_timers();
+	srcu_init();
 	hrtimers_init();
 	softirq_init();
 	timekeeping_init();
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index d64b842f4078..b3af34068051 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -422,12 +422,6 @@ do {									\
 
 #endif /* #if defined(CONFIG_SRCU) || !defined(CONFIG_TINY_RCU) */
 
-#ifdef CONFIG_SRCU
-void srcu_init(void);
-#else /* #ifdef CONFIG_SRCU */
-static inline void srcu_init(void) { }
-#endif /* #else #ifdef CONFIG_SRCU */
-
 #ifdef CONFIG_TINY_RCU
 /* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */
 static inline bool rcu_gp_is_normal(void) { return true; }
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 10e681ea7051..108f9ca06047 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1384,6 +1384,11 @@ void __init srcu_init(void)
 {
 	struct srcu_struct *ssp;
 
+	/*
+	 * Once that is set, call_srcu() can follow the normal path and
+	 * queue delayed work. This must follow RCU workqueues creation
+	 * and timers initialization.
+	 */
 	srcu_init_done = true;
 	while (!list_empty(&srcu_boot_list)) {
 		ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index c8a029fbb114..340b3f8b090d 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -221,5 +221,4 @@ void __init rcu_init(void)
 {
 	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
 	rcu_early_boot_tests();
-	srcu_init();
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 5c214705c33f..740f5cd34459 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4714,7 +4714,6 @@ void __init rcu_init(void)
 	WARN_ON(!rcu_gp_wq);
 	rcu_par_gp_wq = alloc_workqueue("rcu_par_gp", WQ_MEM_RECLAIM, 0);
 	WARN_ON(!rcu_par_gp_wq);
-	srcu_init();
 
 	/* Fill in default value for rcutree.qovld boot parameter. */
 	/* -After- the rcu_node ->lock fields are initialized! */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/5] srcu: Fix broken node geometry after early ssp init
  2021-04-08 22:38 [PATCH 0/5] srcu: Fix boot stall v2 Frederic Weisbecker
  2021-04-08 22:38 ` [PATCH 1/5] srcu: Unconditionally embed struct lockdep_map Frederic Weisbecker
  2021-04-08 22:38 ` [PATCH 2/5] srcu: Initialize SRCU after timers Frederic Weisbecker
@ 2021-04-08 22:39 ` Frederic Weisbecker
  2021-04-08 22:39 ` [PATCH 4/5] srcu: Queue a callback in case of early started poll Frederic Weisbecker
  2021-04-08 22:39 ` [PATCH 5/5] srcu: Early test SRCU polling start Frederic Weisbecker
  4 siblings, 0 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2021-04-08 22:39 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Uladzislau Rezki, Boqun Feng,
	Lai Jiangshan, Neeraj Upadhyay, Josh Triplett, Joel Fernandes

An ssp initialized before rcu_init_geometry() will have its snp hierarchy
based on CONFIG_NR_CPUS.

Once rcu_init_geometry() is called, the nodes distribution is shrinked
and optimized toward meeting the actual possible number of CPUs detected
on boot.

Later on, the ssp that was initialized before rcu_init_geometry() is
confused and sometimes refers to its initial CONFIG_NR_CPUS based node
hierarchy, sometimes to the new num_possible_cpus() based one instead.
For example each of its sdp->mynode remain backward and refer to the
early node leaves that may not exist anymore. On the other hand the
srcu_for_each_node_breadth_first() refers to the new node hierarchy.

There are at least two bad possible outcomes to this:

1) a) A callback enqueued early on an sdp is recorded pending on
      sdp->mynode->srcu_data_have_cbs in srcu_funnel_gp_start() with
      sdp->mynode pointing to a deep leaf (say 3 levels).

   b) The grace period ends after rcu_init_geometry() which shrinks the
      nodes level to a single one. srcu_gp_end() walks through the new
      snp hierarchy without ever reaching the old leaves so the callback
      is never executed.

   This is easily reproduced on an 8 CPUs machine with
   CONFIG_NR_CPUS >= 32 and "rcupdate.rcu_self_test=1". The
   srcu_barrier() after early tests verification never completes and
   the boot hangs:

	[ 5413.141029] INFO: task swapper/0:1 blocked for more than 4915 seconds.
	[ 5413.147564]       Not tainted 5.12.0-rc4+ #28
	[ 5413.151927] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
	[ 5413.159753] task:swapper/0       state:D stack:    0 pid:    1 ppid:     0 flags:0x00004000
	[ 5413.168099] Call Trace:
	[ 5413.170555]  __schedule+0x36c/0x930
	[ 5413.174057]  ? wait_for_completion+0x88/0x110
	[ 5413.178423]  schedule+0x46/0xf0
	[ 5413.181575]  schedule_timeout+0x284/0x380
	[ 5413.185591]  ? wait_for_completion+0x88/0x110
	[ 5413.189957]  ? mark_held_locks+0x61/0x80
	[ 5413.193882]  ? mark_held_locks+0x61/0x80
	[ 5413.197809]  ? _raw_spin_unlock_irq+0x24/0x50
	[ 5413.202173]  ? wait_for_completion+0x88/0x110
	[ 5413.206535]  wait_for_completion+0xb4/0x110
	[ 5413.210724]  ? srcu_torture_stats_print+0x110/0x110
	[ 5413.215610]  srcu_barrier+0x187/0x200
	[ 5413.219277]  ? rcu_tasks_verify_self_tests+0x50/0x50
	[ 5413.224244]  ? rdinit_setup+0x2b/0x2b
	[ 5413.227907]  rcu_verify_early_boot_tests+0x2d/0x40
	[ 5413.232700]  do_one_initcall+0x63/0x310
	[ 5413.236541]  ? rdinit_setup+0x2b/0x2b
	[ 5413.240207]  ? rcu_read_lock_sched_held+0x52/0x80
	[ 5413.244912]  kernel_init_freeable+0x253/0x28f
	[ 5413.249273]  ? rest_init+0x250/0x250
	[ 5413.252846]  kernel_init+0xa/0x110
	[ 5413.256257]  ret_from_fork+0x22/0x30

2) An ssp that gets initialized before rcu_init_geometry() and used
   afterward will always have stale rdp->mynode references, resulting in
   callbacks to be missed in srcu_gp_end(), just like in the previous
   scenario.

Solve this once rcu_init_geometry() is done with resetting the whole
state and node tree layout for each early initialized ssp's. Queued
callbacks are saved then requeued once the ssp reset is done.

Suggested-by: Paul E . McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
---
 include/linux/srcutree.h |  1 +
 kernel/rcu/srcutree.c    | 77 ++++++++++++++++++++++++++++++++++------
 2 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index cb1f4351e8ba..a2422c442470 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -83,6 +83,7 @@ struct srcu_struct {
 						/*  operation. */
 	struct delayed_work work;
 	struct lockdep_map dep_map;
+	struct list_head early_init;
 };
 
 /* Values for state variable (bottom bits of ->srcu_gp_seq). */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 108f9ca06047..7ca1bd0067c4 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -39,7 +39,7 @@ static ulong counter_wrap_check = (ULONG_MAX >> 2);
 module_param(counter_wrap_check, ulong, 0444);
 
 /* Early-boot callback-management, so early that no lock is required! */
-static LIST_HEAD(srcu_boot_list);
+static LIST_HEAD(srcu_early_init_list);
 static bool __read_mostly srcu_init_done;
 
 static void srcu_invoke_callbacks(struct work_struct *work);
@@ -174,10 +174,61 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
 	init_srcu_struct_nodes(ssp);
 	ssp->srcu_gp_seq_needed_exp = 0;
 	ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
+	if (!srcu_init_done)
+		list_add_tail(&ssp->early_init, &srcu_early_init_list);
 	smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
 	return 0;
 }
 
+/*
+ * After rcu_init_geometry(), early initialized SRCU nodes can be
+ * in a confused state, with sdp->mynode pointing to pruned nodes.
+ * Reset the ssp struct along with its node hierarchy and requeue its
+ * callbacks.
+ */
+static void reset_srcu_struct(struct srcu_struct *ssp)
+{
+	int cpu;
+	struct lockdep_map dep_map;
+	struct rcu_cblist pendcbs;
+	struct rcu_head *rhp;
+	struct srcu_data __percpu *sda;
+
+	rcu_cblist_init(&pendcbs);
+
+	for_each_possible_cpu(cpu) {
+		struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu);
+
+		if (cpu == smp_processor_id())
+			rcu_segcblist_extract_pend_cbs(&sdp->srcu_cblist, &pendcbs);
+
+		/*
+		 * We don't expect to have ready callbacks at this stage, nor any
+		 * callback at all on non-boot CPUs.
+		 */
+		WARN_ON_ONCE(!rcu_segcblist_empty(&sdp->srcu_cblist));
+
+		memset(sdp, 0, sizeof(*sdp));
+	}
+
+	/*
+	 * Save the per-CPU states. We don't need to re-allocate them.
+	 * Besides they may even be static.
+	 */
+	sda = ssp->sda;
+	/* Save the lockdep map, it may not suffer double-initialization */
+	dep_map = ssp->dep_map;
+
+	memset(ssp, 0, sizeof(*ssp));
+	ssp->sda = sda;
+	ssp->dep_map = dep_map;
+	spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
+	init_srcu_struct_fields(ssp, true);
+
+	while ((rhp = rcu_cblist_dequeue(&pendcbs)))
+		call_srcu(ssp, rhp, rhp->func);
+}
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 
 int __init_srcu_struct(struct srcu_struct *ssp, const char *name,
@@ -675,11 +726,14 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	    rcu_seq_state(ssp->srcu_gp_seq) == SRCU_STATE_IDLE) {
 		WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
 		srcu_gp_start(ssp);
+		/*
+		 * If SRCU isn't initialized yet, the grace period will be
+		 * started again on srcu_init() at which point the work will
+		 * be safely queued to rcu_gp_wq.
+		 */
 		if (likely(srcu_init_done))
 			queue_delayed_work(rcu_gp_wq, &ssp->work,
 					   srcu_get_delay(ssp));
-		else if (list_empty(&ssp->work.work.entry))
-			list_add(&ssp->work.work.entry, &srcu_boot_list);
 	}
 	spin_unlock_irqrestore_rcu_node(ssp, flags);
 }
@@ -1382,7 +1436,9 @@ early_initcall(srcu_bootup_announce);
 
 void __init srcu_init(void)
 {
-	struct srcu_struct *ssp;
+	struct srcu_struct *ssp, *tmp;
+
+	lockdep_assert_irqs_disabled();
 
 	/*
 	 * Once that is set, call_srcu() can follow the normal path and
@@ -1390,12 +1446,13 @@ void __init srcu_init(void)
 	 * and timers initialization.
 	 */
 	srcu_init_done = true;
-	while (!list_empty(&srcu_boot_list)) {
-		ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
-				      work.work.entry);
-		list_del_init(&ssp->work.work.entry);
-		queue_work(rcu_gp_wq, &ssp->work.work);
-	}
+	/*
+	 * ssp's that got initialized before rcu_init_geometry() have a stale
+	 * node hierarchy. Rebuild their node trees and requeue their pending
+	 * callbacks.
+	 */
+	list_for_each_entry_safe(ssp, tmp, &srcu_early_init_list, early_init)
+		reset_srcu_struct(ssp);
 }
 
 #ifdef CONFIG_MODULES
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/5] srcu: Queue a callback in case of early started poll
  2021-04-08 22:38 [PATCH 0/5] srcu: Fix boot stall v2 Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2021-04-08 22:39 ` [PATCH 3/5] srcu: Fix broken node geometry after early ssp init Frederic Weisbecker
@ 2021-04-08 22:39 ` Frederic Weisbecker
  2021-04-08 22:39 ` [PATCH 5/5] srcu: Early test SRCU polling start Frederic Weisbecker
  4 siblings, 0 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2021-04-08 22:39 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Uladzislau Rezki, Boqun Feng,
	Lai Jiangshan, Neeraj Upadhyay, Josh Triplett, Joel Fernandes

If SRCU polling is used before SRCU initialization, it won't benefit
from the callback requeue performed after rcu_init_geometry() in order
to fix the node hierarchy reshuffle. This is because polling start grace
periods that don't rely on callbacks. Therefore the grace period started
may be lost upon srcu_init().

To fix this, queue an empty callback in case of early use of
start_poll_synchronize_srcu() so that it can later get requeued with
the preserved order against other early calls to either call_srcu()
or start_poll_synchronize_srcu().

Since it can be called early any number of time, have at least two
struct rcu_head per ssp dedicated to this early enqueue. The two first
calls to start_poll_synchronize_srcu() will each start one new grace
period, if no call_srcu() happen before or in-between. Any subsequent
early call to start_poll_synchronize_srcu() will wait for the second
grace period so there is no need to queue empty callbacks beyond the
second call.

Suggested-by: Paul E . McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
---
 include/linux/srcutree.h |  1 +
 kernel/rcu/srcutree.c    | 37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index a2422c442470..9d4fbfc2c109 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -84,6 +84,7 @@ struct srcu_struct {
 	struct delayed_work work;
 	struct lockdep_map dep_map;
 	struct list_head early_init;
+	struct rcu_head early_poll[2];
 };
 
 /* Values for state variable (bottom bits of ->srcu_gp_seq). */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 7ca1bd0067c4..2fa35e5bfbc9 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -190,6 +190,8 @@ static void reset_srcu_struct(struct srcu_struct *ssp)
 {
 	int cpu;
 	struct lockdep_map dep_map;
+	struct rcu_head early_poll[2];
+	int i;
 	struct rcu_cblist pendcbs;
 	struct rcu_head *rhp;
 	struct srcu_data __percpu *sda;
@@ -218,10 +220,16 @@ static void reset_srcu_struct(struct srcu_struct *ssp)
 	sda = ssp->sda;
 	/* Save the lockdep map, it may not suffer double-initialization */
 	dep_map = ssp->dep_map;
+	/* Save the early_poll callback links. They may be queued to pendcbs */
+	for (i = 0; i < ARRAY_SIZE(ssp->early_poll); i++)
+		early_poll[i] = ssp->early_poll[i];
 
 	memset(ssp, 0, sizeof(*ssp));
 	ssp->sda = sda;
 	ssp->dep_map = dep_map;
+	for (i = 0; i < ARRAY_SIZE(ssp->early_poll); i++)
+		ssp->early_poll[i] = early_poll[i];
+
 	spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
 	init_srcu_struct_fields(ssp, true);
 
@@ -1079,6 +1087,10 @@ unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp)
 }
 EXPORT_SYMBOL_GPL(get_state_synchronize_srcu);
 
+static void early_poll_func(struct rcu_head *rhp)
+{
+}
+
 /**
  * start_poll_synchronize_srcu - Provide cookie and start grace period
  * @ssp: srcu_struct to provide cookie for.
@@ -1091,7 +1103,30 @@ EXPORT_SYMBOL_GPL(get_state_synchronize_srcu);
  */
 unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp)
 {
-	return srcu_gp_start_if_needed(ssp, NULL, true);
+	struct rcu_head *rhp = NULL;
+
+	/*
+	 * After rcu_init_geometry(), we need to reset the ssp and restart
+	 * the early started grace periods. Callbacks can be requeued but
+	 * callback-less grace periods are harder to track, especially if
+	 * we want to preserve the order among all the early calls to
+	 * call_srcu() and start_poll_synchronize_srcu(). So queue empty
+	 * callbacks to solve this. We may initialize at most two grace periods
+	 * that early, no need to queue more than two callbacks per ssp, any
+	 * further early call to start_poll_synchronize_srcu() will wait for
+	 * the second grace period.
+	 */
+	if (!srcu_init_done) {
+		int i;
+		for (i = 0; i < ARRAY_SIZE(ssp->early_poll); i++) {
+			if (!ssp->early_poll[i].func) {
+				rhp = &ssp->early_poll[i];
+				rhp->func = early_poll_func;
+				break;
+			}
+		}
+	}
+	return srcu_gp_start_if_needed(ssp, rhp, true);
 }
 EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/5] srcu: Early test SRCU polling start
  2021-04-08 22:38 [PATCH 0/5] srcu: Fix boot stall v2 Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2021-04-08 22:39 ` [PATCH 4/5] srcu: Queue a callback in case of early started poll Frederic Weisbecker
@ 2021-04-08 22:39 ` Frederic Weisbecker
  4 siblings, 0 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2021-04-08 22:39 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Uladzislau Rezki, Boqun Feng,
	Lai Jiangshan, Neeraj Upadhyay, Josh Triplett, Joel Fernandes

Test early calls to start_poll_synchronize_srcu(), mixed within the
early test to call_srcu(), and make sure that
poll_state_synchronize_srcu() correctly see the expired grace periods
after the srcu_barrier() on late initcall. Normally srcu_barrier()
doesn't wait for callback-less grace periods but early calls to
start_poll_synchronize_srcu() involve empty callbacks.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
---
 kernel/rcu/update.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index dd94a602a6d2..7ee57d66a327 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -528,6 +528,7 @@ DEFINE_STATIC_SRCU(early_srcu);
 struct early_boot_kfree_rcu {
 	struct rcu_head rh;
 };
+static unsigned long early_cookie[3];
 
 static void early_boot_test_call_rcu(void)
 {
@@ -536,8 +537,14 @@ static void early_boot_test_call_rcu(void)
 	struct early_boot_kfree_rcu *rhp;
 
 	call_rcu(&head, test_callback);
-	if (IS_ENABLED(CONFIG_SRCU))
+	if (IS_ENABLED(CONFIG_SRCU)) {
+		int i;
+		early_cookie[0] = start_poll_synchronize_srcu(&early_srcu);
 		call_srcu(&early_srcu, &shead, test_callback);
+
+		for (i = 1; i < ARRAY_SIZE(early_cookie); i++)
+			early_cookie[i] = start_poll_synchronize_srcu(&early_srcu);
+	}
 	rhp = kmalloc(sizeof(*rhp), GFP_KERNEL);
 	if (!WARN_ON_ONCE(!rhp))
 		kfree_rcu(rhp, rh);
@@ -561,8 +568,11 @@ static int rcu_verify_early_boot_tests(void)
 		early_boot_test_counter++;
 		rcu_barrier();
 		if (IS_ENABLED(CONFIG_SRCU)) {
+			int i;
 			early_boot_test_counter++;
 			srcu_barrier(&early_srcu);
+			for (i = 0; i < ARRAY_SIZE(early_cookie); i++)
+				WARN_ON_ONCE(!poll_state_synchronize_srcu(&early_srcu, early_cookie[i]));
 		}
 	}
 	if (rcu_self_test_counter != early_boot_test_counter) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/5] srcu: Initialize SRCU after timers
  2021-04-08 22:38 ` [PATCH 2/5] srcu: Initialize SRCU after timers Frederic Weisbecker
@ 2021-04-09 20:05   ` Paul E. McKenney
  0 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2021-04-09 20:05 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Uladzislau Rezki, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Joel Fernandes

On Fri, Apr 09, 2021 at 12:38:59AM +0200, Frederic Weisbecker wrote:
> Once srcu_init() is called, the SRCU core is free to queue delayed
> workqueues, which rely on timers. However init_timers() is called
> several steps after rcu_init(). Any call_srcu() in-between would finish
> its course inside a dangerously uninitialized timer core.
> 
> Make sure we stay in early SRCU mode until everything is well settled.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Uladzislau Rezki <urezki@gmail.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Joel Fernandes <joel@joelfernandes.org>

Good catches!  Apparently not so many people are using SRCU that early
in boot, but good to get it working.  I queued this patch and the
preceding one (1/5).

								
							Thanx, Paul

> ---
>  include/linux/srcu.h  | 6 ++++++
>  init/main.c           | 2 ++
>  kernel/rcu/rcu.h      | 6 ------
>  kernel/rcu/srcutree.c | 5 +++++
>  kernel/rcu/tiny.c     | 1 -
>  kernel/rcu/tree.c     | 1 -
>  6 files changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> index a0895bbf71ce..e6011a9975af 100644
> --- a/include/linux/srcu.h
> +++ b/include/linux/srcu.h
> @@ -64,6 +64,12 @@ unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp);
>  unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp);
>  bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie);
>  
> +#ifdef CONFIG_SRCU
> +void srcu_init(void);
> +#else /* #ifdef CONFIG_SRCU */
> +static inline void srcu_init(void) { }
> +#endif /* #else #ifdef CONFIG_SRCU */
> +
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  
>  /**
> diff --git a/init/main.c b/init/main.c
> index 53b278845b88..1bc5cc9e52ef 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -42,6 +42,7 @@
>  #include <linux/profile.h>
>  #include <linux/kfence.h>
>  #include <linux/rcupdate.h>
> +#include <linux/srcu.h>
>  #include <linux/moduleparam.h>
>  #include <linux/kallsyms.h>
>  #include <linux/writeback.h>
> @@ -956,6 +957,7 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
>  	tick_init();
>  	rcu_init_nohz();
>  	init_timers();
> +	srcu_init();
>  	hrtimers_init();
>  	softirq_init();
>  	timekeeping_init();
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index d64b842f4078..b3af34068051 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -422,12 +422,6 @@ do {									\
>  
>  #endif /* #if defined(CONFIG_SRCU) || !defined(CONFIG_TINY_RCU) */
>  
> -#ifdef CONFIG_SRCU
> -void srcu_init(void);
> -#else /* #ifdef CONFIG_SRCU */
> -static inline void srcu_init(void) { }
> -#endif /* #else #ifdef CONFIG_SRCU */
> -
>  #ifdef CONFIG_TINY_RCU
>  /* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */
>  static inline bool rcu_gp_is_normal(void) { return true; }
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 10e681ea7051..108f9ca06047 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1384,6 +1384,11 @@ void __init srcu_init(void)
>  {
>  	struct srcu_struct *ssp;
>  
> +	/*
> +	 * Once that is set, call_srcu() can follow the normal path and
> +	 * queue delayed work. This must follow RCU workqueues creation
> +	 * and timers initialization.
> +	 */
>  	srcu_init_done = true;
>  	while (!list_empty(&srcu_boot_list)) {
>  		ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
> diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
> index c8a029fbb114..340b3f8b090d 100644
> --- a/kernel/rcu/tiny.c
> +++ b/kernel/rcu/tiny.c
> @@ -221,5 +221,4 @@ void __init rcu_init(void)
>  {
>  	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
>  	rcu_early_boot_tests();
> -	srcu_init();
>  }
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 5c214705c33f..740f5cd34459 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4714,7 +4714,6 @@ void __init rcu_init(void)
>  	WARN_ON(!rcu_gp_wq);
>  	rcu_par_gp_wq = alloc_workqueue("rcu_par_gp", WQ_MEM_RECLAIM, 0);
>  	WARN_ON(!rcu_par_gp_wq);
> -	srcu_init();
>  
>  	/* Fill in default value for rcutree.qovld boot parameter. */
>  	/* -After- the rcu_node ->lock fields are initialized! */
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-09 20:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-08 22:38 [PATCH 0/5] srcu: Fix boot stall v2 Frederic Weisbecker
2021-04-08 22:38 ` [PATCH 1/5] srcu: Unconditionally embed struct lockdep_map Frederic Weisbecker
2021-04-08 22:38 ` [PATCH 2/5] srcu: Initialize SRCU after timers Frederic Weisbecker
2021-04-09 20:05   ` Paul E. McKenney
2021-04-08 22:39 ` [PATCH 3/5] srcu: Fix broken node geometry after early ssp init Frederic Weisbecker
2021-04-08 22:39 ` [PATCH 4/5] srcu: Queue a callback in case of early started poll Frederic Weisbecker
2021-04-08 22:39 ` [PATCH 5/5] srcu: Early test SRCU polling start Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.