linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14
@ 2021-05-11 22:52 Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 01/19] rcu: Fix typo in comment: kthead -> kthread Paul E. McKenney
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel

Hello!

This series contains miscellaneous fixes, perhaps most notably the
rescusitation of RCU priority boosting.

1.	Fix typo in comment: kthead -> kthread, courtesy of Rolf Eike
	Beer.

2.	Remove the unused rcu_irq_exit_preempt() function.

3.	Improve tree.c comments and add code cleanups, courtesy of
	Zhouyi Zhou.

4.	Invoke rcu_spawn_core_kthreads() from rcu_spawn_gp_kthread().

5.	Add ->rt_priority and ->gp_start to show_rcu_gp_kthreads() output.

6.	Add ->gp_max to show_rcu_gp_kthreads() output.

7.	Explicitly flag likely false-positive report.

8.	Reject RCU_LOCKDEP_WARN() false positives.

9.	Add quiescent states and boost states to show_rcu_gp_kthreads()
	output.

10.	Make RCU priority boosting work on single-CPU rcu_node structures.

11.	Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP.

12.	Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs.

13.	Make rcu_gp_cleanup() be noinline for tracing.

14.	Point to documentation of ordering guarantees.

15.	Create an unrcu_pointer() to remove __rcu from a pointer.

16.	reconcile rcu_nocbs= and nohz_full=, courtesy of Paul Gortmaker.

17.	Improve comments describing RCU read-side critical sections.

18.	Remove obsolete rcu_read_unlock() deadlock commentary.

19.	Add missing __releases() annotation, courtesy of Jules Irenge.

						Thanx, Paul

------------------------------------------------------------------------

 b/include/linux/rcupdate.h |    2 -
 b/include/linux/rcutiny.h  |    1 
 b/include/linux/rcutree.h  |    1 
 b/kernel/locking/lockdep.c |    6 ++-
 b/kernel/rcu/Kconfig.debug |    2 -
 b/kernel/rcu/srcutree.c    |    3 +
 b/kernel/rcu/tree.c        |   22 ------------
 b/kernel/rcu/tree.h        |    1 
 b/kernel/rcu/tree_plugin.h |    2 -
 b/kernel/rcu/tree_stall.h  |    8 ++--
 b/kernel/rcu/update.c      |    2 -
 b/kernel/sched/isolation.c |    4 --
 b/mm/oom_kill.c            |    2 -
 include/linux/rcupdate.h   |   82 ++++++++++++++++++++++++++-------------------
 kernel/rcu/tree.c          |   74 +++++++++++++++++++++++++---------------
 kernel/rcu/tree.h          |    2 -
 kernel/rcu/tree_plugin.h   |   30 ++++------------
 kernel/rcu/tree_stall.h    |   21 ++++++++---
 18 files changed, 138 insertions(+), 127 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 01/19] rcu: Fix typo in comment: kthead -> kthread
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 02/19] rcu: Remove the unused rcu_irq_exit_preempt() function Paul E. McKenney
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Rolf Eike Beer,
	Paul E . McKenney

From: Rolf Eike Beer <eb@emlix.com>

Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_plugin.h | 2 +-
 mm/oom_kill.c            | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index ad0156b86937..2cbe8f8456e6 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1940,7 +1940,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 }
 
 /*
- * Awaken the no-CBs grace-period kthead if needed, either due to it
+ * Awaken the no-CBs grace-period kthread if needed, either due to it
  * legitimately being asleep or due to overload conditions.
  *
  * If warranted, also wake up the kthread servicing this CPUs queues.
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index eefd3f5fde46..54527de9cd2d 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -922,7 +922,7 @@ static void __oom_kill_process(struct task_struct *victim, const char *message)
 			continue;
 		}
 		/*
-		 * No kthead_use_mm() user needs to read from the userspace so
+		 * No kthread_use_mm() user needs to read from the userspace so
 		 * we are ok to reap it.
 		 */
 		if (unlikely(p->flags & PF_KTHREAD))
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 02/19] rcu: Remove the unused rcu_irq_exit_preempt() function
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 01/19] rcu: Fix typo in comment: kthead -> kthread Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 03/19] rcu: Improve tree.c comments and add code cleanups Paul E. McKenney
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Commit 9ee01e0f69a9 ("x86/entry: Clean up idtentry_enter/exit()
leftovers") left the rcu_irq_exit_preempt() in place in order to avoid
conflicts with the -rcu tree.  Now that this change has long since hit
mainline, this commit removes the no-longer-used rcu_irq_exit_preempt()
function.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/rcutiny.h |  1 -
 include/linux/rcutree.h |  1 -
 kernel/rcu/tree.c       | 22 ----------------------
 3 files changed, 24 deletions(-)

diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 35e0be326ffc..953e70fafe38 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -86,7 +86,6 @@ static inline void rcu_irq_enter(void) { }
 static inline void rcu_irq_exit_irqson(void) { }
 static inline void rcu_irq_enter_irqson(void) { }
 static inline void rcu_irq_exit(void) { }
-static inline void rcu_irq_exit_preempt(void) { }
 static inline void rcu_irq_exit_check_preempt(void) { }
 #define rcu_is_idle_cpu(cpu) \
 	(is_idle_task(current) && !in_nmi() && !in_irq() && !in_serving_softirq())
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index b89b54130f49..53209d669400 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -49,7 +49,6 @@ void rcu_idle_enter(void);
 void rcu_idle_exit(void);
 void rcu_irq_enter(void);
 void rcu_irq_exit(void);
-void rcu_irq_exit_preempt(void);
 void rcu_irq_enter_irqson(void);
 void rcu_irq_exit_irqson(void);
 bool rcu_is_idle_cpu(int cpu);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8e78b2430c16..f6543b8004c0 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -833,28 +833,6 @@ void noinstr rcu_irq_exit(void)
 	rcu_nmi_exit();
 }
 
-/**
- * rcu_irq_exit_preempt - Inform RCU that current CPU is exiting irq
- *			  towards in kernel preemption
- *
- * Same as rcu_irq_exit() but has a sanity check that scheduling is safe
- * from RCU point of view. Invoked from return from interrupt before kernel
- * preemption.
- */
-void rcu_irq_exit_preempt(void)
-{
-	lockdep_assert_irqs_disabled();
-	rcu_nmi_exit();
-
-	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) <= 0,
-			 "RCU dynticks_nesting counter underflow/zero!");
-	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) !=
-			 DYNTICK_IRQ_NONIDLE,
-			 "Bad RCU  dynticks_nmi_nesting counter\n");
-	RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),
-			 "RCU in extended quiescent state!");
-}
-
 #ifdef CONFIG_PROVE_RCU
 /**
  * rcu_irq_exit_check_preempt - Validate that scheduling is possible
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 03/19] rcu: Improve tree.c comments and add code cleanups
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 01/19] rcu: Fix typo in comment: kthead -> kthread Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 02/19] rcu: Remove the unused rcu_irq_exit_preempt() function Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 04/19] rcu: Invoke rcu_spawn_core_kthreads() from rcu_spawn_gp_kthread() Paul E. McKenney
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Zhouyi Zhou, Paul E . McKenney

From: Zhouyi Zhou <zhouzhouyi@gmail.com>

This commit cleans up some comments and code in kernel/rcu/tree.c.

Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f6543b8004c0..06f3de96997c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -202,7 +202,7 @@ EXPORT_SYMBOL_GPL(rcu_get_gp_kthreads_prio);
  * the need for long delays to increase some race probabilities with the
  * need for fast grace periods to increase other race probabilities.
  */
-#define PER_RCU_NODE_PERIOD 3	/* Number of grace periods between delays. */
+#define PER_RCU_NODE_PERIOD 3	/* Number of grace periods between delays for debugging. */
 
 /*
  * Compute the mask of online CPUs for the specified rcu_node structure.
@@ -937,7 +937,7 @@ EXPORT_SYMBOL_GPL(rcu_idle_exit);
  */
 void noinstr rcu_user_exit(void)
 {
-	rcu_eqs_exit(1);
+	rcu_eqs_exit(true);
 }
 
 /**
@@ -1203,7 +1203,7 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
 #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
 
 /*
- * We are reporting a quiescent state on behalf of some other CPU, so
+ * When trying to report a quiescent state on behalf of some other CPU,
  * it is our responsibility to check for and handle potential overflow
  * of the rcu_node ->gp_seq counter with respect to the rcu_data counters.
  * After all, the CPU might be in deep idle state, and thus executing no
@@ -2607,7 +2607,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
  * state, for example, user mode or idle loop.  It also schedules RCU
  * core processing.  If the current grace period has gone on too long,
  * it will ask the scheduler to manufacture a context switch for the sole
- * purpose of providing a providing the needed quiescent state.
+ * purpose of providing the needed quiescent state.
  */
 void rcu_sched_clock_irq(int user)
 {
@@ -3236,7 +3236,7 @@ put_cached_bnode(struct kfree_rcu_cpu *krcp,
 
 /*
  * This function is invoked in workqueue context after a grace period.
- * It frees all the objects queued on ->bhead_free or ->head_free.
+ * It frees all the objects queued on ->bkvhead_free or ->head_free.
  */
 static void kfree_rcu_work(struct work_struct *work)
 {
@@ -3263,7 +3263,7 @@ static void kfree_rcu_work(struct work_struct *work)
 	krwp->head_free = NULL;
 	raw_spin_unlock_irqrestore(&krcp->lock, flags);
 
-	// Handle two first channels.
+	// Handle the first two channels.
 	for (i = 0; i < FREE_N_CHANNELS; i++) {
 		for (; bkvhead[i]; bkvhead[i] = bnext) {
 			bnext = bkvhead[i]->next;
@@ -3530,11 +3530,11 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
 }
 
 /*
- * Queue a request for lazy invocation of appropriate free routine after a
- * grace period. Please note there are three paths are maintained, two are the
- * main ones that use array of pointers interface and third one is emergency
- * one, that is used only when the main path can not be maintained temporary,
- * due to memory pressure.
+ * Queue a request for lazy invocation of the appropriate free routine
+ * after a grace period.  Please note that three paths are maintained,
+ * two for the common case using arrays of pointers and a third one that
+ * is used only when the main paths cannot be used, for example, due to
+ * memory pressure.
  *
  * Each kvfree_call_rcu() request is added to a batch. The batch will be drained
  * every KFREE_DRAIN_JIFFIES number of jiffies. All the objects in the batch will
@@ -4708,7 +4708,7 @@ void __init rcu_init(void)
 		rcutree_online_cpu(cpu);
 	}
 
-	/* Create workqueue for expedited GPs and for Tree SRCU. */
+	/* Create workqueue for Tree SRCU and for expedited GPs. */
 	rcu_gp_wq = alloc_workqueue("rcu_gp", WQ_MEM_RECLAIM, 0);
 	WARN_ON(!rcu_gp_wq);
 	rcu_par_gp_wq = alloc_workqueue("rcu_par_gp", WQ_MEM_RECLAIM, 0);
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 04/19] rcu: Invoke rcu_spawn_core_kthreads() from rcu_spawn_gp_kthread()
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (2 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 03/19] rcu: Improve tree.c comments and add code cleanups Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 05/19] rcu: Add ->rt_priority and ->gp_start to show_rcu_gp_kthreads() output Paul E. McKenney
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Currently, rcu_spawn_core_kthreads() is invoked via an early_initcall(),
which works, except that rcu_spawn_gp_kthread() is also invoked via an
early_initcall() and rcu_spawn_core_kthreads() relies on adjustments to
kthread_prio that are carried out by rcu_spawn_gp_kthread().  There is
no guaranttee of ordering among early_initcall() handlers, and thus no
guarantee that kthread_prio will be properly checked and range-limited
at the time that rcu_spawn_core_kthreads() needs it.

In most cases, this bug is harmless.  After all, the only reason that
rcu_spawn_gp_kthread() adjusts the value of kthread_prio is if the user
specified a nonsensical value for this boot parameter, which experience
indicates is rare.

Nevertheless, a bug is a bug.  This commit therefore causes the
rcu_spawn_core_kthreads() function to be invoked directly from
rcu_spawn_gp_kthread() after any needed adjustments to kthread_prio have
been carried out.

Fixes: 48d07c04b4cc ("rcu: Enable elimination of Tree-RCU softirq processing")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 06f3de96997c..2532e584e95f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2889,7 +2889,6 @@ static int __init rcu_spawn_core_kthreads(void)
 		  "%s: Could not start rcuc kthread, OOM is now expected behavior\n", __func__);
 	return 0;
 }
-early_initcall(rcu_spawn_core_kthreads);
 
 /*
  * Handle any core-RCU processing required by a call_rcu() invocation.
@@ -4450,6 +4449,7 @@ static int __init rcu_spawn_gp_kthread(void)
 	wake_up_process(t);
 	rcu_spawn_nocb_kthreads();
 	rcu_spawn_boost_kthreads();
+	rcu_spawn_core_kthreads();
 	return 0;
 }
 early_initcall(rcu_spawn_gp_kthread);
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 05/19] rcu: Add ->rt_priority and ->gp_start to show_rcu_gp_kthreads() output
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (3 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 04/19] rcu: Invoke rcu_spawn_core_kthreads() from rcu_spawn_gp_kthread() Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 06/19] rcu: Add ->gp_max " Paul E. McKenney
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

This commit adds ->rt_priority and ->gp_start to show_rcu_gp_kthreads()
output in order to better diagnose RCU priority boosting failures.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_stall.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 59b95cc5cbdf..fb4702570316 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -726,6 +726,7 @@ void show_rcu_gp_kthreads(void)
 	unsigned long j;
 	unsigned long ja;
 	unsigned long jr;
+	unsigned long js;
 	unsigned long jw;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
@@ -734,11 +735,12 @@ void show_rcu_gp_kthreads(void)
 	j = jiffies;
 	ja = j - data_race(rcu_state.gp_activity);
 	jr = j - data_race(rcu_state.gp_req_activity);
+	js = j - data_race(rcu_state.gp_start);
 	jw = j - data_race(rcu_state.gp_wake_time);
-	pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n",
+	pr_info("%s: wait state: %s(%d) ->state: %#lx ->rt_priority %u delta ->gp_start %lu ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n",
 		rcu_state.name, gp_state_getname(rcu_state.gp_state),
-		rcu_state.gp_state, t ? t->state : 0x1ffffL,
-		ja, jr, jw, (long)data_race(rcu_state.gp_wake_seq),
+		rcu_state.gp_state, t ? t->state : 0x1ffffL, t ? t->rt_priority : 0xffU,
+		js, ja, jr, jw, (long)data_race(rcu_state.gp_wake_seq),
 		(long)data_race(rcu_state.gp_seq),
 		(long)data_race(rcu_get_root()->gp_seq_needed),
 		data_race(rcu_state.gp_flags));
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 06/19] rcu: Add ->gp_max to show_rcu_gp_kthreads() output
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (4 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 05/19] rcu: Add ->rt_priority and ->gp_start to show_rcu_gp_kthreads() output Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 07/19] lockdep: Explicitly flag likely false-positive report Paul E. McKenney
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

This commit adds ->gp_max to show_rcu_gp_kthreads() output in order to
better diagnose RCU priority boosting failures.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_stall.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index fb4702570316..a4e2bb3bdce7 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -737,12 +737,13 @@ void show_rcu_gp_kthreads(void)
 	jr = j - data_race(rcu_state.gp_req_activity);
 	js = j - data_race(rcu_state.gp_start);
 	jw = j - data_race(rcu_state.gp_wake_time);
-	pr_info("%s: wait state: %s(%d) ->state: %#lx ->rt_priority %u delta ->gp_start %lu ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n",
+	pr_info("%s: wait state: %s(%d) ->state: %#lx ->rt_priority %u delta ->gp_start %lu ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_max %lu ->gp_flags %#x\n",
 		rcu_state.name, gp_state_getname(rcu_state.gp_state),
 		rcu_state.gp_state, t ? t->state : 0x1ffffL, t ? t->rt_priority : 0xffU,
 		js, ja, jr, jw, (long)data_race(rcu_state.gp_wake_seq),
 		(long)data_race(rcu_state.gp_seq),
 		(long)data_race(rcu_get_root()->gp_seq_needed),
+		data_race(rcu_state.gp_max),
 		data_race(rcu_state.gp_flags));
 	rcu_for_each_node_breadth_first(rnp) {
 		if (ULONG_CMP_GE(READ_ONCE(rcu_state.gp_seq),
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 07/19] lockdep: Explicitly flag likely false-positive report
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (5 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 06/19] rcu: Add ->gp_max " Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 08/19] rcu: Reject RCU_LOCKDEP_WARN() false positives Paul E. McKenney
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney, Boqun Feng

The reason that lockdep_rcu_suspicious() prints the value of debug_locks
is because a value of zero indicates a likely false positive.  This can
work, but is a bit obtuse.  This commit therefore explicitly calls out
the possibility of a false positive.

Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/locking/lockdep.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 48d736aa03b2..d6c3c987009d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -6393,6 +6393,7 @@ asmlinkage __visible void lockdep_sys_exit(void)
 void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 {
 	struct task_struct *curr = current;
+	int dl = READ_ONCE(debug_locks);
 
 	/* Note: the following can be executed concurrently, so be careful. */
 	pr_warn("\n");
@@ -6402,11 +6403,12 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 	pr_warn("-----------------------------\n");
 	pr_warn("%s:%d %s!\n", file, line, s);
 	pr_warn("\nother info that might help us debug this:\n\n");
-	pr_warn("\n%srcu_scheduler_active = %d, debug_locks = %d\n",
+	pr_warn("\n%srcu_scheduler_active = %d, debug_locks = %d\n%s",
 	       !rcu_lockdep_current_cpu_online()
 			? "RCU used illegally from offline CPU!\n"
 			: "",
-	       rcu_scheduler_active, debug_locks);
+	       rcu_scheduler_active, dl,
+	       dl ? "" : "Possible false positive due to lockdep disabling via debug_locks = 0\n");
 
 	/*
 	 * If a CPU is in the RCU-free window in idle (ie: in the section
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 08/19] rcu: Reject RCU_LOCKDEP_WARN() false positives
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (6 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 07/19] lockdep: Explicitly flag likely false-positive report Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 09/19] rcu: Add quiescent states and boost states to show_rcu_gp_kthreads() output Paul E. McKenney
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney,
	syzbot+dde0cc33951735441301, Matthew Wilcox,
	syzbot+88e4f02896967fe1ab0d, Boqun Feng

If another lockdep report runs concurrently with an RCU lockdep report
from RCU_LOCKDEP_WARN(), the following sequence of events can occur:

1.	debug_lockdep_rcu_enabled() sees that lockdep is enabled
	when called from (say) synchronize_rcu().

2.	Lockdep is disabled by a concurrent lockdep report.

3.	debug_lockdep_rcu_enabled() evaluates its lockdep-expression
	argument, for example, lock_is_held(&rcu_bh_lock_map).

4.	Because lockdep is now disabled, lock_is_held() plays it safe and
	returns the constant 1.

5.	But in this case, the constant 1 is not safe, because invoking
	synchronize_rcu() under rcu_read_lock_bh() is disallowed.

6.	debug_lockdep_rcu_enabled() wrongly invokes lockdep_rcu_suspicious(),
	resulting in a false-positive splat.

This commit therefore changes RCU_LOCKDEP_WARN() to check
debug_lockdep_rcu_enabled() after checking the lockdep expression,
so that any "safe" returns from lock_is_held() are rejected by
debug_lockdep_rcu_enabled().  This requires memory ordering, which is
supplied by READ_ONCE(debug_locks).  The resulting volatile accesses
prevent the compiler from reordering and the fact that only one variable
is being accessed prevents the underlying hardware from reordering.
The combination works for IA64, which can reorder reads to the same
location, but this is defeated by the volatile accesses, which compile
to load instructions that provide ordering.

Reported-by: syzbot+dde0cc33951735441301@syzkaller.appspotmail.com
Reported-by: Matthew Wilcox <willy@infradead.org>
Reported-by: syzbot+88e4f02896967fe1ab0d@syzkaller.appspotmail.com
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/rcupdate.h | 2 +-
 kernel/rcu/update.c      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 9455476c5ba2..1199ffd305d1 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -315,7 +315,7 @@ static inline int rcu_read_lock_any_held(void)
 #define RCU_LOCKDEP_WARN(c, s)						\
 	do {								\
 		static bool __section(".data.unlikely") __warned;	\
-		if (debug_lockdep_rcu_enabled() && !__warned && (c)) {	\
+		if ((c) && debug_lockdep_rcu_enabled() && !__warned) {	\
 			__warned = true;				\
 			lockdep_rcu_suspicious(__FILE__, __LINE__, s);	\
 		}							\
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index b95ae86c40a7..dd94a602a6d2 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -277,7 +277,7 @@ EXPORT_SYMBOL_GPL(rcu_callback_map);
 
 noinstr int notrace debug_lockdep_rcu_enabled(void)
 {
-	return rcu_scheduler_active != RCU_SCHEDULER_INACTIVE && debug_locks &&
+	return rcu_scheduler_active != RCU_SCHEDULER_INACTIVE && READ_ONCE(debug_locks) &&
 	       current->lockdep_recursion == 0;
 }
 EXPORT_SYMBOL_GPL(debug_lockdep_rcu_enabled);
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 09/19] rcu: Add quiescent states and boost states to show_rcu_gp_kthreads() output
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (7 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 08/19] rcu: Reject RCU_LOCKDEP_WARN() false positives Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 10/19] rcu: Make RCU priority boosting work on single-CPU rcu_node structures Paul E. McKenney
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

This commit adds each rcu_node structure's ->qsmask and "bBEG" output
indicating whether: (1) There is a boost kthread, (2) A reader needs
to be (or is in the process of being) boosted, (3) A reader is blocking
an expedited grace period, and (4) A reader is blocking a normal grace
period.  This helps diagnose RCU priority boosting failures.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.h        |  1 +
 kernel/rcu/tree_plugin.h |  1 +
 kernel/rcu/tree_stall.h  | 12 +++++++++---
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 71821d59d95c..5fd0c443517e 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -115,6 +115,7 @@ struct rcu_node {
 				/*  boosting for this rcu_node structure. */
 	unsigned int boost_kthread_status;
 				/* State of boost_kthread_task for tracing. */
+	unsigned long n_boosts;	/* Number of boosts for this rcu_node structure. */
 #ifdef CONFIG_RCU_NOCB_CPU
 	struct swait_queue_head nocb_gp_wq[2];
 				/* Place for rcu_nocb_kthread() to wait GP. */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 2cbe8f8456e6..ef004cc7101d 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1098,6 +1098,7 @@ static int rcu_boost(struct rcu_node *rnp)
 	/* Lock only for side effect: boosts task t's priority. */
 	rt_mutex_lock(&rnp->boost_mtx);
 	rt_mutex_unlock(&rnp->boost_mtx);  /* Then keep lockdep happy. */
+	rnp->n_boosts++;
 
 	return READ_ONCE(rnp->exp_tasks) != NULL ||
 	       READ_ONCE(rnp->boost_tasks) != NULL;
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index a4e2bb3bdce7..c1f83864a18e 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -749,9 +749,15 @@ void show_rcu_gp_kthreads(void)
 		if (ULONG_CMP_GE(READ_ONCE(rcu_state.gp_seq),
 				 READ_ONCE(rnp->gp_seq_needed)))
 			continue;
-		pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld\n",
-			rnp->grplo, rnp->grphi, (long)data_race(rnp->gp_seq),
-			(long)data_race(rnp->gp_seq_needed));
+		pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld ->qsmask %#lx %c%c%c%c ->n_boosts %ld\n",
+			rnp->grplo, rnp->grphi,
+			(long)data_race(rnp->gp_seq), (long)data_race(rnp->gp_seq_needed),
+			data_race(rnp->qsmask),
+			".b"[!!data_race(rnp->boost_kthread_task)],
+			".B"[!!data_race(rnp->boost_tasks)],
+			".E"[!!data_race(rnp->exp_tasks)],
+			".G"[!!data_race(rnp->gp_tasks)],
+			data_race(rnp->n_boosts));
 		if (!rcu_is_leaf_node(rnp))
 			continue;
 		for_each_leaf_node_possible_cpu(rnp, cpu) {
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 10/19] rcu: Make RCU priority boosting work on single-CPU rcu_node structures
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (8 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 09/19] rcu: Add quiescent states and boost states to show_rcu_gp_kthreads() output Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 11/19] rcu: Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP Paul E. McKenney
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney,
	Sebastian Andrzej Siewior, Scott Wood

When any CPU comes online, it checks to see if an RCU-boost kthread has
already been created for that CPU's leaf rcu_node structure, and if
not, it creates one.  Unfortunately, it also verifies that this leaf
rcu_node structure actually has at least one online CPU, and if not,
it declines to create the kthread.  Although this behavior makes sense
during early boot, especially on systems that claim far more CPUs than
they actually have, it makes no sense for the first CPU to come online
for a given rcu_node structure.  There is no point in checking because
we know there is a CPU on its way in.

The problem is that timing differences can cause this incoming CPU to not
yet be reflected in the various bit masks even at rcutree_online_cpu()
time, and there is no chance at rcutree_prepare_cpu() time.  Plus it
would be better to create the RCU-boost kthread at rcutree_prepare_cpu()
to handle the case where the CPU is involved in an RCU priority inversion
very shortly after it comes online.

This commit therefore moves the checking to rcu_prepare_kthreads(), which
is called only at early boot, when the check is appropriate.  In addition,
it makes rcutree_prepare_cpu() invoke rcu_spawn_one_boost_kthread(), which
no longer does any checking for online CPUs.

With this change, RCU priority boosting tests now pass for short rcutorture
runs, even with single-CPU leaf rcu_node structures.

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Scott Wood <swood@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c        |  2 +-
 kernel/rcu/tree.h        |  2 +-
 kernel/rcu/tree_plugin.h | 29 +++++++----------------------
 3 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2532e584e95f..00a3ebca70b8 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4166,7 +4166,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	rdp->rcu_iw_gp_seq = rdp->gp_seq - 1;
 	trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuonl"));
 	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
-	rcu_prepare_kthreads(cpu);
+	rcu_spawn_one_boost_kthread(rnp);
 	rcu_spawn_cpu_nocb_kthread(cpu);
 	WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus + 1);
 
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 5fd0c443517e..b5508f44ff29 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -418,8 +418,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags);
 static void rcu_preempt_boost_start_gp(struct rcu_node *rnp);
 static bool rcu_is_callbacks_kthread(void);
 static void rcu_cpu_kthread_setup(unsigned int cpu);
+static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp);
 static void __init rcu_spawn_boost_kthreads(void);
-static void rcu_prepare_kthreads(int cpu);
 static void rcu_cleanup_after_idle(void);
 static void rcu_prepare_for_idle(void);
 static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index ef004cc7101d..3c90dad00d3c 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1198,22 +1198,16 @@ static void rcu_preempt_boost_start_gp(struct rcu_node *rnp)
  */
 static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
 {
-	int rnp_index = rnp - rcu_get_root();
 	unsigned long flags;
+	int rnp_index = rnp - rcu_get_root();
 	struct sched_param sp;
 	struct task_struct *t;
 
-	if (!IS_ENABLED(CONFIG_PREEMPT_RCU))
-		return;
-
-	if (!rcu_scheduler_fully_active || rcu_rnp_online_cpus(rnp) == 0)
+	if (rnp->boost_kthread_task || !rcu_scheduler_fully_active)
 		return;
 
 	rcu_state.boost = 1;
 
-	if (rnp->boost_kthread_task != NULL)
-		return;
-
 	t = kthread_create(rcu_boost_kthread, (void *)rnp,
 			   "rcub/%d", rnp_index);
 	if (WARN_ON_ONCE(IS_ERR(t)))
@@ -1265,17 +1259,8 @@ static void __init rcu_spawn_boost_kthreads(void)
 	struct rcu_node *rnp;
 
 	rcu_for_each_leaf_node(rnp)
-		rcu_spawn_one_boost_kthread(rnp);
-}
-
-static void rcu_prepare_kthreads(int cpu)
-{
-	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
-	struct rcu_node *rnp = rdp->mynode;
-
-	/* Fire up the incoming CPU's kthread and leaf rcu_node kthread. */
-	if (rcu_scheduler_fully_active)
-		rcu_spawn_one_boost_kthread(rnp);
+		if (rcu_rnp_online_cpus(rnp))
+			rcu_spawn_one_boost_kthread(rnp);
 }
 
 #else /* #ifdef CONFIG_RCU_BOOST */
@@ -1295,15 +1280,15 @@ static void rcu_preempt_boost_start_gp(struct rcu_node *rnp)
 {
 }
 
-static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
+static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
 {
 }
 
-static void __init rcu_spawn_boost_kthreads(void)
+static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
 {
 }
 
-static void rcu_prepare_kthreads(int cpu)
+static void __init rcu_spawn_boost_kthreads(void)
 {
 }
 
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 11/19] rcu: Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (9 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 10/19] rcu: Make RCU priority boosting work on single-CPU rcu_node structures Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 12/19] rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs Paul E. McKenney
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Currently, show_rcu_gp_kthreads() only dumps rcu_node structures that
have outdated ideas of the current grace-period number.  This commit
also dumps those that are in any way blocking the current grace period.
This helps diagnose RCU priority boosting failures.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_stall.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index c1f83864a18e..e6bd518e0bc4 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -746,8 +746,9 @@ void show_rcu_gp_kthreads(void)
 		data_race(rcu_state.gp_max),
 		data_race(rcu_state.gp_flags));
 	rcu_for_each_node_breadth_first(rnp) {
-		if (ULONG_CMP_GE(READ_ONCE(rcu_state.gp_seq),
-				 READ_ONCE(rnp->gp_seq_needed)))
+		if (ULONG_CMP_GE(READ_ONCE(rcu_state.gp_seq), READ_ONCE(rnp->gp_seq_needed)) &&
+		    !data_race(rnp->qsmask) && !data_race(rnp->boost_tasks) &&
+		    !data_race(rnp->exp_tasks) && !data_race(rnp->gp_tasks))
 			continue;
 		pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld ->qsmask %#lx %c%c%c%c ->n_boosts %ld\n",
 			rnp->grplo, rnp->grphi,
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 12/19] rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (10 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 11/19] rcu: Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 13/19] rcu: Make rcu_gp_cleanup() be noinline for tracing Paul E. McKenney
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Kernels built with CONFIG_RCU_STRICT_GRACE_PERIOD=y can experience
significant lock contention due to RCU's resulting focus on ending grace
periods as soon as possible.  This is OK, but only if there are not very
many CPUs.  This commit therefore puts this Kconfig option off-limits
to systems with more than four CPUs.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/Kconfig.debug | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
index 1942c1f1bb65..4fd64999300f 100644
--- a/kernel/rcu/Kconfig.debug
+++ b/kernel/rcu/Kconfig.debug
@@ -116,7 +116,7 @@ config RCU_EQS_DEBUG
 
 config RCU_STRICT_GRACE_PERIOD
 	bool "Provide debug RCU implementation with short grace periods"
-	depends on DEBUG_KERNEL && RCU_EXPERT
+	depends on DEBUG_KERNEL && RCU_EXPERT && NR_CPUS <= 4
 	default n
 	select PREEMPT_COUNT if PREEMPT=n
 	help
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 13/19] rcu: Make rcu_gp_cleanup() be noinline for tracing
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (11 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 12/19] rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:52 ` [PATCH tip/core/rcu 14/19] rcu: Point to documentation of ordering guarantees Paul E. McKenney
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Although there are trace events for RCU grace periods, these are only
enabled in CONFIG_RCU_TRACE=y kernels.  This commit therefore marks
rcu_gp_cleanup() noinline in order to provide a function that can be
traced that is invoked near the end of each grace period.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 00a3ebca70b8..6eb64e44bdcd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2026,7 +2026,7 @@ static void rcu_gp_fqs_loop(void)
 /*
  * Clean up after the old grace period.
  */
-static void rcu_gp_cleanup(void)
+static noinline void rcu_gp_cleanup(void)
 {
 	int cpu;
 	bool needgp = false;
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 14/19] rcu: Point to documentation of ordering guarantees
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (12 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 13/19] rcu: Make rcu_gp_cleanup() be noinline for tracing Paul E. McKenney
@ 2021-05-11 22:52 ` Paul E. McKenney
  2021-05-11 22:53 ` [PATCH tip/core/rcu 15/19] rcu: Create an unrcu_pointer() to remove __rcu from a pointer Paul E. McKenney
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:52 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Add comments to synchronize_rcu() and friends that point to
Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/srcutree.c |  3 +++
 kernel/rcu/tree.c     | 20 ++++++++++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index e26547b34ad3..f8340c3b1c00 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1000,6 +1000,9 @@ EXPORT_SYMBOL_GPL(synchronize_srcu_expedited);
  * synchronize_srcu(), srcu_read_lock(), and srcu_read_unlock() are
  * passed the same srcu_struct structure.
  *
+ * Implementation of these memory-ordering guarantees is similar to
+ * that of synchronize_rcu().
+ *
  * If SRCU is likely idle, expedite the first request.  This semantic
  * was provided by Classic SRCU, and is relied upon by its users, so TREE
  * SRCU must also provide it.  Note that detecting idleness is heuristic
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 6eb64e44bdcd..2437960a2795 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3084,6 +3084,9 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
  * between the call to call_rcu() and the invocation of "func()" -- even
  * if CPU A and CPU B are the same CPU (but again only if the system has
  * more than one CPU).
+ *
+ * Implementation of these memory-ordering guarantees is described here:
+ * Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst.
  */
 void call_rcu(struct rcu_head *head, rcu_callback_t func)
 {
@@ -3751,6 +3754,9 @@ static int rcu_blocking_is_gp(void)
  * to have executed a full memory barrier during the execution of
  * synchronize_rcu() -- even if CPU A and CPU B are the same CPU (but
  * again only if the system has more than one CPU).
+ *
+ * Implementation of these memory-ordering guarantees is described here:
+ * Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst.
  */
 void synchronize_rcu(void)
 {
@@ -3821,7 +3827,7 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_rcu);
 /**
  * poll_state_synchronize_rcu - Conditionally wait for an RCU grace period
  *
- * @oldstate: return from call to get_state_synchronize_rcu() or start_poll_synchronize_rcu()
+ * @oldstate: value from get_state_synchronize_rcu() or start_poll_synchronize_rcu()
  *
  * If a full RCU grace period has elapsed since the earlier call from
  * which oldstate was obtained, return @true, otherwise return @false.
@@ -3837,6 +3843,11 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_rcu);
  * (many hours even on 32-bit systems) should check them occasionally
  * and either refresh them or set a flag indicating that the grace period
  * has completed.
+ *
+ * This function provides the same memory-ordering guarantees that
+ * would be provided by a synchronize_rcu() that was invoked at the call
+ * to the function that provided @oldstate, and that returned at the end
+ * of this function.
  */
 bool poll_state_synchronize_rcu(unsigned long oldstate)
 {
@@ -3851,7 +3862,7 @@ EXPORT_SYMBOL_GPL(poll_state_synchronize_rcu);
 /**
  * cond_synchronize_rcu - Conditionally wait for an RCU grace period
  *
- * @oldstate: return value from earlier call to get_state_synchronize_rcu()
+ * @oldstate: value from get_state_synchronize_rcu() or start_poll_synchronize_rcu()
  *
  * If a full RCU grace period has elapsed since the earlier call to
  * get_state_synchronize_rcu() or start_poll_synchronize_rcu(), just return.
@@ -3861,6 +3872,11 @@ EXPORT_SYMBOL_GPL(poll_state_synchronize_rcu);
  * counter wrap is harmless.  If the counter wraps, we have waited for
  * more than 2 billion grace periods (and way more on a 64-bit system!),
  * so waiting for one additional grace period should be just fine.
+ *
+ * This function provides the same memory-ordering guarantees that
+ * would be provided by a synchronize_rcu() that was invoked at the call
+ * to the function that provided @oldstate, and that returned at the end
+ * of this function.
  */
 void cond_synchronize_rcu(unsigned long oldstate)
 {
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 15/19] rcu: Create an unrcu_pointer() to remove __rcu from a pointer
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (13 preceding siblings ...)
  2021-05-11 22:52 ` [PATCH tip/core/rcu 14/19] rcu: Point to documentation of ordering guarantees Paul E. McKenney
@ 2021-05-11 22:53 ` Paul E. McKenney
  2021-05-11 22:53 ` [PATCH tip/core/rcu 16/19] sched/isolation: reconcile rcu_nocbs= and nohz_full= Paul E. McKenney
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:53 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney,
	Toke Høiland-Jørgensen

The xchg() and cmpxchg() functions are sometimes used to carry out RCU
updates.  Unfortunately, this can result in sparse warnings for both
the old-value and new-value arguments, as well as for the return value.
The arguments can be dealt with using RCU_INITIALIZER():

	old_p = xchg(&p, RCU_INITIALIZER(new_p));

But a sparse warning still remains due to assigning the __rcu pointer
returned from xchg to the (most likely) non-__rcu pointer old_p.

This commit therefore provides an unrcu_pointer() macro that strips
the __rcu.  This macro can be used as follows:

	old_p = unrcu_pointer(xchg(&p, RCU_INITIALIZER(new_p)));

Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/rcupdate.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 1199ffd305d1..a10480f2b4ef 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -363,6 +363,20 @@ static inline void rcu_preempt_sleep_check(void) { }
 #define rcu_check_sparse(p, space)
 #endif /* #else #ifdef __CHECKER__ */
 
+/**
+ * unrcu_pointer - mark a pointer as not being RCU protected
+ * @p: pointer needing to lose its __rcu property
+ *
+ * Converts @p from an __rcu pointer to a __kernel pointer.
+ * This allows an __rcu pointer to be used with xchg() and friends.
+ */
+#define unrcu_pointer(p)						\
+({									\
+	typeof(*p) *_________p1 = (typeof(*p) *__force)(p);		\
+	rcu_check_sparse(p, __rcu); 					\
+	((typeof(*p) __force __kernel *)(_________p1)); 		\
+})
+
 #define __rcu_access_pointer(p, space) \
 ({ \
 	typeof(*p) *_________p1 = (typeof(*p) *__force)READ_ONCE(p); \
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 16/19] sched/isolation: reconcile rcu_nocbs= and nohz_full=
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (14 preceding siblings ...)
  2021-05-11 22:53 ` [PATCH tip/core/rcu 15/19] rcu: Create an unrcu_pointer() to remove __rcu from a pointer Paul E. McKenney
@ 2021-05-11 22:53 ` Paul E. McKenney
  2021-05-11 22:53 ` [PATCH tip/core/rcu 17/19] rcu: Improve comments describing RCU read-side critical sections Paul E. McKenney
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:53 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul Gortmaker, Ingo Molnar,
	Paul E . McKenney, Frederic Weisbecker

From: Paul Gortmaker <paul.gortmaker@windriver.com>

We have a mismatch between RCU and isolation -- in relation to what is
considered the maximum valid CPU number.

This matters because nohz_full= and rcu_nocbs= are joined at the hip; in
fact the former will enforce the latter.  So we don't want a CPU mask to
be valid for one and denied for the other.

The difference 1st appeared as of v4.15; further details are below.

As it is confusing to anyone who isn't looking at the code regularly, a
reminder is in order; three values exist here:

CONFIG_NR_CPUS	- compiled in maximum cap on number of CPUs supported.
nr_cpu_ids 	- possible # of CPUs (typically reflects what ACPI says)
cpus_present	- actual number of present/detected/installed CPUs.

For this example, I'll refer to NR_CPUS=64 from "make defconfig" and
nr_cpu_ids=6 for ACPI reporting on a board that could run a six core,
and present=4 for a quad that is physically in the socket.  From dmesg:

 smpboot: Allowing 6 CPUs, 2 hotplug CPUs
 setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:6 nr_node_ids:1
 rcu: 	RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=6.
 smp: Brought up 1 node, 4 CPUs

And from userspace, see:

   paul@trash:/sys/devices/system/cpu$ cat present
   0-3
   paul@trash:/sys/devices/system/cpu$ cat possible
   0-5
   paul@trash:/sys/devices/system/cpu$ cat kernel_max
   63

Everything is fine if we boot 5x5 for rcu/nohz:

  Command line: BOOT_IMAGE=/boot/bzImage nohz_full=2-5 rcu_nocbs=2-5 root=/dev/sda1 ro
  NO_HZ: Full dynticks CPUs: 2-5.
  rcu: 	Offload RCU callbacks from CPUs: 2-5.

..even though there is no CPU 4 or 5.  Both RCU and nohz_full are OK.
Now we push that > 6 but less than NR_CPU and with 15x15 we get:

  Command line: BOOT_IMAGE=/boot/bzImage rcu_nocbs=2-15 nohz_full=2-15 root=/dev/sda1 ro
  rcu: 	Note: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.
  rcu: 	Offload RCU callbacks from CPUs: 2-5.

These are both functionally equivalent, as we are only changing flags on
phantom CPUs that don't exist, but note the kernel interpretation changes.
And worse, it only changes for one of the two - which is the problem.

RCU doesn't care if you want to restrict the flags on phantom CPUs but
clearly nohz_full does after this change from v4.15 (edb9382175c3):

-       if (cpulist_parse(str, non_housekeeping_mask) < 0) {
-               pr_warn("Housekeeping: Incorrect nohz_full cpumask\n");
+       err = cpulist_parse(str, non_housekeeping_mask);
+       if (err < 0 || cpumask_last(non_housekeeping_mask) >= nr_cpu_ids) {
+               pr_warn("Housekeeping: nohz_full= or isolcpus= incorrect CPU range\n");

To be clear, the sanity check on "possible" (nr_cpu_ids) is new here.

The goal was reasonable ; not wanting housekeeping to land on a
not-possible CPU, but note two things:

1) this is an exclusion list, not an inclusion list; we are tracking
non_housekeeping CPUs; not ones who are explicitly assigned housekeeping

2) we went one further in 9219565aa890 - ensuring that housekeeping was
sanity checking against present and not just possible CPUs.

To be clear, this means the check added in v4.15 is doubly redundant.
And more importantly, overly strict/restrictive.

We care now, because the bitmap boot arg parsing now knows that a value
of "N" is NR_CPUS; the size of the bitmap, but the bitmap code doesn't
know anything about the subtleties of our max/possible/present CPU
specifics as outlined above.

So drop the check added in v4.15 (edb9382175c3) and make RCU and
nohz_full both in alignment again on NR_CPUS so "N" works for both,
and then they can fall back to nr_cpu_ids internally just as before.

  Command line: BOOT_IMAGE=/boot/bzImage nohz_full=2-N rcu_nocbs=2-N root=/dev/sda1 ro
  NO_HZ: Full dynticks CPUs: 2-5.
  rcu: 	Offload RCU callbacks from CPUs: 2-5.

As shown above, with this change, RCU and nohz_full are in sync, even
with the use of the "N" placeholder.  Same result is achieved with "15".

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/sched/isolation.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index 5a6ea03f9882..7f06eaf12818 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -81,11 +81,9 @@ static int __init housekeeping_setup(char *str, enum hk_flags flags)
 {
 	cpumask_var_t non_housekeeping_mask;
 	cpumask_var_t tmp;
-	int err;
 
 	alloc_bootmem_cpumask_var(&non_housekeeping_mask);
-	err = cpulist_parse(str, non_housekeeping_mask);
-	if (err < 0 || cpumask_last(non_housekeeping_mask) >= nr_cpu_ids) {
+	if (cpulist_parse(str, non_housekeeping_mask) < 0) {
 		pr_warn("Housekeeping: nohz_full= or isolcpus= incorrect CPU range\n");
 		free_bootmem_cpumask_var(non_housekeeping_mask);
 		return 0;
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 17/19] rcu: Improve comments describing RCU read-side critical sections
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (15 preceding siblings ...)
  2021-05-11 22:53 ` [PATCH tip/core/rcu 16/19] sched/isolation: reconcile rcu_nocbs= and nohz_full= Paul E. McKenney
@ 2021-05-11 22:53 ` Paul E. McKenney
  2021-05-11 22:53 ` [PATCH tip/core/rcu 18/19] rcu: Remove obsolete rcu_read_unlock() deadlock commentary Paul E. McKenney
  2021-05-11 22:53 ` [PATCH tip/core/rcu 19/19] rcu: Add missing __releases() annotation Paul E. McKenney
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:53 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney,
	Michel Lespinasse

There are a number of places that call out the fact that preempt-disable
regions of code now act as RCU read-side critical sections, where
preempt-disable regions of code include irq-disable regions of code,
bh-disable regions of code, hardirq handlers, and NMI handlers.  However,
someone relying solely on (for example) the call_rcu() header comment
might well have no idea that preempt-disable regions of code have RCU
semantics.

This commit therefore updates the header comments for
call_rcu(), synchronize_rcu(), rcu_dereference_bh_check(), and
rcu_dereference_sched_check() to call out these new(ish) forms of RCU
readers.

Reported-by: Michel Lespinasse <michel@lespinasse.org>
[ paulmck: Apply Matthew Wilcox and Michel Lespinasse feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/rcupdate.h | 35 ++++++++++++++++++++++++++++-------
 kernel/rcu/tree.c        | 24 ++++++++++++++----------
 2 files changed, 42 insertions(+), 17 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index a10480f2b4ef..45e58f14b1ce 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -532,7 +532,12 @@ do {									      \
  * @p: The pointer to read, prior to dereferencing
  * @c: The conditions under which the dereference will take place
  *
- * This is the RCU-bh counterpart to rcu_dereference_check().
+ * This is the RCU-bh counterpart to rcu_dereference_check().  However,
+ * please note that starting in v5.0 kernels, vanilla RCU grace periods
+ * wait for local_bh_disable() regions of code in addition to regions of
+ * code demarked by rcu_read_lock() and rcu_read_unlock().  This means
+ * that synchronize_rcu(), call_rcu, and friends all take not only
+ * rcu_read_lock() but also rcu_read_lock_bh() into account.
  */
 #define rcu_dereference_bh_check(p, c) \
 	__rcu_dereference_check((p), (c) || rcu_read_lock_bh_held(), __rcu)
@@ -543,6 +548,11 @@ do {									      \
  * @c: The conditions under which the dereference will take place
  *
  * This is the RCU-sched counterpart to rcu_dereference_check().
+ * However, please note that starting in v5.0 kernels, vanilla RCU grace
+ * periods wait for preempt_disable() regions of code in addition to
+ * regions of code demarked by rcu_read_lock() and rcu_read_unlock().
+ * This means that synchronize_rcu(), call_rcu, and friends all take not
+ * only rcu_read_lock() but also rcu_read_lock_sched() into account.
  */
 #define rcu_dereference_sched_check(p, c) \
 	__rcu_dereference_check((p), (c) || rcu_read_lock_sched_held(), \
@@ -634,6 +644,12 @@ do {									      \
  * sections, invocation of the corresponding RCU callback is deferred
  * until after the all the other CPUs exit their critical sections.
  *
+ * In v5.0 and later kernels, synchronize_rcu() and call_rcu() also
+ * wait for regions of code with preemption disabled, including regions of
+ * code with interrupts or softirqs disabled.  In pre-v5.0 kernels, which
+ * define synchronize_sched(), only code enclosed within rcu_read_lock()
+ * and rcu_read_unlock() are guaranteed to be waited for.
+ *
  * Note, however, that RCU callbacks are permitted to run concurrently
  * with new RCU read-side critical sections.  One way that this can happen
  * is via the following sequence of events: (1) CPU 0 enters an RCU
@@ -728,9 +744,11 @@ static inline void rcu_read_unlock(void)
 /**
  * rcu_read_lock_bh() - mark the beginning of an RCU-bh critical section
  *
- * This is equivalent of rcu_read_lock(), but also disables softirqs.
- * Note that anything else that disables softirqs can also serve as
- * an RCU read-side critical section.
+ * This is equivalent to rcu_read_lock(), but also disables softirqs.
+ * Note that anything else that disables softirqs can also serve as an RCU
+ * read-side critical section.  However, please note that this equivalence
+ * applies only to v5.0 and later.  Before v5.0, rcu_read_lock() and
+ * rcu_read_lock_bh() were unrelated.
  *
  * Note that rcu_read_lock_bh() and the matching rcu_read_unlock_bh()
  * must occur in the same context, for example, it is illegal to invoke
@@ -763,9 +781,12 @@ static inline void rcu_read_unlock_bh(void)
 /**
  * rcu_read_lock_sched() - mark the beginning of a RCU-sched critical section
  *
- * This is equivalent of rcu_read_lock(), but disables preemption.
- * Read-side critical sections can also be introduced by anything else
- * that disables preemption, including local_irq_disable() and friends.
+ * This is equivalent to rcu_read_lock(), but also disables preemption.
+ * Read-side critical sections can also be introduced by anything else that
+ * disables preemption, including local_irq_disable() and friends.  However,
+ * please note that the equivalence to rcu_read_lock() applies only to
+ * v5.0 and later.  Before v5.0, rcu_read_lock() and rcu_read_lock_sched()
+ * were unrelated.
  *
  * Note that rcu_read_lock_sched() and the matching rcu_read_unlock_sched()
  * must occur in the same context, for example, it is illegal to invoke
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2437960a2795..4b00e4fbfa10 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3059,12 +3059,14 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
  * period elapses, in other words after all pre-existing RCU read-side
  * critical sections have completed.  However, the callback function
  * might well execute concurrently with RCU read-side critical sections
- * that started after call_rcu() was invoked.  RCU read-side critical
- * sections are delimited by rcu_read_lock() and rcu_read_unlock(), and
- * may be nested.  In addition, regions of code across which interrupts,
- * preemption, or softirqs have been disabled also serve as RCU read-side
- * critical sections.  This includes hardware interrupt handlers, softirq
- * handlers, and NMI handlers.
+ * that started after call_rcu() was invoked.
+ *
+ * RCU read-side critical sections are delimited by rcu_read_lock()
+ * and rcu_read_unlock(), and may be nested.  In addition, but only in
+ * v5.0 and later, regions of code across which interrupts, preemption,
+ * or softirqs have been disabled also serve as RCU read-side critical
+ * sections.  This includes hardware interrupt handlers, softirq handlers,
+ * and NMI handlers.
  *
  * Note that all CPUs must agree that the grace period extended beyond
  * all pre-existing RCU read-side critical section.  On systems with more
@@ -3730,10 +3732,12 @@ static int rcu_blocking_is_gp(void)
  * read-side critical sections have completed.  Note, however, that
  * upon return from synchronize_rcu(), the caller might well be executing
  * concurrently with new RCU read-side critical sections that began while
- * synchronize_rcu() was waiting.  RCU read-side critical sections are
- * delimited by rcu_read_lock() and rcu_read_unlock(), and may be nested.
- * In addition, regions of code across which interrupts, preemption, or
- * softirqs have been disabled also serve as RCU read-side critical
+ * synchronize_rcu() was waiting.
+ *
+ * RCU read-side critical sections are delimited by rcu_read_lock()
+ * and rcu_read_unlock(), and may be nested.  In addition, but only in
+ * v5.0 and later, regions of code across which interrupts, preemption,
+ * or softirqs have been disabled also serve as RCU read-side critical
  * sections.  This includes hardware interrupt handlers, softirq handlers,
  * and NMI handlers.
  *
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 18/19] rcu: Remove obsolete rcu_read_unlock() deadlock commentary
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (16 preceding siblings ...)
  2021-05-11 22:53 ` [PATCH tip/core/rcu 17/19] rcu: Improve comments describing RCU read-side critical sections Paul E. McKenney
@ 2021-05-11 22:53 ` Paul E. McKenney
  2021-05-11 22:53 ` [PATCH tip/core/rcu 19/19] rcu: Add missing __releases() annotation Paul E. McKenney
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:53 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

The deferred quiescent states resulting from the consolidation of RCU-bh
and RCU-sched into RCU means that rcu_read_unlock() will no longer attempt
to acquire scheduler locks if interrupts were disabled across that call
to rcu_read_unlock().  The cautions in the rcu_read_unlock() header
comment are therefore obsolete.  This commit therefore removes them.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/rcupdate.h | 33 ++++++---------------------------
 1 file changed, 6 insertions(+), 27 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 45e58f14b1ce..323954363389 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -702,33 +702,12 @@ static __always_inline void rcu_read_lock(void)
 /**
  * rcu_read_unlock() - marks the end of an RCU read-side critical section.
  *
- * In most situations, rcu_read_unlock() is immune from deadlock.
- * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock()
- * is responsible for deboosting, which it does via rt_mutex_unlock().
- * Unfortunately, this function acquires the scheduler's runqueue and
- * priority-inheritance spinlocks.  This means that deadlock could result
- * if the caller of rcu_read_unlock() already holds one of these locks or
- * any lock that is ever acquired while holding them.
- *
- * That said, RCU readers are never priority boosted unless they were
- * preempted.  Therefore, one way to avoid deadlock is to make sure
- * that preemption never happens within any RCU read-side critical
- * section whose outermost rcu_read_unlock() is called with one of
- * rt_mutex_unlock()'s locks held.  Such preemption can be avoided in
- * a number of ways, for example, by invoking preempt_disable() before
- * critical section's outermost rcu_read_lock().
- *
- * Given that the set of locks acquired by rt_mutex_unlock() might change
- * at any time, a somewhat more future-proofed approach is to make sure
- * that that preemption never happens within any RCU read-side critical
- * section whose outermost rcu_read_unlock() is called with irqs disabled.
- * This approach relies on the fact that rt_mutex_unlock() currently only
- * acquires irq-disabled locks.
- *
- * The second of these two approaches is best in most situations,
- * however, the first approach can also be useful, at least to those
- * developers willing to keep abreast of the set of locks acquired by
- * rt_mutex_unlock().
+ * In almost all situations, rcu_read_unlock() is immune from deadlock.
+ * In recent kernels that have consolidated synchronize_sched() and
+ * synchronize_rcu_bh() into synchronize_rcu(), this deadlock immunity
+ * also extends to the scheduler's runqueue and priority-inheritance
+ * spinlocks, courtesy of the quiescent-state deferral that is carried
+ * out when rcu_read_unlock() is invoked with interrupts disabled.
  *
  * See rcu_read_lock() for more information.
  */
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH tip/core/rcu 19/19] rcu: Add missing __releases() annotation
  2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
                   ` (17 preceding siblings ...)
  2021-05-11 22:53 ` [PATCH tip/core/rcu 18/19] rcu: Remove obsolete rcu_read_unlock() deadlock commentary Paul E. McKenney
@ 2021-05-11 22:53 ` Paul E. McKenney
  18 siblings, 0 replies; 20+ messages in thread
From: Paul E. McKenney @ 2021-05-11 22:53 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Jules Irenge, Paul E . McKenney

From: Jules Irenge <jbi.octave@gmail.com>

Sparse reports a warning at rcu_print_task_stall():

"warning: context imbalance in rcu_print_task_stall - unexpected unlock"

The root cause is a missing annotation on rcu_print_task_stall().

This commit therefore adds the missing __releases(rnp->lock) annotation.

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_stall.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index e6bd518e0bc4..ffb8cf6c6437 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -314,6 +314,7 @@ static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
  * tasks blocked within RCU read-side critical sections.
  */
 static int rcu_print_task_stall(struct rcu_node *rnp, unsigned long flags)
+	__releases(rnp->lock)
 {
 	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 	return 0;
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-05-11 22:53 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-11 22:52 [PATCH tip/core/rcu 0/19] Miscellaneous fixes for v5.14 Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 01/19] rcu: Fix typo in comment: kthead -> kthread Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 02/19] rcu: Remove the unused rcu_irq_exit_preempt() function Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 03/19] rcu: Improve tree.c comments and add code cleanups Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 04/19] rcu: Invoke rcu_spawn_core_kthreads() from rcu_spawn_gp_kthread() Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 05/19] rcu: Add ->rt_priority and ->gp_start to show_rcu_gp_kthreads() output Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 06/19] rcu: Add ->gp_max " Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 07/19] lockdep: Explicitly flag likely false-positive report Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 08/19] rcu: Reject RCU_LOCKDEP_WARN() false positives Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 09/19] rcu: Add quiescent states and boost states to show_rcu_gp_kthreads() output Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 10/19] rcu: Make RCU priority boosting work on single-CPU rcu_node structures Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 11/19] rcu: Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 12/19] rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 13/19] rcu: Make rcu_gp_cleanup() be noinline for tracing Paul E. McKenney
2021-05-11 22:52 ` [PATCH tip/core/rcu 14/19] rcu: Point to documentation of ordering guarantees Paul E. McKenney
2021-05-11 22:53 ` [PATCH tip/core/rcu 15/19] rcu: Create an unrcu_pointer() to remove __rcu from a pointer Paul E. McKenney
2021-05-11 22:53 ` [PATCH tip/core/rcu 16/19] sched/isolation: reconcile rcu_nocbs= and nohz_full= Paul E. McKenney
2021-05-11 22:53 ` [PATCH tip/core/rcu 17/19] rcu: Improve comments describing RCU read-side critical sections Paul E. McKenney
2021-05-11 22:53 ` [PATCH tip/core/rcu 18/19] rcu: Remove obsolete rcu_read_unlock() deadlock commentary Paul E. McKenney
2021-05-11 22:53 ` [PATCH tip/core/rcu 19/19] rcu: Add missing __releases() annotation Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).