linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/12] NOCB updates for v5.13
@ 2021-03-04  0:22 Paul E. McKenney
  2021-03-04  0:23 ` [PATCH tip/core/rcu 01/12] rcu/nocb: Detect unsafe checks for offloaded rdp paulmck
                   ` (11 more replies)
  0 siblings, 12 replies; 15+ messages in thread
From: Paul E. McKenney @ 2021-03-04  0:22 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel

Hello!

This series contains updates for no-callbacks (NOCBs) CPUs:

1.	Detect unsafe checks for offloaded rdp, courtesy of Frederic
	Weisbecker.

2.	Report ignored local enqueue in nohz mode, courtesy of Frederic
	Weisbecker.

3.	Comment the reason behind BH disablement on batch processing,
	courtesy of Frederic Weisbecker.

4.	Forbid NOCB toggling on offline CPUs, courtesy of Frederic
	Weisbecker.

5.	Avoid confusing double write of rdp->nocb_cb_sleep, courtesy of
	Frederic Weisbecker.

6.	Only (re-)initialize segcblist when needed on CPU up, courtesy
	of Frederic Weisbecker.

7.	Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading,
	courtesy of Frederic Weisbecker.

8.	Make nocb_nobypass_lim_per_jiffy static, courtesy of Jiapeng
	Chong.

9.	Fix missed nocb_timer requeue, courtesy of Frederic Weisbecker.

10.	Disable bypass when CPU isn't completely offloaded, courtesy of
	Frederic Weisbecker.

11.	Remove stale comment above rcu_segcblist_offload(), courtesy of
	Frederic Weisbecker.

12.	Move trace_rcu_nocb_wake() calls outside nocb_lock when possible,
	courtesy of Frederic Weisbecker.

						Thanx, Paul

------------------------------------------------------------------------

 include/linux/rcu_segcblist.h |    7 -
 kernel/rcu/rcu_segcblist.c    |    3 
 kernel/rcu/tree.c             |   33 +++---
 kernel/rcu/tree_plugin.h      |  222 ++++++++++++++++++++++++++++--------------
 kernel/sched/core.c           |   24 ++++
 5 files changed, 194 insertions(+), 95 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 01/12] rcu/nocb: Detect unsafe checks for offloaded rdp
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 02/12] timer: Report ignored local enqueue in nohz mode paulmck
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Neeraj Upadhyay, Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

Provide CONFIG_PROVE_RCU sanity checks to ensure we are always reading
the offloaded state of an rdp in a safe and stable way and prevent from
its value to be changed under us. We must either hold the barrier mutex,
the cpu-hotplug lock (read or write) or the nocb lock.
Local non-preemptible reads are also safe. NOCB kthreads and timers have
their own means of synchronization against the offloaded state updaters.

Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c        | 21 +++++------
 kernel/rcu/tree_plugin.h | 90 ++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 87 insertions(+), 24 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index da6f521..03503e2 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -156,6 +156,7 @@ static void invoke_rcu_core(void);
 static void rcu_report_exp_rdp(struct rcu_data *rdp);
 static void sync_sched_exp_online_cleanup(int cpu);
 static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp);
+static bool rcu_rdp_is_offloaded(struct rcu_data *rdp);
 
 /* rcuc/rcub kthread realtime priority */
 static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
@@ -1672,7 +1673,7 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
 {
 	bool ret = false;
 	bool need_qs;
-	const bool offloaded = rcu_segcblist_is_offloaded(&rdp->cblist);
+	const bool offloaded = rcu_rdp_is_offloaded(rdp);
 
 	raw_lockdep_assert_held_rcu_node(rnp);
 
@@ -2128,7 +2129,7 @@ static void rcu_gp_cleanup(void)
 		needgp = true;
 	}
 	/* Advance CBs to reduce false positives below. */
-	offloaded = rcu_segcblist_is_offloaded(&rdp->cblist);
+	offloaded = rcu_rdp_is_offloaded(rdp);
 	if ((offloaded || !rcu_accelerate_cbs(rnp, rdp)) && needgp) {
 		WRITE_ONCE(rcu_state.gp_flags, RCU_GP_FLAG_INIT);
 		WRITE_ONCE(rcu_state.gp_req_activity, jiffies);
@@ -2327,7 +2328,7 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
 	unsigned long flags;
 	unsigned long mask;
 	bool needwake = false;
-	const bool offloaded = rcu_segcblist_is_offloaded(&rdp->cblist);
+	const bool offloaded = rcu_rdp_is_offloaded(rdp);
 	struct rcu_node *rnp;
 
 	WARN_ON_ONCE(rdp->cpu != smp_processor_id());
@@ -2497,7 +2498,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	int div;
 	bool __maybe_unused empty;
 	unsigned long flags;
-	const bool offloaded = rcu_segcblist_is_offloaded(&rdp->cblist);
+	const bool offloaded = rcu_rdp_is_offloaded(rdp);
 	struct rcu_head *rhp;
 	struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
 	long bl, count = 0;
@@ -3066,7 +3067,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
 	trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued"));
 
 	/* Go handle any RCU core processing required. */
-	if (unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) {
+	if (unlikely(rcu_rdp_is_offloaded(rdp))) {
 		__call_rcu_nocb_wake(rdp, was_alldone, flags); /* unlocks */
 	} else {
 		__call_rcu_core(rdp, head, flags);
@@ -3843,13 +3844,13 @@ static int rcu_pending(int user)
 		return 1;
 
 	/* Does this CPU have callbacks ready to invoke? */
-	if (!rcu_segcblist_is_offloaded(&rdp->cblist) &&
+	if (!rcu_rdp_is_offloaded(rdp) &&
 	    rcu_segcblist_ready_cbs(&rdp->cblist))
 		return 1;
 
 	/* Has RCU gone idle with this CPU needing another grace period? */
 	if (!gp_in_progress && rcu_segcblist_is_enabled(&rdp->cblist) &&
-	    !rcu_segcblist_is_offloaded(&rdp->cblist) &&
+	    !rcu_rdp_is_offloaded(rdp) &&
 	    !rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL))
 		return 1;
 
@@ -3968,7 +3969,7 @@ void rcu_barrier(void)
 	for_each_possible_cpu(cpu) {
 		rdp = per_cpu_ptr(&rcu_data, cpu);
 		if (cpu_is_offline(cpu) &&
-		    !rcu_segcblist_is_offloaded(&rdp->cblist))
+		    !rcu_rdp_is_offloaded(rdp))
 			continue;
 		if (rcu_segcblist_n_cbs(&rdp->cblist) && cpu_online(cpu)) {
 			rcu_barrier_trace(TPS("OnlineQ"), cpu,
@@ -4291,7 +4292,7 @@ void rcutree_migrate_callbacks(int cpu)
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 	bool needwake;
 
-	if (rcu_segcblist_is_offloaded(&rdp->cblist) ||
+	if (rcu_rdp_is_offloaded(rdp) ||
 	    rcu_segcblist_empty(&rdp->cblist))
 		return;  /* No callbacks to migrate. */
 
@@ -4309,7 +4310,7 @@ void rcutree_migrate_callbacks(int cpu)
 	rcu_segcblist_disable(&rdp->cblist);
 	WARN_ON_ONCE(rcu_segcblist_empty(&my_rdp->cblist) !=
 		     !rcu_segcblist_n_cbs(&my_rdp->cblist));
-	if (rcu_segcblist_is_offloaded(&my_rdp->cblist)) {
+	if (rcu_rdp_is_offloaded(my_rdp)) {
 		raw_spin_unlock_rcu_node(my_rnp); /* irqs remain disabled. */
 		__call_rcu_nocb_wake(my_rdp, true, flags);
 	} else {
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 2d60377..cd513ea 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -16,8 +16,70 @@
 #ifdef CONFIG_RCU_NOCB_CPU
 static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */
 static bool __read_mostly rcu_nocb_poll;    /* Offload kthread are to poll. */
+static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
+{
+	return lockdep_is_held(&rdp->nocb_lock);
+}
+
+static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
+{
+	/* Race on early boot between thread creation and assignment */
+	if (!rdp->nocb_cb_kthread || !rdp->nocb_gp_kthread)
+		return true;
+
+	if (current == rdp->nocb_cb_kthread || current == rdp->nocb_gp_kthread)
+		if (in_task())
+			return true;
+	return false;
+}
+
+static inline bool rcu_running_nocb_timer(struct rcu_data *rdp)
+{
+	return (timer_curr_running(&rdp->nocb_timer) && !in_irq());
+}
+#else
+static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
+{
+	return 0;
+}
+
+static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
+{
+	return false;
+}
+
+static inline bool rcu_running_nocb_timer(struct rcu_data *rdp)
+{
+	return false;
+}
+
 #endif /* #ifdef CONFIG_RCU_NOCB_CPU */
 
+static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
+{
+	/*
+	 * In order to read the offloaded state of an rdp is a safe
+	 * and stable way and prevent from its value to be changed
+	 * under us, we must either hold the barrier mutex, the cpu
+	 * hotplug lock (read or write) or the nocb lock. Local
+	 * non-preemptible reads are also safe. NOCB kthreads and
+	 * timers have their own means of synchronization against the
+	 * offloaded state updaters.
+	 */
+	RCU_LOCKDEP_WARN(
+		!(lockdep_is_held(&rcu_state.barrier_mutex) ||
+		  (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) ||
+		  rcu_lockdep_is_held_nocb(rdp) ||
+		  (rdp == this_cpu_ptr(&rcu_data) &&
+		   !(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible())) ||
+		  rcu_current_is_nocb_kthread(rdp) ||
+		  rcu_running_nocb_timer(rdp)),
+		"Unsafe read of RCU_NOCB offloaded state"
+	);
+
+	return rcu_segcblist_is_offloaded(&rdp->cblist);
+}
+
 /*
  * Check the RCU kernel configuration parameters and print informative
  * messages about anything out of the ordinary.
@@ -1257,7 +1319,7 @@ int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 {
 	*nextevt = KTIME_MAX;
 	return !rcu_segcblist_empty(&this_cpu_ptr(&rcu_data)->cblist) &&
-	       !rcu_segcblist_is_offloaded(&this_cpu_ptr(&rcu_data)->cblist);
+		!rcu_rdp_is_offloaded(this_cpu_ptr(&rcu_data));
 }
 
 /*
@@ -1352,7 +1414,7 @@ int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 
 	/* If no non-offloaded callbacks, RCU doesn't need the CPU. */
 	if (rcu_segcblist_empty(&rdp->cblist) ||
-	    rcu_segcblist_is_offloaded(&this_cpu_ptr(&rcu_data)->cblist)) {
+	    rcu_rdp_is_offloaded(rdp)) {
 		*nextevt = KTIME_MAX;
 		return 0;
 	}
@@ -1388,7 +1450,7 @@ static void rcu_prepare_for_idle(void)
 	int tne;
 
 	lockdep_assert_irqs_disabled();
-	if (rcu_segcblist_is_offloaded(&rdp->cblist))
+	if (rcu_rdp_is_offloaded(rdp))
 		return;
 
 	/* Handle nohz enablement switches conservatively. */
@@ -1429,7 +1491,7 @@ static void rcu_cleanup_after_idle(void)
 	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 
 	lockdep_assert_irqs_disabled();
-	if (rcu_segcblist_is_offloaded(&rdp->cblist))
+	if (rcu_rdp_is_offloaded(rdp))
 		return;
 	if (rcu_try_advance_all_cbs())
 		invoke_rcu_core();
@@ -1560,7 +1622,7 @@ static void rcu_nocb_bypass_unlock(struct rcu_data *rdp)
 static void rcu_nocb_lock(struct rcu_data *rdp)
 {
 	lockdep_assert_irqs_disabled();
-	if (!rcu_segcblist_is_offloaded(&rdp->cblist))
+	if (!rcu_rdp_is_offloaded(rdp))
 		return;
 	raw_spin_lock(&rdp->nocb_lock);
 }
@@ -1571,7 +1633,7 @@ static void rcu_nocb_lock(struct rcu_data *rdp)
  */
 static void rcu_nocb_unlock(struct rcu_data *rdp)
 {
-	if (rcu_segcblist_is_offloaded(&rdp->cblist)) {
+	if (rcu_rdp_is_offloaded(rdp)) {
 		lockdep_assert_irqs_disabled();
 		raw_spin_unlock(&rdp->nocb_lock);
 	}
@@ -1584,7 +1646,7 @@ static void rcu_nocb_unlock(struct rcu_data *rdp)
 static void rcu_nocb_unlock_irqrestore(struct rcu_data *rdp,
 				       unsigned long flags)
 {
-	if (rcu_segcblist_is_offloaded(&rdp->cblist)) {
+	if (rcu_rdp_is_offloaded(rdp)) {
 		lockdep_assert_irqs_disabled();
 		raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
 	} else {
@@ -1596,7 +1658,7 @@ static void rcu_nocb_unlock_irqrestore(struct rcu_data *rdp,
 static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
 {
 	lockdep_assert_irqs_disabled();
-	if (rcu_segcblist_is_offloaded(&rdp->cblist))
+	if (rcu_rdp_is_offloaded(rdp))
 		lockdep_assert_held(&rdp->nocb_lock);
 }
 
@@ -1690,7 +1752,7 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 {
 	struct rcu_cblist rcl;
 
-	WARN_ON_ONCE(!rcu_segcblist_is_offloaded(&rdp->cblist));
+	WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
 	rcu_lockdep_assert_cblist_protected(rdp);
 	lockdep_assert_held(&rdp->nocb_bypass_lock);
 	if (rhp && !rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
@@ -1718,7 +1780,7 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 				  unsigned long j)
 {
-	if (!rcu_segcblist_is_offloaded(&rdp->cblist))
+	if (!rcu_rdp_is_offloaded(rdp))
 		return true;
 	rcu_lockdep_assert_cblist_protected(rdp);
 	rcu_nocb_bypass_lock(rdp);
@@ -1732,7 +1794,7 @@ static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j)
 {
 	rcu_lockdep_assert_cblist_protected(rdp);
-	if (!rcu_segcblist_is_offloaded(&rdp->cblist) ||
+	if (!rcu_rdp_is_offloaded(rdp) ||
 	    !rcu_nocb_bypass_trylock(rdp))
 		return;
 	WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j));
@@ -1764,7 +1826,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 	unsigned long j = jiffies;
 	long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
 
-	if (!rcu_segcblist_is_offloaded(&rdp->cblist)) {
+	if (!rcu_rdp_is_offloaded(rdp)) {
 		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
 		return false; /* Not offloaded, no bypassing. */
 	}
@@ -2397,7 +2459,7 @@ int rcu_nocb_cpu_deoffload(int cpu)
 	}
 	mutex_lock(&rcu_state.barrier_mutex);
 	cpus_read_lock();
-	if (rcu_segcblist_is_offloaded(&rdp->cblist)) {
+	if (rcu_rdp_is_offloaded(rdp)) {
 		if (cpu_online(cpu))
 			ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp);
 		else
@@ -2472,7 +2534,7 @@ int rcu_nocb_cpu_offload(int cpu)
 
 	mutex_lock(&rcu_state.barrier_mutex);
 	cpus_read_lock();
-	if (!rcu_segcblist_is_offloaded(&rdp->cblist)) {
+	if (!rcu_rdp_is_offloaded(rdp)) {
 		if (cpu_online(cpu))
 			ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp);
 		else
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 02/12] timer: Report ignored local enqueue in nohz mode
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
  2021-03-04  0:23 ` [PATCH tip/core/rcu 01/12] rcu/nocb: Detect unsafe checks for offloaded rdp paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04 11:58   ` Rafael J. Wysocki
  2021-03-04  0:23 ` [PATCH tip/core/rcu 03/12] rcu/nocb: Comment the reason behind BH disablement on batch processing paulmck
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Rafael J . Wysocki, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

Enqueuing a local timer after the tick has been stopped will result in
the timer being ignored until the next random interrupt.

Perform sanity checks to report these situations.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar<mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/sched/core.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ca2bb62..4822371 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -674,6 +674,26 @@ int get_nohz_timer_target(void)
 	return cpu;
 }
 
+static void wake_idle_assert_possible(void)
+{
+#ifdef CONFIG_SCHED_DEBUG
+	/* Timers are re-evaluated after idle IRQs */
+	if (in_hardirq())
+		return;
+	/*
+	 * Same as hardirqs, assuming they are executing
+	 * on IRQ tail. Ksoftirqd shouldn't reach here
+	 * as the timer base wouldn't be idle. And inline
+	 * softirq processing after a call to local_bh_enable()
+	 * within idle loop sound too fun to be considered here.
+	 */
+	if (in_serving_softirq())
+		return;
+
+	WARN_ON_ONCE("Late timer enqueue may be ignored\n");
+#endif
+}
+
 /*
  * When add_timer_on() enqueues a timer into the timer wheel of an
  * idle CPU then this timer might expire before the next timer event
@@ -688,8 +708,10 @@ static void wake_up_idle_cpu(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
 
-	if (cpu == smp_processor_id())
+	if (cpu == smp_processor_id()) {
+		wake_idle_assert_possible();
 		return;
+	}
 
 	if (set_nr_and_not_polling(rq->idle))
 		smp_send_reschedule(cpu);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 03/12] rcu/nocb: Comment the reason behind BH disablement on batch processing
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
  2021-03-04  0:23 ` [PATCH tip/core/rcu 01/12] rcu/nocb: Detect unsafe checks for offloaded rdp paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 02/12] timer: Report ignored local enqueue in nohz mode paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 04/12] rcu/nocb: Forbid NOCB toggling on offline CPUs paulmck
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Neeraj Upadhyay, Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

This commit explains why softirqs need to be disabled while invoking
callbacks, even when callback processing has been offloaded.  After
all, invoking callbacks concurrently is one thing, but concurrently
invoking the same callback is quite another.

Reported-by: Boqun Feng <boqun.feng@gmail.com>
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_plugin.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index cd513ea..013142d 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2235,6 +2235,12 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 	local_irq_save(flags);
 	rcu_momentary_dyntick_idle();
 	local_irq_restore(flags);
+	/*
+	 * Disable BH to provide the expected environment.  Also, when
+	 * transitioning to/from NOCB mode, a self-requeuing callback might
+	 * be invoked from softirq.  A short grace period could cause both
+	 * instances of this callback would execute concurrently.
+	 */
 	local_bh_disable();
 	rcu_do_batch(rdp);
 	local_bh_enable();
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 04/12] rcu/nocb: Forbid NOCB toggling on offline CPUs
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
                   ` (2 preceding siblings ...)
  2021-03-04  0:23 ` [PATCH tip/core/rcu 03/12] rcu/nocb: Comment the reason behind BH disablement on batch processing paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 05/12] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep paulmck
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Neeraj Upadhyay, Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

It makes no sense to de-offload an offline CPU because that CPU will never
invoke any remaining callbacks.  It also makes little sense to offload an
offline CPU because any pending RCU callbacks were migrated when that CPU
went offline.  Yes, it is in theory possible to use a number of tricks
to permit offloading and deoffloading offline CPUs in certain cases, but
in practice it is far better to have the simple and deterministic rule
"Toggling the offload state of an offline CPU is forbidden".

For but one example, consider that an offloaded offline CPU might have
millions of callbacks queued.  Best to just say "no".

This commit therefore forbids toggling of the offloaded state of
offline CPUs.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c        |  3 +--
 kernel/rcu/tree_plugin.h | 57 ++++++++++++++++++------------------------------
 2 files changed, 22 insertions(+), 38 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 03503e2..ee77858 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4086,8 +4086,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
 	/*
 	 * Lock in case the CB/GP kthreads are still around handling
-	 * old callbacks (longer term we should flush all callbacks
-	 * before completing CPU offline)
+	 * old callbacks.
 	 */
 	rcu_nocb_lock(rdp);
 	if (rcu_segcblist_empty(&rdp->cblist)) /* No early-boot CBs? */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 013142d..9fd8588 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2399,23 +2399,18 @@ static int rdp_offload_toggle(struct rcu_data *rdp,
 	return 0;
 }
 
-static int __rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
+static long rcu_nocb_rdp_deoffload(void *arg)
 {
+	struct rcu_data *rdp = arg;
 	struct rcu_segcblist *cblist = &rdp->cblist;
 	unsigned long flags;
 	int ret;
 
+	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
+
 	pr_info("De-offloading %d\n", rdp->cpu);
 
 	rcu_nocb_lock_irqsave(rdp, flags);
-	/*
-	 * If there are still pending work offloaded, the offline
-	 * CPU won't help much handling them.
-	 */
-	if (cpu_is_offline(rdp->cpu) && !rcu_segcblist_empty(&rdp->cblist)) {
-		rcu_nocb_unlock_irqrestore(rdp, flags);
-		return -EBUSY;
-	}
 
 	ret = rdp_offload_toggle(rdp, false, flags);
 	swait_event_exclusive(rdp->nocb_state_wq,
@@ -2446,14 +2441,6 @@ static int __rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
 	return ret;
 }
 
-static long rcu_nocb_rdp_deoffload(void *arg)
-{
-	struct rcu_data *rdp = arg;
-
-	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
-	return __rcu_nocb_rdp_deoffload(rdp);
-}
-
 int rcu_nocb_cpu_deoffload(int cpu)
 {
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
@@ -2466,12 +2453,14 @@ int rcu_nocb_cpu_deoffload(int cpu)
 	mutex_lock(&rcu_state.barrier_mutex);
 	cpus_read_lock();
 	if (rcu_rdp_is_offloaded(rdp)) {
-		if (cpu_online(cpu))
+		if (cpu_online(cpu)) {
 			ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp);
-		else
-			ret = __rcu_nocb_rdp_deoffload(rdp);
-		if (!ret)
-			cpumask_clear_cpu(cpu, rcu_nocb_mask);
+			if (!ret)
+				cpumask_clear_cpu(cpu, rcu_nocb_mask);
+		} else {
+			pr_info("NOCB: Can't CB-deoffload an offline CPU\n");
+			ret = -EINVAL;
+		}
 	}
 	cpus_read_unlock();
 	mutex_unlock(&rcu_state.barrier_mutex);
@@ -2480,12 +2469,14 @@ int rcu_nocb_cpu_deoffload(int cpu)
 }
 EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload);
 
-static int __rcu_nocb_rdp_offload(struct rcu_data *rdp)
+static long rcu_nocb_rdp_offload(void *arg)
 {
+	struct rcu_data *rdp = arg;
 	struct rcu_segcblist *cblist = &rdp->cblist;
 	unsigned long flags;
 	int ret;
 
+	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
 	/*
 	 * For now we only support re-offload, ie: the rdp must have been
 	 * offloaded on boot first.
@@ -2525,14 +2516,6 @@ static int __rcu_nocb_rdp_offload(struct rcu_data *rdp)
 	return ret;
 }
 
-static long rcu_nocb_rdp_offload(void *arg)
-{
-	struct rcu_data *rdp = arg;
-
-	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
-	return __rcu_nocb_rdp_offload(rdp);
-}
-
 int rcu_nocb_cpu_offload(int cpu)
 {
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
@@ -2541,12 +2524,14 @@ int rcu_nocb_cpu_offload(int cpu)
 	mutex_lock(&rcu_state.barrier_mutex);
 	cpus_read_lock();
 	if (!rcu_rdp_is_offloaded(rdp)) {
-		if (cpu_online(cpu))
+		if (cpu_online(cpu)) {
 			ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp);
-		else
-			ret = __rcu_nocb_rdp_offload(rdp);
-		if (!ret)
-			cpumask_set_cpu(cpu, rcu_nocb_mask);
+			if (!ret)
+				cpumask_set_cpu(cpu, rcu_nocb_mask);
+		} else {
+			pr_info("NOCB: Can't CB-offload an offline CPU\n");
+			ret = -EINVAL;
+		}
 	}
 	cpus_read_unlock();
 	mutex_unlock(&rcu_state.barrier_mutex);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 05/12] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
                   ` (3 preceding siblings ...)
  2021-03-04  0:23 ` [PATCH tip/core/rcu 04/12] rcu/nocb: Forbid NOCB toggling on offline CPUs paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 06/12] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up paulmck
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Neeraj Upadhyay, Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

The nocb_cb_wait() function first sets the rdp->nocb_cb_sleep flag to
true by after invoking the callbacks, and then sets it back to false if
it finds more callbacks that are ready to invoke.

This is confusing and will become unsafe if this flag is ever read
locklessly.  This commit therefore writes it only once, based on the
state after both callback invocation and checking.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_plugin.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 9fd8588..6a7f77d 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2230,6 +2230,7 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 	unsigned long flags;
 	bool needwake_state = false;
 	bool needwake_gp = false;
+	bool can_sleep = true;
 	struct rcu_node *rnp = rdp->mynode;
 
 	local_irq_save(flags);
@@ -2253,8 +2254,6 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 		raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
 	}
 
-	WRITE_ONCE(rdp->nocb_cb_sleep, true);
-
 	if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
 		if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB)) {
 			rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_CB);
@@ -2262,7 +2261,7 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 				needwake_state = true;
 		}
 		if (rcu_segcblist_ready_cbs(cblist))
-			WRITE_ONCE(rdp->nocb_cb_sleep, false);
+			can_sleep = false;
 	} else {
 		/*
 		 * De-offloading. Clear our flag and notify the de-offload worker.
@@ -2275,6 +2274,8 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 			needwake_state = true;
 	}
 
+	WRITE_ONCE(rdp->nocb_cb_sleep, can_sleep);
+
 	if (rdp->nocb_cb_sleep)
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("CBSleep"));
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 06/12] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
                   ` (4 preceding siblings ...)
  2021-03-04  0:23 ` [PATCH tip/core/rcu 05/12] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 07/12] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading paulmck
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Neeraj Upadhyay, Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

At the start of a CPU-hotplug operation, the incoming CPU's callback
list can be in a number of states:

1.	Disabled and empty.  This is the case when the boot CPU has
	not invoked call_rcu(), when a non-boot CPU first comes online,
	and when a non-offloaded CPU comes back online.  In this case,
	it is both necessary and permissible to initialize ->cblist.
	Because either the CPU is currently running with interrupts
	disabled (boot CPU) or is not yet running at all (other CPUs),
	it is not necessary to acquire ->nocb_lock.

	In this case, initialization is required.

2.	Disabled and non-empty.  This cannot occur, because early boot
	call_rcu() invocations enable the callback list before enqueuing
	their callback.

3.	Enabled, whether empty or not.	In this case, the callback
	list has already been initialized.  This case occurs when the
	boot CPU has executed an early boot call_rcu() and also when
	an offloaded CPU comes back online.  In both cases, there is
	no need to initialize the callback list: In the boot-CPU case,
	the CPU has not (yet) gone offline, and in the offloaded case,
	the rcuo kthreads are taking care of business.

	Because it is not necessary to initialize the callback list,
	it is also not necessary to acquire ->nocb_lock.

Therefore, checking if the segcblist is enabled suffices.  This commit
therefore initializes the callback list at rcutree_prepare_cpu() time
only if that list is disabled.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index ee77858..402ea36 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4084,14 +4084,13 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	rdp->dynticks_nesting = 1;	/* CPU not up, no tearing. */
 	rcu_dynticks_eqs_online();
 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
+
 	/*
-	 * Lock in case the CB/GP kthreads are still around handling
-	 * old callbacks.
+	 * Only non-NOCB CPUs that didn't have early-boot callbacks need to be
+	 * (re-)initialized.
 	 */
-	rcu_nocb_lock(rdp);
-	if (rcu_segcblist_empty(&rdp->cblist)) /* No early-boot CBs? */
+	if (!rcu_segcblist_is_enabled(&rdp->cblist))
 		rcu_segcblist_init(&rdp->cblist);  /* Re-enable callbacks. */
-	rcu_nocb_unlock(rdp);
 
 	/*
 	 * Add CPU to leaf rcu_node pending-online bitmask.  Any needed
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 07/12] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
                   ` (5 preceding siblings ...)
  2021-03-04  0:23 ` [PATCH tip/core/rcu 06/12] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 08/12] rcu: Make nocb_nobypass_lim_per_jiffy static paulmck
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Neeraj Upadhyay, Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

The name nocb_gp_update_state() is unenlightening, so this commit changes
it to nocb_gp_update_state_deoffloading().  This function now does what
its name says, updates state and returns true if the CPU corresponding to
the specified rcu_data structure is in the process of being de-offloaded.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_plugin.h | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 6a7f77d..93d3938 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2016,7 +2016,8 @@ static inline bool nocb_gp_enabled_cb(struct rcu_data *rdp)
 	return rcu_segcblist_test_flags(&rdp->cblist, flags);
 }
 
-static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_state)
+static inline bool nocb_gp_update_state_deoffloading(struct rcu_data *rdp,
+						     bool *needwake_state)
 {
 	struct rcu_segcblist *cblist = &rdp->cblist;
 
@@ -2026,7 +2027,7 @@ static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_sta
 			if (rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
 				*needwake_state = true;
 		}
-		return true;
+		return false;
 	}
 
 	/*
@@ -2037,7 +2038,7 @@ static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_sta
 	rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP);
 	if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
 		*needwake_state = true;
-	return false;
+	return true;
 }
 
 
@@ -2075,7 +2076,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 			continue;
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
 		rcu_nocb_lock_irqsave(rdp, flags);
-		if (!nocb_gp_update_state(rdp, &needwake_state)) {
+		if (nocb_gp_update_state_deoffloading(rdp, &needwake_state)) {
 			rcu_nocb_unlock_irqrestore(rdp, flags);
 			if (needwake_state)
 				swake_up_one(&rdp->nocb_state_wq);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 08/12] rcu: Make nocb_nobypass_lim_per_jiffy static
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
                   ` (6 preceding siblings ...)
  2021-03-04  0:23 ` [PATCH tip/core/rcu 07/12] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 09/12] rcu/nocb: Fix missed nocb_timer requeue paulmck
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Jiapeng Chong, Paul E . McKenney

From: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>

RCU triggerse the following sparse warning:

kernel/rcu/tree_plugin.h:1497:5: warning: symbol
'nocb_nobypass_lim_per_jiffy' was not declared. Should it be static?

This commit therefore makes this variable static.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_plugin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 93d3938..a1a17ad 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1556,7 +1556,7 @@ early_param("rcu_nocb_poll", parse_rcu_nocb_poll);
  * After all, the main point of bypassing is to avoid lock contention
  * on ->nocb_lock, which only can happen at high call_rcu() rates.
  */
-int nocb_nobypass_lim_per_jiffy = 16 * 1000 / HZ;
+static int nocb_nobypass_lim_per_jiffy = 16 * 1000 / HZ;
 module_param(nocb_nobypass_lim_per_jiffy, int, 0);
 
 /*
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 09/12] rcu/nocb: Fix missed nocb_timer requeue
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
                   ` (7 preceding siblings ...)
  2021-03-04  0:23 ` [PATCH tip/core/rcu 08/12] rcu: Make nocb_nobypass_lim_per_jiffy static paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 10/12] rcu/nocb: Disable bypass when CPU isn't completely offloaded paulmck
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker, stable,
	Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

This sequence of events can lead to a failure to requeue a CPU's
->nocb_timer:

1.	There are no callbacks queued for any CPU covered by CPU 0-2's
	->nocb_gp_kthread.  Note that ->nocb_gp_kthread is associated
	with CPU 0.

2.	CPU 1 enqueues its first callback with interrupts disabled, and
	thus must defer awakening its ->nocb_gp_kthread.  It therefore
	queues its rcu_data structure's ->nocb_timer.  At this point,
	CPU 1's rdp->nocb_defer_wakeup is RCU_NOCB_WAKE.

3.	CPU 2, which shares the same ->nocb_gp_kthread, also enqueues a
	callback, but with interrupts enabled, allowing it to directly
	awaken the ->nocb_gp_kthread.

4.	The newly awakened ->nocb_gp_kthread associates both CPU 1's
	and CPU 2's callbacks with a future grace period and arranges
	for that grace period to be started.

5.	This ->nocb_gp_kthread goes to sleep waiting for the end of this
	future grace period.

6.	This grace period elapses before the CPU 1's timer fires.
	This is normally improbably given that the timer is set for only
	one jiffy, but timers can be delayed.  Besides, it is possible
	that kernel was built with CONFIG_RCU_STRICT_GRACE_PERIOD=y.

7.	The grace period ends, so rcu_gp_kthread awakens the
	->nocb_gp_kthread, which in turn awakens both CPU 1's and
	CPU 2's ->nocb_cb_kthread.  Then ->nocb_gb_kthread sleeps
	waiting for more newly queued callbacks.

8.	CPU 1's ->nocb_cb_kthread invokes its callback, then sleeps
	waiting for more invocable callbacks.

9.	Note that neither kthread updated any ->nocb_timer state,
	so CPU 1's ->nocb_defer_wakeup is still set to RCU_NOCB_WAKE.

10.	CPU 1 enqueues its second callback, this time with interrupts
 	enabled so it can wake directly	->nocb_gp_kthread.
	It does so with calling wake_nocb_gp() which also cancels the
	pending timer that got queued in step 2. But that doesn't reset
	CPU 1's ->nocb_defer_wakeup which is still set to RCU_NOCB_WAKE.
	So CPU 1's ->nocb_defer_wakeup and its ->nocb_timer are now
	desynchronized.

11.	->nocb_gp_kthread associates the callback queued in 10 with a new
	grace period, arranges for that grace period to start and sleeps
	waiting for it to complete.

12.	The grace period ends, rcu_gp_kthread awakens ->nocb_gp_kthread,
	which in turn wakes up CPU 1's ->nocb_cb_kthread which then
	invokes the callback queued in 10.

13.	CPU 1 enqueues its third callback, this time with interrupts
	disabled so it must queue a timer for a deferred wakeup. However
	the value of its ->nocb_defer_wakeup is RCU_NOCB_WAKE which
	incorrectly indicates that a timer is already queued.  Instead,
	CPU 1's ->nocb_timer was cancelled in 10.  CPU 1 therefore fails
	to queue the ->nocb_timer.

14.	CPU 1 has its pending callback and it may go unnoticed until
	some other CPU ever wakes up ->nocb_gp_kthread or CPU 1 ever
	calls an explicit deferred wakeup, for example, during idle entry.

This commit fixes this bug by resetting rdp->nocb_defer_wakeup everytime
we delete the ->nocb_timer.

It is quite possible that there is a similar scenario involving
->nocb_bypass_timer and ->nocb_defer_wakeup.  However, despite some
effort from several people, a failure scenario has not yet been located.
However, that by no means guarantees that no such scenario exists.
Finding a failure scenario is left as an exercise for the reader, and the
"Fixes:" tag below relates to ->nocb_bypass_timer instead of ->nocb_timer.

Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing)
Cc: <stable@vger.kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_plugin.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index a1a17ad..e392bd1 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1708,7 +1708,11 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
 		rcu_nocb_unlock_irqrestore(rdp, flags);
 		return false;
 	}
-	del_timer(&rdp->nocb_timer);
+
+	if (READ_ONCE(rdp->nocb_defer_wakeup) > RCU_NOCB_WAKE_NOT) {
+		WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
+		del_timer(&rdp->nocb_timer);
+	}
 	rcu_nocb_unlock_irqrestore(rdp, flags);
 	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
 	if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) {
@@ -2335,7 +2339,6 @@ static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp)
 		return false;
 	}
 	ndw = READ_ONCE(rdp->nocb_defer_wakeup);
-	WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
 	ret = wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags);
 	trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake"));
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 10/12] rcu/nocb: Disable bypass when CPU isn't completely offloaded
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
                   ` (8 preceding siblings ...)
  2021-03-04  0:23 ` [PATCH tip/core/rcu 09/12] rcu/nocb: Fix missed nocb_timer requeue paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 11/12] rcu/nocb: Remove stale comment above rcu_segcblist_offload() paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 12/12] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible paulmck
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Neeraj Upadhyay, Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

Currently, the bypass is flushed at the very last moment in the
deoffloading procedure.  However, this approach leads to a larger state
space than would be preferred.  This commit therefore disables the
bypass at soon as the deoffloading procedure begins, then flushes it.
This guarantees that the bypass remains empty and thus out of the way
of the deoffloading procedure.

Symmetrically, this commit waits to enable the bypass until the offloading
procedure has completed.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/rcu_segcblist.h |  7 ++++---
 kernel/rcu/tree_plugin.h      | 38 +++++++++++++++++++++++++++++---------
 2 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
index 8afe886..3db96c4 100644
--- a/include/linux/rcu_segcblist.h
+++ b/include/linux/rcu_segcblist.h
@@ -109,7 +109,7 @@ struct rcu_cblist {
  *  |                           SEGCBLIST_KTHREAD_GP                           |
  *  |                                                                          |
  *  |   Kthreads handle callbacks holding nocb_lock, local rcu_core() stops    |
- *  |   handling callbacks.                                                    |
+ *  |   handling callbacks. Enable bypass queueing.                            |
  *  ----------------------------------------------------------------------------
  */
 
@@ -125,7 +125,7 @@ struct rcu_cblist {
  *  |                           SEGCBLIST_KTHREAD_GP                           |
  *  |                                                                          |
  *  |   CB/GP kthreads handle callbacks holding nocb_lock, local rcu_core()    |
- *  |   ignores callbacks.                                                     |
+ *  |   ignores callbacks. Bypass enqueue is enabled.                          |
  *  ----------------------------------------------------------------------------
  *                                      |
  *                                      v
@@ -134,7 +134,8 @@ struct rcu_cblist {
  *  |                           SEGCBLIST_KTHREAD_GP                           |
  *  |                                                                          |
  *  |   CB/GP kthreads and local rcu_core() handle callbacks concurrently      |
- *  |   holding nocb_lock. Wake up CB and GP kthreads if necessary.            |
+ *  |   holding nocb_lock. Wake up CB and GP kthreads if necessary. Disable    |
+ *  |   bypass enqueue.                                                        |
  *  ----------------------------------------------------------------------------
  *                                      |
  *                                      v
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index e392bd1..b08564b 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1830,11 +1830,22 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 	unsigned long j = jiffies;
 	long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
 
+	lockdep_assert_irqs_disabled();
+
+	// Pure softirq/rcuc based processing: no bypassing, no
+	// locking.
 	if (!rcu_rdp_is_offloaded(rdp)) {
 		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
+		return false;
+	}
+
+	// In the process of (de-)offloading: no bypassing, but
+	// locking.
+	if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
+		rcu_nocb_lock(rdp);
+		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
 		return false; /* Not offloaded, no bypassing. */
 	}
-	lockdep_assert_irqs_disabled();
 
 	// Don't use ->nocb_bypass during early boot.
 	if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) {
@@ -2416,7 +2427,16 @@ static long rcu_nocb_rdp_deoffload(void *arg)
 	pr_info("De-offloading %d\n", rdp->cpu);
 
 	rcu_nocb_lock_irqsave(rdp, flags);
-
+	/*
+	 * Flush once and for all now. This suffices because we are
+	 * running on the target CPU holding ->nocb_lock (thus having
+	 * interrupts disabled), and because rdp_offload_toggle()
+	 * invokes rcu_segcblist_offload(), which clears SEGCBLIST_OFFLOADED.
+	 * Thus future calls to rcu_segcblist_completely_offloaded() will
+	 * return false, which means that future calls to rcu_nocb_try_bypass()
+	 * will refuse to put anything into the bypass.
+	 */
+	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
 	ret = rdp_offload_toggle(rdp, false, flags);
 	swait_event_exclusive(rdp->nocb_state_wq,
 			      !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |
@@ -2428,21 +2448,21 @@ static long rcu_nocb_rdp_deoffload(void *arg)
 	del_timer_sync(&rdp->nocb_timer);
 
 	/*
-	 * Flush bypass. While IRQs are disabled and once we set
-	 * SEGCBLIST_SOFTIRQ_ONLY, no callback is supposed to be
-	 * enqueued on bypass.
+	 * Theoretically we could set SEGCBLIST_SOFTIRQ_ONLY with CB unlocked
+	 * and IRQs disabled but let's be paranoid.
 	 */
 	rcu_nocb_lock_irqsave(rdp, flags);
-	rcu_nocb_flush_bypass(rdp, NULL, jiffies);
 	rcu_segcblist_set_flags(cblist, SEGCBLIST_SOFTIRQ_ONLY);
 	/*
 	 * With SEGCBLIST_SOFTIRQ_ONLY, we can't use
-	 * rcu_nocb_unlock_irqrestore() anymore. Theoretically we
-	 * could set SEGCBLIST_SOFTIRQ_ONLY with cb unlocked and IRQs
-	 * disabled now, but let's be paranoid.
+	 * rcu_nocb_unlock_irqrestore() anymore.
 	 */
 	raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
 
+	/* Sanity check */
+	WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
+
+
 	return ret;
 }
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 11/12] rcu/nocb: Remove stale comment above rcu_segcblist_offload()
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
                   ` (9 preceding siblings ...)
  2021-03-04  0:23 ` [PATCH tip/core/rcu 10/12] rcu/nocb: Disable bypass when CPU isn't completely offloaded paulmck
@ 2021-03-04  0:23 ` paulmck
  2021-03-04  0:23 ` [PATCH tip/core/rcu 12/12] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible paulmck
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Neeraj Upadhyay, Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

This commit removes a stale comment claiming that the cblist must be
empty before changing the offloading state.  This claim was correct back
when the offloaded state was defined exclusively at boot.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/rcu_segcblist.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 7f181c9..aaa1112 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -261,8 +261,7 @@ void rcu_segcblist_disable(struct rcu_segcblist *rsclp)
 }
 
 /*
- * Mark the specified rcu_segcblist structure as offloaded.  This
- * structure must be empty.
+ * Mark the specified rcu_segcblist structure as offloaded.
  */
 void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload)
 {
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH tip/core/rcu 12/12] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible
  2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
                   ` (10 preceding siblings ...)
  2021-03-04  0:23 ` [PATCH tip/core/rcu 11/12] rcu/nocb: Remove stale comment above rcu_segcblist_offload() paulmck
@ 2021-03-04  0:23 ` paulmck
  11 siblings, 0 replies; 15+ messages in thread
From: paulmck @ 2021-03-04  0:23 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Neeraj Upadhyay, Boqun Feng, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

Those tracing calls don't need to be under ->nocb_lock.  This commit
therefore moves them outside of that lock.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_plugin.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index b08564b..9846c8a 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1703,9 +1703,9 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
 
 	lockdep_assert_held(&rdp->nocb_lock);
 	if (!READ_ONCE(rdp_gp->nocb_gp_kthread)) {
+		rcu_nocb_unlock_irqrestore(rdp, flags);
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 				    TPS("AlreadyAwake"));
-		rcu_nocb_unlock_irqrestore(rdp, flags);
 		return false;
 	}
 
@@ -1955,9 +1955,9 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 	// If we are being polled or there is no kthread, just leave.
 	t = READ_ONCE(rdp->nocb_gp_kthread);
 	if (rcu_nocb_poll || !t) {
+		rcu_nocb_unlock_irqrestore(rdp, flags);
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 				    TPS("WakeNotPoll"));
-		rcu_nocb_unlock_irqrestore(rdp, flags);
 		return;
 	}
 	// Need to actually to a wakeup.
@@ -1992,8 +1992,8 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 					   TPS("WakeOvfIsDeferred"));
 		rcu_nocb_unlock_irqrestore(rdp, flags);
 	} else {
-		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
 		rcu_nocb_unlock_irqrestore(rdp, flags);
+		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
 	}
 	return;
 }
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH tip/core/rcu 02/12] timer: Report ignored local enqueue in nohz mode
  2021-03-04  0:23 ` [PATCH tip/core/rcu 02/12] timer: Report ignored local enqueue in nohz mode paulmck
@ 2021-03-04 11:58   ` Rafael J. Wysocki
  2021-03-04 20:06     ` Paul E. McKenney
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2021-03-04 11:58 UTC (permalink / raw)
  To: paulmck, rcu
  Cc: linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker

On 3/4/2021 1:23 AM, paulmck@kernel.org wrote:
> From: Frederic Weisbecker <frederic@kernel.org>
>
> Enqueuing a local timer after the tick has been stopped will result in
> the timer being ignored until the next random interrupt.
>
> Perform sanity checks to report these situations.
>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar<mingo@kernel.org>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


> ---
>   kernel/sched/core.c | 24 +++++++++++++++++++++++-
>   1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ca2bb62..4822371 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -674,6 +674,26 @@ int get_nohz_timer_target(void)
>   	return cpu;
>   }
>   
> +static void wake_idle_assert_possible(void)
> +{
> +#ifdef CONFIG_SCHED_DEBUG
> +	/* Timers are re-evaluated after idle IRQs */
> +	if (in_hardirq())
> +		return;
> +	/*
> +	 * Same as hardirqs, assuming they are executing
> +	 * on IRQ tail. Ksoftirqd shouldn't reach here
> +	 * as the timer base wouldn't be idle. And inline
> +	 * softirq processing after a call to local_bh_enable()
> +	 * within idle loop sound too fun to be considered here.
> +	 */
> +	if (in_serving_softirq())
> +		return;
> +
> +	WARN_ON_ONCE("Late timer enqueue may be ignored\n");
> +#endif
> +}
> +
>   /*
>    * When add_timer_on() enqueues a timer into the timer wheel of an
>    * idle CPU then this timer might expire before the next timer event
> @@ -688,8 +708,10 @@ static void wake_up_idle_cpu(int cpu)
>   {
>   	struct rq *rq = cpu_rq(cpu);
>   
> -	if (cpu == smp_processor_id())
> +	if (cpu == smp_processor_id()) {
> +		wake_idle_assert_possible();
>   		return;
> +	}
>   
>   	if (set_nr_and_not_polling(rq->idle))
>   		smp_send_reschedule(cpu);



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH tip/core/rcu 02/12] timer: Report ignored local enqueue in nohz mode
  2021-03-04 11:58   ` Rafael J. Wysocki
@ 2021-03-04 20:06     ` Paul E. McKenney
  0 siblings, 0 replies; 15+ messages in thread
From: Paul E. McKenney @ 2021-03-04 20:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker

On Thu, Mar 04, 2021 at 12:58:54PM +0100, Rafael J. Wysocki wrote:
> On 3/4/2021 1:23 AM, paulmck@kernel.org wrote:
> > From: Frederic Weisbecker <frederic@kernel.org>
> > 
> > Enqueuing a local timer after the tick has been stopped will result in
> > the timer being ignored until the next random interrupt.
> > 
> > Perform sanity checks to report these situations.
> > 
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar<mingo@kernel.org>
> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Applied, thank you!

							Thanx, Paul

> > ---
> >   kernel/sched/core.c | 24 +++++++++++++++++++++++-
> >   1 file changed, 23 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index ca2bb62..4822371 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -674,6 +674,26 @@ int get_nohz_timer_target(void)
> >   	return cpu;
> >   }
> > +static void wake_idle_assert_possible(void)
> > +{
> > +#ifdef CONFIG_SCHED_DEBUG
> > +	/* Timers are re-evaluated after idle IRQs */
> > +	if (in_hardirq())
> > +		return;
> > +	/*
> > +	 * Same as hardirqs, assuming they are executing
> > +	 * on IRQ tail. Ksoftirqd shouldn't reach here
> > +	 * as the timer base wouldn't be idle. And inline
> > +	 * softirq processing after a call to local_bh_enable()
> > +	 * within idle loop sound too fun to be considered here.
> > +	 */
> > +	if (in_serving_softirq())
> > +		return;
> > +
> > +	WARN_ON_ONCE("Late timer enqueue may be ignored\n");
> > +#endif
> > +}
> > +
> >   /*
> >    * When add_timer_on() enqueues a timer into the timer wheel of an
> >    * idle CPU then this timer might expire before the next timer event
> > @@ -688,8 +708,10 @@ static void wake_up_idle_cpu(int cpu)
> >   {
> >   	struct rq *rq = cpu_rq(cpu);
> > -	if (cpu == smp_processor_id())
> > +	if (cpu == smp_processor_id()) {
> > +		wake_idle_assert_possible();
> >   		return;
> > +	}
> >   	if (set_nr_and_not_polling(rq->idle))
> >   		smp_send_reschedule(cpu);
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-03-04 20:07 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-04  0:22 [PATCH tip/core/rcu 0/12] NOCB updates for v5.13 Paul E. McKenney
2021-03-04  0:23 ` [PATCH tip/core/rcu 01/12] rcu/nocb: Detect unsafe checks for offloaded rdp paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 02/12] timer: Report ignored local enqueue in nohz mode paulmck
2021-03-04 11:58   ` Rafael J. Wysocki
2021-03-04 20:06     ` Paul E. McKenney
2021-03-04  0:23 ` [PATCH tip/core/rcu 03/12] rcu/nocb: Comment the reason behind BH disablement on batch processing paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 04/12] rcu/nocb: Forbid NOCB toggling on offline CPUs paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 05/12] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 06/12] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 07/12] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 08/12] rcu: Make nocb_nobypass_lim_per_jiffy static paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 09/12] rcu/nocb: Fix missed nocb_timer requeue paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 10/12] rcu/nocb: Disable bypass when CPU isn't completely offloaded paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 11/12] rcu/nocb: Remove stale comment above rcu_segcblist_offload() paulmck
2021-03-04  0:23 ` [PATCH tip/core/rcu 12/12] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible paulmck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).