linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/16] rcu/nocb updates
@ 2021-01-28 17:12 Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 01/16] rcu/nocb: Fix potential missed nocb_timer rearm Frederic Weisbecker
                   ` (15 more replies)
  0 siblings, 16 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

So this set has grown further than I expected.

This addresses most reviews from Paul and also consolidates the nocb
timers code.

Please mind the very first patch that is a stable bugfix.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	rcu/dev

HEAD: 75991420c246c26f598602da1a70947b5bdf77b6

Thanks,
	Frederic
---

Frederic Weisbecker (16):
      rcu/nocb: Fix potential missed nocb_timer rearm
      rcu/nocb: Comment the reason behind BH disablement on batch processing
      rcu/nocb: Forbid NOCB toggling on offline CPUs
      rcu/nocb: Only (re-)initialize segcblist when needed on CPU up
      rcu/nocb: Disable bypass when CPU isn't completely offloaded
      rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep
      rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading
      rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible
      rcu/nocb: Merge nocb_timer to the rdp leader
      rcu/nocb: Directly call __wake_nocb_gp() from bypass timer
      rcu/nocb: Allow de-offloading rdp leader
      rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup
      rcu/nocb: Delete bypass_timer upon nocb_gp wakeup
      rcu/nocb: Only cancel nocb timer if not polling
      rcu/nocb: Prepare for finegrained deferred wakeup
      rcu/nocb: Unify timers


 include/linux/rcu_segcblist.h |   7 +-
 include/trace/events/rcu.h    |   1 +
 kernel/rcu/tree.c             |  12 +-
 kernel/rcu/tree.h             |   9 +-
 kernel/rcu/tree_plugin.h      | 280 ++++++++++++++++++++++--------------------
 5 files changed, 163 insertions(+), 146 deletions(-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 01/16] rcu/nocb: Fix potential missed nocb_timer rearm
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
       [not found]   ` <20210128184834.GP2743@paulmck-ThinkPad-P72>
  2021-01-28 17:12 ` [PATCH 02/16] rcu/nocb: Comment the reason behind BH disablement on batch processing Frederic Weisbecker
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

The "nocb_bypass_timer" ends up calling wake_nocb_gp() which deletes
the pending "nocb_timer" (note they are not the same timers) for the
given rdp without resetting the matching state stored in nocb_defer
wakeup.

As a result, a future call_rcu() on that rdp may be fooled and think the
timer is armed when it's not, missing a deferred nocb_gp wakeup.

Fix this with resetting rdp->nocb_defer_wakeup when we disarm the timer.

Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing)
Cc: Stable <stable@vger.kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tree_plugin.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 7e33dae0e6ee..a44f80d7661b 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1705,6 +1705,8 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
 		rcu_nocb_unlock_irqrestore(rdp, flags);
 		return false;
 	}
+
+	rdp->nocb_defer_wakeup = RCU_NOCB_WAKE_NOT;
 	del_timer(&rdp->nocb_timer);
 	rcu_nocb_unlock_irqrestore(rdp, flags);
 	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/16] rcu/nocb: Comment the reason behind BH disablement on batch processing
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 01/16] rcu/nocb: Fix potential missed nocb_timer rearm Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 03/16] rcu/nocb: Forbid NOCB toggling on offline CPUs Frederic Weisbecker
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

Explain why we need to disable softirqs while processing callbacks in
an offline fashion. The subtle reason doesn't want to be forgotten.

Reported-by: Boqun Feng <boqun.feng@gmail.com>
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tree_plugin.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index a44f80d7661b..dcfae03eb9e9 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2235,6 +2235,12 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 	local_irq_save(flags);
 	rcu_momentary_dyntick_idle();
 	local_irq_restore(flags);
+	/*
+	 * While transitioning to/from NOCB mode, a CPU might execute the same
+	 * callback concurrently if it requeues itself and the softirq interrupts
+	 * the offloaded callback processing. Make sure we disable BH to prevent
+	 * from that.
+	 */
 	local_bh_disable();
 	rcu_do_batch(rdp);
 	local_bh_enable();
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/16] rcu/nocb: Forbid NOCB toggling on offline CPUs
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 01/16] rcu/nocb: Fix potential missed nocb_timer rearm Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 02/16] rcu/nocb: Comment the reason behind BH disablement on batch processing Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 19:52   ` Paul E. McKenney
  2021-01-28 17:12 ` [PATCH 04/16] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up Frederic Weisbecker
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

Toggling the NOCB state of a CPU when it is offline imply some specific
issues to handle, especially making sure that the kthreads have handled
all the remaining callbacks and bypass before the corresponding CPU can
be set as non-offloaded while it is offline.

To prevent from such complications, simply forbid offline CPUs to
perform NOCB-mode toggling. It's a simple rule to observe and after all
it doesn't make much sense to switch a non working CPU to/from offloaded
state.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tree.c        |  3 +--
 kernel/rcu/tree_plugin.h | 57 +++++++++++++++-------------------------
 2 files changed, 22 insertions(+), 38 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e7a226abff0d..4c5a1ac54fa6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4068,8 +4068,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
 	/*
 	 * Lock in case the CB/GP kthreads are still around handling
-	 * old callbacks (longer term we should flush all callbacks
-	 * before completing CPU offline)
+	 * old callbacks.
 	 */
 	rcu_nocb_lock(rdp);
 	if (rcu_segcblist_empty(&rdp->cblist)) /* No early-boot CBs? */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index dcfae03eb9e9..732942a15f2b 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2399,23 +2399,18 @@ static int rdp_offload_toggle(struct rcu_data *rdp,
 	return 0;
 }
 
-static int __rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
+static long rcu_nocb_rdp_deoffload(void *arg)
 {
+	struct rcu_data *rdp = arg;
 	struct rcu_segcblist *cblist = &rdp->cblist;
 	unsigned long flags;
 	int ret;
 
+	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
+
 	pr_info("De-offloading %d\n", rdp->cpu);
 
 	rcu_nocb_lock_irqsave(rdp, flags);
-	/*
-	 * If there are still pending work offloaded, the offline
-	 * CPU won't help much handling them.
-	 */
-	if (cpu_is_offline(rdp->cpu) && !rcu_segcblist_empty(&rdp->cblist)) {
-		rcu_nocb_unlock_irqrestore(rdp, flags);
-		return -EBUSY;
-	}
 
 	ret = rdp_offload_toggle(rdp, false, flags);
 	swait_event_exclusive(rdp->nocb_state_wq,
@@ -2446,14 +2441,6 @@ static int __rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
 	return ret;
 }
 
-static long rcu_nocb_rdp_deoffload(void *arg)
-{
-	struct rcu_data *rdp = arg;
-
-	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
-	return __rcu_nocb_rdp_deoffload(rdp);
-}
-
 int rcu_nocb_cpu_deoffload(int cpu)
 {
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
@@ -2466,12 +2453,14 @@ int rcu_nocb_cpu_deoffload(int cpu)
 	mutex_lock(&rcu_state.barrier_mutex);
 	cpus_read_lock();
 	if (rcu_rdp_is_offloaded(rdp)) {
-		if (cpu_online(cpu))
+		if (cpu_online(cpu)) {
 			ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp);
-		else
-			ret = __rcu_nocb_rdp_deoffload(rdp);
-		if (!ret)
-			cpumask_clear_cpu(cpu, rcu_nocb_mask);
+			if (!ret)
+				cpumask_clear_cpu(cpu, rcu_nocb_mask);
+		} else {
+			pr_info("NOCB: Can't CB-deoffload an offline CPU\n");
+			ret = -EINVAL;
+		}
 	}
 	cpus_read_unlock();
 	mutex_unlock(&rcu_state.barrier_mutex);
@@ -2480,12 +2469,14 @@ int rcu_nocb_cpu_deoffload(int cpu)
 }
 EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload);
 
-static int __rcu_nocb_rdp_offload(struct rcu_data *rdp)
+static long rcu_nocb_rdp_offload(void *arg)
 {
+	struct rcu_data *rdp = arg;
 	struct rcu_segcblist *cblist = &rdp->cblist;
 	unsigned long flags;
 	int ret;
 
+	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
 	/*
 	 * For now we only support re-offload, ie: the rdp must have been
 	 * offloaded on boot first.
@@ -2525,14 +2516,6 @@ static int __rcu_nocb_rdp_offload(struct rcu_data *rdp)
 	return ret;
 }
 
-static long rcu_nocb_rdp_offload(void *arg)
-{
-	struct rcu_data *rdp = arg;
-
-	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
-	return __rcu_nocb_rdp_offload(rdp);
-}
-
 int rcu_nocb_cpu_offload(int cpu)
 {
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
@@ -2541,12 +2524,14 @@ int rcu_nocb_cpu_offload(int cpu)
 	mutex_lock(&rcu_state.barrier_mutex);
 	cpus_read_lock();
 	if (!rcu_rdp_is_offloaded(rdp)) {
-		if (cpu_online(cpu))
+		if (cpu_online(cpu)) {
 			ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp);
-		else
-			ret = __rcu_nocb_rdp_offload(rdp);
-		if (!ret)
-			cpumask_set_cpu(cpu, rcu_nocb_mask);
+			if (!ret)
+				cpumask_set_cpu(cpu, rcu_nocb_mask);
+		} else {
+			pr_info("NOCB: Can't CB-offload an offline CPU\n");
+			ret = -EINVAL;
+		}
 	}
 	cpus_read_unlock();
 	mutex_unlock(&rcu_state.barrier_mutex);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/16] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 03/16] rcu/nocb: Forbid NOCB toggling on offline CPUs Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
       [not found]   ` <20210128191228.GQ2743@paulmck-ThinkPad-P72>
  2021-01-28 17:12 ` [PATCH 05/16] rcu/nocb: Disable bypass when CPU isn't completely offloaded Frederic Weisbecker
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

Simply checking if the segcblist is enabled is enough to know if we
need to initialize it or not. It's safe to check within hotplug
machine.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/rcu/tree.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 4c5a1ac54fa6..f74a9ba62c12 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4066,14 +4066,13 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	rdp->dynticks_nesting = 1;	/* CPU not up, no tearing. */
 	rcu_dynticks_eqs_online();
 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
+
 	/*
-	 * Lock in case the CB/GP kthreads are still around handling
-	 * old callbacks.
+	 * Only non-NOCB CPUs that didn't have early-boot callbacks need to be
+	 * (re-)initialized.
 	 */
-	rcu_nocb_lock(rdp);
-	if (rcu_segcblist_empty(&rdp->cblist)) /* No early-boot CBs? */
+	if (!rcu_segcblist_is_enabled(&rdp->cblist))
 		rcu_segcblist_init(&rdp->cblist);  /* Re-enable callbacks. */
-	rcu_nocb_unlock(rdp);
 
 	/*
 	 * Add CPU to leaf rcu_node pending-online bitmask.  Any needed
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/16] rcu/nocb: Disable bypass when CPU isn't completely offloaded
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 04/16] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 21:31   ` Paul E. McKenney
  2021-01-28 17:12 ` [PATCH 06/16] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep Frederic Weisbecker
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

Instead of flushing bypass at the very last moment in the deoffloading
process, just disable bypass enqueue at soon as we start the deoffloading
process and flush the pending bypass early. It's less fragile and we
leave some time to the kthreads and softirqs to process quietly.

Symmetrically, only enable bypass once we safely complete the offloading
process.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 include/linux/rcu_segcblist.h |  7 ++++---
 kernel/rcu/tree_plugin.h      | 31 +++++++++++++++++++++++--------
 2 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
index 8afe886e85f1..5a2d6dadd720 100644
--- a/include/linux/rcu_segcblist.h
+++ b/include/linux/rcu_segcblist.h
@@ -109,7 +109,7 @@ struct rcu_cblist {
  *  |                           SEGCBLIST_KTHREAD_GP                           |
  *  |                                                                          |
  *  |   Kthreads handle callbacks holding nocb_lock, local rcu_core() stops    |
- *  |   handling callbacks.                                                    |
+ *  |   handling callbacks. Allow bypass enqueue.                              |
  *  ----------------------------------------------------------------------------
  */
 
@@ -125,7 +125,7 @@ struct rcu_cblist {
  *  |                           SEGCBLIST_KTHREAD_GP                           |
  *  |                                                                          |
  *  |   CB/GP kthreads handle callbacks holding nocb_lock, local rcu_core()    |
- *  |   ignores callbacks.                                                     |
+ *  |   ignores callbacks. Bypass enqueue is enabled.                          |
  *  ----------------------------------------------------------------------------
  *                                      |
  *                                      v
@@ -134,7 +134,8 @@ struct rcu_cblist {
  *  |                           SEGCBLIST_KTHREAD_GP                           |
  *  |                                                                          |
  *  |   CB/GP kthreads and local rcu_core() handle callbacks concurrently      |
- *  |   holding nocb_lock. Wake up CB and GP kthreads if necessary.            |
+ *  |   holding nocb_lock. Wake up CB and GP kthreads if necessary. Disable    |
+ *  |   bypass enqueue.                                                        |
  *  ----------------------------------------------------------------------------
  *                                      |
  *                                      v
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 732942a15f2b..7781830a3cf1 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1825,11 +1825,22 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 	unsigned long j = jiffies;
 	long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
 
+	lockdep_assert_irqs_disabled();
+
+	// Pure softirq/rcuc based processing: no bypassing, no
+	// locking.
 	if (!rcu_rdp_is_offloaded(rdp)) {
+		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
+		return false;
+	}
+
+	// In the process of (de-)offloading: no bypassing, but
+	// locking.
+	if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
+		rcu_nocb_lock(rdp);
 		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
 		return false; /* Not offloaded, no bypassing. */
 	}
-	lockdep_assert_irqs_disabled();
 
 	// Don't use ->nocb_bypass during early boot.
 	if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) {
@@ -2412,6 +2423,7 @@ static long rcu_nocb_rdp_deoffload(void *arg)
 
 	rcu_nocb_lock_irqsave(rdp, flags);
 
+	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
 	ret = rdp_offload_toggle(rdp, false, flags);
 	swait_event_exclusive(rdp->nocb_state_wq,
 			      !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |
@@ -2422,19 +2434,22 @@ static long rcu_nocb_rdp_deoffload(void *arg)
 	rcu_nocb_unlock_irqrestore(rdp, flags);
 	del_timer_sync(&rdp->nocb_timer);
 
+	/* Sanity check */
+	WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
+
 	/*
-	 * Flush bypass. While IRQs are disabled and once we set
-	 * SEGCBLIST_SOFTIRQ_ONLY, no callback is supposed to be
-	 * enqueued on bypass.
+	 * Lock one last time so we see latest updates from kthreads and timer
+	 * so that we can later run callbacks locally without the lock.
 	 */
 	rcu_nocb_lock_irqsave(rdp, flags);
-	rcu_nocb_flush_bypass(rdp, NULL, jiffies);
+	/*
+	 * Theoretically we could set SEGCBLIST_SOFTIRQ_ONLY after the nocb
+	 * LOCK/UNLOCK but let's be paranoid.
+	 */
 	rcu_segcblist_set_flags(cblist, SEGCBLIST_SOFTIRQ_ONLY);
 	/*
 	 * With SEGCBLIST_SOFTIRQ_ONLY, we can't use
-	 * rcu_nocb_unlock_irqrestore() anymore. Theoretically we
-	 * could set SEGCBLIST_SOFTIRQ_ONLY with cb unlocked and IRQs
-	 * disabled now, but let's be paranoid.
+	 * rcu_nocb_unlock_irqrestore() anymore.
 	 */
 	raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/16] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 05/16] rcu/nocb: Disable bypass when CPU isn't completely offloaded Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 21:42   ` Paul E. McKenney
  2021-01-28 17:12 ` [PATCH 07/16] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading Frederic Weisbecker
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

rdp->nocb_cb_sleep is first set to true by default after processing
the callbacks then set back to false if we still find ready callbacks
to invoke.

This is confusing and even unsafe if it ever happens to be read
locklessly at some point. So make sure we write it only once per
nocb_cb_wait() loop.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tree_plugin.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 7781830a3cf1..53ff99a18ab1 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2241,6 +2241,7 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 	unsigned long flags;
 	bool needwake_state = false;
 	bool needwake_gp = false;
+	bool can_sleep = true;
 	struct rcu_node *rnp = rdp->mynode;
 
 	local_irq_save(flags);
@@ -2264,8 +2265,6 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 		raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
 	}
 
-	WRITE_ONCE(rdp->nocb_cb_sleep, true);
-
 	if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
 		if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB)) {
 			rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_CB);
@@ -2273,7 +2272,7 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 				needwake_state = true;
 		}
 		if (rcu_segcblist_ready_cbs(cblist))
-			WRITE_ONCE(rdp->nocb_cb_sleep, false);
+			can_sleep = false;
 	} else {
 		/*
 		 * De-offloading. Clear our flag and notify the de-offload worker.
@@ -2286,6 +2285,8 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 			needwake_state = true;
 	}
 
+	WRITE_ONCE(rdp->nocb_cb_sleep, can_sleep);
+
 	if (rdp->nocb_cb_sleep)
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("CBSleep"));
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/16] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 06/16] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-29  0:49   ` Paul E. McKenney
  2021-01-28 17:12 ` [PATCH 08/16] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible Frederic Weisbecker
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

Unconfuse a bit the name of this function which suggests returning true
when the state is updated. It actually returns true when the rdp is in
the process of deoffloading and we must ignore it.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tree_plugin.h | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 53ff99a18ab1..86c3bcceede6 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2026,7 +2026,8 @@ static inline bool nocb_gp_enabled_cb(struct rcu_data *rdp)
 	return rcu_segcblist_test_flags(&rdp->cblist, flags);
 }
 
-static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_state)
+static inline bool nocb_gp_update_state_deoffloading(struct rcu_data *rdp,
+						     bool *needwake_state)
 {
 	struct rcu_segcblist *cblist = &rdp->cblist;
 
@@ -2036,7 +2037,7 @@ static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_sta
 			if (rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
 				*needwake_state = true;
 		}
-		return true;
+		return false;
 	}
 
 	/*
@@ -2047,7 +2048,7 @@ static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_sta
 	rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP);
 	if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
 		*needwake_state = true;
-	return false;
+	return true;
 }
 
 
@@ -2085,7 +2086,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 			continue;
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
 		rcu_nocb_lock_irqsave(rdp, flags);
-		if (!nocb_gp_update_state(rdp, &needwake_state)) {
+		if (nocb_gp_update_state_deoffloading(rdp, &needwake_state)) {
 			rcu_nocb_unlock_irqrestore(rdp, flags);
 			if (needwake_state)
 				swake_up_one(&rdp->nocb_state_wq);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/16] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (6 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 07/16] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-29  0:51   ` Paul E. McKenney
  2021-01-28 17:12 ` [PATCH 09/16] rcu/nocb: Merge nocb_timer to the rdp leader Frederic Weisbecker
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

Those tracing calls don't need to be under the nocb lock. Move them
outside.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/rcu/tree_plugin.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 86c3bcceede6..8c5fea58ee80 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1700,9 +1700,9 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
 
 	lockdep_assert_held(&rdp->nocb_lock);
 	if (!READ_ONCE(rdp_gp->nocb_gp_kthread)) {
+		rcu_nocb_unlock_irqrestore(rdp, flags);
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 				    TPS("AlreadyAwake"));
-		rcu_nocb_unlock_irqrestore(rdp, flags);
 		return false;
 	}
 
@@ -1950,9 +1950,9 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 	// If we are being polled or there is no kthread, just leave.
 	t = READ_ONCE(rdp->nocb_gp_kthread);
 	if (rcu_nocb_poll || !t) {
+		rcu_nocb_unlock_irqrestore(rdp, flags);
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 				    TPS("WakeNotPoll"));
-		rcu_nocb_unlock_irqrestore(rdp, flags);
 		return;
 	}
 	// Need to actually to a wakeup.
@@ -1987,8 +1987,8 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 					   TPS("WakeOvfIsDeferred"));
 		rcu_nocb_unlock_irqrestore(rdp, flags);
 	} else {
+		rcu_nocb_unlock_irqrestore(rdp, flags);
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
-		rcu_nocb_unlock_irqrestore(rdp, flags);
 	}
 	return;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/16] rcu/nocb: Merge nocb_timer to the rdp leader
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (7 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 08/16] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 10/16] rcu/nocb: Directly call __wake_nocb_gp() from bypass timer Frederic Weisbecker
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

Currently each offline rdp has its own nocb_timer armed when the
nocb_gp wakeup must be deferred. This layout has many drawbacks,
compared to a solution based on a single timer per rdp group:

* It's a lot of timers to maintain.

* The per rdp nocb lock must be held to arm and cancel the timer and it's
  already quite contended.

* Firing one timer doesn't cancel the other rdp's timers in the same
  group:
  - They may conflict and end up with spurious wake ups
  - Each of the rdp that armed a timer need to lock both nocb_lock
    and then nocb_gp_lock upon exit to idle/user/guest mode.

* We can't cancel all of them if we detect an unflushed bypass in
  nocb_gp_wait(). In fact currently we only ever cancel the nocb_timer
  of the leader group.

* The leader group's nocb_timer is cancelled without locking nocb_lock
  in nocb_gp_wait(). It appears to be safe currently but is very error
  prone and hairy for reviewers.

* Since the timer takes the nocb_lock, it requires extra care in the NOCB
  (de-)offloading process, needing it to be either enabled or disabled
  and flushed.

Use a per rdp leader timer instead. It is based on nocb_gp_lock that is
_way_ less contended and stays so after this change. As a matter of fact,
the nocb_timer almost never fires and the deferred wakeup is mostly
handled on idle/user/guest entry. Now the early check performed at this
point in do_nocb_deferred_wakeup() is done on rdp_gp->nocb_defer_wakeup,
which is racy of course. It doesn't matter though because we only need
the guarantee to see the timer armed if we were the last one to arm it.
Any other situation (another rdp has armed it and we either see it or not)
is fine.

This solves all the issues listed above.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/rcu/tree.h        |   1 -
 kernel/rcu/tree_plugin.h | 119 ++++++++++++++++++++++-----------------
 2 files changed, 68 insertions(+), 52 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 71821d59d95c..b280a843bd2c 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -257,7 +257,6 @@ struct rcu_data {
 };
 
 /* Values for nocb_defer_wakeup field in struct rcu_data. */
-#define RCU_NOCB_WAKE_OFF	-1
 #define RCU_NOCB_WAKE_NOT	0
 #define RCU_NOCB_WAKE		1
 #define RCU_NOCB_WAKE_FORCE	2
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 8c5fea58ee80..5e83ea380bec 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1687,41 +1687,48 @@ bool rcu_is_nocb_cpu(int cpu)
 	return false;
 }
 
-/*
- * Kick the GP kthread for this NOCB group.  Caller holds ->nocb_lock
- * and this function releases it.
- */
-static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
-			 unsigned long flags)
-	__releases(rdp->nocb_lock)
+static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
+			   struct rcu_data *rdp,
+			   bool force, unsigned long flags)
+	__releases(rdp_gp->nocb_gp_lock)
 {
 	bool needwake = false;
-	struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
 
-	lockdep_assert_held(&rdp->nocb_lock);
 	if (!READ_ONCE(rdp_gp->nocb_gp_kthread)) {
-		rcu_nocb_unlock_irqrestore(rdp, flags);
+		raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 				    TPS("AlreadyAwake"));
 		return false;
 	}
 
-	rdp->nocb_defer_wakeup = RCU_NOCB_WAKE_NOT;
-	del_timer(&rdp->nocb_timer);
-	rcu_nocb_unlock_irqrestore(rdp, flags);
-	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+	rdp_gp->nocb_defer_wakeup = RCU_NOCB_WAKE_NOT;
+	del_timer(&rdp_gp->nocb_timer);
+
 	if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) {
 		WRITE_ONCE(rdp_gp->nocb_gp_sleep, false);
 		needwake = true;
+	}
+	raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
+	if (needwake) {
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake"));
-	}
-	raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
-	if (needwake)
 		wake_up_process(rdp_gp->nocb_gp_kthread);
+	}
 
 	return needwake;
 }
 
+/*
+ * Kick the GP kthread for this NOCB group.
+ */
+static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
+{
+	unsigned long flags;
+	struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
+
+	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+	return __wake_nocb_gp(rdp_gp, rdp, force, flags);
+}
+
 /*
  * Arrange to wake the GP kthread for this NOCB group at some future
  * time when it is safe to do so.
@@ -1729,12 +1736,18 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
 static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
 			       const char *reason)
 {
-	if (rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_OFF)
-		return;
-	if (rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT)
-		mod_timer(&rdp->nocb_timer, jiffies + 1);
-	if (rdp->nocb_defer_wakeup < waketype)
-		WRITE_ONCE(rdp->nocb_defer_wakeup, waketype);
+	unsigned long flags;
+	struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
+
+	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+
+	if (rdp_gp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT)
+		mod_timer(&rdp_gp->nocb_timer, jiffies + 1);
+	if (rdp_gp->nocb_defer_wakeup < waketype)
+		WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
+
+	raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
+
 	trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, reason);
 }
 
@@ -1961,13 +1974,14 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 		rdp->qlen_last_fqs_check = len;
 		if (!irqs_disabled_flags(flags)) {
 			/* ... if queue was empty ... */
-			wake_nocb_gp(rdp, false, flags);
+			rcu_nocb_unlock_irqrestore(rdp, flags);
+			wake_nocb_gp(rdp, false);
 			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 					    TPS("WakeEmpty"));
 		} else {
+			rcu_nocb_unlock_irqrestore(rdp, flags);
 			wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE,
 					   TPS("WakeEmptyIsDeferred"));
-			rcu_nocb_unlock_irqrestore(rdp, flags);
 		}
 	} else if (len > rdp->qlen_last_fqs_check + qhimark) {
 		/* ... or if many callbacks queued. */
@@ -1982,10 +1996,14 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 		smp_mb(); /* Enqueue before timer_pending(). */
 		if ((rdp->nocb_cb_sleep ||
 		     !rcu_segcblist_ready_cbs(&rdp->cblist)) &&
-		    !timer_pending(&rdp->nocb_bypass_timer))
+		    !timer_pending(&rdp->nocb_bypass_timer)) {
+			rcu_nocb_unlock_irqrestore(rdp, flags);
 			wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_FORCE,
 					   TPS("WakeOvfIsDeferred"));
-		rcu_nocb_unlock_irqrestore(rdp, flags);
+		} else {
+			rcu_nocb_unlock_irqrestore(rdp, flags);
+			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
+		}
 	} else {
 		rcu_nocb_unlock_irqrestore(rdp, flags);
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
@@ -2111,11 +2129,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 			bypass = true;
 		}
 		rnp = rdp->mynode;
-		if (bypass) {  // Avoid race with first bypass CB.
-			WRITE_ONCE(my_rdp->nocb_defer_wakeup,
-				   RCU_NOCB_WAKE_NOT);
-			del_timer(&my_rdp->nocb_timer);
-		}
+
 		// Advance callbacks if helpful and low contention.
 		needwake_gp = false;
 		if (!rcu_segcblist_restempty(&rdp->cblist,
@@ -2161,11 +2175,16 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 	my_rdp->nocb_gp_bypass = bypass;
 	my_rdp->nocb_gp_gp = needwait_gp;
 	my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
-	if (bypass && !rcu_nocb_poll) {
-		// At least one child with non-empty ->nocb_bypass, so set
-		// timer in order to avoid stranding its callbacks.
+	if (bypass) {
 		raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
-		mod_timer(&my_rdp->nocb_bypass_timer, j + 2);
+		// Avoid race with first bypass CB.
+		WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
+		del_timer(&my_rdp->nocb_timer);
+		if (!rcu_nocb_poll) {
+			// At least one child with non-empty ->nocb_bypass, so set
+			// timer in order to avoid stranding its callbacks.
+			mod_timer(&my_rdp->nocb_bypass_timer, j + 2);
+		}
 		raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
 	}
 	if (rcu_nocb_poll) {
@@ -2339,16 +2358,17 @@ static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp)
 {
 	unsigned long flags;
 	int ndw;
+	struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
 	int ret;
 
-	rcu_nocb_lock_irqsave(rdp, flags);
-	if (!rcu_nocb_need_deferred_wakeup(rdp)) {
-		rcu_nocb_unlock_irqrestore(rdp, flags);
+	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+
+	if (!rcu_nocb_need_deferred_wakeup(rdp_gp)) {
+		raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);;
 		return false;
 	}
-	ndw = READ_ONCE(rdp->nocb_defer_wakeup);
-	WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
-	ret = wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags);
+	ndw = READ_ONCE(rdp_gp->nocb_defer_wakeup);
+	ret = __wake_nocb_gp(rdp_gp, rdp, ndw == RCU_NOCB_WAKE_FORCE, flags);
 	trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake"));
 
 	return ret;
@@ -2369,7 +2389,10 @@ static void do_nocb_deferred_wakeup_timer(struct timer_list *t)
  */
 static bool do_nocb_deferred_wakeup(struct rcu_data *rdp)
 {
-	if (rcu_nocb_need_deferred_wakeup(rdp))
+	if (!rdp->nocb_gp_rdp)
+		return false;
+
+	if (rcu_nocb_need_deferred_wakeup(rdp->nocb_gp_rdp))
 		return do_nocb_deferred_wakeup_common(rdp);
 	return false;
 }
@@ -2430,18 +2453,12 @@ static long rcu_nocb_rdp_deoffload(void *arg)
 	swait_event_exclusive(rdp->nocb_state_wq,
 			      !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |
 							SEGCBLIST_KTHREAD_GP));
-	rcu_nocb_lock_irqsave(rdp, flags);
-	/* Make sure nocb timer won't stay around */
-	WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_OFF);
-	rcu_nocb_unlock_irqrestore(rdp, flags);
-	del_timer_sync(&rdp->nocb_timer);
-
 	/* Sanity check */
 	WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
 
 	/*
-	 * Lock one last time so we see latest updates from kthreads and timer
-	 * so that we can later run callbacks locally without the lock.
+	 * Lock one last time so we see latest updates from kthreads
+	 * and we can later run callbacks locally without the lock.
 	 */
 	rcu_nocb_lock_irqsave(rdp, flags);
 	/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/16] rcu/nocb: Directly call __wake_nocb_gp() from bypass timer
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (8 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 09/16] rcu/nocb: Merge nocb_timer to the rdp leader Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 11/16] rcu/nocb: Allow de-offloading rdp leader Frederic Weisbecker
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

The bypass timer calls __call_rcu_nocb_wake() instead of directly
calling __wake_nocb_gp(). The only difference here is that
rdp->qlen_last_fqs_check gets overriden. But resetting the deferred
force quiescent state base shouldn't be relevant for that timer. In fact
the bypass queue in concern can be for any rdp from the group and not
necessarily the rdp leader on which the bypass timer is attached.

Therefore we can simply call directly __wake_nocb_gp(). This way we
don't even need to lock the nocb_lock.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/rcu/tree_plugin.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 5e83ea380bec..28ace1ae83d6 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2018,9 +2018,10 @@ static void do_nocb_bypass_wakeup_timer(struct timer_list *t)
 	struct rcu_data *rdp = from_timer(rdp, t, nocb_bypass_timer);
 
 	trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Timer"));
-	rcu_nocb_lock_irqsave(rdp, flags);
+
+	raw_spin_lock_irqsave(&rdp->nocb_gp_lock, flags);
 	smp_mb__after_spinlock(); /* Timer expire before wakeup. */
-	__call_rcu_nocb_wake(rdp, true, flags);
+	__wake_nocb_gp(rdp, rdp, false, flags);
 }
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/16] rcu/nocb: Allow de-offloading rdp leader
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (9 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 10/16] rcu/nocb: Directly call __wake_nocb_gp() from bypass timer Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 12/16] rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup Frederic Weisbecker
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

The only thing that prevented an rdp leader from being de-offloaded was
the nocb_bypass_timer that used to lock the nocb_lock of the rdp leader.

If an rdp gets de-offloaded, it will subtely ignore rcu_nocb_lock()
calls and do its job in the timer unsafely. Worse yet: if it gets
re-offloaded in the middle of the timer, rcu_nocb_unlock() would try to
unlock, leaving it imbalanced.

Now that the nocb_bypass_timer doesn't use the nocb_lock anymore, we can
safely allow rdp leader to be de-offloaded.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tree_plugin.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 28ace1ae83d6..dae892ac570a 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2481,10 +2481,6 @@ int rcu_nocb_cpu_deoffload(int cpu)
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 	int ret = 0;
 
-	if (rdp == rdp->nocb_gp_rdp) {
-		pr_info("Can't deoffload an rdp GP leader (yet)\n");
-		return -EINVAL;
-	}
 	mutex_lock(&rcu_state.barrier_mutex);
 	cpus_read_lock();
 	if (rcu_rdp_is_offloaded(rdp)) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 12/16] rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (10 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 11/16] rcu/nocb: Allow de-offloading rdp leader Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 13/16] rcu/nocb: Delete bypass_timer " Frederic Weisbecker
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

As we wake up in nocb_gp_wait(), there is no need to keep the nocb_timer
around as we are going to go through the whole rdp list again. Any update
performed before the timer was armed will now be visible after the
nocb_gp_lock acquire.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/rcu/tree_plugin.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index dae892ac570a..39fb792704ed 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2212,6 +2212,10 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 		raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
 		if (bypass)
 			del_timer(&my_rdp->nocb_bypass_timer);
+		if (my_rdp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
+			WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
+			del_timer(&my_rdp->nocb_timer);
+		}
 		WRITE_ONCE(my_rdp->nocb_gp_sleep, true);
 		raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 13/16] rcu/nocb: Delete bypass_timer upon nocb_gp wakeup
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (11 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 12/16] rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 14/16] rcu/nocb: Only cancel nocb timer if not polling Frederic Weisbecker
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

A NOCB-gp wake up can safely delete the nocb_bypass_timer. nocb_gp_wait()
is going to check again the bypass state and rearm the bypass timer if
necessary.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/rcu/tree_plugin.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 39fb792704ed..eb8614577a2c 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1703,6 +1703,7 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
 
 	rdp_gp->nocb_defer_wakeup = RCU_NOCB_WAKE_NOT;
 	del_timer(&rdp_gp->nocb_timer);
+	del_timer(&rdp_gp->nocb_bypass_timer);
 
 	if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) {
 		WRITE_ONCE(rdp_gp->nocb_gp_sleep, false);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 14/16] rcu/nocb: Only cancel nocb timer if not polling
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (12 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 13/16] rcu/nocb: Delete bypass_timer " Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 15/16] rcu/nocb: Prepare for finegrained deferred wakeup Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 16/16] rcu/nocb: Unify timers Frederic Weisbecker
  15 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

No need to disarm the nocb_timer if rcu_nocb is polling because it
shouldn't be armed either.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/rcu/tree_plugin.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index eb8614577a2c..b1f76f341c66 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2178,16 +2178,16 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 	my_rdp->nocb_gp_gp = needwait_gp;
 	my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
 	if (bypass) {
-		raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
-		// Avoid race with first bypass CB.
-		WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
-		del_timer(&my_rdp->nocb_timer);
 		if (!rcu_nocb_poll) {
+			raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
+			// Avoid race with first bypass CB.
+			WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
+			del_timer(&my_rdp->nocb_timer);
 			// At least one child with non-empty ->nocb_bypass, so set
 			// timer in order to avoid stranding its callbacks.
 			mod_timer(&my_rdp->nocb_bypass_timer, j + 2);
+			raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
 		}
-		raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
 	}
 	if (rcu_nocb_poll) {
 		/* Polling, so trace if first poll in the series. */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 15/16] rcu/nocb: Prepare for finegrained deferred wakeup
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (13 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 14/16] rcu/nocb: Only cancel nocb timer if not polling Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  2021-01-28 17:12 ` [PATCH 16/16] rcu/nocb: Unify timers Frederic Weisbecker
  15 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

Provide a way to tune the deferred wakeup level we want to perform from
a safe wakeup point. Currently those sites are:

* nocb_timer
* user/idle/guest entry
* CPU down
* softirq/rcuc

All of these sites perform the wake up for both RCU_NOCB_WAKE and
RCU_NOCB_WAKE_FORCE.

In order to merge nocb_timer and nocb_bypass_timer together, we plan to
add a new RCU_NOCB_WAKE_BYPASS that really want to be deferred until
a timer fires so that we don't wake up the NOCB-gp kthread too early.

To prepare for that, integrate a wake up level/limit that a callsite is
deemed to perform.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/rcu/tree.c        |  2 +-
 kernel/rcu/tree.h        |  2 +-
 kernel/rcu/tree_plugin.h | 15 ++++++++-------
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f74a9ba62c12..e31b1d8fde93 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3813,7 +3813,7 @@ static int rcu_pending(int user)
 	check_cpu_stall(rdp);
 
 	/* Does this CPU need a deferred NOCB wakeup? */
-	if (rcu_nocb_need_deferred_wakeup(rdp))
+	if (rcu_nocb_need_deferred_wakeup(rdp, RCU_NOCB_WAKE))
 		return 1;
 
 	/* Is this a nohz_full CPU in userspace or idle?  (Ignore RCU if so.) */
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index b280a843bd2c..2510e86265c1 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -433,7 +433,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 				bool *was_alldone, unsigned long flags);
 static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty,
 				 unsigned long flags);
-static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp);
+static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level);
 static bool do_nocb_deferred_wakeup(struct rcu_data *rdp);
 static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp);
 static void rcu_spawn_cpu_nocb_kthread(int cpu);
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index b1f76f341c66..162dda3714f1 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2354,13 +2354,14 @@ static int rcu_nocb_cb_kthread(void *arg)
 }
 
 /* Is a deferred wakeup of rcu_nocb_kthread() required? */
-static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp)
+static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level)
 {
-	return READ_ONCE(rdp->nocb_defer_wakeup) > RCU_NOCB_WAKE_NOT;
+	return READ_ONCE(rdp->nocb_defer_wakeup) >= level;
 }
 
 /* Do a deferred wakeup of rcu_nocb_kthread(). */
-static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp)
+static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp,
+					   int level)
 {
 	unsigned long flags;
 	int ndw;
@@ -2369,7 +2370,7 @@ static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp)
 
 	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
 
-	if (!rcu_nocb_need_deferred_wakeup(rdp_gp)) {
+	if (!rcu_nocb_need_deferred_wakeup(rdp_gp, level)) {
 		raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);;
 		return false;
 	}
@@ -2385,7 +2386,7 @@ static void do_nocb_deferred_wakeup_timer(struct timer_list *t)
 {
 	struct rcu_data *rdp = from_timer(rdp, t, nocb_timer);
 
-	do_nocb_deferred_wakeup_common(rdp);
+	do_nocb_deferred_wakeup_common(rdp, RCU_NOCB_WAKE);
 }
 
 /*
@@ -2398,8 +2399,8 @@ static bool do_nocb_deferred_wakeup(struct rcu_data *rdp)
 	if (!rdp->nocb_gp_rdp)
 		return false;
 
-	if (rcu_nocb_need_deferred_wakeup(rdp->nocb_gp_rdp))
-		return do_nocb_deferred_wakeup_common(rdp);
+	if (rcu_nocb_need_deferred_wakeup(rdp->nocb_gp_rdp, RCU_NOCB_WAKE))
+		return do_nocb_deferred_wakeup_common(rdp, RCU_NOCB_WAKE);
 	return false;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 16/16] rcu/nocb: Unify timers
  2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
                   ` (14 preceding siblings ...)
  2021-01-28 17:12 ` [PATCH 15/16] rcu/nocb: Prepare for finegrained deferred wakeup Frederic Weisbecker
@ 2021-01-28 17:12 ` Frederic Weisbecker
  15 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 17:12 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Boqun Feng, Lai Jiangshan,
	Neeraj Upadhyay, Josh Triplett, Stable, Joel Fernandes

Now that nocb_timer and nocb_bypass_timer have become very similar,
merge them together. A new RCU_NOCB_WAKE_BYPASS wake level is introduced.
As a result, timers perform all kinds of deferred wake ups but other
deferred wakeup callsites only handle non-bypass wakeups in order not
to wake up rcuo too early.

The timer also performs the full barrier all the time to order
timer_pending() and callback enqueue although the path performing
RCU_NOCB_WAKE_FORCE that makes use of it is debatable. It should also
test against the rdp leader instead of the current rdp.

The permanent full barrier shouldn't bring visible overhead since the
timers almost never fire.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 include/trace/events/rcu.h |  1 +
 kernel/rcu/tree.h          |  6 +--
 kernel/rcu/tree_plugin.h   | 88 ++++++++++++++++----------------------
 3 files changed, 42 insertions(+), 53 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 5fc29400e1a2..c16cb7d78f51 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -278,6 +278,7 @@ TRACE_EVENT_RCU(rcu_exp_funnel_lock,
  * "WakeNot": Don't wake rcuo kthread.
  * "WakeNotPoll": Don't wake rcuo kthread because it is polling.
  * "WakeOvfIsDeferred": Wake rcuo kthread later, CB list is huge.
+ * "WakeBypassIsDeferred": Wake rcuo kthread later, bypass list is contended.
  * "WokeEmpty": rcuo CB kthread woke to find empty list.
  */
 TRACE_EVENT_RCU(rcu_nocb_wake,
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 2510e86265c1..9a16487edfca 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -218,7 +218,6 @@ struct rcu_data {
 
 	/* The following fields are used by GP kthread, hence own cacheline. */
 	raw_spinlock_t nocb_gp_lock ____cacheline_internodealigned_in_smp;
-	struct timer_list nocb_bypass_timer; /* Force nocb_bypass flush. */
 	u8 nocb_gp_sleep;		/* Is the nocb GP thread asleep? */
 	u8 nocb_gp_bypass;		/* Found a bypass on last scan? */
 	u8 nocb_gp_gp;			/* GP to wait for on last scan? */
@@ -258,8 +257,9 @@ struct rcu_data {
 
 /* Values for nocb_defer_wakeup field in struct rcu_data. */
 #define RCU_NOCB_WAKE_NOT	0
-#define RCU_NOCB_WAKE		1
-#define RCU_NOCB_WAKE_FORCE	2
+#define RCU_NOCB_WAKE_BYPASS	1
+#define RCU_NOCB_WAKE		2
+#define RCU_NOCB_WAKE_FORCE	3
 
 #define RCU_JIFFIES_TILL_FORCE_QS (1 + (HZ > 250) + (HZ > 500))
 					/* For jiffies_till_first_fqs and */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 162dda3714f1..516bacbea7b9 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1703,7 +1703,6 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
 
 	rdp_gp->nocb_defer_wakeup = RCU_NOCB_WAKE_NOT;
 	del_timer(&rdp_gp->nocb_timer);
-	del_timer(&rdp_gp->nocb_bypass_timer);
 
 	if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) {
 		WRITE_ONCE(rdp_gp->nocb_gp_sleep, false);
@@ -1742,10 +1741,19 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
 
 	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
 
-	if (rdp_gp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT)
-		mod_timer(&rdp_gp->nocb_timer, jiffies + 1);
-	if (rdp_gp->nocb_defer_wakeup < waketype)
+	/*
+	 * Bypass wakeup overrides previous deferments. In case
+	 * of callback storm, no need to wake up too early.
+	 */
+	if (waketype == RCU_NOCB_WAKE_BYPASS) {
+		mod_timer(&rdp_gp->nocb_timer, jiffies + 2);
 		WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
+	} else {
+		if (rdp_gp->nocb_defer_wakeup < RCU_NOCB_WAKE)
+			mod_timer(&rdp_gp->nocb_timer, jiffies + 1);
+		if (rdp_gp->nocb_defer_wakeup < waketype)
+			WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
+	}
 
 	raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
 
@@ -1997,7 +2005,7 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 		smp_mb(); /* Enqueue before timer_pending(). */
 		if ((rdp->nocb_cb_sleep ||
 		     !rcu_segcblist_ready_cbs(&rdp->cblist)) &&
-		    !timer_pending(&rdp->nocb_bypass_timer)) {
+		    !timer_pending(&rdp->nocb_timer)) {
 			rcu_nocb_unlock_irqrestore(rdp, flags);
 			wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_FORCE,
 					   TPS("WakeOvfIsDeferred"));
@@ -2012,19 +2020,6 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 	return;
 }
 
-/* Wake up the no-CBs GP kthread to flush ->nocb_bypass. */
-static void do_nocb_bypass_wakeup_timer(struct timer_list *t)
-{
-	unsigned long flags;
-	struct rcu_data *rdp = from_timer(rdp, t, nocb_bypass_timer);
-
-	trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Timer"));
-
-	raw_spin_lock_irqsave(&rdp->nocb_gp_lock, flags);
-	smp_mb__after_spinlock(); /* Timer expire before wakeup. */
-	__wake_nocb_gp(rdp, rdp, false, flags);
-}
-
 /*
  * Check if we ignore this rdp.
  *
@@ -2177,17 +2172,11 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 	my_rdp->nocb_gp_bypass = bypass;
 	my_rdp->nocb_gp_gp = needwait_gp;
 	my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
-	if (bypass) {
-		if (!rcu_nocb_poll) {
-			raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
-			// Avoid race with first bypass CB.
-			WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
-			del_timer(&my_rdp->nocb_timer);
-			// At least one child with non-empty ->nocb_bypass, so set
-			// timer in order to avoid stranding its callbacks.
-			mod_timer(&my_rdp->nocb_bypass_timer, j + 2);
-			raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
-		}
+	if (bypass && !rcu_nocb_poll) {
+		// At least one child with non-empty ->nocb_bypass, so set
+		// timer in order to avoid stranding its callbacks.
+		wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
+				   TPS("WakeBypassIsDeferred"));
 	}
 	if (rcu_nocb_poll) {
 		/* Polling, so trace if first poll in the series. */
@@ -2211,8 +2200,6 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 	}
 	if (!rcu_nocb_poll) {
 		raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
-		if (bypass)
-			del_timer(&my_rdp->nocb_bypass_timer);
 		if (my_rdp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
 			WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
 			del_timer(&my_rdp->nocb_timer);
@@ -2360,16 +2347,14 @@ static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level)
 }
 
 /* Do a deferred wakeup of rcu_nocb_kthread(). */
-static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp,
-					   int level)
+static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp_gp,
+					   struct rcu_data *rdp, int level,
+					   unsigned long flags)
+	__releases(rdp_gp->nocb_gp_lock)
 {
-	unsigned long flags;
 	int ndw;
-	struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
 	int ret;
 
-	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
-
 	if (!rcu_nocb_need_deferred_wakeup(rdp_gp, level)) {
 		raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);;
 		return false;
@@ -2384,9 +2369,15 @@ static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp,
 /* Do a deferred wakeup of rcu_nocb_kthread() from a timer handler. */
 static void do_nocb_deferred_wakeup_timer(struct timer_list *t)
 {
+	unsigned long flags;
 	struct rcu_data *rdp = from_timer(rdp, t, nocb_timer);
 
-	do_nocb_deferred_wakeup_common(rdp, RCU_NOCB_WAKE);
+	WARN_ON_ONCE(rdp->nocb_gp_rdp != rdp);
+	trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Timer"));
+
+	raw_spin_lock_irqsave(&rdp->nocb_gp_lock, flags);
+	smp_mb__after_spinlock(); /* Timer expire before wakeup. */
+	do_nocb_deferred_wakeup_common(rdp, rdp, RCU_NOCB_WAKE_BYPASS, flags);
 }
 
 /*
@@ -2396,12 +2387,14 @@ static void do_nocb_deferred_wakeup_timer(struct timer_list *t)
  */
 static bool do_nocb_deferred_wakeup(struct rcu_data *rdp)
 {
-	if (!rdp->nocb_gp_rdp)
+	unsigned long flags;
+	struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
+
+	if (!rdp_gp || !rcu_nocb_need_deferred_wakeup(rdp_gp, RCU_NOCB_WAKE))
 		return false;
 
-	if (rcu_nocb_need_deferred_wakeup(rdp->nocb_gp_rdp, RCU_NOCB_WAKE))
-		return do_nocb_deferred_wakeup_common(rdp, RCU_NOCB_WAKE);
-	return false;
+	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+	return do_nocb_deferred_wakeup_common(rdp_gp, rdp, RCU_NOCB_WAKE, flags);
 }
 
 void rcu_nocb_flush_deferred_wakeup(void)
@@ -2636,7 +2629,6 @@ static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
 	raw_spin_lock_init(&rdp->nocb_bypass_lock);
 	raw_spin_lock_init(&rdp->nocb_gp_lock);
 	timer_setup(&rdp->nocb_timer, do_nocb_deferred_wakeup_timer, 0);
-	timer_setup(&rdp->nocb_bypass_timer, do_nocb_bypass_wakeup_timer, 0);
 	rcu_cblist_init(&rdp->nocb_bypass);
 }
 
@@ -2795,13 +2787,12 @@ static void show_rcu_nocb_gp_state(struct rcu_data *rdp)
 {
 	struct rcu_node *rnp = rdp->mynode;
 
-	pr_info("nocb GP %d %c%c%c%c%c%c %c[%c%c] %c%c:%ld rnp %d:%d %lu %c CPU %d%s\n",
+	pr_info("nocb GP %d %c%c%c%c%c %c[%c%c] %c%c:%ld rnp %d:%d %lu %c CPU %d%s\n",
 		rdp->cpu,
 		"kK"[!!rdp->nocb_gp_kthread],
 		"lL"[raw_spin_is_locked(&rdp->nocb_gp_lock)],
 		"dD"[!!rdp->nocb_defer_wakeup],
 		"tT"[timer_pending(&rdp->nocb_timer)],
-		"bB"[timer_pending(&rdp->nocb_bypass_timer)],
 		"sS"[!!rdp->nocb_gp_sleep],
 		".W"[swait_active(&rdp->nocb_gp_wq)],
 		".W"[swait_active(&rnp->nocb_gp_wq[0])],
@@ -2822,7 +2813,6 @@ static void show_rcu_nocb_state(struct rcu_data *rdp)
 	char bufr[20];
 	struct rcu_segcblist *rsclp = &rdp->cblist;
 	bool waslocked;
-	bool wastimer;
 	bool wassleep;
 
 	if (rdp->nocb_gp_rdp == rdp)
@@ -2859,15 +2849,13 @@ static void show_rcu_nocb_state(struct rcu_data *rdp)
 		return;
 
 	waslocked = raw_spin_is_locked(&rdp->nocb_gp_lock);
-	wastimer = timer_pending(&rdp->nocb_bypass_timer);
 	wassleep = swait_active(&rdp->nocb_gp_wq);
-	if (!rdp->nocb_gp_sleep && !waslocked && !wastimer && !wassleep)
+	if (!rdp->nocb_gp_sleep && !waslocked && !wassleep)
 		return;  /* Nothing untowards. */
 
-	pr_info("   nocb GP activity on CB-only CPU!!! %c%c%c%c %c\n",
+	pr_info("   nocb GP activity on CB-only CPU!!! %c%c%c %c\n",
 		"lL"[waslocked],
 		"dD"[!!rdp->nocb_defer_wakeup],
-		"tT"[wastimer],
 		"sS"[!!rdp->nocb_gp_sleep],
 		".W"[wassleep]);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 03/16] rcu/nocb: Forbid NOCB toggling on offline CPUs
  2021-01-28 17:12 ` [PATCH 03/16] rcu/nocb: Forbid NOCB toggling on offline CPUs Frederic Weisbecker
@ 2021-01-28 19:52   ` Paul E. McKenney
  0 siblings, 0 replies; 28+ messages in thread
From: Paul E. McKenney @ 2021-01-28 19:52 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 06:12:09PM +0100, Frederic Weisbecker wrote:
> Toggling the NOCB state of a CPU when it is offline imply some specific
> issues to handle, especially making sure that the kthreads have handled
> all the remaining callbacks and bypass before the corresponding CPU can
> be set as non-offloaded while it is offline.
> 
> To prevent from such complications, simply forbid offline CPUs to
> perform NOCB-mode toggling. It's a simple rule to observe and after all
> it doesn't make much sense to switch a non working CPU to/from offloaded
> state.
> 
> Reported-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Very nice, cuts out a world of hurt, thank you!  Queued for testing and
further review, the usual wordsmithing applied and the usual request
for you to check to see if I messed anything up.

							Thanx, Paul

------------------------------------------------------------------------

commit 4fbfeec81533e6ea9811e57cc848ee30522c0517
Author: Frederic Weisbecker <frederic@kernel.org>
Date:   Thu Jan 28 18:12:09 2021 +0100

    rcu/nocb: Forbid NOCB toggling on offline CPUs
    
    It makes no sense to de-offload an offline CPU because that CPU will never
    invoke any remaining callbacks.  It also makes little sense to offload an
    offline CPU because any pending RCU callbacks were migrated when that CPU
    went offline.  Yes, it is in theory possible to use a number of tricks
    to permit offloading and deoffloading offline CPUs in certain cases, but
    in practice it is far better to have the simple and deterministic rule
    "Toggling the offload state of an offline CPU is forbidden".
    
    For but one example, consider that an offloaded offline CPU might have
    millions of callbacks queued.  Best to just say "no".
    
    This commit therefore forbids toggling of the offloaded state of
    offline CPUs.
    
    Reported-by: Paul E. McKenney <paulmck@kernel.org>
    Cc: Josh Triplett <josh@joshtriplett.org>
    Cc: Lai Jiangshan <jiangshanlai@gmail.com>
    Cc: Joel Fernandes <joel@joelfernandes.org>
    Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
    Cc: Boqun Feng <boqun.feng@gmail.com>
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8bb8da2..00059df 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4066,8 +4066,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
 	/*
 	 * Lock in case the CB/GP kthreads are still around handling
-	 * old callbacks (longer term we should flush all callbacks
-	 * before completing CPU offline)
+	 * old callbacks.
 	 */
 	rcu_nocb_lock(rdp);
 	if (rcu_segcblist_empty(&rdp->cblist)) /* No early-boot CBs? */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index f1ebe67..c61613a 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2398,23 +2398,18 @@ static int rdp_offload_toggle(struct rcu_data *rdp,
 	return 0;
 }
 
-static int __rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
+static long rcu_nocb_rdp_deoffload(void *arg)
 {
+	struct rcu_data *rdp = arg;
 	struct rcu_segcblist *cblist = &rdp->cblist;
 	unsigned long flags;
 	int ret;
 
+	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
+
 	pr_info("De-offloading %d\n", rdp->cpu);
 
 	rcu_nocb_lock_irqsave(rdp, flags);
-	/*
-	 * If there are still pending work offloaded, the offline
-	 * CPU won't help much handling them.
-	 */
-	if (cpu_is_offline(rdp->cpu) && !rcu_segcblist_empty(&rdp->cblist)) {
-		rcu_nocb_unlock_irqrestore(rdp, flags);
-		return -EBUSY;
-	}
 
 	ret = rdp_offload_toggle(rdp, false, flags);
 	swait_event_exclusive(rdp->nocb_state_wq,
@@ -2445,14 +2440,6 @@ static int __rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
 	return ret;
 }
 
-static long rcu_nocb_rdp_deoffload(void *arg)
-{
-	struct rcu_data *rdp = arg;
-
-	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
-	return __rcu_nocb_rdp_deoffload(rdp);
-}
-
 int rcu_nocb_cpu_deoffload(int cpu)
 {
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
@@ -2465,12 +2452,14 @@ int rcu_nocb_cpu_deoffload(int cpu)
 	mutex_lock(&rcu_state.barrier_mutex);
 	cpus_read_lock();
 	if (rcu_rdp_is_offloaded(rdp)) {
-		if (cpu_online(cpu))
+		if (cpu_online(cpu)) {
 			ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp);
-		else
-			ret = __rcu_nocb_rdp_deoffload(rdp);
-		if (!ret)
-			cpumask_clear_cpu(cpu, rcu_nocb_mask);
+			if (!ret)
+				cpumask_clear_cpu(cpu, rcu_nocb_mask);
+		} else {
+			pr_info("NOCB: Can't CB-deoffload an offline CPU\n");
+			ret = -EINVAL;
+		}
 	}
 	cpus_read_unlock();
 	mutex_unlock(&rcu_state.barrier_mutex);
@@ -2479,12 +2468,14 @@ int rcu_nocb_cpu_deoffload(int cpu)
 }
 EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload);
 
-static int __rcu_nocb_rdp_offload(struct rcu_data *rdp)
+static long rcu_nocb_rdp_offload(void *arg)
 {
+	struct rcu_data *rdp = arg;
 	struct rcu_segcblist *cblist = &rdp->cblist;
 	unsigned long flags;
 	int ret;
 
+	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
 	/*
 	 * For now we only support re-offload, ie: the rdp must have been
 	 * offloaded on boot first.
@@ -2524,14 +2515,6 @@ static int __rcu_nocb_rdp_offload(struct rcu_data *rdp)
 	return ret;
 }
 
-static long rcu_nocb_rdp_offload(void *arg)
-{
-	struct rcu_data *rdp = arg;
-
-	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
-	return __rcu_nocb_rdp_offload(rdp);
-}
-
 int rcu_nocb_cpu_offload(int cpu)
 {
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
@@ -2540,12 +2523,14 @@ int rcu_nocb_cpu_offload(int cpu)
 	mutex_lock(&rcu_state.barrier_mutex);
 	cpus_read_lock();
 	if (!rcu_rdp_is_offloaded(rdp)) {
-		if (cpu_online(cpu))
+		if (cpu_online(cpu)) {
 			ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp);
-		else
-			ret = __rcu_nocb_rdp_offload(rdp);
-		if (!ret)
-			cpumask_set_cpu(cpu, rcu_nocb_mask);
+			if (!ret)
+				cpumask_set_cpu(cpu, rcu_nocb_mask);
+		} else {
+			pr_info("NOCB: Can't CB-offload an offline CPU\n");
+			ret = -EINVAL;
+		}
 	}
 	cpus_read_unlock();
 	mutex_unlock(&rcu_state.barrier_mutex);

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 01/16] rcu/nocb: Fix potential missed nocb_timer rearm
       [not found]   ` <20210128184834.GP2743@paulmck-ThinkPad-P72>
@ 2021-01-28 21:23     ` Frederic Weisbecker
  0 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 21:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 10:48:34AM -0800, Paul E. McKenney wrote:
> On Thu, Jan 28, 2021 at 06:12:07PM +0100, Frederic Weisbecker wrote:
> > The "nocb_bypass_timer" ends up calling wake_nocb_gp() which deletes
> > the pending "nocb_timer" (note they are not the same timers) for the
> > given rdp without resetting the matching state stored in nocb_defer
> > wakeup.
> > 
> > As a result, a future call_rcu() on that rdp may be fooled and think the
> > timer is armed when it's not, missing a deferred nocb_gp wakeup.
> > 
> > Fix this with resetting rdp->nocb_defer_wakeup when we disarm the timer.
> > 
> > Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing)
> > Cc: Stable <stable@vger.kernel.org>
> > Cc: Josh Triplett <josh@joshtriplett.org>
> > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> > Cc: Boqun Feng <boqun.feng@gmail.com>
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > ---
> >  kernel/rcu/tree_plugin.h | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 7e33dae0e6ee..a44f80d7661b 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -1705,6 +1705,8 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
> >  		rcu_nocb_unlock_irqrestore(rdp, flags);
> >  		return false;
> >  	}
> > +
> > +	rdp->nocb_defer_wakeup = RCU_NOCB_WAKE_NOT;
> 
> Given this change, does it make sense to remove the
> setting of ->nocb_defer_wakeup to RCU_NOCB_WAKE_NOT from the
> do_nocb_deferred_wakeup_common() function?

I do it later in "[PATCH 09/16] rcu/nocb: Merge nocb_timer to the rdp leader"

> Does the above assignment need
> to be WRITE_ONCE(), in other words, are all reads of ->nocb_defer_wakeup
> done with either ->nocb_lock or ->nocb_gp_lock held?  (I do not believe
> that this is the case.)

Ah indeed it should probably be done with WRITE_ONCE() because it's read
locklessly on many places.

Thanks.

> 
> 							Thanx, Paul
> 
> >  	del_timer(&rdp->nocb_timer);
> >  	rcu_nocb_unlock_irqrestore(rdp, flags);
> >  	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
> > -- 
> > 2.25.1
> > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 05/16] rcu/nocb: Disable bypass when CPU isn't completely offloaded
  2021-01-28 17:12 ` [PATCH 05/16] rcu/nocb: Disable bypass when CPU isn't completely offloaded Frederic Weisbecker
@ 2021-01-28 21:31   ` Paul E. McKenney
  2021-01-28 22:25     ` Frederic Weisbecker
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2021-01-28 21:31 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 06:12:11PM +0100, Frederic Weisbecker wrote:
> Instead of flushing bypass at the very last moment in the deoffloading
> process, just disable bypass enqueue at soon as we start the deoffloading
> process and flush the pending bypass early. It's less fragile and we
> leave some time to the kthreads and softirqs to process quietly.
> 
> Symmetrically, only enable bypass once we safely complete the offloading
> process.
> 
> Reported-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Looks plausible, thank you!  Some questions and comments below.

> ---
>  include/linux/rcu_segcblist.h |  7 ++++---
>  kernel/rcu/tree_plugin.h      | 31 +++++++++++++++++++++++--------
>  2 files changed, 27 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
> index 8afe886e85f1..5a2d6dadd720 100644
> --- a/include/linux/rcu_segcblist.h
> +++ b/include/linux/rcu_segcblist.h
> @@ -109,7 +109,7 @@ struct rcu_cblist {
>   *  |                           SEGCBLIST_KTHREAD_GP                           |
>   *  |                                                                          |
>   *  |   Kthreads handle callbacks holding nocb_lock, local rcu_core() stops    |
> - *  |   handling callbacks.                                                    |
> + *  |   handling callbacks. Allow bypass enqueue.                              |

"Allow bypass enqueue" as in bypass was disabled and entering this
state causes it to be enabled, correct?  If so, "Enable bypass
queueing" would be less ambiguous and would match the change below.

>   *  ----------------------------------------------------------------------------
>   */
>  
> @@ -125,7 +125,7 @@ struct rcu_cblist {
>   *  |                           SEGCBLIST_KTHREAD_GP                           |
>   *  |                                                                          |
>   *  |   CB/GP kthreads handle callbacks holding nocb_lock, local rcu_core()    |
> - *  |   ignores callbacks.                                                     |
> + *  |   ignores callbacks. Bypass enqueue is enabled.                          |
>   *  ----------------------------------------------------------------------------
>   *                                      |
>   *                                      v
> @@ -134,7 +134,8 @@ struct rcu_cblist {
>   *  |                           SEGCBLIST_KTHREAD_GP                           |
>   *  |                                                                          |
>   *  |   CB/GP kthreads and local rcu_core() handle callbacks concurrently      |
> - *  |   holding nocb_lock. Wake up CB and GP kthreads if necessary.            |
> + *  |   holding nocb_lock. Wake up CB and GP kthreads if necessary. Disable    |
> + *  |   bypass enqueue.                                                        |
>   *  ----------------------------------------------------------------------------
>   *                                      |
>   *                                      v
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 732942a15f2b..7781830a3cf1 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -1825,11 +1825,22 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>  	unsigned long j = jiffies;
>  	long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
>  
> +	lockdep_assert_irqs_disabled();
> +
> +	// Pure softirq/rcuc based processing: no bypassing, no
> +	// locking.
>  	if (!rcu_rdp_is_offloaded(rdp)) {
> +		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
> +		return false;
> +	}
> +
> +	// In the process of (de-)offloading: no bypassing, but
> +	// locking.
> +	if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
> +		rcu_nocb_lock(rdp);
>  		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
>  		return false; /* Not offloaded, no bypassing. */
>  	}
> -	lockdep_assert_irqs_disabled();
>  
>  	// Don't use ->nocb_bypass during early boot.
>  	if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) {
> @@ -2412,6 +2423,7 @@ static long rcu_nocb_rdp_deoffload(void *arg)
>  
>  	rcu_nocb_lock_irqsave(rdp, flags);
>  
> +	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));

This flush suffices because we are running on the target CPU
holding ->nocb_lock (thus having interrupts disabled), and
because rdp_offload_toggle() invokes rcu_segcblist_offload(),
which clears SEGCBLIST_OFFLOADED.  Thus future calls to
rcu_segcblist_completely_offloaded() will return false,
which means that future calls to rcu_nocb_try_bypass() will
refuse to put anything into the bypass.

I believe that this deserves a comment, particularly if I am
confused about what is really happening here.  ;-)

On another topic, since I saw it along the way...

The header comment for rcu_segcblist_offload() says that the
structure must be empty, but that isn't really the case, is it?

>  	ret = rdp_offload_toggle(rdp, false, flags);
>  	swait_event_exclusive(rdp->nocb_state_wq,
>  			      !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |
> @@ -2422,19 +2434,22 @@ static long rcu_nocb_rdp_deoffload(void *arg)
>  	rcu_nocb_unlock_irqrestore(rdp, flags);
>  	del_timer_sync(&rdp->nocb_timer);
>  
> +	/* Sanity check */
> +	WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> +
>  	/*
> -	 * Flush bypass. While IRQs are disabled and once we set
> -	 * SEGCBLIST_SOFTIRQ_ONLY, no callback is supposed to be
> -	 * enqueued on bypass.
> +	 * Lock one last time so we see latest updates from kthreads and timer

You lost me here.  What updates are we seeing from kthreads and timers?

> +	 * so that we can later run callbacks locally without the lock.
>  	 */
>  	rcu_nocb_lock_irqsave(rdp, flags);
> -	rcu_nocb_flush_bypass(rdp, NULL, jiffies);
> +	/*
> +	 * Theoretically we could set SEGCBLIST_SOFTIRQ_ONLY after the nocb
> +	 * LOCK/UNLOCK but let's be paranoid.
> +	 */
>  	rcu_segcblist_set_flags(cblist, SEGCBLIST_SOFTIRQ_ONLY);

As long as we are being paranoid, should we also check that the
bypass remains empty?

>  	/*
>  	 * With SEGCBLIST_SOFTIRQ_ONLY, we can't use
> -	 * rcu_nocb_unlock_irqrestore() anymore. Theoretically we
> -	 * could set SEGCBLIST_SOFTIRQ_ONLY with cb unlocked and IRQs
> -	 * disabled now, but let's be paranoid.
> +	 * rcu_nocb_unlock_irqrestore() anymore.
>  	 */
>  	raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
>  
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/16] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up
       [not found]   ` <20210128191228.GQ2743@paulmck-ThinkPad-P72>
@ 2021-01-28 21:34     ` Frederic Weisbecker
  2021-01-28 21:45       ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 21:34 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 11:12:28AM -0800, Paul E. McKenney wrote:
> On Thu, Jan 28, 2021 at 06:12:10PM +0100, Frederic Weisbecker wrote:
> > Simply checking if the segcblist is enabled is enough to know if we
> > need to initialize it or not. It's safe to check within hotplug
> > machine.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Josh Triplett <josh@joshtriplett.org>
> > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> > Cc: Boqun Feng <boqun.feng@gmail.com>
> 
> Hmmm...
> 
> At the start of a CPU-hotplug operation, an incoming CPU's callback
> list can be in a number of states:
> 
> 1.	Disabled and empty.  This is the case when the boot CPU has
> 	not done call_rcu(), when a non-boot CPU first comes online,
> 	and when a non-offloaded CPU comes back online.  In this case,
> 	it is permissible to initialize ->cblist.  Because either the
> 	CPU is currently running with interrupts disabled (boot CPU)
> 	or is not yet running at all (other CPUs), it is not necessary
> 	to acquire ->nocb_lock.
> 
> 2.	Disabled and non-empty.  This is the case when the boot CPU has
> 	done call_rcu().  It is not permissible to initialize ->cblist
> 	because doing so will leak any callbacks posted by early boot
> 	invocations of call_rcu().

I don't think that's possible. In this case __call_rcu() has called
rcu_segcblist_init() and has enabled the segcblist.

> 
> 	Test for the possibility of leaking by building with
> 	CONFIG_PROVE_RCU=y and booting with rcupdate.rcu_self_test=1.
> 
> 3.	Enabled, whether empty or not.  This is the case when an
> 	offloaded CPU comes back online.  This is the only case where
> 	the ->nocb_lock must be held to modify ->cblist.  However,
> 	it is not necessarily to modify ->cblist because the rcuoc
> 	kthread is on the job.
> 
> So I believe that it is necessary to check for both disabled and empty.
> But don't take my word for it!  Build with CONFIG_PROVE_RCU=y and boot
> with rcupdate.rcu_self_test=1.  ;-)

I'm trying that :-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 06/16] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep
  2021-01-28 17:12 ` [PATCH 06/16] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep Frederic Weisbecker
@ 2021-01-28 21:42   ` Paul E. McKenney
  0 siblings, 0 replies; 28+ messages in thread
From: Paul E. McKenney @ 2021-01-28 21:42 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 06:12:12PM +0100, Frederic Weisbecker wrote:
> rdp->nocb_cb_sleep is first set to true by default after processing
> the callbacks then set back to false if we still find ready callbacks
> to invoke.
> 
> This is confusing and even unsafe if it ever happens to be read
> locklessly at some point. So make sure we write it only once per
> nocb_cb_wait() loop.
> 
> Reported-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Nice, queued, thank you!  The usual wordsmithing &c...

							Thanx, Paul

------------------------------------------------------------------------

commit cbc3fbfe8424edc90668d5878eb493ae2ff1b888
Author: Frederic Weisbecker <frederic@kernel.org>
Date:   Thu Jan 28 18:12:12 2021 +0100

    rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep
    
    The nocb_cb_wait() function first sets the rdp->nocb_cb_sleep flag to
    true by after invoking the callbacks, and then sets it back to false if
    it finds more callbacks that are ready to invoke.
    
    This is confusing and will become unsafe if this flag is ever read
    locklessly.  This commit therefore writes it only once, based on the
    state after both callback invocation and checking.
    
    Reported-by: Paul E. McKenney <paulmck@kernel.org>
    Cc: Josh Triplett <josh@joshtriplett.org>
    Cc: Lai Jiangshan <jiangshanlai@gmail.com>
    Cc: Joel Fernandes <joel@joelfernandes.org>
    Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
    Cc: Boqun Feng <boqun.feng@gmail.com>
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index c61613a..a3db700 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2229,6 +2229,7 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 	unsigned long flags;
 	bool needwake_state = false;
 	bool needwake_gp = false;
+	bool can_sleep = true;
 	struct rcu_node *rnp = rdp->mynode;
 
 	local_irq_save(flags);
@@ -2252,8 +2253,6 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 		raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
 	}
 
-	WRITE_ONCE(rdp->nocb_cb_sleep, true);
-
 	if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
 		if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB)) {
 			rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_CB);
@@ -2261,7 +2260,7 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 				needwake_state = true;
 		}
 		if (rcu_segcblist_ready_cbs(cblist))
-			WRITE_ONCE(rdp->nocb_cb_sleep, false);
+			can_sleep = false;
 	} else {
 		/*
 		 * De-offloading. Clear our flag and notify the de-offload worker.
@@ -2274,6 +2273,8 @@ static void nocb_cb_wait(struct rcu_data *rdp)
 			needwake_state = true;
 	}
 
+	WRITE_ONCE(rdp->nocb_cb_sleep, can_sleep);
+
 	if (rdp->nocb_cb_sleep)
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("CBSleep"));
 

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/16] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up
  2021-01-28 21:34     ` Frederic Weisbecker
@ 2021-01-28 21:45       ` Paul E. McKenney
  2021-01-29  0:26         ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2021-01-28 21:45 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 10:34:13PM +0100, Frederic Weisbecker wrote:
> On Thu, Jan 28, 2021 at 11:12:28AM -0800, Paul E. McKenney wrote:
> > On Thu, Jan 28, 2021 at 06:12:10PM +0100, Frederic Weisbecker wrote:
> > > Simply checking if the segcblist is enabled is enough to know if we
> > > need to initialize it or not. It's safe to check within hotplug
> > > machine.
> > > 
> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > Cc: Josh Triplett <josh@joshtriplett.org>
> > > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > > Cc: Joel Fernandes <joel@joelfernandes.org>
> > > Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> > > Cc: Boqun Feng <boqun.feng@gmail.com>
> > 
> > Hmmm...
> > 
> > At the start of a CPU-hotplug operation, an incoming CPU's callback
> > list can be in a number of states:
> > 
> > 1.	Disabled and empty.  This is the case when the boot CPU has
> > 	not done call_rcu(), when a non-boot CPU first comes online,
> > 	and when a non-offloaded CPU comes back online.  In this case,
> > 	it is permissible to initialize ->cblist.  Because either the
> > 	CPU is currently running with interrupts disabled (boot CPU)
> > 	or is not yet running at all (other CPUs), it is not necessary
> > 	to acquire ->nocb_lock.
> > 
> > 2.	Disabled and non-empty.  This is the case when the boot CPU has
> > 	done call_rcu().  It is not permissible to initialize ->cblist
> > 	because doing so will leak any callbacks posted by early boot
> > 	invocations of call_rcu().
> 
> I don't think that's possible. In this case __call_rcu() has called
> rcu_segcblist_init() and has enabled the segcblist.

You are right, rcu_segcblist_init() would have been called in that
case and it has: rcu_segcblist_set_flags(rsclp, SEGCBLIST_ENABLED).

> > 	Test for the possibility of leaking by building with
> > 	CONFIG_PROVE_RCU=y and booting with rcupdate.rcu_self_test=1.
> > 
> > 3.	Enabled, whether empty or not.  This is the case when an
> > 	offloaded CPU comes back online.  This is the only case where
> > 	the ->nocb_lock must be held to modify ->cblist.  However,
> > 	it is not necessarily to modify ->cblist because the rcuoc
> > 	kthread is on the job.
> > 
> > So I believe that it is necessary to check for both disabled and empty.
> > But don't take my word for it!  Build with CONFIG_PROVE_RCU=y and boot
> > with rcupdate.rcu_self_test=1.  ;-)
> 
> I'm trying that :-)

Even better!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 05/16] rcu/nocb: Disable bypass when CPU isn't completely offloaded
  2021-01-28 21:31   ` Paul E. McKenney
@ 2021-01-28 22:25     ` Frederic Weisbecker
  2021-01-29  0:19       ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2021-01-28 22:25 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 01:31:33PM -0800, Paul E. McKenney wrote:
> On Thu, Jan 28, 2021 at 06:12:11PM +0100, Frederic Weisbecker wrote:
> > ---
> >  include/linux/rcu_segcblist.h |  7 ++++---
> >  kernel/rcu/tree_plugin.h      | 31 +++++++++++++++++++++++--------
> >  2 files changed, 27 insertions(+), 11 deletions(-)
> > 
> > diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
> > index 8afe886e85f1..5a2d6dadd720 100644
> > --- a/include/linux/rcu_segcblist.h
> > +++ b/include/linux/rcu_segcblist.h
> > @@ -109,7 +109,7 @@ struct rcu_cblist {
> >   *  |                           SEGCBLIST_KTHREAD_GP                           |
> >   *  |                                                                          |
> >   *  |   Kthreads handle callbacks holding nocb_lock, local rcu_core() stops    |
> > - *  |   handling callbacks.                                                    |
> > + *  |   handling callbacks. Allow bypass enqueue.                              |
> 
> "Allow bypass enqueue" as in bypass was disabled and entering this
> state causes it to be enabled, correct?

Right.

> If so, "Enable bypass
> queueing" would be less ambiguous and would match the change below.

Ok I'll fix that.

> > @@ -2412,6 +2423,7 @@ static long rcu_nocb_rdp_deoffload(void *arg)
> >  
> >  	rcu_nocb_lock_irqsave(rdp, flags);
> >  
> > +	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
> 
> This flush suffices because we are running on the target CPU
> holding ->nocb_lock (thus having interrupts disabled), and
> because rdp_offload_toggle() invokes rcu_segcblist_offload(),
> which clears SEGCBLIST_OFFLOADED.  Thus future calls to
> rcu_segcblist_completely_offloaded() will return false,
> which means that future calls to rcu_nocb_try_bypass() will
> refuse to put anything into the bypass.

Exactly!

> I believe that this deserves a comment, particularly if I am
> confused about what is really happening here.  ;-)

Yes indeed I've been greedy there, will comment this :o)

> 
> On another topic, since I saw it along the way...
> 
> The header comment for rcu_segcblist_offload() says that the
> structure must be empty, but that isn't really the case, is it?

Ah strange indeed, must be a leftover. I'll remove it.

> 
> >  	ret = rdp_offload_toggle(rdp, false, flags);
> >  	swait_event_exclusive(rdp->nocb_state_wq,
> >  			      !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |
> > @@ -2422,19 +2434,22 @@ static long rcu_nocb_rdp_deoffload(void *arg)
> >  	rcu_nocb_unlock_irqrestore(rdp, flags);
> >  	del_timer_sync(&rdp->nocb_timer);
> >  
> > +	/* Sanity check */
> > +	WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > +
> >  	/*
> > -	 * Flush bypass. While IRQs are disabled and once we set
> > -	 * SEGCBLIST_SOFTIRQ_ONLY, no callback is supposed to be
> > -	 * enqueued on bypass.
> > +	 * Lock one last time so we see latest updates from kthreads and timer
> 
> You lost me here.  What updates are we seeing from kthreads and timers?

We want to make sure that, whatever change has been made on the segcblist by
kthreads such as length or callbacks dequeue, this is visible on the current
CPU. The swait_event_exclusive() doesn't guarantee that everything from the
kthreads is visible here as the flags are checked lockless and I haven't added
specific barriers.

That said right after the swait_event there is a nocb_lock LOCK/UNLOCK cycle to
disable the timer, so that's enough for the local CPU to see those updates. What
remains is the updates made by pending timers flushed in del_timer_sync(). There
is nothing special there to be visible here but out of paranoia...

In fact this matters later in the series as the above timer disablement and
flush will disappear and the LOCK/UNLOCK cycle that comes along.

> 
> > +	 * so that we can later run callbacks locally without the lock.
> >  	 */
> >  	rcu_nocb_lock_irqsave(rdp, flags);
> > -	rcu_nocb_flush_bypass(rdp, NULL, jiffies);
> > +	/*
> > +	 * Theoretically we could set SEGCBLIST_SOFTIRQ_ONLY after the nocb
> > +	 * LOCK/UNLOCK but let's be paranoid.
> > +	 */
> >  	rcu_segcblist_set_flags(cblist, SEGCBLIST_SOFTIRQ_ONLY);
> 
> As long as we are being paranoid, should we also check that the
> bypass remains empty?

You missed it, check above for sanity check :-)

Thanks.

> 
> >  	/*
> >  	 * With SEGCBLIST_SOFTIRQ_ONLY, we can't use
> > -	 * rcu_nocb_unlock_irqrestore() anymore. Theoretically we
> > -	 * could set SEGCBLIST_SOFTIRQ_ONLY with cb unlocked and IRQs
> > -	 * disabled now, but let's be paranoid.
> > +	 * rcu_nocb_unlock_irqrestore() anymore.
> >  	 */
> >  	raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
> >  
> > -- 
> > 2.25.1
> > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 05/16] rcu/nocb: Disable bypass when CPU isn't completely offloaded
  2021-01-28 22:25     ` Frederic Weisbecker
@ 2021-01-29  0:19       ` Paul E. McKenney
  0 siblings, 0 replies; 28+ messages in thread
From: Paul E. McKenney @ 2021-01-29  0:19 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 11:25:31PM +0100, Frederic Weisbecker wrote:
> On Thu, Jan 28, 2021 at 01:31:33PM -0800, Paul E. McKenney wrote:
> > On Thu, Jan 28, 2021 at 06:12:11PM +0100, Frederic Weisbecker wrote:
> > > ---
> > >  include/linux/rcu_segcblist.h |  7 ++++---
> > >  kernel/rcu/tree_plugin.h      | 31 +++++++++++++++++++++++--------
> > >  2 files changed, 27 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
> > > index 8afe886e85f1..5a2d6dadd720 100644
> > > --- a/include/linux/rcu_segcblist.h
> > > +++ b/include/linux/rcu_segcblist.h
> > > @@ -109,7 +109,7 @@ struct rcu_cblist {
> > >   *  |                           SEGCBLIST_KTHREAD_GP                           |
> > >   *  |                                                                          |
> > >   *  |   Kthreads handle callbacks holding nocb_lock, local rcu_core() stops    |
> > > - *  |   handling callbacks.                                                    |
> > > + *  |   handling callbacks. Allow bypass enqueue.                              |
> > 
> > "Allow bypass enqueue" as in bypass was disabled and entering this
> > state causes it to be enabled, correct?
> 
> Right.
> 
> > If so, "Enable bypass
> > queueing" would be less ambiguous and would match the change below.
> 
> Ok I'll fix that.
> 
> > > @@ -2412,6 +2423,7 @@ static long rcu_nocb_rdp_deoffload(void *arg)
> > >  
> > >  	rcu_nocb_lock_irqsave(rdp, flags);
> > >  
> > > +	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
> > 
> > This flush suffices because we are running on the target CPU
> > holding ->nocb_lock (thus having interrupts disabled), and
> > because rdp_offload_toggle() invokes rcu_segcblist_offload(),
> > which clears SEGCBLIST_OFFLOADED.  Thus future calls to
> > rcu_segcblist_completely_offloaded() will return false,
> > which means that future calls to rcu_nocb_try_bypass() will
> > refuse to put anything into the bypass.
> 
> Exactly!

Whew!  ;-)

> > I believe that this deserves a comment, particularly if I am
> > confused about what is really happening here.  ;-)
> 
> Yes indeed I've been greedy there, will comment this :o)

Very good!

> > On another topic, since I saw it along the way...
> > 
> > The header comment for rcu_segcblist_offload() says that the
> > structure must be empty, but that isn't really the case, is it?
> 
> Ah strange indeed, must be a leftover. I'll remove it.

I should have spotted it earlier, shouldn't I have?  ;-)

> > >  	ret = rdp_offload_toggle(rdp, false, flags);
> > >  	swait_event_exclusive(rdp->nocb_state_wq,
> > >  			      !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |
> > > @@ -2422,19 +2434,22 @@ static long rcu_nocb_rdp_deoffload(void *arg)
> > >  	rcu_nocb_unlock_irqrestore(rdp, flags);
> > >  	del_timer_sync(&rdp->nocb_timer);
> > >  
> > > +	/* Sanity check */
> > > +	WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > > +
> > >  	/*
> > > -	 * Flush bypass. While IRQs are disabled and once we set
> > > -	 * SEGCBLIST_SOFTIRQ_ONLY, no callback is supposed to be
> > > -	 * enqueued on bypass.
> > > +	 * Lock one last time so we see latest updates from kthreads and timer
> > 
> > You lost me here.  What updates are we seeing from kthreads and timers?
> 
> We want to make sure that, whatever change has been made on the segcblist by
> kthreads such as length or callbacks dequeue, this is visible on the current
> CPU. The swait_event_exclusive() doesn't guarantee that everything from the
> kthreads is visible here as the flags are checked lockless and I haven't added
> specific barriers.
> 
> That said right after the swait_event there is a nocb_lock LOCK/UNLOCK cycle to
> disable the timer, so that's enough for the local CPU to see those updates. What
> remains is the updates made by pending timers flushed in del_timer_sync(). There
> is nothing special there to be visible here but out of paranoia...
> 
> In fact this matters later in the series as the above timer disablement and
> flush will disappear and the LOCK/UNLOCK cycle that comes along.

OK, so the point is that any future manipulations of this callback
list will see the desired stable state, correct?

> > > +	 * so that we can later run callbacks locally without the lock.
> > >  	 */
> > >  	rcu_nocb_lock_irqsave(rdp, flags);
> > > -	rcu_nocb_flush_bypass(rdp, NULL, jiffies);
> > > +	/*
> > > +	 * Theoretically we could set SEGCBLIST_SOFTIRQ_ONLY after the nocb
> > > +	 * LOCK/UNLOCK but let's be paranoid.
> > > +	 */
> > >  	rcu_segcblist_set_flags(cblist, SEGCBLIST_SOFTIRQ_ONLY);
> > 
> > As long as we are being paranoid, should we also check that the
> > bypass remains empty?
> 
> You missed it, check above for sanity check :-)

I did see that, but...  Why not place it as late as possible, like
just before releasing the ->nocb_lock?  Or is there some way that
a callback can sneak into the bypass list after the sanity check but
before ->nocb_lock is acquired?

							Thanx, Paul

> Thanks.
> 
> > 
> > >  	/*
> > >  	 * With SEGCBLIST_SOFTIRQ_ONLY, we can't use
> > > -	 * rcu_nocb_unlock_irqrestore() anymore. Theoretically we
> > > -	 * could set SEGCBLIST_SOFTIRQ_ONLY with cb unlocked and IRQs
> > > -	 * disabled now, but let's be paranoid.
> > > +	 * rcu_nocb_unlock_irqrestore() anymore.
> > >  	 */
> > >  	raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
> > >  
> > > -- 
> > > 2.25.1
> > > 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/16] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up
  2021-01-28 21:45       ` Paul E. McKenney
@ 2021-01-29  0:26         ` Paul E. McKenney
  0 siblings, 0 replies; 28+ messages in thread
From: Paul E. McKenney @ 2021-01-29  0:26 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 01:45:24PM -0800, Paul E. McKenney wrote:
> On Thu, Jan 28, 2021 at 10:34:13PM +0100, Frederic Weisbecker wrote:
> > On Thu, Jan 28, 2021 at 11:12:28AM -0800, Paul E. McKenney wrote:
> > > On Thu, Jan 28, 2021 at 06:12:10PM +0100, Frederic Weisbecker wrote:
> > > > Simply checking if the segcblist is enabled is enough to know if we
> > > > need to initialize it or not. It's safe to check within hotplug
> > > > machine.
> > > > 
> > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > > Cc: Josh Triplett <josh@joshtriplett.org>
> > > > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > > > Cc: Joel Fernandes <joel@joelfernandes.org>
> > > > Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> > > > Cc: Boqun Feng <boqun.feng@gmail.com>
> > > 
> > > Hmmm...
> > > 
> > > At the start of a CPU-hotplug operation, an incoming CPU's callback
> > > list can be in a number of states:
> > > 
> > > 1.	Disabled and empty.  This is the case when the boot CPU has
> > > 	not done call_rcu(), when a non-boot CPU first comes online,
> > > 	and when a non-offloaded CPU comes back online.  In this case,
> > > 	it is permissible to initialize ->cblist.  Because either the
> > > 	CPU is currently running with interrupts disabled (boot CPU)
> > > 	or is not yet running at all (other CPUs), it is not necessary
> > > 	to acquire ->nocb_lock.
> > > 
> > > 2.	Disabled and non-empty.  This is the case when the boot CPU has
> > > 	done call_rcu().  It is not permissible to initialize ->cblist
> > > 	because doing so will leak any callbacks posted by early boot
> > > 	invocations of call_rcu().
> > 
> > I don't think that's possible. In this case __call_rcu() has called
> > rcu_segcblist_init() and has enabled the segcblist.
> 
> You are right, rcu_segcblist_init() would have been called in that
> case and it has: rcu_segcblist_set_flags(rsclp, SEGCBLIST_ENABLED).
> 
> > > 	Test for the possibility of leaking by building with
> > > 	CONFIG_PROVE_RCU=y and booting with rcupdate.rcu_self_test=1.
> > > 
> > > 3.	Enabled, whether empty or not.  This is the case when an
> > > 	offloaded CPU comes back online.  This is the only case where
> > > 	the ->nocb_lock must be held to modify ->cblist.  However,
> > > 	it is not necessarily to modify ->cblist because the rcuoc
> > > 	kthread is on the job.
> > > 
> > > So I believe that it is necessary to check for both disabled and empty.
> > > But don't take my word for it!  Build with CONFIG_PROVE_RCU=y and boot
> > > with rcupdate.rcu_self_test=1.  ;-)
> > 
> > I'm trying that :-)
> 
> Even better!

I applied this patch (with the usual wordsmithing as shown below), then
ran this:

tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE05" --bootargs "rcu_nocbs=0-7" --trust-make

The resulting console.log file says "Running RCU self tests" and the test
runs to completion without complaint.  So good show!

								Thanx, Paul

------------------------------------------------------------------------

commit 0a43799de756a486e45eba8d9d4286a577e746cd
Author: Frederic Weisbecker <frederic@kernel.org>
Date:   Thu Jan 28 18:12:10 2021 +0100

    rcu/nocb: Only (re-)initialize segcblist when needed on CPU up
    
    At the start of a CPU-hotplug operation, the incoming CPU's callback
    list can be in a number of states:
    
    1.      Disabled and empty.  This is the case when the boot CPU has
            not invoked call_rcu(), when a non-boot CPU first comes online,
            and when a non-offloaded CPU comes back online.  In this case,
            it is both necessary and permissible to initialize ->cblist.
            Because either the CPU is currently running with interrupts
            disabled (boot CPU) or is not yet running at all (other CPUs),
            it is not necessary to acquire ->nocb_lock.
    
            In this case, initialization is required.
    
    2.      Disabled and non-empty.  This cannot occur, because early boot
            call_rcu() invocations enable the callback list before enqueuing
            their callback.
    
    3.      Enabled, whether empty or not.  In this case, the callback
            list has already been initialized.  This case occurs when the
            boot CPU has executed an early boot call_rcu() and also when
            an offloaded CPU comes back online.  In both cases, there is
            no need to initialize the callback list: In the boot-CPU case,
            the CPU has not (yet) gone offline, and in the offloaded case,
            the rcuo kthreads are taking care of business.
    
            Because it is not necessary to initialize the callback list,
            it is also not necessary to acquire ->nocb_lock.
    
    Therefore, checking if the segcblist is enabled suffices.  This commit
    therefore initializes the callback list at rcutree_prepare_cpu() time
    only if that list is disabled.
    
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Cc: Josh Triplett <josh@joshtriplett.org>
    Cc: Lai Jiangshan <jiangshanlai@gmail.com>
    Cc: Joel Fernandes <joel@joelfernandes.org>
    Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
    Cc: Boqun Feng <boqun.feng@gmail.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 00059df..70ddc33 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4064,14 +4064,13 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	rdp->dynticks_nesting = 1;	/* CPU not up, no tearing. */
 	rcu_dynticks_eqs_online();
 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
+
 	/*
-	 * Lock in case the CB/GP kthreads are still around handling
-	 * old callbacks.
+	 * Only non-NOCB CPUs that didn't have early-boot callbacks need to be
+	 * (re-)initialized.
 	 */
-	rcu_nocb_lock(rdp);
-	if (rcu_segcblist_empty(&rdp->cblist)) /* No early-boot CBs? */
+	if (!rcu_segcblist_is_enabled(&rdp->cblist))
 		rcu_segcblist_init(&rdp->cblist);  /* Re-enable callbacks. */
-	rcu_nocb_unlock(rdp);
 
 	/*
 	 * Add CPU to leaf rcu_node pending-online bitmask.  Any needed

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 07/16] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading
  2021-01-28 17:12 ` [PATCH 07/16] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading Frederic Weisbecker
@ 2021-01-29  0:49   ` Paul E. McKenney
  0 siblings, 0 replies; 28+ messages in thread
From: Paul E. McKenney @ 2021-01-29  0:49 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 06:12:13PM +0100, Frederic Weisbecker wrote:
> Unconfuse a bit the name of this function which suggests returning true
> when the state is updated. It actually returns true when the rdp is in
> the process of deoffloading and we must ignore it.
> 
> Reported-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Fair point, thank you!  I have queued this one for further review and
testing, with the usual wordsmithing shown below.

							Thanx, Paul

------------------------------------------------------------------------

commit 142d159f544763140e0f498936bca8f71563e0e0
Author: Frederic Weisbecker <frederic@kernel.org>
Date:   Thu Jan 28 18:12:13 2021 +0100

    rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading
    
    The name nocb_gp_update_state() is unenlightening, so this commit changes
    it to nocb_gp_update_state_deoffloading().  This function now does what
    its name says, updates state and returns true if the CPU corresponding to
    the specified rcu_data structure is in the process of being de-offloaded.
    
    Reported-by: Paul E. McKenney <paulmck@kernel.org>
    Cc: Josh Triplett <josh@joshtriplett.org>
    Cc: Lai Jiangshan <jiangshanlai@gmail.com>
    Cc: Joel Fernandes <joel@joelfernandes.org>
    Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
    Cc: Boqun Feng <boqun.feng@gmail.com>
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index a3db700..9c0ee82 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2014,7 +2014,8 @@ static inline bool nocb_gp_enabled_cb(struct rcu_data *rdp)
 	return rcu_segcblist_test_flags(&rdp->cblist, flags);
 }
 
-static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_state)
+static inline bool nocb_gp_update_state_deoffloading(struct rcu_data *rdp,
+						     bool *needwake_state)
 {
 	struct rcu_segcblist *cblist = &rdp->cblist;
 
@@ -2024,7 +2025,7 @@ static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_sta
 			if (rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
 				*needwake_state = true;
 		}
-		return true;
+		return false;
 	}
 
 	/*
@@ -2035,7 +2036,7 @@ static inline bool nocb_gp_update_state(struct rcu_data *rdp, bool *needwake_sta
 	rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP);
 	if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
 		*needwake_state = true;
-	return false;
+	return true;
 }
 
 
@@ -2073,7 +2074,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 			continue;
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
 		rcu_nocb_lock_irqsave(rdp, flags);
-		if (!nocb_gp_update_state(rdp, &needwake_state)) {
+		if (nocb_gp_update_state_deoffloading(rdp, &needwake_state)) {
 			rcu_nocb_unlock_irqrestore(rdp, flags);
 			if (needwake_state)
 				swake_up_one(&rdp->nocb_state_wq);

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 08/16] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible
  2021-01-28 17:12 ` [PATCH 08/16] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible Frederic Weisbecker
@ 2021-01-29  0:51   ` Paul E. McKenney
  0 siblings, 0 replies; 28+ messages in thread
From: Paul E. McKenney @ 2021-01-29  0:51 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Lai Jiangshan, Neeraj Upadhyay, Josh Triplett,
	Stable, Joel Fernandes

On Thu, Jan 28, 2021 at 06:12:14PM +0100, Frederic Weisbecker wrote:
> Those tracing calls don't need to be under the nocb lock. Move them
> outside.

This one looks fine (give or take the usual wordsmithing), but does
not apply, presumably due to my not having yet taken one of the earlier
patches.  So deferred for now, and please include it in your next version.

							Thanx, Paul

> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> ---
>  kernel/rcu/tree_plugin.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 86c3bcceede6..8c5fea58ee80 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -1700,9 +1700,9 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
>  
>  	lockdep_assert_held(&rdp->nocb_lock);
>  	if (!READ_ONCE(rdp_gp->nocb_gp_kthread)) {
> +		rcu_nocb_unlock_irqrestore(rdp, flags);
>  		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>  				    TPS("AlreadyAwake"));
> -		rcu_nocb_unlock_irqrestore(rdp, flags);
>  		return false;
>  	}
>  
> @@ -1950,9 +1950,9 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
>  	// If we are being polled or there is no kthread, just leave.
>  	t = READ_ONCE(rdp->nocb_gp_kthread);
>  	if (rcu_nocb_poll || !t) {
> +		rcu_nocb_unlock_irqrestore(rdp, flags);
>  		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>  				    TPS("WakeNotPoll"));
> -		rcu_nocb_unlock_irqrestore(rdp, flags);
>  		return;
>  	}
>  	// Need to actually to a wakeup.
> @@ -1987,8 +1987,8 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
>  					   TPS("WakeOvfIsDeferred"));
>  		rcu_nocb_unlock_irqrestore(rdp, flags);
>  	} else {
> +		rcu_nocb_unlock_irqrestore(rdp, flags);
>  		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
> -		rcu_nocb_unlock_irqrestore(rdp, flags);
>  	}
>  	return;
>  }
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-01-29  0:53 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-28 17:12 [PATCH 00/16] rcu/nocb updates Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 01/16] rcu/nocb: Fix potential missed nocb_timer rearm Frederic Weisbecker
     [not found]   ` <20210128184834.GP2743@paulmck-ThinkPad-P72>
2021-01-28 21:23     ` Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 02/16] rcu/nocb: Comment the reason behind BH disablement on batch processing Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 03/16] rcu/nocb: Forbid NOCB toggling on offline CPUs Frederic Weisbecker
2021-01-28 19:52   ` Paul E. McKenney
2021-01-28 17:12 ` [PATCH 04/16] rcu/nocb: Only (re-)initialize segcblist when needed on CPU up Frederic Weisbecker
     [not found]   ` <20210128191228.GQ2743@paulmck-ThinkPad-P72>
2021-01-28 21:34     ` Frederic Weisbecker
2021-01-28 21:45       ` Paul E. McKenney
2021-01-29  0:26         ` Paul E. McKenney
2021-01-28 17:12 ` [PATCH 05/16] rcu/nocb: Disable bypass when CPU isn't completely offloaded Frederic Weisbecker
2021-01-28 21:31   ` Paul E. McKenney
2021-01-28 22:25     ` Frederic Weisbecker
2021-01-29  0:19       ` Paul E. McKenney
2021-01-28 17:12 ` [PATCH 06/16] rcu/nocb: Avoid confusing double write of rdp->nocb_cb_sleep Frederic Weisbecker
2021-01-28 21:42   ` Paul E. McKenney
2021-01-28 17:12 ` [PATCH 07/16] rcu/nocb: Rename nocb_gp_update_state to nocb_gp_update_state_deoffloading Frederic Weisbecker
2021-01-29  0:49   ` Paul E. McKenney
2021-01-28 17:12 ` [PATCH 08/16] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible Frederic Weisbecker
2021-01-29  0:51   ` Paul E. McKenney
2021-01-28 17:12 ` [PATCH 09/16] rcu/nocb: Merge nocb_timer to the rdp leader Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 10/16] rcu/nocb: Directly call __wake_nocb_gp() from bypass timer Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 11/16] rcu/nocb: Allow de-offloading rdp leader Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 12/16] rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 13/16] rcu/nocb: Delete bypass_timer " Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 14/16] rcu/nocb: Only cancel nocb timer if not polling Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 15/16] rcu/nocb: Prepare for finegrained deferred wakeup Frederic Weisbecker
2021-01-28 17:12 ` [PATCH 16/16] rcu/nocb: Unify timers Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).