linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5
@ 2019-10-03  1:38 Paul E. McKenney
  2019-10-03  1:38 ` [PATCH tip/core/rcu 01/12] nohz: Add TICK_DEP_BIT_RCU paulmck
                   ` (11 more replies)
  0 siblings, 12 replies; 21+ messages in thread
From: Paul E. McKenney @ 2019-10-03  1:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel

Hello!

This series contains various fixes for NO_HZ and NO_HZ_FULL problems,
including re-enabling the tick during long-term kernel-mode execution.

1.	Add TICK_DEP_BIT_RCU (which allows RCU-specific tick re-enabling),
	courtesy of Frederic Weisbecker.

2.	Export tick start/stop functions for rcutorture.

3.	Force on tick when invoking lots of callbacks.

4.	Force on tick for rcutorture readers and callback flooders.

5.	Provide RCU quiescent state in multi_cpu_stop().

6.	Make CPU-hotplug removal operations enable tick.

7.	Use {READ,WRITE)_ONCE() for multi_cpu_stop() ->state.

8.	Force tick on for nohz_full CPUs not reaching quiescent states.

9.	Force nohz_full tick on upon irq enter instead of exit.

10.	Reset CPU hints when reporting a quiescent state, courtesy of
	Joel Fernandes.

11.	Confine ->core_needs_qs accesses to the corresponding CPU.

12.	Make kernel-mode nohz_full CPUs invoke the RCU core processing.

							Thanx, Paul

------------------------------------------------------------------------

 include/linux/rcutree.h      |    1 
 include/linux/tick.h         |    7 ++
 include/trace/events/timer.h |    3 -
 kernel/rcu/rcutorture.c      |   20 +++++---
 kernel/rcu/tree.c            |  105 ++++++++++++++++++++++++++++++-------------
 kernel/rcu/tree.h            |    1 
 kernel/stop_machine.c        |    7 +-
 kernel/time/tick-sched.c     |   11 ++++
 8 files changed, 114 insertions(+), 41 deletions(-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 01/12] nohz: Add TICK_DEP_BIT_RCU
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
@ 2019-10-03  1:38 ` paulmck
  2019-10-03  1:38 ` [PATCH tip/core/rcu 02/12] time: Export tick start/stop functions for rcutorture paulmck
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: paulmck @ 2019-10-03  1:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Frederic Weisbecker,
	Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

If a nohz_full CPU is looping in the kernel, the scheduling-clock tick
might nevertheless remain disabled.  In !PREEMPT kernels, this can
prevent RCU's attempts to enlist the aid of that CPU's executions of
cond_resched(), which can in turn result in an arbitrarily delayed grace
period and thus an OOM.  RCU therefore needs a way to enable a holdout
nohz_full CPU's scheduler-clock interrupt.

This commit therefore provides a new TICK_DEP_BIT_RCU value which RCU can
pass to tick_dep_set_cpu() and friends to force on the scheduler-clock
interrupt for a specified CPU or task.  In some cases, rcutorture needs
to turn on the scheduler-clock tick, so this commit also exports the
relevant symbols to GPL-licensed modules.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 include/linux/tick.h         | 7 ++++++-
 include/trace/events/timer.h | 3 ++-
 kernel/time/tick-sched.c     | 7 +++++++
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index f92a10b..39eb445 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -108,7 +108,8 @@ enum tick_dep_bits {
 	TICK_DEP_BIT_POSIX_TIMER	= 0,
 	TICK_DEP_BIT_PERF_EVENTS	= 1,
 	TICK_DEP_BIT_SCHED		= 2,
-	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3
+	TICK_DEP_BIT_CLOCK_UNSTABLE	= 3,
+	TICK_DEP_BIT_RCU		= 4
 };
 
 #define TICK_DEP_MASK_NONE		0
@@ -116,6 +117,7 @@ enum tick_dep_bits {
 #define TICK_DEP_MASK_PERF_EVENTS	(1 << TICK_DEP_BIT_PERF_EVENTS)
 #define TICK_DEP_MASK_SCHED		(1 << TICK_DEP_BIT_SCHED)
 #define TICK_DEP_MASK_CLOCK_UNSTABLE	(1 << TICK_DEP_BIT_CLOCK_UNSTABLE)
+#define TICK_DEP_MASK_RCU		(1 << TICK_DEP_BIT_RCU)
 
 #ifdef CONFIG_NO_HZ_COMMON
 extern bool tick_nohz_enabled;
@@ -268,6 +270,9 @@ static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
 
+static inline void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) { }
+static inline void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit) { }
+
 static inline void tick_dep_set(enum tick_dep_bits bit) { }
 static inline void tick_dep_clear(enum tick_dep_bits bit) { }
 static inline void tick_dep_set_cpu(int cpu, enum tick_dep_bits bit) { }
diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index b7a9048..295517f 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -367,7 +367,8 @@ TRACE_EVENT(itimer_expire,
 		tick_dep_name(POSIX_TIMER)		\
 		tick_dep_name(PERF_EVENTS)		\
 		tick_dep_name(SCHED)			\
-		tick_dep_name_end(CLOCK_UNSTABLE)
+		tick_dep_name(CLOCK_UNSTABLE)		\
+		tick_dep_name_end(RCU)
 
 #undef tick_dep_name
 #undef tick_dep_mask_name
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 9558517..d1b0a84 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -198,6 +198,11 @@ static bool check_tick_dependency(atomic_t *dep)
 		return true;
 	}
 
+	if (val & TICK_DEP_MASK_RCU) {
+		trace_tick_stop(0, TICK_DEP_MASK_RCU);
+		return true;
+	}
+
 	return false;
 }
 
@@ -324,6 +329,7 @@ void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit)
 		preempt_enable();
 	}
 }
+EXPORT_SYMBOL_GPL(tick_nohz_dep_set_cpu);
 
 void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit)
 {
@@ -331,6 +337,7 @@ void tick_nohz_dep_clear_cpu(int cpu, enum tick_dep_bits bit)
 
 	atomic_andnot(BIT(bit), &ts->tick_dep_mask);
 }
+EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_cpu);
 
 /*
  * Set a per-task tick dependency. Posix CPU timers need this in order to elapse
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 02/12] time: Export tick start/stop functions for rcutorture
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
  2019-10-03  1:38 ` [PATCH tip/core/rcu 01/12] nohz: Add TICK_DEP_BIT_RCU paulmck
@ 2019-10-03  1:38 ` paulmck
  2019-10-03  1:38 ` [PATCH tip/core/rcu 03/12] rcu: Force on tick when invoking lots of callbacks paulmck
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: paulmck @ 2019-10-03  1:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

It turns out that rcutorture needs to ensure that the scheduling-clock
interrupt is enabled in CONFIG_NO_HZ_FULL kernels before starting on
CPU-bound in-kernel processing.  This commit therefore exports
tick_nohz_dep_set_task(), tick_nohz_dep_clear_task(), and
tick_nohz_full_setup() to GPL kernel modules.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/time/tick-sched.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index d1b0a84..1ffdb4b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -172,6 +172,7 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
 #ifdef CONFIG_NO_HZ_FULL
 cpumask_var_t tick_nohz_full_mask;
 bool tick_nohz_full_running;
+EXPORT_SYMBOL_GPL(tick_nohz_full_running);
 static atomic_t tick_dep_mask;
 
 static bool check_tick_dependency(atomic_t *dep)
@@ -351,11 +352,13 @@ void tick_nohz_dep_set_task(struct task_struct *tsk, enum tick_dep_bits bit)
 	 */
 	tick_nohz_dep_set_all(&tsk->tick_dep_mask, bit);
 }
+EXPORT_SYMBOL_GPL(tick_nohz_dep_set_task);
 
 void tick_nohz_dep_clear_task(struct task_struct *tsk, enum tick_dep_bits bit)
 {
 	atomic_andnot(BIT(bit), &tsk->tick_dep_mask);
 }
+EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_task);
 
 /*
  * Set a per-taskgroup tick dependency. Posix CPU timers need this in order to elapse
@@ -404,6 +407,7 @@ void __init tick_nohz_full_setup(cpumask_var_t cpumask)
 	cpumask_copy(tick_nohz_full_mask, cpumask);
 	tick_nohz_full_running = true;
 }
+EXPORT_SYMBOL_GPL(tick_nohz_full_setup);
 
 static int tick_nohz_cpu_down(unsigned int cpu)
 {
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 03/12] rcu: Force on tick when invoking lots of callbacks
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
  2019-10-03  1:38 ` [PATCH tip/core/rcu 01/12] nohz: Add TICK_DEP_BIT_RCU paulmck
  2019-10-03  1:38 ` [PATCH tip/core/rcu 02/12] time: Export tick start/stop functions for rcutorture paulmck
@ 2019-10-03  1:38 ` paulmck
  2019-10-03 14:10   ` Frederic Weisbecker
  2019-10-03  1:38 ` [PATCH tip/core/rcu 04/12] rcutorture: Force on tick for readers and callback flooders paulmck
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: paulmck @ 2019-10-03  1:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

Callback invocation can run for a significant time period, and within
CONFIG_NO_HZ_FULL=y kernels, this period will be devoid of scheduler-clock
interrupts.  In-kernel execution without such interrupts can cause all
manner of malfunction, with RCU CPU stall warnings being but one result.

This commit therefore forces scheduling-clock interrupts on whenever more
than a few RCU callbacks are invoked.  Because offloaded callback invocation
can be preempted, this forcing is withdrawn on each context switch.  This
in turn requires that the loop invoking RCU callbacks reiterate the forcing
periodically.

[ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/tree.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8110514..db673ae 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2151,6 +2151,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	rcu_nocb_unlock_irqrestore(rdp, flags);
 
 	/* Invoke callbacks. */
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_dep_set_task(current, TICK_DEP_BIT_RCU);
 	rhp = rcu_cblist_dequeue(&rcl);
 	for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
 		debug_rcu_head_unqueue(rhp);
@@ -2217,6 +2219,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	/* Re-invoke RCU core processing if there are callbacks remaining. */
 	if (!offloaded && rcu_segcblist_ready_cbs(&rdp->cblist))
 		invoke_rcu_core();
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_dep_clear_task(current, TICK_DEP_BIT_RCU);
 }
 
 /*
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 04/12] rcutorture: Force on tick for readers and callback flooders
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
                   ` (2 preceding siblings ...)
  2019-10-03  1:38 ` [PATCH tip/core/rcu 03/12] rcu: Force on tick when invoking lots of callbacks paulmck
@ 2019-10-03  1:38 ` paulmck
  2019-10-03 14:14   ` Frederic Weisbecker
  2019-10-03  1:38 ` [PATCH tip/core/rcu 05/12] stop_machine: EXP Provide RCU quiescent state in multi_cpu_stop() paulmck
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: paulmck @ 2019-10-03  1:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

Readers and callback flooders in the rcutorture stress-test suite run for
extended time periods by design.  They do take pains to relinquish the
CPU from time to time, but in some cases this relies on the scheduler
being active, which in turn relies on the scheduler-clock interrupt
firing from time to time.

This commit therefore forces scheduling-clock interrupts within
these loops.  While in the area, this commit also prevents
rcu_torture_reader()'s occasional timed sleeps from delaying shutdown.

[ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/rcutorture.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 3c9feca..1ce6a7e 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -44,6 +44,7 @@
 #include <linux/sched/debug.h>
 #include <linux/sched/sysctl.h>
 #include <linux/oom.h>
+#include <linux/tick.h>
 
 #include "rcu.h"
 
@@ -1363,15 +1364,16 @@ rcu_torture_reader(void *arg)
 	set_user_nice(current, MAX_NICE);
 	if (irqreader && cur_ops->irq_capable)
 		timer_setup_on_stack(&t, rcu_torture_timer, 0);
-
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_dep_set_task(current, TICK_DEP_BIT_RCU);
 	do {
 		if (irqreader && cur_ops->irq_capable) {
 			if (!timer_pending(&t))
 				mod_timer(&t, jiffies + 1);
 		}
-		if (!rcu_torture_one_read(&rand))
+		if (!rcu_torture_one_read(&rand) && !torture_must_stop())
 			schedule_timeout_interruptible(HZ);
-		if (time_after(jiffies, lastsleep)) {
+		if (time_after(jiffies, lastsleep) && !torture_must_stop()) {
 			schedule_timeout_interruptible(1);
 			lastsleep = jiffies + 10;
 		}
@@ -1383,6 +1385,8 @@ rcu_torture_reader(void *arg)
 		del_timer_sync(&t);
 		destroy_timer_on_stack(&t);
 	}
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_dep_clear_task(current, TICK_DEP_BIT_RCU);
 	torture_kthread_stopping("rcu_torture_reader");
 	return 0;
 }
@@ -1729,10 +1733,10 @@ static void rcu_torture_fwd_prog_cond_resched(unsigned long iter)
 		// Real call_rcu() floods hit userspace, so emulate that.
 		if (need_resched() || (iter & 0xfff))
 			schedule();
-	} else {
-		// No userspace emulation: CB invocation throttles call_rcu()
-		cond_resched();
+		return;
 	}
+	// No userspace emulation: CB invocation throttles call_rcu()
+	cond_resched();
 }
 
 /*
@@ -1865,6 +1869,8 @@ static void rcu_torture_fwd_prog_cr(void)
 	cver = READ_ONCE(rcu_torture_current_version);
 	gps = cur_ops->get_gp_seq();
 	rcu_launder_gp_seq_start = gps;
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_dep_set_task(current, TICK_DEP_BIT_RCU);
 	while (time_before(jiffies, stopat) &&
 	       !shutdown_time_arrived() &&
 	       !READ_ONCE(rcu_fwd_emergency_stop) && !torture_must_stop()) {
@@ -1911,6 +1917,8 @@ static void rcu_torture_fwd_prog_cr(void)
 		rcu_torture_fwd_cb_hist();
 	}
 	schedule_timeout_uninterruptible(HZ); /* Let CBs drain. */
+	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
+		tick_dep_clear_task(current, TICK_DEP_BIT_RCU);
 	WRITE_ONCE(rcu_fwd_cb_nodelay, false);
 }
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 05/12] stop_machine: EXP Provide RCU quiescent state in multi_cpu_stop()
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
                   ` (3 preceding siblings ...)
  2019-10-03  1:38 ` [PATCH tip/core/rcu 04/12] rcutorture: Force on tick for readers and callback flooders paulmck
@ 2019-10-03  1:38 ` paulmck
  2019-10-03  1:38 ` [PATCH tip/core/rcu 06/12] rcu: Make CPU-hotplug removal operations enable tick paulmck
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: paulmck @ 2019-10-03  1:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

When multi_cpu_stop() loops waiting for other tasks, it can trigger an RCU
CPU stall warning.  This can be misleading because what is instead needed
is information on whatever task is blocking multi_cpu_stop().  This commit
therefore inserts an RCU quiescent state into the multi_cpu_stop()
function's waitloop.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 include/linux/rcutree.h | 1 +
 kernel/rcu/tree.c       | 2 +-
 kernel/stop_machine.c   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 18b1ed9..c5147de 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -37,6 +37,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
 
 void rcu_barrier(void);
 bool rcu_eqs_special_set(int cpu);
+void rcu_momentary_dyntick_idle(void);
 unsigned long get_state_synchronize_rcu(void);
 void cond_synchronize_rcu(unsigned long oldstate);
 
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index db673ae..f708d54 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -364,7 +364,7 @@ bool rcu_eqs_special_set(int cpu)
  *
  * The caller must have disabled interrupts and must not be idle.
  */
-static void __maybe_unused rcu_momentary_dyntick_idle(void)
+void rcu_momentary_dyntick_idle(void)
 {
 	int special;
 
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index c7031a2..34c4f11 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -233,6 +233,7 @@ static int multi_cpu_stop(void *data)
 			 */
 			touch_nmi_watchdog();
 		}
+		rcu_momentary_dyntick_idle();
 	} while (curstate != MULTI_STOP_EXIT);
 
 	local_irq_restore(flags);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 06/12] rcu: Make CPU-hotplug removal operations enable tick
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
                   ` (4 preceding siblings ...)
  2019-10-03  1:38 ` [PATCH tip/core/rcu 05/12] stop_machine: EXP Provide RCU quiescent state in multi_cpu_stop() paulmck
@ 2019-10-03  1:38 ` paulmck
  2019-10-03 14:34   ` Frederic Weisbecker
  2019-10-03  1:38 ` [PATCH tip/core/rcu 07/12] stop_machine: Use {READ,WRITE)_ONCE() for multi_cpu_stop() ->state paulmck
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: paulmck @ 2019-10-03  1:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

CPU-hotplug removal operations run the multi_cpu_stop() function, which
relies on the scheduler to gain control from whatever is running on the
various online CPUs, including any nohz_full CPUs running long loops in
kernel-mode code.  Lack of the scheduler-clock interrupt on such CPUs
can delay multi_cpu_stop() for several minutes and can also result in
RCU CPU stall warnings.  This commit therefore causes CPU-hotplug removal
operations to enable the scheduler-clock interrupt on all online CPUs.

[ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/tree.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f708d54..74bf5c65 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2091,6 +2091,7 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
  */
 int rcutree_dead_cpu(unsigned int cpu)
 {
+	int c;
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 	struct rcu_node *rnp = rdp->mynode;  /* Outgoing CPU's rdp & rnp. */
 
@@ -2101,6 +2102,10 @@ int rcutree_dead_cpu(unsigned int cpu)
 	rcu_boost_kthread_setaffinity(rnp, -1);
 	/* Do any needed no-CB deferred wakeups from this CPU. */
 	do_nocb_deferred_wakeup(per_cpu_ptr(&rcu_data, cpu));
+
+	// Stop-machine done, so allow nohz_full to disable tick.
+	for_each_online_cpu(c)
+		tick_dep_clear_cpu(c, TICK_DEP_BIT_RCU);
 	return 0;
 }
 
@@ -3074,6 +3079,7 @@ static void rcutree_affinity_setting(unsigned int cpu, int outgoing)
  */
 int rcutree_online_cpu(unsigned int cpu)
 {
+	int c;
 	unsigned long flags;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
@@ -3087,6 +3093,10 @@ int rcutree_online_cpu(unsigned int cpu)
 		return 0; /* Too early in boot for scheduler work. */
 	sync_sched_exp_online_cleanup(cpu);
 	rcutree_affinity_setting(cpu, -1);
+
+	// Stop-machine done, so allow nohz_full to disable tick.
+	for_each_online_cpu(c)
+		tick_dep_clear_cpu(c, TICK_DEP_BIT_RCU);
 	return 0;
 }
 
@@ -3096,6 +3106,7 @@ int rcutree_online_cpu(unsigned int cpu)
  */
 int rcutree_offline_cpu(unsigned int cpu)
 {
+	int c;
 	unsigned long flags;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
@@ -3107,6 +3118,10 @@ int rcutree_offline_cpu(unsigned int cpu)
 	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 
 	rcutree_affinity_setting(cpu, cpu);
+
+	// nohz_full CPUs need the tick for stop-machine to work quickly
+	for_each_online_cpu(c)
+		tick_dep_set_cpu(c, TICK_DEP_BIT_RCU);
 	return 0;
 }
 
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 07/12] stop_machine: Use {READ,WRITE)_ONCE() for multi_cpu_stop() ->state
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
                   ` (5 preceding siblings ...)
  2019-10-03  1:38 ` [PATCH tip/core/rcu 06/12] rcu: Make CPU-hotplug removal operations enable tick paulmck
@ 2019-10-03  1:38 ` paulmck
  2019-10-03  1:38 ` [PATCH tip/core/rcu 08/12] rcu: Force tick on for nohz_full CPUs not reaching quiescent states paulmck
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: paulmck @ 2019-10-03  1:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

The multi_stop_data structure's ->state field is updated and read
concurrently, so this commit replaces the current C-language accesses
with READ_ONCE() and WRITE_ONCE().

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/stop_machine.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 34c4f11..c02c56e 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -167,7 +167,7 @@ static void set_state(struct multi_stop_data *msdata,
 	/* Reset ack counter. */
 	atomic_set(&msdata->thread_ack, msdata->num_threads);
 	smp_wmb();
-	msdata->state = newstate;
+	WRITE_ONCE(msdata->state, newstate);
 }
 
 /* Last one to ack a state moves to the next state. */
@@ -210,8 +210,8 @@ static int multi_cpu_stop(void *data)
 	do {
 		/* Chill out and ensure we re-read multi_stop_state. */
 		stop_machine_yield(cpumask);
-		if (msdata->state != curstate) {
-			curstate = msdata->state;
+		if (READ_ONCE(msdata->state) != curstate) {
+			curstate = READ_ONCE(msdata->state);
 			switch (curstate) {
 			case MULTI_STOP_DISABLE_IRQ:
 				local_irq_disable();
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 08/12] rcu: Force tick on for nohz_full CPUs not reaching quiescent states
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
                   ` (6 preceding siblings ...)
  2019-10-03  1:38 ` [PATCH tip/core/rcu 07/12] stop_machine: Use {READ,WRITE)_ONCE() for multi_cpu_stop() ->state paulmck
@ 2019-10-03  1:38 ` paulmck
  2019-10-03 14:50   ` Frederic Weisbecker
  2019-10-03  1:39 ` [PATCH tip/core/rcu 09/12] rcu: Force nohz_full tick on upon irq enter instead of exit paulmck
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: paulmck @ 2019-10-03  1:38 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

CPUs running for long time periods in the kernel in nohz_full mode
might leave the scheduling-clock interrupt disabled for then full
duration of their in-kernel execution.  This can (among other things)
delay grace periods.  This commit therefore forces the tick back on
for any nohz_full CPU that is failing to pass through a quiescent state
upon return from interrupt, which the resched_cpu() will induce.

Reported-by: Joel Fernandes <joel@joelfernandes.org>
[ paulmck: Clear ->rcu_forced_tick as reported by Joel Fernandes testing. ]
[ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/tree.c | 38 +++++++++++++++++++++++++++++++-------
 kernel/rcu/tree.h |  1 +
 2 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 74bf5c65..621cc06 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -650,6 +650,12 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
 	 */
 	if (rdp->dynticks_nmi_nesting != 1) {
 		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2, rdp->dynticks);
+		if (tick_nohz_full_cpu(rdp->cpu) &&
+		    rdp->dynticks_nmi_nesting == 2 &&
+		    rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
+			rdp->rcu_forced_tick = true;
+			tick_dep_set_cpu(rdp->cpu, TICK_DEP_MASK_RCU);
+		}
 		WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */
 			   rdp->dynticks_nmi_nesting - 2);
 		return;
@@ -885,6 +891,18 @@ void rcu_irq_enter_irqson(void)
 	local_irq_restore(flags);
 }
 
+/*
+ * If the scheduler-clock interrupt was enabled on a nohz_full CPU
+ * in order to get to a quiescent state, disable it.
+ */
+void rcu_disable_tick_upon_qs(struct rcu_data *rdp)
+{
+	if (tick_nohz_full_cpu(rdp->cpu) && rdp->rcu_forced_tick) {
+		tick_dep_clear_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
+		rdp->rcu_forced_tick = false;
+	}
+}
+
 /**
  * rcu_is_watching - see if RCU thinks that the current CPU is not idle
  *
@@ -1979,6 +1997,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
 		if (!offloaded)
 			needwake = rcu_accelerate_cbs(rnp, rdp);
 
+		rcu_disable_tick_upon_qs(rdp);
 		rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
 		/* ^^^ Released rnp->lock */
 		if (needwake)
@@ -2268,6 +2287,7 @@ static void force_qs_rnp(int (*f)(struct rcu_data *rdp))
 	int cpu;
 	unsigned long flags;
 	unsigned long mask;
+	struct rcu_data *rdp;
 	struct rcu_node *rnp;
 
 	rcu_for_each_leaf_node(rnp) {
@@ -2292,8 +2312,11 @@ static void force_qs_rnp(int (*f)(struct rcu_data *rdp))
 		for_each_leaf_node_possible_cpu(rnp, cpu) {
 			unsigned long bit = leaf_node_cpu_bit(rnp, cpu);
 			if ((rnp->qsmask & bit) != 0) {
-				if (f(per_cpu_ptr(&rcu_data, cpu)))
+				rdp = per_cpu_ptr(&rcu_data, cpu);
+				if (f(rdp)) {
 					mask |= bit;
+					rcu_disable_tick_upon_qs(rdp);
+				}
 			}
 		}
 		if (mask != 0) {
@@ -2321,7 +2344,7 @@ void rcu_force_quiescent_state(void)
 	rnp = __this_cpu_read(rcu_data.mynode);
 	for (; rnp != NULL; rnp = rnp->parent) {
 		ret = (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) ||
-		      !raw_spin_trylock(&rnp->fqslock);
+		       !raw_spin_trylock(&rnp->fqslock);
 		if (rnp_old != NULL)
 			raw_spin_unlock(&rnp_old->fqslock);
 		if (ret)
@@ -2854,7 +2877,7 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
 {
 	if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) {
 		rcu_barrier_trace(TPS("LastCB"), -1,
-				   rcu_state.barrier_sequence);
+				  rcu_state.barrier_sequence);
 		complete(&rcu_state.barrier_completion);
 	} else {
 		rcu_barrier_trace(TPS("CB"), -1, rcu_state.barrier_sequence);
@@ -2878,7 +2901,7 @@ static void rcu_barrier_func(void *unused)
 	} else {
 		debug_rcu_head_unqueue(&rdp->barrier_head);
 		rcu_barrier_trace(TPS("IRQNQ"), -1,
-				   rcu_state.barrier_sequence);
+				  rcu_state.barrier_sequence);
 	}
 	rcu_nocb_unlock(rdp);
 }
@@ -2905,7 +2928,7 @@ void rcu_barrier(void)
 	/* Did someone else do our work for us? */
 	if (rcu_seq_done(&rcu_state.barrier_sequence, s)) {
 		rcu_barrier_trace(TPS("EarlyExit"), -1,
-				   rcu_state.barrier_sequence);
+				  rcu_state.barrier_sequence);
 		smp_mb(); /* caller's subsequent code after above check. */
 		mutex_unlock(&rcu_state.barrier_mutex);
 		return;
@@ -2937,11 +2960,11 @@ void rcu_barrier(void)
 			continue;
 		if (rcu_segcblist_n_cbs(&rdp->cblist)) {
 			rcu_barrier_trace(TPS("OnlineQ"), cpu,
-					   rcu_state.barrier_sequence);
+					  rcu_state.barrier_sequence);
 			smp_call_function_single(cpu, rcu_barrier_func, NULL, 1);
 		} else {
 			rcu_barrier_trace(TPS("OnlineNQ"), cpu,
-					   rcu_state.barrier_sequence);
+					  rcu_state.barrier_sequence);
 		}
 	}
 	put_online_cpus();
@@ -3167,6 +3190,7 @@ void rcu_cpu_starting(unsigned int cpu)
 	rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
 	rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);
 	if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */
+		rcu_disable_tick_upon_qs(rdp);
 		/* Report QS -after- changing ->qsmaskinitnext! */
 		rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
 	} else {
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index c612f30..055c317 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -181,6 +181,7 @@ struct rcu_data {
 	atomic_t dynticks;		/* Even value for idle, else odd. */
 	bool rcu_need_heavy_qs;		/* GP old, so heavy quiescent state! */
 	bool rcu_urgent_qs;		/* GP old need light quiescent state. */
+	bool rcu_forced_tick;		/* Forced tick to provide QS. */
 #ifdef CONFIG_RCU_FAST_NO_HZ
 	bool all_lazy;			/* All CPU's CBs lazy at idle start? */
 	unsigned long last_accelerate;	/* Last jiffy CBs were accelerated. */
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 09/12] rcu: Force nohz_full tick on upon irq enter instead of exit
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
                   ` (7 preceding siblings ...)
  2019-10-03  1:38 ` [PATCH tip/core/rcu 08/12] rcu: Force tick on for nohz_full CPUs not reaching quiescent states paulmck
@ 2019-10-03  1:39 ` paulmck
  2019-10-03  1:39 ` [PATCH tip/core/rcu 10/12] rcu: Reset CPU hints when reporting a quiescent state paulmck
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: paulmck @ 2019-10-03  1:39 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

There is interrupt-exit code that forces on the tick for nohz_full CPUs
failing to respond to the current grace period in a timely fashion.
However, this code must compare ->dynticks_nmi_nesting to the value 2
in the interrupt-exit fastpath.  This commit therefore moves this code
to the interrupt-entry fastpath, where a lighter-weight comparison to
zero may be used.

Reported-by: Joel Fernandes <joel@joelfernandes.org>
[ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/tree.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 621cc06..1601fa6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -650,12 +650,6 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
 	 */
 	if (rdp->dynticks_nmi_nesting != 1) {
 		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2, rdp->dynticks);
-		if (tick_nohz_full_cpu(rdp->cpu) &&
-		    rdp->dynticks_nmi_nesting == 2 &&
-		    rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
-			rdp->rcu_forced_tick = true;
-			tick_dep_set_cpu(rdp->cpu, TICK_DEP_MASK_RCU);
-		}
 		WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */
 			   rdp->dynticks_nmi_nesting - 2);
 		return;
@@ -830,6 +824,11 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
 			rcu_cleanup_after_idle();
 
 		incby = 1;
+	} else if (tick_nohz_full_cpu(rdp->cpu) &&
+		   rdp->dynticks_nmi_nesting == DYNTICK_IRQ_NONIDLE &&
+		   rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
+		rdp->rcu_forced_tick = true;
+		tick_dep_set_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
 	}
 	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
 			  rdp->dynticks_nmi_nesting,
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 10/12] rcu: Reset CPU hints when reporting a quiescent state
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
                   ` (8 preceding siblings ...)
  2019-10-03  1:39 ` [PATCH tip/core/rcu 09/12] rcu: Force nohz_full tick on upon irq enter instead of exit paulmck
@ 2019-10-03  1:39 ` paulmck
  2019-10-03  1:39 ` [PATCH tip/core/rcu 11/12] rcu: Confine ->core_needs_qs accesses to the corresponding CPU paulmck
  2019-10-03  1:39 ` [PATCH tip/core/rcu 12/12] rcu: Make kernel-mode nohz_full CPUs invoke the RCU core processing paulmck
  11 siblings, 0 replies; 21+ messages in thread
From: paulmck @ 2019-10-03  1:39 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

In some cases, tracing shows that need_heavy_qs is still set even though
urgent_qs was cleared upon reporting of a quiescent state.  One such
case is when the softirq reports that a CPU has passed quiescent state.

Commit 671a63517cf9 ("rcu: Avoid unnecessary softirq when system is
idle") fixed a bug where core_needs_qs was not being cleared.  In order
to avoid running into similar situations with the urgent-grace-period
flags, this commit causes rcu_disable_urgency_upon_qs(), previously
rcu_disable_tick_upon_qs(), to clear the urgency hints, ->rcu_urgent_qs
and ->rcu_need_heavy_qs.  Note that it is possible for CPUs to go
offline with these urgency hints still set.  This is handled because
rcu_disable_urgency_upon_qs() is also invoked during the online process.

Because these hints can be cleared both by the corresponding CPU and by
the grace-period kthread, this commit also adds a number of READ_ONCE()
and WRITE_ONCE() calls.

Tested overnight with rcutorture running for 60 minutes on all
configurations of RCU.

Signed-off-by: "Joel Fernandes (Google)" <joel@joelfernandes.org>
[ paulmck: Clear urgency flags in rcu_disable_urgency_upon_qs(). ]
[ paulmck: Remove ->core_needs_qs from the set cleared at quiescent state. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1601fa6..59527b0 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -826,7 +826,7 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
 		incby = 1;
 	} else if (tick_nohz_full_cpu(rdp->cpu) &&
 		   rdp->dynticks_nmi_nesting == DYNTICK_IRQ_NONIDLE &&
-		   rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
+		   READ_ONCE(rdp->rcu_urgent_qs) && !rdp->rcu_forced_tick) {
 		rdp->rcu_forced_tick = true;
 		tick_dep_set_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
 	}
@@ -891,11 +891,14 @@ void rcu_irq_enter_irqson(void)
 }
 
 /*
- * If the scheduler-clock interrupt was enabled on a nohz_full CPU
- * in order to get to a quiescent state, disable it.
+ * If any sort of urgency was applied to the current CPU (for example,
+ * the scheduler-clock interrupt was enabled on a nohz_full CPU) in order
+ * to get to a quiescent state, disable it.
  */
-void rcu_disable_tick_upon_qs(struct rcu_data *rdp)
+void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
 {
+	WRITE_ONCE(rdp->rcu_urgent_qs, false);
+	WRITE_ONCE(rdp->rcu_need_heavy_qs, false);
 	if (tick_nohz_full_cpu(rdp->cpu) && rdp->rcu_forced_tick) {
 		tick_dep_clear_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
 		rdp->rcu_forced_tick = false;
@@ -1996,7 +1999,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
 		if (!offloaded)
 			needwake = rcu_accelerate_cbs(rnp, rdp);
 
-		rcu_disable_tick_upon_qs(rdp);
+		rcu_disable_urgency_upon_qs(rdp);
 		rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
 		/* ^^^ Released rnp->lock */
 		if (needwake)
@@ -2314,7 +2317,7 @@ static void force_qs_rnp(int (*f)(struct rcu_data *rdp))
 				rdp = per_cpu_ptr(&rcu_data, cpu);
 				if (f(rdp)) {
 					mask |= bit;
-					rcu_disable_tick_upon_qs(rdp);
+					rcu_disable_urgency_upon_qs(rdp);
 				}
 			}
 		}
@@ -3189,7 +3192,7 @@ void rcu_cpu_starting(unsigned int cpu)
 	rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
 	rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);
 	if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */
-		rcu_disable_tick_upon_qs(rdp);
+		rcu_disable_urgency_upon_qs(rdp);
 		/* Report QS -after- changing ->qsmaskinitnext! */
 		rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
 	} else {
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 11/12] rcu: Confine ->core_needs_qs accesses to the corresponding CPU
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
                   ` (9 preceding siblings ...)
  2019-10-03  1:39 ` [PATCH tip/core/rcu 10/12] rcu: Reset CPU hints when reporting a quiescent state paulmck
@ 2019-10-03  1:39 ` paulmck
  2019-10-03  1:39 ` [PATCH tip/core/rcu 12/12] rcu: Make kernel-mode nohz_full CPUs invoke the RCU core processing paulmck
  11 siblings, 0 replies; 21+ messages in thread
From: paulmck @ 2019-10-03  1:39 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

Commit 671a63517cf9 ("rcu: Avoid unnecessary softirq when system
is idle") fixed a bug that could result in an indefinite number of
unnecessary invocations of the RCU_SOFTIRQ handler at the trailing edge
of a scheduler-clock interrupt.  However, the fix introduced off-CPU
stores to ->core_needs_qs.  These writes did not conflict with the
on-CPU stores because the CPU's leaf rcu_node structure's ->lock was
held across all such stores.  However, the loads from ->core_needs_qs
were not promoted to READ_ONCE() and, worse yet, the code loading from
->core_needs_qs was written assuming that it was only ever updated by
the corresponding CPU.  So operation has been robust, but only by luck.
This situation is therefore an accident waiting to happen.

This commit therefore takes a different approach.  Instead of clearing
->core_needs_qs from the grace-period kthread's force-quiescent-state
processing, it modifies the rcu_pending() function to suppress the
rcu_sched_clock_irq() function's call to invoke_rcu_core() if there is no
grace period in progress.  This avoids the infinite needless RCU_SOFTIRQ
handlers while still keeping all accesses to ->core_needs_qs local to
the corresponding CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/tree.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 59527b0..1b250d4 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1988,7 +1988,6 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
 		return;
 	}
 	mask = rdp->grpmask;
-	rdp->core_needs_qs = false;
 	if ((rnp->qsmask & mask) == 0) {
 		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 	} else {
@@ -2822,6 +2821,7 @@ EXPORT_SYMBOL_GPL(cond_synchronize_rcu);
  */
 static int rcu_pending(void)
 {
+	bool gp_in_progress;
 	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 	struct rcu_node *rnp = rdp->mynode;
 
@@ -2837,7 +2837,8 @@ static int rcu_pending(void)
 		return 0;
 
 	/* Is the RCU core waiting for a quiescent state from this CPU? */
-	if (rdp->core_needs_qs && !rdp->cpu_no_qs.b.norm)
+	gp_in_progress = rcu_gp_in_progress();
+	if (rdp->core_needs_qs && !rdp->cpu_no_qs.b.norm && gp_in_progress)
 		return 1;
 
 	/* Does this CPU have callbacks ready to invoke? */
@@ -2845,8 +2846,7 @@ static int rcu_pending(void)
 		return 1;
 
 	/* Has RCU gone idle with this CPU needing another grace period? */
-	if (!rcu_gp_in_progress() &&
-	    rcu_segcblist_is_enabled(&rdp->cblist) &&
+	if (!gp_in_progress && rcu_segcblist_is_enabled(&rdp->cblist) &&
 	    (!IS_ENABLED(CONFIG_RCU_NOCB_CPU) ||
 	     !rcu_segcblist_is_offloaded(&rdp->cblist)) &&
 	    !rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL))
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH tip/core/rcu 12/12] rcu: Make kernel-mode nohz_full CPUs invoke the RCU core processing
  2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
                   ` (10 preceding siblings ...)
  2019-10-03  1:39 ` [PATCH tip/core/rcu 11/12] rcu: Confine ->core_needs_qs accesses to the corresponding CPU paulmck
@ 2019-10-03  1:39 ` paulmck
  11 siblings, 0 replies; 21+ messages in thread
From: paulmck @ 2019-10-03  1:39 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.ibm.com>

If a nohz_full CPU is idle or executing in userspace, it makes good sense
to keep it out of RCU core processing.  After all, the RCU grace-period
kthread can see its quiescent states and all of its callbacks are
offloaded, so there is nothing for RCU core processing to do.

However, if a nohz_full CPU is executing in kernel space, the RCU
grace-period kthread cannot do anything for it, so such a CPU must report
its own quiescent states.  This commit therefore makes nohz_full CPUs
skip RCU core processing only if the scheduler-clock interrupt caught
them in idle or in userspace.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/tree.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1b250d4..9ffe503 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -496,7 +496,7 @@ module_param_cb(jiffies_till_next_fqs, &next_fqs_jiffies_ops, &jiffies_till_next
 module_param(rcu_kick_kthreads, bool, 0644);
 
 static void force_qs_rnp(int (*f)(struct rcu_data *rdp));
-static int rcu_pending(void);
+static int rcu_pending(int user);
 
 /*
  * Return the number of RCU GPs completed thus far for debug & stats.
@@ -2270,7 +2270,7 @@ void rcu_sched_clock_irq(int user)
 		__this_cpu_write(rcu_data.rcu_urgent_qs, false);
 	}
 	rcu_flavor_sched_clock_irq(user);
-	if (rcu_pending())
+	if (rcu_pending(user))
 		invoke_rcu_core();
 
 	trace_rcu_utilization(TPS("End scheduler-tick"));
@@ -2819,7 +2819,7 @@ EXPORT_SYMBOL_GPL(cond_synchronize_rcu);
  * CPU-local state are performed first.  However, we must check for CPU
  * stalls first, else we might not get a chance.
  */
-static int rcu_pending(void)
+static int rcu_pending(int user)
 {
 	bool gp_in_progress;
 	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
@@ -2832,8 +2832,8 @@ static int rcu_pending(void)
 	if (rcu_nocb_need_deferred_wakeup(rdp))
 		return 1;
 
-	/* Is this CPU a NO_HZ_FULL CPU that should ignore RCU? */
-	if (rcu_nohz_full_cpu())
+	/* Is this a nohz_full CPU in userspace or idle?  (Ignore RCU if so.) */
+	if ((user || rcu_is_cpu_rrupt_from_idle()) && rcu_nohz_full_cpu())
 		return 0;
 
 	/* Is the RCU core waiting for a quiescent state from this CPU? */
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH tip/core/rcu 03/12] rcu: Force on tick when invoking lots of callbacks
  2019-10-03  1:38 ` [PATCH tip/core/rcu 03/12] rcu: Force on tick when invoking lots of callbacks paulmck
@ 2019-10-03 14:10   ` Frederic Weisbecker
  2019-10-05 16:42     ` Paul E. McKenney
  0 siblings, 1 reply; 21+ messages in thread
From: Frederic Weisbecker @ 2019-10-03 14:10 UTC (permalink / raw)
  To: paulmck
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

On Wed, Oct 02, 2019 at 06:38:54PM -0700, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@linux.ibm.com>
> 
> Callback invocation can run for a significant time period, and within
> CONFIG_NO_HZ_FULL=y kernels, this period will be devoid of scheduler-clock
> interrupts.  In-kernel execution without such interrupts can cause all
> manner of malfunction, with RCU CPU stall warnings being but one result.
> 
> This commit therefore forces scheduling-clock interrupts on whenever more
> than a few RCU callbacks are invoked.  Because offloaded callback invocation
> can be preempted, this forcing is withdrawn on each context switch.  This
> in turn requires that the loop invoking RCU callbacks reiterate the forcing
> periodically.
> 
> [ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> ---
>  kernel/rcu/tree.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 8110514..db673ae 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2151,6 +2151,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  	rcu_nocb_unlock_irqrestore(rdp, flags);
>  
>  	/* Invoke callbacks. */
> +	if (IS_ENABLED(CONFIG_NO_HZ_FULL))

No need for the IS_ENABLED(), the API takes care of that.

> +		tick_dep_set_task(current, TICK_DEP_BIT_RCU);
>  	rhp = rcu_cblist_dequeue(&rcl);
>  	for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
>  		debug_rcu_head_unqueue(rhp);
> @@ -2217,6 +2219,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
>  	/* Re-invoke RCU core processing if there are callbacks remaining. */
>  	if (!offloaded && rcu_segcblist_ready_cbs(&rdp->cblist))
>  		invoke_rcu_core();
> +	if (IS_ENABLED(CONFIG_NO_HZ_FULL))

Same here.

Thanks.

> +		tick_dep_clear_task(current, TICK_DEP_BIT_RCU);
>  }
>  
>  /*
> -- 
> 2.9.5
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH tip/core/rcu 04/12] rcutorture: Force on tick for readers and callback flooders
  2019-10-03  1:38 ` [PATCH tip/core/rcu 04/12] rcutorture: Force on tick for readers and callback flooders paulmck
@ 2019-10-03 14:14   ` Frederic Weisbecker
  2019-10-05 16:52     ` Paul E. McKenney
  0 siblings, 1 reply; 21+ messages in thread
From: Frederic Weisbecker @ 2019-10-03 14:14 UTC (permalink / raw)
  To: paulmck
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

On Wed, Oct 02, 2019 at 06:38:55PM -0700, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@linux.ibm.com>
> 
> Readers and callback flooders in the rcutorture stress-test suite run for
> extended time periods by design.  They do take pains to relinquish the
> CPU from time to time, but in some cases this relies on the scheduler
> being active, which in turn relies on the scheduler-clock interrupt
> firing from time to time.
> 
> This commit therefore forces scheduling-clock interrupts within
> these loops.  While in the area, this commit also prevents
> rcu_torture_reader()'s occasional timed sleeps from delaying shutdown.
> 
> [ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>

You can also remove all the IS_ENABLED here.

Thanks!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH tip/core/rcu 06/12] rcu: Make CPU-hotplug removal operations enable tick
  2019-10-03  1:38 ` [PATCH tip/core/rcu 06/12] rcu: Make CPU-hotplug removal operations enable tick paulmck
@ 2019-10-03 14:34   ` Frederic Weisbecker
  2019-10-05 17:17     ` Paul E. McKenney
  0 siblings, 1 reply; 21+ messages in thread
From: Frederic Weisbecker @ 2019-10-03 14:34 UTC (permalink / raw)
  To: paulmck
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

On Wed, Oct 02, 2019 at 06:38:57PM -0700, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@linux.ibm.com>
> 
> CPU-hotplug removal operations run the multi_cpu_stop() function, which
> relies on the scheduler to gain control from whatever is running on the
> various online CPUs, including any nohz_full CPUs running long loops in
> kernel-mode code.  Lack of the scheduler-clock interrupt on such CPUs
> can delay multi_cpu_stop() for several minutes and can also result in
> RCU CPU stall warnings.  This commit therefore causes CPU-hotplug removal
> operations to enable the scheduler-clock interrupt on all online CPUs.

So, like Peter said back then, there must be an issue in the scheduler
such as a missing or mishandled preemption point.

> 
> [ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> ---
>  kernel/rcu/tree.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f708d54..74bf5c65 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2091,6 +2091,7 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
>   */
>  int rcutree_dead_cpu(unsigned int cpu)
>  {
> +	int c;
>  	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
>  	struct rcu_node *rnp = rdp->mynode;  /* Outgoing CPU's rdp & rnp. */
>  
> @@ -2101,6 +2102,10 @@ int rcutree_dead_cpu(unsigned int cpu)
>  	rcu_boost_kthread_setaffinity(rnp, -1);
>  	/* Do any needed no-CB deferred wakeups from this CPU. */
>  	do_nocb_deferred_wakeup(per_cpu_ptr(&rcu_data, cpu));
> +
> +	// Stop-machine done, so allow nohz_full to disable tick.
> +	for_each_online_cpu(c)
> +		tick_dep_clear_cpu(c, TICK_DEP_BIT_RCU);

Just use tick_dep_clear() without for_each_online_cpu().

>  	return 0;
>  }
>  
> @@ -3074,6 +3079,7 @@ static void rcutree_affinity_setting(unsigned int cpu, int outgoing)
>   */
>  int rcutree_online_cpu(unsigned int cpu)
>  {
> +	int c;
>  	unsigned long flags;
>  	struct rcu_data *rdp;
>  	struct rcu_node *rnp;
> @@ -3087,6 +3093,10 @@ int rcutree_online_cpu(unsigned int cpu)
>  		return 0; /* Too early in boot for scheduler work. */
>  	sync_sched_exp_online_cleanup(cpu);
>  	rcutree_affinity_setting(cpu, -1);
> +
> +	// Stop-machine done, so allow nohz_full to disable tick.
> +	for_each_online_cpu(c)
> +		tick_dep_clear_cpu(c, TICK_DEP_BIT_RCU);

Same here.

>  	return 0;
>  }
>  
> @@ -3096,6 +3106,7 @@ int rcutree_online_cpu(unsigned int cpu)
>   */
>  int rcutree_offline_cpu(unsigned int cpu)
>  {
> +	int c;
>  	unsigned long flags;
>  	struct rcu_data *rdp;
>  	struct rcu_node *rnp;
> @@ -3107,6 +3118,10 @@ int rcutree_offline_cpu(unsigned int cpu)
>  	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
>  
>  	rcutree_affinity_setting(cpu, cpu);
> +
> +	// nohz_full CPUs need the tick for stop-machine to work quickly
> +	for_each_online_cpu(c)
> +		tick_dep_set_cpu(c, TICK_DEP_BIT_RCU);

And here you only need tick_dep_set() without for_each_online_cpu().

Thanks.

>  	return 0;
>  }
>  
> -- 
> 2.9.5
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH tip/core/rcu 08/12] rcu: Force tick on for nohz_full CPUs not reaching quiescent states
  2019-10-03  1:38 ` [PATCH tip/core/rcu 08/12] rcu: Force tick on for nohz_full CPUs not reaching quiescent states paulmck
@ 2019-10-03 14:50   ` Frederic Weisbecker
  2019-10-05 17:21     ` Paul E. McKenney
  0 siblings, 1 reply; 21+ messages in thread
From: Frederic Weisbecker @ 2019-10-03 14:50 UTC (permalink / raw)
  To: paulmck
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

On Wed, Oct 02, 2019 at 06:38:59PM -0700, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@linux.ibm.com>
> 
> CPUs running for long time periods in the kernel in nohz_full mode
> might leave the scheduling-clock interrupt disabled for then full
> duration of their in-kernel execution.  This can (among other things)
> delay grace periods.  This commit therefore forces the tick back on
> for any nohz_full CPU that is failing to pass through a quiescent state
> upon return from interrupt, which the resched_cpu() will induce.
> 
> Reported-by: Joel Fernandes <joel@joelfernandes.org>
> [ paulmck: Clear ->rcu_forced_tick as reported by Joel Fernandes testing. ]
> [ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> ---
>  kernel/rcu/tree.c | 38 +++++++++++++++++++++++++++++++-------
>  kernel/rcu/tree.h |  1 +
>  2 files changed, 32 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 74bf5c65..621cc06 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -650,6 +650,12 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
>  	 */
>  	if (rdp->dynticks_nmi_nesting != 1) {
>  		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2, rdp->dynticks);
> +		if (tick_nohz_full_cpu(rdp->cpu) &&
> +		    rdp->dynticks_nmi_nesting == 2 &&
> +		    rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
> +			rdp->rcu_forced_tick = true;
> +			tick_dep_set_cpu(rdp->cpu, TICK_DEP_MASK_RCU);

I understand rdp->cpu is always smp_processor_id() here, right? Because calling
tick_dep_set_cpu() to a remote CPU while in NMI wouldn't be safe. It would warn anyway.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH tip/core/rcu 03/12] rcu: Force on tick when invoking lots of callbacks
  2019-10-03 14:10   ` Frederic Weisbecker
@ 2019-10-05 16:42     ` Paul E. McKenney
  0 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2019-10-05 16:42 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel

On Thu, Oct 03, 2019 at 04:10:52PM +0200, Frederic Weisbecker wrote:
> On Wed, Oct 02, 2019 at 06:38:54PM -0700, paulmck@kernel.org wrote:
> > From: "Paul E. McKenney" <paulmck@linux.ibm.com>
> > 
> > Callback invocation can run for a significant time period, and within
> > CONFIG_NO_HZ_FULL=y kernels, this period will be devoid of scheduler-clock
> > interrupts.  In-kernel execution without such interrupts can cause all
> > manner of malfunction, with RCU CPU stall warnings being but one result.
> > 
> > This commit therefore forces scheduling-clock interrupts on whenever more
> > than a few RCU callbacks are invoked.  Because offloaded callback invocation
> > can be preempted, this forcing is withdrawn on each context switch.  This
> > in turn requires that the loop invoking RCU callbacks reiterate the forcing
> > periodically.
> > 
> > [ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
> > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> > ---
> >  kernel/rcu/tree.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 8110514..db673ae 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2151,6 +2151,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
> >  	rcu_nocb_unlock_irqrestore(rdp, flags);
> >  
> >  	/* Invoke callbacks. */
> > +	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
> 
> No need for the IS_ENABLED(), the API takes care of that.
> 
> > +		tick_dep_set_task(current, TICK_DEP_BIT_RCU);
> >  	rhp = rcu_cblist_dequeue(&rcl);
> >  	for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
> >  		debug_rcu_head_unqueue(rhp);
> > @@ -2217,6 +2219,8 @@ static void rcu_do_batch(struct rcu_data *rdp)
> >  	/* Re-invoke RCU core processing if there are callbacks remaining. */
> >  	if (!offloaded && rcu_segcblist_ready_cbs(&rdp->cblist))
> >  		invoke_rcu_core();
> > +	if (IS_ENABLED(CONFIG_NO_HZ_FULL))
> 
> Same here.

Good catches!  Applied, thank you!

							Thanx, Paul

> Thanks.
> 
> > +		tick_dep_clear_task(current, TICK_DEP_BIT_RCU);
> >  }
> >  
> >  /*
> > -- 
> > 2.9.5
> > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH tip/core/rcu 04/12] rcutorture: Force on tick for readers and callback flooders
  2019-10-03 14:14   ` Frederic Weisbecker
@ 2019-10-05 16:52     ` Paul E. McKenney
  0 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2019-10-05 16:52 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel

On Thu, Oct 03, 2019 at 04:14:58PM +0200, Frederic Weisbecker wrote:
> On Wed, Oct 02, 2019 at 06:38:55PM -0700, paulmck@kernel.org wrote:
> > From: "Paul E. McKenney" <paulmck@linux.ibm.com>
> > 
> > Readers and callback flooders in the rcutorture stress-test suite run for
> > extended time periods by design.  They do take pains to relinquish the
> > CPU from time to time, but in some cases this relies on the scheduler
> > being active, which in turn relies on the scheduler-clock interrupt
> > firing from time to time.
> > 
> > This commit therefore forces scheduling-clock interrupts within
> > these loops.  While in the area, this commit also prevents
> > rcu_torture_reader()'s occasional timed sleeps from delaying shutdown.
> > 
> > [ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
> > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> 
> You can also remove all the IS_ENABLED here.

Again, good catch and fixed, thank you!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH tip/core/rcu 06/12] rcu: Make CPU-hotplug removal operations enable tick
  2019-10-03 14:34   ` Frederic Weisbecker
@ 2019-10-05 17:17     ` Paul E. McKenney
  0 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2019-10-05 17:17 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel

On Thu, Oct 03, 2019 at 04:34:09PM +0200, Frederic Weisbecker wrote:
> On Wed, Oct 02, 2019 at 06:38:57PM -0700, paulmck@kernel.org wrote:
> > From: "Paul E. McKenney" <paulmck@linux.ibm.com>
> > 
> > CPU-hotplug removal operations run the multi_cpu_stop() function, which
> > relies on the scheduler to gain control from whatever is running on the
> > various online CPUs, including any nohz_full CPUs running long loops in
> > kernel-mode code.  Lack of the scheduler-clock interrupt on such CPUs
> > can delay multi_cpu_stop() for several minutes and can also result in
> > RCU CPU stall warnings.  This commit therefore causes CPU-hotplug removal
> > operations to enable the scheduler-clock interrupt on all online CPUs.
> 
> So, like Peter said back then, there must be an issue in the scheduler
> such as a missing or mishandled preemption point.

Fair enough, but this is useful in the meantime.

> > [ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
> > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> > ---
> >  kernel/rcu/tree.c | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index f708d54..74bf5c65 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2091,6 +2091,7 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
> >   */
> >  int rcutree_dead_cpu(unsigned int cpu)
> >  {
> > +	int c;
> >  	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> >  	struct rcu_node *rnp = rdp->mynode;  /* Outgoing CPU's rdp & rnp. */
> >  
> > @@ -2101,6 +2102,10 @@ int rcutree_dead_cpu(unsigned int cpu)
> >  	rcu_boost_kthread_setaffinity(rnp, -1);
> >  	/* Do any needed no-CB deferred wakeups from this CPU. */
> >  	do_nocb_deferred_wakeup(per_cpu_ptr(&rcu_data, cpu));
> > +
> > +	// Stop-machine done, so allow nohz_full to disable tick.
> > +	for_each_online_cpu(c)
> > +		tick_dep_clear_cpu(c, TICK_DEP_BIT_RCU);
> 
> Just use tick_dep_clear() without for_each_online_cpu().
> 
> >  	return 0;
> >  }
> >  
> > @@ -3074,6 +3079,7 @@ static void rcutree_affinity_setting(unsigned int cpu, int outgoing)
> >   */
> >  int rcutree_online_cpu(unsigned int cpu)
> >  {
> > +	int c;
> >  	unsigned long flags;
> >  	struct rcu_data *rdp;
> >  	struct rcu_node *rnp;
> > @@ -3087,6 +3093,10 @@ int rcutree_online_cpu(unsigned int cpu)
> >  		return 0; /* Too early in boot for scheduler work. */
> >  	sync_sched_exp_online_cleanup(cpu);
> >  	rcutree_affinity_setting(cpu, -1);
> > +
> > +	// Stop-machine done, so allow nohz_full to disable tick.
> > +	for_each_online_cpu(c)
> > +		tick_dep_clear_cpu(c, TICK_DEP_BIT_RCU);
> 
> Same here.
> 
> >  	return 0;
> >  }
> >  
> > @@ -3096,6 +3106,7 @@ int rcutree_online_cpu(unsigned int cpu)
> >   */
> >  int rcutree_offline_cpu(unsigned int cpu)
> >  {
> > +	int c;
> >  	unsigned long flags;
> >  	struct rcu_data *rdp;
> >  	struct rcu_node *rnp;
> > @@ -3107,6 +3118,10 @@ int rcutree_offline_cpu(unsigned int cpu)
> >  	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> >  
> >  	rcutree_affinity_setting(cpu, cpu);
> > +
> > +	// nohz_full CPUs need the tick for stop-machine to work quickly
> > +	for_each_online_cpu(c)
> > +		tick_dep_set_cpu(c, TICK_DEP_BIT_RCU);
> 
> And here you only need tick_dep_set() without for_each_online_cpu().

Thank you!  I applied all three simplifications.

							Thanx, Paul

> Thanks.
> 
> >  	return 0;
> >  }
> >  
> > -- 
> > 2.9.5
> > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH tip/core/rcu 08/12] rcu: Force tick on for nohz_full CPUs not reaching quiescent states
  2019-10-03 14:50   ` Frederic Weisbecker
@ 2019-10-05 17:21     ` Paul E. McKenney
  0 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2019-10-05 17:21 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel

On Thu, Oct 03, 2019 at 04:50:55PM +0200, Frederic Weisbecker wrote:
> On Wed, Oct 02, 2019 at 06:38:59PM -0700, paulmck@kernel.org wrote:
> > From: "Paul E. McKenney" <paulmck@linux.ibm.com>
> > 
> > CPUs running for long time periods in the kernel in nohz_full mode
> > might leave the scheduling-clock interrupt disabled for then full
> > duration of their in-kernel execution.  This can (among other things)
> > delay grace periods.  This commit therefore forces the tick back on
> > for any nohz_full CPU that is failing to pass through a quiescent state
> > upon return from interrupt, which the resched_cpu() will induce.
> > 
> > Reported-by: Joel Fernandes <joel@joelfernandes.org>
> > [ paulmck: Clear ->rcu_forced_tick as reported by Joel Fernandes testing. ]
> > [ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ]
> > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> > ---
> >  kernel/rcu/tree.c | 38 +++++++++++++++++++++++++++++++-------
> >  kernel/rcu/tree.h |  1 +
> >  2 files changed, 32 insertions(+), 7 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 74bf5c65..621cc06 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -650,6 +650,12 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> >  	 */
> >  	if (rdp->dynticks_nmi_nesting != 1) {
> >  		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2, rdp->dynticks);
> > +		if (tick_nohz_full_cpu(rdp->cpu) &&
> > +		    rdp->dynticks_nmi_nesting == 2 &&
> > +		    rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
> > +			rdp->rcu_forced_tick = true;
> > +			tick_dep_set_cpu(rdp->cpu, TICK_DEP_MASK_RCU);
> 
> I understand rdp->cpu is always smp_processor_id() here, right? Because calling
> tick_dep_set_cpu() to a remote CPU while in NMI wouldn't be safe. It would warn anyway.

Yes, this is always invoked on the CPU whose ID is rdp->cpu, but thank
you for checking!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-10-05 17:21 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-03  1:38 [PATCH tip/core/rcu 0/12] NO_HZ fixes for v5.5 Paul E. McKenney
2019-10-03  1:38 ` [PATCH tip/core/rcu 01/12] nohz: Add TICK_DEP_BIT_RCU paulmck
2019-10-03  1:38 ` [PATCH tip/core/rcu 02/12] time: Export tick start/stop functions for rcutorture paulmck
2019-10-03  1:38 ` [PATCH tip/core/rcu 03/12] rcu: Force on tick when invoking lots of callbacks paulmck
2019-10-03 14:10   ` Frederic Weisbecker
2019-10-05 16:42     ` Paul E. McKenney
2019-10-03  1:38 ` [PATCH tip/core/rcu 04/12] rcutorture: Force on tick for readers and callback flooders paulmck
2019-10-03 14:14   ` Frederic Weisbecker
2019-10-05 16:52     ` Paul E. McKenney
2019-10-03  1:38 ` [PATCH tip/core/rcu 05/12] stop_machine: EXP Provide RCU quiescent state in multi_cpu_stop() paulmck
2019-10-03  1:38 ` [PATCH tip/core/rcu 06/12] rcu: Make CPU-hotplug removal operations enable tick paulmck
2019-10-03 14:34   ` Frederic Weisbecker
2019-10-05 17:17     ` Paul E. McKenney
2019-10-03  1:38 ` [PATCH tip/core/rcu 07/12] stop_machine: Use {READ,WRITE)_ONCE() for multi_cpu_stop() ->state paulmck
2019-10-03  1:38 ` [PATCH tip/core/rcu 08/12] rcu: Force tick on for nohz_full CPUs not reaching quiescent states paulmck
2019-10-03 14:50   ` Frederic Weisbecker
2019-10-05 17:21     ` Paul E. McKenney
2019-10-03  1:39 ` [PATCH tip/core/rcu 09/12] rcu: Force nohz_full tick on upon irq enter instead of exit paulmck
2019-10-03  1:39 ` [PATCH tip/core/rcu 10/12] rcu: Reset CPU hints when reporting a quiescent state paulmck
2019-10-03  1:39 ` [PATCH tip/core/rcu 11/12] rcu: Confine ->core_needs_qs accesses to the corresponding CPU paulmck
2019-10-03  1:39 ` [PATCH tip/core/rcu 12/12] rcu: Make kernel-mode nohz_full CPUs invoke the RCU core processing paulmck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).