rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4
@ 2019-08-01 22:37 Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 01/12] lockdep: Make print_lock() address visible Paul E. McKenney
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel

Hello!

This series contains miscellaneous fixes:

1.	Make lockdep's print_lock() address visible.

2.	Fix tick_broadcast_offline() lockdep complaint.

3.	Restore barrier() to rcu_read_lock() and rcu_read_unlock().

4.	Add kernel parameter to dump trace after RCU CPU stall warning.

5.	Add destroy_work_on_stack() to match INIT_WORK_ONSTACK().

6.	Avoid srcutorture security-based pointer obfuscation.

7.	Change return type of rcu_spawn_one_boost_kthread(), courtesy of
	Byungchul Park.

8.	Add rcutree.kthread_prio pointer to stallwarn.txt.

9.	Prevent late-arriving interrupts from disrupting offline, courtesy
	of Peter Zijlstra.

10.	Remove redundant "if" condition from rcu_gp_is_expedited().

11.	Use common outgoing-CPU-notification code for arm.

12.	Fix spelling mistake "greate"->"great", courtesy of Mukesh Ojha.

							Thanx, Paul

------------------------------------------------------------------------

 Documentation/RCU/Design/Requirements/Requirements.html |   71 ++++++++++++++++
 Documentation/RCU/stallwarn.txt                         |    6 +
 Documentation/admin-guide/kernel-parameters.txt         |    4 
 arch/arm/kernel/smp.c                                   |    6 -
 include/linux/tick.h                                    |   10 --
 kernel/locking/lockdep.c                                |    2 
 kernel/rcu/rcu.h                                        |    1 
 kernel/rcu/rcu_segcblist.h                              |   21 ----
 kernel/rcu/srcutree.c                                   |    5 -
 kernel/rcu/tree_exp.h                                   |    8 +
 kernel/rcu/tree_plugin.h                                |   31 ++----
 kernel/rcu/tree_stall.h                                 |    4 
 kernel/rcu/update.c                                     |    5 -
 kernel/sched/core.c                                     |   57 +++++++++++-
 kernel/sched/idle.c                                     |    5 -
 15 files changed, 164 insertions(+), 72 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 01/12] lockdep: Make print_lock() address visible
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 02/12] time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint Paul E. McKenney
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Security is a wonderful thing, but so is the ability to debug based on
lockdep warnings.  This commit therefore makes lockdep lock addresses
visible in the clear.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/locking/lockdep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 4861cf8e274b..4aca3f4379d2 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -620,7 +620,7 @@ static void print_lock(struct held_lock *hlock)
 		return;
 	}
 
-	printk(KERN_CONT "%p", hlock->instance);
+	printk(KERN_CONT "%px", hlock->instance);
 	print_lock_name(lock);
 	printk(KERN_CONT ", at: %pS\n", (void *)hlock->acquire_ip);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 02/12] time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 01/12] lockdep: Make print_lock() address visible Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 03/12] rcu: Restore barrier() to rcu_read_lock() and rcu_read_unlock() Paul E. McKenney
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney, Ingo Molnar

The TASKS03 and TREE04 rcutorture scenarios produce the following
lockdep complaint:

------------------------------------------------------------------------

================================
WARNING: inconsistent lock state
5.2.0-rc1+ #513 Not tainted
--------------------------------
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
migration/1/14 [HC0[0]:SC0[0]:HE1:SE1] takes:
(____ptrval____) (tick_broadcast_lock){?...}, at: tick_broadcast_offline+0xf/0x70
{IN-HARDIRQ-W} state was registered at:
  lock_acquire+0xb0/0x1c0
  _raw_spin_lock_irqsave+0x3c/0x50
  tick_broadcast_switch_to_oneshot+0xd/0x40
  tick_switch_to_oneshot+0x4f/0xd0
  hrtimer_run_queues+0xf3/0x130
  run_local_timers+0x1c/0x50
  update_process_times+0x1c/0x50
  tick_periodic+0x26/0xc0
  tick_handle_periodic+0x1a/0x60
  smp_apic_timer_interrupt+0x80/0x2a0
  apic_timer_interrupt+0xf/0x20
  _raw_spin_unlock_irqrestore+0x4e/0x60
  rcu_nocb_gp_kthread+0x15d/0x590
  kthread+0xf3/0x130
  ret_from_fork+0x3a/0x50
irq event stamp: 171
hardirqs last  enabled at (171): [<ffffffff8a201a37>] trace_hardirqs_on_thunk+0x1a/0x1c
hardirqs last disabled at (170): [<ffffffff8a201a53>] trace_hardirqs_off_thunk+0x1a/0x1c
softirqs last  enabled at (0): [<ffffffff8a264ee0>] copy_process.part.56+0x650/0x1cb0
softirqs last disabled at (0): [<0000000000000000>] 0x0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(tick_broadcast_lock);
  <Interrupt>
    lock(tick_broadcast_lock);

 *** DEADLOCK ***

1 lock held by migration/1/14:
 #0: (____ptrval____) (clockevents_lock){+.+.}, at: tick_offline_cpu+0xf/0x30

stack backtrace:
CPU: 1 PID: 14 Comm: migration/1 Not tainted 5.2.0-rc1+ #513
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0x5e/0x8b
 print_usage_bug+0x1fc/0x216
 ? print_shortest_lock_dependencies+0x1b0/0x1b0
 mark_lock+0x1f2/0x280
 __lock_acquire+0x1e0/0x18f0
 ? __lock_acquire+0x21b/0x18f0
 ? _raw_spin_unlock_irqrestore+0x4e/0x60
 lock_acquire+0xb0/0x1c0
 ? tick_broadcast_offline+0xf/0x70
 _raw_spin_lock+0x33/0x40
 ? tick_broadcast_offline+0xf/0x70
 tick_broadcast_offline+0xf/0x70
 tick_offline_cpu+0x16/0x30
 take_cpu_down+0x7d/0xa0
 multi_cpu_stop+0xa2/0xe0
 ? cpu_stop_queue_work+0xc0/0xc0
 cpu_stopper_thread+0x6d/0x100
 smpboot_thread_fn+0x169/0x240
 kthread+0xf3/0x130
 ? sort_range+0x20/0x20
 ? kthread_cancel_delayed_work_sync+0x10/0x10
 ret_from_fork+0x3a/0x50

------------------------------------------------------------------------

To reproduce, run the following rcutorture test:

        tools/testing/selftests/rcutorture/bin/kvm.sh --duration 5 --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --configs "TASKS03 TREE04"

It turns out that tick_broadcast_offline() was an innocent bystander.
After all, interrupts are supposed to be disabled throughout
take_cpu_down(), and therefore should have been disabled upon entry to
tick_offline_cpu() and thus to tick_broadcast_offline().  This suggests
that one of the CPU-hotplug notifiers was incorrectly enabling interrupts,
and leaving them enabled on return.

Some debugging code showed that the culprit was sched_cpu_dying().
It had irqs enabled after return from sched_tick_stop().  Which in turn
had irqs enabled after return from cancel_delayed_work_sync().  Which is a
wrapper around __cancel_work_timer().  Which can sleep in the case where
something else is concurrently trying to cancel the same delayed work,
and as Thomas Gleixner pointed out on IRC, sleeping is a decidedly bad
idea when you are invoked from take_cpu_down(), regardless of the state
you leave interrupts in upon return.

Code inspection located no reason why the delayed work absolutely
needed to be canceled from sched_tick_stop():  The work is not
bound to the outgoing CPU by design, given that the whole point is
to collect statistics without disturbing the outgoing CPU.

This commit therefore simply drops the cancel_delayed_work_sync() from
sched_tick_stop().  Instead, a new ->state field is added to the tick_work
structure so that the delayed-work handler function sched_tick_remote()
can avoid reposting itself.  A cpu_is_offline() check is also added to
sched_tick_remote() to avoid mucking with the state of an offlined CPU
(though it does appear safe to do so).  The sched_tick_start() and
sched_tick_stop() functions also update ->state, and sched_tick_start()
also schedules the delayed work if ->state indicates that it is not
already in flight.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
[ paulmck: Apply Peter Zijlstra and Frederic Weisbecker atomics feedback. ]
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/sched/core.c | 57 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 49 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2b037f195473..0b22e55cebe8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3486,8 +3486,36 @@ void scheduler_tick(void)
 
 struct tick_work {
 	int			cpu;
+	atomic_t		state;
 	struct delayed_work	work;
 };
+/* Values for ->state, see diagram below. */
+#define TICK_SCHED_REMOTE_OFFLINE	0
+#define TICK_SCHED_REMOTE_OFFLINING	1
+#define TICK_SCHED_REMOTE_RUNNING	2
+
+/*
+ * State diagram for ->state:
+ *
+ *
+ *          TICK_SCHED_REMOTE_OFFLINE
+ *                    |   ^
+ *                    |   |
+ *                    |   | sched_tick_remote()
+ *                    |   |
+ *                    |   |
+ *                    +--TICK_SCHED_REMOTE_OFFLINING
+ *                    |   ^
+ *                    |   |
+ * sched_tick_start() |   | sched_tick_stop()
+ *                    |   |
+ *                    V   |
+ *          TICK_SCHED_REMOTE_RUNNING
+ *
+ *
+ * Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
+ * and sched_tick_start() are happy to leave the state in RUNNING.
+ */
 
 static struct tick_work __percpu *tick_work_cpu;
 
@@ -3500,6 +3528,7 @@ static void sched_tick_remote(struct work_struct *work)
 	struct task_struct *curr;
 	struct rq_flags rf;
 	u64 delta;
+	int os;
 
 	/*
 	 * Handle the tick only if it appears the remote CPU is running in full
@@ -3513,7 +3542,7 @@ static void sched_tick_remote(struct work_struct *work)
 
 	rq_lock_irq(rq, &rf);
 	curr = rq->curr;
-	if (is_idle_task(curr))
+	if (is_idle_task(curr) || cpu_is_offline(cpu))
 		goto out_unlock;
 
 	update_rq_clock(rq);
@@ -3533,13 +3562,18 @@ static void sched_tick_remote(struct work_struct *work)
 	/*
 	 * Run the remote tick once per second (1Hz). This arbitrary
 	 * frequency is large enough to avoid overload but short enough
-	 * to keep scheduler internal stats reasonably up to date.
+	 * to keep scheduler internal stats reasonably up to date.  But
+	 * first update state to reflect hotplug activity if required.
 	 */
-	queue_delayed_work(system_unbound_wq, dwork, HZ);
+	os = atomic_fetch_add_unless(&twork->state, -1, TICK_SCHED_REMOTE_RUNNING);
+	WARN_ON_ONCE(os == TICK_SCHED_REMOTE_OFFLINE);
+	if (os == TICK_SCHED_REMOTE_RUNNING)
+		queue_delayed_work(system_unbound_wq, dwork, HZ);
 }
 
 static void sched_tick_start(int cpu)
 {
+	int os;
 	struct tick_work *twork;
 
 	if (housekeeping_cpu(cpu, HK_FLAG_TICK))
@@ -3548,15 +3582,20 @@ static void sched_tick_start(int cpu)
 	WARN_ON_ONCE(!tick_work_cpu);
 
 	twork = per_cpu_ptr(tick_work_cpu, cpu);
-	twork->cpu = cpu;
-	INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
-	queue_delayed_work(system_unbound_wq, &twork->work, HZ);
+	os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
+	WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
+	if (os == TICK_SCHED_REMOTE_OFFLINE) {
+		twork->cpu = cpu;
+		INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
+		queue_delayed_work(system_unbound_wq, &twork->work, HZ);
+	}
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
 static void sched_tick_stop(int cpu)
 {
 	struct tick_work *twork;
+	int os;
 
 	if (housekeeping_cpu(cpu, HK_FLAG_TICK))
 		return;
@@ -3564,7 +3603,10 @@ static void sched_tick_stop(int cpu)
 	WARN_ON_ONCE(!tick_work_cpu);
 
 	twork = per_cpu_ptr(tick_work_cpu, cpu);
-	cancel_delayed_work_sync(&twork->work);
+	/* There cannot be competing actions, but don't rely on stop-machine. */
+	os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_OFFLINING);
+	WARN_ON_ONCE(os != TICK_SCHED_REMOTE_RUNNING);
+	/* Don't cancel, as this would mess up the state machine. */
 }
 #endif /* CONFIG_HOTPLUG_CPU */
 
@@ -3572,7 +3614,6 @@ int __init sched_tick_offload_init(void)
 {
 	tick_work_cpu = alloc_percpu(struct tick_work);
 	BUG_ON(!tick_work_cpu);
-
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 03/12] rcu: Restore barrier() to rcu_read_lock() and rcu_read_unlock()
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 01/12] lockdep: Make print_lock() address visible Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 02/12] time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 04/12] rcu: Add kernel parameter to dump trace after RCU CPU stall warning Paul E. McKenney
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Commit bb73c52bad36 ("rcu: Don't disable preemption for Tiny and Tree
RCU readers") removed the barrier() calls from rcu_read_lock() and
rcu_write_lock() in CONFIG_PREEMPT=n&&CONFIG_PREEMPT_COUNT=n kernels.
Within RCU, this commit was OK, but it failed to account for things like
get_user() that can pagefault and that can be reordered by the compiler.
Lack of the barrier() calls in rcu_read_lock() and rcu_read_unlock()
can cause these page faults to migrate into RCU read-side critical
sections, which in CONFIG_PREEMPT=n kernels could result in too-short
grace periods and arbitrary misbehavior.  Please see commit 386afc91144b
("spinlocks and preemption points need to be at least compiler barriers")
and Linus's commit 66be4e66a7f4 ("rcu: locking and unlocking need to
always be at least barriers"), this last of which restores the barrier()
call to both rcu_read_lock() and rcu_read_unlock().

This commit removes barrier() calls that are no longer needed given that
the addition of them in Linus's commit noted above.  The combination of
this commit and Linus's commit effectively reverts commit bb73c52bad36
("rcu: Don't disable preemption for Tiny and Tree RCU readers").

Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
[ paulmck: Fix embarrassing typo located by Alan Stern. ]
---
 .../RCU/Design/Requirements/Requirements.html | 71 +++++++++++++++++++
 kernel/rcu/tree_plugin.h                      | 11 ---
 2 files changed, 71 insertions(+), 11 deletions(-)

diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html
index 5a9238a2883c..f04c467e55c5 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.html
+++ b/Documentation/RCU/Design/Requirements/Requirements.html
@@ -2129,6 +2129,8 @@ Some of the relevant points of interest are as follows:
 <li>	<a href="#Hotplug CPU">Hotplug CPU</a>.
 <li>	<a href="#Scheduler and RCU">Scheduler and RCU</a>.
 <li>	<a href="#Tracing and RCU">Tracing and RCU</a>.
+<li>	<a href="#Accesses to User Memory and RCU">
+Accesses to User Memory and RCU</a>.
 <li>	<a href="#Energy Efficiency">Energy Efficiency</a>.
 <li>	<a href="#Scheduling-Clock Interrupts and RCU">
 	Scheduling-Clock Interrupts and RCU</a>.
@@ -2521,6 +2523,75 @@ cannot be used.
 The tracing folks both located the requirement and provided the
 needed fix, so this surprise requirement was relatively painless.
 
+<h3><a name="Accesses to User Memory and RCU">
+Accesses to User Memory and RCU</a></h3>
+
+<p>
+The kernel needs to access user-space memory, for example, to access
+data referenced by system-call parameters.
+The <tt>get_user()</tt> macro does this job.
+
+<p>
+However, user-space memory might well be paged out, which means
+that <tt>get_user()</tt> might well page-fault and thus block while
+waiting for the resulting I/O to complete.
+It would be a very bad thing for the compiler to reorder
+a <tt>get_user()</tt> invocation into an RCU read-side critical
+section.
+For example, suppose that the source code looked like this:
+
+<blockquote>
+<pre>
+ 1 rcu_read_lock();
+ 2 p = rcu_dereference(gp);
+ 3 v = p-&gt;value;
+ 4 rcu_read_unlock();
+ 5 get_user(user_v, user_p);
+ 6 do_something_with(v, user_v);
+</pre>
+</blockquote>
+
+<p>
+The compiler must not be permitted to transform this source code into
+the following:
+
+<blockquote>
+<pre>
+ 1 rcu_read_lock();
+ 2 p = rcu_dereference(gp);
+ 3 get_user(user_v, user_p); // BUG: POSSIBLE PAGE FAULT!!!
+ 4 v = p-&gt;value;
+ 5 rcu_read_unlock();
+ 6 do_something_with(v, user_v);
+</pre>
+</blockquote>
+
+<p>
+If the compiler did make this transformation in a
+<tt>CONFIG_PREEMPT=n</tt> kernel build, and if <tt>get_user()</tt> did
+page fault, the result would be a quiescent state in the middle
+of an RCU read-side critical section.
+This misplaced quiescent state could result in line&nbsp;4 being
+a use-after-free access, which could be bad for your kernel's
+actuarial statistics.
+Similar examples can be constructed with the call to <tt>get_user()</tt>
+preceding the <tt>rcu_read_lock()</tt>.
+
+<p>
+Unfortunately, <tt>get_user()</tt> doesn't have any particular
+ordering properties, and in some architectures the underlying <tt>asm</tt>
+isn't even marked <tt>volatile</tt>.
+And even if it was marked <tt>volatile</tt>, the above access to
+<tt>p-&gt;value</tt> is not volatile, so the compiler would not have any
+reason to keep those two accesses in order.
+
+<p>
+Therefore, the Linux-kernel definitions of <tt>rcu_read_lock()</tt>
+and <tt>rcu_read_unlock()</tt> must act as compiler barriers,
+at least for outermost instances of <tt>rcu_read_lock()</tt> and
+<tt>rcu_read_unlock()</tt> within a nested set of RCU read-side critical
+sections.
+
 <h3><a name="Energy Efficiency">Energy Efficiency</a></h3>
 
 <p>
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index acb225023ed1..3f1b5041de9b 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -288,7 +288,6 @@ void rcu_note_context_switch(bool preempt)
 	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 	struct rcu_node *rnp;
 
-	barrier(); /* Avoid RCU read-side critical sections leaking down. */
 	trace_rcu_utilization(TPS("Start context switch"));
 	lockdep_assert_irqs_disabled();
 	WARN_ON_ONCE(!preempt && t->rcu_read_lock_nesting > 0);
@@ -340,7 +339,6 @@ void rcu_note_context_switch(bool preempt)
 	if (rdp->exp_deferred_qs)
 		rcu_report_exp_rdp(rdp);
 	trace_rcu_utilization(TPS("End context switch"));
-	barrier(); /* Avoid RCU read-side critical sections leaking up. */
 }
 EXPORT_SYMBOL_GPL(rcu_note_context_switch);
 
@@ -828,11 +826,6 @@ static void rcu_qs(void)
  * dyntick-idle quiescent state visible to other CPUs, which will in
  * some cases serve for expedited as well as normal grace periods.
  * Either way, register a lightweight quiescent state.
- *
- * The barrier() calls are redundant in the common case when this is
- * called externally, but just in case this is called from within this
- * file.
- *
  */
 void rcu_all_qs(void)
 {
@@ -847,14 +840,12 @@ void rcu_all_qs(void)
 		return;
 	}
 	this_cpu_write(rcu_data.rcu_urgent_qs, false);
-	barrier(); /* Avoid RCU read-side critical sections leaking down. */
 	if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) {
 		local_irq_save(flags);
 		rcu_momentary_dyntick_idle();
 		local_irq_restore(flags);
 	}
 	rcu_qs();
-	barrier(); /* Avoid RCU read-side critical sections leaking up. */
 	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(rcu_all_qs);
@@ -864,7 +855,6 @@ EXPORT_SYMBOL_GPL(rcu_all_qs);
  */
 void rcu_note_context_switch(bool preempt)
 {
-	barrier(); /* Avoid RCU read-side critical sections leaking down. */
 	trace_rcu_utilization(TPS("Start context switch"));
 	rcu_qs();
 	/* Load rcu_urgent_qs before other flags. */
@@ -877,7 +867,6 @@ void rcu_note_context_switch(bool preempt)
 		rcu_tasks_qs(current);
 out:
 	trace_rcu_utilization(TPS("End context switch"));
-	barrier(); /* Avoid RCU read-side critical sections leaking up. */
 }
 EXPORT_SYMBOL_GPL(rcu_note_context_switch);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 04/12] rcu: Add kernel parameter to dump trace after RCU CPU stall warning
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
                   ` (2 preceding siblings ...)
  2019-08-01 22:37 ` [PATCH tip/core/rcu 03/12] rcu: Restore barrier() to rcu_read_lock() and rcu_read_unlock() Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 05/12] rcu: Add destroy_work_on_stack() to match INIT_WORK_ONSTACK() Paul E. McKenney
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

This commit adds a rcu_cpu_stall_ftrace_dump kernel boot parameter, that,
when set, causes the trace buffer to be dumped after an RCU CPU stall
warning is printed.  This kernel boot parameter is disabled by default,
maintaining compatibility with previous behavior.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 4 ++++
 kernel/rcu/rcu.h                                | 1 +
 kernel/rcu/tree_stall.h                         | 4 ++++
 kernel/rcu/update.c                             | 2 ++
 4 files changed, 11 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7ccd158b3894..f3fcd6140ee1 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4047,6 +4047,10 @@
 	rcutorture.verbose= [KNL]
 			Enable additional printk() statements.
 
+	rcupdate.rcu_cpu_stall_ftrace_dump= [KNL]
+			Dump ftrace buffer after reporting RCU CPU
+			stall warning.
+
 	rcupdate.rcu_cpu_stall_suppress= [KNL]
 			Suppress RCU CPU stall warning messages.
 
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 5290b01de534..8fd4f82c9b3d 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -227,6 +227,7 @@ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
 
 #ifdef CONFIG_RCU_STALL_COMMON
 
+extern int rcu_cpu_stall_ftrace_dump;
 extern int rcu_cpu_stall_suppress;
 extern int rcu_cpu_stall_timeout;
 int rcu_jiffies_till_stall_check(void);
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 065183391f75..0627a66699a6 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -527,6 +527,8 @@ static void check_cpu_stall(struct rcu_data *rdp)
 
 		/* We haven't checked in, so go dump stack. */
 		print_cpu_stall();
+		if (rcu_cpu_stall_ftrace_dump)
+			rcu_ftrace_dump(DUMP_ALL);
 
 	} else if (rcu_gp_in_progress() &&
 		   ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) &&
@@ -534,6 +536,8 @@ static void check_cpu_stall(struct rcu_data *rdp)
 
 		/* They had a few time units to dump stack, so complain. */
 		print_other_cpu_stall(gs2);
+		if (rcu_cpu_stall_ftrace_dump)
+			rcu_ftrace_dump(DUMP_ALL);
 	}
 }
 
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 61df2bf08563..249517058b13 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -437,6 +437,8 @@ EXPORT_SYMBOL_GPL(rcutorture_sched_setaffinity);
 #endif
 
 #ifdef CONFIG_RCU_STALL_COMMON
+int rcu_cpu_stall_ftrace_dump __read_mostly;
+module_param(rcu_cpu_stall_ftrace_dump, int, 0644);
 int rcu_cpu_stall_suppress __read_mostly; /* 1 = suppress stall warnings. */
 EXPORT_SYMBOL_GPL(rcu_cpu_stall_suppress);
 module_param(rcu_cpu_stall_suppress, int, 0644);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 05/12] rcu: Add destroy_work_on_stack() to match INIT_WORK_ONSTACK()
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
                   ` (3 preceding siblings ...)
  2019-08-01 22:37 ` [PATCH tip/core/rcu 04/12] rcu: Add kernel parameter to dump trace after RCU CPU stall warning Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 06/12] srcu: Avoid srcutorture security-based pointer obfuscation Paul E. McKenney
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

The synchronize_rcu_expedited() function has an INIT_WORK_ONSTACK(),
but lacks the corresponding destroy_work_on_stack().  This commit
therefore adds destroy_work_on_stack().

Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
---
 kernel/rcu/tree_exp.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index af7e7b9c86af..513b403b683b 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -792,6 +792,7 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
  */
 void synchronize_rcu_expedited(void)
 {
+	bool boottime = (rcu_scheduler_active == RCU_SCHEDULER_INIT);
 	struct rcu_exp_work rew;
 	struct rcu_node *rnp;
 	unsigned long s;
@@ -817,7 +818,7 @@ void synchronize_rcu_expedited(void)
 		return;  /* Someone else did our work for us. */
 
 	/* Ensure that load happens before action based on it. */
-	if (unlikely(rcu_scheduler_active == RCU_SCHEDULER_INIT)) {
+	if (unlikely(boottime)) {
 		/* Direct call during scheduler init and early_initcalls(). */
 		rcu_exp_sel_wait_wake(s);
 	} else {
@@ -835,5 +836,8 @@ void synchronize_rcu_expedited(void)
 
 	/* Let the next expedited grace period start. */
 	mutex_unlock(&rcu_state.exp_mutex);
+
+	if (likely(!boottime))
+		destroy_work_on_stack(&rew.rew_work);
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 06/12] srcu: Avoid srcutorture security-based pointer obfuscation
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
                   ` (4 preceding siblings ...)
  2019-08-01 22:37 ` [PATCH tip/core/rcu 05/12] rcu: Add destroy_work_on_stack() to match INIT_WORK_ONSTACK() Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 07/12] rcu: Change return type of rcu_spawn_one_boost_kthread() Paul E. McKenney
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Because pointer output is now obfuscated, and because what you really
want to know is whether or not the callback lists are empty, this commit
replaces the srcu_data structure's head callback pointer printout with
a single character that is "." is the callback list is empty or "C"
otherwise.

This is the only remaining user of rcu_segcblist_head(), so this
commit also removes this function's definition.  It also turns out that
rcu_segcblist_tail() no longer has any callers, so this commit removes
that function's definition while in the area.  They were both marked
"Interim", and their end has come.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/rcu_segcblist.h | 21 ---------------------
 kernel/rcu/srcutree.c      |  5 +++--
 2 files changed, 3 insertions(+), 23 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 71b64648464e..822a39da0533 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -76,27 +76,6 @@ static inline bool rcu_segcblist_restempty(struct rcu_segcblist *rsclp, int seg)
 	return !*rsclp->tails[seg];
 }
 
-/*
- * Interim function to return rcu_segcblist head pointer.  Longer term, the
- * rcu_segcblist will be used more pervasively, removing the need for this
- * function.
- */
-static inline struct rcu_head *rcu_segcblist_head(struct rcu_segcblist *rsclp)
-{
-	return rsclp->head;
-}
-
-/*
- * Interim function to return rcu_segcblist head pointer.  Longer term, the
- * rcu_segcblist will be used more pervasively, removing the need for this
- * function.
- */
-static inline struct rcu_head **rcu_segcblist_tail(struct rcu_segcblist *rsclp)
-{
-	WARN_ON_ONCE(rcu_segcblist_empty(rsclp));
-	return rsclp->tails[RCU_NEXT_TAIL];
-}
-
 void rcu_segcblist_init(struct rcu_segcblist *rsclp);
 void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
 bool rcu_segcblist_ready_cbs(struct rcu_segcblist *rsclp);
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index cf0e886314f2..5dffade2d7cd 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1279,8 +1279,9 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
 
 		c0 = l0 - u0;
 		c1 = l1 - u1;
-		pr_cont(" %d(%ld,%ld %1p)",
-			cpu, c0, c1, rcu_segcblist_head(&sdp->srcu_cblist));
+		pr_cont(" %d(%ld,%ld %c)",
+			cpu, c0, c1,
+			"C."[rcu_segcblist_empty(&sdp->srcu_cblist)]);
 		s0 += c0;
 		s1 += c1;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 07/12] rcu: Change return type of rcu_spawn_one_boost_kthread()
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
                   ` (5 preceding siblings ...)
  2019-08-01 22:37 ` [PATCH tip/core/rcu 06/12] srcu: Avoid srcutorture security-based pointer obfuscation Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 08/12] doc: Add rcutree.kthread_prio pointer to stallwarn.txt Paul E. McKenney
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Byungchul Park,
	Paul E . McKenney

From: Byungchul Park <byungchul.park@lge.com>

The return value of rcu_spawn_one_boost_kthread() is not used any longer.
This commit therefore changes its return type from int to void, and
removes the cast to void from its callers.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/tree_plugin.h | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 3f1b5041de9b..307ae6ebb804 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1123,7 +1123,7 @@ static void rcu_preempt_boost_start_gp(struct rcu_node *rnp)
  * already exist.  We only create this kthread for preemptible RCU.
  * Returns zero if all is well, a negated errno otherwise.
  */
-static int rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
+static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
 {
 	int rnp_index = rnp - rcu_get_root();
 	unsigned long flags;
@@ -1131,25 +1131,27 @@ static int rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
 	struct task_struct *t;
 
 	if (!IS_ENABLED(CONFIG_PREEMPT_RCU))
-		return 0;
+		return;
 
 	if (!rcu_scheduler_fully_active || rcu_rnp_online_cpus(rnp) == 0)
-		return 0;
+		return;
 
 	rcu_state.boost = 1;
+
 	if (rnp->boost_kthread_task != NULL)
-		return 0;
+		return;
+
 	t = kthread_create(rcu_boost_kthread, (void *)rnp,
 			   "rcub/%d", rnp_index);
-	if (IS_ERR(t))
-		return PTR_ERR(t);
+	if (WARN_ON_ONCE(IS_ERR(t)))
+		return;
+
 	raw_spin_lock_irqsave_rcu_node(rnp, flags);
 	rnp->boost_kthread_task = t;
 	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 	sp.sched_priority = kthread_prio;
 	sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
 	wake_up_process(t); /* get to TASK_INTERRUPTIBLE quickly. */
-	return 0;
 }
 
 /*
@@ -1190,7 +1192,7 @@ static void __init rcu_spawn_boost_kthreads(void)
 	struct rcu_node *rnp;
 
 	rcu_for_each_leaf_node(rnp)
-		(void)rcu_spawn_one_boost_kthread(rnp);
+		rcu_spawn_one_boost_kthread(rnp);
 }
 
 static void rcu_prepare_kthreads(int cpu)
@@ -1200,7 +1202,7 @@ static void rcu_prepare_kthreads(int cpu)
 
 	/* Fire up the incoming CPU's kthread and leaf rcu_node kthread. */
 	if (rcu_scheduler_fully_active)
-		(void)rcu_spawn_one_boost_kthread(rnp);
+		rcu_spawn_one_boost_kthread(rnp);
 }
 
 #else /* #ifdef CONFIG_RCU_BOOST */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 08/12] doc: Add rcutree.kthread_prio pointer to stallwarn.txt
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
                   ` (6 preceding siblings ...)
  2019-08-01 22:37 ` [PATCH tip/core/rcu 07/12] rcu: Change return type of rcu_spawn_one_boost_kthread() Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 09/12] idle: Prevent late-arriving interrupts from disrupting offline Paul E. McKenney
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

This commit adds mention of the rcutree.kthread_prio kernel boot parameter
to the discussion of how high-priority real-time tasks can result in
RCU CPU stall warnings.  (However, this does not necessarily help when
the high-priority real-time tasks are using dubious deadlines.)

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 Documentation/RCU/stallwarn.txt | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 13e88fc00f01..f48f4621ccbc 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -57,6 +57,12 @@ o	A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
 	CONFIG_PREEMPT_RCU case, you might see stall-warning
 	messages.
 
+	You can use the rcutree.kthread_prio kernel boot parameter to
+	increase the scheduling priority of RCU's kthreads, which can
+	help avoid this problem.  However, please note that doing this
+	can increase your system's context-switch rate and thus degrade
+	performance.
+
 o	A periodic interrupt whose handler takes longer than the time
 	interval between successive pairs of interrupts.  This can
 	prevent RCU's kthreads and softirq handlers from running.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 09/12] idle: Prevent late-arriving interrupts from disrupting offline
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
                   ` (7 preceding siblings ...)
  2019-08-01 22:37 ` [PATCH tip/core/rcu 08/12] doc: Add rcutree.kthread_prio pointer to stallwarn.txt Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 10/12] rcu: Remove redundant "if" condition from rcu_gp_is_expedited() Paul E. McKenney
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E . McKenney

From: Peter Zijlstra <peterz@infradead.org>

Scheduling-clock interrupts can arrive late in the CPU-offline process,
after idle entry and the subsequent call to cpuhp_report_idle_dead().
Once execution passes the call to rcu_report_dead(), RCU is ignoring
the CPU, which results in lockdep complaints when the interrupt handler
uses RCU:

------------------------------------------------------------------------

=============================
WARNING: suspicious RCU usage
5.2.0-rc1+ #681 Not tainted
-----------------------------
kernel/sched/fair.c:9542 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

RCU used illegally from offline CPU!
rcu_scheduler_active = 2, debug_locks = 1
no locks held by swapper/5/0.

stack backtrace:
CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.2.0-rc1+ #681
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
Call Trace:
 <IRQ>
 dump_stack+0x5e/0x8b
 trigger_load_balance+0xa8/0x390
 ? tick_sched_do_timer+0x60/0x60
 update_process_times+0x3b/0x50
 tick_sched_handle+0x2f/0x40
 tick_sched_timer+0x32/0x70
 __hrtimer_run_queues+0xd3/0x3b0
 hrtimer_interrupt+0x11d/0x270
 ? sched_clock_local+0xc/0x74
 smp_apic_timer_interrupt+0x79/0x200
 apic_timer_interrupt+0xf/0x20
 </IRQ>
RIP: 0010:delay_tsc+0x22/0x50
Code: ff 0f 1f 80 00 00 00 00 65 44 8b 05 18 a7 11 48 0f ae e8 0f 31 48 89 d6 48 c1 e6 20 48 09 c6 eb 0e f3 90 65 8b 05 fe a6 11 48 <41> 39 c0 75 18 0f ae e8 0f 31 48 c1 e2 20 48 09 c2 48 89 d0 48 29
RSP: 0000:ffff8f92c0157ed0 EFLAGS: 00000212 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000005 RBX: ffff8c861f356400 RCX: ffff8f92c0157e64
RDX: 000000321214c8cc RSI: 00000032120daa7f RDI: 0000000000260f15
RBP: 0000000000000005 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: ffff8c861ee18000 R15: ffff8c861ee18000
 cpuhp_report_idle_dead+0x31/0x60
 do_idle+0x1d5/0x200
 ? _raw_spin_unlock_irqrestore+0x2d/0x40
 cpu_startup_entry+0x14/0x20
 start_secondary+0x151/0x170
 secondary_startup_64+0xa4/0xb0

------------------------------------------------------------------------

This happens rarely, but can be forced by happen more often by
placing delays in cpuhp_report_idle_dead() following the call to
rcu_report_dead().  With this in place, the following rcutorture
scenario reproduces the problem within a few minutes:

tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --duration 5 --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --configs "TREE04"

This commit uses the crude but effective expedient of moving the disabling
of interrupts within the idle loop to precede the cpu_is_offline()
check.  It also invokes tick_nohz_idle_stop_tick() instead of
tick_nohz_idle_stop_tick_protected() to shut off the scheduling-clock
interrupt, and then removes the latter because there are no other callers.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 include/linux/tick.h | 10 ----------
 kernel/sched/idle.c  |  5 +++--
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index f92a10b5e112..196a0a7bfc4f 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -134,14 +134,6 @@ extern unsigned long tick_nohz_get_idle_calls(void);
 extern unsigned long tick_nohz_get_idle_calls_cpu(int cpu);
 extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
 extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
-
-static inline void tick_nohz_idle_stop_tick_protected(void)
-{
-	local_irq_disable();
-	tick_nohz_idle_stop_tick();
-	local_irq_enable();
-}
-
 #else /* !CONFIG_NO_HZ_COMMON */
 #define tick_nohz_enabled (0)
 static inline int tick_nohz_tick_stopped(void) { return 0; }
@@ -164,8 +156,6 @@ static inline ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
 }
 static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
 static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
-
-static inline void tick_nohz_idle_stop_tick_protected(void) { }
 #endif /* !CONFIG_NO_HZ_COMMON */
 
 #ifdef CONFIG_NO_HZ_FULL
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 80940939b733..e4bc4aa739b8 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -241,13 +241,14 @@ static void do_idle(void)
 		check_pgt_cache();
 		rmb();
 
+		local_irq_disable();
+
 		if (cpu_is_offline(cpu)) {
-			tick_nohz_idle_stop_tick_protected();
+			tick_nohz_idle_stop_tick();
 			cpuhp_report_idle_dead();
 			arch_cpu_idle_dead();
 		}
 
-		local_irq_disable();
 		arch_cpu_idle_enter();
 
 		/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 10/12] rcu: Remove redundant "if" condition from rcu_gp_is_expedited()
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
                   ` (8 preceding siblings ...)
  2019-08-01 22:37 ` [PATCH tip/core/rcu 09/12] idle: Prevent late-arriving interrupts from disrupting offline Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 11/12] arm: Use common outgoing-CPU-notification code Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 12/12] rcu: Fix spelling mistake "greate"->"great" Paul E. McKenney
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney

Because rcu_expedited_nesting is initialized to 1 and not decremented
until just before init is spawned, rcu_expedited_nesting is guaranteed
to be non-zero whenever rcu_scheduler_active == RCU_SCHEDULER_INIT.
This commit therefore removes this redundant "if" equality test.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/update.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 249517058b13..64e9cc8609e7 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -136,8 +136,7 @@ static atomic_t rcu_expedited_nesting = ATOMIC_INIT(1);
  */
 bool rcu_gp_is_expedited(void)
 {
-	return rcu_expedited || atomic_read(&rcu_expedited_nesting) ||
-	       rcu_scheduler_active == RCU_SCHEDULER_INIT;
+	return rcu_expedited || atomic_read(&rcu_expedited_nesting);
 }
 EXPORT_SYMBOL_GPL(rcu_gp_is_expedited);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 11/12] arm: Use common outgoing-CPU-notification code
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
                   ` (9 preceding siblings ...)
  2019-08-01 22:37 ` [PATCH tip/core/rcu 10/12] rcu: Remove redundant "if" condition from rcu_gp_is_expedited() Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  2019-08-01 22:37 ` [PATCH tip/core/rcu 12/12] rcu: Fix spelling mistake "greate"->"great" Paul E. McKenney
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Paul E. McKenney,
	linux-arm-kernel, Russell King, Mark Rutland, Dietmar Eggemann

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

This commit removes the open-coded CPU-offline notification with new
common code.  In particular, this change avoids calling scheduler code
using RCU from an offline CPU that RCU is ignoring.  This is a minimal
change.  A more intrusive change might invoke the cpu_check_up_prepare()
and cpu_set_state_online() functions at CPU-online time, which would
allow onlining throw an error if the CPU did not go offline properly.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 arch/arm/kernel/smp.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index aab8ba40ce38..4b0bab2607e4 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -264,15 +264,13 @@ int __cpu_disable(void)
 	return 0;
 }
 
-static DECLARE_COMPLETION(cpu_died);
-
 /*
  * called on the thread which is asking for a CPU to be shutdown -
  * waits until shutdown has completed, or it is timed out.
  */
 void __cpu_die(unsigned int cpu)
 {
-	if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) {
+	if (!cpu_wait_death(cpu, 5)) {
 		pr_err("CPU%u: cpu didn't die\n", cpu);
 		return;
 	}
@@ -319,7 +317,7 @@ void arch_cpu_idle_dead(void)
 	 * this returns, power and/or clocks can be removed at any point
 	 * from this CPU and its cache by platform_cpu_kill().
 	 */
-	complete(&cpu_died);
+	(void)cpu_report_death();
 
 	/*
 	 * Ensure that the cache lines associated with that completion are
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH tip/core/rcu 12/12] rcu: Fix spelling mistake "greate"->"great"
  2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
                   ` (10 preceding siblings ...)
  2019-08-01 22:37 ` [PATCH tip/core/rcu 11/12] arm: Use common outgoing-CPU-notification code Paul E. McKenney
@ 2019-08-01 22:37 ` Paul E. McKenney
  11 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2019-08-01 22:37 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, Mukesh Ojha, Paul E . McKenney

From: Mukesh Ojha <mojha@codeaurora.org>

This commit fixes a spelling mistake in file tree_exp.h.

Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
---
 kernel/rcu/tree_exp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 513b403b683b..d632cd019597 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -781,7 +781,7 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
  * other hand, if the CPU is not in an RCU read-side critical section,
  * the IPI handler reports the quiescent state immediately.
  *
- * Although this is a greate improvement over previous expedited
+ * Although this is a great improvement over previous expedited
  * implementations, it is still unfriendly to real-time workloads, so is
  * thus not recommended for any sort of common-case code.  In fact, if
  * you are using synchronize_rcu_expedited() in a loop, please restructure
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-08-01 22:39 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-01 22:37 [PATCH tip/core/rcu 0/12] Miscellaneous fixes for v5.4 Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 01/12] lockdep: Make print_lock() address visible Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 02/12] time/tick-broadcast: Fix tick_broadcast_offline() lockdep complaint Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 03/12] rcu: Restore barrier() to rcu_read_lock() and rcu_read_unlock() Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 04/12] rcu: Add kernel parameter to dump trace after RCU CPU stall warning Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 05/12] rcu: Add destroy_work_on_stack() to match INIT_WORK_ONSTACK() Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 06/12] srcu: Avoid srcutorture security-based pointer obfuscation Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 07/12] rcu: Change return type of rcu_spawn_one_boost_kthread() Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 08/12] doc: Add rcutree.kthread_prio pointer to stallwarn.txt Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 09/12] idle: Prevent late-arriving interrupts from disrupting offline Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 10/12] rcu: Remove redundant "if" condition from rcu_gp_is_expedited() Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 11/12] arm: Use common outgoing-CPU-notification code Paul E. McKenney
2019-08-01 22:37 ` [PATCH tip/core/rcu 12/12] rcu: Fix spelling mistake "greate"->"great" Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).