All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC tip/core/rcu] Restore barrier() to rcu_read_lock() and rcu_read_unlock()
@ 2019-06-06 13:19 Paul E. McKenney
  2019-06-06 13:42 ` Herbert Xu
  0 siblings, 1 reply; 3+ messages in thread
From: Paul E. McKenney @ 2019-06-06 13:19 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, herbert, torvalds

Commit bb73c52bad36 ("rcu: Don't disable preemption for Tiny and Tree
RCU readers") removed the barrier() calls from rcu_read_lock() and
rcu_write_lock() in CONFIG_PREEMPT=n&&CONFIG_PREEMPT_COUNT=n kernels.
Within RCU, this commit was OK, but it failed to account for things like
get_user() that can pagefault and that can be reordered by the compiler.
Lack of the barrier() calls in rcu_read_lock() and rcu_read_unlock()
can cause these page faults to migrate into RCU read-side critical
sections, which in CONFIG_PREEMPT=n kernels could result in too-short
grace periods and arbitrary misbehavior.  Please see commit 386afc91144b
("spinlocks and preemption points need to be at least compiler barriers")
for more details.

This commit therefore restores the barrier() call to both rcu_read_lock()
and rcu_read_unlock().  It also removes them from places in the RCU update
machinery that used to need compensatory barrier() calls, effectively
reverting commit bb73c52bad36 ("rcu: Don't disable preemption for Tiny
and Tree RCU readers").

Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>

diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html
index 5a9238a2883c..080b39cc1dbb 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.html
+++ b/Documentation/RCU/Design/Requirements/Requirements.html
@@ -2129,6 +2129,8 @@ Some of the relevant points of interest are as follows:
 <li>	<a href="#Hotplug CPU">Hotplug CPU</a>.
 <li>	<a href="#Scheduler and RCU">Scheduler and RCU</a>.
 <li>	<a href="#Tracing and RCU">Tracing and RCU</a>.
+<li>	<a href="#Accesses to User Mamory and RCU">
+Accesses to User Mamory and RCU</a>.
 <li>	<a href="#Energy Efficiency">Energy Efficiency</a>.
 <li>	<a href="#Scheduling-Clock Interrupts and RCU">
 	Scheduling-Clock Interrupts and RCU</a>.
@@ -2521,6 +2523,75 @@ cannot be used.
 The tracing folks both located the requirement and provided the
 needed fix, so this surprise requirement was relatively painless.
 
+<h3><a name="Accesses to User Mamory and RCU">
+Accesses to User Mamory and RCU</a></h3>
+
+<p>
+The kernel needs to access user-space memory, for example, to access
+data referenced by system-call parameters.
+The <tt>get_user()</tt> macro does this job.
+
+<p>
+However, user-space memory might well be paged out, which means
+that <tt>get_user()</tt> might well page-fault and thus block while
+waiting for the resulting I/O to complete.
+It would be a very bad thing for the compiler to reorder
+a <tt>get_user()</tt> invocation into an RCU read-side critical
+section.
+For example, suppose that the source code looked like this:
+
+<blockquote>
+<pre>
+ 1 rcu_read_lock();
+ 2 p = rcu_dereference(gp);
+ 3 v = p-&gt;value;
+ 4 rcu_read_unlock();
+ 5 get_user(user_v, user_p);
+ 6 do_something_with(v, user_v);
+</pre>
+</blockquote>
+
+<p>
+The compiler must not be permitted to transform this source code into
+the following:
+
+<blockquote>
+<pre>
+ 1 rcu_read_lock();
+ 2 p = rcu_dereference(gp);
+ 3 get_user(user_v, user_p); // BUG: POSSIBLE PAGE FAULT!!!
+ 4 v = p-&gt;value;
+ 5 rcu_read_unlock();
+ 6 do_something_with(v, user_v);
+</pre>
+</blockquote>
+
+<p>
+If the compiler did make this transformation in a
+<tt>CONFIG_PREEMPT=n</tt> kernel build, and if <tt>get_user()</tt> did
+page fault, the result would be a quiescent state in the middle
+of an RCU read-side critical section.
+This misplaced quiescent state could result in line&nbsp;4 being
+a use-after-free access, which could be bad for your kernel's
+actuarial statistics.
+Similar examples can be constructed with the call to <tt>get_user()</tt>
+preceding the <tt>rcu_read_lock()</tt>.
+
+<p>
+Unfortunately, <tt>get_user()</tt> doesn't have any particular
+ordering properties, and in some architectures the underlying <tt>asm</tt>
+isn't even marked <tt>volatile</tt>.
+And even if it was marked <tt>volatile</tt>, the above access to
+<tt>p-&gt;value</tt> is not volatile, so the compiler would not have any
+reason to keep those two accesses in order.
+
+<p>
+Therefore, the Linux-kernel definitions of <tt>rcu_read_lock()</tt>
+and <tt>rcu_read_unlock()</tt> must act as compiler barriers,
+at least for outermost instances of <tt>rcu_read_lock()</tt> and
+<tt>rcu_read_unlock()</tt> within a nested set of RCU read-side critical
+sections.
+
 <h3><a name="Energy Efficiency">Energy Efficiency</a></h3>
 
 <p>
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 0c9b92799abc..8f7167478c1d 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -56,14 +56,12 @@ void __rcu_read_unlock(void);
 
 static inline void __rcu_read_lock(void)
 {
-	if (IS_ENABLED(CONFIG_PREEMPT_COUNT))
-		preempt_disable();
+	preempt_disable();
 }
 
 static inline void __rcu_read_unlock(void)
 {
-	if (IS_ENABLED(CONFIG_PREEMPT_COUNT))
-		preempt_enable();
+	preempt_enable();
 }
 
 static inline int rcu_preempt_depth(void)
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index acb225023ed1..3f1b5041de9b 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -288,7 +288,6 @@ void rcu_note_context_switch(bool preempt)
 	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 	struct rcu_node *rnp;
 
-	barrier(); /* Avoid RCU read-side critical sections leaking down. */
 	trace_rcu_utilization(TPS("Start context switch"));
 	lockdep_assert_irqs_disabled();
 	WARN_ON_ONCE(!preempt && t->rcu_read_lock_nesting > 0);
@@ -340,7 +339,6 @@ void rcu_note_context_switch(bool preempt)
 	if (rdp->exp_deferred_qs)
 		rcu_report_exp_rdp(rdp);
 	trace_rcu_utilization(TPS("End context switch"));
-	barrier(); /* Avoid RCU read-side critical sections leaking up. */
 }
 EXPORT_SYMBOL_GPL(rcu_note_context_switch);
 
@@ -828,11 +826,6 @@ static void rcu_qs(void)
  * dyntick-idle quiescent state visible to other CPUs, which will in
  * some cases serve for expedited as well as normal grace periods.
  * Either way, register a lightweight quiescent state.
- *
- * The barrier() calls are redundant in the common case when this is
- * called externally, but just in case this is called from within this
- * file.
- *
  */
 void rcu_all_qs(void)
 {
@@ -847,14 +840,12 @@ void rcu_all_qs(void)
 		return;
 	}
 	this_cpu_write(rcu_data.rcu_urgent_qs, false);
-	barrier(); /* Avoid RCU read-side critical sections leaking down. */
 	if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) {
 		local_irq_save(flags);
 		rcu_momentary_dyntick_idle();
 		local_irq_restore(flags);
 	}
 	rcu_qs();
-	barrier(); /* Avoid RCU read-side critical sections leaking up. */
 	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(rcu_all_qs);
@@ -864,7 +855,6 @@ EXPORT_SYMBOL_GPL(rcu_all_qs);
  */
 void rcu_note_context_switch(bool preempt)
 {
-	barrier(); /* Avoid RCU read-side critical sections leaking down. */
 	trace_rcu_utilization(TPS("Start context switch"));
 	rcu_qs();
 	/* Load rcu_urgent_qs before other flags. */
@@ -877,7 +867,6 @@ void rcu_note_context_switch(bool preempt)
 		rcu_tasks_qs(current);
 out:
 	trace_rcu_utilization(TPS("End context switch"));
-	barrier(); /* Avoid RCU read-side critical sections leaking up. */
 }
 EXPORT_SYMBOL_GPL(rcu_note_context_switch);
 


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH RFC tip/core/rcu] Restore barrier() to rcu_read_lock() and rcu_read_unlock()
  2019-06-06 13:19 [PATCH RFC tip/core/rcu] Restore barrier() to rcu_read_lock() and rcu_read_unlock() Paul E. McKenney
@ 2019-06-06 13:42 ` Herbert Xu
  2019-06-06 13:51   ` Paul E. McKenney
  0 siblings, 1 reply; 3+ messages in thread
From: Herbert Xu @ 2019-06-06 13:42 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, torvalds

On Thu, Jun 06, 2019 at 06:19:33AM -0700, Paul E. McKenney wrote:
> Commit bb73c52bad36 ("rcu: Don't disable preemption for Tiny and Tree
> RCU readers") removed the barrier() calls from rcu_read_lock() and
> rcu_write_lock() in CONFIG_PREEMPT=n&&CONFIG_PREEMPT_COUNT=n kernels.
> Within RCU, this commit was OK, but it failed to account for things like
> get_user() that can pagefault and that can be reordered by the compiler.
> Lack of the barrier() calls in rcu_read_lock() and rcu_read_unlock()
> can cause these page faults to migrate into RCU read-side critical
> sections, which in CONFIG_PREEMPT=n kernels could result in too-short
> grace periods and arbitrary misbehavior.  Please see commit 386afc91144b
> ("spinlocks and preemption points need to be at least compiler barriers")
> for more details.
> 
> This commit therefore restores the barrier() call to both rcu_read_lock()
> and rcu_read_unlock().  It also removes them from places in the RCU update
> machinery that used to need compensatory barrier() calls, effectively
> reverting commit bb73c52bad36 ("rcu: Don't disable preemption for Tiny
> and Tree RCU readers").
> 
> Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>

Paul, Linus has already commited his patch:

commit 66be4e66a7f422128748e3c3ef6ee72b20a6197b
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Mon Jun 3 13:26:20 2019 -0700

    rcu: locking and unlocking need to always be at least barriers

So you'll need to rebase this.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH RFC tip/core/rcu] Restore barrier() to rcu_read_lock() and rcu_read_unlock()
  2019-06-06 13:42 ` Herbert Xu
@ 2019-06-06 13:51   ` Paul E. McKenney
  0 siblings, 0 replies; 3+ messages in thread
From: Paul E. McKenney @ 2019-06-06 13:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: rcu, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
	edumazet, fweisbec, oleg, joel, torvalds

On Thu, Jun 06, 2019 at 09:42:33PM +0800, Herbert Xu wrote:
> On Thu, Jun 06, 2019 at 06:19:33AM -0700, Paul E. McKenney wrote:
> > Commit bb73c52bad36 ("rcu: Don't disable preemption for Tiny and Tree
> > RCU readers") removed the barrier() calls from rcu_read_lock() and
> > rcu_write_lock() in CONFIG_PREEMPT=n&&CONFIG_PREEMPT_COUNT=n kernels.
> > Within RCU, this commit was OK, but it failed to account for things like
> > get_user() that can pagefault and that can be reordered by the compiler.
> > Lack of the barrier() calls in rcu_read_lock() and rcu_read_unlock()
> > can cause these page faults to migrate into RCU read-side critical
> > sections, which in CONFIG_PREEMPT=n kernels could result in too-short
> > grace periods and arbitrary misbehavior.  Please see commit 386afc91144b
> > ("spinlocks and preemption points need to be at least compiler barriers")
> > for more details.
> > 
> > This commit therefore restores the barrier() call to both rcu_read_lock()
> > and rcu_read_unlock().  It also removes them from places in the RCU update
> > machinery that used to need compensatory barrier() calls, effectively
> > reverting commit bb73c52bad36 ("rcu: Don't disable preemption for Tiny
> > and Tree RCU readers").
> > 
> > Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
> > Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> 
> Paul, Linus has already commited his patch:
> 
> commit 66be4e66a7f422128748e3c3ef6ee72b20a6197b
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Mon Jun 3 13:26:20 2019 -0700
> 
>     rcu: locking and unlocking need to always be at least barriers
> 
> So you'll need to rebase this.

Thank you for letting me know.  Easy enough to do, just remove those
two hunks from my patch.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-06-06 13:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-06 13:19 [PATCH RFC tip/core/rcu] Restore barrier() to rcu_read_lock() and rcu_read_unlock() Paul E. McKenney
2019-06-06 13:42 ` Herbert Xu
2019-06-06 13:51   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.