[PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching
@ 2017-09-23 20:56 Steven Rostedt
  2017-09-23 20:56 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Steven Rostedt @ 2017-09-23 20:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney

Linus,

Stack tracing and RCU has been having issues with each other and lockdep
has been pointing out constant problems. The changes have been going into
the stack tracer, but it has been discovered that the problem isn't
with the stack tracer itself, but it is with calling save_stack_trace()
from within the internals of RCU. The stack tracer is the one that
can trigger the issue the easiest, but examining the problem further,
it could also happen from a WARN() in the wrong place, or even if
an NMI happened in this area and it did an rcu_read_lock().

The critical area is where RCU is not watching. Which can happen while
going to and from idle, or bringing up or taking down a CPU.

The final fix was to put the protection in kernel_text_address() as it
is the one that requires RCU to be watching while doing the stack trace.

To make this work properly, Paul had to allow rcu_irq_enter() happen after
rcu_nmi_enter(). This should have been done anyway, since an NMI can
page fault (reading vmalloc area), and a page fault triggers rcu_irq_enter().

One patch is just a consolidation of code so that the fix only needed
to be done in one location.

Please pull the latest trace-v4.14-rc1-2 tree, which can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-v4.14-rc1-2

Tag SHA1: cdd226e99fbb8f22e5317e7968b37387108ce568
Head SHA1: 15516c89acce948debc4c598e03c3fee53045797

Paul E. McKenney (1):
      rcu: Allow for page faults in NMI handlers

Steven Rostedt (VMware) (3):
      extable: Consolidate *kernel_text_address() functions
      extable: Enable RCU if it is not watching in kernel_text_address()
      tracing: Remove RCU work arounds from stack tracer

----
 kernel/extable.c           | 45 +++++++++++++++++++++++++++++++--------------
 kernel/rcu/tree.c          | 10 ++++++++++
 kernel/trace/trace_stack.c | 15 ---------------
 3 files changed, 41 insertions(+), 29 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
@ 2017-09-23 20:56 ` Steven Rostedt
  2017-09-24 19:42   ` Linus Torvalds
  2017-09-23 20:56 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2017-09-23 20:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney, stable

[-- Attachment #1: 0001-rcu-Allow-for-page-faults-in-NMI-handlers.patch --]
[-- Type: text/plain, Size: 2269 bytes --]

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

A number of architecture invoke rcu_irq_enter() on exception entry in
order to allow RCU read-side critical sections in the exception handler
when the exception is from an idle or nohz_full CPU.  This works, at
least unless the exception happens in an NMI handler.  In that case,
rcu_nmi_enter() would already have exited the extended quiescent state,
which would mean that rcu_irq_enter() would (incorrectly) cause RCU
to think that it is again in an extended quiescent state.  This will
in turn result in lockdep splats in response to later RCU read-side
critical sections.

This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
take no action if there is an rcu_nmi_enter() in effect, thus avoiding
the unscheduled return to RCU quiescent state.  This in turn should
make the kernel safe for on-demand RCU voyeurism.

Link: http://lkml.kernel.org/r/20170922211022.GA18084@linux.vnet.ibm.com

Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/rcu/tree.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 51d4c3acf32d..63bee8e1b193 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -888,6 +888,11 @@ void rcu_irq_exit(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (READ_ONCE(rdtp->dynticks_nmi_nesting))
+		return;
+
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
 		     rdtp->dynticks_nesting < 1);
 	if (rdtp->dynticks_nesting <= 1) {
@@ -1020,6 +1025,11 @@ void rcu_irq_enter(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_enter() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (READ_ONCE(rdtp->dynticks_nmi_nesting))
+		return;
+
 	oldval = rdtp->dynticks_nesting;
 	rdtp->dynticks_nesting++;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] extable: Consolidate *kernel_text_address() functions
  2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
  2017-09-23 20:56 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
@ 2017-09-23 20:56 ` Steven Rostedt
  2017-09-23 20:56 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
  2017-09-23 20:56 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
  3 siblings, 0 replies; 17+ messages in thread
From: Steven Rostedt @ 2017-09-23 20:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney, stable

[-- Attachment #1: 0002-extable-Consolidate-kernel_text_address-functions.patch --]
[-- Type: text/plain, Size: 1493 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The functionality between kernel_text_address() and _kernel_text_address()
is the same except that _kernel_text_address() does a little more (that
function needs a rename, but that can be done another time). Instead of
having duplicate code in both, simply have _kernel_text_address() calls
kernel_text_address() instead.

This is marked for stable because there's an RCU bug that can happen if
one of these functions gets called while RCU is not watching. That fix
depends on this fix to keep from having to write the fix twice.

Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/extable.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/kernel/extable.c b/kernel/extable.c
index 38c2412401a1..a7024a494faf 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -102,15 +102,7 @@ int core_kernel_data(unsigned long addr)

 int __kernel_text_address(unsigned long addr)
 {
-	if (core_kernel_text(addr))
-		return 1;
-	if (is_module_text_address(addr))
-		return 1;
-	if (is_ftrace_trampoline(addr))
-		return 1;
-	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
-		return 1;
-	if (is_bpf_text_address(addr))
+	if (kernel_text_address(addr))
 		return 1;
 	/*
 	 * There might be init symbols in saved stacktraces.
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
  2017-09-23 20:56 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
  2017-09-23 20:56 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
@ 2017-09-23 20:56 ` Steven Rostedt
  2017-09-23 20:56 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
  3 siblings, 0 replies; 17+ messages in thread
From: Steven Rostedt @ 2017-09-23 20:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney, stable

[-- Attachment #1: 0003-extable-Enable-RCU-if-it-is-not-watching-in-kernel_t.patch --]
[-- Type: text/plain, Size: 2290 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

If kernel_text_address() is called when RCU is not watching, it can cause an
RCU bug because is_module_text_address(), the is_kprobe_*insn_slot()
and is_bpf_text_address() functions require the use of RCU.

Only enable RCU if it is not currently watching before it calls
is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
because kernel_text_address() can happen pretty much anywhere (like an NMI),
and even from within an NMI. It is called via save_stack_trace() that can be
called by any WARN() or tracing function, which can happen while RCU is not
watching (for example, going to or coming from idle, or during CPU take down
or bring up).

Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/extable.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/kernel/extable.c b/kernel/extable.c
index a7024a494faf..9aa1cc41ecf7 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -119,17 +119,42 @@ int __kernel_text_address(unsigned long addr)

 int kernel_text_address(unsigned long addr)
 {
+	bool no_rcu;
+	int ret = 1;
+
 	if (core_kernel_text(addr))
 		return 1;
+
+	/*
+	 * If a stack dump happens while RCU is not watching, then
+	 * RCU needs to be notified that it requires to start
+	 * watching again. This can happen either by tracing that
+	 * triggers a stack trace, or a WARN() that happens during
+	 * coming back from idle, or cpu on or offlining.
+	 *
+	 * is_module_text_address() as well as the kprobe slots
+	 * and is_bpf_text_address() require RCU to be watching.
+	 */
+	no_rcu = !rcu_is_watching();
+
+	/* Treat this like an NMI as it can happen anywhere */
+	if (no_rcu)
+		rcu_nmi_enter();
+
 	if (is_module_text_address(addr))
-		return 1;
+		goto out;
 	if (is_ftrace_trampoline(addr))
-		return 1;
+		goto out;
 	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
-		return 1;
+		goto out;
 	if (is_bpf_text_address(addr))
-		return 1;
-	return 0;
+		goto out;
+	ret = 0;
+out:
+	if (no_rcu)
+		rcu_nmi_exit();
+
+	return ret;
 }

 /*
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
  2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
                   ` (2 preceding siblings ...)
  2017-09-23 20:56 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
@ 2017-09-23 20:56 ` Steven Rostedt
  3 siblings, 0 replies; 17+ messages in thread
From: Steven Rostedt @ 2017-09-23 20:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney, stable

[-- Attachment #1: 0004-tracing-Remove-RCU-work-arounds-from-stack-tracer.patch --]
[-- Type: text/plain, Size: 1826 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Currently the stack tracer calls rcu_irq_enter() to make sure RCU
is watching when it records a stack trace. But if the stack tracer
is triggered while tracing inside of a rcu_irq_enter(), calling
rcu_irq_enter() unconditionally can be problematic.

The reason for having rcu_irq_enter() in the first place has been
fixed from within the saving of the stack trace code, and there's no
reason for doing it in the stack tracer itself. Just remove it.

Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index a4df67cbc711..49cb41412eec 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -96,23 +96,9 @@ check_stack(unsigned long ip, unsigned long *stack)
 	if (in_nmi())
 		return;
 
-	/*
-	 * There's a slight chance that we are tracing inside the
-	 * RCU infrastructure, and rcu_irq_enter() will not work
-	 * as expected.
-	 */
-	if (unlikely(rcu_irq_enter_disabled()))
-		return;
-
 	local_irq_save(flags);
 	arch_spin_lock(&stack_trace_max_lock);
 
-	/*
-	 * RCU may not be watching, make it see us.
-	 * The stack trace code uses rcu_sched.
-	 */
-	rcu_irq_enter();
-
 	/* In case another CPU set the tracer_frame on us */
 	if (unlikely(!frame_size))
 		this_size -= tracer_frame;
@@ -205,7 +191,6 @@ check_stack(unsigned long ip, unsigned long *stack)
 	}
 
  out:
-	rcu_irq_exit();
 	arch_spin_unlock(&stack_trace_max_lock);
 	local_irq_restore(flags);
 }
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-23 20:56 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
@ 2017-09-24 19:42   ` Linus Torvalds
  2017-09-25  0:03     ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2017-09-24 19:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linux Kernel Mailing List, Ingo Molnar, Andrew Morton,
	Paul E. McKenney, stable

On Sat, Sep 23, 2017 at 1:56 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> +
> +       /* Page faults can happen in NMI handlers, so check... */
> +       if (READ_ONCE(rdtp->dynticks_nmi_nesting))
> +               return;
> +

What is the reason for the READ_ONCE() here (and in the other case)?

It doesn't seem to have any actual reason.  It's a "stable" per-cpu
value in that even if an NMI were to happen, it gets incremented and
then decremented, so there is nothing really volatile about it
anywhere that I can see.

So the READ_ONCE() seems to be just pure confusion.

What am I missing?

               Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-24 19:42   ` Linus Torvalds
@ 2017-09-25  0:03     ` Paul E. McKenney
  2017-09-25  0:12       ` Linus Torvalds
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2017-09-25  0:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Linux Kernel Mailing List, Ingo Molnar,
	Andrew Morton, stable

On Sun, Sep 24, 2017 at 12:42:32PM -0700, Linus Torvalds wrote:
> On Sat, Sep 23, 2017 at 1:56 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > +
> > +       /* Page faults can happen in NMI handlers, so check... */
> > +       if (READ_ONCE(rdtp->dynticks_nmi_nesting))
> > +               return;
> > +
> 
> What is the reason for the READ_ONCE() here (and in the other case)?
> 
> It doesn't seem to have any actual reason.  It's a "stable" per-cpu
> value in that even if an NMI were to happen, it gets incremented and
> then decremented, so there is nothing really volatile about it
> anywhere that I can see.
> 
> So the READ_ONCE() seems to be just pure confusion.
> 
> What am I missing?

Mostly just paranoia on my part.  I would be happy to remove it if
you prefer.  Or you or Steve can do so if that is more convenient.

And yes, consistency would dictate that the uses in rcu_nmi_enter()
and rcu_nmi_exit() should be _ONCE(), particularly the stores to
->dynticks_nmi_nesting.  But I am not too worried about that right now
because I suspect that I should be able to combine rcu_irq_{enter,exit}()
and rcu_nmi_{enter,exit}(), which would be a good simplification.

							Thanx, aul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-25  0:03     ` Paul E. McKenney
@ 2017-09-25  0:12       ` Linus Torvalds
  2017-09-25  0:26         ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2017-09-25  0:12 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Steven Rostedt, Linux Kernel Mailing List, Ingo Molnar,
	Andrew Morton, stable

On Sun, Sep 24, 2017 at 5:03 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
>
> Mostly just paranoia on my part.  I would be happy to remove it if
> you prefer.  Or you or Steve can do so if that is more convenient.

I really don't think it's warranted. The values are *stable*. There's
no subtle lack of locking, or some optimistic access to a value that
can change.

The compiler can generate code to read the value fifteen billion
times, and it will always get the same value.

Yes, maybe in between the different accesses, an NMI will happen, and
the value will be incremented, but then as the NMI exits, it will
decrement again, so the code that got interrupted will not actually
see the change.

So the READ_ONCE() isn't "paranoia". It's just confusing.

> And yes, consistency would dictate that the uses in rcu_nmi_enter()
> and rcu_nmi_exit() should be _ONCE(), particularly the stores to
> ->dynticks_nmi_nesting.

NO.

That would be just more of that confusion.

That value is STABLE. It's stable even within an NMI handler. The NMI
code can read it, modify it, write it back, do a little dance, all
without having to care. There's no "_ONCE()" about it - not for the
readers, not for the writers, not for _anybody_.

So adding even more READ/WRITE_ONCE() accesses wouldn't be
"consistent", it would just be insanity.

Now, if an NMI happens and the value would be different on entry than
it is on exit, that would be something else. Then it really wouldn't
be stable wrt random users. But that would also be a major bug in the
NMI handler, as far as I can tell.

So the reason I'm objecting to that READ_ONCE() is that it isn't
"paranoia", it's "voodoo programming". And we don't do voodoo
programming.

            Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-25  0:12       ` Linus Torvalds
@ 2017-09-25  0:26         ` Paul E. McKenney
  2017-09-25  0:34           ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2017-09-25  0:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Linux Kernel Mailing List, Ingo Molnar,
	Andrew Morton, stable

On Sun, Sep 24, 2017 at 05:12:13PM -0700, Linus Torvalds wrote:
> On Sun, Sep 24, 2017 at 5:03 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> >
> > Mostly just paranoia on my part.  I would be happy to remove it if
> > you prefer.  Or you or Steve can do so if that is more convenient.
> 
> I really don't think it's warranted. The values are *stable*. There's
> no subtle lack of locking, or some optimistic access to a value that
> can change.
> 
> The compiler can generate code to read the value fifteen billion
> times, and it will always get the same value.
> 
> Yes, maybe in between the different accesses, an NMI will happen, and
> the value will be incremented, but then as the NMI exits, it will
> decrement again, so the code that got interrupted will not actually
> see the change.
> 
> So the READ_ONCE() isn't "paranoia". It's just confusing.
> 
> > And yes, consistency would dictate that the uses in rcu_nmi_enter()
> > and rcu_nmi_exit() should be _ONCE(), particularly the stores to
> > ->dynticks_nmi_nesting.
> 
> NO.
> 
> That would be just more of that confusion.
> 
> That value is STABLE. It's stable even within an NMI handler. The NMI
> code can read it, modify it, write it back, do a little dance, all
> without having to care. There's no "_ONCE()" about it - not for the
> readers, not for the writers, not for _anybody_.
> 
> So adding even more READ/WRITE_ONCE() accesses wouldn't be
> "consistent", it would just be insanity.
> 
> Now, if an NMI happens and the value would be different on entry than
> it is on exit, that would be something else. Then it really wouldn't
> be stable wrt random users. But that would also be a major bug in the
> NMI handler, as far as I can tell.
> 
> So the reason I'm objecting to that READ_ONCE() is that it isn't
> "paranoia", it's "voodoo programming". And we don't do voodoo
> programming.

I already agreed that the READ_ONCE() can be removed.

But without the WRITE_ONCE(), the compiler could theoretically tear
the store.  Now we might be asserting that our compilers don't do that,
and that if they ever do, we will file a bug or whatever.

So are we asserting that our compilers won't ever do store tearing?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-25  0:26         ` Paul E. McKenney
@ 2017-09-25  0:34           ` Paul E. McKenney
  2017-09-25  4:41             ` Steven Rostedt
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2017-09-25  0:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Linux Kernel Mailing List, Ingo Molnar,
	Andrew Morton, stable

On Sun, Sep 24, 2017 at 05:26:53PM -0700, Paul E. McKenney wrote:
> On Sun, Sep 24, 2017 at 05:12:13PM -0700, Linus Torvalds wrote:
> > On Sun, Sep 24, 2017 at 5:03 PM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > Mostly just paranoia on my part.  I would be happy to remove it if
> > > you prefer.  Or you or Steve can do so if that is more convenient.
> > 
> > I really don't think it's warranted. The values are *stable*. There's
> > no subtle lack of locking, or some optimistic access to a value that
> > can change.
> > 
> > The compiler can generate code to read the value fifteen billion
> > times, and it will always get the same value.
> > 
> > Yes, maybe in between the different accesses, an NMI will happen, and
> > the value will be incremented, but then as the NMI exits, it will
> > decrement again, so the code that got interrupted will not actually
> > see the change.
> > 
> > So the READ_ONCE() isn't "paranoia". It's just confusing.
> > 
> > > And yes, consistency would dictate that the uses in rcu_nmi_enter()
> > > and rcu_nmi_exit() should be _ONCE(), particularly the stores to
> > > ->dynticks_nmi_nesting.
> > 
> > NO.
> > 
> > That would be just more of that confusion.
> > 
> > That value is STABLE. It's stable even within an NMI handler. The NMI
> > code can read it, modify it, write it back, do a little dance, all
> > without having to care. There's no "_ONCE()" about it - not for the
> > readers, not for the writers, not for _anybody_.
> > 
> > So adding even more READ/WRITE_ONCE() accesses wouldn't be
> > "consistent", it would just be insanity.
> > 
> > Now, if an NMI happens and the value would be different on entry than
> > it is on exit, that would be something else. Then it really wouldn't
> > be stable wrt random users. But that would also be a major bug in the
> > NMI handler, as far as I can tell.
> > 
> > So the reason I'm objecting to that READ_ONCE() is that it isn't
> > "paranoia", it's "voodoo programming". And we don't do voodoo
> > programming.
> 
> I already agreed that the READ_ONCE() can be removed.

And for whatever it is worth, here is the updated patch.

							Thanx, Paul

------------------------------------------------------------------------

commit 3e2baa988b9c13095995c46c51e0e32c0b6a7d43
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Fri Sep 22 13:14:42 2017 -0700

    rcu: Allow for page faults in NMI handlers
    
    A number of architecture invoke rcu_irq_enter() on exception entry in
    order to allow RCU read-side critical sections in the exception handler
    when the exception is from an idle or nohz_full CPU.  This works, at
    least unless the exception happens in an NMI handler.  In that case,
    rcu_nmi_enter() would already have exited the extended quiescent state,
    which would mean that rcu_irq_enter() would (incorrectly) cause RCU
    to think that it is again in an extended quiescent state.  This will
    in turn result in lockdep splats in response to later RCU read-side
    critical sections.
    
    This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
    take no action if there is an rcu_nmi_enter() in effect, thus avoiding
    the unscheduled return to RCU quiescent state.  This in turn should
    make the kernel safe for on-demand RCU voyeurism.
    
    Reported-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    [ paulmck: Remove READ_ONCE() per Linux Torvalds feedback. ]

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index db5eb8c3f7af..e4fe06d42385 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -891,6 +891,11 @@ void rcu_irq_exit(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (rdtp->dynticks_nmi_nesting)
+		return;
+
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
 		     rdtp->dynticks_nesting < 1);
 	if (rdtp->dynticks_nesting <= 1) {
@@ -1036,6 +1041,11 @@ void rcu_irq_enter(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_enter() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (rdtp->dynticks_nmi_nesting)
+		return;
+
 	oldval = rdtp->dynticks_nesting;
 	rdtp->dynticks_nesting++;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-25  0:34           ` Paul E. McKenney
@ 2017-09-25  4:41             ` Steven Rostedt
  2017-09-25  4:56               ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2017-09-25  4:41 UTC (permalink / raw)
  To: paulmck, Paul E. McKenney, Linus Torvalds
  Cc: Linux Kernel Mailing List, Ingo Molnar, Andrew Morton, stable

Sorry for the top post, currently on a train to Paris.

This series already went through all my testing, and I would hate to rebase it for this reason. Can you just add a patch to remove the READ_ONCE()s?

Thanks,

-- Steve


On September 25, 2017 2:34:56 AM GMT+02:00, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>On Sun, Sep 24, 2017 at 05:26:53PM -0700, Paul E. McKenney wrote:
>> On Sun, Sep 24, 2017 at 05:12:13PM -0700, Linus Torvalds wrote:
>> > On Sun, Sep 24, 2017 at 5:03 PM, Paul E. McKenney
>> > <paulmck@linux.vnet.ibm.com> wrote:
>> > >
>> > > Mostly just paranoia on my part.  I would be happy to remove it
>if
>> > > you prefer.  Or you or Steve can do so if that is more
>convenient.
>> > 
>> > I really don't think it's warranted. The values are *stable*.
>There's
>> > no subtle lack of locking, or some optimistic access to a value
>that
>> > can change.
>> > 
>> > The compiler can generate code to read the value fifteen billion
>> > times, and it will always get the same value.
>> > 
>> > Yes, maybe in between the different accesses, an NMI will happen,
>and
>> > the value will be incremented, but then as the NMI exits, it will
>> > decrement again, so the code that got interrupted will not actually
>> > see the change.
>> > 
>> > So the READ_ONCE() isn't "paranoia". It's just confusing.
>> > 
>> > > And yes, consistency would dictate that the uses in
>rcu_nmi_enter()
>> > > and rcu_nmi_exit() should be _ONCE(), particularly the stores to
>> > > ->dynticks_nmi_nesting.
>> > 
>> > NO.
>> > 
>> > That would be just more of that confusion.
>> > 
>> > That value is STABLE. It's stable even within an NMI handler. The
>NMI
>> > code can read it, modify it, write it back, do a little dance, all
>> > without having to care. There's no "_ONCE()" about it - not for the
>> > readers, not for the writers, not for _anybody_.
>> > 
>> > So adding even more READ/WRITE_ONCE() accesses wouldn't be
>> > "consistent", it would just be insanity.
>> > 
>> > Now, if an NMI happens and the value would be different on entry
>than
>> > it is on exit, that would be something else. Then it really
>wouldn't
>> > be stable wrt random users. But that would also be a major bug in
>the
>> > NMI handler, as far as I can tell.
>> > 
>> > So the reason I'm objecting to that READ_ONCE() is that it isn't
>> > "paranoia", it's "voodoo programming". And we don't do voodoo
>> > programming.
>> 
>> I already agreed that the READ_ONCE() can be removed.
>
>And for whatever it is worth, here is the updated patch.
>
>							Thanx, Paul
>
>------------------------------------------------------------------------
>
>commit 3e2baa988b9c13095995c46c51e0e32c0b6a7d43
>Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>Date:   Fri Sep 22 13:14:42 2017 -0700
>
>    rcu: Allow for page faults in NMI handlers
>    
>  A number of architecture invoke rcu_irq_enter() on exception entry in
>order to allow RCU read-side critical sections in the exception handler
>   when the exception is from an idle or nohz_full CPU.  This works, at
>   least unless the exception happens in an NMI handler.  In that case,
>rcu_nmi_enter() would already have exited the extended quiescent state,
>    which would mean that rcu_irq_enter() would (incorrectly) cause RCU
>   to think that it is again in an extended quiescent state.  This will
>    in turn result in lockdep splats in response to later RCU read-side
>    critical sections.
>    
>    This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
> take no action if there is an rcu_nmi_enter() in effect, thus avoiding
>    the unscheduled return to RCU quiescent state.  This in turn should
>    make the kernel safe for on-demand RCU voyeurism.
>    
>    Reported-by: Steven Rostedt <rostedt@goodmis.org>
>    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>    [ paulmck: Remove READ_ONCE() per Linux Torvalds feedback. ]
>
>diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>index db5eb8c3f7af..e4fe06d42385 100644
>--- a/kernel/rcu/tree.c
>+++ b/kernel/rcu/tree.c
>@@ -891,6 +891,11 @@ void rcu_irq_exit(void)
> 
>	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs
>enabled!!!");
> 	rdtp = this_cpu_ptr(&rcu_dynticks);
>+
>+	/* Page faults can happen in NMI handlers, so check... */
>+	if (rdtp->dynticks_nmi_nesting)
>+		return;
>+
> 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> 		     rdtp->dynticks_nesting < 1);
> 	if (rdtp->dynticks_nesting <= 1) {
>@@ -1036,6 +1041,11 @@ void rcu_irq_enter(void)
> 
>	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_enter() invoked with irqs
>enabled!!!");
> 	rdtp = this_cpu_ptr(&rcu_dynticks);
>+
>+	/* Page faults can happen in NMI handlers, so check... */
>+	if (rdtp->dynticks_nmi_nesting)
>+		return;
>+
> 	oldval = rdtp->dynticks_nesting;
> 	rdtp->dynticks_nesting++;
> 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-25  4:41             ` Steven Rostedt
@ 2017-09-25  4:56               ` Paul E. McKenney
  2017-09-26  3:19                 ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2017-09-25  4:56 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Linux Kernel Mailing List, Ingo Molnar,
	Andrew Morton, stable

On Mon, Sep 25, 2017 at 06:41:30AM +0200, Steven Rostedt wrote:
> Sorry for the top post, currently on a train to Paris.
> 
> This series already went through all my testing, and I would hate to rebase it for this reason. Can you just add a patch to remove the READ_ONCE()s?

If Linus accepts the original series, easy enough.

							Thanx, Paul

> Thanks,
> 
> -- Steve
> 
> 
> On September 25, 2017 2:34:56 AM GMT+02:00, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >On Sun, Sep 24, 2017 at 05:26:53PM -0700, Paul E. McKenney wrote:
> >> On Sun, Sep 24, 2017 at 05:12:13PM -0700, Linus Torvalds wrote:
> >> > On Sun, Sep 24, 2017 at 5:03 PM, Paul E. McKenney
> >> > <paulmck@linux.vnet.ibm.com> wrote:
> >> > >
> >> > > Mostly just paranoia on my part.  I would be happy to remove it
> >if
> >> > > you prefer.  Or you or Steve can do so if that is more
> >convenient.
> >> > 
> >> > I really don't think it's warranted. The values are *stable*.
> >There's
> >> > no subtle lack of locking, or some optimistic access to a value
> >that
> >> > can change.
> >> > 
> >> > The compiler can generate code to read the value fifteen billion
> >> > times, and it will always get the same value.
> >> > 
> >> > Yes, maybe in between the different accesses, an NMI will happen,
> >and
> >> > the value will be incremented, but then as the NMI exits, it will
> >> > decrement again, so the code that got interrupted will not actually
> >> > see the change.
> >> > 
> >> > So the READ_ONCE() isn't "paranoia". It's just confusing.
> >> > 
> >> > > And yes, consistency would dictate that the uses in
> >rcu_nmi_enter()
> >> > > and rcu_nmi_exit() should be _ONCE(), particularly the stores to
> >> > > ->dynticks_nmi_nesting.
> >> > 
> >> > NO.
> >> > 
> >> > That would be just more of that confusion.
> >> > 
> >> > That value is STABLE. It's stable even within an NMI handler. The
> >NMI
> >> > code can read it, modify it, write it back, do a little dance, all
> >> > without having to care. There's no "_ONCE()" about it - not for the
> >> > readers, not for the writers, not for _anybody_.
> >> > 
> >> > So adding even more READ/WRITE_ONCE() accesses wouldn't be
> >> > "consistent", it would just be insanity.
> >> > 
> >> > Now, if an NMI happens and the value would be different on entry
> >than
> >> > it is on exit, that would be something else. Then it really
> >wouldn't
> >> > be stable wrt random users. But that would also be a major bug in
> >the
> >> > NMI handler, as far as I can tell.
> >> > 
> >> > So the reason I'm objecting to that READ_ONCE() is that it isn't
> >> > "paranoia", it's "voodoo programming". And we don't do voodoo
> >> > programming.
> >> 
> >> I already agreed that the READ_ONCE() can be removed.
> >
> >And for whatever it is worth, here is the updated patch.
> >
> >							Thanx, Paul
> >
> >------------------------------------------------------------------------
> >
> >commit 3e2baa988b9c13095995c46c51e0e32c0b6a7d43
> >Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >Date:   Fri Sep 22 13:14:42 2017 -0700
> >
> >    rcu: Allow for page faults in NMI handlers
> >    
> >  A number of architecture invoke rcu_irq_enter() on exception entry in
> >order to allow RCU read-side critical sections in the exception handler
> >   when the exception is from an idle or nohz_full CPU.  This works, at
> >   least unless the exception happens in an NMI handler.  In that case,
> >rcu_nmi_enter() would already have exited the extended quiescent state,
> >    which would mean that rcu_irq_enter() would (incorrectly) cause RCU
> >   to think that it is again in an extended quiescent state.  This will
> >    in turn result in lockdep splats in response to later RCU read-side
> >    critical sections.
> >    
> >    This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
> > take no action if there is an rcu_nmi_enter() in effect, thus avoiding
> >    the unscheduled return to RCU quiescent state.  This in turn should
> >    make the kernel safe for on-demand RCU voyeurism.
> >    
> >    Reported-by: Steven Rostedt <rostedt@goodmis.org>
> >    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >    [ paulmck: Remove READ_ONCE() per Linux Torvalds feedback. ]
> >
> >diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> >index db5eb8c3f7af..e4fe06d42385 100644
> >--- a/kernel/rcu/tree.c
> >+++ b/kernel/rcu/tree.c
> >@@ -891,6 +891,11 @@ void rcu_irq_exit(void)
> > 
> >	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs
> >enabled!!!");
> > 	rdtp = this_cpu_ptr(&rcu_dynticks);
> >+
> >+	/* Page faults can happen in NMI handlers, so check... */
> >+	if (rdtp->dynticks_nmi_nesting)
> >+		return;
> >+
> > 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> > 		     rdtp->dynticks_nesting < 1);
> > 	if (rdtp->dynticks_nesting <= 1) {
> >@@ -1036,6 +1041,11 @@ void rcu_irq_enter(void)
> > 
> >	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_enter() invoked with irqs
> >enabled!!!");
> > 	rdtp = this_cpu_ptr(&rcu_dynticks);
> >+
> >+	/* Page faults can happen in NMI handlers, so check... */
> >+	if (rdtp->dynticks_nmi_nesting)
> >+		return;
> >+
> > 	oldval = rdtp->dynticks_nesting;
> > 	rdtp->dynticks_nesting++;
> > 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> 
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-25  4:56               ` Paul E. McKenney
@ 2017-09-26  3:19                 ` Paul E. McKenney
  0 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2017-09-26  3:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Linux Kernel Mailing List, Ingo Molnar,
	Andrew Morton, stable

On Sun, Sep 24, 2017 at 09:56:32PM -0700, Paul E. McKenney wrote:
> On Mon, Sep 25, 2017 at 06:41:30AM +0200, Steven Rostedt wrote:
> > Sorry for the top post, currently on a train to Paris.
> > 
> > This series already went through all my testing, and I would hate to rebase it for this reason. Can you just add a patch to remove the READ_ONCE()s?
> 
> If Linus accepts the original series, easy enough.

And he did, so here is the READ_ONCE()-removal commit that I have queued
for the next merge window.  If anyone feels it is needed sooner, please
let me know.  (Can't see why it is urgent myself, but who knows...)

							Thanx, Paul

------------------------------------------------------------------------

commit 79e6337d3f3a629be48cd45d5075d058788ce90f
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Sep 25 20:07:49 2017 -0700

    rcu: Remove extraneous READ_ONCE()s from rcu_irq_{enter,exit}()
    
    The read of ->dynticks_nmi_nesting in rcu_irq_enter() and rcu_irq_exit()
    is currently protected with READ_ONCE().  However, this protection is
    unnecessary because (1) ->dynticks_nmi_nesting is updated only by the
    current CPU, (2) Although NMI handlers can update this field, they reset
    it back to its old value before return, and (3) Interrupts are disabled,
    so nothing else can modify it.  The value of ->dynticks_nmi_nesting is
    thus effectively constant, and so no protection is required.
    
    This commit therefore removes the READ_ONCE() protection from these
    two accesses.
    
    Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 872d20cee00a..e4fe06d42385 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -893,7 +893,7 @@ void rcu_irq_exit(void)
 	rdtp = this_cpu_ptr(&rcu_dynticks);
 
 	/* Page faults can happen in NMI handlers, so check... */
-	if (READ_ONCE(rdtp->dynticks_nmi_nesting))
+	if (rdtp->dynticks_nmi_nesting)
 		return;
 
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
@@ -1043,7 +1043,7 @@ void rcu_irq_enter(void)
 	rdtp = this_cpu_ptr(&rcu_dynticks);
 
 	/* Page faults can happen in NMI handlers, so check... */
-	if (READ_ONCE(rdtp->dynticks_nmi_nesting))
+	if (rdtp->dynticks_nmi_nesting)
 		return;
 
 	oldval = rdtp->dynticks_nesting;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-22 22:44   ` Paul E. McKenney
@ 2017-09-23  1:09     ` Steven Rostedt
  0 siblings, 0 replies; 17+ messages in thread
From: Steven Rostedt @ 2017-09-23  1:09 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Fri, 22 Sep 2017 15:44:06 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Sep 22, 2017 at 06:15:46PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > If kernel_text_address() is called when RCU is not watching, it can cause an
> > RCU bug because is_module_text_address() and the is_kprobe_*insn_slot()
> > functions require the use of RCU.
> > 
> > Only enable RCU if it is not currently watching before it calls
> > is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
> > because kernel_text_address() can happen pretty much anywhere (like an NMI),
> > and even from within an NMI. It is called via save_stack_trace() that can be
> > called by any WARN() or tracing function, which can happen while RCU is not
> > watching (for example, going to or coming from idle, or during CPU take down
> > or bring up).
> > 
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: stable@vger.kernel.org  
> 
> Do we need something calling out the dependency on the first two patches,
> or is that implied these days?

I believe the stable folks will simply apply patches marked for stable
in the order they are applied. This is a patch series, so I don't think
it will be an issue for them.

> 
> > Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> 
> Give or take Josh's feedback on the commit log and comment:

Yep.

> 
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
  2017-09-22 22:28   ` Josh Poimboeuf
@ 2017-09-22 22:44   ` Paul E. McKenney
  2017-09-23  1:09     ` Steven Rostedt
  1 sibling, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2017-09-22 22:44 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Fri, Sep 22, 2017 at 06:15:46PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> If kernel_text_address() is called when RCU is not watching, it can cause an
> RCU bug because is_module_text_address() and the is_kprobe_*insn_slot()
> functions require the use of RCU.
> 
> Only enable RCU if it is not currently watching before it calls
> is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
> because kernel_text_address() can happen pretty much anywhere (like an NMI),
> and even from within an NMI. It is called via save_stack_trace() that can be
> called by any WARN() or tracing function, which can happen while RCU is not
> watching (for example, going to or coming from idle, or during CPU take down
> or bring up).
> 
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: stable@vger.kernel.org

Do we need something calling out the dependency on the first two patches,
or is that implied these days?

> Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Give or take Josh's feedback on the commit log and comment:

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/extable.c | 35 ++++++++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/extable.c b/kernel/extable.c
> index a7024a494faf..89c4668376a6 100644
> --- a/kernel/extable.c
> +++ b/kernel/extable.c
> @@ -119,17 +119,42 @@ int __kernel_text_address(unsigned long addr)
> 
>  int kernel_text_address(unsigned long addr)
>  {
> +	bool no_rcu;
> +	int ret = 1;
> +
>  	if (core_kernel_text(addr))
>  		return 1;
> +
> +	/*
> +	 * If a stack dump happens while RCU is not watching, then
> +	 * RCU needs to be notified that it requires to start
> +	 * watching again. This can happen either by tracing that
> +	 * triggers a stack trace, or a WARN() that happens during
> +	 * coming back from idle, or cpu on or offlining.
> +	 *
> +	 * is_module_text_address() as well as the kprobe slots
> +	 * require RCU to be watching.
> +	 */
> +	no_rcu = !rcu_is_watching();
> +
> +	/* Treat this like an NMI as it can happen anywhere */
> +	if (no_rcu)
> +		rcu_nmi_enter();
> +
>  	if (is_module_text_address(addr))
> -		return 1;
> +		goto out;
>  	if (is_ftrace_trampoline(addr))
> -		return 1;
> +		goto out;
>  	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
> -		return 1;
> +		goto out;
>  	if (is_bpf_text_address(addr))
> -		return 1;
> -	return 0;
> +		goto out;
> +	ret = 0;
> +out:
> +	if (no_rcu)
> +		rcu_nmi_exit();
> +
> +	return ret;
>  }
> 
>  /*
> -- 
> 2.13.2
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
@ 2017-09-22 22:28   ` Josh Poimboeuf
  2017-09-22 22:44   ` Paul E. McKenney
  1 sibling, 0 replies; 17+ messages in thread
From: Josh Poimboeuf @ 2017-09-22 22:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, ngo Molnar, Andrew Morton, Paul E. McKenney, stable

On Fri, Sep 22, 2017 at 06:15:46PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> If kernel_text_address() is called when RCU is not watching, it can cause an
> RCU bug because is_module_text_address() and the is_kprobe_*insn_slot()
> functions require the use of RCU.

is_bpf_text_address() also requires RCU.

> Only enable RCU if it is not currently watching before it calls
> is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
> because kernel_text_address() can happen pretty much anywhere (like an NMI),
> and even from within an NMI. It is called via save_stack_trace() that can be
> called by any WARN() or tracing function, which can happen while RCU is not
> watching (for example, going to or coming from idle, or during CPU take down
> or bring up).
> 
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: stable@vger.kernel.org
> Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/extable.c | 35 ++++++++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/extable.c b/kernel/extable.c
> index a7024a494faf..89c4668376a6 100644
> --- a/kernel/extable.c
> +++ b/kernel/extable.c
> @@ -119,17 +119,42 @@ int __kernel_text_address(unsigned long addr)
>  
>  int kernel_text_address(unsigned long addr)
>  {
> +	bool no_rcu;
> +	int ret = 1;
> +
>  	if (core_kernel_text(addr))
>  		return 1;
> +
> +	/*
> +	 * If a stack dump happens while RCU is not watching, then
> +	 * RCU needs to be notified that it requires to start
> +	 * watching again. This can happen either by tracing that
> +	 * triggers a stack trace, or a WARN() that happens during
> +	 * coming back from idle, or cpu on or offlining.
> +	 *
> +	 * is_module_text_address() as well as the kprobe slots
> +	 * require RCU to be watching.

Ditto here.

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
@ 2017-09-22 22:15 ` Steven Rostedt
  2017-09-22 22:28   ` Josh Poimboeuf
  2017-09-22 22:44   ` Paul E. McKenney
  0 siblings, 2 replies; 17+ messages in thread
From: Steven Rostedt @ 2017-09-22 22:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: ngo Molnar, Andrew Morton, Paul E. McKenney, stable

[-- Attachment #1: 0003-extable-Enable-RCU-if-it-is-not-watching-in-kernel_t.patch --]
[-- Type: text/plain, Size: 2235 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

If kernel_text_address() is called when RCU is not watching, it can cause an
RCU bug because is_module_text_address() and the is_kprobe_*insn_slot()
functions require the use of RCU.

Only enable RCU if it is not currently watching before it calls
is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
because kernel_text_address() can happen pretty much anywhere (like an NMI),
and even from within an NMI. It is called via save_stack_trace() that can be
called by any WARN() or tracing function, which can happen while RCU is not
watching (for example, going to or coming from idle, or during CPU take down
or bring up).

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/extable.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/kernel/extable.c b/kernel/extable.c
index a7024a494faf..89c4668376a6 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -119,17 +119,42 @@ int __kernel_text_address(unsigned long addr)
 
 int kernel_text_address(unsigned long addr)
 {
+	bool no_rcu;
+	int ret = 1;
+
 	if (core_kernel_text(addr))
 		return 1;
+
+	/*
+	 * If a stack dump happens while RCU is not watching, then
+	 * RCU needs to be notified that it requires to start
+	 * watching again. This can happen either by tracing that
+	 * triggers a stack trace, or a WARN() that happens during
+	 * coming back from idle, or cpu on or offlining.
+	 *
+	 * is_module_text_address() as well as the kprobe slots
+	 * require RCU to be watching.
+	 */
+	no_rcu = !rcu_is_watching();
+
+	/* Treat this like an NMI as it can happen anywhere */
+	if (no_rcu)
+		rcu_nmi_enter();
+
 	if (is_module_text_address(addr))
-		return 1;
+		goto out;
 	if (is_ftrace_trampoline(addr))
-		return 1;
+		goto out;
 	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
-		return 1;
+		goto out;
 	if (is_bpf_text_address(addr))
-		return 1;
-	return 0;
+		goto out;
+	ret = 0;
+out:
+	if (no_rcu)
+		rcu_nmi_exit();
+
+	return ret;
 }
 
 /*
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-09-26  3:19 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
2017-09-23 20:56 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
2017-09-24 19:42   ` Linus Torvalds
2017-09-25  0:03     ` Paul E. McKenney
2017-09-25  0:12       ` Linus Torvalds
2017-09-25  0:26         ` Paul E. McKenney
2017-09-25  0:34           ` Paul E. McKenney
2017-09-25  4:41             ` Steven Rostedt
2017-09-25  4:56               ` Paul E. McKenney
2017-09-26  3:19                 ` Paul E. McKenney
2017-09-23 20:56 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
2017-09-23 20:56 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
2017-09-23 20:56 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
  -- strict thread matches above, loose matches on Subject: below --
2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
2017-09-22 22:28   ` Josh Poimboeuf
2017-09-22 22:44   ` Paul E. McKenney
2017-09-23  1:09     ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.