All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching
@ 2017-09-22 22:15 Steven Rostedt
  2017-09-22 22:15 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-09-22 22:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: ngo Molnar, Andrew Morton, Paul E. McKenney

I'm currently running this through my test suite, but wanted to get
review or acks, so that I can send this right off to Linus before rc2.
I need these to base my new work on.

Basically if a stack trace happens (say from a WARN()) when RCU is not
watching (like going to or coming from idle, or bringing down or up
a CPU), the stack trace can cause RCU issues because it requires RCU to
be watching.

We use rcu_nmi_enter() for this case, as these cases behave similar to
an NMI (can happen pretty much anywhere).


Paul E. McKenney (1):
      rcu: Allow for page faults in NMI handlers

Steven Rostedt (VMware) (3):
      extable: Consolidate *kernel_text_address() functions
      extable: Enable RCU if it is not watching in kernel_text_address()
      tracing: Remove RCU work arounds from stack tracer

----
 kernel/extable.c           | 45 +++++++++++++++++++++++++++++++--------------
 kernel/rcu/tree.c          | 10 ++++++++++
 kernel/trace/trace_stack.c | 15 ---------------
 3 files changed, 41 insertions(+), 29 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] rcu: Allow for page faults in NMI handlers
  2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
@ 2017-09-22 22:15 ` Steven Rostedt
  2017-09-22 22:15 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-09-22 22:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: ngo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0001-rcu-Allow-for-page-faults-in-NMI-handlers.patch --]
[-- Type: text/plain, Size: 2182 bytes --]

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

A number of architecture invoke rcu_irq_enter() on exception entry in
order to allow RCU read-side critical sections in the exception handler
when the exception is from an idle or nohz_full CPU.  This works, at
least unless the exception happens in an NMI handler.  In that case,
rcu_nmi_enter() would already have exited the extended quiescent state,
which would mean that rcu_irq_enter() would (incorrectly) cause RCU
to think that it is again in an extended quiescent state.  This will
in turn result in lockdep splats in response to later RCU read-side
critical sections.

This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
take no action if there is an rcu_nmi_enter() in effect, thus avoiding
the unscheduled return to RCU quiescent state.  This in turn should
make the kernel safe for on-demand RCU voyeurism.

Link: http://lkml.kernel.org/r/20170922211022.GA18084@linux.vnet.ibm.com

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/rcu/tree.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 51d4c3acf32d..63bee8e1b193 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -888,6 +888,11 @@ void rcu_irq_exit(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (READ_ONCE(rdtp->dynticks_nmi_nesting))
+		return;
+
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
 		     rdtp->dynticks_nesting < 1);
 	if (rdtp->dynticks_nesting <= 1) {
@@ -1020,6 +1025,11 @@ void rcu_irq_enter(void)
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_enter() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/* Page faults can happen in NMI handlers, so check... */
+	if (READ_ONCE(rdtp->dynticks_nmi_nesting))
+		return;
+
 	oldval = rdtp->dynticks_nesting;
 	rdtp->dynticks_nesting++;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] extable: Consolidate *kernel_text_address() functions
  2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
  2017-09-22 22:15 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
@ 2017-09-22 22:15 ` Steven Rostedt
  2017-09-22 22:40   ` Paul E. McKenney
  2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
  2017-09-22 22:15 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
  3 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2017-09-22 22:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: ngo Molnar, Andrew Morton, Paul E. McKenney, stable

[-- Attachment #1: 0002-extable-Consolidate-kernel_text_address-functions.patch --]
[-- Type: text/plain, Size: 1487 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The functionality between kernel_text_address() and _kernel_text_address()
is the same except that _kernel_text_address() does a little more (that
function needs a rename, but that can be done another time). Instead of
having duplicate code in both, simply have _kernel_text_address() calls
kernel_text_address() instead.

This is marked for stable because there's an RCU bug that can happen if
one of these functions gets called while RCU is not watching. That fix
depends on this fix to keep from having to write the fix twice.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/extable.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/kernel/extable.c b/kernel/extable.c
index 38c2412401a1..a7024a494faf 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -102,15 +102,7 @@ int core_kernel_data(unsigned long addr)
 
 int __kernel_text_address(unsigned long addr)
 {
-	if (core_kernel_text(addr))
-		return 1;
-	if (is_module_text_address(addr))
-		return 1;
-	if (is_ftrace_trampoline(addr))
-		return 1;
-	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
-		return 1;
-	if (is_bpf_text_address(addr))
+	if (kernel_text_address(addr))
 		return 1;
 	/*
 	 * There might be init symbols in saved stacktraces.
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
  2017-09-22 22:15 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
  2017-09-22 22:15 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
@ 2017-09-22 22:15 ` Steven Rostedt
  2017-09-22 22:28   ` Josh Poimboeuf
  2017-09-22 22:44   ` [PATCH 3/4] " Paul E. McKenney
  2017-09-22 22:15 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
  3 siblings, 2 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-09-22 22:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: ngo Molnar, Andrew Morton, Paul E. McKenney, stable

[-- Attachment #1: 0003-extable-Enable-RCU-if-it-is-not-watching-in-kernel_t.patch --]
[-- Type: text/plain, Size: 2235 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

If kernel_text_address() is called when RCU is not watching, it can cause an
RCU bug because is_module_text_address() and the is_kprobe_*insn_slot()
functions require the use of RCU.

Only enable RCU if it is not currently watching before it calls
is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
because kernel_text_address() can happen pretty much anywhere (like an NMI),
and even from within an NMI. It is called via save_stack_trace() that can be
called by any WARN() or tracing function, which can happen while RCU is not
watching (for example, going to or coming from idle, or during CPU take down
or bring up).

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/extable.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/kernel/extable.c b/kernel/extable.c
index a7024a494faf..89c4668376a6 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -119,17 +119,42 @@ int __kernel_text_address(unsigned long addr)
 
 int kernel_text_address(unsigned long addr)
 {
+	bool no_rcu;
+	int ret = 1;
+
 	if (core_kernel_text(addr))
 		return 1;
+
+	/*
+	 * If a stack dump happens while RCU is not watching, then
+	 * RCU needs to be notified that it requires to start
+	 * watching again. This can happen either by tracing that
+	 * triggers a stack trace, or a WARN() that happens during
+	 * coming back from idle, or cpu on or offlining.
+	 *
+	 * is_module_text_address() as well as the kprobe slots
+	 * require RCU to be watching.
+	 */
+	no_rcu = !rcu_is_watching();
+
+	/* Treat this like an NMI as it can happen anywhere */
+	if (no_rcu)
+		rcu_nmi_enter();
+
 	if (is_module_text_address(addr))
-		return 1;
+		goto out;
 	if (is_ftrace_trampoline(addr))
-		return 1;
+		goto out;
 	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
-		return 1;
+		goto out;
 	if (is_bpf_text_address(addr))
-		return 1;
-	return 0;
+		goto out;
+	ret = 0;
+out:
+	if (no_rcu)
+		rcu_nmi_exit();
+
+	return ret;
 }
 
 /*
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
  2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
                   ` (2 preceding siblings ...)
  2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
@ 2017-09-22 22:15 ` Steven Rostedt
  2017-09-22 22:54   ` Paul E. McKenney
  3 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2017-09-22 22:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: ngo Molnar, Andrew Morton, Paul E. McKenney, stable

[-- Attachment #1: 0004-tracing-Remove-RCU-work-arounds-from-stack-tracer.patch --]
[-- Type: text/plain, Size: 2063 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

While debugging some RCU issues with the stack tracer, it was discovered
that the problem was much more than with the stack tracer itself, but with
the saving of the stack trace, which could happen from any WARN() as well.
The problem was fixed within kernel_text_address().

One of the bugs that was discovered was that the stack tracer called
rcu_enter_irq() unconditionally. Paul McKenney said that could cause issues
as well. Instead of adding logic to only call rcu_enter_irq() if RCU is not
watching from within the stack tracer, since the core issue has been fixed
(within save_stack_trace()), we can simply remove all the logic in the stack
tracer that deals with RCU work arounds.

Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index a4df67cbc711..49cb41412eec 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -96,23 +96,9 @@ check_stack(unsigned long ip, unsigned long *stack)
 	if (in_nmi())
 		return;
 
-	/*
-	 * There's a slight chance that we are tracing inside the
-	 * RCU infrastructure, and rcu_irq_enter() will not work
-	 * as expected.
-	 */
-	if (unlikely(rcu_irq_enter_disabled()))
-		return;
-
 	local_irq_save(flags);
 	arch_spin_lock(&stack_trace_max_lock);
 
-	/*
-	 * RCU may not be watching, make it see us.
-	 * The stack trace code uses rcu_sched.
-	 */
-	rcu_irq_enter();
-
 	/* In case another CPU set the tracer_frame on us */
 	if (unlikely(!frame_size))
 		this_size -= tracer_frame;
@@ -205,7 +191,6 @@ check_stack(unsigned long ip, unsigned long *stack)
 	}
 
  out:
-	rcu_irq_exit();
 	arch_spin_unlock(&stack_trace_max_lock);
 	local_irq_restore(flags);
 }
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
@ 2017-09-22 22:28   ` Josh Poimboeuf
  2017-09-23  1:12     ` [PATCH 3/4 v2] " Steven Rostedt
  2017-09-22 22:44   ` [PATCH 3/4] " Paul E. McKenney
  1 sibling, 1 reply; 16+ messages in thread
From: Josh Poimboeuf @ 2017-09-22 22:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, ngo Molnar, Andrew Morton, Paul E. McKenney, stable

On Fri, Sep 22, 2017 at 06:15:46PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> If kernel_text_address() is called when RCU is not watching, it can cause an
> RCU bug because is_module_text_address() and the is_kprobe_*insn_slot()
> functions require the use of RCU.

is_bpf_text_address() also requires RCU.

> Only enable RCU if it is not currently watching before it calls
> is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
> because kernel_text_address() can happen pretty much anywhere (like an NMI),
> and even from within an NMI. It is called via save_stack_trace() that can be
> called by any WARN() or tracing function, which can happen while RCU is not
> watching (for example, going to or coming from idle, or during CPU take down
> or bring up).
> 
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: stable@vger.kernel.org
> Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/extable.c | 35 ++++++++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/extable.c b/kernel/extable.c
> index a7024a494faf..89c4668376a6 100644
> --- a/kernel/extable.c
> +++ b/kernel/extable.c
> @@ -119,17 +119,42 @@ int __kernel_text_address(unsigned long addr)
>  
>  int kernel_text_address(unsigned long addr)
>  {
> +	bool no_rcu;
> +	int ret = 1;
> +
>  	if (core_kernel_text(addr))
>  		return 1;
> +
> +	/*
> +	 * If a stack dump happens while RCU is not watching, then
> +	 * RCU needs to be notified that it requires to start
> +	 * watching again. This can happen either by tracing that
> +	 * triggers a stack trace, or a WARN() that happens during
> +	 * coming back from idle, or cpu on or offlining.
> +	 *
> +	 * is_module_text_address() as well as the kprobe slots
> +	 * require RCU to be watching.

Ditto here.

-- 
Josh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] extable: Consolidate *kernel_text_address() functions
  2017-09-22 22:15 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
@ 2017-09-22 22:40   ` Paul E. McKenney
  0 siblings, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2017-09-22 22:40 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Fri, Sep 22, 2017 at 06:15:45PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> The functionality between kernel_text_address() and _kernel_text_address()
> is the same except that _kernel_text_address() does a little more (that
> function needs a rename, but that can be done another time). Instead of
> having duplicate code in both, simply have _kernel_text_address() calls
> kernel_text_address() instead.
> 
> This is marked for stable because there's an RCU bug that can happen if
> one of these functions gets called while RCU is not watching. That fix
> depends on this fix to keep from having to write the fix twice.
> 
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: stable@vger.kernel.org
> Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Not my area, but straightforward transformation and nice reduction
in code size.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/extable.c | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/kernel/extable.c b/kernel/extable.c
> index 38c2412401a1..a7024a494faf 100644
> --- a/kernel/extable.c
> +++ b/kernel/extable.c
> @@ -102,15 +102,7 @@ int core_kernel_data(unsigned long addr)
> 
>  int __kernel_text_address(unsigned long addr)
>  {
> -	if (core_kernel_text(addr))
> -		return 1;
> -	if (is_module_text_address(addr))
> -		return 1;
> -	if (is_ftrace_trampoline(addr))
> -		return 1;
> -	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
> -		return 1;
> -	if (is_bpf_text_address(addr))
> +	if (kernel_text_address(addr))
>  		return 1;
>  	/*
>  	 * There might be init symbols in saved stacktraces.
> -- 
> 2.13.2
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
  2017-09-22 22:28   ` Josh Poimboeuf
@ 2017-09-22 22:44   ` Paul E. McKenney
  2017-09-23  1:09     ` Steven Rostedt
  1 sibling, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2017-09-22 22:44 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Fri, Sep 22, 2017 at 06:15:46PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> If kernel_text_address() is called when RCU is not watching, it can cause an
> RCU bug because is_module_text_address() and the is_kprobe_*insn_slot()
> functions require the use of RCU.
> 
> Only enable RCU if it is not currently watching before it calls
> is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
> because kernel_text_address() can happen pretty much anywhere (like an NMI),
> and even from within an NMI. It is called via save_stack_trace() that can be
> called by any WARN() or tracing function, which can happen while RCU is not
> watching (for example, going to or coming from idle, or during CPU take down
> or bring up).
> 
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: stable@vger.kernel.org

Do we need something calling out the dependency on the first two patches,
or is that implied these days?

> Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Give or take Josh's feedback on the commit log and comment:

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/extable.c | 35 ++++++++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/extable.c b/kernel/extable.c
> index a7024a494faf..89c4668376a6 100644
> --- a/kernel/extable.c
> +++ b/kernel/extable.c
> @@ -119,17 +119,42 @@ int __kernel_text_address(unsigned long addr)
> 
>  int kernel_text_address(unsigned long addr)
>  {
> +	bool no_rcu;
> +	int ret = 1;
> +
>  	if (core_kernel_text(addr))
>  		return 1;
> +
> +	/*
> +	 * If a stack dump happens while RCU is not watching, then
> +	 * RCU needs to be notified that it requires to start
> +	 * watching again. This can happen either by tracing that
> +	 * triggers a stack trace, or a WARN() that happens during
> +	 * coming back from idle, or cpu on or offlining.
> +	 *
> +	 * is_module_text_address() as well as the kprobe slots
> +	 * require RCU to be watching.
> +	 */
> +	no_rcu = !rcu_is_watching();
> +
> +	/* Treat this like an NMI as it can happen anywhere */
> +	if (no_rcu)
> +		rcu_nmi_enter();
> +
>  	if (is_module_text_address(addr))
> -		return 1;
> +		goto out;
>  	if (is_ftrace_trampoline(addr))
> -		return 1;
> +		goto out;
>  	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
> -		return 1;
> +		goto out;
>  	if (is_bpf_text_address(addr))
> -		return 1;
> -	return 0;
> +		goto out;
> +	ret = 0;
> +out:
> +	if (no_rcu)
> +		rcu_nmi_exit();
> +
> +	return ret;
>  }
> 
>  /*
> -- 
> 2.13.2
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
  2017-09-22 22:15 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
@ 2017-09-22 22:54   ` Paul E. McKenney
  2017-09-23  1:27     ` Steven Rostedt
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2017-09-22 22:54 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Fri, Sep 22, 2017 at 06:15:47PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> While debugging some RCU issues with the stack tracer, it was discovered
> that the problem was much more than with the stack tracer itself, but with
> the saving of the stack trace, which could happen from any WARN() as well.
> The problem was fixed within kernel_text_address().
> 
> One of the bugs that was discovered was that the stack tracer called
> rcu_enter_irq() unconditionally. Paul McKenney said that could cause issues
> as well. Instead of adding logic to only call rcu_enter_irq() if RCU is not
> watching from within the stack tracer, since the core issue has been fixed
> (within save_stack_trace()), we can simply remove all the logic in the stack
> tracer that deals with RCU work arounds.

I must confess that I am having some difficulty parsing this paragraph,
especially the last sentence...

Does this capture it?

	One problem is that the stack tracer called rcu_irq_enter()
	unconditionally, which is problematic if RCU's last
	not-watching-to-watching transition was carried out by
	rcu_nmi_enter.	In that case, rcu_irq_enter() actually switches
	RCU back to the not-watching state for this CPU, which results
	in lockdep splats complaining about rcu_read_lock() being
	used on an idle (not-watched) CPU.  The first patch of this
	series addressed this problem by having rcu_irq_enter() and
	rcu_irq_exit() refrain from doing anything when rcu_nmi_enter()
	caused RCU to start watching this CPU.	The third patch in this
	series caused save_stack_trace() to invoke rcu_nmi_enter() and
	rcu_nmi_exit() as needed, so this fourth patch now removes the
	rcu_irq_enter() and rcu_irq_exit() from within the stack tracer.

One further question...  Can I now remove the rcu_irq_enter_disabled()
logic?

> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: stable@vger.kernel.org
> Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

With the hard-to-parse paragraph fixed:

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/trace/trace_stack.c | 15 ---------------
>  1 file changed, 15 deletions(-)
> 
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index a4df67cbc711..49cb41412eec 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -96,23 +96,9 @@ check_stack(unsigned long ip, unsigned long *stack)
>  	if (in_nmi())
>  		return;
> 
> -	/*
> -	 * There's a slight chance that we are tracing inside the
> -	 * RCU infrastructure, and rcu_irq_enter() will not work
> -	 * as expected.
> -	 */
> -	if (unlikely(rcu_irq_enter_disabled()))
> -		return;
> -
>  	local_irq_save(flags);
>  	arch_spin_lock(&stack_trace_max_lock);
> 
> -	/*
> -	 * RCU may not be watching, make it see us.
> -	 * The stack trace code uses rcu_sched.
> -	 */
> -	rcu_irq_enter();
> -
>  	/* In case another CPU set the tracer_frame on us */
>  	if (unlikely(!frame_size))
>  		this_size -= tracer_frame;
> @@ -205,7 +191,6 @@ check_stack(unsigned long ip, unsigned long *stack)
>  	}
> 
>   out:
> -	rcu_irq_exit();
>  	arch_spin_unlock(&stack_trace_max_lock);
>  	local_irq_restore(flags);
>  }
> -- 
> 2.13.2
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-22 22:44   ` [PATCH 3/4] " Paul E. McKenney
@ 2017-09-23  1:09     ` Steven Rostedt
  0 siblings, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-09-23  1:09 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Fri, 22 Sep 2017 15:44:06 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Sep 22, 2017 at 06:15:46PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > If kernel_text_address() is called when RCU is not watching, it can cause an
> > RCU bug because is_module_text_address() and the is_kprobe_*insn_slot()
> > functions require the use of RCU.
> > 
> > Only enable RCU if it is not currently watching before it calls
> > is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
> > because kernel_text_address() can happen pretty much anywhere (like an NMI),
> > and even from within an NMI. It is called via save_stack_trace() that can be
> > called by any WARN() or tracing function, which can happen while RCU is not
> > watching (for example, going to or coming from idle, or during CPU take down
> > or bring up).
> > 
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: stable@vger.kernel.org  
> 
> Do we need something calling out the dependency on the first two patches,
> or is that implied these days?

I believe the stable folks will simply apply patches marked for stable
in the order they are applied. This is a patch series, so I don't think
it will be an issue for them.

> 
> > Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> 
> Give or take Josh's feedback on the commit log and comment:

Yep.

> 
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 3/4 v2] extable: Enable RCU if it is not watching in kernel_text_address()
  2017-09-22 22:28   ` Josh Poimboeuf
@ 2017-09-23  1:12     ` Steven Rostedt
  0 siblings, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-09-23  1:12 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney, stable

>From 32b9a1b584549478af9e55e2008d50c58a98535f Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Date: Fri, 22 Sep 2017 17:36:32 -0400
Subject: [PATCH] extable: Enable RCU if it is not watching in
 kernel_text_address()

If kernel_text_address() is called when RCU is not watching, it can cause an
RCU bug because is_module_text_address(), the is_kprobe_*insn_slot()
and is_bpf_text_address() functions require the use of RCU.

Only enable RCU if it is not currently watching before it calls
is_module_text_address(). The use of rcu_nmi_enter() is used to enable RCU
because kernel_text_address() can happen pretty much anywhere (like an NMI),
and even from within an NMI. It is called via save_stack_trace() that can be
called by any WARN() or tracing function, which can happen while RCU is not
watching (for example, going to or coming from idle, or during CPU take down
or bring up).

Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/extable.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/kernel/extable.c b/kernel/extable.c
index a7024a4..9aa1cc4 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -119,17 +119,42 @@ int __kernel_text_address(unsigned long addr)
 
 int kernel_text_address(unsigned long addr)
 {
+	bool no_rcu;
+	int ret = 1;
+
 	if (core_kernel_text(addr))
 		return 1;
+
+	/*
+	 * If a stack dump happens while RCU is not watching, then
+	 * RCU needs to be notified that it requires to start
+	 * watching again. This can happen either by tracing that
+	 * triggers a stack trace, or a WARN() that happens during
+	 * coming back from idle, or cpu on or offlining.
+	 *
+	 * is_module_text_address() as well as the kprobe slots
+	 * and is_bpf_text_address() require RCU to be watching.
+	 */
+	no_rcu = !rcu_is_watching();
+
+	/* Treat this like an NMI as it can happen anywhere */
+	if (no_rcu)
+		rcu_nmi_enter();
+
 	if (is_module_text_address(addr))
-		return 1;
+		goto out;
 	if (is_ftrace_trampoline(addr))
-		return 1;
+		goto out;
 	if (is_kprobe_optinsn_slot(addr) || is_kprobe_insn_slot(addr))
-		return 1;
+		goto out;
 	if (is_bpf_text_address(addr))
-		return 1;
-	return 0;
+		goto out;
+	ret = 0;
+out:
+	if (no_rcu)
+		rcu_nmi_exit();
+
+	return ret;
 }
 
 /*
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
  2017-09-22 22:54   ` Paul E. McKenney
@ 2017-09-23  1:27     ` Steven Rostedt
  2017-09-23  6:07       ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2017-09-23  1:27 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Fri, 22 Sep 2017 15:54:55 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Sep 22, 2017 at 06:15:47PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > While debugging some RCU issues with the stack tracer, it was discovered
> > that the problem was much more than with the stack tracer itself, but with
> > the saving of the stack trace, which could happen from any WARN() as well.
> > The problem was fixed within kernel_text_address().
> > 
> > One of the bugs that was discovered was that the stack tracer called
> > rcu_enter_irq() unconditionally. Paul McKenney said that could cause issues
> > as well. Instead of adding logic to only call rcu_enter_irq() if RCU is not
> > watching from within the stack tracer, since the core issue has been fixed
> > (within save_stack_trace()), we can simply remove all the logic in the stack
> > tracer that deals with RCU work arounds.  
> 
> I must confess that I am having some difficulty parsing this paragraph,
> especially the last sentence...
> 
> Does this capture it?
> 
> 	One problem is that the stack tracer called rcu_irq_enter()
> 	unconditionally, which is problematic if RCU's last
> 	not-watching-to-watching transition was carried out by
> 	rcu_nmi_enter.	In that case, rcu_irq_enter() actually switches

I thought the rcu_irq_enter() after rcu_nmi_enter() was a separate bug.
Your original complaint was that I called rcu_irq_enter()
unconditionally, and wanted me to only call it if RCU wasn't watching.

But, the new code could possibly have this get called after
rcu_nmi_enter() because we are calling it without in_nmi() being set.


> 	RCU back to the not-watching state for this CPU, which results
> 	in lockdep splats complaining about rcu_read_lock() being
> 	used on an idle (not-watched) CPU.  The first patch of this
> 	series addressed this problem by having rcu_irq_enter() and
> 	rcu_irq_exit() refrain from doing anything when rcu_nmi_enter()
> 	caused RCU to start watching this CPU.	The third patch in this
> 	series caused save_stack_trace() to invoke rcu_nmi_enter() and
> 	rcu_nmi_exit() as needed, so this fourth patch now removes the
> 	rcu_irq_enter() and rcu_irq_exit() from within the stack tracer.
> 
> One further question...  Can I now remove the rcu_irq_enter_disabled()
> logic?

After this goes in. Yes. But that doesn't need to be a stable change.

> 
> > Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Cc: stable@vger.kernel.org
> > Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> > Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> 
> With the hard-to-parse paragraph fixed:
> 
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Thanks, but we may need to go back and forth a bit to get the change
log correct.

Remember the first bug. The one that was fixed by changing
rcu_irq_enter() to rcu_nmi_enter()?

=============================
 WARNING: suspicious RCU usage
 4.13.0-rc7-test+ #117 Tainted: G        W      
 -----------------------------
 /work/git/linux-trace.git/arch/x86/kernel/traps.c:305 entry code didn't wake RCU!
 
 other info that might help us debug this:

 
 RCU used illegally from idle CPU!
 rcu_scheduler_active = 2, debug_locks = 1
 RCU used illegally from extended quiescent state!
 no locks held by swapper/1/0.
 
 stack backtrace:
 CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.13.0-rc7-test+ #117
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 Call Trace:
  dump_stack+0x86/0xcf
  lockdep_rcu_suspicious+0xc5/0x100
  do_error_trap+0x125/0x130
  ? do_error_trap+0x5/0x130
  ? trace_hardirqs_off_thunk+0x1a/0x1c
  ? do_invalid_op+0x5/0x30
  do_invalid_op+0x20/0x30
  invalid_op+0x1e/0x30
 RIP: 0010:module_assert_mutex_or_preempt+0x34/0x40
 RSP: 0018:ffffc900006abc58 EFLAGS: 00010046
 RAX: 0000000000000000 RBX: ffffffffa000a077 RCX: 0000000000000002
 RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000046
 RBP: ffffc900006abc58 R08: ffffc900006abf40 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: ffff8801188d8040 R15: ffffffff81ed5720
  ? 0xffffffffa000a077
  ? module_assert_mutex_or_preempt+0x30/0x40
  __module_address+0x2c/0xf0
  ? 0xffffffffa000a077
  __module_text_address+0x12/0x60
  ? 0xffffffffa000a077
  is_module_text_address+0x1f/0x50
  ? 0xffffffffa000a077
  __kernel_text_address+0x30/0x90
  unwind_get_return_address+0x1f/0x30
  __save_stack_trace+0x83/0xd0
  ? 0xffffffffa000a077
  ? rcu_dynticks_eqs_exit+0x5/0x40
  save_stack_trace+0x1b/0x20
  check_stack+0xf8/0x2f0
  ? rcu_dynticks_eqs_enter+0x30/0x30
  stack_trace_call+0x6e/0x80
  0xffffffffa000a077
  ? ftrace_graph_caller+0x78/0xa8
  ? rcu_dynticks_eqs_exit+0x5/0x40
  rcu_dynticks_eqs_exit+0x5/0x40
  rcu_idle_exit+0xdf/0xf0
  ? rcu_dynticks_eqs_exit+0x5/0x40
  ? rcu_idle_exit+0xdf/0xf0
  do_idle+0x128/0x200
  cpu_startup_entry+0x1d/0x20
  start_secondary+0x108/0x130
  secondary_startup_64+0x9f/0x9f

This was caused by just using rcu_irq_enter(). Not sure if this will
still be an issue or not. But because we now add an rcu_nmi_enter()
without being in_nmi(), we probably should do this. This code doesn't
run if in_nmi() is true, but it could run from the stack trace dump
itself, and that now calls rcu_nmi_enter().

Actually, thinking about this more, this doesn't need to go in stable.
As recursive rcu_irq_enter() calls should not hurt, and you now allow
rcu_irq_enter() to be called even after a rcu_nmi_enter() right?

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
  2017-09-23  1:27     ` Steven Rostedt
@ 2017-09-23  6:07       ` Paul E. McKenney
  2017-09-23 11:22         ` Steven Rostedt
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2017-09-23  6:07 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Fri, Sep 22, 2017 at 09:27:39PM -0400, Steven Rostedt wrote:
> On Fri, 22 Sep 2017 15:54:55 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Sep 22, 2017 at 06:15:47PM -0400, Steven Rostedt wrote:
> > > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > > 
> > > While debugging some RCU issues with the stack tracer, it was discovered
> > > that the problem was much more than with the stack tracer itself, but with
> > > the saving of the stack trace, which could happen from any WARN() as well.
> > > The problem was fixed within kernel_text_address().
> > > 
> > > One of the bugs that was discovered was that the stack tracer called
> > > rcu_enter_irq() unconditionally. Paul McKenney said that could cause issues
> > > as well. Instead of adding logic to only call rcu_enter_irq() if RCU is not
> > > watching from within the stack tracer, since the core issue has been fixed
> > > (within save_stack_trace()), we can simply remove all the logic in the stack
> > > tracer that deals with RCU work arounds.  
> > 
> > I must confess that I am having some difficulty parsing this paragraph,
> > especially the last sentence...
> > 
> > Does this capture it?
> > 
> > 	One problem is that the stack tracer called rcu_irq_enter()
> > 	unconditionally, which is problematic if RCU's last
> > 	not-watching-to-watching transition was carried out by
> > 	rcu_nmi_enter.	In that case, rcu_irq_enter() actually switches
> 
> I thought the rcu_irq_enter() after rcu_nmi_enter() was a separate bug.
> Your original complaint was that I called rcu_irq_enter()
> unconditionally, and wanted me to only call it if RCU wasn't watching.
> 
> But, the new code could possibly have this get called after
> rcu_nmi_enter() because we are calling it without in_nmi() being set.

You are correct.

The initial bug was rcu_irq_enter() being invoked recursively due
to tracing of some of its called functions, which caused RCU to not
have been watching when it should have.  I then advised you to switch
to rcu_nmi_enter(), unaware that there would be calls to rcu_irq_enter()
between the rcu_nmi_enter() and its matching rcu_nmi_exit().  Which
led us to the second problem, described in my suggested paragraph.

> > 	RCU back to the not-watching state for this CPU, which results
> > 	in lockdep splats complaining about rcu_read_lock() being
> > 	used on an idle (not-watched) CPU.  The first patch of this
> > 	series addressed this problem by having rcu_irq_enter() and
> > 	rcu_irq_exit() refrain from doing anything when rcu_nmi_enter()
> > 	caused RCU to start watching this CPU.	The third patch in this
> > 	series caused save_stack_trace() to invoke rcu_nmi_enter() and
> > 	rcu_nmi_exit() as needed, so this fourth patch now removes the
> > 	rcu_irq_enter() and rcu_irq_exit() from within the stack tracer.
> > 
> > One further question...  Can I now remove the rcu_irq_enter_disabled()
> > logic?
> 
> After this goes in. Yes. But that doesn't need to be a stable change.

Good, at least a little simplification out of this.  ;-)

> > > Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Cc: stable@vger.kernel.org
> > > Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
> > > Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> > 
> > With the hard-to-parse paragraph fixed:
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Thanks, but we may need to go back and forth a bit to get the change
> log correct.
> 
> Remember the first bug. The one that was fixed by changing
> rcu_irq_enter() to rcu_nmi_enter()?
> 
> =============================
>  WARNING: suspicious RCU usage
>  4.13.0-rc7-test+ #117 Tainted: G        W      
>  -----------------------------
>  /work/git/linux-trace.git/arch/x86/kernel/traps.c:305 entry code didn't wake RCU!
> 
>  other info that might help us debug this:
> 
> 
>  RCU used illegally from idle CPU!
>  rcu_scheduler_active = 2, debug_locks = 1
>  RCU used illegally from extended quiescent state!
>  no locks held by swapper/1/0.
> 
>  stack backtrace:
>  CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.13.0-rc7-test+ #117
>  Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
>  Call Trace:
>   dump_stack+0x86/0xcf
>   lockdep_rcu_suspicious+0xc5/0x100
>   do_error_trap+0x125/0x130
>   ? do_error_trap+0x5/0x130
>   ? trace_hardirqs_off_thunk+0x1a/0x1c
>   ? do_invalid_op+0x5/0x30
>   do_invalid_op+0x20/0x30
>   invalid_op+0x1e/0x30
>  RIP: 0010:module_assert_mutex_or_preempt+0x34/0x40
>  RSP: 0018:ffffc900006abc58 EFLAGS: 00010046
>  RAX: 0000000000000000 RBX: ffffffffa000a077 RCX: 0000000000000002
>  RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000046
>  RBP: ffffc900006abc58 R08: ffffc900006abf40 R09: 0000000000000000
>  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>  R13: 0000000000000000 R14: ffff8801188d8040 R15: ffffffff81ed5720
>   ? 0xffffffffa000a077
>   ? module_assert_mutex_or_preempt+0x30/0x40
>   __module_address+0x2c/0xf0
>   ? 0xffffffffa000a077
>   __module_text_address+0x12/0x60
>   ? 0xffffffffa000a077
>   is_module_text_address+0x1f/0x50
>   ? 0xffffffffa000a077
>   __kernel_text_address+0x30/0x90
>   unwind_get_return_address+0x1f/0x30
>   __save_stack_trace+0x83/0xd0
>   ? 0xffffffffa000a077
>   ? rcu_dynticks_eqs_exit+0x5/0x40
>   save_stack_trace+0x1b/0x20
>   check_stack+0xf8/0x2f0
>   ? rcu_dynticks_eqs_enter+0x30/0x30
>   stack_trace_call+0x6e/0x80
>   0xffffffffa000a077
>   ? ftrace_graph_caller+0x78/0xa8
>   ? rcu_dynticks_eqs_exit+0x5/0x40
>   rcu_dynticks_eqs_exit+0x5/0x40
>   rcu_idle_exit+0xdf/0xf0
>   ? rcu_dynticks_eqs_exit+0x5/0x40
>   ? rcu_idle_exit+0xdf/0xf0
>   do_idle+0x128/0x200
>   cpu_startup_entry+0x1d/0x20
>   start_secondary+0x108/0x130
>   secondary_startup_64+0x9f/0x9f
> 
> This was caused by just using rcu_irq_enter(). Not sure if this will
> still be an issue or not. But because we now add an rcu_nmi_enter()
> without being in_nmi(), we probably should do this. This code doesn't
> run if in_nmi() is true, but it could run from the stack trace dump
> itself, and that now calls rcu_nmi_enter().

OK, how about the following?

	It turns out that functions called from rcu_irq_enter() can
	be subject to various kinds of tracing, which can result in
	rcu_irq_enter() being invoked recursively.  This recursion
	causes RCU to not have been watching when it should have,
	resulting in lockdep-RCU splats.  Switching from rcu_irq_enter()
	to rcu_nmi_enter() still resulted in failures because of calls
	to rcu_irq_enter() between the rcu_nmi_enter() and its matching
	rcu_nmi_exit().  Such calls again cause RCU to not be watching
	when it should have been.

	In particular, the stack tracer called rcu_irq_enter()
	unconditionally, which is problematic when RCU's last
	not-watching-to-watching transition was carried out by
	rcu_nmi_enter(), as will be the case when tracing uses
	rcu_nmi_enter() to cause RCU to start watching the current CPU.
	In that case, rcu_irq_enter() actually switches RCU back to
	the not-watching state for this CPU, which results in lockdep
	splats complaining about rcu_read_lock() being used on an idle
	(not-watched) CPU.  The first patch of this series addressed
	this problem by having rcu_irq_enter() and rcu_irq_exit()
	refrain from doing anything when rcu_nmi_enter() caused RCU to
	start watching this CPU.  The third patch in this series caused
	save_stack_trace() to invoke rcu_nmi_enter() and rcu_nmi_exit()
	as needed, so this fourth patch now removes the rcu_irq_enter()
	and rcu_irq_exit() from within the stack tracer.

> Actually, thinking about this more, this doesn't need to go in stable.
> As recursive rcu_irq_enter() calls should not hurt, and you now allow
> rcu_irq_enter() to be called even after a rcu_nmi_enter() right?

Yes, it is now the case that rcu_irq_enter() can be called even after
an rcu_nmi_enter() exited idle, because rcu_irq_enter() now checks for
this and takes an early exit if so.

But what is it about older kernels prevents the tracing-induced recursive
calls to rcu_irq_enter()?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
  2017-09-23  6:07       ` Paul E. McKenney
@ 2017-09-23 11:22         ` Steven Rostedt
  2017-09-23 17:15           ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2017-09-23 11:22 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Fri, 22 Sep 2017 23:07:37 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> OK, how about the following?
> 
> 	It turns out that functions called from rcu_irq_enter() can
> 	be subject to various kinds of tracing, which can result in
> 	rcu_irq_enter() being invoked recursively.  This recursion
> 	causes RCU to not have been watching when it should have,
> 	resulting in lockdep-RCU splats.  Switching from rcu_irq_enter()
> 	to rcu_nmi_enter() still resulted in failures because of calls
> 	to rcu_irq_enter() between the rcu_nmi_enter() and its matching
> 	rcu_nmi_exit().  Such calls again cause RCU to not be watching
> 	when it should have been.
> 
> 	In particular, the stack tracer called rcu_irq_enter()
> 	unconditionally, which is problematic when RCU's last
> 	not-watching-to-watching transition was carried out by
> 	rcu_nmi_enter(), as will be the case when tracing uses
> 	rcu_nmi_enter() to cause RCU to start watching the current CPU.
> 	In that case, rcu_irq_enter() actually switches RCU back to
> 	the not-watching state for this CPU, which results in lockdep
> 	splats complaining about rcu_read_lock() being used on an idle
> 	(not-watched) CPU.  The first patch of this series addressed
> 	this problem by having rcu_irq_enter() and rcu_irq_exit()
> 	refrain from doing anything when rcu_nmi_enter() caused RCU to
> 	start watching this CPU.  The third patch in this series caused
> 	save_stack_trace() to invoke rcu_nmi_enter() and rcu_nmi_exit()
> 	as needed, so this fourth patch now removes the rcu_irq_enter()
> 	and rcu_irq_exit() from within the stack tracer.

I think it's a bit too much ;-)  I believe it talks too much about the
RCU internals, and the bug will be lost on us mere mortals.

> 
> > Actually, thinking about this more, this doesn't need to go in stable.
> > As recursive rcu_irq_enter() calls should not hurt, and you now allow
> > rcu_irq_enter() to be called even after a rcu_nmi_enter() right?  
> 
> Yes, it is now the case that rcu_irq_enter() can be called even after
> an rcu_nmi_enter() exited idle, because rcu_irq_enter() now checks for
> this and takes an early exit if so.
> 
> But what is it about older kernels prevents the tracing-induced recursive
> calls to rcu_irq_enter()?

Ah, the bug is if rcu_irq_enter() is called, and the stack trace
triggers then. OK, lets keep it simple, and just say this.


    Currently the stack tracer calls rcu_irq_enter() to make sure RCU
    is watching when it records a stack trace. But if the stack tracer
    is triggered while tracing inside of a rcu_irq_enter(), calling
    rcu_irq_enter() unconditionally can be problematic.

    The reason for having rcu_irq_enter() in the first place has been
    fixed from within the saving of the stack trace code, and there's no
    reason for doing it in the stack tracer itself. Just remove it.

Simple and to the point.

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
  2017-09-23 11:22         ` Steven Rostedt
@ 2017-09-23 17:15           ` Paul E. McKenney
  0 siblings, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2017-09-23 17:15 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, ngo Molnar, Andrew Morton, stable

On Sat, Sep 23, 2017 at 07:22:04AM -0400, Steven Rostedt wrote:
> On Fri, 22 Sep 2017 23:07:37 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > OK, how about the following?
> > 
> > 	It turns out that functions called from rcu_irq_enter() can
> > 	be subject to various kinds of tracing, which can result in
> > 	rcu_irq_enter() being invoked recursively.  This recursion
> > 	causes RCU to not have been watching when it should have,
> > 	resulting in lockdep-RCU splats.  Switching from rcu_irq_enter()
> > 	to rcu_nmi_enter() still resulted in failures because of calls
> > 	to rcu_irq_enter() between the rcu_nmi_enter() and its matching
> > 	rcu_nmi_exit().  Such calls again cause RCU to not be watching
> > 	when it should have been.
> > 
> > 	In particular, the stack tracer called rcu_irq_enter()
> > 	unconditionally, which is problematic when RCU's last
> > 	not-watching-to-watching transition was carried out by
> > 	rcu_nmi_enter(), as will be the case when tracing uses
> > 	rcu_nmi_enter() to cause RCU to start watching the current CPU.
> > 	In that case, rcu_irq_enter() actually switches RCU back to
> > 	the not-watching state for this CPU, which results in lockdep
> > 	splats complaining about rcu_read_lock() being used on an idle
> > 	(not-watched) CPU.  The first patch of this series addressed
> > 	this problem by having rcu_irq_enter() and rcu_irq_exit()
> > 	refrain from doing anything when rcu_nmi_enter() caused RCU to
> > 	start watching this CPU.  The third patch in this series caused
> > 	save_stack_trace() to invoke rcu_nmi_enter() and rcu_nmi_exit()
> > 	as needed, so this fourth patch now removes the rcu_irq_enter()
> > 	and rcu_irq_exit() from within the stack tracer.
> 
> I think it's a bit too much ;-)  I believe it talks too much about the
> RCU internals, and the bug will be lost on us mere mortals.
> 
> > 
> > > Actually, thinking about this more, this doesn't need to go in stable.
> > > As recursive rcu_irq_enter() calls should not hurt, and you now allow
> > > rcu_irq_enter() to be called even after a rcu_nmi_enter() right?  
> > 
> > Yes, it is now the case that rcu_irq_enter() can be called even after
> > an rcu_nmi_enter() exited idle, because rcu_irq_enter() now checks for
> > this and takes an early exit if so.
> > 
> > But what is it about older kernels prevents the tracing-induced recursive
> > calls to rcu_irq_enter()?
> 
> Ah, the bug is if rcu_irq_enter() is called, and the stack trace
> triggers then. OK, lets keep it simple, and just say this.
> 
> 
>     Currently the stack tracer calls rcu_irq_enter() to make sure RCU
>     is watching when it records a stack trace. But if the stack tracer
>     is triggered while tracing inside of a rcu_irq_enter(), calling
>     rcu_irq_enter() unconditionally can be problematic.
> 
>     The reason for having rcu_irq_enter() in the first place has been
>     fixed from within the saving of the stack trace code, and there's no
>     reason for doing it in the stack tracer itself. Just remove it.
> 
> Simple and to the point.

Works for me!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer
  2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
@ 2017-09-23 20:56 ` Steven Rostedt
  0 siblings, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-09-23 20:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney, stable

[-- Attachment #1: 0004-tracing-Remove-RCU-work-arounds-from-stack-tracer.patch --]
[-- Type: text/plain, Size: 1826 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Currently the stack tracer calls rcu_irq_enter() to make sure RCU
is watching when it records a stack trace. But if the stack tracer
is triggered while tracing inside of a rcu_irq_enter(), calling
rcu_irq_enter() unconditionally can be problematic.

The reason for having rcu_irq_enter() in the first place has been
fixed from within the saving of the stack trace code, and there's no
reason for doing it in the stack tracer itself. Just remove it.

Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index a4df67cbc711..49cb41412eec 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -96,23 +96,9 @@ check_stack(unsigned long ip, unsigned long *stack)
 	if (in_nmi())
 		return;
 
-	/*
-	 * There's a slight chance that we are tracing inside the
-	 * RCU infrastructure, and rcu_irq_enter() will not work
-	 * as expected.
-	 */
-	if (unlikely(rcu_irq_enter_disabled()))
-		return;
-
 	local_irq_save(flags);
 	arch_spin_lock(&stack_trace_max_lock);
 
-	/*
-	 * RCU may not be watching, make it see us.
-	 * The stack trace code uses rcu_sched.
-	 */
-	rcu_irq_enter();
-
 	/* In case another CPU set the tracer_frame on us */
 	if (unlikely(!frame_size))
 		this_size -= tracer_frame;
@@ -205,7 +191,6 @@ check_stack(unsigned long ip, unsigned long *stack)
 	}
 
  out:
-	rcu_irq_exit();
 	arch_spin_unlock(&stack_trace_max_lock);
 	local_irq_restore(flags);
 }
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-09-23 20:58 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-22 22:15 [PATCH 0/4] rcu/tracing/extable: Fix stack dump when RCU is not watching Steven Rostedt
2017-09-22 22:15 ` [PATCH 1/4] rcu: Allow for page faults in NMI handlers Steven Rostedt
2017-09-22 22:15 ` [PATCH 2/4] extable: Consolidate *kernel_text_address() functions Steven Rostedt
2017-09-22 22:40   ` Paul E. McKenney
2017-09-22 22:15 ` [PATCH 3/4] extable: Enable RCU if it is not watching in kernel_text_address() Steven Rostedt
2017-09-22 22:28   ` Josh Poimboeuf
2017-09-23  1:12     ` [PATCH 3/4 v2] " Steven Rostedt
2017-09-22 22:44   ` [PATCH 3/4] " Paul E. McKenney
2017-09-23  1:09     ` Steven Rostedt
2017-09-22 22:15 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt
2017-09-22 22:54   ` Paul E. McKenney
2017-09-23  1:27     ` Steven Rostedt
2017-09-23  6:07       ` Paul E. McKenney
2017-09-23 11:22         ` Steven Rostedt
2017-09-23 17:15           ` Paul E. McKenney
2017-09-23 20:56 [PATCH 0/4] [GIT PULL] tracing/rcu: Fix save_stack_trace() called when RCU is not watching Steven Rostedt
2017-09-23 20:56 ` [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.