Re: [BUG] stack tracing causes: kernel/module.c:271 module_assert_mutex_or_preempt

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] stack tracing causes: kernel/module.c:271 module_assert_mutex_or_preempt
Date: Wed, 5 Apr 2017 10:59:25 -0700	[thread overview]
Message-ID: <20170405175925.GG1600@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170405124539.730a2365@gandalf.local.home>

On Wed, Apr 05, 2017 at 12:45:39PM -0400, Steven Rostedt wrote:
> On Wed, 5 Apr 2017 09:25:15 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > Thoughts?  
> > 
> > Ouch!!!
> > 
> > I really blew it telling you that you can use rcu_irq_enter()!!!
> > 
> > RCU's eqs code disables interrupts, but has a call to the trace code
> > exactly where it cannot handle an interrupt.  Your rcu_irq_enter()
> > therefore breaks things -- the ->dynticks_nesting has been set to zero,
> > but ->dynticks still indicates non-idle.  Therefore, rcu_irq_enter() sees
> > zero nesting, and thus incorrectly increments ->dynticks, which tells RCU
> > "idle".  The later rcu_is_watching() checks therefore complain bitterly.
> > 
> > I really don't want you to have to hack around this.  So I should adjust
> > RCU to avoid doing tracing during this period of time, which would allow
> > you to continue using rcu_irq_enter() as you are now.  (You don't trace
> > from NMIs, do you?  If you do, I guess I need to somehow disable tracing
> > during the vulnerable period?)
> 
> Note, this is the stack tracer. It attaches itself to the function
> tracer, which traces all functions and checks the stack size for each
> function it calls. It performs a stack trace when it sees a stack
> larger than any stack it has seen before. It just so happens that it
> saw a larger stack when tracing inside the rcu functions.
> 
> Note, this has nothing to do with tracepoints or TRACE_EVENT()s. Nor
> does it have any issue with function tracing itself and yes, function
> tracing does trace NMIs, but doesn't need RCU. And yes, stack tracing
> just punts (returns doing nothing) if it sees that it is in an NMI.
> Especially since stack tracing grabs locks.

Whew!!!  ;-)

> The problem is that the recording of the stack (save_stack_trace) does
> utilize RCU, and that's where it complains.
> 
> We don't need to stop tracing, we just need to stop stack tracing. I'm
> fine with adding another check to see if we are in a critical RCU
> section, and just not trace the stack there.
> 
> > 
> > One simple-and-stupid fix would be for me to move rcu_eqs_enter_common()'s
> > trace_rcu_dyntick() statement into all its callers.  Another would be to
> > give rcu_eqs_enter_common() an additional argument that is the amount to
> > subtract from ->dynticks_nesting, and have rcu_eqs_enter_common() do the
> > decrment after the tracing.  The latter seems most reasonable at first
> > glance, especially given that it confines the risky data-inconsistent
> > period to a single contiguous piece of code.
> 
> Note, this has nothing to do with trace_rcu_dyntick(). It's the
> function tracer tracing inside RCU, calling the stack tracer to record
> a new stack if it sees its larger than any stack before. All I need is
> a way to tell the stack tracer to not record a stack if it is in this
> RCU critical section.
> 
> If you can add a "in_rcu_critical_section()" function, that the stack
> tracer can test, and simply exit out like it does if in_nmi() is set,
> that would work too. Below is my current work around.

Except that the rcu_irq_enter() would already have triggered the bug
that was (allegedly) fixed by my earlier patch.  So, yes, the check for
rcu_is_watching() would work around this bug, but the hope is that
with my earlier fix, this workaround would not be needed.

So could you please test my earlier patch?

This patch does not conflict with anything on -rcu, so you could
carry it if that helps.

> > So how about the following lightly tested patch?
> > 
> > I will check to see if something similar is needed for eqs exit.

I did verify that the eqs-exit path does not have this problem, FWIW.

> > Could you please let me know if tracing happens in NMI handlers?
> > If so, a bit of additional code will be needed.
> > 
> > 							Thanx, Paul
> > 
> > PS.  Which reminds me, any short-term uses of RCU_TASKS?  This represents
> >      3 of my 16 test scenarios, which is getting hard to justify for
> >      something that isn't used.  Especially given that I will need to
> >      add more scenarios for parallel-callbacks SRCU...
> 
> The RCU_TASK implementation is next on my todo list. Yes, there's going
> to be plenty of users very soon. Not for 4.12 but definitely for 4.13.
> 
> Sorry for the delay in implementing that :-/

OK, I will wait a few months before checking again...

							Thanx, Paul

> -- Steve
> 
> 
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 5fb1f2c87e6b..ac7241cf8c9c 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -105,6 +105,10 @@ check_stack(unsigned long ip, unsigned long *stack)
>  	 */
>  	rcu_irq_enter();
> 
> +	/* If we traced rcu internals, rcu may still not be watching */
> +	if (unlikely(!rcu_is_watching()))
> +		goto out;
> +
>  	/* In case another CPU set the tracer_frame on us */
>  	if (unlikely(!frame_size))
>  		this_size -= tracer_frame;
>