linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: There is a Tasks RCU stall warning
Date: Wed, 12 Apr 2017 08:18:17 -0700	[thread overview]
Message-ID: <20170412151817.GG3956@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170412104255.26bb17d4@gandalf.local.home>

On Wed, Apr 12, 2017 at 10:42:55AM -0400, Steven Rostedt wrote:
> On Wed, 12 Apr 2017 07:19:36 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Apr 12, 2017 at 09:18:21AM -0400, Steven Rostedt wrote:
> > > On Tue, 11 Apr 2017 20:23:07 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > But another question...
> > > > 
> > > > Suppose someone traced or probed or whatever a call to (say)
> > > > cond_resched_rcu_qs().  Wouldn't that put the call to this
> > > > function in the trampoline itself?  Of course, if this happened,
> > > > life would be hard when the trampoline was freed due to
> > > > cond_resched_rcu_qs() being a quiescent state.  
> > > 
> > > Not at all, because the trampoline happens at the beginning of the
> > > function. Not in the guts of it (unless something in the guts was
> > > traced). But even then, it should be fine as the change was already
> > > made.
> > > 
> > > 	/* unhook trampoline from function calls */
> > > 	unregister_ftrace_function(my_ops);
> > > 
> > > 	synchronize_rcu_tasks();
> > > 
> > > 	kfree(my_ops->trampoline);
> > > 
> > > 
> > > Thus, once the unregister_ftrace_function() is called, no new entries
> > > into the trampoline can happen. The synchronize_rcu_tasks() is to move
> > > those that are currently on a trampoline off.  
> > 
> > OK, good!  (I thought that these things could appear anywhere.)
> 
> Well the trampolines pretty much can, but they are removed before
> calling synchronize_rcu_tasks(), and nothing can enter the trampoline
> when that is called.

Color me confused...

So you can have an arbitrary function call within a trampoline?

If not, agreed, no problem.  Otherwise, it seems like we have a big
problem remaining.  Unless the functions called from a trampoline are
guaranteed never to do a context switch.

So what exactly is the trampoline code allowed to do?  ;-)

> > If it ever becomes necessary, I suppose you could have a function
> > call as the very last thing on a trampoline.  Do the (off-trampoline)
> > return-address push, jump at the function, and that is the last need
> > for the trampoline.
> 
> The point of trampolines is to optimize the function hooks, added
> features will kill that optimization. But then it gets even more
> complex. The trampolines are written in assembly and do special reg
> savings in order to call C code. And it needs to restore back to the
> original state before calling back to the function being traced. Thus,
> anything at the end of the trampoline will need to be written in
> assembly. Not sure writing RCU code in assembly would be much fun.

Writing RCU code as assembly code would indeed not be my first choice!

> > Assuming that the called function doesn't try accessing the code
> > surrounding the call, but that would be a problem in any case.
> > 
> > > Is there a way that a task could be in the middle of
> > > cond_resched_rcu_qs() and get preempted by something while on the
> > > ftrace trampoline, then the above "unregister_ftrace_function()" and
> > > "synchronize_rcu_tasks()" can be called and finish, while the one task
> > > is still on the trampoline and never finished the cond_resched_rcu_qs()?  
> > 
> > Well, if the kernel being ftraced is a guest OS and the hypervisor
> > preempts it at just that point...
> 
> Not sure what you mean by the above. You mean the hypervisor running
> ftrace on the guest OS? Or just a long pause on the guest OS (could
> also be an NMI). But in any case, we don't care about long pauses. We
> care about tasks going to sleep while on the trampoline, and the ftrace
> code that does the schedule_on_each_cpu() missing that task, because it
> was preempted, and not effected by the schedule_on_each_cpu() call.

The guest doing ftrace and the hypervisor preempting it.  But yes,
same thing as NMI.

> > > > Or is there something that takes care to avoid putting calls to
> > > > this sort of function (and calls to any function calling this sort
> > > > of function, directly or indirectly) into a trampoline?  
> > > 
> > > The question is, if its on the trampoline in one of theses functions
> > > when synchronize_rcu_tasks() is called, will it still be on the
> > > trampoline when that returns?  
> > 
> > If the function's return address is within the trampoline, it seems to
> > me that bad things could happen.
> 
> Not sure what you mean by the above. One should never be tracing within
> a trampoline, or calling synchronize_rcu_tasks() in one. The trampoline
> could be called from any context, including NMI.

My problem is that I have no idea what can and cannot be included in
trampoline code.  In absence of that information, my RCU-honed reflexes
jump immediately to the worst case that I can think of.  ;-)

							Thanx, Paul

  reply	other threads:[~2017-04-12 15:18 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-11 21:18 There is a Tasks RCU stall warning Paul E. McKenney
2017-04-11 21:21 ` Steven Rostedt
2017-04-11 21:32   ` Paul E. McKenney
2017-04-11 21:31 ` Steven Rostedt
2017-04-11 21:34   ` Steven Rostedt
2017-04-11 21:39     ` Steven Rostedt
2017-04-11 21:44       ` Paul E. McKenney
2017-04-11 21:49         ` Steven Rostedt
2017-04-11 21:56           ` Paul E. McKenney
2017-04-11 22:15             ` Steven Rostedt
2017-04-11 23:01               ` Paul E. McKenney
2017-04-11 23:04                 ` Paul E. McKenney
2017-04-11 23:11                   ` Paul E. McKenney
2017-04-12  3:23                     ` Paul E. McKenney
2017-04-12 13:18                       ` Steven Rostedt
2017-04-12 14:19                         ` Paul E. McKenney
2017-04-12 14:42                           ` Steven Rostedt
2017-04-12 15:18                             ` Paul E. McKenney [this message]
2017-04-12 15:53                               ` Steven Rostedt
2017-04-12 16:26                                 ` Paul E. McKenney
2017-04-12 16:49                                   ` Steven Rostedt
2017-04-12 14:48                     ` Paul E. McKenney
2017-04-12 14:59                       ` Steven Rostedt
2017-04-12 16:27                         ` Paul E. McKenney
2017-04-12 16:57                           ` Steven Rostedt
2017-04-12 17:07                             ` Paul E. McKenney
2017-04-12 17:13                               ` Steven Rostedt
2017-04-12 20:02                                 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170412151817.GG3956@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).