All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-kernel@vger.kernel.org, mingo@kernel.org,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, tglx@linutronix.de, dhowells@redhat.com,
	edumazet@google.com, dvhart@linux.intel.com, fweisbec@gmail.com,
	bobby.prani@gmail.com, masami.hiramatsu.pt@hitachi.com
Subject: Re: [PATCH v3 tip/core/rcu 3/9] rcu: Add synchronous grace-period waiting for RCU-tasks
Date: Fri, 8 Aug 2014 10:58:58 -0400	[thread overview]
Message-ID: <20140808105858.171da847@gandalf.local.home> (raw)
In-Reply-To: <20140808143413.GB9918@twins.programming.kicks-ass.net>

On Fri, 8 Aug 2014 16:34:13 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, Aug 08, 2014 at 10:12:21AM -0400, Steven Rostedt wrote:
> > > Ok, so they're purely used in the function prologue/epilogue callchain.
> > 
> > No, they are also used by optimized kprobes. This is why optimized
> > kprobes depend on !CONFIG_PREEMPT. [ added Masami to the discussion ].
> 
> How do those work? Is that one where the INT3 relocates the instruction
> stream into an alternative 'text' and that JMPs back into the original
> stream at the end?

No, it's where we replace the 'int3' with a jump to a trampoline that
simulates an INT3. Speeds things up quite a bit.

> 
> And what is there to make sure the kprobe itself doesn't do 'funny'?

Well, kprobes, like function callbacks are just restricted like
interrupt handlers are. If they break, they break. They should know
better ;-)

> 
> > Which reminds me. On !CONFIG_PREEMPT, call_rcu_task() should be
> > equivalent to call_rcu_sched().
> 
> Sure, as long as you make absolutely sure none of that code ends up
> calling cond_resched()/might_sleep() etc. Which I think you already said
> was true, so no worries there.

Right. There's no guarantees that someone wont do such a stupid thing.
But then, there's no guarantees that someone wont register an NMI
callback with the same code too.

> 
> > > And you don't want to use synchronize_tasks() because registering a trace
> > > functions is atomic ?
> > 
> > No. Has nothing to do with registering the trace function. The issue is
> > that we have no idea when a task happens to be on a trampoline after it
> > is registered. For example:
> > 
> > ops adds a callback to sys_read:
> > 
> > sys_read() {
> >  call trampoline ->
> >     set up regs for function call.
> >     <interrupt>
> >       preempt_schedule();
> > 
> >       [ new task runs for long time ]
> > 
> > 
> > While this new task is running, we remove the trampoline and want to
> > free it. Say this new task keeps the other task from running for
> > minutes! We call synchronize_sched() or any other rcu call, and all
> > grace periods finish and we free the trampoline. The sys_read() no
> > longer calls our trampoline. Doesn't matter, because that task is still
> > on it. Now we schedule that task back. It's on a trampoline that has
> > just been freed! BOOM. It's executing code that no longer exits.
> 
> Sure, I get that part. What I was getting as is _WHY_ you need
> call_rcu_task(), why isn't synchronize_tasks() good enough?

Oh, because that synchronize_tasks() may take minutes. And that means
we wont be able to return for a long time. The only thing I can really
see using call_rcu_task() is something that needs to free its data. Why
wait around when all you're going to do is call free? It's basically
just a garbage collector.

> 
> > > No need for extra allocations and fancy means of getting rid of them,
> > > and only a few bytes extra wrt the existing function.
> > 
> > This doesn't address the issue we want to solve.
> > 
> > Say we have 1000 functions we want to trace with 1000 different
> > callbacks. Each of theses functions has one call back. How do you solve
> > that with your solution? Today, we do the list for every function. That
> > is, for each of these 1000 functions, we run through 1000 ops looking
> > for the ops that registered for this function. Not very efficient is it?
> 
> Ah, but you didn't say that, didn't you :-)

I just thought it was implied ;-)

> 
> > What we want to do today, is to create a dynamic trampoline for each of
> > theses 1000 functions. Each function will call a separate trampoline
> > that will only call the function that was registered to it. That way,
> > we can have 1000 different ops registered to 1000 different functions
> > and still have the same performance.
> 
> And how will you limit the amount of memory tied up in this? This looks
> like a good way to tie up an immense amount of memory fast.

Well, these operations are currently only allowed by root. Thus, it's
the thing that root should be careful about. The trampolines are small,
and it will take a hell of a lot of callbacks to cause issues.

The thing I'm worried about is to make sure they get freed. Otherwise a
leak will cause more issues than anything else. Which also means we
need to have a way to expedite call_rcu_tasks() if need be.

-- Steve

  reply	other threads:[~2014-08-08 14:59 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-31 21:54 [PATCH v3 tip/core/rcu 0/9 Paul E. McKenney
2014-07-31 21:55 ` [PATCH v3 tip/core/rcu 1/9] rcu: Add call_rcu_tasks() Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 2/9] rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 3/9] rcu: Add synchronous grace-period waiting for RCU-tasks Paul E. McKenney
2014-08-01 15:09     ` Oleg Nesterov
2014-08-01 18:32       ` Paul E. McKenney
2014-08-01 19:44         ` Paul E. McKenney
2014-08-02 14:47           ` Oleg Nesterov
2014-08-02 22:58             ` Paul E. McKenney
2014-08-06  0:57               ` Steven Rostedt
2014-08-06  1:21                 ` Paul E. McKenney
2014-08-06  8:47                   ` Peter Zijlstra
2014-08-06 12:09                     ` Paul E. McKenney
2014-08-06 16:30                       ` Peter Zijlstra
2014-08-06 22:45                         ` Paul E. McKenney
2014-08-07  8:45                           ` Peter Zijlstra
2014-08-07 15:00                             ` Paul E. McKenney
2014-08-07 15:26                               ` Peter Zijlstra
2014-08-07 17:27                                 ` Peter Zijlstra
2014-08-07 18:46                                   ` Peter Zijlstra
2014-08-07 19:49                                     ` Steven Rostedt
2014-08-07 19:53                                       ` Steven Rostedt
2014-08-07 20:08                                         ` Peter Zijlstra
2014-08-07 21:18                                           ` Steven Rostedt
2014-08-08  6:40                                             ` Peter Zijlstra
2014-08-08 14:12                                               ` Steven Rostedt
2014-08-08 14:28                                                 ` Paul E. McKenney
2014-08-09 10:56                                                   ` Masami Hiramatsu
2014-08-08 14:34                                                 ` Peter Zijlstra
2014-08-08 14:58                                                   ` Steven Rostedt [this message]
2014-08-08 15:16                                                     ` Peter Zijlstra
2014-08-08 15:39                                                       ` Steven Rostedt
2014-08-08 16:01                                                         ` Peter Zijlstra
2014-08-08 16:10                                                           ` Steven Rostedt
2014-08-08 16:17                                                         ` Peter Zijlstra
2014-08-08 16:40                                                           ` Steven Rostedt
2014-08-08 16:52                                                             ` Peter Zijlstra
2014-08-08 16:27                                                     ` Peter Zijlstra
2014-08-08 16:39                                                       ` Paul E. McKenney
2014-08-08 16:49                                                         ` Steven Rostedt
2014-08-08 16:51                                                         ` Peter Zijlstra
2014-08-08 17:09                                                           ` Paul E. McKenney
2014-08-08 16:43                                                       ` Steven Rostedt
2014-08-08 16:50                                                         ` Peter Zijlstra
2014-08-08 17:27                                                       ` Steven Rostedt
2014-08-09 10:36                                                         ` Masami Hiramatsu
2014-08-07 20:06                                       ` Peter Zijlstra
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 4/9] rcu: Export RCU-tasks APIs to GPL modules Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 5/9] rcutorture: Add torture tests for RCU-tasks Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 6/9] rcutorture: Add RCU-tasks test cases Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 7/9] rcu: Add stall-warning checks for RCU-tasks Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 8/9] rcu: Improve RCU-tasks energy efficiency Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 9/9] documentation: Add verbiage on RCU-tasks stall warning messages Paul E. McKenney
2014-07-31 23:57   ` [PATCH v3 tip/core/rcu 1/9] rcu: Add call_rcu_tasks() Frederic Weisbecker
2014-08-01  2:04     ` Paul E. McKenney
2014-08-01 15:06       ` Frederic Weisbecker
2014-08-01  1:15   ` Lai Jiangshan
2014-08-01  1:59     ` Paul E. McKenney
2014-08-01  1:31   ` Lai Jiangshan
2014-08-01  2:11     ` Paul E. McKenney
2014-08-01 14:11   ` Oleg Nesterov
2014-08-01 18:28     ` Paul E. McKenney
2014-08-01 18:40       ` Oleg Nesterov
2014-08-02 23:00         ` Paul E. McKenney
2014-08-03 12:57           ` Oleg Nesterov
2014-08-03 22:03             ` Paul E. McKenney
2014-08-04 13:29               ` Oleg Nesterov
2014-08-04 13:48                 ` Paul E. McKenney
2014-08-01 18:57   ` Oleg Nesterov
2014-08-02 22:50     ` Paul E. McKenney
2014-08-02 14:56   ` Oleg Nesterov
2014-08-02 22:57     ` Paul E. McKenney
2014-08-03 13:33       ` Oleg Nesterov
2014-08-03 22:05         ` Paul E. McKenney
2014-08-04  0:37           ` Lai Jiangshan
2014-08-04  1:09             ` Paul E. McKenney
2014-08-04 13:25               ` Oleg Nesterov
2014-08-04 13:51                 ` Paul E. McKenney
2014-08-04 13:52                   ` Paul E. McKenney
2014-08-04 13:32           ` Oleg Nesterov
2014-08-04 19:28             ` Paul E. McKenney
2014-08-04 19:32               ` Oleg Nesterov
2014-08-04  1:28   ` Lai Jiangshan
2014-08-04  7:46     ` Peter Zijlstra
2014-08-04  8:18       ` Lai Jiangshan
2014-08-04 11:50         ` Paul E. McKenney
2014-08-04 12:25           ` Peter Zijlstra
2014-08-04 12:37             ` Paul E. McKenney
2014-08-04 14:56             ` Peter Zijlstra
2014-08-05  0:47               ` Lai Jiangshan
2014-08-05 21:55                 ` Paul E. McKenney
2014-08-06  0:27                   ` Lai Jiangshan
2014-08-06  0:48                     ` Paul E. McKenney
2014-08-06  0:33                   ` Lai Jiangshan
2014-08-06  0:51                     ` Paul E. McKenney
2014-08-06 22:48                       ` Paul E. McKenney
2014-08-07  8:49                   ` Peter Zijlstra
2014-08-07 15:43                     ` Paul E. McKenney
2014-08-07 16:32                       ` Peter Zijlstra
2014-08-07 17:48                         ` Paul E. McKenney
2014-08-08 19:13   ` Peter Zijlstra
2014-08-08 20:58     ` Paul E. McKenney
2014-08-09  6:15       ` Peter Zijlstra
2014-08-09 12:44         ` Steven Rostedt
2014-08-09 16:05           ` Paul E. McKenney
2014-08-09 16:01         ` Paul E. McKenney
2014-08-09 18:19           ` Peter Zijlstra
2014-08-09 18:24             ` Peter Zijlstra
2014-08-10  1:29               ` Paul E. McKenney
2014-08-10  8:14                 ` Peter Zijlstra
2014-08-11  3:30                   ` Paul E. McKenney
2014-08-11 11:57                     ` Peter Zijlstra
2014-08-11 16:15                       ` Paul E. McKenney
2014-08-10  1:26             ` Paul E. McKenney
2014-08-10  8:12               ` Peter Zijlstra
2014-08-10 16:46                 ` Peter Zijlstra
2014-08-11  3:28                   ` Paul E. McKenney
2014-08-11  3:23                 ` Paul E. McKenney
2014-08-09 18:33       ` Peter Zijlstra
2014-08-10  1:38         ` Paul E. McKenney
2014-08-10 15:00           ` Peter Zijlstra
2014-08-11  3:37             ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140808105858.171da847@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=bobby.prani@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.