LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Masami Hiramatsu <mhiramat@kernel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Alexei Starovoitov <ast@kernel.org>, paulmck <paulmck@kernel.org>,
	"Joel Fernandes, Google" <joel@joelfernandes.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Jason Wessel <jason.wessel@windriver.com>
Subject: Re: Instrumentation and RCU
Date: Wed, 11 Mar 2020 09:18:15 +0900
Message-ID: <20200311091815.fce458348bb7641b60f600d9@kernel.org> (raw)
In-Reply-To: <1760242532.23694.1583857291763.JavaMail.zimbra@efficios.com>

Hi Mathieu,

On Tue, 10 Mar 2020 12:21:31 -0400 (EDT)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> ----- On Mar 10, 2020, at 11:46 AM, rostedt rostedt@goodmis.org wrote:
> 
> > On Tue, 10 Mar 2020 11:31:51 -0400 (EDT)
> > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> > 
> >> I think there are two distinct problems we are trying to solve here,
> >> and it would be good to spell them out to see which pieces of technical
> >> solution apply to which.
> >> 
> >> Problem #1) Tracer invoked from partially initialized kernel context
> >> 
> >>   - Moving the early/late entry/exit points into sections invisible from
> >>     instrumentation seems to make tons of sense for this.
> >> 
> >> Problem #2) Tracer recursion
> >> 
> >>   - I'm much less convinced that hiding entry points from instrumentation
> >>     works for this. As an example, with the isntr_begin/end() approach you
> >>     propose above, as soon as you have a tracer recursing into itself because
> >>     something below do_stuff() has been instrumented, having hidden the entry
> >>     point did not help at all.
> >> 
> >> So I would be tempted to use the "hide entry/exit points" with explicit
> >> instr begin/end annotation to solve Problem #1, but I'm still thinking there
> >> is value in the per recursion context "in_tracing" flag to prevent tracer
> >> recursion.
> > 
> > The only recursion issue that I've seen discussed is breakpoints. And
> > that's outside of the tracer infrastructure. Basically, if someone added a
> > breakpoint for a kprobe on something that gets called in the int3 code
> > before kprobes is called we have (let's say rcu_nmi_enter()):
> > 
> > 
> > rcu_nmi_enter();
> >  <int3>
> >     do_int3() {
> >        rcu_nmi_enter();
> >          <int3>
> >             do_int3();
> >                [..]
> > 
> > Where would a "in_tracer" flag help here? Perhaps a "in_breakpoint" could?
> 
> An approach where the "in_tracer" flag is tested and set by the instrumentation
> (function tracer, kprobes, tracepoints) would work here. Let's say the beginning
> of the int3 ISR is part of the code which is invisible to instrumentation, and
> before we issue rcu_nmi_enter(), we handle the in_tracer flag:
> 
> rcu_nmi_enter();
>  <int3>
>     (recursion_ctx->in_tracer == false)
>     set recursion_ctx->in_tracer = true
>     do_int3() {
>        rcu_nmi_enter();
>          <int3>
>             if (recursion_ctx->in_tracer == true)
>                 iret
> 
> We can change "in_tracer" for "in_breakpoint", "in_tracepoint" and
> "in_function_trace" if we ever want to allow different types of instrumentation
> to nest. I'm not sure whether this is useful or not through.

Kprobes already has its own "in_kprobe" flag, and the recursion path is
not so simple. Since the int3 replaces the original instruction, we have to
execute the original instruction with single-step and fixup.

This means it involves do_debug() too. Thus, we can not do iret directly
from do_int3 like above, but if recursion happens, we have no way to
recover to origonal execution path (and call BUG()).

As my previous email, I showed a patch which is something like
"bust_kprobes()" for oops path. That is not safe but no other way to escape
from this recursion hell. (Maybe we can try to call it instead of calling
BUG() so that the kernel can continue to run, but I'm not sure we can
safely make the pagetable to readonly again.)

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

  reply index

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-09 17:02 Thomas Gleixner
2020-03-09 18:15 ` Steven Rostedt
2020-03-09 18:42   ` Joel Fernandes
2020-03-09 19:07     ` Steven Rostedt
2020-03-09 19:20       ` Mathieu Desnoyers
2020-03-16 15:02       ` Joel Fernandes
2020-03-09 18:59   ` Thomas Gleixner
2020-03-10  8:09     ` Masami Hiramatsu
2020-03-10 11:43       ` Thomas Gleixner
2020-03-10 15:31         ` Mathieu Desnoyers
2020-03-10 15:46           ` Steven Rostedt
2020-03-10 16:21             ` Mathieu Desnoyers
2020-03-11  0:18               ` Masami Hiramatsu [this message]
2020-03-11  0:37                 ` Mathieu Desnoyers
2020-03-11  7:48                   ` Masami Hiramatsu
2020-03-10 16:06         ` Masami Hiramatsu
2020-03-12 13:53         ` Peter Zijlstra
2020-03-10 15:24       ` Mathieu Desnoyers
2020-03-10 17:05       ` Daniel Thompson
2020-03-09 18:37 ` Mathieu Desnoyers
2020-03-09 18:44   ` Steven Rostedt
2020-03-09 18:52     ` Mathieu Desnoyers
2020-03-09 19:09       ` Steven Rostedt
2020-03-09 19:25         ` Mathieu Desnoyers
2020-03-09 19:52   ` Thomas Gleixner
2020-03-10 15:03     ` Mathieu Desnoyers
2020-03-10 16:48       ` Thomas Gleixner
2020-03-10 17:40         ` Mathieu Desnoyers
2020-03-10 18:31           ` Thomas Gleixner
2020-03-10 18:37             ` Mathieu Desnoyers
2020-03-10  1:40   ` Alexei Starovoitov
2020-03-10  8:02     ` Thomas Gleixner
2020-03-10 16:54     ` Paul E. McKenney
2020-03-17 17:56     ` Joel Fernandes
2020-03-09 20:18 ` Peter Zijlstra
2020-03-09 20:47 ` Paul E. McKenney
2020-03-09 20:58   ` Steven Rostedt
2020-03-09 21:25     ` Paul E. McKenney
2020-03-09 23:52   ` Frederic Weisbecker
2020-03-10  2:26     ` Paul E. McKenney
2020-03-10 15:13   ` Mathieu Desnoyers
2020-03-10 16:49     ` Paul E. McKenney
2020-03-10 17:22       ` Mathieu Desnoyers
2020-03-10 17:26         ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200311091815.fce458348bb7641b60f600d9@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=ast@kernel.org \
    --cc=frederic@kernel.org \
    --cc=jason.wessel@windriver.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git