From: "Paul E. McKenney" <paulmck@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Joel Fernandes <joel@joelfernandes.org>,
Frederic Weisbecker <frederic@kernel.org>
Subject: Re: Instrumentation and RCU
Date: Mon, 9 Mar 2020 13:47:10 -0700 [thread overview]
Message-ID: <20200309204710.GU2935@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <87mu8p797b.fsf@nanos.tec.linutronix.de>
On Mon, Mar 09, 2020 at 06:02:32PM +0100, Thomas Gleixner wrote:
> Folks,
>
> I'm starting a new conversation because there are about 20 different
> threads which look at that problem in various ways and the information
> is so scattered that creating a coherent picture is pretty much
> impossible.
>
> There are several problems to solve:
>
> 1) Fragile low level entry code
>
> 2) Breakpoint utilization
>
> 3) RCU idle
>
> 4) Callchain protection
>
> #1 Fragile low level entry code
>
> While I understand the desire of instrumentation to observe
> everything we really have to ask the question whether it is worth the
> trouble especially with entry trainwrecks like x86, PTI and other
> horrors in that area.
>
> I don't think so and we really should just bite the bullet and forbid
> any instrumentation in that code unless it is explicitly designed
> for that case, makes sense and has a real value from an observation
> perspective.
>
> This is very much related to #3..
>
> #2) Breakpoint utilization
>
> As recent findings have shown, breakpoint utilization needs to be
> extremly careful about not creating infinite breakpoint recursions.
>
> I think that's pretty much obvious, but falls into the overall
> question of how to protect callchains.
>
> #3) RCU idle
>
> Being able to trace code inside RCU idle sections is very similar to
> the question raised in #1.
>
> Assume all of the instrumentation would be doing conditional RCU
> schemes, i.e.:
>
> if (rcuidle)
> ....
> else
> rcu_read_lock_sched()
>
> before invoking the actual instrumentation functions and of course
> undoing that right after it, that really begs the question whether
> it's worth it.
>
> Especially constructs like:
>
> trace_hardirqs_off()
> idx = srcu_read_lock()
> rcu_irq_enter_irqson();
> ...
> rcu_irq_exit_irqson();
> srcu_read_unlock(idx);
>
> if (user_mode)
> user_exit_irqsoff();
> else
> rcu_irq_enter();
>
> are really more than questionable. For 99.9999% of instrumentation
> users it's absolutely irrelevant whether this traces the interrupt
> disabled time of user_exit_irqsoff() or rcu_irq_enter() or not.
>
> But what's relevant is the tracer overhead which is e.g. inflicted
> with todays trace_hardirqs_off/on() implementation because that
> unconditionally uses the rcuidle variant with the scru/rcu_irq dance
> around every tracepoint.
>
> Even if the tracepoint sits in the ASM code it just covers about ~20
> low level ASM instructions more. The tracer invocation, which is
> even done twice when coming from user space on x86 (the second call
> is optimized in the tracer C-code), costs definitely way more
> cycles. When you take the scru/rcu_irq dance into account it's a
> complete disaster performance wise.
Suppose that we had a variant of RCU that had about the same read-side
overhead as Preempt-RCU, but which could be used from idle as well as
from CPUs in the process of coming online or going offline? I have not
thought through the irq/NMI/exception entry/exit cases, but I don't see
why that would be problem.
This would have explicit critical-section entry/exit code, so it would
not be any help for trampolines.
Would such a variant of RCU help?
Yeah, I know. Just what the kernel doesn't need, yet another variant
of RCU...
Thanx, Paul
> #4 Protecting call chains
>
> Our current approach of annotating functions with notrace/noprobe is
> pretty much broken.
>
> Functions which are marked NOPROBE or notrace call out into functions
> which are not marked and while this might be ok, there are enough
> places where it is not. But we have no way to verify that.
>
> That's just a recipe for disaster. We really cannot request from
> sysadmins who want to use instrumentation to stare at the code first
> whether they can place/enable an instrumentation point somewhere.
> That'd be just a bad joke.
>
> I really think we need to have proper text sections which are off
> limit for any form of instrumentation and have tooling to analyze the
> calls into other sections. These calls need to be annotated as safe
> and intentional.
>
> Thoughts?
>
> Thanks,
>
> tglx
>
>
>
>
>
>
>
>
next prev parent reply other threads:[~2020-03-09 20:47 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-09 17:02 Instrumentation and RCU Thomas Gleixner
2020-03-09 18:15 ` Steven Rostedt
2020-03-09 18:42 ` Joel Fernandes
2020-03-09 19:07 ` Steven Rostedt
2020-03-09 19:20 ` Mathieu Desnoyers
2020-03-16 15:02 ` Joel Fernandes
2020-03-09 18:59 ` Thomas Gleixner
2020-03-10 8:09 ` Masami Hiramatsu
2020-03-10 11:43 ` Thomas Gleixner
2020-03-10 15:31 ` Mathieu Desnoyers
2020-03-10 15:46 ` Steven Rostedt
2020-03-10 16:21 ` Mathieu Desnoyers
2020-03-11 0:18 ` Masami Hiramatsu
2020-03-11 0:37 ` Mathieu Desnoyers
2020-03-11 7:48 ` Masami Hiramatsu
2020-03-10 16:06 ` Masami Hiramatsu
2020-03-12 13:53 ` Peter Zijlstra
2020-03-10 15:24 ` Mathieu Desnoyers
2020-03-10 17:05 ` Daniel Thompson
2020-03-09 18:37 ` Mathieu Desnoyers
2020-03-09 18:44 ` Steven Rostedt
2020-03-09 18:52 ` Mathieu Desnoyers
2020-03-09 19:09 ` Steven Rostedt
2020-03-09 19:25 ` Mathieu Desnoyers
2020-03-09 19:52 ` Thomas Gleixner
2020-03-10 15:03 ` Mathieu Desnoyers
2020-03-10 16:48 ` Thomas Gleixner
2020-03-10 17:40 ` Mathieu Desnoyers
2020-03-10 18:31 ` Thomas Gleixner
2020-03-10 18:37 ` Mathieu Desnoyers
2020-03-10 1:40 ` Alexei Starovoitov
2020-03-10 8:02 ` Thomas Gleixner
2020-03-10 16:54 ` Paul E. McKenney
2020-03-17 17:56 ` Joel Fernandes
2020-03-09 20:18 ` Peter Zijlstra
2020-03-09 20:47 ` Paul E. McKenney [this message]
2020-03-09 20:58 ` Steven Rostedt
2020-03-09 21:25 ` Paul E. McKenney
2020-03-09 23:52 ` Frederic Weisbecker
2020-03-10 2:26 ` Paul E. McKenney
2020-03-10 15:13 ` Mathieu Desnoyers
2020-03-10 16:49 ` Paul E. McKenney
2020-03-10 17:22 ` Mathieu Desnoyers
2020-03-10 17:26 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200309204710.GU2935@paulmck-ThinkPad-P72 \
--to=paulmck@kernel.org \
--cc=ast@kernel.org \
--cc=frederic@kernel.org \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).