From: Peter Zijlstra <firstname.lastname@example.org> To: Steven Rostedt <email@example.com> Cc: Alexei Starovoitov <firstname.lastname@example.org>, Masami Hiramatsu <email@example.com>, Nikolay Borisov <firstname.lastname@example.org>, LKML <email@example.com>, Alexei Starovoitov <firstname.lastname@example.org>, bpf <email@example.com>, Josh Poimboeuf <firstname.lastname@example.org> Subject: Re: kprobes broken since 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()") Date: Tue, 2 Feb 2021 11:45:41 +0100 Message-ID: <YBktVT+z7sV/vEPU@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <email@example.com> On Sat, Jan 30, 2021 at 07:44:10AM -0500, Steven Rostedt wrote: > On Sat, 30 Jan 2021 09:28:32 +0100 > Peter Zijlstra <firstname.lastname@example.org> wrote: > > > On Fri, Jan 29, 2021 at 04:24:54PM -0500, Steven Rostedt wrote: > > > Specifically, kprobe and ftrace callbacks may have this: > > > > > > if (in_nmi()) > > > return; > > > > > > raw_spin_lock_irqsave(&lock, flags); > > > [..] > > > raw_spin_unlock_irqrestore(&lock, flags); > > > > > > Which is totally fine to have, > > > > Why? There's a distinct lack of explaining here. > > > > Note that we ripped out all such dodgy locking from kretprobes. > > Actually, I think you helped explain the distinction. You mention > "kretpobes" do you mean the infrastructure of kretprobes or all its > users? kretprobe infra. > The infrastructure of ftrace and kprobes can work in any context, it > does not mean that the callbacks must. Again, these are more like > exceptions. Why have "in_nmi()"? If anything that can be called by an > NMI should just work, right? That's basically your argument for having > ftrace and kprobes set "in_nmi". > > You can have locking in NMIs if the locking is *only* in NMI handlers, > right? If that's the case, then so should ftrace and kprobe callbacks. Which is still dodgy as heck. NMIs _can_ nest. Now mostly it doesn't happen, and the few sites that do use spinlocks from NMI context are sure to use it from a specific NMI 'handler' context which typically don't nest. But I still utterly detest the idea of using spinlocks from NMI. It's inherently fragile. > The stack tracer checks the size of the stack, compares it to the > largest recorded size, and if it's bigger, it will save the stack. But > if this happens on two CPUs at the same time, only one can do the > recording at the same time. To synchronize this, a spin lock must be > taken. Similar to spin locks in an NMI. That sounds like something cmpxchg() should be able to do. Have a per-cpu stack trace buffer and a global max one, when cpu local exceeds previous max, cmpxchg the buffer. > But the problem here is, the callbacks can also be done from an NMI > context, so if we are in NMI, we don't want to take any locks, and > simply don't record the stack traces from NMIs. Which is obviously shit :-) The NMI might have interesting stack usage. > The more I think about it, the more I hate the idea that ftrace > callbacks and kprobes are considered NMIs. Simply because they are not! Yet they happen when IRQs are off, so they are ;-) Also, given how everything can nest, it had better all be lockless anyway. You can get your regular function trace interrupted, which can hit a #DB, which can function trace, which can #BP which can function trace again which can get #NMI etc.. Many wonderfun nestings possible. And god knows what these handlers end up calling. The only sane approach is treating it all as NMI and having it all lockless.
next prev parent reply index Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <email@example.com> [not found] ` <firstname.lastname@example.org> [not found] ` <email@example.com> [not found] ` <firstname.lastname@example.org> [not found] ` <email@example.com> 2021-01-28 3:38 ` Masami Hiramatsu 2021-01-28 7:11 ` Nikolay Borisov 2021-01-28 16:12 ` Nikolay Borisov 2021-01-28 16:45 ` Nikolay Borisov 2021-01-28 16:50 ` Josh Poimboeuf 2021-01-28 21:52 ` [PATCH] x86: Disable CET instrumentation in the kernel Josh Poimboeuf 2021-01-29 6:23 ` Nikolay Borisov 2021-01-29 10:21 ` Borislav Petkov [not found] ` <20210129151034.iba4eaa2fuxsipqa@treble> 2021-01-29 16:30 ` Borislav Petkov 2021-01-29 16:49 ` Josh Poimboeuf 2021-01-29 16:54 ` Nikolay Borisov 2021-01-29 17:03 ` Josh Poimboeuf 2021-01-29 17:07 ` Borislav Petkov 2021-01-29 17:58 ` Seth Forshee 2021-01-28 18:24 ` kprobes broken since 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()") Peter Zijlstra 2021-01-29 1:34 ` Alexei Starovoitov 2021-01-29 6:36 ` Nikolay Borisov [not found] ` <YBPNyRyrkzw2echi@hirez.programming.kicks-ass.net> [not found] ` <firstname.lastname@example.org> [not found] ` <email@example.com> 2021-01-29 16:24 ` Peter Zijlstra 2021-01-29 17:45 ` Alexei Starovoitov 2021-01-29 17:59 ` Peter Zijlstra 2021-01-29 19:01 ` Steven Rostedt 2021-01-29 21:05 ` Alexei Starovoitov 2021-01-30 1:41 ` Masami Hiramatsu 2021-01-29 21:24 ` Steven Rostedt 2021-01-30 8:28 ` Peter Zijlstra 2021-01-30 12:44 ` Steven Rostedt 2021-02-02 10:45 ` Peter Zijlstra [this message] 2021-02-02 14:52 ` Steven Rostedt 2021-02-02 16:45 ` Peter Zijlstra 2021-02-02 16:56 ` Steven Rostedt 2021-02-02 18:30 ` Peter Zijlstra 2021-02-02 21:05 ` Steven Rostedt 2021-02-03 13:33 ` Masami Hiramatsu 2021-02-03 13:52 ` Steven Rostedt 2021-01-30 2:02 ` Masami Hiramatsu 2021-01-30 3:08 ` Alexei Starovoitov 2021-01-30 12:10 ` Masami Hiramatsu
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YBktVT+z7sV/vEPU@hirez.programming.kicks-ass.net \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
BPF Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \ email@example.com public-inbox-index bpf Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.bpf AGPL code for this site: git clone https://public-inbox.org/public-inbox.git