bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Masami Hiramatsu <masami.hiramatsu@gmail.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nikolay Borisov <nborisov@suse.com>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	bpf@vger.kernel.org
Subject: Re: kprobes broken since 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")
Date: Thu, 28 Jan 2021 12:38:42 +0900	[thread overview]
Message-ID: <20210128123842.c9e33949e62f504b84bfadf5@gmail.com> (raw)
In-Reply-To: <20210128103415.d90be51ec607bb6123b2843c@kernel.org>

Hi,

On Thu, 28 Jan 2021 10:34:15 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Wed, 27 Jan 2021 19:57:56 +0200
> Nikolay Borisov <nborisov@suse.com> wrote:
> 
> > 
> > 
> > On 27.01.21 г. 17:24 ч., Masami Hiramatsu wrote:
> > > On Thu, 28 Jan 2021 00:13:53 +0900
> > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > 
> > >> Hi Nikolay,
> > >>
> > >> On Wed, 27 Jan 2021 15:43:29 +0200
> > >> Nikolay Borisov <nborisov@suse.com> wrote:
> > >>
> > >>> Hello,
> > >>>
> > >>> I'm currently seeing latest Linus' master being somewhat broken w.r.t
> > >>> krpobes. In particular I have the following test-case:
> > >>>
> > >>> #!/bin/bash
> > >>>
> > >>> mkfs.btrfs -f /dev/vdc &> /dev/null
> > >>> mount /dev/vdc /media/scratch/
> > >>>
> > >>> bpftrace -e 'kprobe:btrfs_sync_file {printf("kprobe: %s\n", kstack());}'
> > >>> &>bpf-output &
> > >>> bpf_trace_pid=$!
> > >>>
> > >>> # force btrfs_sync_file to be called
> > >>> sleep 2
> > >>> xfs_io -f -c "pwrite 0 4m" -c "fsync" /media/scratch/file5
> > >>>
> > >>> kill $bpf_trace_pid
> > >>> sleep 1
> > >>>
> > >>> grep -q kprobe bpf-output
> > >>> retval=$?
> > >>> rm -f bpf-output
> > >>> umount /media/scratch
> > >>>
> > >>> exit $retval
> > >>>
> > >>> It traces btrfs_sync_file which is called when fsync is executed on a
> > >>> btrfs file, however I don't see the stacktrace being printed i.e the
> > >>> kprobe doesn't fire at all. The following alternative program:
> > >>>
> > >>> bpftrace -e 'tracepoint:btrfs:btrfs_sync_file {printf("tracepoint:
> > >>> %s\n", kstack());} kprobe:btrfs_sync_file {printf("kprobe: %s\n",
> > >>> kstack());}'
> > >>>
> > >>> only prints the stack from the tracepoint and not from the kprobe, given
> > >>> that the tracepoint is called from the btrfs_sync_file function.
> > >>
> > >> Thank you for reporting!
> > >>
> > >> If you don't mind, could you confirm it with ftrace (tracefs)?
> > >> bpftrace etc. involves too many things. It is better to test with
> > >> simpler way to test it.
> > >> I'm not familer with the bpftrace, but I think you can check it with
> > >>
> > >> # cd /sys/kernel/debug/tracing
> > >> # echo p:myevent btrfs_sync_file >> kprobe_events
> > >> # echo stacktrace > events/kprobes/myevent/trigger
> > >>  (or echo 1 > options/stacktrace , if trigger file doesn't exist)
> > > 
> > > Of course, also you have to enable the event.
> > >  # echo 1 > events/kprobes/myevent/enable
> > > 
> > > And check the results
> > > 
> > >  # cat trace
> > > 
> > > 
> > >> Could you also share your kernel config, so that we can reproduce it?
> > > 
> > 
> > I've attached the config and indeed with the scenario you proposed it
> > seems to works. I see:
> > 
> >        xfs_io-20280   [000] d.Z.  9900.748633: myevent:
> > (btrfs_sync_file+0x0/0x580)
> >           xfs_io-20280   [000] d.Z.  9900.748647: <stack trace>
> >  => kprobe_trace_func
> >  => kprobe_dispatcher
> >  => kprobe_int3_handler
> >  => exc_int3
> >  => asm_exc_int3
> >  => btrfs_sync_file
> >  => do_fsync
> >  => __x64_sys_fsync
> >  => do_syscall_64
> >  => entry_SYSCALL_64_after_hwframe
> 
> Hmm, then there might be a problem in bpftrace or ebpf (need more info).
> At least kprobes itself isn't broken.
> I guess they check "in_nmi()" and skip such event?

Yeah, there is. Nikolay, could you try this tentative patch?

Of course this just drops the NMI check from the handler, so alternative
checker is required. But I'm not sure what the original code concerns.
As far as I can see, there seems no re-entrant block flag, nor locks
among ebpf programs in runtime.

Alexei, could you tell me what is the concerning situation for bpf?

Thank you,

From c5cd0e5f60ef6494c9e1579ec1b82b7344c41f9a Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <mhiramat@kernel.org>
Date: Thu, 28 Jan 2021 12:31:02 +0900
Subject: [PATCH] tracing: bpf: Remove in_nmi() check from kprobe handler

Since commit 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()") has
changed the kprobe handler to run in the NMI context, in_nmi() always returns
true. This means the bpf events on kprobes always skipped.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 kernel/trace/bpf_trace.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 6c0018abe68a..764400260eb6 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -96,9 +96,6 @@ unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
 {
 	unsigned int ret;
 
-	if (in_nmi()) /* not supported yet */
-		return 1;
-
 	cant_sleep();
 
 	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
-- 
2.25.1

-- 
Masami Hiramatsu

       reply	other threads:[~2021-01-28  4:10 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <25cd2608-03c2-94b8-7760-9de9935fde64@suse.com>
     [not found] ` <20210128001353.66e7171b395473ef992d6991@kernel.org>
     [not found]   ` <20210128002452.a79714c236b69ab9acfa986c@kernel.org>
     [not found]     ` <a35a6f15-9ab1-917c-d443-23d3e78f2d73@suse.com>
     [not found]       ` <20210128103415.d90be51ec607bb6123b2843c@kernel.org>
2021-01-28  3:38         ` Masami Hiramatsu [this message]
2021-01-28  7:11           ` kprobes broken since 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()") Nikolay Borisov
2021-01-28 16:12           ` Nikolay Borisov
2021-01-28 16:45             ` Nikolay Borisov
2021-01-28 16:50               ` Josh Poimboeuf
2021-01-28 21:52                 ` [PATCH] x86: Disable CET instrumentation in the kernel Josh Poimboeuf
2021-01-29  6:23                   ` Nikolay Borisov
2021-01-29 10:21                   ` Borislav Petkov
     [not found]                     ` <20210129151034.iba4eaa2fuxsipqa@treble>
2021-01-29 16:30                       ` Borislav Petkov
2021-01-29 16:49                         ` Josh Poimboeuf
2021-01-29 16:54                           ` Nikolay Borisov
2021-01-29 17:03                             ` Josh Poimboeuf
2021-01-29 17:07                               ` Borislav Petkov
2021-01-29 17:58                                 ` Seth Forshee
2021-01-28 18:24               ` kprobes broken since 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()") Peter Zijlstra
2021-01-29  1:34                 ` Alexei Starovoitov
2021-01-29  6:36                   ` Nikolay Borisov
     [not found]                   ` <YBPNyRyrkzw2echi@hirez.programming.kicks-ass.net>
     [not found]                     ` <20210129224011.81bcdb3eba1227c414e69e1f@kernel.org>
     [not found]                       ` <20210129105952.74dc8464@gandalf.local.home>
2021-01-29 16:24                         ` Peter Zijlstra
2021-01-29 17:45                           ` Alexei Starovoitov
2021-01-29 17:59                             ` Peter Zijlstra
2021-01-29 19:01                               ` Steven Rostedt
2021-01-29 21:05                                 ` Alexei Starovoitov
2021-01-30  1:41                                   ` Masami Hiramatsu
2021-01-29 21:24                                 ` Steven Rostedt
2021-01-30  8:28                                   ` Peter Zijlstra
2021-01-30 12:44                                     ` Steven Rostedt
2021-02-02 10:45                                       ` Peter Zijlstra
2021-02-02 14:52                                         ` Steven Rostedt
2021-02-02 16:45                                           ` Peter Zijlstra
2021-02-02 16:56                                             ` Steven Rostedt
2021-02-02 18:30                                               ` Peter Zijlstra
2021-02-02 21:05                                                 ` Steven Rostedt
2021-02-03 13:33                                                   ` Masami Hiramatsu
2021-02-03 13:52                                                     ` Steven Rostedt
2021-01-30  2:02                               ` Masami Hiramatsu
2021-01-30  3:08                                 ` Alexei Starovoitov
2021-01-30 12:10                                   ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210128123842.c9e33949e62f504b84bfadf5@gmail.com \
    --to=masami.hiramatsu@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=nborisov@suse.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).