All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next] bpf: mark kprobe_multi_link_prog_run as always inlined function
@ 2024-03-20 20:06 Andrii Nakryiko
  2024-03-21  6:55 ` Alexei Starovoitov
  0 siblings, 1 reply; 4+ messages in thread
From: Andrii Nakryiko @ 2024-03-20 20:06 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

kprobe_multi_link_prog_run() is called both for multi-kprobe and
multi-kretprobe BPF programs from kprobe_multi_link_handler() and
kprobe_multi_link_exit_handler(), respectively.
kprobe_multi_link_prog_run() is doing all the relevant work, with those
wrappers just satisfying ftrace's interfaces (kprobe callback is
supposed to return int, while kretprobe returns void).

With this structure compile performs tail-call optimization:

Dump of assembler code for function kprobe_multi_link_exit_handler:
   0xffffffff8122f1e0 <+0>:     add    $0xffffffffffffffc0,%rdi
   0xffffffff8122f1e4 <+4>:     mov    %rcx,%rdx
   0xffffffff8122f1e7 <+7>:     jmp    0xffffffff81230080 <kprobe_multi_link_prog_run>

This means that when trying to capture LBR that traces all indirect branches
we are wasting an entry just to record that kprobe_multi_link_exit_handler
called/jumped into kprobe_multi_link_prog_run.

LBR entries are especially sparse on AMD CPUs (just 16 entries on latest CPUs
vs typically 32 on latest Intel CPUs), and every entry counts (and we already
have a bunch of other LBR entries spent getting to a BPF program), so it would
be great to not waste any more than necessary.

Marking it as just `static inline` doesn't change anything, compiler
still performs tail call optimization only. But by marking
kprobe_multi_link_prog_run() as __always_inline we ensure that compiler
fully inlines it, avoiding jumps:

Dump of assembler code for function kprobe_multi_link_exit_handler:
   0xffffffff8122f4e0 <+0>:     push   %r15
   0xffffffff8122f4e2 <+2>:     push   %r14
   0xffffffff8122f4e4 <+4>:     push   %r13
   0xffffffff8122f4e6 <+6>:     push   %r12
   0xffffffff8122f4e8 <+8>:     push   %rbx
   0xffffffff8122f4e9 <+9>:     sub    $0x10,%rsp
   0xffffffff8122f4ed <+13>:    mov    %rdi,%r14
   0xffffffff8122f4f0 <+16>:    lea    -0x40(%rdi),%rax

   ...

   0xffffffff8122f590 <+176>:   call   0xffffffff8108e420 <sched_clock>
   0xffffffff8122f595 <+181>:   sub    %r14,%rax
   0xffffffff8122f598 <+184>:   add    %rax,0x8(%rbx,%r13,1)
   0xffffffff8122f59d <+189>:   jmp    0xffffffff8122f541 <kprobe_multi_link_exit_handler+97>

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/trace/bpf_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 434e3ece6688..0bebd6f02e17 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2796,7 +2796,7 @@ static u64 bpf_kprobe_multi_entry_ip(struct bpf_run_ctx *ctx)
 	return run_ctx->entry_ip;
 }
 
-static int
+static __always_inline int
 kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
 			   unsigned long entry_ip, struct pt_regs *regs)
 {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf-next] bpf: mark kprobe_multi_link_prog_run as always inlined function
  2024-03-20 20:06 [PATCH bpf-next] bpf: mark kprobe_multi_link_prog_run as always inlined function Andrii Nakryiko
@ 2024-03-21  6:55 ` Alexei Starovoitov
  2024-03-21  7:02   ` Alexei Starovoitov
  0 siblings, 1 reply; 4+ messages in thread
From: Alexei Starovoitov @ 2024-03-21  6:55 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Kernel Team

On Wed, Mar 20, 2024 at 1:06 PM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> Dump of assembler code for function kprobe_multi_link_exit_handler:
>    0xffffffff8122f1e0 <+0>:     add    $0xffffffffffffffc0,%rdi
>    0xffffffff8122f1e4 <+4>:     mov    %rcx,%rdx
>    0xffffffff8122f1e7 <+7>:     jmp    0xffffffff81230080 <kprobe_multi_link_prog_run>
>
> This means that when trying to capture LBR that traces all indirect branches
> we are wasting an entry just to record that kprobe_multi_link_exit_handler
> called/jumped into kprobe_multi_link_prog_run.

I don't follow.
If LBR was configured to trace indirect branches then this direct jmp
shouldn't be recorded.
If LBR is capturing stack frames (those should be call/ret instructions)
then this jmp also won't be recorded.
If LBR is actually capturing all indirect, direct and conditional
jmps, and calls then saving an entry won't make a difference.

Maybe it's an LBR configuration issue on the retsnoop side?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf-next] bpf: mark kprobe_multi_link_prog_run as always inlined function
  2024-03-21  6:55 ` Alexei Starovoitov
@ 2024-03-21  7:02   ` Alexei Starovoitov
  2024-03-21 16:04     ` Andrii Nakryiko
  0 siblings, 1 reply; 4+ messages in thread
From: Alexei Starovoitov @ 2024-03-21  7:02 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Kernel Team

On Wed, Mar 20, 2024 at 11:55 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Mar 20, 2024 at 1:06 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Dump of assembler code for function kprobe_multi_link_exit_handler:
> >    0xffffffff8122f1e0 <+0>:     add    $0xffffffffffffffc0,%rdi
> >    0xffffffff8122f1e4 <+4>:     mov    %rcx,%rdx
> >    0xffffffff8122f1e7 <+7>:     jmp    0xffffffff81230080 <kprobe_multi_link_prog_run>
> >
> > This means that when trying to capture LBR that traces all indirect branches
> > we are wasting an entry just to record that kprobe_multi_link_exit_handler
> > called/jumped into kprobe_multi_link_prog_run.
>
> I don't follow.
> If LBR was configured to trace indirect branches then this direct jmp
> shouldn't be recorded.
> If LBR is capturing stack frames (those should be call/ret instructions)
> then this jmp also won't be recorded.
> If LBR is actually capturing all indirect, direct and conditional
> jmps, and calls then saving an entry won't make a difference.
>
> Maybe it's an LBR configuration issue on the retsnoop side?

furthermore.
This 'jmp kprobe_multi_link_prog_run' is not any different than
"goto" in C code and other unconditional jmp-s within functions.
Something doesn't add up that this particular jmp stands out
from other similar jmps before and after. From cpu pov
the tail call jmp is the same as jmp within a function.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf-next] bpf: mark kprobe_multi_link_prog_run as always inlined function
  2024-03-21  7:02   ` Alexei Starovoitov
@ 2024-03-21 16:04     ` Andrii Nakryiko
  0 siblings, 0 replies; 4+ messages in thread
From: Andrii Nakryiko @ 2024-03-21 16:04 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team

On Thu, Mar 21, 2024 at 12:02 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Mar 20, 2024 at 11:55 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Mar 20, 2024 at 1:06 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> > >
> > > Dump of assembler code for function kprobe_multi_link_exit_handler:
> > >    0xffffffff8122f1e0 <+0>:     add    $0xffffffffffffffc0,%rdi
> > >    0xffffffff8122f1e4 <+4>:     mov    %rcx,%rdx
> > >    0xffffffff8122f1e7 <+7>:     jmp    0xffffffff81230080 <kprobe_multi_link_prog_run>
> > >
> > > This means that when trying to capture LBR that traces all indirect branches
> > > we are wasting an entry just to record that kprobe_multi_link_exit_handler
> > > called/jumped into kprobe_multi_link_prog_run.
> >
> > I don't follow.
> > If LBR was configured to trace indirect branches then this direct jmp
> > shouldn't be recorded.
> > If LBR is capturing stack frames (those should be call/ret instructions)
> > then this jmp also won't be recorded.
> > If LBR is actually capturing all indirect, direct and conditional
> > jmps, and calls then saving an entry won't make a difference.

I was imprecise with my wording, sorry. It's not "indirect branches",
it's any change of linear execution of code. So any jump (conditional
or not) and any call add LBR record, as you mention in this case.

This is the --lbr=any option in retsnoop, which is used to "look" into
the last function that fails to see which condition (out of many)
caused an error. Very sensitive but extremely useful mode, so
minimizing the amount of wasted LBR records is very important.

Not sure why you are saying it won't make any difference? I'm working
on some other changes to eliminate a few more calls/jmps/branches. It
all adds up.


> >
> > Maybe it's an LBR configuration issue on the retsnoop side?
>
> furthermore.
> This 'jmp kprobe_multi_link_prog_run' is not any different than
> "goto" in C code and other unconditional jmp-s within functions.
> Something doesn't add up that this particular jmp stands out
> from other similar jmps before and after. From cpu pov
> the tail call jmp is the same as jmp within a function.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-03-21 16:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-20 20:06 [PATCH bpf-next] bpf: mark kprobe_multi_link_prog_run as always inlined function Andrii Nakryiko
2024-03-21  6:55 ` Alexei Starovoitov
2024-03-21  7:02   ` Alexei Starovoitov
2024-03-21 16:04     ` Andrii Nakryiko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.