All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <jolsa@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: "Steven Rostedt" <rostedt@goodmis.org>,
	"Masami Hiramatsu" <mhiramat@kernel.org>,
	"Andrii Nakryiko" <andrii.nakryiko@gmail.com>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Andrii Nakryiko" <andriin@fb.com>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	"Martin KaFai Lau" <kafai@fb.com>,
	"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"KP Singh" <kpsingh@chromium.org>, "Daniel Xu" <dxu@dxuuu.xyz>,
	"Jesper Brouer" <jbrouer@redhat.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Viktor Malik" <vmalik@redhat.com>
Subject: Re: [PATCHv2 RFC bpf-next 0/7] bpf: Add support for ftrace probe
Date: Wed, 21 Apr 2021 15:40:37 +0200	[thread overview]
Message-ID: <YIArVa6IE37vsazU@krava> (raw)
In-Reply-To: <CAADnVQLh3tCWi=TiWnJVaMrYhJ=j-xSrJ72+XnZDP8CMZM+1mQ@mail.gmail.com>

On Tue, Apr 20, 2021 at 04:38:45PM -0700, Alexei Starovoitov wrote:

SNIP

> > >
> > > I don't see how you can do it without BTF.
> > > The mass-attach feature should prepare generic 6 or so arguments
> > > from all functions it attached to.
> > > On x86-64 it's trivial because 6 regs are the same.
> > > On arm64 is now more challenging since return value regs overlaps with
> > > first argument, so bpf trampoline (when it's ready for arm64) will look
> > > a bit different than bpf trampoline on x86-64 to preserve arg0, arg1,
> > > ..arg6, ret
> > > 64-bit values that bpf prog expects to see.
> > > On x86-32 it's even more trickier, since the same 6 args need to be copied
> > > from a combination of regs and stack.
> > > This is not some hypothetical case. We already use BTF in x86-32 JIT
> > > and btf_func_model was introduced specifically to handle such cases.
> > > So I really don't see how ftrace can do that just yet. It has to understand BTF
> > > of all of the funcs it attaches to otherwise it's just saving all regs.
> > > That approach was a pain to deal with.
> >
> > ok, my idea was to get regs from the ftrace and have arch specific code
> > to prepare 6 (or less) args for ebpf program.. that part would be
> > already in bpf code
> >
> > so you'd like to see this functionality directly in ftrace, so we don't
> > save unneeded regs, is that right?
> 
> What do you mean by "already in bpf code" ?

that it would not be part of ftrace code

> 
> The main question is an api across layers.
> If ftrace doesn't use BTF it has to prepare all regs that could be used.
> Meaning on x86-64 that has to be 6 regs for args, 1 reg for return and
> stack pointer.
> That would be enough to discover input args and return value in fexit.
> On arm64 that has to be similar, but while x86-64 can do with single pt_regs
> where %rax is updated on fexit, arm64 cannot do so, since the same register
> is used as arg1 and as a return value.
> The most generic api between ftrace and bpf layers would be two sets of
> pt_regs. One on entry and one on exit, but that's going to be very expensive.

that's what I was going for and I think it's the only way if
we use ftrace graph_ops for mass attaching

> On x86-32 it would have to be 3 regs plus stack pointer and another 2 regs
> to cover all input args and return value.
> So there will be plenty of per-arch differences.
> 
> Jiri, if you're thinking of a bpf helper like:
> u64 bpf_read_argN(pt_regs, ip, arg_num)
> that will do lookup of btf_id from ip, then it will parse btf_id and
> function proto,
> then it will translate that to btf_func_model and finally will extract the right
> argument value from a combination of stack and regs ?
> That's doable, but it's a lot of run-time overhead.
> It would be usable by bpf progs that don't care much about run-time perf
> and don't care that they're not usable 24/7 on production systems.
> Such tools exist and they're useful,
> but I'd like this mass-attach facility to be usable everywhere
> including the production and 24/7 tracing.

I did not think of this option, but yep, seems also expensive

> Hence I think it's better to do this per-arch translation during bpf
> prog attach.
> That's exactly what bpf trampoline is doing.
> Currently it's doing for single btf_id, single trampoline, and single bpf prog.
> To make the same logic work across N attach points the trampoline logic
> would need to iterate all btf_func_model-s of all btf_id-s and generate
> M trampolines (where M < N) for a combination of possible argument passing.
> On x86-64 the M will be equal to 1. On arm64 it will be equal to 1 as well.
> But on x86-32 it will depend on a set of btf_ids. It could be 1,2,..10.
> Since bpf doesn't allow to attach to struct-by-value it's only 32-bit and 64-bit
> integers to deal with and number of combinations of possible calling conventions
> is actually very small. I suspect it won't be more than 10.
> This way there will be no additional run-time overhead and bpf programs
> can be portable. They will work as-is on x86-64, x86-32, arm64.
> Just like fentry/fexit work today. Or rather they will be portable
> when bpf trampoline is supported on these archs.
> This portability is the key feature of bpf trampoline design. The bpf trampoline
> was implemented for x86-64 only so far. Arm64 patches are still wip.
> btf_func_model is used by both x86-64 and x86-32 JITs.

ok, I understand why this would be the best solution for calling
the program from multiple probes

I think it's the 'attach' layer which is the source of problems

currently there is ftrace's fgraph_ops support that allows fast mass
attach and calls callbacks for functions entry and exit:
  https://lore.kernel.org/lkml/20190525031633.811342628@goodmis.org/

these callbacks get ip/parent_ip and can get pt_regs (that's not
implemented at the moment)

but that gets us to the situation of having full pt_regs on both
entry/exit callbacks that you described above and want to avoid,
but I think it's the price for having this on top of generic
tracing layer

the way ftrace's fgraph_ops is implemented, I'm not sure it can
be as fast as current bpf entry/exit trampoline

but to better understand the pain points I think I'll try to implement
the 'mass trampolines' call to the bpf program you described above and
attach it for now to fgraph_ops callbacks

perhaps this is a good topic to discuss in one of the Thursday's BPF mtg?

thanks,
jirka


  reply	other threads:[~2021-04-21 13:41 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-13 12:15 [PATCHv2 RFC bpf-next 0/7] bpf: Add support for ftrace probe Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 1/7] bpf: Move bpf_prog_start/end functions to generic place Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 2/7] bpf: Add bpf_functions object Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 3/7] bpf: Add support to attach program to ftrace probe Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 4/7] libbpf: Add btf__find_by_pattern_kind function Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 5/7] libbpf: Add support to load and attach ftrace probe Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 6/7] selftests/bpf: Add ftrace probe to fentry test Jiri Olsa
2021-04-13 12:15 ` [PATCHv2 RFC bpf-next 7/7] selftests/bpf: Add ftrace probe test Jiri Olsa
2021-04-14  1:04 ` [PATCHv2 RFC bpf-next 0/7] bpf: Add support for ftrace probe Andrii Nakryiko
2021-04-14 12:19   ` Jiri Olsa
2021-04-14 22:46     ` Andrii Nakryiko
2021-04-15 14:00       ` Jiri Olsa
2021-04-15 15:10       ` Steven Rostedt
2021-04-15 17:39         ` Jiri Olsa
2021-04-15 18:18           ` Steven Rostedt
2021-04-15 18:21             ` Steven Rostedt
2021-04-15 21:49               ` Jiri Olsa
2021-04-15 23:30                 ` Steven Rostedt
2021-04-19 20:51                   ` Jiri Olsa
2021-04-19 22:00                     ` Steven Rostedt
2021-04-15 18:31             ` Yonghong Song
2021-04-15 20:45         ` Andrii Nakryiko
2021-04-15 21:00           ` Steven Rostedt
2021-04-16 15:03             ` Masami Hiramatsu
2021-04-16 16:48               ` Steven Rostedt
2021-04-19 14:29                 ` Masami Hiramatsu
2021-04-20 12:51                 ` Jiri Olsa
2021-04-20 15:33                   ` Alexei Starovoitov
2021-04-20 16:33                     ` Steven Rostedt
2021-04-20 16:52                     ` Jiri Olsa
2021-04-20 23:38                       ` Alexei Starovoitov
2021-04-21 13:40                         ` Jiri Olsa [this message]
2021-04-21 14:05                           ` Steven Rostedt
2021-04-21 18:52                             ` Andrii Nakryiko
2021-04-21 19:18                               ` Jiri Olsa
2021-04-22 14:24                                 ` Steven Rostedt
2021-04-21 21:37                             ` Jiri Olsa
2021-04-22  1:17                               ` Steven Rostedt
2021-04-20  4:51               ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YIArVa6IE37vsazU@krava \
    --to=jolsa@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dxu@dxuuu.xyz \
    --cc=jbrouer@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kafai@fb.com \
    --cc=kpsingh@chromium.org \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=songliubraving@fb.com \
    --cc=toke@redhat.com \
    --cc=vmalik@redhat.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.