From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Jiri Olsa <jolsa@redhat.com>
Cc: "Steven Rostedt" <rostedt@goodmis.org>,
"Jiri Olsa" <jolsa@kernel.org>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Andrii Nakryiko" <andriin@fb.com>,
Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
"Martin KaFai Lau" <kafai@fb.com>,
"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"KP Singh" <kpsingh@chromium.org>, "Daniel Xu" <dxu@dxuuu.xyz>,
"Jesper Brouer" <jbrouer@redhat.com>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
"Viktor Malik" <vmalik@redhat.com>
Subject: Re: [RFC bpf-next 09/16] bpf: Add BPF_TRAMPOLINE_BATCH_ATTACH support
Date: Mon, 26 Oct 2020 16:19:30 -0700 [thread overview]
Message-ID: <CAEf4Bza0+MHuRneepidvXFZGZJ+hnMtaJpCq7EU=pvZHW7FD9w@mail.gmail.com> (raw)
In-Reply-To: <20201025194123.GD2681365@krava>
On Sun, Oct 25, 2020 at 12:41 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Oct 23, 2020 at 03:23:10PM -0700, Andrii Nakryiko wrote:
> > On Fri, Oct 23, 2020 at 1:31 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > >
> > > On Fri, 23 Oct 2020 13:03:22 -0700
> > > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > > Basically, maybe ftrace subsystem could provide a set of APIs to
> > > > prepare a set of functions to attach to. Then BPF subsystem would just
> > > > do what it does today, except instead of attaching to a specific
> > > > kernel function, it would attach to ftrace's placeholder. I don't know
> > > > anything about ftrace implementation, so this might be far off. But I
> > > > thought that looking at this problem from a bit of a different angle
> > > > would benefit the discussion. Thoughts?
> > >
> > > I probably understand bpf internals as much as you understand ftrace
> > > internals ;-)
> > >
> >
> > Heh :) But while we are here, what do you think about this idea of
> > preparing a no-op trampoline, that a bunch (thousands, potentially) of
> > function entries will jump to. And once all that is ready and patched
> > through kernel functions entry points, then allow to attach BPF
> > program or ftrace callback (if I get the terminology right) in a one
> > fast and simple operation? For users that would mean that they will
> > either get calls for all or none of attached kfuncs, with a simple and
> > reliable semantics.
>
> so the main pain point the batch interface is addressing, is that
> every attach (BPF_RAW_TRACEPOINT_OPEN command) calls register_ftrace_direct,
> and you'll need to do the same for nop trampoline, no?
I guess I had a hope that if we know it's a nop that we are
installing, then we can do it without extra waiting, which should
speed it up quite a bit.
>
> I wonder if we could create some 'transaction object' represented
> by fd and add it to bpf_attr::raw_tracepoint
>
> then attach (BPF_RAW_TRACEPOINT_OPEN command) would add program to this
> new 'transaction object' instead of updating ftrace directly
>
> and when the collection is done (all BPF_RAW_TRACEPOINT_OPEN command
> are executed), we'd call new bpf syscall command on that transaction
> and it would call ftrace interface
>
This is conceptually something like what I had in mind, but I had a
single BPF program attached to many kernel functions in mind.
Something that's impossible today, as you mentioned in another thread.
> something like:
>
> bpf(TRANSACTION_NEW) = fd
> bpf(BPF_RAW_TRACEPOINT_OPEN) for prog_fd_1, fd
> bpf(BPF_RAW_TRACEPOINT_OPEN) for prog_fd_2, fd
> ...
> bpf(TRANSACTION_DONE) for fd
>
> jirka
>
> >
> > Something like this, where bpf_prog attachment (which replaces nop)
> > happens as step 2:
> >
> > +------------+ +----------+ +----------+
> > | kfunc1 | | kfunc2 | | kfunc3 |
> > +------+-----+ +----+-----+ +----+-----+
> > | | |
> > | | |
> > +---------------------------+
> > |
> > v
> > +---+---+ +-----------+
> > | nop +-----------> bpf_prog |
> > +-------+ +-----------+
> >
> >
> > > Anyway, what I'm currently working on, is a fast way to get to the
> > > arguments of a function. For now, I'm just focused on x86_64, and only add
> > > 6 argments.
> > >
> > > The main issue that Alexei had with using the ftrace trampoline, was that
> > > the only way to get to the arguments was to set the "REGS" flag, which
> > > would give a regs parameter that contained a full pt_regs. The problem with
> > > this approach is that it required saving *all* regs for every function
> > > traced. Alexei felt that this was too much overehead.
> > >
> > > Looking at Jiri's patch, I took a look at the creation of the bpf
> > > trampoline, and noticed that it's copying the regs on a stack (at least
> > > what is used, which I think could be an issue).
> >
> > Right. And BPF doesn't get access to the entire pt_regs struct, so it
> > doesn't have to pay the prices of saving it.
> >
> > But just FYI. Alexei is out till next week, so don't expect him to
> > reply in the next few days. But he's probably best to discuss these
> > nitty-gritty details with :)
> >
> > >
> > > For tracing a function, one must store all argument registers used, and
> > > restore them, as that's how they are passed from caller to callee. And
> > > since they are stored anyway, I figure, that should also be sent to the
> > > function callbacks, so that they have access to them too.
> > >
> > > I'm working on a set of patches to make this a reality.
> > >
> > > -- Steve
> >
>
next prev parent reply other threads:[~2020-10-26 23:19 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-22 8:21 [RFC bpf-next 00/16] bpf: Speed up trampoline attach Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 01/16] ftrace: Add check_direct_entry function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 02/16] ftrace: Add adjust_direct_size function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 03/16] ftrace: Add get/put_direct_func function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 04/16] ftrace: Add ftrace_set_filter_ips function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 05/16] ftrace: Add register_ftrace_direct_ips function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 06/16] ftrace: Add unregister_ftrace_direct_ips function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 07/16] kallsyms: Use rb tree for kallsyms name search Jiri Olsa
2020-10-28 18:25 ` Jiri Olsa
2020-10-28 21:15 ` Alexei Starovoitov
2020-10-29 9:29 ` Jiri Olsa
2020-10-29 22:45 ` Andrii Nakryiko
2020-10-28 22:40 ` Andrii Nakryiko
2020-10-29 9:33 ` Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 08/16] bpf: Use delayed link free in bpf_link_put Jiri Olsa
2020-10-23 19:46 ` Andrii Nakryiko
2020-10-25 19:02 ` Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 09/16] bpf: Add BPF_TRAMPOLINE_BATCH_ATTACH support Jiri Olsa
2020-10-23 20:03 ` Andrii Nakryiko
2020-10-23 20:31 ` Steven Rostedt
2020-10-23 22:23 ` Andrii Nakryiko
2020-10-25 19:41 ` Jiri Olsa
2020-10-26 23:19 ` Andrii Nakryiko [this message]
2020-10-22 8:21 ` [RFC bpf-next 10/16] bpf: Add BPF_TRAMPOLINE_BATCH_DETACH support Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 11/16] bpf: Sync uapi bpf.h to tools Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 12/16] bpf: Move synchronize_rcu_mult for batch processing (NOT TO BE MERGED) Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 13/16] libbpf: Add trampoline batch attach support Jiri Olsa
2020-10-23 20:09 ` Andrii Nakryiko
2020-10-25 19:11 ` Jiri Olsa
2020-10-26 23:15 ` Andrii Nakryiko
2020-10-27 19:03 ` Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 14/16] libbpf: Add trampoline batch detach support Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 15/16] selftests/bpf: Add trampoline batch test Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 16/16] selftests/bpf: Add attach batch test (NOT TO BE MERGED) Jiri Olsa
2020-10-22 13:35 ` [RFC bpf-next 00/16] bpf: Speed up trampoline attach Steven Rostedt
2020-10-22 14:11 ` Jiri Olsa
2020-10-22 14:42 ` Steven Rostedt
2020-10-22 16:21 ` Steven Rostedt
2020-10-22 20:52 ` Steven Rostedt
2020-10-23 6:09 ` Jiri Olsa
2020-10-23 13:50 ` Steven Rostedt
2020-10-25 19:01 ` Jiri Olsa
2020-10-27 4:30 ` Alexei Starovoitov
2020-10-27 13:14 ` Steven Rostedt
2020-10-27 14:28 ` Jiri Olsa
2020-10-28 21:13 ` Alexei Starovoitov
2020-10-29 11:09 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEf4Bza0+MHuRneepidvXFZGZJ+hnMtaJpCq7EU=pvZHW7FD9w@mail.gmail.com' \
--to=andrii.nakryiko@gmail.com \
--cc=andriin@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=dxu@dxuuu.xyz \
--cc=jbrouer@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=jolsa@redhat.com \
--cc=kafai@fb.com \
--cc=kpsingh@chromium.org \
--cc=netdev@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=toke@redhat.com \
--cc=vmalik@redhat.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).