All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Jiri Olsa <jolsa@redhat.com>
Cc: "Steven Rostedt" <rostedt@goodmis.org>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Andrii Nakryiko" <andriin@fb.com>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	"Martin KaFai Lau" <kafai@fb.com>,
	"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"KP Singh" <kpsingh@chromium.org>, "Daniel Xu" <dxu@dxuuu.xyz>,
	"Jesper Brouer" <jbrouer@redhat.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Viktor Malik" <vmalik@redhat.com>
Subject: Re: [RFC bpf-next 09/16] bpf: Add BPF_TRAMPOLINE_BATCH_ATTACH support
Date: Mon, 26 Oct 2020 16:19:30 -0700	[thread overview]
Message-ID: <CAEf4Bza0+MHuRneepidvXFZGZJ+hnMtaJpCq7EU=pvZHW7FD9w@mail.gmail.com> (raw)
In-Reply-To: <20201025194123.GD2681365@krava>

On Sun, Oct 25, 2020 at 12:41 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Oct 23, 2020 at 03:23:10PM -0700, Andrii Nakryiko wrote:
> > On Fri, Oct 23, 2020 at 1:31 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > >
> > > On Fri, 23 Oct 2020 13:03:22 -0700
> > > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > > Basically, maybe ftrace subsystem could provide a set of APIs to
> > > > prepare a set of functions to attach to. Then BPF subsystem would just
> > > > do what it does today, except instead of attaching to a specific
> > > > kernel function, it would attach to ftrace's placeholder. I don't know
> > > > anything about ftrace implementation, so this might be far off. But I
> > > > thought that looking at this problem from a bit of a different angle
> > > > would benefit the discussion. Thoughts?
> > >
> > > I probably understand bpf internals as much as you understand ftrace
> > > internals ;-)
> > >
> >
> > Heh :) But while we are here, what do you think about this idea of
> > preparing a no-op trampoline, that a bunch (thousands, potentially) of
> > function entries will jump to. And once all that is ready and patched
> > through kernel functions entry points, then allow to attach BPF
> > program or ftrace callback (if I get the terminology right) in a one
> > fast and simple operation? For users that would mean that they will
> > either get calls for all or none of attached kfuncs, with a simple and
> > reliable semantics.
>
> so the main pain point the batch interface is addressing, is that
> every attach (BPF_RAW_TRACEPOINT_OPEN command) calls register_ftrace_direct,
> and you'll need to do the same for nop trampoline, no?

I guess I had a hope that if we know it's a nop that we are
installing, then we can do it without extra waiting, which should
speed it up quite a bit.

>
> I wonder if we could create some 'transaction object' represented
> by fd and add it to bpf_attr::raw_tracepoint
>
> then attach (BPF_RAW_TRACEPOINT_OPEN command) would add program to this
> new 'transaction object' instead of updating ftrace directly
>
> and when the collection is done (all BPF_RAW_TRACEPOINT_OPEN command
> are executed), we'd call new bpf syscall command on that transaction
> and it would call ftrace interface
>

This is conceptually something like what I had in mind, but I had a
single BPF program attached to many kernel functions in mind.
Something that's impossible today, as you mentioned in another thread.

> something like:
>
>   bpf(TRANSACTION_NEW) = fd
>   bpf(BPF_RAW_TRACEPOINT_OPEN) for prog_fd_1, fd
>   bpf(BPF_RAW_TRACEPOINT_OPEN) for prog_fd_2, fd
>   ...
>   bpf(TRANSACTION_DONE) for fd
>
> jirka
>
> >
> > Something like this, where bpf_prog attachment (which replaces nop)
> > happens as step 2:
> >
> > +------------+  +----------+  +----------+
> > |  kfunc1    |  |  kfunc2  |  |  kfunc3  |
> > +------+-----+  +----+-----+  +----+-----+
> >        |             |             |
> >        |             |             |
> >        +---------------------------+
> >                      |
> >                      v
> >                  +---+---+           +-----------+
> >                  |  nop  +----------->  bpf_prog |
> >                  +-------+           +-----------+
> >
> >
> > > Anyway, what I'm currently working on, is a fast way to get to the
> > > arguments of a function. For now, I'm just focused on x86_64, and only add
> > > 6 argments.
> > >
> > > The main issue that Alexei had with using the ftrace trampoline, was that
> > > the only way to get to the arguments was to set the "REGS" flag, which
> > > would give a regs parameter that contained a full pt_regs. The problem with
> > > this approach is that it required saving *all* regs for every function
> > > traced. Alexei felt that this was too much overehead.
> > >
> > > Looking at Jiri's patch, I took a look at the creation of the bpf
> > > trampoline, and noticed that it's copying the regs on a stack (at least
> > > what is used, which I think could be an issue).
> >
> > Right. And BPF doesn't get access to the entire pt_regs struct, so it
> > doesn't have to pay the prices of saving it.
> >
> > But just FYI. Alexei is out till next week, so don't expect him to
> > reply in the next few days. But he's probably best to discuss these
> > nitty-gritty details with :)
> >
> > >
> > > For tracing a function, one must store all argument registers used, and
> > > restore them, as that's how they are passed from caller to callee. And
> > > since they are stored anyway, I figure, that should also be sent to the
> > > function callbacks, so that they have access to them too.
> > >
> > > I'm working on a set of patches to make this a reality.
> > >
> > > -- Steve
> >
>

  reply	other threads:[~2020-10-26 23:19 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-22  8:21 [RFC bpf-next 00/16] bpf: Speed up trampoline attach Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 01/16] ftrace: Add check_direct_entry function Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 02/16] ftrace: Add adjust_direct_size function Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 03/16] ftrace: Add get/put_direct_func function Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 04/16] ftrace: Add ftrace_set_filter_ips function Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 05/16] ftrace: Add register_ftrace_direct_ips function Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 06/16] ftrace: Add unregister_ftrace_direct_ips function Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 07/16] kallsyms: Use rb tree for kallsyms name search Jiri Olsa
2020-10-28 18:25   ` Jiri Olsa
2020-10-28 21:15     ` Alexei Starovoitov
2020-10-29  9:29       ` Jiri Olsa
2020-10-29 22:45         ` Andrii Nakryiko
2020-10-28 22:40     ` Andrii Nakryiko
2020-10-29  9:33       ` Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 08/16] bpf: Use delayed link free in bpf_link_put Jiri Olsa
2020-10-23 19:46   ` Andrii Nakryiko
2020-10-25 19:02     ` Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 09/16] bpf: Add BPF_TRAMPOLINE_BATCH_ATTACH support Jiri Olsa
2020-10-22 11:55   ` kernel test robot
2020-10-22 11:57   ` kernel test robot
2020-10-23 20:03   ` Andrii Nakryiko
2020-10-23 20:31     ` Steven Rostedt
2020-10-23 22:23       ` Andrii Nakryiko
2020-10-25 19:41         ` Jiri Olsa
2020-10-26 23:19           ` Andrii Nakryiko [this message]
2020-10-22  8:21 ` [RFC bpf-next 10/16] bpf: Add BPF_TRAMPOLINE_BATCH_DETACH support Jiri Olsa
2020-10-22 13:00   ` kernel test robot
2020-10-22 13:04   ` kernel test robot
2020-10-22  8:21 ` [RFC bpf-next 11/16] bpf: Sync uapi bpf.h to tools Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 12/16] bpf: Move synchronize_rcu_mult for batch processing (NOT TO BE MERGED) Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 13/16] libbpf: Add trampoline batch attach support Jiri Olsa
2020-10-23 20:09   ` Andrii Nakryiko
2020-10-25 19:11     ` Jiri Olsa
2020-10-26 23:15       ` Andrii Nakryiko
2020-10-27 19:03         ` Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 14/16] libbpf: Add trampoline batch detach support Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 15/16] selftests/bpf: Add trampoline batch test Jiri Olsa
2020-10-22  8:21 ` [RFC bpf-next 16/16] selftests/bpf: Add attach batch test (NOT TO BE MERGED) Jiri Olsa
2020-10-22 13:35 ` [RFC bpf-next 00/16] bpf: Speed up trampoline attach Steven Rostedt
2020-10-22 14:11   ` Jiri Olsa
2020-10-22 14:42     ` Steven Rostedt
2020-10-22 16:21       ` Steven Rostedt
2020-10-22 20:52         ` Steven Rostedt
2020-10-23  6:09           ` Jiri Olsa
2020-10-23 13:50             ` Steven Rostedt
2020-10-25 19:01               ` Jiri Olsa
2020-10-27  4:30       ` Alexei Starovoitov
2020-10-27 13:14         ` Steven Rostedt
2020-10-27 14:28         ` Jiri Olsa
2020-10-28 21:13           ` Alexei Starovoitov
2020-10-29 11:09             ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4Bza0+MHuRneepidvXFZGZJ+hnMtaJpCq7EU=pvZHW7FD9w@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dxu@dxuuu.xyz \
    --cc=jbrouer@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@chromium.org \
    --cc=netdev@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=songliubraving@fb.com \
    --cc=toke@redhat.com \
    --cc=vmalik@redhat.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.