All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alan Maguire <alan.maguire@oracle.com>
Cc: Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Jonathan Lemon <jonathan.lemon@gmail.com>
Subject: Re: [PATCH bpf-next 1/6] bpf: implement BPF ring buffer and verifier support for it
Date: Wed, 13 May 2020 22:59:05 -0700	[thread overview]
Message-ID: <CAEf4BzZpa3Xjevy-tV2oD2Yoxhf=Sm1EPNZdWsv0CoUCSmuF9w@mail.gmail.com> (raw)
In-Reply-To: <alpine.LRH.2.21.2005132231450.1535@localhost>

On Wed, May 13, 2020 at 2:59 PM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On Wed, 13 May 2020, Andrii Nakryiko wrote:
>
> > This commits adds a new MPSC ring buffer implementation into BPF ecosystem,
> > which allows multiple CPUs to submit data to a single shared ring buffer. On
> > the consumption side, only single consumer is assumed.
> >
> > Motivation
> > ----------
> > There are two distinctive motivators for this work, which are not satisfied by
> > existing perf buffer, which prompted creation of a new ring buffer
> > implementation.
> >   - more efficient memory utilization by sharing ring buffer across CPUs;
> >   - preserving ordering of events that happen sequentially in time, even
> >   across multiple CPUs (e.g., fork/exec/exit events for a task).
> >
> > These two problems are independent, but perf buffer fails to satisfy both.
> > Both are a result of a choice to have per-CPU perf ring buffer.  Both can be
> > also solved by having an MPSC implementation of ring buffer. The ordering
> > problem could technically be solved for perf buffer with some in-kernel
> > counting, but given the first one requires an MPSC buffer, the same solution
> > would solve the second problem automatically.
> >
>
> This looks great Andrii! One potentially interesting side-effect of
> the way this is implemented is that it could (I think) support speculative
> tracing.
>
> Say I want to record some tracing info when I enter function foo(), but
> I only care about cases where that function later returns an error value.
> I _think_ your implementation could support that via a scheme like
> this:
>
> - attach a kprobe program to record the data via bpf_ringbuf_reserve(),
>   and store the reserved pointer value in a per-task keyed hashmap.
>   Then record the values of interest in the reserved space. This is our
>   speculative data as we don't know whether we want to commit it yet.
>
> - attach a kretprobe program that picks up our reserved pointer and
>   commit()s or discard()s the associated data based on the return value.
>
> - the consumer should (I think) then only read the committed data, so in
>   this case just the data of interest associated with the failure case.
>
> I'm curious if that sort of ringbuf access pattern across multiple
> programs would work? Thanks!


Right now it's not allowed. Similar to spin lock and socket reference,
verifier will enforce that reserved record is committed or discarded
within the same BPF program invocation. Technically, nothing prevents
us from relaxing this and allowing to store this pointer in a map, but
that's probably way too dangerous and not necessary for most common
cases.

But all your troubles with this is due to using a pair of
kprobe+kretprobe. What I think should solve your problem is a single
fexit program. It can read input arguments *and* return value of
traced function. So there won't be any need for additional map and
storing speculative data (and no speculation as well, because you'll
just know beforehand if you even need to capture data). Does this work
for your case?

>
> Alan
>

[...]

no one seems to like trimming emails ;)

  reply	other threads:[~2020-05-14  5:59 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-13 19:25 [PATCH bpf-next 0/6] BPF ring buffer Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 1/6] bpf: implement BPF ring buffer and verifier support for it Andrii Nakryiko
2020-05-13 20:57   ` kbuild test robot
2020-05-13 20:57     ` kbuild test robot
2020-05-13 21:58   ` Alan Maguire
2020-05-14  5:59     ` Andrii Nakryiko [this message]
2020-05-14 22:25       ` Alan Maguire
2020-05-13 22:16   ` kbuild test robot
2020-05-13 22:16     ` kbuild test robot
2020-05-14 16:50   ` Jonathan Lemon
2020-05-14 20:11     ` Andrii Nakryiko
2020-05-14 17:33   ` sdf
2020-05-14 20:18     ` Andrii Nakryiko
2020-05-14 20:53       ` sdf
2020-05-14 21:13         ` Andrii Nakryiko
2020-05-14 21:56           ` Stanislav Fomichev
2020-05-14 19:06   ` Alexei Starovoitov
2020-05-14 20:49     ` Andrii Nakryiko
2020-05-14 19:18   ` Jakub Kicinski
2020-05-14 19:18     ` Jakub Kicinski
2020-05-14 20:39     ` Thomas Gleixner
2020-05-14 21:30       ` Andrii Nakryiko
2020-05-14 22:13         ` Paul E. McKenney
2020-05-14 22:56         ` Alexei Starovoitov
2020-05-14 23:06           ` Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 2/6] tools/memory-model: add BPF ringbuf MPSC litmus tests Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 3/6] bpf: track reference type in verifier Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 4/6] libbpf: add BPF ring buffer support Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 5/6] selftests/bpf: add BPF ringbuf selftests Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 6/6] bpf: add BPF ringbuf and perf buffer benchmarks Andrii Nakryiko
2020-05-13 22:49 ` [PATCH bpf-next 0/6] BPF ring buffer Jonathan Lemon
2020-05-14  6:08   ` Andrii Nakryiko
2020-05-14 16:30     ` Jonathan Lemon
2020-05-14 20:06       ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzZpa3Xjevy-tV2oD2Yoxhf=Sm1EPNZdWsv0CoUCSmuF9w@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=alan.maguire@oracle.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jonathan.lemon@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.