All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Jakub Kicinski <kuba@kernel.org>,
	Andrii Nakryiko <andriin@fb.com>,
	linux-arch@vger.kernel.org, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Jonathan Lemon <jonathan.lemon@gmail.com>
Subject: Re: [PATCH bpf-next 1/6] bpf: implement BPF ring buffer and verifier support for it
Date: Thu, 14 May 2020 14:30:11 -0700	[thread overview]
Message-ID: <CAEf4Bzbj-WvRkoGxkSFtK5_1JfQxthoFid398C97RM0ppBb0dA@mail.gmail.com> (raw)
In-Reply-To: <87h7wixndi.fsf@nanos.tec.linutronix.de>

On Thu, May 14, 2020 at 1:39 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Jakub Kicinski <kuba@kernel.org> writes:
>
> > On Wed, 13 May 2020 12:25:27 -0700 Andrii Nakryiko wrote:
> >> One interesting implementation bit, that significantly simplifies (and thus
> >> speeds up as well) implementation of both producers and consumers is how data
> >> area is mapped twice contiguously back-to-back in the virtual memory. This
> >> allows to not take any special measures for samples that have to wrap around
> >> at the end of the circular buffer data area, because the next page after the
> >> last data page would be first data page again, and thus the sample will still
> >> appear completely contiguous in virtual memory. See comment and a simple ASCII
> >> diagram showing this visually in bpf_ringbuf_area_alloc().
> >
> > Out of curiosity - is this 100% okay to do in the kernel and user space
> > these days? Is this bit part of the uAPI in case we need to back out of
> > it?
> >
> > In the olden days virtually mapped/tagged caches could get confused
> > seeing the same physical memory have two active virtual mappings, or
> > at least that's what I've been told in school :)
>
> Yes, caching the same thing twice causes coherency problems.
>
> VIVT can be found in ARMv5, MIPS, NDS32 and Unicore32.
>
> > Checking with Paul - he says that could have been the case for Itanium
> > and PA-RISC CPUs.
>
> Itanium: PIPT L1/L2.
> PA-RISC: VIPT L1 and PIPT L2
>
> Thanks,
>

Jakub, thanks for bringing this up.

Thomas, Paul, what kind of problems are we talking about here? What
are the possible problems in practice?

So just for the context, all the metadata (record header) that is
written/read under lock and with smp_store_release/smp_load_acquire is
written through the one set of page mappings (the first one). Only
some of sample payload might go into the second set of mapped pages.
Does this mean that user-space might read some old payloads in such
case?

I could work-around that in user-space, by mmaping twice the same
range, one after the other (second mmap would use MAP_FIXED flag, of
course). So that's not a big deal.

But on the kernel side it's crucial property, because it allows BPF
programs to work with data with the assumption that all data is
linearly mapped. If we can't do that, reserve() API is impossible to
implement. So in that case, I'd rather enable BPF ring buffer only on
platforms that won't have these problems, instead of removing
reserve/commit API altogether.

Well, another way is to just "discard" remaining space at the end, if
it's not sufficient for entire record. That's doable, there will
always be at least 8 bytes available for record header, so not a
problem in that regard. But I would appreciate if you can help me
understand full implications of caching physical memory twice.

Also just for my education, with VIVT caches, if user-space
application mmap()'s same region of memory twice (without MAP_FIXED),
wouldn't that cause similar problems? Can't this happen today with
mmap() API? Why is that not a problem?


>         tglx

  reply	other threads:[~2020-05-14 21:30 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-13 19:25 [PATCH bpf-next 0/6] BPF ring buffer Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 1/6] bpf: implement BPF ring buffer and verifier support for it Andrii Nakryiko
2020-05-13 20:57   ` kbuild test robot
2020-05-13 20:57     ` kbuild test robot
2020-05-13 21:58   ` Alan Maguire
2020-05-14  5:59     ` Andrii Nakryiko
2020-05-14 22:25       ` Alan Maguire
2020-05-13 22:16   ` kbuild test robot
2020-05-13 22:16     ` kbuild test robot
2020-05-14 16:50   ` Jonathan Lemon
2020-05-14 20:11     ` Andrii Nakryiko
2020-05-14 17:33   ` sdf
2020-05-14 20:18     ` Andrii Nakryiko
2020-05-14 20:53       ` sdf
2020-05-14 21:13         ` Andrii Nakryiko
2020-05-14 21:56           ` Stanislav Fomichev
2020-05-14 19:06   ` Alexei Starovoitov
2020-05-14 20:49     ` Andrii Nakryiko
2020-05-14 19:18   ` Jakub Kicinski
2020-05-14 19:18     ` Jakub Kicinski
2020-05-14 20:39     ` Thomas Gleixner
2020-05-14 21:30       ` Andrii Nakryiko [this message]
2020-05-14 22:13         ` Paul E. McKenney
2020-05-14 22:56         ` Alexei Starovoitov
2020-05-14 23:06           ` Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 2/6] tools/memory-model: add BPF ringbuf MPSC litmus tests Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 3/6] bpf: track reference type in verifier Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 4/6] libbpf: add BPF ring buffer support Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 5/6] selftests/bpf: add BPF ringbuf selftests Andrii Nakryiko
2020-05-13 19:25 ` [PATCH bpf-next 6/6] bpf: add BPF ringbuf and perf buffer benchmarks Andrii Nakryiko
2020-05-13 22:49 ` [PATCH bpf-next 0/6] BPF ring buffer Jonathan Lemon
2020-05-14  6:08   ` Andrii Nakryiko
2020-05-14 16:30     ` Jonathan Lemon
2020-05-14 20:06       ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4Bzbj-WvRkoGxkSFtK5_1JfQxthoFid398C97RM0ppBb0dA@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jonathan.lemon@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=kuba@kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.