bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Lorenz Bauer <lmb@cloudflare.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andriin@fb.com>, Alexei Starovoitov <ast@fb.com>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>,
	Jakub Kicinski <jakub.kicinski@netronome.com>,
	Joe Stringer <joe@wand.net.nz>
Subject: Re: [PATCH v2 bpf-next 00/11] BTF-defined BPF map definitions
Date: Fri, 21 Jun 2019 10:56:47 -0700	[thread overview]
Message-ID: <CAEf4BzYtM-41Y5w0-6kWhaeQUh7it5Zd_OZBj4kvMYou3TwQ+A@mail.gmail.com> (raw)
In-Reply-To: <CACAyw98JqwZbcTdpRNcG_fT6A-ekEqn9D5Zx4myB8oiX73uZkw@mail.gmail.com>

On Fri, Jun 21, 2019 at 3:29 AM Lorenz Bauer <lmb@cloudflare.com> wrote:
>
> On Fri, 21 Jun 2019 at 05:20, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Jun 20, 2019 at 7:49 AM Lorenz Bauer <lmb@cloudflare.com> wrote:
> > >
> > > On Tue, 18 Jun 2019 at 22:37, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > > > I would just drop the object-scope pinning. We avoided using it and I'm not
> > > > > aware if anyone else make use. It also has the ugly side-effect that this
> > > > > relies on AF_ALG which e.g. on some cloud provider shipped kernels is disabled.
> > > > > The pinning attribute should be part of the standard set of map attributes for
> > > > > libbpf though as it's generally useful for networking applications.
> > > >
> > > > Sounds good. I'll do some more surveying of use cases inside FB to see
> > > > if anyone needs object-scope pinning, just to be sure we are not
> > > > short-cutting anyone.
> > >
> > > I'm also curious what the use cases for declarative pinning are. From my
> > > limited POV it doesn't seem that useful? There are a couple of factors:
> >
> > Cilium is using it pretty extensively, so there are clearly use cases.
> > The most straigtforward use case is using a map created and shared by
> > another BPF program (to communicate, read stats, what have you).
>
> I think Cilium is in the quirky position that it has a persistent daemon, but
> shells out to tc for loading programs. They are probably also the most
> advanced (open-source) users of BPF out there. If I understood their comments
> correctly they want to move to using a library for loading their ELF. At that
> point whether something is possible in a declarative way is less important,
> because you have the much more powerful APIs at your disposal.
>
> Maybe Daniel or someone else from the Cilium team can chime in here?

Yep, curious about their perspective on that.

>
> > > * Systemd mounts the default location only accessible to root, so I have to
> > >   used my own bpffs mount.
> > > * Since I don't want to hard code that, I put it in a config file.
> > > * After loading the ELF we pin maps from the daemon managing the XDP.
> >
> > So mounting root would be specified per bpf_object, before maps are
> > created, so user-land driving application will have an opportunity to
> > tune everything. Declarative is only the per-map decision of whether
> > that map should be exposed to outer world (for sharing) or not.
>
> So `tc filter add bpf obj foo.elf pin-root /gobbledygook`?

I meant something like:

bpf_object_open_attr attr;
attr.file = "path/to/my/object.o";
attr.pin_root_path = "/my/fancy/bpffs/root";
bpf_object__open_xattr(&attr);

Then tools can adopt they when necessary.

>
> > Then check tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c
> > for more crazy syntax ;)
> >
> > typedef char * (* const (* const fn_ptr_arr2_t[5])())(char * (*)(int));
>
> Not on a Friday ;P
>
> > > What if this did
> > >
> > >   __type(value, struct my_value)[1000];
> > >   struct my_value __member(value)[1000]; // alternative
> > >
> > > instead, and skipped max_entries?
> >
> > I considered that, but decided for now to keep all those attributes
> > orthogonal for more flexibility and uniformity. This syntax might be
> > considered a nice "syntax sugar" and can be added in the future, if
> > necessary.
>
> Ack.
>
> > > At that point you have to understand that value is a pointer so all of
> > > our efforts
> > > are for naught. I suspect there is other weirdness like this, but I need to play
> > > with it a little bit more.
> >
> > Yes, C can let you do crazy stuff, if you wish, but I think that
> > shouldn't be a blocker for this proposal. I haven't seen any BPF
> > program doing that, usually you duplicate the type of inner value
> > inside your function anyway, so there is no point in taking
> > sizeof(map.value) from BPF program side. From outside, though, all the
> > types will make sense, as expected.
>
> Right, but in my mind that is a bit of a cop out. I like BTF map definitions,
> and I want them to be as unsurprising as possible, so that they are
> easy to use and adopt.


Right, but there are limit on what you can do with C syntax and it's
type system. Having fancy extra features like you described (e.g,
sizeof(map.value), etc) is pretty low on a priority list.

>
> If a type encodes all the information we need via the array dimension hack,
> couldn't we make the map variable itself a pointer, and drop the inner pointers?
>
> struct my_map_def {
>   int type[BPF_MAP_TYPE_HASH];
>   int value;
>   struct foo key;

This is bad because it potentially uses lots of space. If `struct foo`
is big, if max_entries is big, even for type, it's still a bunch of
extra space wasted. That's why we have pointers everywhere, as they
allow to encode everything with fixed space overhead of 8 bytes for a
pointer.


>   ...
> }
>
> struct my_map_def *my_map;
>
> --
> Lorenz Bauer  |  Systems Engineer
> 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK
>
> www.cloudflare.com

  reply	other threads:[~2019-06-21 17:57 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-17 19:26 [PATCH v2 bpf-next 00/11] BTF-defined BPF map definitions Andrii Nakryiko
2019-06-17 19:26 ` [PATCH v2 bpf-next 01/11] libbpf: add common min/max macro to libbpf_internal.h Andrii Nakryiko
2019-06-17 19:26 ` [PATCH v2 bpf-next 02/11] libbpf: extract BTF loading logic Andrii Nakryiko
2019-06-17 19:40   ` Song Liu
2019-06-17 19:26 ` [PATCH v2 bpf-next 03/11] libbpf: streamline ELF parsing error-handling Andrii Nakryiko
2019-06-17 19:46   ` Song Liu
2019-06-17 19:26 ` [PATCH v2 bpf-next 04/11] libbpf: refactor map initialization Andrii Nakryiko
2019-06-17 19:39   ` Song Liu
2019-06-26 14:48   ` Matt Hart
2019-06-26 18:28     ` Andrii Nakryiko
2019-06-27 15:11       ` Matt Hart
2019-06-17 19:26 ` [PATCH v2 bpf-next 05/11] libbpf: identify maps by section index in addition to offset Andrii Nakryiko
2019-06-17 19:26 ` [PATCH v2 bpf-next 06/11] libbpf: split initialization and loading of BTF Andrii Nakryiko
2019-06-17 19:26 ` [PATCH v2 bpf-next 07/11] libbpf: allow specifying map definitions using BTF Andrii Nakryiko
2019-06-17 19:43   ` Song Liu
2019-06-17 19:26 ` [PATCH v2 bpf-next 08/11] selftests/bpf: add test for BTF-defined maps Andrii Nakryiko
2019-06-17 19:26 ` [PATCH v2 bpf-next 09/11] selftests/bpf: switch BPF_ANNOTATE_KV_PAIR tests to " Andrii Nakryiko
2019-06-17 21:41   ` Song Liu
2019-06-17 22:20     ` Daniel Borkmann
2019-06-17 19:26 ` [PATCH v2 bpf-next 10/11] selftests/bpf: convert tests w/ custom values " Andrii Nakryiko
2019-06-17 19:27 ` [PATCH v2 bpf-next 11/11] selftests/bpf: convert remaining selftests " Andrii Nakryiko
2019-06-17 21:17 ` [PATCH v2 bpf-next 00/11] BTF-defined BPF map definitions Daniel Borkmann
2019-06-17 22:17   ` Daniel Borkmann
2019-06-18 21:37   ` Andrii Nakryiko
2019-06-20 14:49     ` Lorenz Bauer
2019-06-21  4:19       ` Andrii Nakryiko
2019-06-21 10:29         ` Lorenz Bauer
2019-06-21 17:56           ` Andrii Nakryiko [this message]
2019-06-25 18:14             ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzYtM-41Y5w0-6kWhaeQUh7it5Zd_OZBj4kvMYou3TwQ+A@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=joe@wand.net.nz \
    --cc=kernel-team@fb.com \
    --cc=lmb@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).