bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@fomichev.me>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>,
	Andrii Nakryiko <andriin@fb.com>,
	LKML <linux-kernel@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next 11/15] bpftool: add skeleton codegen command
Date: Wed, 11 Dec 2019 11:15:18 -0800	[thread overview]
Message-ID: <20191211191518.GD3105713@mini-arch> (raw)
In-Reply-To: <CAEf4Bzb+3b-ypP8YJVA=ogQgp1KXx2xPConOswA0EiGXsmfJow@mail.gmail.com>

On 12/11, Andrii Nakryiko wrote:
> On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> >
> > On 12/10, Andrii Nakryiko wrote:
> > > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > > >
> > > > On 12/10, Andrii Nakryiko wrote:
> > > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > > > > >
> > > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > struct <object-name> {
> > > > > > > > > >       /* used by libbpf's skeleton API */
> > > > > > > > > >       struct bpf_object_skeleton *skeleton;
> > > > > > > > > >       /* bpf_object for libbpf APIs */
> > > > > > > > > >       struct bpf_object *obj;
> > > > > > > > > >       struct {
> > > > > > > > > >               /* for every defined map in BPF object: */
> > > > > > > > > >               struct bpf_map *<map-name>;
> > > > > > > > > >       } maps;
> > > > > > > > > >       struct {
> > > > > > > > > >               /* for every program in BPF object: */
> > > > > > > > > >               struct bpf_program *<program-name>;
> > > > > > > > > >       } progs;
> > > > > > > > > >       struct {
> > > > > > > > > >               /* for every program in BPF object: */
> > > > > > > > > >               struct bpf_link *<program-name>;
> > > > > > > > > >       } links;
> > > > > > > > > >       /* for every present global data section: */
> > > > > > > > > >       struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > >               /* memory layout of corresponding data section,
> > > > > > > > > >                * with every defined variable represented as a struct field
> > > > > > > > > >                * with exactly the same type, but without const/volatile
> > > > > > > > > >                * modifiers, e.g.:
> > > > > > > > > >                */
> > > > > > > > > >                int *my_var_1;
> > > > > > > > > >                ...
> > > > > > > > > >       } *<one of bss, data, or rodata>;
> > > > > > > > > > };
> > > > > > > > >
> > > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > > written in a dynamic language like python?  Would it perhaps be a
> > > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > > time writing code gens and working around limitations of C?
> > > > > > > >
> > > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > > >
> > > > > > > This "do it yourself" shit is not really funny :/
> > > > > > >
> > > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > > that :/ Maybe that's what you want.
> > > > > > >
> > > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > > work, see few of the last patches in this series.
> > > > > > >
> > > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > > to that goal.
> > > > > > >
> > > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > > *nod*
> > > > > >
> > > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > > >
> > > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > > I personally think it would be great to have both Python and C++
> > > > > specific API that uses libbpf under the cover. The only debatable
> > > > > thing is the logistics: where the source code lives, how it's kept in
> > > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > > something is hard or inconvenient to adapt w/ Python, etc.
> > > >
> > > > [..]
> > > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > > think you can solve it without code generation for C++. How do you
> > > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > > data sections in such a way, where it's type safe (to the degree that
> > > > > language allows that, of course) and is not "stringly-based" API? This
> > > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > > work with global data from userspace pretty much at the same level of
> > > > > performance and convenience, as from BPF side. How can you achieve
> > > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > > dynamic lookups based on just the name of property/method, but amount
> > > > > of overheads is not acceptable for all applications (and Python itself
> > > > > is not acceptable for those applications). In addition to that, C is
> > > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > > Rust.
> > > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > > straightforward thing to do. Have a way to "open" a section and
> > > > a way to find a symbol in it. Yes, it's a string-based API,
> > > > but there is nothing wrong with it. IMO, this is easier to
> > > > use/understand and I suppose Python/C++ wrappers are trivial.
> > >
> > > Without digging through libbpf source code (or actually, look at code,
> > > but don't run any test program), what's the name of the map
> > > corresponding to .bss section, if object file is
> > > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > > multiple attempts to memorize the pattern), how much time did you
> > > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > > you use anonymous structs for your global vars, good luck maintaining
> > > two copies of that: one for BPF side and one for userspace.
> > As your average author of BPF programs I don't really care
> > which section my symbol ends up into. Just give me an api
> > to mmap all "global" sections (or a call per section which does all the
> > naming magic inside) and lookup symbol by name; I can cast it to a proper
> > type and set it.
> 
> I'd like to not have to know about bss/rodata/data as well, but that's
> how things are done for global variables. In skeleton we can try to
> make an illusion like they are part of one big datasection/struct, but
> that seems like a bit too much magic at this point. But then again,
> one of the reasons I want this as an experimental feature, so that we
> can actually judge from real experience how inconvenient some things
> are, and not just based on "I think it would be ...".
> 
> re: "Just give me ...". Following the spirit of "C is hard" from your
> previous arguments, you already have that API: mmap() syscall. C
> programmers have to be able to figure out the rest ;) But on the
> serious note, this auto-generated code in skeleton actually addresses
> all concerns (and more) that you mentioned: mmaping, knowing offsets,
> knowing names and types, etc. And it doesn't preclude adding more
> "conventional" additional APIs to do everything more dynamically,
> based on string names.
We have different understanding of what's difficult :-)

To me, doing transparent data/rodata/bss mmap in bpf_object__load and then
adding a single libbpf api call to lookup symbol by string name is simple
(both from user perspective and from libbpf code complexity). Because in
order to use the codegen I need to teach our build system to spit it
out (which means I need to add bpftool to it and keep it
updated/etc/etc). You can use it as an example of "real experience how
inconvenient some things are".

> > RE anonymous structs: maybe don't use them if you want to share the data
> > between bpf and userspace?
> 
> Alright.
> 
> >
> > > I never said there is anything wrong with current straightforward
> > > libbpf API, but I also never said it's the easiest and most
> > > user-friendly way to work with BPF either. So we'll have both
> > > code-generated interface and existing API. Furthermore, they are
> > > interoperable (you can pass skel->maps.whatever to any of the existing
> > > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > > that can beat performance and usability of code-generated .data, .bss,
> > > .rodata (and now .extern) layout.
> > I haven't looked closely enough, but is there a libbpf api to get
> > an offset of a variable? Suppose I have the following in bpf.c:
> >
> >         int a;
> >         int b;
> >
> > Can I get an offset of 'b' in the .bss without manually parsing BTF?
> 
> No there isn't right now. There isn't even an API to know that there
> is such a variable called "b". Except for this skeleton, of course.
> 
> >
> > TBH, I don't buy the performance argument for these global maps.
> > When you did the mmap patchset for the array, you said it yourself
> > that it's about convenience and not performance.
> 
> Yes, it's first and foremost about convenience, addressing exactly the
> problems you mentioned above. But performance is critical for some use
> cases, and nothing can beat memory-mapped view of BPF map for those.
> Think about the case of frequently polling (or even atomically
> exchanging) some stats from userspace, as one possible example. E.g.,
> like some map statistics (number of filled elements, p50 of whatever
> of those elements, etc). I'm not sure what's there to buy: doing
> syscall to get **entire** global data map contents vs just fetching
> single integer from memory-mapped region, guess which one is cheaper?
My understanding was that when you were talking about performance, you
were talking about doing symbol offset lookup at runtime vs having a
generated struct with fixed offsets; not about mmap vs old api with copy
(this debate is settled since your patches are accepted).

But to your original reply: you do understand that if you have multiple
threads that write to this global data you have a bigger problem, right?

  reply	other threads:[~2019-12-11 19:15 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-10  1:14 [PATCH bpf-next 00/15] Add code-generated BPF object skeleton support Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 01/15] libbpf: don't require root for bpf_object__open() Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 02/15] libbpf: add generic bpf_program__attach() Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 03/15] libbpf: move non-public APIs from libbpf.h to libbpf_internal.h Andrii Nakryiko
2019-12-10  1:33   ` Jakub Kicinski
2019-12-10 17:04     ` Andrii Nakryiko
2019-12-10 18:17       ` Jakub Kicinski
2019-12-10 18:47         ` Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 04/15] libbpf: add BPF_EMBED_OBJ macro for embedding BPF .o files Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 05/15] libbpf: expose field/var declaration emitting API internally Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 06/15] libbpf: expose BPF program's function name Andrii Nakryiko
2019-12-11 19:38   ` [Potential Spoof] " Martin Lau
2019-12-11 19:54     ` Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 07/15] libbpf: refactor global data map initialization Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 08/15] libbpf: postpone BTF ID finding for TRACING programs to load phase Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 09/15] libbpf: reduce log level of supported section names dump Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 10/15] libbpf: add experimental BPF object skeleton support Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 11/15] bpftool: add skeleton codegen command Andrii Nakryiko
2019-12-10  1:57   ` Jakub Kicinski
2019-12-10 17:11     ` Andrii Nakryiko
2019-12-10 18:05       ` Jakub Kicinski
2019-12-10 18:56         ` Andrii Nakryiko
2019-12-10 21:44         ` Stanislav Fomichev
2019-12-10 22:33           ` Andrii Nakryiko
2019-12-10 22:59             ` Stanislav Fomichev
2019-12-11  7:07               ` Andrii Nakryiko
2019-12-11 17:24                 ` Stanislav Fomichev
2019-12-11 18:26                   ` Andrii Nakryiko
2019-12-11 19:15                     ` Stanislav Fomichev [this message]
2019-12-11 19:41                       ` Andrii Nakryiko
2019-12-11 20:09                         ` Stanislav Fomichev
2019-12-12  0:50                           ` Andrii Nakryiko
2019-12-12  2:57                             ` Stanislav Fomichev
2019-12-12  7:27                               ` Andrii Nakryiko
2019-12-12 16:29                                 ` Stanislav Fomichev
2019-12-12 16:53                                   ` Andrii Nakryiko
2019-12-12 18:43                                     ` Jakub Kicinski
2019-12-12 18:58                                       ` Stanislav Fomichev
2019-12-12 19:23                                         ` Jakub Kicinski
2019-12-12 19:54                                       ` Alexei Starovoitov
2019-12-12 20:21                                         ` Jakub Kicinski
2019-12-12 21:28                                           ` Alexei Starovoitov
2019-12-12 21:59                                             ` Jakub Kicinski
2019-12-13  6:48                                           ` Andrii Nakryiko
2019-12-13 17:47                                             ` Jakub Kicinski
2019-12-12 21:45                                         ` Stanislav Fomichev
2019-12-13  6:23                                           ` Andrii Nakryiko
2019-12-11 22:50   ` [Potential Spoof] " Martin Lau
2019-12-16 14:16   ` Daniel Borkmann
2019-12-16 18:53     ` Andrii Nakryiko
2019-12-17 13:59       ` Daniel Borkmann
2019-12-17 15:45         ` Alexei Starovoitov
2019-12-10  1:14 ` [PATCH bpf-next 12/15] selftests/bpf: add BPF skeletons selftests and convert attach_probe.c Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 13/15] selftests/bpf: convert few more selftest to skeletons Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 14/15] selftests/bpf: add test validating data section to struct convertion layout Andrii Nakryiko
2019-12-10  1:14 ` [PATCH bpf-next 15/15] bpftool: add `gen skeleton` BASH completions Andrii Nakryiko
2019-12-11 22:55 ` [PATCH bpf-next 00/15] Add code-generated BPF object skeleton support Martin Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191211191518.GD3105713@mini-arch \
    --to=sdf@fomichev.me \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).