Re: [RFC][PATCH bpf-next] libbpf: add bpf_object__open_{file,mem} w/ sized opts

From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	KP Singh <kpsingh@chromium.org>, Kernel Team <kernel-team@fb.com>
Subject: Re: [RFC][PATCH bpf-next] libbpf: add bpf_object__open_{file,mem} w/ sized opts
Date: Tue, 1 Oct 2019 16:43:30 -0700	[thread overview]
Message-ID: <CAEf4BzZJFBdjCSAzJ3-rOrCkkaTJmPSDhx_0xKJt4+Vg2TEFwg@mail.gmail.com> (raw)
In-Reply-To: <87lfu4t9up.fsf@toke.dk>

On Tue, Oct 1, 2019 at 2:49 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Tue, Oct 1, 2019 at 1:42 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Andrii Nakryiko <andriin@fb.com> writes:
> >>
> >> > Add new set of bpf_object__open APIs using new approach to optional
> >> > parameters extensibility allowing simpler ABI compatibility approach.
> >> >
> >> > This patch demonstrates an approach to implementing libbpf APIs that
> >> > makes it easy to extend existing APIs with extra optional parameters in
> >> > such a way, that ABI compatibility is preserved without having to do
> >> > symbol versioning and generating lots of boilerplate code to handle it.
> >> > To facilitate succinct code for working with options, add OPTS_VALID,
> >> > OPTS_HAS, and OPTS_GET macros that hide all the NULL and size checks.
> >> >
> >> > Additionally, newly added libbpf APIs are encouraged to follow similar
> >> > pattern of having all mandatory parameters as formal function parameters
> >> > and always have optional (NULL-able) xxx_opts struct, which should
> >> > always have real struct size as a first field and the rest would be
> >> > optional parameters added over time, which tune the behavior of existing
> >> > API, if specified by user.
> >>
> >> I think this is a reasonable idea. It does require some care when adding
> >> new options, though. They have to be truly optional. I.e., I could
> >> imagine that we will have cases where the behaviour might need to be
> >> different if a program doesn't understand a particular option (I am
> >> working on such a case in the kernel ATM). You could conceivably use the
> >> OPTS_HAS() macro to test for this case in the code, but that breaks if a
> >> program is recompiled with no functional change: then it would *appear*
> >> to "understand" that option, but not react properly to it.
> >
> > So let me double-check I'm understanding this correctly.
> >
> > Let's say we have some initial options like:
> >
> > // VERSION 1
> > struct bla_opts {
> >     size_t sz;
> > };
> >
> > // VERSION 2
> > Then in newer version we add new field:
> > struct bla_opts {
> >     int awesomeness_trigger;
> > };
> >
> > Are you saying that if program was built with VERSION 1 in mind (so sz
> > = 8 for bla_opts, so awesomeness_trigger can't be even specified),
> > then that should be different from the program built against VERSION 2
> > and specifying .awesomeness_trigger = 0?
> > Do I get this right? I'm not sure how to otherwise interpret what you
> > are saying, so please elaborate if I didn't get the idea.
> >
> > If that's what you are saying, then I think we shouldn't (and we
> > really can't, see Jesper's remark about padding) distinguish between
> > whether field was not "physically" there or whether it was just set to
> > default 0 value. Treating this uniformly as 0 makes libbpf logic
> > simpler and consistent and behavior much less surprising.
>
> Indeed. My point was that we should make sure we don't try to do this :)

Ah, cool, glad we are in agreement then :)

>
> >> In other words, this should only be used for truly optional bits (like
> >> flags) where the default corresponds to unchanged behaviour relative to
> >> when the option was added.
> >
> > This I agree 100%, furthermore, any added new option has to behave
> > like this. If that's not the case, then it has to be a new API
> > function or at least another symbol version.
>
> Exactly!
>
> >>
> >> A few comments on the syntax below...
> >>
> >>
> >> > +static struct bpf_object *
> >> > +__bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz,
> >> > +                    struct bpf_object_open_opts *opts, bool enforce_kver)
> >>
> >> I realise this is an internal function, but why does it have a
> >> non-optional parameter *after* the opts?
> >
> > Oh, no reason, added it later and I'm hoping to remove it completely.
> > Current bpf_object__open_buffer always enforces kver presence in a
> > program, which differs from bpf_object__open behavior (where it
> > depends on provided .prog_type argument), so with this I tried to
> > preserve existing behavior. But in the final version of this patch I
> > think I'll just make this kver archaic business in libbpf not
> > enforced. It's been deleted from kernel long time ago, there is no
> > good reason to keep enforcing this in libbpf. If someone is running
> > against old kernel and didn't specify kver, they'll get error anyway.
> > Libbpf will just need to make sure to pass kver through, if it's
> > specified. Thoughts?
>
> Not many. Enforcing anything on kernel version seems brittle anyway, so
> off the top of my head, yeah, let's nuke it (in a backwards-compatible
> way, of course :)).

Yep, that's what I'm intending to do.

>
> >>
> >> >       char tmp_name[64];
> >> > +     const char *name;
> >> >
> >> > -     /* param validation */
> >> > -     if (!obj_buf || obj_buf_sz <= 0)
> >> > -             return NULL;
> >> > +     if (!OPTS_VALID(opts) || !obj_buf || obj_buf_sz == 0)
> >> > +             return ERR_PTR(-EINVAL);
> >> >
> >> > +     name = OPTS_GET(opts, object_name, NULL);
> >> >       if (!name) {
> >> >               snprintf(tmp_name, sizeof(tmp_name), "%lx-%lx",
> >> >                        (unsigned long)obj_buf,
> >> >                        (unsigned long)obj_buf_sz);
> >> >               name = tmp_name;
> >> >       }
> >> > +
> >> >       pr_debug("loading object '%s' from buffer\n", name);
> >> >
> >> > -     return __bpf_object__open(name, obj_buf, obj_buf_sz, true, true);
> >> > +     return __bpf_object__open(name, obj_buf, obj_buf_sz, enforce_kver, 0);
> >> > +}
> >> > +
> >> > +struct bpf_object *
> >> > +bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz,
> >> > +                  struct bpf_object_open_opts *opts)
> >> > +{
> >> > +     return __bpf_object__open_mem(obj_buf, obj_buf_sz, opts, false);
> >> > +}
> >> > +
> >> > +struct bpf_object *
> >> > +bpf_object__open_buffer(const void *obj_buf, size_t obj_buf_sz, const char *name)
> >> > +{
> >> > +     struct bpf_object_open_opts opts = {
> >> > +             .sz = sizeof(struct bpf_object_open_opts),
> >> > +             .object_name = name,
> >> > +     };
> >>
> >> I think this usage with the "size in struct" model is really awkward.
> >> Could we define a macro to help hide it? E.g.,
> >>
> >> #define BPF_OPTS_TYPE(type) struct bpf_ ## type ## _opts
> >> #define DECLARE_BPF_OPTS(var, type) BPF_OPTS_TYPE(type) var = { .sz = sizeof(BPF_OPTS_TYPE(type)); }
> >
> > We certainly could (though I'd maintain that type specified full
> > struct name, makes it easier to navigate/grep code), but then we'll be
> > preventing this nice syntax of initializing structs, which makes me
> > very said because I love that syntax.
> >>
> >> Then the usage code could be:
> >>
> >> DECLARE_BPF_OPTS(opts, object_open);
> >> opts.object_name = name;
> >>
> >> Still not ideal, but at least it's less boiler plate for the caller, and
> >> people will be less likely to mess up by forgetting to add the size.
> >
> > What do you think about this?
> >
> > #define BPF_OPTS(type, name, ...) \
> >         struct type name = { \
> >                 .sz = sizeof(struct type), \
> >                 __VA_ARGS__ \
> >         }
> >
> > struct bla_opts {
> >         size_t sz;
> >         int opt1;
> >         void *opt2;
> >         const char *opt3;
> > };
> >
> > int main() {
> >         BPF_OPTS(bla_opts, opts,
> >                 .opt1 = 123,
> >                 .opt2 = NULL,
> >                 .opt3 = "fancy",
> >         );
> >
> >         /* then also */
> >         BPF_OPTS(bla_opts, old_school);
> >         old_school.opt1 = 256;
> >
> >         return opts.opt1;
> > }
>
> Sure, LGTM! Should we still keep the bit where it expands _opts in the
> struct name as part of the macro, or does that become too obtuse?

For me it's a question of code navigation. When I'll have a code

LIBBPF_OPTS(bpf_object_open, <whatever>);

I'll want to jump to the definition of "bpf_object_open" (e.g., w/
cscope)... and will find nothing, because it's actually
bpf_object_open_opts. So I prefer user to spell it out exactly and in
full, this is more maintainable in the long run, IMO.

I'll work on all of that in non-RFC v1 then.

>
> > Thanks a lot for a thoughtful feedback, Toke!
>
> You're very welcome! And thanks for working on these API issues!
>
> -Toke
>