Re: [PATCH bpf-next 2/4] bpftool: add support for subskeletons

From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Delyan Kratunov <delyank@fb.com>
Cc: "daniel@iogearbox.net" <daniel@iogearbox.net>,
	"ast@kernel.org" <ast@kernel.org>,
	"andrii@kernel.org" <andrii@kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 2/4] bpftool: add support for subskeletons
Date: Fri, 4 Mar 2022 11:29:20 -0800	[thread overview]
Message-ID: <CAEf4BzbuQ+7vkKw0ozkwX7E1D7ygfTbyhaUMJitxTgiYq9y7Fg@mail.gmail.com> (raw)
In-Reply-To: <9028e60f908d30d5f156064cebfd5af8d66c5c9c.camel@fb.com>

On Thu, Mar 3, 2022 at 10:57 AM Delyan Kratunov <delyank@fb.com> wrote:
>
> Thanks Andrii for taking a thorough look!
>
> On Wed, 2022-03-02 at 17:46 -0800, Andrii Nakryiko wrote:
> > There is a need (I needed it in libusdt, for example). Let's add it
> > from the very beginning, especially that it can be done in *exactly*
> > the same form as for skeletons.
>
> Absolutely, I'll get that into the next version!
>
> > seems like subskel's datasec codegen is considerably different (and
> > simpler, really) than skeleton's, I'd add a separate function for it
> > instead of all this if (subskel) special-casing. Main concern for
> > skeleton is generating correct memory layout, while for subskel we
> > just need to generate a list of pointers without any regard to memory
> > layout.
>
> I'm not 100% convinced given how much code would end up being duplicated but I
> can go in that direction, if you'd prefer it.
>
> >
>
> > it's unfortunate to have to modify original BTF just to have this '*'
> > added.  If I remember correctly, C syntax for pointers has special
> > case only for arrays and func prototypes, so it should work with this
> > logic (not tested, obviously)
> >
> > 1. if top variable type is array or func_proto, set var_ident to "(*<variable>)"
> > 2. otherwise set var_ident to "*<variable>"
> >
> > we'd need to test it for array and func_proto, but it definitely
> > should work for all other cases
>
> A couple of thoughts here:
>
> 1. We are not modifing the original BTF, we are layering in-memory-only types on
> top of it. This ends up working transparently through btf_dump code, which is
> the source of truth of what "correct" is. I think this is strictly better than
> the alternative textual modifications to var_ident.

Yeah, I noticed that you are creating split BTF later. Ok, I don't
mind, let's do it this way (due to the horrible pointer-to-array
syntax inconsistency). Please leave the comment somewhere here to make
it obvious that we are not modifying original BTF

>
> 2. I guess we see the change differently - to me it's not just about adding an
> asterisk but instead working with derivative types. This might come in handy in
> other contexts that we haven't envisioned yet and I feel is a direction worth
> supporting overall.

It's not "just about adding an asterisk", it's about generating a
pointer to the type. Split BTF added on top makes it a bit more
tolerable (though there is still a bunch of unnecessary complexity and
overhead just for that pesky asterisk).

Another alternative would be:

typeof(char[123]) *my_ptr;

This can be done without generating extra BTF. For complex types it's
actually even easier to parse, tbh. I initially didn't like it, but
now I'm thinking maybe for arrays and func_protos we should do just
this? WDYT?

>
> 3. We have a full type system with layering and mixed file- and memory-based
> storage. Why limit ourselves to templating instead of using it in the codegen?
> If I were writing this from scratch, much of codegen_datasecs would instead
> create in-memory BTF types and have btf_dump emit them (but that's not the
> bikeshed I want to paint here!).

Maybe, but at the time this code was written we didn't have either
split BTF nor BTF writing APIs.

>
> > > +       char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SKEL_H__")];
> >
> > use __SUBSKEL_H__ here?
>
> Sure, I can introduce the .subskel.h suffix everywhere.
>
> >
> > > +                       strncpy(obj_name, *argv, MAX_OBJ_NAME_LEN - 1);
> > > +                       obj_name[MAX_OBJ_NAME_LEN - 1] = '\0';
> >
> > we should probably force obj_name to be an empty string, so that all
> > map names match their proper section names
>
> Ah, maybe this is why bpf_map__name was not doing the right thing before. I
> don't really like that we're relying on side effects of the empty obj_name but
> I'll try it and see if anything breaks (the templating for one will need it
> anyway).

Yeah, it's due to the fact that we are using "object name" as part of
.data, .rodata, and .bss maps (and .kconfig, probably). Historic
decision, too bad, but we have what we have. We probably should clean
this up in libbpf 1.0, it's too confusing throughout.

>
> >
> > > +       if (verifier_logs)
> > > +               /* log_level1 + log_level2 + stats, but not stable UAPI */
> > > +               opts.kernel_log_level = 1 + 2 + 4;
> >
> > hm.. we shouldn't need this, we are not loading anything into kernel
> > and don't use light skeleton
>
> You're right, will remove.
>
> >
> > > +       obj = bpf_object__open_mem(obj_data, file_sz, &opts);
> > > +       err = libbpf_get_error(obj);
> >
> > no need for libbpf_get_error() anymore, bpftool is in libbpf 1.0 mode,
> > so it will get NULL on error and error itself will be in errno
>
> Ah, yes, I won't add new callsites.
>
> >
> > >
> > > +               map_type_id = btf__find_by_name_kind(btf, bpf_map__section_name(map), BTF_KIND_DATASEC);
> >
> > if we set obj_name to "", bpf_map__name() should return ELF section
> > name here, so no need to expose this as an API
> >
> >
> > oh, but also bpf_map__btf_value_type_id() should give you this ID directly
>
> TIL, that's not obvious at all. There's a few places in gen.c that could be
> simplified then - find_type_for_map goes through slicing the complete name and
> walking over every BTF type to match on the slice. Is there some compatibility
> reason to do that or is btf_value_type_id always there?

No legacy reason, we should use btf_value_type_id if
necessary/possible (it's going to be DATASEC type for all global var
maps, I think). But I just double checked and it seems that we fill
out btf_value_type_id only during map creation (that is, during
bpf_object__load()). But I don't see any reason to postpone it so
late, so let's just move it to the open phase. See
bpf_map_find_btf_info() and where it's called.

>
> >
> > > +               for (i = 0, var = btf_var_secinfos(map_type), len = btf_vlen(map_type);
> > > +                    i < len;
> > > +                    i++, var++) {
> >
> > nit: move those long one-time assignments out of the for() loop and
> > keep it single-line?
>
> Yeah, I hate that structure too, I'll clean it up.
>
> >
> > >
> > > +                       if (!subskel) {                                     \n\
> > > +                               errno = ENOMEM;                             \n\
> > > +                               return NULL;                                \n\
> >
> > leaking obj here
>
> Yeah, I noticed that I didn't use __destroy in the subskel, will fix for v2.
>
>
> > > +       /* walk through each symbol and emit the runtime representation
> > > +        */
> >
> > nit: keep this comment single-lined?
>
> I did initially and checkpatch scolded me :)

checkpatch still loves 80 character lines, but it was relaxed to 100 a
while ago. Checkpatch.pl is not an authority, it's just a suggestion
for a lot of cases.

>
> >
> > > +       bpf_object__for_each_map(map, obj) {
> > > +               if (!bpf_map__is_internal(map))
> > > +                       continue;
> > > +
> > > +               if (!(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
> > > +                       continue;
> > > +
> > > +               if (!get_map_ident(map, ident, sizeof(ident)))
> > > +                       continue;
> >
> > this sequence of ifs seems so frequently repeated that it's probably
> > time to add a helper function?
>
> It is and it's annoying me too. I'll look at the whole iteration pattern more
> closely.
>
> >
> >
> > > +                       codegen("\
> > > +                       \n\
> > > +                               syms[%4$d].name = \"%1$s\";                 \n\
> > > +                               syms[%4$d].section = \"%3$s\";              \n\
> > > +                               syms[%4$d].addr = (void**) &obj->%2$s.%1$s; \n\
> > > +                       ", var_ident, ident, bpf_map__section_name(map), sym_idx);
> > > +               }
> > > +       }
> >
> > why not assign subskel->sym_cnt here using sym_idx and avoid that
> > extra loop over all variables above?
>
> Good call, will do.
> >
> > Quentin will remind you that you should also update the man page and
> > bash completion script :)
>
> Ah, yes, glad to see it's rst and I don't have to suffer groff/mdoc flashbacks!
>
> Thanks,
> Delyan
>