bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Olsa <jolsa@redhat.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>,
	Martin KaFai Lau <kafai@fb.com>, David Miller <davem@redhat.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Wenbo Zhang <ethercflow@gmail.com>,
	KP Singh <kpsingh@chromium.org>, Andrii Nakryiko <andriin@fb.com>,
	Brendan Gregg <bgregg@netflix.com>,
	Florent Revest <revest@chromium.org>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 01/11] bpf: Add btfid tool to resolve BTF IDs in ELF object
Date: Fri, 19 Jun 2020 15:03:54 +0200	[thread overview]
Message-ID: <20200619130354.GB2465907@krava> (raw)
In-Reply-To: <CAEf4BzbB0ZMfWHrhiPhv79sMVZ9L0gMj54uXKn_-+mTawPiBqw@mail.gmail.com>

On Thu, Jun 18, 2020 at 05:38:03PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 16, 2020 at 3:06 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > The btfid tool scans Elf object for .BTF_ids section and
> > resolves its symbols with BTF IDs.
> 
> naming is hard and subjective, I know. But given this actively
> modifies ELF file it probably should indicate this in the name. So
> something like patch_btfids or resolve_btfids would be a bit more
> accurate and for people not in the know will still trigger the
> "warning, tool can modify something" flag, if there are any problems.

resolve_btfids sounds good to me

> 
> >
> > It will be used to during linking time to resolve arrays
> > of BTF IDs used in verifier, so these IDs do not need to
> > be resolved in runtime.
> >
> > The expected layout of .BTF_ids section is described
> > in btfid.c header. Related kernel changes are coming in
> > following changes.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  tools/bpf/btfid/Build    |  26 ++
> >  tools/bpf/btfid/Makefile |  71 +++++
> >  tools/bpf/btfid/btfid.c  | 627 +++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 724 insertions(+)
> >  create mode 100644 tools/bpf/btfid/Build
> >  create mode 100644 tools/bpf/btfid/Makefile
> >  create mode 100644 tools/bpf/btfid/btfid.c
> >
> 
> [...]
> 
> > diff --git a/tools/bpf/btfid/btfid.c b/tools/bpf/btfid/btfid.c
> > new file mode 100644
> > index 000000000000..7cdf39bfb150
> > --- /dev/null
> > +++ b/tools/bpf/btfid/btfid.c
> > @@ -0,0 +1,627 @@
> > +// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> > +#define  _GNU_SOURCE
> > +
> > +/*
> > + * btfid scans Elf object for .BTF_ids section and resolves
> > + * its symbols with BTF IDs.
> > + *
> > + * Each symbol points to 4 bytes data and is expected to have
> > + * following name syntax:
> > + *
> > + * __BTF_ID__<type>__<symbol>[__<id>]
> 
> This ___<id> thingy is just disambiguation between multiple places in
> the code that could have BTF_ID macro, right? Or it has extra meaning?

it's there so you could multiple BTF_ID instances with the same
symbol name

> 
> > + *
> > + * type is:
> > + *
> > + *   func   - lookup BTF_KIND_FUNC symbol with <symbol> name
> > + *            and put its ID into its data
> > + *
> > + *             __BTF_ID__func__vfs_close__1:
> > + *             .zero 4
> > + *
> > + *   struct - lookup BTF_KIND_STRUCT symbol with <symbol> name
> > + *            and put its ID into its data
> > + *
> > + *             __BTF_ID__struct__sk_buff__1:
> > + *             .zero 4
> > + *
> > + *   sort   - put symbol size into data area and sort following
> 
> Oh, I finally got what "put symbol size" means :) It's quite unclear,
> to be honest. Also, is this size in bytes or number of IDs? Clarifying
> would be helpful (I'll probably get this from reading further down the
> code, but still..)

I 'put' ;-) the documentation mainly to kernel/bpf/btf_ids.h,

so there are 2 types of lists, first one defines
just IDs as they go:

 BTF_ID_LIST(list1)
 BTF_ID(type1, name1)
 BTF_ID(type2, name2)

and it's used for helpers btf_id array

2nd one provides count and is sorted:

 BTF_WHITELIST_ENTRY(list2)
 BTF_ID(type1, name1)
 BTF_ID(type2, name2)
 BTF_WHITELIST_END(list)

and it's used for d_path whitelist so far

SNIP

> > +               if (sym.st_shndx != obj->efile.idlist_shndx)
> > +                       continue;
> > +
> > +               name = elf_strptr(obj->efile.elf, obj->efile.strtabidx,
> > +                                 sym.st_name);
> > +
> > +               if (!is_btf_id(name))
> > +                       continue;
> > +
> > +               /*
> > +                * __BTF_ID__TYPE__vfs_truncate__0
> > +                * prefix =  ^
> > +                */
> > +               prefix = name + sizeof(BTF_ID) - 1;
> > +
> > +               if (!strncmp(prefix, BTF_STRUCT, sizeof(BTF_STRUCT) - 1)) {
> > +                       id = add_struct(obj, prefix);
> > +               } else if (!strncmp(prefix, BTF_FUNC, sizeof(BTF_FUNC) - 1)) {
> > +                       id = add_func(obj, prefix);
> > +               } else if (!strncmp(prefix, BTF_SORT, sizeof(BTF_SORT) - 1)) {
> > +                       id = add_sort(obj, prefix);
> > +
> > +                       /*
> > +                        * SORT objects store list's count, which is encoded
> > +                        * in symbol's size.
> > +                        */
> > +                       if (id)
> > +                               id->cnt = sym.st_size / sizeof(int);
> 
> doesn't sym.st_size also include extra 4 bytes for length prefix?

no, count is excluded from the size

SNIP

> > +       btf = btf__parse_elf(obj->path, NULL);
> > +       err = libbpf_get_error(btf);
> > +       if (err) {
> > +               pr_err("FAILED: load BTF from %s: %s",
> > +                       obj->path, strerror(err));
> > +               return -1;
> > +       }
> > +
> > +       nr = btf__get_nr_types(btf);
> > +
> > +       /*
> > +        * Iterate all the BTF types and search for collected symbol IDs.
> > +        */
> > +       for (type_id = 0; type_id < nr; type_id++) {
> 
> common gotcha: type_id <= nr, you can also skip type_id == 0 (always VOID)

ugh, yep.. thanks ;-)

> 
> > +               const struct btf_type *type;
> > +               struct rb_root *root = NULL;
> > +               struct btf_id *id;
> > +               const char *str;
> > +               int *nr;
> > +
> > +               type = btf__type_by_id(btf, type_id);
> > +               if (!type)
> > +                       continue;
> 
> This ought to be an error...

ok, something like "BTF malformed error"

> 
> > +
> > +               /* We support func/struct types. */
> > +               if (BTF_INFO_KIND(type->info) == BTF_KIND_FUNC && nr_funcs) {
> 
> see libbpf's btf.h: btf_is_func(type)

ok 

> 
> > +                       root = &obj->funcs;
> > +                       nr = &nr_funcs;
> > +               } else if (BTF_INFO_KIND(type->info) == BTF_KIND_STRUCT && nr_structs) {
> 
> same as above: btf_is_struct
> 
> But I think you also need to support unions?
> 
> Also what about typedefs? A lot of types are typedefs to struct/func_proto/etc.

I added only types which are needed at the moment, but maybe
we can add the basic types now, so we don't need to bother later,
when we forget how this all work ;-)

> 
> > +                       root = &obj->structs;
> > +                       nr = &nr_structs;
> > +               } else {
> > +                       continue;
> > +               }
> > +
> > +               str = btf__name_by_offset(btf, type->name_off);
> > +               if (!str)
> > +                       continue;
> 
> error, shouldn't happen

ok

> 
> > +
> > +               id = btf_id__find(root, str);
> > +               if (id) {
> 
> isn't it an error, if not found?

no, at this point we are checking if this BTF type was collected
as a symbol for struct/func in some list.. if not, we continue the
iteration to next BTF type

> 
> > +                       id->id = type_id;
> > +                       (*nr)--;
> > +               }
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> 
> [...]
> 
> > +
> > +       /*
> > +        * We do proper cleanup and file close
> > +        * intentionally only on success.
> > +        */
> > +       if (elf_collect(&obj))
> > +               return -1;
> > +
> > +       if (symbols_collect(&obj))
> > +               return -1;
> > +
> > +       if (symbols_resolve(&obj))
> > +               return -1;
> > +
> > +       if (symbols_patch(&obj))
> > +               return -1;
> 
> nit: should these elf_end/close properly on error?

I wrote in the comment above that I intentionaly do not cleanup
on error path.. I wanted to save some time, but actualy I think
that would not be so expensive, I can add it

thanks,
jirka

> 
> 
> > +
> > +       elf_end(obj.efile.elf);
> > +       close(obj.efile.fd);
> > +       return 0;
> > +}
> > --
> > 2.25.4
> >
> 


  reply	other threads:[~2020-06-19 13:04 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-16 10:05 [PATCHv3 0/9] bpf: Add d_path helper Jiri Olsa
2020-06-16 10:05 ` [PATCH 01/11] bpf: Add btfid tool to resolve BTF IDs in ELF object Jiri Olsa
2020-06-19  0:38   ` Andrii Nakryiko
2020-06-19 13:03     ` Jiri Olsa [this message]
2020-06-19 18:12       ` Andrii Nakryiko
2020-06-22  8:59         ` Jiri Olsa
2020-06-16 10:05 ` [PATCH 02/11] bpf: Compile btfid tool at kernel compilation start Jiri Olsa
2020-06-18 20:40   ` John Fastabend
2020-06-18 21:17     ` John Fastabend
2020-06-19 13:23     ` Jiri Olsa
2020-06-19  0:40   ` Andrii Nakryiko
2020-06-19  0:47     ` Arnaldo Carvalho de Melo
2020-06-19  2:08       ` Alexei Starovoitov
2020-06-19  3:51         ` Andrii Nakryiko
2020-06-16 10:05 ` [PATCH 03/11] bpf: Add btf_ids object Jiri Olsa
2020-06-19  0:56   ` Andrii Nakryiko
2020-06-19  1:06     ` Andrii Nakryiko
2020-06-19 13:16       ` Jiri Olsa
2020-06-19 13:13     ` Jiri Olsa
2020-06-19 18:15       ` Andrii Nakryiko
2020-06-19  1:02   ` Andrii Nakryiko
2020-06-19 13:05     ` Jiri Olsa
2020-06-16 10:05 ` [PATCH 04/11] bpf: Resolve BTF IDs in vmlinux image Jiri Olsa
2020-06-16 10:05 ` [PATCH 05/11] bpf: Remove btf_id helpers resolving Jiri Olsa
2020-06-19  1:10   ` Andrii Nakryiko
2020-06-19 13:18     ` Jiri Olsa
2020-06-16 10:05 ` [PATCH 06/11] bpf: Do not pass enum bpf_access_type to btf_struct_access Jiri Olsa
2020-06-19  3:58   ` Andrii Nakryiko
2020-06-19 13:23     ` Jiri Olsa
2020-06-16 10:05 ` [PATCH 07/11] bpf: Allow nested BTF object to be refferenced by BTF object + offset Jiri Olsa
2020-06-16 10:05 ` [PATCH 08/11] bpf: Add BTF whitelist support Jiri Olsa
2020-06-19  4:29   ` Andrii Nakryiko
2020-06-19 13:29     ` Jiri Olsa
2020-06-16 10:05 ` [PATCH 09/11] bpf: Add d_path helper Jiri Olsa
2020-06-19  4:35   ` Andrii Nakryiko
2020-06-19 13:31     ` Jiri Olsa
2020-06-19 18:25       ` Andrii Nakryiko
2020-06-22  9:02         ` Jiri Olsa
2020-06-22 19:18           ` Andrii Nakryiko
2020-06-23 10:02             ` Jiri Olsa
2020-06-23 18:58               ` Andrii Nakryiko
2020-06-23 20:14                 ` Jiri Olsa
2020-06-23 20:17                   ` Andrii Nakryiko
2020-06-16 10:05 ` [PATCH 10/11] selftests/bpf: Add verifier test for " Jiri Olsa
2020-06-19  4:38   ` Andrii Nakryiko
2020-06-19 13:32     ` Jiri Olsa
2020-06-16 10:05 ` [PATCH 11/11] selftests/bpf: Add " Jiri Olsa
2020-06-19  4:44   ` Andrii Nakryiko
2020-06-19 13:34     ` Jiri Olsa
2020-06-18 20:57 ` [PATCHv3 0/9] bpf: Add " John Fastabend
2020-06-19 12:35   ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200619130354.GB2465907@krava \
    --to=jolsa@redhat.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bgregg@netflix.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@redhat.com \
    --cc=ethercflow@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kafai@fb.com \
    --cc=kpsingh@chromium.org \
    --cc=netdev@vger.kernel.org \
    --cc=revest@chromium.org \
    --cc=songliubraving@fb.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).