bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alan Maguire <alan.maguire@oracle.com>,
	Alexei Starovoitov <ast@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	 Arnaldo Carvalho de Melo <acme@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>,  Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	 Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>,
	 Quentin Monnet <quentin@isovalent.com>,
	Mykola Lysenko <mykolal@fb.com>, bpf <bpf@vger.kernel.org>
Subject: Re: [RFC bpf-next 1/8] btf: add kind metadata encoding to UAPI
Date: Fri, 2 Jun 2023 09:34:28 -0700	[thread overview]
Message-ID: <CAEf4BzYSdq4J=TJBOmSVeY86SvJVWS+4OLV112qR9dLi=qm0QA@mail.gmail.com> (raw)
In-Reply-To: <CAEf4BzZaUEqYnyBs6OqX2_L_X=U4zjrKF9nPeyyKp7tRNVLMww@mail.gmail.com>

On Fri, Jun 2, 2023 at 9:32 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Jun 1, 2023 at 9:54 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Jun 1, 2023 at 3:38 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > >
> > > On 01/06/2023 04:53, Alexei Starovoitov wrote:
> > > > On Wed, May 31, 2023 at 09:19:28PM +0100, Alan Maguire wrote:
> > > >> BTF kind metadata provides information to parse BTF kinds.
> > > >> By separating parsing BTF from using all the information
> > > >> it provides, we allow BTF to encode new features even if
> > > >> they cannot be used.  This is helpful in particular for
> > > >> cases where newer tools for BTF generation run on an
> > > >> older kernel; BTF kinds may be present that the kernel
> > > >> cannot yet use, but at least it can parse the BTF
> > > >> provided.  Meanwhile userspace tools with newer libbpf
> > > >> may be able to use the newer information.
> > > >>
> > > >> The intent is to support encoding of kind metadata
> > > >> optionally so that tools like pahole can add this
> > > >> information.  So for each kind we record
> > > >>
> > > >> - a kind name string
> > > >> - kind-related flags
> > > >> - length of singular element following struct btf_type
> > > >> - length of each of the btf_vlen() elements following
> > > >>
> > > >> In addition we make space in the metadata for
> > > >> CRC32s computed over the BTF along with a CRC for
> > > >> the base BTF; this allows split BTF to identify
> > > >> a mismatch explicitly.  Finally we provide an
> > > >> offset for an optional description string.
> > > >>
> > > >> The ideas here were discussed at [1] hence
> > > >>
> > > >> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > > >> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > > >>
> > > >> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/
> > > >> ---
> > > >>  include/uapi/linux/btf.h       | 29 +++++++++++++++++++++++++++++
> > > >>  tools/include/uapi/linux/btf.h | 29 +++++++++++++++++++++++++++++
> > > >>  2 files changed, 58 insertions(+)
> > > >>
> > > >> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
> > > >> index ec1798b6d3ff..94c1f4518249 100644
> > > >> --- a/include/uapi/linux/btf.h
> > > >> +++ b/include/uapi/linux/btf.h
> > > >> @@ -8,6 +8,34 @@
> > > >>  #define BTF_MAGIC   0xeB9F
> > > >>  #define BTF_VERSION 1
> > > >>
> > > >> +/* is this information required? If so it cannot be sanitized safely. */
> > > >> +#define BTF_KIND_META_OPTIONAL              (1 << 0)
>
> Another flag I was thinking about was a flag whether struct btf_type's
> type/size field is a type or a size (or something else). E.g., let's
> say we haven't had btf_type_tag yet and were adding it after we had
> this new metadata. We could say that type_tag's type/size field is
> actually a type ID, and generic tools like bpftool could basically
> skip type_tag and resolve to underlying type. This way, optional
> modifier/decorator KINDs won't even have to break applications using
> old libbpf's when it comes to calculating type sizes and resolving
> them.
>
> > > >> +
> > > >> +struct btf_kind_meta {
> > > >> +    __u32 name_off;         /* kind name string offset */
>
> I'm not sure why we'd need to record this for every KIND? The tool
> that doesn't know about this new kind can't do much about it anyways,
> so whether it knows that this is "KIND_NEW_FANCY" or just its ID #123
> doesn't make much difference?
>
> > > >> +    __u16 flags;            /* see BTF_KIND_META_* values above */
> > > >> +    __u8 info_sz;           /* size of singular element after btf_type */
> > > >> +    __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
>
> on the other hand, reserving __u32 as "extra" might be useful if we
> ever want to have some more stuff here (worst case we can define it as
> string offset which describes some more metadata
>
>
> > > >> +};
> > > >> +
> > > >> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
> > > >> +#define BTF_META_CRC_SET            (1 << 0)
> > > >> +#define BTF_META_BASE_CRC_SET               (1 << 1)
> > > >> +
> > > >> +struct btf_metadata {
> > > >> +    __u8    kind_meta_cnt;          /* number of struct btf_kind_meta */
> > > >> +    __u32   flags;
> > > >> +    __u32   description_off;        /* optional description string */
> > > >> +    __u32   crc;                    /* crc32 of BTF */
> > > >> +    __u32   base_crc;               /* crc32 of base BTF */
> > > >> +    struct btf_kind_meta kind_meta[];
> > > >> +};
> > > >> +
> > > >> +struct btf_meta_header {
> > > >> +    __u32   meta_off;       /* offset of metadata section */
> > > >> +    __u32   meta_len;       /* length of metadata section */
> > > >> +};
> > > >> +
> > > >>  struct btf_header {
> > > >>      __u16   magic;
> > > >>      __u8    version;
> > > >> @@ -19,6 +47,7 @@ struct btf_header {
> > > >>      __u32   type_len;       /* length of type section       */
> > > >>      __u32   str_off;        /* offset of string section     */
> > > >>      __u32   str_len;        /* length of string section     */
> > > >> +    struct btf_meta_header meta_header;
>
> looking through all this, it seems like it would be better to have a
> separate "metadata" section, just like we have types and strings? And
> then the contents are described by btf_metadata struct?

Ok, wait, I'm dumb. You are effectively already doing this, but for
some reason added unnecessary struct around meta_off, meta_len fields.
Why? Let's keep it consistent with {type,str}_{off,len}?

>
>
> > > >>  };
> > > >>
> > > >>  /* Max # of type identifier */
> > > >> diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
> > > >> index ec1798b6d3ff..94c1f4518249 100644
> > > >> --- a/tools/include/uapi/linux/btf.h
> > > >> +++ b/tools/include/uapi/linux/btf.h
> > > >> @@ -8,6 +8,34 @@
> > > >>  #define BTF_MAGIC   0xeB9F
> > > >>  #define BTF_VERSION 1
> > > >>
> > > >> +/* is this information required? If so it cannot be sanitized safely. */
> > > >> +#define BTF_KIND_META_OPTIONAL              (1 << 0)
> > > >> +
> > > >> +struct btf_kind_meta {
> > > >> +    __u32 name_off;         /* kind name string offset */
> > > >> +    __u16 flags;            /* see BTF_KIND_META_* values above */
> > > >> +    __u8 info_sz;           /* size of singular element after btf_type */
> > > >> +    __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
> > > >> +};
> > > >> +
> > > >> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
> > > >> +#define BTF_META_CRC_SET            (1 << 0)
> > > >> +#define BTF_META_BASE_CRC_SET               (1 << 1)
> > > >> +
> > > >> +struct btf_metadata {
> > > >> +    __u8    kind_meta_cnt;          /* number of struct btf_kind_meta */
> > > >
> > > > Overall, looks great.
> > > > Few small nits:
> > > > I'd make kind_meta_cnt u32, since padding we won't be able to reuse anyway
> > > > and would bump the BTF_VERSION to 2 to make it a 'milestone'.
>
> Bumping BTF_VERSION to 2 automatically makes BTF incompatible with all
> existing kernels (and potentially many tools that parse BTF). Given we
> can actually extend BTF in backwards compatible way by just adding an
> optional two fields to btf_header + extra bytes for metadata sections,
> why making our lives harder by bumping this version?
>
> > > > v2 -> self described.
> > >
> > > sure, sounds good. One other change perhaps worth making; currently
> > > we assume that the kind metadata is at the end of the struct
> > > btf_metadata, but if we ever wanted to add metadata fields in the
> > > future, we'd want so support both the current metadata structure and
> > > any future structure which had additional fields.
>
> see above, another reason to make metadata a separate section, in
> addition to types and strings
>
> > >
> > > With that in mind, it might make sense to go with something like
> > >
> > > struct btf_metadata {
> > >         __u32   kind_meta_cnt;
> > >         __u32   kind_meta_offset;       /* kind_meta_cnt instances of struct
> > > btf_kind_meta start here */
> > >         __u32   flags;
> > >         __u32   description_off;        /* optional description string*/
> > >         __u32   crc;                    /* crc32 of BTF */
> > >         __u32   base_crc;               /* crc32 of base BTF */
> > > };
> > >
> > > For the original version, kind_meta_offset would just be
> > > at meta_off + sizeof(struct btf_metadata), but if we had multiple
> > > versions of the btf_metadata header to handle, they could all rely on
> > > the kind_meta_offset being where kind metadata is stored.
> > > For validation we'd have to make sure kind_meta_offset was within
> > > the the metadata header range.
> >
> > kind_meta_offset is an ok idea, but I don't quite see why we'd have
> > multiple 'struct btf_metadata' pointing to the same set of 'struct
> > btf_kind_meta'.
> >
> > Also why do we need description_off ? Shouldn't string go into
> > btf_header->str_off ?
> >
> > > >
> > > >> +    __u32   flags;
> > > >> +    __u32   description_off;        /* optional description string */
> > > >> +    __u32   crc;                    /* crc32 of BTF */
> > > >> +    __u32   base_crc;               /* crc32 of base BTF */
> > > >
> > > > Hard coded CRC also gives me a pause.
> > > > Should it be an optional KIND like btf tags?
> > >
> > > The goal of the CRC is really just to provide a unique identifier that
> > > we can use for things like checking if there's a mismatch between
> > > base and module BTF. If we want to ever do CRC validation (not sure
> > > if there's a case for that) we probably need to think about cases like
> > > BTF sanitization of BPF program BTF; this would likely only be an
> > > issue if metadata support is added to BPF compilers.
> > >
> > > The problem with adding it via a kind is that if we first compute
> > > the CRC over the entire BTF object and then add the kind, the addition
> > > of the kind breaks the CRC; as a result I _think_ we're stuck with
> > > having to have it in the header.
> >
> > Hmm. libbpf can add BTF_KIND_CRC with zero-ed u32 crc field
> > and later fill it in.
> > It's really not different than u32 crc field inside 'struct btf_metadata'.
> >
> > > That said I don't think CRC is necessarily the only identifier
> > > we could use, and we don't even need to identify it as a
> > > CRC in the UAPI, just as a "unique identifier"; that would deal
> > > with issues about breaking the CRC during sanitization. All
> > > depends on whether we ever see a need to verify BTF via CRC
> > > really.
> >
> > Right. It could be sha or anything else, but user space and kernel
> > need to agree on the math to compute it, so something got to indicate
> > that this 32-bit is a crc.
> > Hence KIND_CRC, KIND_SHA fit better.
>
> what if instead of crc and base_src fields, we have
>
> __u32 id_str_off;
> __u32 base_id_str_off;
>
> and they are offsets into a string section. We can then define that
> those strings have to be something like "crc:<crc-value>" or
> "sha:<sha-value". This will be a generic ID, and extensible (and more
> easily extensible, probably), but won't require new KIND.
>
> This also has a good property that 0 means "no ID", which helps with
> the base BTF case. Current "__u32 crc;" doesn't have this property and
> requires a flag.

  reply	other threads:[~2023-06-02 16:34 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-31 20:19 [RFC bpf-next 0/8] bpf: support BTF kind metadata to separate Alan Maguire
2023-05-31 20:19 ` [RFC bpf-next 1/8] btf: add kind metadata encoding to UAPI Alan Maguire
2023-06-01  3:53   ` Alexei Starovoitov
2023-06-01 10:36     ` Alan Maguire
2023-06-01 16:53       ` Alexei Starovoitov
2023-06-02 16:32         ` Andrii Nakryiko
2023-06-02 16:34           ` Andrii Nakryiko [this message]
2023-06-02 18:11           ` Alexei Starovoitov
2023-06-02 20:33             ` Andrii Nakryiko
2023-06-05 16:14               ` Alexei Starovoitov
2023-06-05 22:38                 ` Andrii Nakryiko
2023-06-06  2:46                   ` Alexei Starovoitov
2023-06-06 11:30                     ` Toke Høiland-Jørgensen
2023-06-07 11:55                       ` Eduard Zingerman
2023-06-07 15:29                         ` Yonghong Song
2023-06-07 16:14                           ` Eduard Zingerman
2023-06-07 21:47                             ` Andrii Nakryiko
2023-06-07 22:05                               ` Eduard Zingerman
2023-06-07 22:34                                 ` Andrii Nakryiko
2023-06-06 16:50                     ` Andrii Nakryiko
2023-06-07  1:16                       ` Alexei Starovoitov
2023-06-07 21:43                         ` Andrii Nakryiko
2023-05-31 20:19 ` [RFC bpf-next 2/8] libbpf: support handling of metadata section in BTF Alan Maguire
2023-06-05 11:01   ` Jiri Olsa
2023-06-05 21:40     ` Andrii Nakryiko
2023-05-31 20:19 ` [RFC bpf-next 3/8] libbpf: use metadata to compute an unknown kind size Alan Maguire
2023-05-31 20:19 ` [RFC bpf-next 4/8] btf: support kernel parsing of BTF with metadata, use it to parse BTF with unknown kinds Alan Maguire
2023-06-07 19:51   ` Eduard Zingerman
2023-05-31 20:19 ` [RFC bpf-next 5/8] libbpf: add metadata encoding support Alan Maguire
2023-05-31 20:19 ` [RFC bpf-next 6/8] btf: generate metadata for vmlinux/module BTF Alan Maguire
2023-05-31 20:19 ` [RFC bpf-next 7/8] bpftool: add BTF dump "format meta" to dump header/metadata Alan Maguire
2023-06-01 16:33   ` Quentin Monnet
2023-06-02 16:57   ` Andrii Nakryiko
2023-05-31 20:19 ` [RFC bpf-next 8/8] selftests/bpf: test kind encoding/decoding Alan Maguire
2023-05-31 20:19 ` [RFC dwarves] dwarves: encode BTF metadata if --btf_gen_meta is set Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzYSdq4J=TJBOmSVeY86SvJVWS+4OLV112qR9dLi=qm0QA@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mykolal@fb.com \
    --cc=quentin@isovalent.com \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).