bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Maguire <alan.maguire@oracle.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: acme@kernel.org, ast@kernel.org, andrii@kernel.org,
	daniel@iogearbox.net, quentin@isovalent.com, jolsa@kernel.org,
	martin.lau@linux.dev, song@kernel.org, yhs@fb.com,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com,
	haoluo@google.com, mykolal@fb.com, bpf@vger.kernel.org
Subject: Re: [PATCH v2 bpf-next 1/9] btf: add kind layout encoding, crcs to UAPI
Date: Thu, 20 Jul 2023 21:18:18 +0100	[thread overview]
Message-ID: <e75a1937-0a92-8563-2242-95e875169880@oracle.com> (raw)
In-Reply-To: <CAEf4Bzazke2aLWfEZChN2BCcf83b9_EufQVAP0Z19LY5z=+yZQ@mail.gmail.com>

On 06/07/2023 00:48, Andrii Nakryiko wrote:
> On Mon, Jul 3, 2023 at 6:42 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> On 22/06/2023 23:02, Andrii Nakryiko wrote:
>>> On Fri, Jun 16, 2023 at 10:17 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>
>>>> BTF kind layouts provide information to parse BTF kinds.
>>>> By separating parsing BTF from using all the information
>>>> it provides, we allow BTF to encode new features even if
>>>> they cannot be used.  This is helpful in particular for
>>>> cases where newer tools for BTF generation run on an
>>>> older kernel; BTF kinds may be present that the kernel
>>>> cannot yet use, but at least it can parse the BTF
>>>> provided.  Meanwhile userspace tools with newer libbpf
>>>> may be able to use the newer information.
>>>>
>>>> The intent is to support encoding of kind layouts
>>>> optionally so that tools like pahole can add this
>>>> information.  So for each kind we record
>>>>
>>>> - kind-related flags
>>>> - length of singular element following struct btf_type
>>>> - length of each of the btf_vlen() elements following
>>>>
>>>> In addition we make space in the BTF header for
>>>> CRC32s computed over the BTF along with a CRC for
>>>> the base BTF; this allows split BTF to identify
>>>> a mismatch explicitly.
>>>>
>>>> The ideas here were discussed at [1], [2]; hence
>>>>
>>>> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
>>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>>>
>>>> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/
>>>> [2] https://lore.kernel.org/bpf/20230531201936.1992188-1-alan.maguire@oracle.com/
>>>> ---
>>>>  include/uapi/linux/btf.h       | 24 ++++++++++++++++++++++++
>>>>  tools/include/uapi/linux/btf.h | 24 ++++++++++++++++++++++++
>>>>  2 files changed, 48 insertions(+)
>>>>
>>>> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
>>>> index ec1798b6d3ff..cea9125ed953 100644
>>>> --- a/include/uapi/linux/btf.h
>>>> +++ b/include/uapi/linux/btf.h
>>>> @@ -8,6 +8,22 @@
>>>>  #define BTF_MAGIC      0xeB9F
>>>>  #define BTF_VERSION    1
>>>>
>>>> +/* is this information required? If so it cannot be sanitized safely. */
>>>> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
>>>
>>> hm.. I thought we agreed to not have OPTIONAL flag last time, no? From
>>> kernel's perspective nothing is optional. From libbpf perspective
>>> everything should be optional, unless we get type_id reference to
>>> something that we don't recognize. So why the flag and extra code to
>>> handle it?
>>>
>>> We can always add it later, if necessary.
>>>
>>
>> I totally agree we need to reject any BTF that contains references
>> to unknown objects if these references are made via known ones;
>> so for example an enum64 in a struct (in the case we didn't know
>> about an enum64). However, it's possible a BTF kind could point
>> _at_ other BTF kinds but not be pointed _to_ by any known kinds;
>> in such a case wouldn't optional make sense even for the kernel
>> to say "ignore any kinds that we don't know about that aren't
>> participating in any known BTF relationships"? Default assumption
>> without the optional flag would be to reject such BTF.
> 
> I think it's simpler (and would follow what we've been doing with
> kernel-side strict validation of everything) to reject everything
> unknown. "Being pointed to" isn't always contained within BTF itself.
> E.g., for line and func info, type_id comes during BPF_PROG_LOAD. So
> at the point of BTF validation you don't know that some FUNCs will be
> pointed to (as an example). So I'd avoid making unnecessarily
> dangerous assumptions that some pieces of information can be ignored.
> 
> And in general, kernel doesn't trust user-space data without
> validation, so we'd have to double-check all this OPTIONAL flagsole
> somehow anyways.
> 

makes sense. I think I'd been hoping the BTF kind layout would help
address the "newer pahole runs on older kernel" problem, but it
doesn't really. More on that below...

>>
>>>> +
>>>> +/* kind layout section consists of a struct btf_kind_layout for each known
>>>> + * kind at BTF encoding time.
>>>> + */
>>>> +struct btf_kind_layout {
>>>> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
>>>> +       __u8 info_sz;           /* size of singular element after btf_type */
>>>> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
>>>> +};
>>>> +
>>>> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
>>>> +#define BTF_FLAG_CRC_SET               (1 << 0)
>>>> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
>>>> +
>>>>  struct btf_header {
>>>>         __u16   magic;
>>>>         __u8    version;
>>>> @@ -19,8 +35,16 @@ struct btf_header {
>>>>         __u32   type_len;       /* length of type section       */
>>>>         __u32   str_off;        /* offset of string section     */
>>>>         __u32   str_len;        /* length of string section     */
>>>> +       __u32   kind_layout_off;/* offset of kind layout section */
>>>> +       __u32   kind_layout_len;/* length of kind layout section */
>>>> +
>>>> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
>>>> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
>>>>  };
>>>>
>>>> +/* required minimum BTF header length */
>>>> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
>>>
>>> offsetof(struct btf_header, kind_layout_off) ?
>>>
>>> but actually why this needs to be a part of UAPI?
>>>
>>
>> no not really. I was trying to come up with a more elegant
>> way of differentiating between the old and new header formats
>> on the basis of size and eventually just gave up and added
>> a #define. It can absolutely be removed.
> 
> right, so that's why just checking if field is present based on
> btf_header.len and field offset is a good approach? Let's drop
> unnecessary constants from UAPI header
> 
>>
>>>> +
>>>>  /* Max # of type identifier */
>>>>  #define BTF_MAX_TYPE   0x000fffff
>>>>  /* Max offset into the string section */
>>>> diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
>>>> index ec1798b6d3ff..cea9125ed953 100644
>>>> --- a/tools/include/uapi/linux/btf.h
>>>> +++ b/tools/include/uapi/linux/btf.h
>>>> @@ -8,6 +8,22 @@
>>>>  #define BTF_MAGIC      0xeB9F
>>>>  #define BTF_VERSION    1
>>>>
>>>> +/* is this information required? If so it cannot be sanitized safely. */
>>>> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
>>>> +
>>>> +/* kind layout section consists of a struct btf_kind_layout for each known
>>>> + * kind at BTF encoding time.
>>>> + */
>>>> +struct btf_kind_layout {
>>>> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
>>>> +       __u8 info_sz;           /* size of singular element after btf_type */
>>>> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
>>>> +};
>>>> +
>>>> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
>>>> +#define BTF_FLAG_CRC_SET               (1 << 0)
>>>> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
>>>> +
>>>>  struct btf_header {
>>>>         __u16   magic;
>>>>         __u8    version;
>>>> @@ -19,8 +35,16 @@ struct btf_header {
>>>>         __u32   type_len;       /* length of type section       */
>>>>         __u32   str_off;        /* offset of string section     */
>>>>         __u32   str_len;        /* length of string section     */
>>>> +       __u32   kind_layout_off;/* offset of kind layout section */
>>>> +       __u32   kind_layout_len;/* length of kind layout section */
>>>> +
>>>> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
>>>
>>> why are we making crc optional? shouldn't we just say that crc is
>>> always filled out?
>>>
>>
>> The approach I took was to have libbpf/pahole be flexible about
>> specification of crcs and kind layout; neither, one of these or both
>> are supported. When neither are specified we'll still generate
>> a larger header, but it will be zeros for the new fields so parseable
>> by older libbpf/kernel. I think we probably need to make it optional
>> for a while to support new pahole on older kernels.
> 
> I'm not sure how this "optional for a while" will turn to
> "non-optional", tbh, and who and when will decide that. I think the
> "new pahole on old kernel" problem is solvable easily by making all
> this new stuff opt-in. New kernel Makefiles will request pahole to
> emit them, if pahole is new enough. Old kernels won't know about this
> feature and even new pahole won't emit it.
>

Another approach would be if we could auto-detect the kinds supported
by an older kernel, and not emit anything > BTF_KIND_MAX for that
kernel. I've put together a proof-of-concept for that, see [1].

> I don't feel too strongly about it, just generally feeling like
> minimizing all the different supportable variations.
> 

Yeah, hopefully some of these options can go away eventually.
The auto-detection scheme in [1], or something like it, might
make things easier in future. Thanks!

Alan

[1]
https://lore.kernel.org/bpf/20230720201443.224040-1-alan.maguire@oracle.com/
>>
>>
>>>> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
>>>
>>> here it would be nice if we could just rely on zero meaning not set,
>>> but I suspect not everyone will be happy about this, as technically
>>> crc 0 is a valid crc :(
>>>
>>
>> Right, I think we're stuck with the flags unfortunately.
> 
> yep. one extra reason why I like the idea of this being string offset,
> but whatever, small thing
> 
> 
>> Thanks for the review (and apologies for the belated response!)
>>
>> Alan
>>
>>>
>>>>  };
>>>>
>>>> +/* required minimum BTF header length */
>>>> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
>>>> +
>>>>  /* Max # of type identifier */
>>>>  #define BTF_MAX_TYPE   0x000fffff
>>>>  /* Max offset into the string section */
>>>> --
>>>> 2.39.3
>>>>
> 

  reply	other threads:[~2023-07-20 20:18 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-16 17:17 [PATCH v2 bpf-next 0/9] bpf: support BTF kind layout info, CRCs Alan Maguire
2023-06-16 17:17 ` [PATCH v2 bpf-next 1/9] btf: add kind layout encoding, crcs to UAPI Alan Maguire
2023-06-17  0:39   ` Alexei Starovoitov
2023-06-22 22:02   ` Andrii Nakryiko
2023-07-03 13:42     ` Alan Maguire
2023-07-05 23:48       ` Andrii Nakryiko
2023-07-20 20:18         ` Alan Maguire [this message]
2023-06-16 17:17 ` [PATCH v2 bpf-next 2/9] libbpf: support handling of kind layout section in BTF Alan Maguire
2023-06-22 22:02   ` Andrii Nakryiko
2023-06-16 17:17 ` [PATCH v2 bpf-next 3/9] libbpf: use kind layout to compute an unknown kind size Alan Maguire
2023-06-22 22:03   ` Andrii Nakryiko
2023-06-16 17:17 ` [PATCH v2 bpf-next 4/9] btf: support kernel parsing of BTF with kind layout Alan Maguire
2023-06-18 13:08   ` Jiri Olsa
2023-06-20  8:40     ` Alan Maguire
2023-06-22 22:03   ` Andrii Nakryiko
2023-06-16 17:17 ` [PATCH v2 bpf-next 5/9] libbpf: add kind layout encoding, crc support Alan Maguire
2023-06-19 23:24   ` Yonghong Song
2023-06-20  9:09     ` Alan Maguire
2023-06-20 14:41       ` Yonghong Song
2023-06-22 22:04   ` Andrii Nakryiko
2023-06-16 17:17 ` [PATCH v2 bpf-next 6/9] btf: generate BTF kind layout for vmlinux/module BTF Alan Maguire
2023-06-16 18:53   ` kernel test robot
2023-06-18 13:07   ` Jiri Olsa
2023-06-20  8:46     ` Alan Maguire
2023-06-16 17:17 ` [PATCH v2 bpf-next 7/9] bpftool: add BTF dump "format meta" to dump header/metadata Alan Maguire
2023-06-22 22:04   ` Andrii Nakryiko
2023-06-29 14:16   ` Quentin Monnet
2023-06-16 17:17 ` [PATCH v2 bpf-next 8/9] bpftool: update doc to describe bpftool btf dump .. format meta Alan Maguire
2023-06-29 14:16   ` Quentin Monnet
2023-06-16 17:17 ` [PATCH v2 bpf-next 9/9] selftests/bpf: test kind encoding/decoding Alan Maguire
2023-06-22 23:48   ` Andrii Nakryiko
2023-06-16 17:17 ` [PATCH dwarves] dwarves: encode BTF kind layout, crcs Alan Maguire
2023-06-17  0:39 ` [PATCH v2 bpf-next 0/9] bpf: support BTF kind layout info, CRCs Alexei Starovoitov
2023-06-20  8:41   ` Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e75a1937-0a92-8563-2242-95e875169880@oracle.com \
    --to=alan.maguire@oracle.com \
    --cc=acme@kernel.org \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mykolal@fb.com \
    --cc=quentin@isovalent.com \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).