Re: [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support
       [not found] ` <20201110011932.3201430-2-andrii@kernel.org>
@ 2020-11-10 17:49   ` Song Liu
  2020-11-10 18:31     ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Song Liu @ 2020-11-10 17:49 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, daniel, Kernel Team,
	linux-kernel, rafael, jeyu, Arnaldo Carvalho de Melo,
	Greg Kroah-Hartman



> On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> Adjust in-kernel BTF implementation to support a split BTF mode of operation.
> Changes are mostly mirroring libbpf split BTF changes, with the exception of
> start_id being 0 for in-kernel implementation due to simpler read-only mode.
> 
> Otherwise, for split BTF logic, most of the logic of jumping to base BTF,
> where necessary, is encapsulated in few helper functions. Type numbering and
> string offset in a split BTF are logically continuing where base BTF ends, so
> most of the high-level logic is kept without changes.
> 
> Type verification and size resolution is only doing an added resolution of new
> split BTF types and relies on already cached size and type resolution results
> in the base BTF.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> kernel/bpf/btf.c | 171 +++++++++++++++++++++++++++++++++--------------
> 1 file changed, 119 insertions(+), 52 deletions(-)
> 
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 6324de8c59f7..727c1c27053f 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -203,12 +203,17 @@ struct btf {
> 	const char *strings;
> 	void *nohdr_data;
> 	struct btf_header hdr;
> -	u32 nr_types;
> +	u32 nr_types; /* includes VOID for base BTF */
> 	u32 types_size;
> 	u32 data_size;
> 	refcount_t refcnt;
> 	u32 id;
> 	struct rcu_head rcu;
> +
> +	/* split BTF support */
> +	struct btf *base_btf;
> +	u32 start_id; /* first type ID in this BTF (0 for base BTF) */
> +	u32 start_str_off; /* first string offset (0 for base BTF) */
> };
> 
> enum verifier_phase {
> @@ -449,14 +454,27 @@ static bool btf_type_is_datasec(const struct btf_type *t)
> 	return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
> }
> 
> +static u32 btf_nr_types_total(const struct btf *btf)
> +{
> +	u32 total = 0;
> +
> +	while (btf) {
> +		total += btf->nr_types;
> +		btf = btf->base_btf;
> +	}
> +
> +	return total;
> +}
> +
> s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> {
> 	const struct btf_type *t;
> 	const char *tname;
> -	u32 i;
> +	u32 i, total;
> 
> -	for (i = 1; i <= btf->nr_types; i++) {
> -		t = btf->types[i];
> +	total = btf_nr_types_total(btf);
> +	for (i = 1; i < total; i++) {
> +		t = btf_type_by_id(btf, i);
> 		if (BTF_INFO_KIND(t->info) != kind)
> 			continue;
> 
> @@ -599,8 +617,14 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
> 
> static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
> {
> -	return BTF_STR_OFFSET_VALID(offset) &&
> -		offset < btf->hdr.str_len;
> +	if (!BTF_STR_OFFSET_VALID(offset))
> +		return false;
> +
> +	while (offset < btf->start_str_off)
> +		btf = btf->base_btf;

Do we need "if (!btf) return false;" in the while loop? (and some other loops below)

[...]


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support
  2020-11-10 17:49   ` [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support Song Liu
@ 2020-11-10 18:31     ` Andrii Nakryiko
  2020-11-10 20:14       ` Song Liu
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2020-11-10 18:31 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, daniel,
	Kernel Team, linux-kernel, rafael, jeyu,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman

On Tue, Nov 10, 2020 at 9:50 AM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Adjust in-kernel BTF implementation to support a split BTF mode of operation.
> > Changes are mostly mirroring libbpf split BTF changes, with the exception of
> > start_id being 0 for in-kernel implementation due to simpler read-only mode.
> >
> > Otherwise, for split BTF logic, most of the logic of jumping to base BTF,
> > where necessary, is encapsulated in few helper functions. Type numbering and
> > string offset in a split BTF are logically continuing where base BTF ends, so
> > most of the high-level logic is kept without changes.
> >
> > Type verification and size resolution is only doing an added resolution of new
> > split BTF types and relies on already cached size and type resolution results
> > in the base BTF.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> > kernel/bpf/btf.c | 171 +++++++++++++++++++++++++++++++++--------------
> > 1 file changed, 119 insertions(+), 52 deletions(-)
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 6324de8c59f7..727c1c27053f 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -203,12 +203,17 @@ struct btf {
> >       const char *strings;
> >       void *nohdr_data;
> >       struct btf_header hdr;
> > -     u32 nr_types;
> > +     u32 nr_types; /* includes VOID for base BTF */
> >       u32 types_size;
> >       u32 data_size;
> >       refcount_t refcnt;
> >       u32 id;
> >       struct rcu_head rcu;
> > +
> > +     /* split BTF support */
> > +     struct btf *base_btf;
> > +     u32 start_id; /* first type ID in this BTF (0 for base BTF) */
> > +     u32 start_str_off; /* first string offset (0 for base BTF) */
> > };
> >
> > enum verifier_phase {
> > @@ -449,14 +454,27 @@ static bool btf_type_is_datasec(const struct btf_type *t)
> >       return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
> > }
> >
> > +static u32 btf_nr_types_total(const struct btf *btf)
> > +{
> > +     u32 total = 0;
> > +
> > +     while (btf) {
> > +             total += btf->nr_types;
> > +             btf = btf->base_btf;
> > +     }
> > +
> > +     return total;
> > +}
> > +
> > s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> > {
> >       const struct btf_type *t;
> >       const char *tname;
> > -     u32 i;
> > +     u32 i, total;
> >
> > -     for (i = 1; i <= btf->nr_types; i++) {
> > -             t = btf->types[i];
> > +     total = btf_nr_types_total(btf);
> > +     for (i = 1; i < total; i++) {
> > +             t = btf_type_by_id(btf, i);
> >               if (BTF_INFO_KIND(t->info) != kind)
> >                       continue;
> >
> > @@ -599,8 +617,14 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
> >
> > static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
> > {
> > -     return BTF_STR_OFFSET_VALID(offset) &&
> > -             offset < btf->hdr.str_len;
> > +     if (!BTF_STR_OFFSET_VALID(offset))
> > +             return false;
> > +
> > +     while (offset < btf->start_str_off)
> > +             btf = btf->base_btf;
>
> Do we need "if (!btf) return false;" in the while loop? (and some other loops below)

No, because for base btf start_str_off and start_type_id are always
zero, so loop condition is always false.

>
> [...]
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support
  2020-11-10 18:31     ` Andrii Nakryiko
@ 2020-11-10 20:14       ` Song Liu
  0 siblings, 0 replies; 20+ messages in thread
From: Song Liu @ 2020-11-10 20:14 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, daniel,
	Kernel Team, linux-kernel, rafael, jeyu,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman



> On Nov 10, 2020, at 10:31 AM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Tue, Nov 10, 2020 at 9:50 AM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
>>> 
>>> Adjust in-kernel BTF implementation to support a split BTF mode of operation.
>>> Changes are mostly mirroring libbpf split BTF changes, with the exception of
>>> start_id being 0 for in-kernel implementation due to simpler read-only mode.
>>> 
>>> Otherwise, for split BTF logic, most of the logic of jumping to base BTF,
>>> where necessary, is encapsulated in few helper functions. Type numbering and
>>> string offset in a split BTF are logically continuing where base BTF ends, so
>>> most of the high-level logic is kept without changes.
>>> 
>>> Type verification and size resolution is only doing an added resolution of new
>>> split BTF types and relies on already cached size and type resolution results
>>> in the base BTF.
>>> 
>>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>>> ---
>>> kernel/bpf/btf.c | 171 +++++++++++++++++++++++++++++++++--------------
>>> 1 file changed, 119 insertions(+), 52 deletions(-)
>>> 
>>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>>> index 6324de8c59f7..727c1c27053f 100644
>>> --- a/kernel/bpf/btf.c
>>> +++ b/kernel/bpf/btf.c
>>> @@ -203,12 +203,17 @@ struct btf {
>>>      const char *strings;
>>>      void *nohdr_data;
>>>      struct btf_header hdr;
>>> -     u32 nr_types;
>>> +     u32 nr_types; /* includes VOID for base BTF */
>>>      u32 types_size;
>>>      u32 data_size;
>>>      refcount_t refcnt;
>>>      u32 id;
>>>      struct rcu_head rcu;
>>> +
>>> +     /* split BTF support */
>>> +     struct btf *base_btf;
>>> +     u32 start_id; /* first type ID in this BTF (0 for base BTF) */
>>> +     u32 start_str_off; /* first string offset (0 for base BTF) */
>>> };
>>> 
>>> enum verifier_phase {
>>> @@ -449,14 +454,27 @@ static bool btf_type_is_datasec(const struct btf_type *t)
>>>      return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
>>> }
>>> 
>>> +static u32 btf_nr_types_total(const struct btf *btf)
>>> +{
>>> +     u32 total = 0;
>>> +
>>> +     while (btf) {
>>> +             total += btf->nr_types;
>>> +             btf = btf->base_btf;
>>> +     }
>>> +
>>> +     return total;
>>> +}
>>> +
>>> s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
>>> {
>>>      const struct btf_type *t;
>>>      const char *tname;
>>> -     u32 i;
>>> +     u32 i, total;
>>> 
>>> -     for (i = 1; i <= btf->nr_types; i++) {
>>> -             t = btf->types[i];
>>> +     total = btf_nr_types_total(btf);
>>> +     for (i = 1; i < total; i++) {
>>> +             t = btf_type_by_id(btf, i);
>>>              if (BTF_INFO_KIND(t->info) != kind)
>>>                      continue;
>>> 
>>> @@ -599,8 +617,14 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
>>> 
>>> static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
>>> {
>>> -     return BTF_STR_OFFSET_VALID(offset) &&
>>> -             offset < btf->hdr.str_len;
>>> +     if (!BTF_STR_OFFSET_VALID(offset))
>>> +             return false;
>>> +
>>> +     while (offset < btf->start_str_off)
>>> +             btf = btf->base_btf;
>> 
>> Do we need "if (!btf) return false;" in the while loop? (and some other loops below)
> 
> No, because for base btf start_str_off and start_type_id are always
> zero, so loop condition is always false.

Ah, I misread the code. Thanks for the explanation. 

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 2/5] bpf: assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO
       [not found] ` <20201110011932.3201430-3-andrii@kernel.org>
@ 2020-11-10 20:24   ` Song Liu
  0 siblings, 0 replies; 20+ messages in thread
From: Song Liu @ 2020-11-10 20:24 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, daniel, Kernel Team,
	linux-kernel, rafael, jeyu, Arnaldo Carvalho de Melo,
	Greg Kroah-Hartman



> On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF
> objects in the system. To allow distinguishing vmlinux BTF (and later kernel
> module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as
> BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module
> BTF).  We might want to later allow specifying BTF name for user-provided BTFs
> as well, if that makes sense. But currently this is reserved only for
> in-kernel BTFs.
> 
> Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require
> in-kernel BTF type with ability to specify BTF types from kernel modules, not
> just vmlinux BTF. This will be implemented in a follow up patch set for
> fentry/fexit/fmod_ret/lsm/etc.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it
       [not found] ` <20201110011932.3201430-4-andrii@kernel.org>
@ 2020-11-11  1:04   ` Song Liu
  2020-11-16 19:55     ` Allan, Bruce W
  2020-11-17  3:44   ` David Ahern
  1 sibling, 1 reply; 20+ messages in thread
From: Song Liu @ 2020-11-11  1:04 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, open list, rafael, jeyu, Arnaldo Carvalho de Melo,
	Greg Kroah-Hartman, Masahiro Yamada



> On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:

[...]

> SPLIT BTF
> =========
> 
> $ for f in $(find . -name '*.ko'); do size -A -d $f | grep BTF | awk '{print $2}'; done | awk '{ s += $1 } END { print s }'
> 5194047
> 
> $ for f in $(find . -name '*.ko'); do printf "%s %d\n" $f $(size -A -d $f | grep BTF | awk '{print $2}'); done | sort -nr -k2 | head -n10
> ./drivers/gpu/drm/i915/i915.ko 293206
> ./drivers/gpu/drm/radeon/radeon.ko 282103
> ./fs/xfs/xfs.ko 222150
> ./drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko 198503
> ./drivers/infiniband/hw/mlx5/mlx5_ib.ko 198356
> ./drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko 113444
> ./fs/cifs/cifs.ko 109379
> ./arch/x86/kvm/kvm.ko 100225
> ./drivers/gpu/drm/drm.ko 94827
> ./drivers/infiniband/core/ib_core.ko 91188
> 
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 5/5] tools/bpftool: add support for in-kernel and named BTF in `btf show`
       [not found] ` <20201110011932.3201430-6-andrii@kernel.org>
@ 2020-11-11  1:15   ` Song Liu
  2020-11-11  4:19     ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Song Liu @ 2020-11-11  1:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, daniel, Kernel Team,
	linux-kernel, rafael, jeyu, Arnaldo Carvalho de Melo,
	Greg Kroah-Hartman, Alan Maguire



> On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:

[...]

> ...
> 
> Tested-by: Alan Maguire <alan.maguire@oracle.com>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

With one nit:

> ---
> tools/bpf/bpftool/btf.c | 28 +++++++++++++++++++++++++++-
> 1 file changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
> index c96b56e8e3a4..ed5e97157241 100644
> --- a/tools/bpf/bpftool/btf.c
> +++ b/tools/bpf/bpftool/btf.c
> @@ -742,9 +742,14 @@ show_btf_plain(struct bpf_btf_info *info, int fd,
> 	       struct btf_attach_table *btf_map_table)
> {
> 	struct btf_attach_point *obj;
> +	const char *name = u64_to_ptr(info->name);
> 	int n;
> 
> 	printf("%u: ", info->id);
> +	if (info->kernel_btf)
> +		printf("name [%s]  ", name);
> +	else if (name && name[0])
> +		printf("name %s  ", name);

Maybe explicitly say "name <anonymous>" for btf without a name? I think 
it will benefit plain output.  

> 	printf("size %uB", info->btf_size);
> 
> 	n = 0;

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 5/5] tools/bpftool: add support for in-kernel and named BTF in `btf show`
  2020-11-11  1:15   ` [PATCH v4 bpf-next 5/5] tools/bpftool: add support for in-kernel and named BTF in `btf show` Song Liu
@ 2020-11-11  4:19     ` Andrii Nakryiko
  2020-11-11  7:44       ` Song Liu
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2020-11-11  4:19 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, daniel,
	Kernel Team, linux-kernel, rafael, jeyu,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman, Alan Maguire

On Tue, Nov 10, 2020 at 5:15 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
>
> [...]
>
> > ...
> >
> > Tested-by: Alan Maguire <alan.maguire@oracle.com>
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> With one nit:
>
> > ---
> > tools/bpf/bpftool/btf.c | 28 +++++++++++++++++++++++++++-
> > 1 file changed, 27 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
> > index c96b56e8e3a4..ed5e97157241 100644
> > --- a/tools/bpf/bpftool/btf.c
> > +++ b/tools/bpf/bpftool/btf.c
> > @@ -742,9 +742,14 @@ show_btf_plain(struct bpf_btf_info *info, int fd,
> >              struct btf_attach_table *btf_map_table)
> > {
> >       struct btf_attach_point *obj;
> > +     const char *name = u64_to_ptr(info->name);
> >       int n;
> >
> >       printf("%u: ", info->id);
> > +     if (info->kernel_btf)
> > +             printf("name [%s]  ", name);
> > +     else if (name && name[0])
> > +             printf("name %s  ", name);
>
> Maybe explicitly say "name <anonymous>" for btf without a name? I think
> it will benefit plain output.

This patch set is already landed. But I can do a follow-up patch to add this.

>
> >       printf("size %uB", info->btf_size);
> >
> >       n = 0;
>
> [...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs
       [not found] ` <20201110011932.3201430-5-andrii@kernel.org>
@ 2020-11-11  7:11   ` kernel test robot
  2020-11-11 10:13   ` Jessica Yu
  1 sibling, 0 replies; 20+ messages in thread
From: kernel test robot @ 2020-11-11  7:11 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast, daniel
  Cc: kbuild-all, clang-built-linux, kernel-team, linux-kernel, rafael,
	jeyu, Andrii Nakryiko, Arnaldo Carvalho de Melo

[-- Attachment #1: Type: text/plain, Size: 4563 bytes --]

Hi Andrii,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Andrii-Nakryiko/Integrate-kernel-module-BTF-support/20201110-095309
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: mips-randconfig-r002-20201110 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 4d81c8adb6ed9840257f6cb6b93f60856d422a15)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install mips cross compiling tool for clang build
        # apt-get install binutils-mips-linux-gnu
        # https://github.com/0day-ci/linux/commit/dcd763b7808fdc01ebf70bbe07ba92388df4d20d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Andrii-Nakryiko/Integrate-kernel-module-BTF-support/20201110-095309
        git checkout dcd763b7808fdc01ebf70bbe07ba92388df4d20d
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=mips 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> kernel/bpf/btf.c:4481:20: warning: unused function 'btf_parse_module'
   static struct btf char const void unsigned int data_size)
   ^
   fatal error: error in backend: Nested variants found in inline asm string: ' .set push
   .set mips64r2
   .if ( 0x00 ) != -1)) 0x00 ) != -1)) : ($( static struct ftrace_branch_data __attribute__((__aligned__(4))) __attribute__((__section__("_ftrace_branch"))) __if_trace = $( .func = __func__, .file = "arch/mips/include/asm/atomic.h", .line = 154, $); 0x00 ) != -1)) : $))) ) && ( 0 ); .set push; .set mips64r2; .rept 1; sync 0x00; .endr; .set pop; .else; ; .endif
   1: ll $1, $2 # atomic_fetch_sub
   subu $0, $1, $3
   sc $0, $2
   beqz $0, 1b
   .set pop
   move $0, $1
   '
   clang-12: error: clang frontend command failed with exit code 70 (use -v to see invocation)
   clang version 12.0.0 (git://gitmirror/llvm_project 874b0a0b9db93f5d3350ffe6b5efda2d908415d0)
   Target: mipsel-unknown-linux-gnu
   Thread model: posix
   InstalledDir: /opt/cross/clang-874b0a0b9d/bin
   clang-12: note: diagnostic msg:
   Makefile arch drivers include kernel mm scripts source usr

vim +/btf_parse_module +4481 kernel/bpf/btf.c

  4480	
> 4481	static struct btf *btf_parse_module(const char *module_name, const void *data, unsigned int data_size)
  4482	{
  4483		struct btf_verifier_env *env = NULL;
  4484		struct bpf_verifier_log *log;
  4485		struct btf *btf = NULL, *base_btf;
  4486		int err;
  4487	
  4488		base_btf = bpf_get_btf_vmlinux();
  4489		if (IS_ERR(base_btf))
  4490			return base_btf;
  4491		if (!base_btf)
  4492			return ERR_PTR(-EINVAL);
  4493	
  4494		env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
  4495		if (!env)
  4496			return ERR_PTR(-ENOMEM);
  4497	
  4498		log = &env->log;
  4499		log->level = BPF_LOG_KERNEL;
  4500	
  4501		btf = kzalloc(sizeof(*btf), GFP_KERNEL | __GFP_NOWARN);
  4502		if (!btf) {
  4503			err = -ENOMEM;
  4504			goto errout;
  4505		}
  4506		env->btf = btf;
  4507	
  4508		btf->base_btf = base_btf;
  4509		btf->start_id = base_btf->nr_types;
  4510		btf->start_str_off = base_btf->hdr.str_len;
  4511		btf->kernel_btf = true;
  4512		snprintf(btf->name, sizeof(btf->name), "%s", module_name);
  4513	
  4514		btf->data = kvmalloc(data_size, GFP_KERNEL | __GFP_NOWARN);
  4515		if (!btf->data) {
  4516			err = -ENOMEM;
  4517			goto errout;
  4518		}
  4519		memcpy(btf->data, data, data_size);
  4520		btf->data_size = data_size;
  4521	
  4522		err = btf_parse_hdr(env);
  4523		if (err)
  4524			goto errout;
  4525	
  4526		btf->nohdr_data = btf->data + btf->hdr.hdr_len;
  4527	
  4528		err = btf_parse_str_sec(env);
  4529		if (err)
  4530			goto errout;
  4531	
  4532		err = btf_check_all_metas(env);
  4533		if (err)
  4534			goto errout;
  4535	
  4536		btf_verifier_env_free(env);
  4537		refcount_set(&btf->refcnt, 1);
  4538		return btf;
  4539	
  4540	errout:
  4541		btf_verifier_env_free(env);
  4542		if (btf) {
  4543			kvfree(btf->data);
  4544			kvfree(btf->types);
  4545			kfree(btf);
  4546		}
  4547		return ERR_PTR(err);
  4548	}
  4549	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30898 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 5/5] tools/bpftool: add support for in-kernel and named BTF in `btf show`
  2020-11-11  4:19     ` Andrii Nakryiko
@ 2020-11-11  7:44       ` Song Liu
  0 siblings, 0 replies; 20+ messages in thread
From: Song Liu @ 2020-11-11  7:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, daniel,
	Kernel Team, linux-kernel, rafael, jeyu,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman, Alan Maguire



> On Nov 10, 2020, at 8:19 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Tue, Nov 10, 2020 at 5:15 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
>> 
>> [...]
>> 
>>> ...
>>> 
>>> Tested-by: Alan Maguire <alan.maguire@oracle.com>
>>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>> 
>> Acked-by: Song Liu <songliubraving@fb.com>
>> 
>> With one nit:
>> 
>>> ---
>>> tools/bpf/bpftool/btf.c | 28 +++++++++++++++++++++++++++-
>>> 1 file changed, 27 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
>>> index c96b56e8e3a4..ed5e97157241 100644
>>> --- a/tools/bpf/bpftool/btf.c
>>> +++ b/tools/bpf/bpftool/btf.c
>>> @@ -742,9 +742,14 @@ show_btf_plain(struct bpf_btf_info *info, int fd,
>>>             struct btf_attach_table *btf_map_table)
>>> {
>>>      struct btf_attach_point *obj;
>>> +     const char *name = u64_to_ptr(info->name);
>>>      int n;
>>> 
>>>      printf("%u: ", info->id);
>>> +     if (info->kernel_btf)
>>> +             printf("name [%s]  ", name);
>>> +     else if (name && name[0])
>>> +             printf("name %s  ", name);
>> 
>> Maybe explicitly say "name <anonymous>" for btf without a name? I think
>> it will benefit plain output.
> 
> This patch set is already landed. But I can do a follow-up patch to add this.

I realized this was applied soon after sending this. Yeah, a follow-up 
patch would be great. 

Thanks,
Song 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs
       [not found] ` <20201110011932.3201430-5-andrii@kernel.org>
  2020-11-11  7:11   ` [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs kernel test robot
@ 2020-11-11 10:13   ` Jessica Yu
  2020-11-11 20:11     ` Andrii Nakryiko
  1 sibling, 1 reply; 20+ messages in thread
From: Jessica Yu @ 2020-11-11 10:13 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, netdev, ast, daniel, kernel-team, linux-kernel, rafael,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman

+++ Andrii Nakryiko [09/11/20 17:19 -0800]:
[snipped]
>diff --git a/kernel/module.c b/kernel/module.c
>index a4fa44a652a7..f2996b02ab2e 100644
>--- a/kernel/module.c
>+++ b/kernel/module.c
>@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info,
> 	return (void *)info->sechdrs[sec].sh_addr;
> }
>
>+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */
>+static unsigned int find_any_sec(const struct load_info *info, const char *name)
>+{
>+	unsigned int i;
>+
>+	for (i = 1; i < info->hdr->e_shnum; i++) {
>+		Elf_Shdr *shdr = &info->sechdrs[i];
>+		if (strcmp(info->secstrings + shdr->sh_name, name) == 0)
>+			return i;
>+	}
>+	return 0;
>+}
>+
>+/*
>+ * Find a module section, or NULL. Fill in number of "objects" in section.
>+ * Ignores SHF_ALLOC flag.
>+ */
>+static __maybe_unused void *any_section_objs(const struct load_info *info,
>+					     const char *name,
>+					     size_t object_size,
>+					     unsigned int *num)
>+{
>+	unsigned int sec = find_any_sec(info, name);
>+
>+	/* Section 0 has sh_addr 0 and sh_size 0. */
>+	*num = info->sechdrs[sec].sh_size / object_size;
>+	return (void *)info->sechdrs[sec].sh_addr;
>+}
>+

Hm, I see this patchset has already been applied to bpf-next, but I
guess that doesn't preclude any follow-up patches :-)

I am not a huge fan of the code duplication here, and also the fact
that they're only called in one place. any_section_objs() and
find_any_sec() are pretty much identical to section_objs() and
find_sec(), other than the fact the former drops the SHF_ALLOC check.

Moreover, since it appears that the ".BTF" section is not marked
SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer
after the module is done loading and the module's load_info has been
deallocated, since SHF_ALLOC sections are not allocated nor copied to
the module's final location in memory.

Why not simply mark the ".BTF" section in the module SHF_ALLOC? We
already do some sh_flags rewriting in rewrite_section_headers(). Then
the module loader knows to keep the section in memory and you can use
section_objs(). And since the .BTF section stays in module memory,
that might save you the memcpy() to btf->data in btf_parse_module()
(unless that is still needed for some reason).

Thanks,

Jessica

> /* Provided by the linker */
> extern const struct kernel_symbol __start___ksymtab[];
> extern const struct kernel_symbol __stop___ksymtab[];
>@@ -3250,6 +3279,9 @@ static int find_module_sections(struct module *mod, struct load_info *info)
> 					   sizeof(*mod->bpf_raw_events),
> 					   &mod->num_bpf_raw_events);
> #endif
>+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
>+	mod->btf_data = any_section_objs(info, ".BTF", 1, &mod->btf_data_size);
>+#endif
> #ifdef CONFIG_JUMP_LABEL
> 	mod->jump_entries = section_objs(info, "__jump_table",
> 					sizeof(*mod->jump_entries),
>-- 
>2.24.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs
  2020-11-11 10:13   ` Jessica Yu
@ 2020-11-11 20:11     ` Andrii Nakryiko
  2020-11-13 10:31       ` Jessica Yu
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2020-11-11 20:11 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list, Rafael J. Wysocki,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman

On Wed, Nov 11, 2020 at 2:13 AM Jessica Yu <jeyu@kernel.org> wrote:
>
> +++ Andrii Nakryiko [09/11/20 17:19 -0800]:
> [snipped]
> >diff --git a/kernel/module.c b/kernel/module.c
> >index a4fa44a652a7..f2996b02ab2e 100644
> >--- a/kernel/module.c
> >+++ b/kernel/module.c
> >@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info,
> >       return (void *)info->sechdrs[sec].sh_addr;
> > }
> >
> >+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */
> >+static unsigned int find_any_sec(const struct load_info *info, const char *name)
> >+{
> >+      unsigned int i;
> >+
> >+      for (i = 1; i < info->hdr->e_shnum; i++) {
> >+              Elf_Shdr *shdr = &info->sechdrs[i];
> >+              if (strcmp(info->secstrings + shdr->sh_name, name) == 0)
> >+                      return i;
> >+      }
> >+      return 0;
> >+}
> >+
> >+/*
> >+ * Find a module section, or NULL. Fill in number of "objects" in section.
> >+ * Ignores SHF_ALLOC flag.
> >+ */
> >+static __maybe_unused void *any_section_objs(const struct load_info *info,
> >+                                           const char *name,
> >+                                           size_t object_size,
> >+                                           unsigned int *num)
> >+{
> >+      unsigned int sec = find_any_sec(info, name);
> >+
> >+      /* Section 0 has sh_addr 0 and sh_size 0. */
> >+      *num = info->sechdrs[sec].sh_size / object_size;
> >+      return (void *)info->sechdrs[sec].sh_addr;
> >+}
> >+
>
> Hm, I see this patchset has already been applied to bpf-next, but I
> guess that doesn't preclude any follow-up patches :-)

Of course!

>
> I am not a huge fan of the code duplication here, and also the fact
> that they're only called in one place. any_section_objs() and
> find_any_sec() are pretty much identical to section_objs() and
> find_sec(), other than the fact the former drops the SHF_ALLOC check.

Right, but the alternative was to add a new flag to existing
section_objs() and find_sec() functions, which would cause much more
code churn for no good reason (besides saving some trivial code
duplication). And those true/false flags are harder to read in code
anyways.

>
> Moreover, since it appears that the ".BTF" section is not marked
> SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer
> after the module is done loading and the module's load_info has been
> deallocated, since SHF_ALLOC sections are not allocated nor copied to
> the module's final location in memory.

I can make sure that we also reset the btf_data pointer back to NULL,
if that's a big concern.

>
> Why not simply mark the ".BTF" section in the module SHF_ALLOC? We
> already do some sh_flags rewriting in rewrite_section_headers(). Then
> the module loader knows to keep the section in memory and you can use
> section_objs(). And since the .BTF section stays in module memory,
> that might save you the memcpy() to btf->data in btf_parse_module()
> (unless that is still needed for some reason).

Wasn't aware about rewrite_section_headers() manipulations. Are you
suggesting to just add SHF_ALLOC there for the .BTF section from the
kernel side? I guess that would work, but won't avoid memory copy (so
actually would waste kernel memory, if I understand correctly). The
reason being that the module's BTF is registered as an independently
ref-counted BTF object, which could be held past the kernel module
being unloaded. So I can't directly reference module's .BTF data
anyways.

Also, marking .BTF with SHF_ALLOC with pahole or objcopy tool actually
might generate warnings because SHF_ALLOC sections need to be
allocated to data segments, which neither of those tools know how to
do, it requires a linker support. We do that for vmlinux with extra
linker script logic, but for kernel modules we don't have and probably
don't want to do that.

So in the end, the cleanest approach still seems like not doing
SHF_ALLOC but allowing "capturing" .BTF data with an extra helper.

>
> Thanks,
>
> Jessica
>
> > /* Provided by the linker */
> > extern const struct kernel_symbol __start___ksymtab[];
> > extern const struct kernel_symbol __stop___ksymtab[];
> >@@ -3250,6 +3279,9 @@ static int find_module_sections(struct module *mod, struct load_info *info)
> >                                          sizeof(*mod->bpf_raw_events),
> >                                          &mod->num_bpf_raw_events);
> > #endif
> >+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
> >+      mod->btf_data = any_section_objs(info, ".BTF", 1, &mod->btf_data_size);
> >+#endif
> > #ifdef CONFIG_JUMP_LABEL
> >       mod->jump_entries = section_objs(info, "__jump_table",
> >                                       sizeof(*mod->jump_entries),
> >--
> >2.24.1
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs
  2020-11-11 20:11     ` Andrii Nakryiko
@ 2020-11-13 10:31       ` Jessica Yu
  2020-11-13 18:54         ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Jessica Yu @ 2020-11-13 10:31 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list, Rafael J. Wysocki,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman

+++ Andrii Nakryiko [11/11/20 12:11 -0800]:
>On Wed, Nov 11, 2020 at 2:13 AM Jessica Yu <jeyu@kernel.org> wrote:
>>
>> +++ Andrii Nakryiko [09/11/20 17:19 -0800]:
>> [snipped]
>> >diff --git a/kernel/module.c b/kernel/module.c
>> >index a4fa44a652a7..f2996b02ab2e 100644
>> >--- a/kernel/module.c
>> >+++ b/kernel/module.c
>> >@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info,
>> >       return (void *)info->sechdrs[sec].sh_addr;
>> > }
>> >
>> >+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */
>> >+static unsigned int find_any_sec(const struct load_info *info, const char *name)
>> >+{
>> >+      unsigned int i;
>> >+
>> >+      for (i = 1; i < info->hdr->e_shnum; i++) {
>> >+              Elf_Shdr *shdr = &info->sechdrs[i];
>> >+              if (strcmp(info->secstrings + shdr->sh_name, name) == 0)
>> >+                      return i;
>> >+      }
>> >+      return 0;
>> >+}
>> >+
>> >+/*
>> >+ * Find a module section, or NULL. Fill in number of "objects" in section.
>> >+ * Ignores SHF_ALLOC flag.
>> >+ */
>> >+static __maybe_unused void *any_section_objs(const struct load_info *info,
>> >+                                           const char *name,
>> >+                                           size_t object_size,
>> >+                                           unsigned int *num)
>> >+{
>> >+      unsigned int sec = find_any_sec(info, name);
>> >+
>> >+      /* Section 0 has sh_addr 0 and sh_size 0. */
>> >+      *num = info->sechdrs[sec].sh_size / object_size;
>> >+      return (void *)info->sechdrs[sec].sh_addr;
>> >+}
>> >+
>>
>> Hm, I see this patchset has already been applied to bpf-next, but I
>> guess that doesn't preclude any follow-up patches :-)
>
>Of course!
>
>>
>> I am not a huge fan of the code duplication here, and also the fact
>> that they're only called in one place. any_section_objs() and
>> find_any_sec() are pretty much identical to section_objs() and
>> find_sec(), other than the fact the former drops the SHF_ALLOC check.
>
>Right, but the alternative was to add a new flag to existing
>section_objs() and find_sec() functions, which would cause much more
>code churn for no good reason (besides saving some trivial code
>duplication). And those true/false flags are harder to read in code
>anyways.

That's true, all fair points. I thought there was the possibility to
avoid the code duplication if .BTF were also set to SHF_ALLOC, but I
see for reasons you explained below it is more trouble than it's worth.

>>
>> Moreover, since it appears that the ".BTF" section is not marked
>> SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer
>> after the module is done loading and the module's load_info has been
>> deallocated, since SHF_ALLOC sections are not allocated nor copied to
>> the module's final location in memory.
>
>I can make sure that we also reset the btf_data pointer back to NULL,
>if that's a big concern.

It's not a terribly huge concern, since mod->btf_data is only accessed
in the btf coming notifier at the moment, but it's probably best to at
least not advertise it as a valid pointer anymore after the module is
done loading. We do some pointer and section size cleanup at the end
of do_init_module() for sections that are deallocated at the end of
module load (starting where init_layout.base is reset to NULL),
we could just tack on mod->btf_data = NULL there as well.

>>
>> Why not simply mark the ".BTF" section in the module SHF_ALLOC? We
>> already do some sh_flags rewriting in rewrite_section_headers(). Then
>> the module loader knows to keep the section in memory and you can use
>> section_objs(). And since the .BTF section stays in module memory,
>> that might save you the memcpy() to btf->data in btf_parse_module()
>> (unless that is still needed for some reason).
>
>Wasn't aware about rewrite_section_headers() manipulations. Are you
>suggesting to just add SHF_ALLOC there for the .BTF section from the
>kernel side? I guess that would work, but won't avoid memory copy (so
>actually would waste kernel memory, if I understand correctly). The
>reason being that the module's BTF is registered as an independently
>ref-counted BTF object, which could be held past the kernel module
>being unloaded. So I can't directly reference module's .BTF data
>anyways.

Ah OK, I was not aware that the section could be held past the module
being unloaded. Then yeah, it would be a memory waste to keep them in
memory if they are being memcpy'd anyway. Thanks for clarifying!

Jessica

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs
  2020-11-13 10:31       ` Jessica Yu
@ 2020-11-13 18:54         ` Andrii Nakryiko
  0 siblings, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2020-11-13 18:54 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, open list, Rafael J. Wysocki,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman

On Fri, Nov 13, 2020 at 2:32 AM Jessica Yu <jeyu@kernel.org> wrote:
>
> +++ Andrii Nakryiko [11/11/20 12:11 -0800]:
> >On Wed, Nov 11, 2020 at 2:13 AM Jessica Yu <jeyu@kernel.org> wrote:
> >>
> >> +++ Andrii Nakryiko [09/11/20 17:19 -0800]:
> >> [snipped]
> >> >diff --git a/kernel/module.c b/kernel/module.c
> >> >index a4fa44a652a7..f2996b02ab2e 100644
> >> >--- a/kernel/module.c
> >> >+++ b/kernel/module.c
> >> >@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info,
> >> >       return (void *)info->sechdrs[sec].sh_addr;
> >> > }
> >> >
> >> >+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */
> >> >+static unsigned int find_any_sec(const struct load_info *info, const char *name)
> >> >+{
> >> >+      unsigned int i;
> >> >+
> >> >+      for (i = 1; i < info->hdr->e_shnum; i++) {
> >> >+              Elf_Shdr *shdr = &info->sechdrs[i];
> >> >+              if (strcmp(info->secstrings + shdr->sh_name, name) == 0)
> >> >+                      return i;
> >> >+      }
> >> >+      return 0;
> >> >+}
> >> >+
> >> >+/*
> >> >+ * Find a module section, or NULL. Fill in number of "objects" in section.
> >> >+ * Ignores SHF_ALLOC flag.
> >> >+ */
> >> >+static __maybe_unused void *any_section_objs(const struct load_info *info,
> >> >+                                           const char *name,
> >> >+                                           size_t object_size,
> >> >+                                           unsigned int *num)
> >> >+{
> >> >+      unsigned int sec = find_any_sec(info, name);
> >> >+
> >> >+      /* Section 0 has sh_addr 0 and sh_size 0. */
> >> >+      *num = info->sechdrs[sec].sh_size / object_size;
> >> >+      return (void *)info->sechdrs[sec].sh_addr;
> >> >+}
> >> >+
> >>
> >> Hm, I see this patchset has already been applied to bpf-next, but I
> >> guess that doesn't preclude any follow-up patches :-)
> >
> >Of course!
> >
> >>
> >> I am not a huge fan of the code duplication here, and also the fact
> >> that they're only called in one place. any_section_objs() and
> >> find_any_sec() are pretty much identical to section_objs() and
> >> find_sec(), other than the fact the former drops the SHF_ALLOC check.
> >
> >Right, but the alternative was to add a new flag to existing
> >section_objs() and find_sec() functions, which would cause much more
> >code churn for no good reason (besides saving some trivial code
> >duplication). And those true/false flags are harder to read in code
> >anyways.
>
> That's true, all fair points. I thought there was the possibility to
> avoid the code duplication if .BTF were also set to SHF_ALLOC, but I
> see for reasons you explained below it is more trouble than it's worth.
>
> >>
> >> Moreover, since it appears that the ".BTF" section is not marked
> >> SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer
> >> after the module is done loading and the module's load_info has been
> >> deallocated, since SHF_ALLOC sections are not allocated nor copied to
> >> the module's final location in memory.
> >
> >I can make sure that we also reset the btf_data pointer back to NULL,
> >if that's a big concern.
>
> It's not a terribly huge concern, since mod->btf_data is only accessed
> in the btf coming notifier at the moment, but it's probably best to at
> least not advertise it as a valid pointer anymore after the module is
> done loading. We do some pointer and section size cleanup at the end
> of do_init_module() for sections that are deallocated at the end of
> module load (starting where init_layout.base is reset to NULL),
> we could just tack on mod->btf_data = NULL there as well.

Sounds good, I'll send a follow up patch. Thanks!

>
> >>
> >> Why not simply mark the ".BTF" section in the module SHF_ALLOC? We
> >> already do some sh_flags rewriting in rewrite_section_headers(). Then
> >> the module loader knows to keep the section in memory and you can use
> >> section_objs(). And since the .BTF section stays in module memory,
> >> that might save you the memcpy() to btf->data in btf_parse_module()
> >> (unless that is still needed for some reason).
> >
> >Wasn't aware about rewrite_section_headers() manipulations. Are you
> >suggesting to just add SHF_ALLOC there for the .BTF section from the
> >kernel side? I guess that would work, but won't avoid memory copy (so
> >actually would waste kernel memory, if I understand correctly). The
> >reason being that the module's BTF is registered as an independently
> >ref-counted BTF object, which could be held past the kernel module
> >being unloaded. So I can't directly reference module's .BTF data
> >anyways.
>
> Ah OK, I was not aware that the section could be held past the module
> being unloaded. Then yeah, it would be a memory waste to keep them in
> memory if they are being memcpy'd anyway. Thanks for clarifying!
>
> Jessica

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it
  2020-11-11  1:04   ` [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it Song Liu
@ 2020-11-16 19:55     ` Allan, Bruce W
  2020-11-16 20:34       ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Allan, Bruce W @ 2020-11-16 19:55 UTC (permalink / raw)
  To: Song Liu, Andrii Nakryiko
  Cc: bpf, Networking, Starovoitov, Alexei, Daniel Borkmann,
	Kernel Team, open list, rafael, jeyu, Arnaldo Carvalho de Melo,
	Greg Kroah-Hartman, Masahiro Yamada

> -----Original Message-----
> From: Song Liu <songliubraving@fb.com>
> Sent: Tuesday, November 10, 2020 5:05 PM
> To: Andrii Nakryiko <andrii@kernel.org>
> Cc: bpf <bpf@vger.kernel.org>; Networking <netdev@vger.kernel.org>;
> Starovoitov, Alexei <ast@fb.com>; Daniel Borkmann <daniel@iogearbox.net>;
> Kernel Team <Kernel-team@fb.com>; open list <linux-
> kernel@vger.kernel.org>; rafael@kernel.org; jeyu@kernel.org; Arnaldo
> Carvalho de Melo <acme@redhat.com>; Greg Kroah-Hartman
> <gregkh@linuxfoundation.org>; Masahiro Yamada
> <yamada.masahiro@socionext.com>
> Subject: Re: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is
> enabled and pahole supports it
> 
> 
> 
> > On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> [...]
> 
> > SPLIT BTF
> > =========
> >
> > $ for f in $(find . -name '*.ko'); do size -A -d $f | grep BTF | awk '{print $2}';
> done | awk '{ s += $1 } END { print s }'
> > 5194047
> >
> > $ for f in $(find . -name '*.ko'); do printf "%s %d\n" $f $(size -A -d $f | grep
> BTF | awk '{print $2}'); done | sort -nr -k2 | head -n10
> > ./drivers/gpu/drm/i915/i915.ko 293206
> > ./drivers/gpu/drm/radeon/radeon.ko 282103
> > ./fs/xfs/xfs.ko 222150
> > ./drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko 198503
> > ./drivers/infiniband/hw/mlx5/mlx5_ib.ko 198356
> > ./drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko 113444
> > ./fs/cifs/cifs.ko 109379
> > ./arch/x86/kvm/kvm.ko 100225
> > ./drivers/gpu/drm/drm.ko 94827
> > ./drivers/infiniband/core/ib_core.ko 91188
> >
> > Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> 
> Acked-by: Song Liu <songliubraving@fb.com>

This change, commit 5f9ae91f7c0d ("kbuild: Build kernel module BTFs if BTF is enabled and pahole
supports it") currently in net-next, linux-next, etc. breaks the use-case of compiling only a specific
kernel module (both in-tree and out-of-tree, e.g. 'make M=drivers/net/ethernet/intel/ice') after
first doing a 'make modules_prepare'.  Previously, that use-case would result in a warning noting
"Symbol info of vmlinux is missing. Unresolved symbol check will be entirely skipped" but now it
errors out after noting "No rule to make target 'vmlinux', needed by '<...>.ko'.  Stop."

Is that intentional?

Thanks,
Bruce.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it
  2020-11-16 19:55     ` Allan, Bruce W
@ 2020-11-16 20:34       ` Andrii Nakryiko
  2020-11-16 21:24         ` Jakub Kicinski
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2020-11-16 20:34 UTC (permalink / raw)
  To: Allan, Bruce W
  Cc: Song Liu, Andrii Nakryiko, bpf, Networking, Starovoitov, Alexei,
	Daniel Borkmann, Kernel Team, open list, rafael, jeyu,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman, Masahiro Yamada

On Mon, Nov 16, 2020 at 11:55 AM Allan, Bruce W <bruce.w.allan@intel.com> wrote:
>
> > -----Original Message-----
> > From: Song Liu <songliubraving@fb.com>
> > Sent: Tuesday, November 10, 2020 5:05 PM
> > To: Andrii Nakryiko <andrii@kernel.org>
> > Cc: bpf <bpf@vger.kernel.org>; Networking <netdev@vger.kernel.org>;
> > Starovoitov, Alexei <ast@fb.com>; Daniel Borkmann <daniel@iogearbox.net>;
> > Kernel Team <Kernel-team@fb.com>; open list <linux-
> > kernel@vger.kernel.org>; rafael@kernel.org; jeyu@kernel.org; Arnaldo
> > Carvalho de Melo <acme@redhat.com>; Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org>; Masahiro Yamada
> > <yamada.masahiro@socionext.com>
> > Subject: Re: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is
> > enabled and pahole supports it
> >
> >
> >
> > > On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > [...]
> >
> > > SPLIT BTF
> > > =========
> > >
> > > $ for f in $(find . -name '*.ko'); do size -A -d $f | grep BTF | awk '{print $2}';
> > done | awk '{ s += $1 } END { print s }'
> > > 5194047
> > >
> > > $ for f in $(find . -name '*.ko'); do printf "%s %d\n" $f $(size -A -d $f | grep
> > BTF | awk '{print $2}'); done | sort -nr -k2 | head -n10
> > > ./drivers/gpu/drm/i915/i915.ko 293206
> > > ./drivers/gpu/drm/radeon/radeon.ko 282103
> > > ./fs/xfs/xfs.ko 222150
> > > ./drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko 198503
> > > ./drivers/infiniband/hw/mlx5/mlx5_ib.ko 198356
> > > ./drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko 113444
> > > ./fs/cifs/cifs.ko 109379
> > > ./arch/x86/kvm/kvm.ko 100225
> > > ./drivers/gpu/drm/drm.ko 94827
> > > ./drivers/infiniband/core/ib_core.ko 91188
> > >
> > > Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> >
> > Acked-by: Song Liu <songliubraving@fb.com>
>
> This change, commit 5f9ae91f7c0d ("kbuild: Build kernel module BTFs if BTF is enabled and pahole
> supports it") currently in net-next, linux-next, etc. breaks the use-case of compiling only a specific
> kernel module (both in-tree and out-of-tree, e.g. 'make M=drivers/net/ethernet/intel/ice') after
> first doing a 'make modules_prepare'.  Previously, that use-case would result in a warning noting
> "Symbol info of vmlinux is missing. Unresolved symbol check will be entirely skipped" but now it
> errors out after noting "No rule to make target 'vmlinux', needed by '<...>.ko'.  Stop."
>
> Is that intentional?

I wasn't aware of such a use pattern, so definitely not intentional.
But vmlinux is absolutely necessary to generate the module BTF. So I'm
wondering what's the proper fix here? Leave it as is (that error
message is actually surprisingly descriptive, btw)? Force vmlinux
build? Or skip BTF generation for that module?

>
> Thanks,
> Bruce.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it
  2020-11-16 20:34       ` Andrii Nakryiko
@ 2020-11-16 21:24         ` Jakub Kicinski
  2020-11-16 22:35           ` Andrii Nakryiko
  2021-11-26 15:16           ` Fabian Grünbichler
  0 siblings, 2 replies; 20+ messages in thread
From: Jakub Kicinski @ 2020-11-16 21:24 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Allan, Bruce W, Song Liu, Andrii Nakryiko, bpf, Networking,
	Starovoitov, Alexei, Daniel Borkmann, Kernel Team, open list,
	rafael, jeyu, Arnaldo Carvalho de Melo, Greg Kroah-Hartman,
	Masahiro Yamada

On Mon, 16 Nov 2020 12:34:17 -0800 Andrii Nakryiko wrote:
> > This change, commit 5f9ae91f7c0d ("kbuild: Build kernel module BTFs if BTF is enabled and pahole
> > supports it") currently in net-next, linux-next, etc. breaks the use-case of compiling only a specific
> > kernel module (both in-tree and out-of-tree, e.g. 'make M=drivers/net/ethernet/intel/ice') after
> > first doing a 'make modules_prepare'.  Previously, that use-case would result in a warning noting
> > "Symbol info of vmlinux is missing. Unresolved symbol check will be entirely skipped" but now it
> > errors out after noting "No rule to make target 'vmlinux', needed by '<...>.ko'.  Stop."
> >
> > Is that intentional?  
> 
> I wasn't aware of such a use pattern, so definitely not intentional.
> But vmlinux is absolutely necessary to generate the module BTF. So I'm
> wondering what's the proper fix here? Leave it as is (that error
> message is actually surprisingly descriptive, btw)? Force vmlinux
> build? Or skip BTF generation for that module?

For an external out-of-tree module there is no guarantee that vmlinux
will even be on the system, no? So only the last option can work in
that case.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it
  2020-11-16 21:24         ` Jakub Kicinski
@ 2020-11-16 22:35           ` Andrii Nakryiko
  2021-11-26 15:16           ` Fabian Grünbichler
  1 sibling, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2020-11-16 22:35 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Allan, Bruce W, Song Liu, Andrii Nakryiko, bpf, Networking,
	Starovoitov, Alexei, Daniel Borkmann, Kernel Team, open list,
	rafael, jeyu, Arnaldo Carvalho de Melo, Greg Kroah-Hartman,
	Masahiro Yamada

On Mon, Nov 16, 2020 at 1:24 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 16 Nov 2020 12:34:17 -0800 Andrii Nakryiko wrote:
> > > This change, commit 5f9ae91f7c0d ("kbuild: Build kernel module BTFs if BTF is enabled and pahole
> > > supports it") currently in net-next, linux-next, etc. breaks the use-case of compiling only a specific
> > > kernel module (both in-tree and out-of-tree, e.g. 'make M=drivers/net/ethernet/intel/ice') after
> > > first doing a 'make modules_prepare'.  Previously, that use-case would result in a warning noting
> > > "Symbol info of vmlinux is missing. Unresolved symbol check will be entirely skipped" but now it
> > > errors out after noting "No rule to make target 'vmlinux', needed by '<...>.ko'.  Stop."
> > >
> > > Is that intentional?
> >
> > I wasn't aware of such a use pattern, so definitely not intentional.
> > But vmlinux is absolutely necessary to generate the module BTF. So I'm
> > wondering what's the proper fix here? Leave it as is (that error
> > message is actually surprisingly descriptive, btw)? Force vmlinux
> > build? Or skip BTF generation for that module?
>
> For an external out-of-tree module there is no guarantee that vmlinux
> will even be on the system, no? So only the last option can work in
> that case.


Ok, how about something like the patch below. With that I seem to be
getting the desired behavior:

$ make clean
$ touch drivers/acpi/button.c
$ make M=drivers/acpi
make[1]: Entering directory `/data/users/andriin/linux-build/default-x86_64'
  CC [M]  drivers/acpi/button.o
  MODPOST drivers/acpi/Module.symvers
  LD [M]  drivers/acpi/button.ko
  BTF [M] drivers/acpi/button.ko
Skipping BTF generation for drivers/acpi/button.ko due to
unavailability of vmlinux
make[1]: Leaving directory `/data/users/andriin/linux-build/default-x86_64'
$ readelf -S ~/linux-build/default/drivers/acpi/button.ko | grep BTF -A1
... empty ...

Now with normal build:

$ make all
...
LD [M]  drivers/acpi/button.ko
BTF [M] drivers/acpi/button.ko
...
$ readelf -S ~/linux-build/default/drivers/acpi/button.ko | grep BTF -A1
  [60] .BTF              PROGBITS         0000000000000000  00029310
       000000000000ab3f  0000000000000000           0     0     1



diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index 02b892421f7a..d49ec001825d 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -38,7 +38,12 @@ quiet_cmd_ld_ko_o = LD [M]  $@
     $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)

 quiet_cmd_btf_ko = BTF [M] $@
-      cmd_btf_ko = LLVM_OBJCOPY=$(OBJCOPY) $(PAHOLE) -J --btf_base vmlinux $@
+      cmd_btf_ko =                             \
+    if [ -f vmlinux ]; then                        \
+        LLVM_OBJCOPY=$(OBJCOPY) $(PAHOLE) -J --btf_base vmlinux $@; \
+    else                                \
+        printf "Skipping BTF generation for %s due to unavailability
of vmlinux\n" $@ 1>&2; \
+    fi;

 # Same as newer-prereqs, but allows to exclude specified extra dependencies
 newer_prereqs_except = $(filter-out $(PHONY) $(1),$?)
@@ -49,7 +54,7 @@ if_changed_except = $(if $(call
newer_prereqs_except,$(2))$(cmd-check),      \
     printf '%s\n' 'cmd_$@ := $(make-cmd)' > $(dot-target).cmd, @:)

 # Re-generate module BTFs if either module's .ko or vmlinux changed
-$(modules): %.ko: %.o %.mod.o scripts/module.lds vmlinux FORCE
+$(modules): %.ko: %.o %.mod.o scripts/module.lds $(if
$(KBUILD_BUILTIN),vmlinux) FORCE
     +$(call if_changed_except,ld_ko_o,vmlinux)
 ifdef CONFIG_DEBUG_INFO_BTF_MODULES
     +$(if $(newer-prereqs),$(call cmd,btf_ko))

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it
       [not found] ` <20201110011932.3201430-4-andrii@kernel.org>
  2020-11-11  1:04   ` [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it Song Liu
@ 2020-11-17  3:44   ` David Ahern
  1 sibling, 0 replies; 20+ messages in thread
From: David Ahern @ 2020-11-17  3:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: kernel-team, linux-kernel, rafael, jeyu,
	Arnaldo Carvalho de Melo, Greg Kroah-Hartman, Masahiro Yamada,
	bpf, netdev, ast, daniel

On 11/9/20 6:19 PM, Andrii Nakryiko wrote:
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index d7a7bc3b6098..1e78faaf20a5 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -274,6 +274,15 @@ config DEBUG_INFO_BTF
>  	  Turning this on expects presence of pahole tool, which will convert
>  	  DWARF type info into equivalent deduplicated BTF type info.
>  
> +config PAHOLE_HAS_SPLIT_BTF
> +	def_bool $(success, test `$(PAHOLE) --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/'` -ge "119")
> +
> +config DEBUG_INFO_BTF_MODULES
> +	def_bool y
> +	depends on DEBUG_INFO_BTF && MODULES && PAHOLE_HAS_SPLIT_BTF
> +	help
> +	  Generate compact split BTF type information for kernel modules.
> +
>  config GDB_SCRIPTS
>  	bool "Provide GDB scripts for kernel debugging"
>  	help

Thank you for adding a config option for this feature vs bumping the
required pahole version in scripts/link-vmlinux.sh. This is a much more
friendly way of handling kernel features that require support from build
tools.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it
  2020-11-16 21:24         ` Jakub Kicinski
  2020-11-16 22:35           ` Andrii Nakryiko
@ 2021-11-26 15:16           ` Fabian Grünbichler
  2021-12-08  0:08             ` Andrii Nakryiko
  1 sibling, 1 reply; 20+ messages in thread
From: Fabian Grünbichler @ 2021-11-26 15:16 UTC (permalink / raw)
  To: Andrii Nakryiko, Jakub Kicinski
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Starovoitov, Alexei,
	bpf, Allan, Bruce W, Daniel Borkmann, Greg Kroah-Hartman, jeyu,
	Kernel Team, open list, Networking, rafael, Song Liu,
	Masahiro Yamada

On November 16, 2020 10:24 pm, Jakub Kicinski wrote:
> On Mon, 16 Nov 2020 12:34:17 -0800 Andrii Nakryiko wrote:
>> > This change, commit 5f9ae91f7c0d ("kbuild: Build kernel module BTFs if BTF is enabled and pahole
>> > supports it") currently in net-next, linux-next, etc. breaks the use-case of compiling only a specific
>> > kernel module (both in-tree and out-of-tree, e.g. 'make M=drivers/net/ethernet/intel/ice') after
>> > first doing a 'make modules_prepare'.  Previously, that use-case would result in a warning noting
>> > "Symbol info of vmlinux is missing. Unresolved symbol check will be entirely skipped" but now it
>> > errors out after noting "No rule to make target 'vmlinux', needed by '<...>.ko'.  Stop."
>> >
>> > Is that intentional?  
>> 
>> I wasn't aware of such a use pattern, so definitely not intentional.
>> But vmlinux is absolutely necessary to generate the module BTF. So I'm
>> wondering what's the proper fix here? Leave it as is (that error
>> message is actually surprisingly descriptive, btw)? Force vmlinux
>> build? Or skip BTF generation for that module?
> 
> For an external out-of-tree module there is no guarantee that vmlinux
> will even be on the system, no? So only the last option can work in
> that case.

a year late to the party, but it seems to me that this patch 
series/features also missed another, not yet fixed scenario. I have to 
admit I am not very well-versed in BTF/BPF matters though, so please 
take the analysis below with a grain of salt or two ;)

(am subscribed to LKML/netdev, but not the bpf list, so please keep me 
CCed if discussion moves there! apologies if too many people are CCed 
here, feel free to trim down to relevant people/lists)

many distros do their own tracking of kernel <-> module ABI (usually 
these checks use Module.symvers and some combination of lists/symbols/.. 
to skip/ignore[0]).

depending on detected changes, a kernel update can either
- bump ABI, resulting in a new kernel/modules package that is installed 
  next to the current one
- keep ABI, resulting in an updated kernel/modules package that is 
  installed over/instead of the current one

the former case is obviously not an issue, since the modules and vmlinux 
image shipped in that (set of) package(s) match. but in the later case 
of updated, compatible kernel image + modules with unchanged ABI
(which is important, as it allows shipping fixed modules that are 
loadable for a compatible, older, booted kernel image), the following is 
possible:
- install kernel+modules with ABI 1
- boot kernel with ABI 1
- install ABI-compatible upgrade (e.g. a security fix)
- load module
- BTF validation fails, because the base_btf (loaded at boot time) and 
  the offsets in the module's .BTF section (loaded at module load time) 
  aren't matching

of course the validation might also not fail but the parsed BTF info 
might be bogus, or the base_btf might be similar enough that validation 
passes and the parsed BTF data is correct.

in our case the symptoms look like this (exact details vary with kernel 
builds/modules, but likely not relevant):

Nov 24 11:39:11 host kernel: BPF:         type_id=7 bits_offset=0
Nov 24 11:39:11 host kernel: BPF:
Nov 24 11:39:11 host kernel: BPF:Invalid name
Nov 24 11:39:11 host kernel: BPF:
Nov 24 11:39:11 host kernel: failed to validate module [overlay] BTF: -22

where the booted kernel and the (attempted to get) loaded module are not 
from the same build, but the Module.symvers is matching and loading 
should thus work. adding some more debug logging reveals that the root 
cause is the module's BTF start_str_off being, well, off, since it's derived 
from vmlinux' BTF/base_btf. if it is too big, the name/type lookups will 
wrongly look in the base_btf, if it's too small, the name/type lookups 
will be offset within the module or wrongly look inside the module when 
they should look inside base_btf/vmlinux. in any case, random garbage is 
the result, usually tripping up some validation check (e.g. the first 
byte not being 0 when checking a name). but even if it's correct (old 
and new vmlinux image have the same nr_types/hdr.str_len), there is no 
guarantuee that the offsets into base_btf are pointing at the right 
stuff.

example with debug logging patched in, note the garbled names, and 
offset slightly below the (wrong) start_str_off:

----8<----

BPF:magic: 0xeb9f

BPF:version: 1

BPF:flags: 0x0

BPF:hdr_len: 24

BPF:type_off: 0

BPF:type_len: 9264

BPF:str_off: 9264

BPF:str_len: 5511

BPF:btf_total_size: 14799

BPF:[106314] STRUCT rimary_device
BPF:size=56 vlen=14
BPF:

BPF:offset at call: 1915394
BPF:offset too small, choosing base_btf: 1915397

BPF:offset after base_btf: 1915394

BPF:     ce type_id=49 bits_offset=0
BPF:

BPF:offset at call: 1915403
BPF:offset after base_btf: 6

BPF:     nfig type_id=49 bits_offset=64
BPF:

BPF:offset at call: 1915412
BPF:offset after base_btf: 15

BPF:     rdir type_id=49 bits_offset=128
BPF:

BPF:offset at call: 768428
BPF:offset too small, choosing base_btf: 1915397

BPF:offset after base_btf: 768428

BPF:     _dio type_id=56 bits_offset=192
BPF:

BPF:offset at call: 1915420
BPF:offset after base_btf: 23

BPF:     erdir type_id=56 bits_offset=200
BPF:

BPF:offset at call: 1915433
BPF:offset after base_btf: 36

BPF:first char wrong - 0

BPF:      type_id=56 bits_offset=208
BPF:
BPF:Invalid name STRUCT MEMBER (name offset 1915433)
BPF:

failed to validate module [overlay] BTF: -22

---->8----

also note how it's only after a few botched entries that a check 
actually trips up - not sure what the impliciations for crafted BTF info 
are, but might be worthy a closer look by someone more knowledgable as 
well..

it seems to me this can be solved on the distro/user side by tracking 
vmlinux BTF infos as part of the ABI tracking (how stable are those 
compared to the existing interfaces making up the kernel <-> module 
ABI/Module.symvers? does this effectively mean bumping ABI for any 
change anyway?) or by disabling CONFIG_DEBUG_INFO_BTF_MODULES.

on the kernel/libbpf side it could maybe be solved by storing a hash of 
the base_btf data used to generate the split BTF-sections inside the 
modules, and skip BTF loading/validating if another base_btf is 
currently loaded (so BTF is best-effort, if the booted kernel and the 
module are matching it works, if not module loading works but no BTF 
support). this might be a good safe-guard for split-BTF in general?

I'd appreciate input on how to proceed (we were recently hit by this in 
a downstream Debian derivative, and will disable BTF info for modules as 
an interim measure).

thanks!

0: e.g., Debian's: https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/bin/abiupdate.py

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it
  2021-11-26 15:16           ` Fabian Grünbichler
@ 2021-12-08  0:08             ` Andrii Nakryiko
  0 siblings, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2021-12-08  0:08 UTC (permalink / raw)
  To: Fabian Grünbichler
  Cc: Jakub Kicinski, Arnaldo Carvalho de Melo, Andrii Nakryiko,
	Starovoitov, Alexei, bpf, Allan, Bruce W, Daniel Borkmann,
	Greg Kroah-Hartman, jeyu, Kernel Team, open list, Networking,
	rafael, Song Liu, Masahiro Yamada

On Fri, Nov 26, 2021 at 7:16 AM Fabian Grünbichler
<f.gruenbichler@proxmox.com> wrote:
>
> On November 16, 2020 10:24 pm, Jakub Kicinski wrote:
> > On Mon, 16 Nov 2020 12:34:17 -0800 Andrii Nakryiko wrote:
> >> > This change, commit 5f9ae91f7c0d ("kbuild: Build kernel module BTFs if BTF is enabled and pahole
> >> > supports it") currently in net-next, linux-next, etc. breaks the use-case of compiling only a specific
> >> > kernel module (both in-tree and out-of-tree, e.g. 'make M=drivers/net/ethernet/intel/ice') after
> >> > first doing a 'make modules_prepare'.  Previously, that use-case would result in a warning noting
> >> > "Symbol info of vmlinux is missing. Unresolved symbol check will be entirely skipped" but now it
> >> > errors out after noting "No rule to make target 'vmlinux', needed by '<...>.ko'.  Stop."
> >> >
> >> > Is that intentional?
> >>
> >> I wasn't aware of such a use pattern, so definitely not intentional.
> >> But vmlinux is absolutely necessary to generate the module BTF. So I'm
> >> wondering what's the proper fix here? Leave it as is (that error
> >> message is actually surprisingly descriptive, btw)? Force vmlinux
> >> build? Or skip BTF generation for that module?
> >
> > For an external out-of-tree module there is no guarantee that vmlinux
> > will even be on the system, no? So only the last option can work in
> > that case.
>
> a year late to the party, but it seems to me that this patch

hi, sorry, you email slipped through the cracks initially, just
getting back to it...

> series/features also missed another, not yet fixed scenario. I have to
> admit I am not very well-versed in BTF/BPF matters though, so please
> take the analysis below with a grain of salt or two ;)
>
> (am subscribed to LKML/netdev, but not the bpf list, so please keep me
> CCed if discussion moves there! apologies if too many people are CCed
> here, feel free to trim down to relevant people/lists)
>
> many distros do their own tracking of kernel <-> module ABI (usually
> these checks use Module.symvers and some combination of lists/symbols/..
> to skip/ignore[0]).
>
> depending on detected changes, a kernel update can either
> - bump ABI, resulting in a new kernel/modules package that is installed
>   next to the current one
> - keep ABI, resulting in an updated kernel/modules package that is
>   installed over/instead of the current one
>
> the former case is obviously not an issue, since the modules and vmlinux
> image shipped in that (set of) package(s) match. but in the later case
> of updated, compatible kernel image + modules with unchanged ABI
> (which is important, as it allows shipping fixed modules that are
> loadable for a compatible, older, booted kernel image), the following is
> possible:
> - install kernel+modules with ABI 1
> - boot kernel with ABI 1
> - install ABI-compatible upgrade (e.g. a security fix)
> - load module
> - BTF validation fails, because the base_btf (loaded at boot time) and
>   the offsets in the module's .BTF section (loaded at module load time)
>   aren't matching
>
> of course the validation might also not fail but the parsed BTF info
> might be bogus, or the base_btf might be similar enough that validation
> passes and the parsed BTF data is correct.
>
> in our case the symptoms look like this (exact details vary with kernel
> builds/modules, but likely not relevant):
>
> Nov 24 11:39:11 host kernel: BPF:         type_id=7 bits_offset=0
> Nov 24 11:39:11 host kernel: BPF:
> Nov 24 11:39:11 host kernel: BPF:Invalid name
> Nov 24 11:39:11 host kernel: BPF:
> Nov 24 11:39:11 host kernel: failed to validate module [overlay] BTF: -22
>
> where the booted kernel and the (attempted to get) loaded module are not
> from the same build, but the Module.symvers is matching and loading
> should thus work. adding some more debug logging reveals that the root
> cause is the module's BTF start_str_off being, well, off, since it's derived
> from vmlinux' BTF/base_btf. if it is too big, the name/type lookups will
> wrongly look in the base_btf, if it's too small, the name/type lookups
> will be offset within the module or wrongly look inside the module when
> they should look inside base_btf/vmlinux. in any case, random garbage is
> the result, usually tripping up some validation check (e.g. the first
> byte not being 0 when checking a name). but even if it's correct (old
> and new vmlinux image have the same nr_types/hdr.str_len), there is no
> guarantuee that the offsets into base_btf are pointing at the right
> stuff.
>
> example with debug logging patched in, note the garbled names, and
> offset slightly below the (wrong) start_str_off:
>
> ----8<----
>
> BPF:magic: 0xeb9f
>
> BPF:version: 1
>
> BPF:flags: 0x0
>
> BPF:hdr_len: 24
>
> BPF:type_off: 0
>
> BPF:type_len: 9264
>
> BPF:str_off: 9264
>
> BPF:str_len: 5511
>
> BPF:btf_total_size: 14799
>
> BPF:[106314] STRUCT rimary_device
> BPF:size=56 vlen=14
> BPF:
>
> BPF:offset at call: 1915394
> BPF:offset too small, choosing base_btf: 1915397
>
> BPF:offset after base_btf: 1915394
>
> BPF:     ce type_id=49 bits_offset=0
> BPF:
>
> BPF:offset at call: 1915403
> BPF:offset after base_btf: 6
>
> BPF:     nfig type_id=49 bits_offset=64
> BPF:
>
> BPF:offset at call: 1915412
> BPF:offset after base_btf: 15
>
> BPF:     rdir type_id=49 bits_offset=128
> BPF:
>
> BPF:offset at call: 768428
> BPF:offset too small, choosing base_btf: 1915397
>
> BPF:offset after base_btf: 768428
>
> BPF:     _dio type_id=56 bits_offset=192
> BPF:
>
> BPF:offset at call: 1915420
> BPF:offset after base_btf: 23
>
> BPF:     erdir type_id=56 bits_offset=200
> BPF:
>
> BPF:offset at call: 1915433
> BPF:offset after base_btf: 36
>
> BPF:first char wrong - 0
>
> BPF:      type_id=56 bits_offset=208
> BPF:
> BPF:Invalid name STRUCT MEMBER (name offset 1915433)
> BPF:
>
> failed to validate module [overlay] BTF: -22
>
> ---->8----
>
> also note how it's only after a few botched entries that a check
> actually trips up - not sure what the impliciations for crafted BTF info
> are, but might be worthy a closer look by someone more knowledgable as
> well..
>
> it seems to me this can be solved on the distro/user side by tracking
> vmlinux BTF infos as part of the ABI tracking (how stable are those
> compared to the existing interfaces making up the kernel <-> module
> ABI/Module.symvers? does this effectively mean bumping ABI for any

Yes, I think so, unfortunately. Any change in order of types emitted
by DWARF or any extra string might cause a big change in string
offsets or type IDs.


> change anyway?) or by disabling CONFIG_DEBUG_INFO_BTF_MODULES.
>
> on the kernel/libbpf side it could maybe be solved by storing a hash of
> the base_btf data used to generate the split BTF-sections inside the
> modules, and skip BTF loading/validating if another base_btf is
> currently loaded (so BTF is best-effort, if the booted kernel and the
> module are matching it works, if not module loading works but no BTF
> support). this might be a good safe-guard for split-BTF in general?
>
> I'd appreciate input on how to proceed (we were recently hit by this in
> a downstream Debian derivative, and will disable BTF info for modules as
> an interim measure).
>

Yeah, I think "BTF is best-effort" is a necessity for the complicated
setups you described above. It probably is good to keep strict
behavior (BTF fails to load -> fail loading the module), but allow you
to tune it with the Kconfig option. Most modules probably would be
just fine without its BTF loaded successfully.

As for the check sums, this would add another layer of protection, so
I think storing base vmlinux BTF checksum in some special ELF section
in module' ELF (e.g. ".BTF.base_checksum" or something along those
lines) is also a good idea and would be easy to implement. Insmod and
related tooling might even provide a validation check that vmlinux BTF
matches module's expectation, if necessary.

> thanks!
>
> 0: e.g., Debian's: https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/bin/abiupdate.py
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-12-08  0:08 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20201110011932.3201430-1-andrii@kernel.org>
     [not found] ` <20201110011932.3201430-2-andrii@kernel.org>
2020-11-10 17:49   ` [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support Song Liu
2020-11-10 18:31     ` Andrii Nakryiko
2020-11-10 20:14       ` Song Liu
     [not found] ` <20201110011932.3201430-3-andrii@kernel.org>
2020-11-10 20:24   ` [PATCH v4 bpf-next 2/5] bpf: assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO Song Liu
     [not found] ` <20201110011932.3201430-6-andrii@kernel.org>
2020-11-11  1:15   ` [PATCH v4 bpf-next 5/5] tools/bpftool: add support for in-kernel and named BTF in `btf show` Song Liu
2020-11-11  4:19     ` Andrii Nakryiko
2020-11-11  7:44       ` Song Liu
     [not found] ` <20201110011932.3201430-5-andrii@kernel.org>
2020-11-11  7:11   ` [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs kernel test robot
2020-11-11 10:13   ` Jessica Yu
2020-11-11 20:11     ` Andrii Nakryiko
2020-11-13 10:31       ` Jessica Yu
2020-11-13 18:54         ` Andrii Nakryiko
     [not found] ` <20201110011932.3201430-4-andrii@kernel.org>
2020-11-11  1:04   ` [PATCH v4 bpf-next 3/5] kbuild: build kernel module BTFs if BTF is enabled and pahole supports it Song Liu
2020-11-16 19:55     ` Allan, Bruce W
2020-11-16 20:34       ` Andrii Nakryiko
2020-11-16 21:24         ` Jakub Kicinski
2020-11-16 22:35           ` Andrii Nakryiko
2021-11-26 15:16           ` Fabian Grünbichler
2021-12-08  0:08             ` Andrii Nakryiko
2020-11-17  3:44   ` David Ahern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).