bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* duplicate BTF_IDs leading to symbol redefinition errors?
@ 2023-09-07 19:01 Nick Desaulniers
  2023-09-07 20:33 ` Jiri Olsa
  0 siblings, 1 reply; 16+ messages in thread
From: Nick Desaulniers @ 2023-09-07 19:01 UTC (permalink / raw)
  To: bpf
  Cc: clang-built-linux, Stanislav Fomichev, Nathan Chancellor, Yonghong Song

So we've got a curious report recently:
https://github.com/ClangBuiltLinux/linux/issues/1913

ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
'__BTF_ID__struct__cgroup__624' is already defined
__BTF_ID__struct__cgroup__624:
^

It's been hard to pin down a SHA and .config to reproduce this, but
looking at the definition of BTF_ID's usage of __ID's usage of
__COUNTER__, and the two statements:

kernel/bpf/helpers.c:2460:BTF_ID(struct, cgroup)
kernel/bpf/verifier.c:5075:BTF_ID(struct, cgroup)

Is it possible that __COUNTER__ could evaluate to the same value
across 2 different translation units, leading to a name collision like
the above?

looking at another usage of BTF_ID other than struct
cgroup;kernel/bpf/helpers.c:2461:BTF_ID(func, bpf_cgroup_release)
is only defined in one translation unit

Should one of those two `BTF_ID(struct, cgroup)` be removed? Is there
some other way we can avoid these collisions in the future?

Was this a previously observed/fixed issue?

-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-07 19:01 duplicate BTF_IDs leading to symbol redefinition errors? Nick Desaulniers
@ 2023-09-07 20:33 ` Jiri Olsa
  2023-09-08 11:47   ` Jiri Olsa
  0 siblings, 1 reply; 16+ messages in thread
From: Jiri Olsa @ 2023-09-07 20:33 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: bpf, clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song

On Thu, Sep 07, 2023 at 12:01:18PM -0700, Nick Desaulniers wrote:
> So we've got a curious report recently:
> https://github.com/ClangBuiltLinux/linux/issues/1913
> 
> ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
> '__BTF_ID__struct__cgroup__624' is already defined
> __BTF_ID__struct__cgroup__624:
> ^
> 
> It's been hard to pin down a SHA and .config to reproduce this, but
> looking at the definition of BTF_ID's usage of __ID's usage of
> __COUNTER__, and the two statements:
> 
> kernel/bpf/helpers.c:2460:BTF_ID(struct, cgroup)
> kernel/bpf/verifier.c:5075:BTF_ID(struct, cgroup)
> 
> Is it possible that __COUNTER__ could evaluate to the same value
> across 2 different translation units, leading to a name collision like
> the above?

hum, that probably the case, I see same counter values at different
__BTF_ID_ symbols:

ffffffff833fe540 r __BTF_ID__struct__bpf_bloom_filter__380
ffffffff833fe548 r __BTF_ID__struct__bpf_queue_stack__380
ffffffff833fe578 r __BTF_ID__struct__cgroup__380

perhaps we were just lucky not to hit that :-\

> 
> looking at another usage of BTF_ID other than struct
> cgroup;kernel/bpf/helpers.c:2461:BTF_ID(func, bpf_cgroup_release)
> is only defined in one translation unit
> 
> Should one of those two `BTF_ID(struct, cgroup)` be removed? Is there
> some other way we can avoid these collisions in the future?

need to find some way to make the symbol unique, will check

> 
> Was this a previously observed/fixed issue?

first time I see that

thanks,
jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-07 20:33 ` Jiri Olsa
@ 2023-09-08 11:47   ` Jiri Olsa
  2023-09-08 17:14     ` Nick Desaulniers
  0 siblings, 1 reply; 16+ messages in thread
From: Jiri Olsa @ 2023-09-08 11:47 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Nick Desaulniers, bpf, clang-built-linux, Stanislav Fomichev,
	Nathan Chancellor, Yonghong Song, Song Liu

On Thu, Sep 07, 2023 at 10:33:00PM +0200, Jiri Olsa wrote:
> On Thu, Sep 07, 2023 at 12:01:18PM -0700, Nick Desaulniers wrote:
> > So we've got a curious report recently:
> > https://github.com/ClangBuiltLinux/linux/issues/1913
> > 
> > ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
> > '__BTF_ID__struct__cgroup__624' is already defined
> > __BTF_ID__struct__cgroup__624:
> > ^
> > 
> > It's been hard to pin down a SHA and .config to reproduce this, but
> > looking at the definition of BTF_ID's usage of __ID's usage of
> > __COUNTER__, and the two statements:
> > 
> > kernel/bpf/helpers.c:2460:BTF_ID(struct, cgroup)
> > kernel/bpf/verifier.c:5075:BTF_ID(struct, cgroup)
> > 
> > Is it possible that __COUNTER__ could evaluate to the same value
> > across 2 different translation units, leading to a name collision like
> > the above?
> 
> hum, that probably the case, I see same counter values at different
> __BTF_ID_ symbols:
> 
> ffffffff833fe540 r __BTF_ID__struct__bpf_bloom_filter__380
> ffffffff833fe548 r __BTF_ID__struct__bpf_queue_stack__380
> ffffffff833fe578 r __BTF_ID__struct__cgroup__380
> 
> perhaps we were just lucky not to hit that :-\
> 
> > 
> > looking at another usage of BTF_ID other than struct
> > cgroup;kernel/bpf/helpers.c:2461:BTF_ID(func, bpf_cgroup_release)
> > is only defined in one translation unit
> > 
> > Should one of those two `BTF_ID(struct, cgroup)` be removed? Is there
> > some other way we can avoid these collisions in the future?
> 
> need to find some way to make the symbol unique, will check

the change below uses object's path as the __BTF_ID_.. symbol suffix to make
it unique

I'm still looking, but can't think of a better way so far, perhaps somebody
will have better idea

jirka


---
diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index a3462a9b8e18..564953f9cbc7 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -49,7 +49,7 @@ word							\
 	____BTF_ID(symbol, word)
 
 #define __ID(prefix) \
-	__PASTE(prefix, __COUNTER__)
+	__PASTE(__PASTE(prefix, __COUNTER__), BTF_ID_BASE)
 
 /*
  * The BTF_ID defines unique symbol for each ID pointing
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 68d0134bdbf9..2ef8b2798be0 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -200,6 +200,10 @@ _c_flags += $(if $(patsubst n%,, \
 	-D__KCSAN_INSTRUMENT_BARRIERS__)
 endif
 
+ifeq ($(CONFIG_DEBUG_INFO_BTF),y)
+_c_flags += -DBTF_ID_BASE=$(subst =,,$(shell echo -n $(modfile) | base32 -w0))
+endif
+
 # $(srctree)/$(src) for including checkin headers from generated source files
 # $(objtree)/$(obj) for including generated headers from checkin source files
 ifeq ($(KBUILD_EXTMOD),)

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-08 11:47   ` Jiri Olsa
@ 2023-09-08 17:14     ` Nick Desaulniers
  2023-09-08 20:15       ` Jiri Olsa
  0 siblings, 1 reply; 16+ messages in thread
From: Nick Desaulniers @ 2023-09-08 17:14 UTC (permalink / raw)
  To: Jiri Olsa, Marcus Seyfarth, Masahiro Yamada
  Cc: bpf, clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

Thanks for the patch!

+ Marcus

Marcus can you please test the below patch and provide your tested-by
and reported-by tags?

On Fri, Sep 8, 2023 at 4:47 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Thu, Sep 07, 2023 at 10:33:00PM +0200, Jiri Olsa wrote:
> > On Thu, Sep 07, 2023 at 12:01:18PM -0700, Nick Desaulniers wrote:
> > > So we've got a curious report recently:
> > > https://github.com/ClangBuiltLinux/linux/issues/1913
> > >
> > > ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
> > > '__BTF_ID__struct__cgroup__624' is already defined
> > > __BTF_ID__struct__cgroup__624:
> > > ^
> > >
> > > It's been hard to pin down a SHA and .config to reproduce this, but
> > > looking at the definition of BTF_ID's usage of __ID's usage of
> > > __COUNTER__, and the two statements:
> > >
> > > kernel/bpf/helpers.c:2460:BTF_ID(struct, cgroup)
> > > kernel/bpf/verifier.c:5075:BTF_ID(struct, cgroup)
> > >
> > > Is it possible that __COUNTER__ could evaluate to the same value
> > > across 2 different translation units, leading to a name collision like
> > > the above?
> >
> > hum, that probably the case, I see same counter values at different
> > __BTF_ID_ symbols:
> >
> > ffffffff833fe540 r __BTF_ID__struct__bpf_bloom_filter__380
> > ffffffff833fe548 r __BTF_ID__struct__bpf_queue_stack__380
> > ffffffff833fe578 r __BTF_ID__struct__cgroup__380
> >
> > perhaps we were just lucky not to hit that :-\
> >
> > >
> > > looking at another usage of BTF_ID other than struct
> > > cgroup;kernel/bpf/helpers.c:2461:BTF_ID(func, bpf_cgroup_release)
> > > is only defined in one translation unit
> > >
> > > Should one of those two `BTF_ID(struct, cgroup)` be removed? Is there
> > > some other way we can avoid these collisions in the future?
> >
> > need to find some way to make the symbol unique, will check
>
> the change below uses object's path as the __BTF_ID_.. symbol suffix to make
> it unique
>
> I'm still looking, but can't think of a better way so far, perhaps somebody
> will have better idea

Another good approach; I had simply added __LINE__ into the paste.
https://github.com/ClangBuiltLinux/linux/issues/1913#issuecomment-1710794319
Which just makes the probability of this occurring again smaller, but
still non-zero.

+ Masahiro for thoughts on the invocation of echo and base32.  Looks
like base32 is part of coreutils. Kind of strange that coreutils isn't
listed in Documentation/process/changes.rst.  Would adding the usage
of base32 add a new dependency on coreutils?

>
> jirka
>
>
> ---
> diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
> index a3462a9b8e18..564953f9cbc7 100644
> --- a/include/linux/btf_ids.h
> +++ b/include/linux/btf_ids.h
> @@ -49,7 +49,7 @@ word                                                  \
>         ____BTF_ID(symbol, word)
>
>  #define __ID(prefix) \
> -       __PASTE(prefix, __COUNTER__)
> +       __PASTE(__PASTE(prefix, __COUNTER__), BTF_ID_BASE)

Do we still need __COUNTER__ if we're now using BTF_ID_BASE?

>
>  /*
>   * The BTF_ID defines unique symbol for each ID pointing
> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
> index 68d0134bdbf9..2ef8b2798be0 100644
> --- a/scripts/Makefile.lib
> +++ b/scripts/Makefile.lib
> @@ -200,6 +200,10 @@ _c_flags += $(if $(patsubst n%,, \
>         -D__KCSAN_INSTRUMENT_BARRIERS__)
>  endif
>
> +ifeq ($(CONFIG_DEBUG_INFO_BTF),y)
> +_c_flags += -DBTF_ID_BASE=$(subst =,,$(shell echo -n $(modfile) | base32 -w0))

`man 1 base32` shows it can just read a file. Could the above be:

_c_flags += -DBTF_ID_BASE=$(subst =,,$(shell base32 -w0 $(modfile)))

? (untested)

Also, the output of

$ base32 -w0 Documentation/process/changes.rst

is 24456 characters.  This is going to blow up symbol tables. I
suppose ELF probably has some length limit on symbol names, too.  I
was nervous about my approaching appending __LINE__.

Perhaps pipe the output to `head -c <n bytes>`?

> +endif
> +
>  # $(srctree)/$(src) for including checkin headers from generated source files
>  # $(objtree)/$(obj) for including generated headers from checkin source files
>  ifeq ($(KBUILD_EXTMOD),)



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-08 17:14     ` Nick Desaulniers
@ 2023-09-08 20:15       ` Jiri Olsa
  2023-09-11 16:21         ` Nick Desaulniers
                           ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Jiri Olsa @ 2023-09-08 20:15 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Jiri Olsa, Marcus Seyfarth, Masahiro Yamada, bpf,
	clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

On Fri, Sep 08, 2023 at 10:14:56AM -0700, Nick Desaulniers wrote:
> Thanks for the patch!
> 
> + Marcus
> 
> Marcus can you please test the below patch and provide your tested-by
> and reported-by tags?
> 
> On Fri, Sep 8, 2023 at 4:47 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Thu, Sep 07, 2023 at 10:33:00PM +0200, Jiri Olsa wrote:
> > > On Thu, Sep 07, 2023 at 12:01:18PM -0700, Nick Desaulniers wrote:
> > > > So we've got a curious report recently:
> > > > https://github.com/ClangBuiltLinux/linux/issues/1913
> > > >
> > > > ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
> > > > '__BTF_ID__struct__cgroup__624' is already defined
> > > > __BTF_ID__struct__cgroup__624:
> > > > ^
> > > >
> > > > It's been hard to pin down a SHA and .config to reproduce this, but
> > > > looking at the definition of BTF_ID's usage of __ID's usage of
> > > > __COUNTER__, and the two statements:
> > > >
> > > > kernel/bpf/helpers.c:2460:BTF_ID(struct, cgroup)
> > > > kernel/bpf/verifier.c:5075:BTF_ID(struct, cgroup)
> > > >
> > > > Is it possible that __COUNTER__ could evaluate to the same value
> > > > across 2 different translation units, leading to a name collision like
> > > > the above?
> > >
> > > hum, that probably the case, I see same counter values at different
> > > __BTF_ID_ symbols:
> > >
> > > ffffffff833fe540 r __BTF_ID__struct__bpf_bloom_filter__380
> > > ffffffff833fe548 r __BTF_ID__struct__bpf_queue_stack__380
> > > ffffffff833fe578 r __BTF_ID__struct__cgroup__380
> > >
> > > perhaps we were just lucky not to hit that :-\
> > >
> > > >
> > > > looking at another usage of BTF_ID other than struct
> > > > cgroup;kernel/bpf/helpers.c:2461:BTF_ID(func, bpf_cgroup_release)
> > > > is only defined in one translation unit
> > > >
> > > > Should one of those two `BTF_ID(struct, cgroup)` be removed? Is there
> > > > some other way we can avoid these collisions in the future?
> > >
> > > need to find some way to make the symbol unique, will check
> >
> > the change below uses object's path as the __BTF_ID_.. symbol suffix to make
> > it unique
> >
> > I'm still looking, but can't think of a better way so far, perhaps somebody
> > will have better idea
> 
> Another good approach; I had simply added __LINE__ into the paste.
> https://github.com/ClangBuiltLinux/linux/issues/1913#issuecomment-1710794319
> Which just makes the probability of this occurring again smaller, but
> still non-zero.

yes, there's still possibility of the match

> 
> + Masahiro for thoughts on the invocation of echo and base32.  Looks
> like base32 is part of coreutils. Kind of strange that coreutils isn't
> listed in Documentation/process/changes.rst.  Would adding the usage
> of base32 add a new dependency on coreutils?
> 
> >
> > jirka
> >
> >
> > ---
> > diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
> > index a3462a9b8e18..564953f9cbc7 100644
> > --- a/include/linux/btf_ids.h
> > +++ b/include/linux/btf_ids.h
> > @@ -49,7 +49,7 @@ word                                                  \
> >         ____BTF_ID(symbol, word)
> >
> >  #define __ID(prefix) \
> > -       __PASTE(prefix, __COUNTER__)
> > +       __PASTE(__PASTE(prefix, __COUNTER__), BTF_ID_BASE)
> 
> Do we still need __COUNTER__ if we're now using BTF_ID_BASE?

yes we still need that because we could have same __BTF_ID__...
symbol used multiple times within same object, and that's where
__COUNTER__ makes the difference

> 
> >
> >  /*
> >   * The BTF_ID defines unique symbol for each ID pointing
> > diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
> > index 68d0134bdbf9..2ef8b2798be0 100644
> > --- a/scripts/Makefile.lib
> > +++ b/scripts/Makefile.lib
> > @@ -200,6 +200,10 @@ _c_flags += $(if $(patsubst n%,, \
> >         -D__KCSAN_INSTRUMENT_BARRIERS__)
> >  endif
> >
> > +ifeq ($(CONFIG_DEBUG_INFO_BTF),y)
> > +_c_flags += -DBTF_ID_BASE=$(subst =,,$(shell echo -n $(modfile) | base32 -w0))
> 
> `man 1 base32` shows it can just read a file. Could the above be:
> 
> _c_flags += -DBTF_ID_BASE=$(subst =,,$(shell base32 -w0 $(modfile)))
> 
> ? (untested)
> 
> Also, the output of
> 
> $ base32 -w0 Documentation/process/changes.rst
> 
> is 24456 characters.  This is going to blow up symbol tables. I
> suppose ELF probably has some length limit on symbol names, too.  I
> was nervous about my approaching appending __LINE__.
> 
> Perhaps pipe the output to `head -c <n bytes>`?

so the change is about adding unique id that's basically path of
the object stored in base32 so it could be used as symbol, so we
don't really need to read the actual file

the problem is when BTF_ID definition like:

BTF_ID(struct, cgroup)

translates in 2 separate objects into same symbol name because of
the matching __COUNTER__ macro values (like 380 below)

  __BTF_ID__struct__cgroup__380

this change just adds unique id of the path name at the end of the
symbol with:

  echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG

so the symbol looks like:

  __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG

and is unique over the sources

but I still hope we could come up with some better solution ;-)

jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-08 20:15       ` Jiri Olsa
@ 2023-09-11 16:21         ` Nick Desaulniers
       [not found]         ` <CA+FbhJNz4i4pU+8nT7JBvQKSa0VCkzcNzaJ=dRdRn+JCSTdgKQ@mail.gmail.com>
  2023-09-14  8:17         ` Jiri Olsa
  2 siblings, 0 replies; 16+ messages in thread
From: Nick Desaulniers @ 2023-09-11 16:21 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Marcus Seyfarth, Masahiro Yamada, bpf, clang-built-linux,
	Stanislav Fomichev, Nathan Chancellor, Yonghong Song, Song Liu

On Fri, Sep 8, 2023 at 1:15 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> but I still hope we could come up with some better solution ;-)

I have no preference, and other fires to attend to.  Thanks for the patch.
Acked-by: Nick Desaulniers <ndesaulniers@google.com>

I see Marcus replied but in text/html; I've encouraged Marcus to
re-reply for the public record.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
       [not found]         ` <CA+FbhJNz4i4pU+8nT7JBvQKSa0VCkzcNzaJ=dRdRn+JCSTdgKQ@mail.gmail.com>
@ 2023-09-11 18:17           ` Marcus Seyfarth
  0 siblings, 0 replies; 16+ messages in thread
From: Marcus Seyfarth @ 2023-09-11 18:17 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Nick Desaulniers, Masahiro Yamada, bpf, clang-built-linux,
	Stanislav Fomichev, Nathan Chancellor, Yonghong Song, Song Liu

I hope this one will get through the list as I am not subscribed yet
to the LKML.

From Fr., 8. Sept. 2023 um 23:22 Uhr:
> I've tested the patch from Jiri with a fresh snapshot of clang 18.0.0 (254847fb149fdc03ef601badf69ee08150345a5c) on my custom 6.4.15 Kernel with FullLTO and it also solves the reported issue (the Kernel also boots successfully).
>
> Reported-by: Marcus Seyfarth <m.seyfarth@gmail.com>
> Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-08 20:15       ` Jiri Olsa
  2023-09-11 16:21         ` Nick Desaulniers
       [not found]         ` <CA+FbhJNz4i4pU+8nT7JBvQKSa0VCkzcNzaJ=dRdRn+JCSTdgKQ@mail.gmail.com>
@ 2023-09-14  8:17         ` Jiri Olsa
  2023-09-14  8:30           ` Masahiro Yamada
  2 siblings, 1 reply; 16+ messages in thread
From: Jiri Olsa @ 2023-09-14  8:17 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Nick Desaulniers, Marcus Seyfarth, Masahiro Yamada, bpf,
	clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

On Fri, Sep 08, 2023 at 10:15:35PM +0200, Jiri Olsa wrote:
> On Fri, Sep 08, 2023 at 10:14:56AM -0700, Nick Desaulniers wrote:
> > Thanks for the patch!
> > 
> > + Marcus
> > 
> > Marcus can you please test the below patch and provide your tested-by
> > and reported-by tags?
> > 
> > On Fri, Sep 8, 2023 at 4:47 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Thu, Sep 07, 2023 at 10:33:00PM +0200, Jiri Olsa wrote:
> > > > On Thu, Sep 07, 2023 at 12:01:18PM -0700, Nick Desaulniers wrote:
> > > > > So we've got a curious report recently:
> > > > > https://github.com/ClangBuiltLinux/linux/issues/1913
> > > > >
> > > > > ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
> > > > > '__BTF_ID__struct__cgroup__624' is already defined
> > > > > __BTF_ID__struct__cgroup__624:
> > > > > ^
> > > > >
> > > > > It's been hard to pin down a SHA and .config to reproduce this, but
> > > > > looking at the definition of BTF_ID's usage of __ID's usage of
> > > > > __COUNTER__, and the two statements:
> > > > >
> > > > > kernel/bpf/helpers.c:2460:BTF_ID(struct, cgroup)
> > > > > kernel/bpf/verifier.c:5075:BTF_ID(struct, cgroup)
> > > > >
> > > > > Is it possible that __COUNTER__ could evaluate to the same value
> > > > > across 2 different translation units, leading to a name collision like
> > > > > the above?
> > > >
> > > > hum, that probably the case, I see same counter values at different
> > > > __BTF_ID_ symbols:
> > > >
> > > > ffffffff833fe540 r __BTF_ID__struct__bpf_bloom_filter__380
> > > > ffffffff833fe548 r __BTF_ID__struct__bpf_queue_stack__380
> > > > ffffffff833fe578 r __BTF_ID__struct__cgroup__380
> > > >
> > > > perhaps we were just lucky not to hit that :-\
> > > >
> > > > >
> > > > > looking at another usage of BTF_ID other than struct
> > > > > cgroup;kernel/bpf/helpers.c:2461:BTF_ID(func, bpf_cgroup_release)
> > > > > is only defined in one translation unit
> > > > >
> > > > > Should one of those two `BTF_ID(struct, cgroup)` be removed? Is there
> > > > > some other way we can avoid these collisions in the future?
> > > >
> > > > need to find some way to make the symbol unique, will check
> > >
> > > the change below uses object's path as the __BTF_ID_.. symbol suffix to make
> > > it unique
> > >
> > > I'm still looking, but can't think of a better way so far, perhaps somebody
> > > will have better idea
> > 
> > Another good approach; I had simply added __LINE__ into the paste.
> > https://github.com/ClangBuiltLinux/linux/issues/1913#issuecomment-1710794319
> > Which just makes the probability of this occurring again smaller, but
> > still non-zero.
> 
> yes, there's still possibility of the match
> 
> > 
> > + Masahiro for thoughts on the invocation of echo and base32.  Looks
> > like base32 is part of coreutils. Kind of strange that coreutils isn't
> > listed in Documentation/process/changes.rst.  Would adding the usage
> > of base32 add a new dependency on coreutils?
> > 
> > >
> > > jirka
> > >
> > >
> > > ---
> > > diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
> > > index a3462a9b8e18..564953f9cbc7 100644
> > > --- a/include/linux/btf_ids.h
> > > +++ b/include/linux/btf_ids.h
> > > @@ -49,7 +49,7 @@ word                                                  \
> > >         ____BTF_ID(symbol, word)
> > >
> > >  #define __ID(prefix) \
> > > -       __PASTE(prefix, __COUNTER__)
> > > +       __PASTE(__PASTE(prefix, __COUNTER__), BTF_ID_BASE)
> > 
> > Do we still need __COUNTER__ if we're now using BTF_ID_BASE?
> 
> yes we still need that because we could have same __BTF_ID__...
> symbol used multiple times within same object, and that's where
> __COUNTER__ makes the difference
> 
> > 
> > >
> > >  /*
> > >   * The BTF_ID defines unique symbol for each ID pointing
> > > diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
> > > index 68d0134bdbf9..2ef8b2798be0 100644
> > > --- a/scripts/Makefile.lib
> > > +++ b/scripts/Makefile.lib
> > > @@ -200,6 +200,10 @@ _c_flags += $(if $(patsubst n%,, \
> > >         -D__KCSAN_INSTRUMENT_BARRIERS__)
> > >  endif
> > >
> > > +ifeq ($(CONFIG_DEBUG_INFO_BTF),y)
> > > +_c_flags += -DBTF_ID_BASE=$(subst =,,$(shell echo -n $(modfile) | base32 -w0))
> > 
> > `man 1 base32` shows it can just read a file. Could the above be:
> > 
> > _c_flags += -DBTF_ID_BASE=$(subst =,,$(shell base32 -w0 $(modfile)))
> > 
> > ? (untested)
> > 
> > Also, the output of
> > 
> > $ base32 -w0 Documentation/process/changes.rst
> > 
> > is 24456 characters.  This is going to blow up symbol tables. I
> > suppose ELF probably has some length limit on symbol names, too.  I
> > was nervous about my approaching appending __LINE__.
> > 
> > Perhaps pipe the output to `head -c <n bytes>`?
> 
> so the change is about adding unique id that's basically path of
> the object stored in base32 so it could be used as symbol, so we
> don't really need to read the actual file
> 
> the problem is when BTF_ID definition like:
> 
> BTF_ID(struct, cgroup)
> 
> translates in 2 separate objects into same symbol name because of
> the matching __COUNTER__ macro values (like 380 below)
> 
>   __BTF_ID__struct__cgroup__380
> 
> this change just adds unique id of the path name at the end of the
> symbol with:
> 
>   echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> 
> so the symbol looks like:
> 
>   __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> 
> and is unique over the sources
> 
> but I still hope we could come up with some better solution ;-)

so far the only better solution I could come up with is to use
cksum (also from coreutils) instead of base32, which makes the
BTF_ID_BASE value compact

I'll run test to find out how much it hurts the build time

jirka


---
diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index a3462a9b8e18..564953f9cbc7 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -49,7 +49,7 @@ word							\
 	____BTF_ID(symbol, word)
 
 #define __ID(prefix) \
-	__PASTE(prefix, __COUNTER__)
+	__PASTE(__PASTE(prefix, __COUNTER__), BTF_ID_BASE)
 
 /*
  * The BTF_ID defines unique symbol for each ID pointing
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 68d0134bdbf9..01b14e6a7df3 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -200,6 +200,10 @@ _c_flags += $(if $(patsubst n%,, \
 	-D__KCSAN_INSTRUMENT_BARRIERS__)
 endif
 
+ifeq ($(CONFIG_DEBUG_INFO_BTF),y)
+_c_flags += -DBTF_ID_BASE=$(firstword $(shell echo -n $(modfile) | cksum))
+endif
+
 # $(srctree)/$(src) for including checkin headers from generated source files
 # $(objtree)/$(obj) for including generated headers from checkin source files
 ifeq ($(KBUILD_EXTMOD),)

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-14  8:17         ` Jiri Olsa
@ 2023-09-14  8:30           ` Masahiro Yamada
  2023-09-14  9:52             ` Jiri Olsa
  0 siblings, 1 reply; 16+ messages in thread
From: Masahiro Yamada @ 2023-09-14  8:30 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Nick Desaulniers, Marcus Seyfarth, bpf, clang-built-linux,
	Stanislav Fomichev, Nathan Chancellor, Yonghong Song, Song Liu

On Thu, Sep 14, 2023 at 5:17 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Fri, Sep 08, 2023 at 10:15:35PM +0200, Jiri Olsa wrote:
> > On Fri, Sep 08, 2023 at 10:14:56AM -0700, Nick Desaulniers wrote:
> > > Thanks for the patch!
> > >
> > > + Marcus
> > >
> > > Marcus can you please test the below patch and provide your tested-by
> > > and reported-by tags?
> > >
> > > On Fri, Sep 8, 2023 at 4:47 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 07, 2023 at 10:33:00PM +0200, Jiri Olsa wrote:
> > > > > On Thu, Sep 07, 2023 at 12:01:18PM -0700, Nick Desaulniers wrote:
> > > > > > So we've got a curious report recently:
> > > > > > https://github.com/ClangBuiltLinux/linux/issues/1913
> > > > > >
> > > > > > ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
> > > > > > '__BTF_ID__struct__cgroup__624' is already defined
> > > > > > __BTF_ID__struct__cgroup__624:
> > > > > > ^
> > > > > >
> > > > > > It's been hard to pin down a SHA and .config to reproduce this, but
> > > > > > looking at the definition of BTF_ID's usage of __ID's usage of
> > > > > > __COUNTER__, and the two statements:
> > > > > >
> > > > > > kernel/bpf/helpers.c:2460:BTF_ID(struct, cgroup)
> > > > > > kernel/bpf/verifier.c:5075:BTF_ID(struct, cgroup)
> > > > > >
> > > > > > Is it possible that __COUNTER__ could evaluate to the same value
> > > > > > across 2 different translation units, leading to a name collision like
> > > > > > the above?
> > > > >
> > > > > hum, that probably the case, I see same counter values at different
> > > > > __BTF_ID_ symbols:
> > > > >
> > > > > ffffffff833fe540 r __BTF_ID__struct__bpf_bloom_filter__380
> > > > > ffffffff833fe548 r __BTF_ID__struct__bpf_queue_stack__380
> > > > > ffffffff833fe578 r __BTF_ID__struct__cgroup__380
> > > > >
> > > > > perhaps we were just lucky not to hit that :-\
> > > > >
> > > > > >
> > > > > > looking at another usage of BTF_ID other than struct
> > > > > > cgroup;kernel/bpf/helpers.c:2461:BTF_ID(func, bpf_cgroup_release)
> > > > > > is only defined in one translation unit
> > > > > >
> > > > > > Should one of those two `BTF_ID(struct, cgroup)` be removed? Is there
> > > > > > some other way we can avoid these collisions in the future?
> > > > >
> > > > > need to find some way to make the symbol unique, will check
> > > >
> > > > the change below uses object's path as the __BTF_ID_.. symbol suffix to make
> > > > it unique
> > > >
> > > > I'm still looking, but can't think of a better way so far, perhaps somebody
> > > > will have better idea
> > >
> > > Another good approach; I had simply added __LINE__ into the paste.
> > > https://github.com/ClangBuiltLinux/linux/issues/1913#issuecomment-1710794319
> > > Which just makes the probability of this occurring again smaller, but
> > > still non-zero.
> >
> > yes, there's still possibility of the match
> >
> > >
> > > + Masahiro for thoughts on the invocation of echo and base32.  Looks
> > > like base32 is part of coreutils. Kind of strange that coreutils isn't
> > > listed in Documentation/process/changes.rst.  Would adding the usage
> > > of base32 add a new dependency on coreutils?
> > >
> > > >
> > > > jirka
> > > >
> > > >
> > > > ---
> > > > diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
> > > > index a3462a9b8e18..564953f9cbc7 100644
> > > > --- a/include/linux/btf_ids.h
> > > > +++ b/include/linux/btf_ids.h
> > > > @@ -49,7 +49,7 @@ word                                                  \
> > > >         ____BTF_ID(symbol, word)
> > > >
> > > >  #define __ID(prefix) \
> > > > -       __PASTE(prefix, __COUNTER__)
> > > > +       __PASTE(__PASTE(prefix, __COUNTER__), BTF_ID_BASE)
> > >
> > > Do we still need __COUNTER__ if we're now using BTF_ID_BASE?
> >
> > yes we still need that because we could have same __BTF_ID__...
> > symbol used multiple times within same object, and that's where
> > __COUNTER__ makes the difference
> >
> > >
> > > >
> > > >  /*
> > > >   * The BTF_ID defines unique symbol for each ID pointing
> > > > diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
> > > > index 68d0134bdbf9..2ef8b2798be0 100644
> > > > --- a/scripts/Makefile.lib
> > > > +++ b/scripts/Makefile.lib
> > > > @@ -200,6 +200,10 @@ _c_flags += $(if $(patsubst n%,, \
> > > >         -D__KCSAN_INSTRUMENT_BARRIERS__)
> > > >  endif
> > > >
> > > > +ifeq ($(CONFIG_DEBUG_INFO_BTF),y)
> > > > +_c_flags += -DBTF_ID_BASE=$(subst =,,$(shell echo -n $(modfile) | base32 -w0))
> > >
> > > `man 1 base32` shows it can just read a file. Could the above be:
> > >
> > > _c_flags += -DBTF_ID_BASE=$(subst =,,$(shell base32 -w0 $(modfile)))
> > >
> > > ? (untested)
> > >
> > > Also, the output of
> > >
> > > $ base32 -w0 Documentation/process/changes.rst
> > >
> > > is 24456 characters.  This is going to blow up symbol tables. I
> > > suppose ELF probably has some length limit on symbol names, too.  I
> > > was nervous about my approaching appending __LINE__.
> > >
> > > Perhaps pipe the output to `head -c <n bytes>`?
> >
> > so the change is about adding unique id that's basically path of
> > the object stored in base32 so it could be used as symbol, so we
> > don't really need to read the actual file
> >
> > the problem is when BTF_ID definition like:
> >
> > BTF_ID(struct, cgroup)
> >
> > translates in 2 separate objects into same symbol name because of
> > the matching __COUNTER__ macro values (like 380 below)
> >
> >   __BTF_ID__struct__cgroup__380
> >
> > this change just adds unique id of the path name at the end of the
> > symbol with:
> >
> >   echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> >
> > so the symbol looks like:
> >
> >   __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> >
> > and is unique over the sources
> >
> > but I still hope we could come up with some better solution ;-)
>
> so far the only better solution I could come up with is to use
> cksum (also from coreutils) instead of base32, which makes the
> BTF_ID_BASE value compact
>
> I'll run test to find out how much it hurts the build time
>
> jirka



Seems a bad idea to me.

It would fork a new shell and chsum for all files,
while only a few of them need it.

Better to consult BTF forks.










-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-14  8:30           ` Masahiro Yamada
@ 2023-09-14  9:52             ` Jiri Olsa
  2023-09-14 18:14               ` Andrii Nakryiko
  0 siblings, 1 reply; 16+ messages in thread
From: Jiri Olsa @ 2023-09-14  9:52 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Jiri Olsa, Nick Desaulniers, Marcus Seyfarth, bpf,
	clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

On Thu, Sep 14, 2023 at 05:30:51PM +0900, Masahiro Yamada wrote:

SNIP

> > > so the change is about adding unique id that's basically path of
> > > the object stored in base32 so it could be used as symbol, so we
> > > don't really need to read the actual file
> > >
> > > the problem is when BTF_ID definition like:
> > >
> > > BTF_ID(struct, cgroup)
> > >
> > > translates in 2 separate objects into same symbol name because of
> > > the matching __COUNTER__ macro values (like 380 below)
> > >
> > >   __BTF_ID__struct__cgroup__380
> > >
> > > this change just adds unique id of the path name at the end of the
> > > symbol with:
> > >
> > >   echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > >
> > > so the symbol looks like:
> > >
> > >   __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > >
> > > and is unique over the sources
> > >
> > > but I still hope we could come up with some better solution ;-)
> >
> > so far the only better solution I could come up with is to use
> > cksum (also from coreutils) instead of base32, which makes the
> > BTF_ID_BASE value compact
> >
> > I'll run test to find out how much it hurts the build time
> >
> > jirka
> 
> 
> 
> Seems a bad idea to me.
> 
> It would fork a new shell and chsum for all files,
> while only a few of them need it.

right, I have a change to limit this on kernel and net directories,
but it's still bad

> 
> Better to consult BTF forks.

perhaps there's better way within kbuild to get unique id/value
for each object file?

thanks,
jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-14  9:52             ` Jiri Olsa
@ 2023-09-14 18:14               ` Andrii Nakryiko
  2023-09-15  8:28                 ` Jiri Olsa
  0 siblings, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2023-09-14 18:14 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Masahiro Yamada, Nick Desaulniers, Marcus Seyfarth, bpf,
	clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

On Thu, Sep 14, 2023 at 2:52 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Thu, Sep 14, 2023 at 05:30:51PM +0900, Masahiro Yamada wrote:
>
> SNIP
>
> > > > so the change is about adding unique id that's basically path of
> > > > the object stored in base32 so it could be used as symbol, so we
> > > > don't really need to read the actual file
> > > >
> > > > the problem is when BTF_ID definition like:
> > > >
> > > > BTF_ID(struct, cgroup)
> > > >
> > > > translates in 2 separate objects into same symbol name because of
> > > > the matching __COUNTER__ macro values (like 380 below)
> > > >
> > > >   __BTF_ID__struct__cgroup__380
> > > >
> > > > this change just adds unique id of the path name at the end of the
> > > > symbol with:
> > > >
> > > >   echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > >
> > > > so the symbol looks like:
> > > >
> > > >   __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > >
> > > > and is unique over the sources
> > > >
> > > > but I still hope we could come up with some better solution ;-)
> > >
> > > so far the only better solution I could come up with is to use
> > > cksum (also from coreutils) instead of base32, which makes the
> > > BTF_ID_BASE value compact
> > >
> > > I'll run test to find out how much it hurts the build time
> > >
> > > jirka
> >
> >
> >
> > Seems a bad idea to me.
> >
> > It would fork a new shell and chsum for all files,
> > while only a few of them need it.
>
> right, I have a change to limit this on kernel and net directories,
> but it's still bad
>
> >
> > Better to consult BTF forks.
>
> perhaps there's better way within kbuild to get unique id/value
> for each object file?

let's just use __LINE__ + __COUNTER__ for now and teach resolve_btfids
to fail and complain loudly about duplicate symbols?


This will give us time and opportunity to implement a better approach
to .BTF_ids overall. Encoding the desired type name in the symbol name
always felt off. Maybe it's better to encode type + name as data,
which is discarded at the latest stage during vmlinux linking? Either
way, this baseid hack seems worse and unnecessary.

>
> thanks,
> jirka
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-14 18:14               ` Andrii Nakryiko
@ 2023-09-15  8:28                 ` Jiri Olsa
  2023-09-15 16:47                   ` Nick Desaulniers
  2023-09-15 20:41                   ` Andrii Nakryiko
  0 siblings, 2 replies; 16+ messages in thread
From: Jiri Olsa @ 2023-09-15  8:28 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Masahiro Yamada, Nick Desaulniers, Marcus Seyfarth,
	bpf, clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

On Thu, Sep 14, 2023 at 11:14:19AM -0700, Andrii Nakryiko wrote:
> On Thu, Sep 14, 2023 at 2:52 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Thu, Sep 14, 2023 at 05:30:51PM +0900, Masahiro Yamada wrote:
> >
> > SNIP
> >
> > > > > so the change is about adding unique id that's basically path of
> > > > > the object stored in base32 so it could be used as symbol, so we
> > > > > don't really need to read the actual file
> > > > >
> > > > > the problem is when BTF_ID definition like:
> > > > >
> > > > > BTF_ID(struct, cgroup)
> > > > >
> > > > > translates in 2 separate objects into same symbol name because of
> > > > > the matching __COUNTER__ macro values (like 380 below)
> > > > >
> > > > >   __BTF_ID__struct__cgroup__380
> > > > >
> > > > > this change just adds unique id of the path name at the end of the
> > > > > symbol with:
> > > > >
> > > > >   echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > >
> > > > > so the symbol looks like:
> > > > >
> > > > >   __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > >
> > > > > and is unique over the sources
> > > > >
> > > > > but I still hope we could come up with some better solution ;-)
> > > >
> > > > so far the only better solution I could come up with is to use
> > > > cksum (also from coreutils) instead of base32, which makes the
> > > > BTF_ID_BASE value compact
> > > >
> > > > I'll run test to find out how much it hurts the build time
> > > >
> > > > jirka
> > >
> > >
> > >
> > > Seems a bad idea to me.
> > >
> > > It would fork a new shell and chsum for all files,
> > > while only a few of them need it.
> >
> > right, I have a change to limit this on kernel and net directories,
> > but it's still bad
> >
> > >
> > > Better to consult BTF forks.
> >
> > perhaps there's better way within kbuild to get unique id/value
> > for each object file?
> 
> let's just use __LINE__ + __COUNTER__ for now and teach resolve_btfids
> to fail and complain loudly about duplicate symbols?

ok, will send that.. but it fails during link before resolve_btfids
takes place

> 
> 
> This will give us time and opportunity to implement a better approach
> to .BTF_ids overall. Encoding the desired type name in the symbol name
> always felt off. Maybe it's better to encode type + name as data,
> which is discarded at the latest stage during vmlinux linking? Either

hum, so maybe having a special section (.BTF_ids_desc below)
that would have record for each ID placed in .BTF_ids section:

asm(                                           \
".pushsection .BTF_ids,\"a\";        \n"       \
"1:                                  \n"       \
".zero 4                             \n"       \
".popsection;                        \n"       \
".pushsection .BTF_ids_desc,\"a\";   \n"       \
".long 1b                            \n"       \

and somehow get prefix and name pointers in here:

".long prefix
".long name

".popsection;                        \n");

so resolve_btfids would iterate .BTF_ids_desc records and fix
up .BTF_ids data..

we might need to do one extra link phase to get rid of the
.BTF_ids_desc secion 

> way, this baseid hack seems worse and unnecessary.

yes, it's bad

jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-15  8:28                 ` Jiri Olsa
@ 2023-09-15 16:47                   ` Nick Desaulniers
  2023-09-15 20:41                   ` Andrii Nakryiko
  1 sibling, 0 replies; 16+ messages in thread
From: Nick Desaulniers @ 2023-09-15 16:47 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andrii Nakryiko, Masahiro Yamada, Marcus Seyfarth, bpf,
	clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

On Fri, Sep 15, 2023 at 1:28 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Thu, Sep 14, 2023 at 11:14:19AM -0700, Andrii Nakryiko wrote:
> > On Thu, Sep 14, 2023 at 2:52 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Thu, Sep 14, 2023 at 05:30:51PM +0900, Masahiro Yamada wrote:
> > >
> > > SNIP
> > >
> > > > > > so the change is about adding unique id that's basically path of
> > > > > > the object stored in base32 so it could be used as symbol, so we
> > > > > > don't really need to read the actual file
> > > > > >
> > > > > > the problem is when BTF_ID definition like:
> > > > > >
> > > > > > BTF_ID(struct, cgroup)
> > > > > >
> > > > > > translates in 2 separate objects into same symbol name because of
> > > > > > the matching __COUNTER__ macro values (like 380 below)
> > > > > >
> > > > > >   __BTF_ID__struct__cgroup__380
> > > > > >
> > > > > > this change just adds unique id of the path name at the end of the
> > > > > > symbol with:
> > > > > >
> > > > > >   echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > > >
> > > > > > so the symbol looks like:
> > > > > >
> > > > > >   __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > > >
> > > > > > and is unique over the sources
> > > > > >
> > > > > > but I still hope we could come up with some better solution ;-)
> > > > >
> > > > > so far the only better solution I could come up with is to use
> > > > > cksum (also from coreutils) instead of base32, which makes the
> > > > > BTF_ID_BASE value compact
> > > > >
> > > > > I'll run test to find out how much it hurts the build time
> > > > >
> > > > > jirka
> > > >
> > > >
> > > >
> > > > Seems a bad idea to me.
> > > >
> > > > It would fork a new shell and chsum for all files,
> > > > while only a few of them need it.
> > >
> > > right, I have a change to limit this on kernel and net directories,
> > > but it's still bad
> > >
> > > >
> > > > Better to consult BTF forks.
> > >
> > > perhaps there's better way within kbuild to get unique id/value
> > > for each object file?
> >
> > let's just use __LINE__ + __COUNTER__ for now and teach resolve_btfids
> > to fail and complain loudly about duplicate symbols?
>
> ok, will send that.. but it fails during link before resolve_btfids
> takes place
>
> >
> >
> > This will give us time and opportunity to implement a better approach
> > to .BTF_ids overall. Encoding the desired type name in the symbol name
> > always felt off. Maybe it's better to encode type + name as data,
> > which is discarded at the latest stage during vmlinux linking? Either
>
> hum, so maybe having a special section (.BTF_ids_desc below)
> that would have record for each ID placed in .BTF_ids section:
>
> asm(                                           \
> ".pushsection .BTF_ids,\"a\";        \n"       \
> "1:                                  \n"       \
> ".zero 4                             \n"       \
> ".popsection;                        \n"       \
> ".pushsection .BTF_ids_desc,\"a\";   \n"       \
> ".long 1b                            \n"       \
>
> and somehow get prefix and name pointers in here:
>
> ".long prefix
> ".long name
>
> ".popsection;                        \n");
>
> so resolve_btfids would iterate .BTF_ids_desc records and fix
> up .BTF_ids data..
>
> we might need to do one extra link phase to get rid of the
> .BTF_ids_desc secion

It should be ok to keep it in vmlinux as we do for DWARF debug info.

We'd want to make sure it's not retained in the compressed image
though. Pretty sure `strip` is used to drop DWARF from the compressed
image, but `strip` isn't going to recognize the semantics of some new
ad-hoc ELF section.  Pretty sure `objcopy` can be used to drop ELF
sections; we'd need to probably invoke `objcopy` explicitly to drop
that section (or add any new section to any pre-existing list of
sections to drop).

>
> > way, this baseid hack seems worse and unnecessary.
>
> yes, it's bad
>
> jirka



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-15  8:28                 ` Jiri Olsa
  2023-09-15 16:47                   ` Nick Desaulniers
@ 2023-09-15 20:41                   ` Andrii Nakryiko
  2023-09-17 14:09                     ` Jiri Olsa
  1 sibling, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2023-09-15 20:41 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Masahiro Yamada, Nick Desaulniers, Marcus Seyfarth, bpf,
	clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

On Fri, Sep 15, 2023 at 1:28 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Thu, Sep 14, 2023 at 11:14:19AM -0700, Andrii Nakryiko wrote:
> > On Thu, Sep 14, 2023 at 2:52 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Thu, Sep 14, 2023 at 05:30:51PM +0900, Masahiro Yamada wrote:
> > >
> > > SNIP
> > >
> > > > > > so the change is about adding unique id that's basically path of
> > > > > > the object stored in base32 so it could be used as symbol, so we
> > > > > > don't really need to read the actual file
> > > > > >
> > > > > > the problem is when BTF_ID definition like:
> > > > > >
> > > > > > BTF_ID(struct, cgroup)
> > > > > >
> > > > > > translates in 2 separate objects into same symbol name because of
> > > > > > the matching __COUNTER__ macro values (like 380 below)
> > > > > >
> > > > > >   __BTF_ID__struct__cgroup__380
> > > > > >
> > > > > > this change just adds unique id of the path name at the end of the
> > > > > > symbol with:
> > > > > >
> > > > > >   echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > > >
> > > > > > so the symbol looks like:
> > > > > >
> > > > > >   __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > > >
> > > > > > and is unique over the sources
> > > > > >
> > > > > > but I still hope we could come up with some better solution ;-)
> > > > >
> > > > > so far the only better solution I could come up with is to use
> > > > > cksum (also from coreutils) instead of base32, which makes the
> > > > > BTF_ID_BASE value compact
> > > > >
> > > > > I'll run test to find out how much it hurts the build time
> > > > >
> > > > > jirka
> > > >
> > > >
> > > >
> > > > Seems a bad idea to me.
> > > >
> > > > It would fork a new shell and chsum for all files,
> > > > while only a few of them need it.
> > >
> > > right, I have a change to limit this on kernel and net directories,
> > > but it's still bad
> > >
> > > >
> > > > Better to consult BTF forks.
> > >
> > > perhaps there's better way within kbuild to get unique id/value
> > > for each object file?
> >
> > let's just use __LINE__ + __COUNTER__ for now and teach resolve_btfids
> > to fail and complain loudly about duplicate symbols?
>
> ok, will send that.. but it fails during link before resolve_btfids
> takes place
>
> >
> >
> > This will give us time and opportunity to implement a better approach
> > to .BTF_ids overall. Encoding the desired type name in the symbol name
> > always felt off. Maybe it's better to encode type + name as data,
> > which is discarded at the latest stage during vmlinux linking? Either
>
> hum, so maybe having a special section (.BTF_ids_desc below)
> that would have record for each ID placed in .BTF_ids section:
>
> asm(                                           \
> ".pushsection .BTF_ids,\"a\";        \n"       \
> "1:                                  \n"       \
> ".zero 4                             \n"       \
> ".popsection;                        \n"       \
> ".pushsection .BTF_ids_desc,\"a\";   \n"       \
> ".long 1b                            \n"       \
>
> and somehow get prefix and name pointers in here:
>
> ".long prefix
> ".long name
>
> ".popsection;                        \n");
>
> so resolve_btfids would iterate .BTF_ids_desc records and fix
> up .BTF_ids data..
>

Something like that. I don't think it's even a regression in terms of
vmlinux space usage, because right now we spend as much space on
storing symbol names. So just adding string pointers would be already
a win due to repeating "struct", "func", etc strings.


> we might need to do one extra link phase to get rid of the
> .BTF_ids_desc secion

Hopefully we can find a way to avoid this, we already do like 3 link
phases at least (for kallsyms), so doing all that on the first one and
then stripping it out using link script at subsequent one would be
best.

>
> > way, this baseid hack seems worse and unnecessary.
>
> yes, it's bad
>
> jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-15 20:41                   ` Andrii Nakryiko
@ 2023-09-17 14:09                     ` Jiri Olsa
  2023-09-24 13:27                       ` Jiri Olsa
  0 siblings, 1 reply; 16+ messages in thread
From: Jiri Olsa @ 2023-09-17 14:09 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Masahiro Yamada, Nick Desaulniers, Marcus Seyfarth,
	bpf, clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

On Fri, Sep 15, 2023 at 01:41:25PM -0700, Andrii Nakryiko wrote:
> On Fri, Sep 15, 2023 at 1:28 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Thu, Sep 14, 2023 at 11:14:19AM -0700, Andrii Nakryiko wrote:
> > > On Thu, Sep 14, 2023 at 2:52 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 14, 2023 at 05:30:51PM +0900, Masahiro Yamada wrote:
> > > >
> > > > SNIP
> > > >
> > > > > > > so the change is about adding unique id that's basically path of
> > > > > > > the object stored in base32 so it could be used as symbol, so we
> > > > > > > don't really need to read the actual file
> > > > > > >
> > > > > > > the problem is when BTF_ID definition like:
> > > > > > >
> > > > > > > BTF_ID(struct, cgroup)
> > > > > > >
> > > > > > > translates in 2 separate objects into same symbol name because of
> > > > > > > the matching __COUNTER__ macro values (like 380 below)
> > > > > > >
> > > > > > >   __BTF_ID__struct__cgroup__380
> > > > > > >
> > > > > > > this change just adds unique id of the path name at the end of the
> > > > > > > symbol with:
> > > > > > >
> > > > > > >   echo -n 'kernel/bpf/helpers' | base32 -w0 --> NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > > > >
> > > > > > > so the symbol looks like:
> > > > > > >
> > > > > > >   __BTF_ID__struct__cgroup__380NNSXE3TFNQXWE4DGF5UGK3DQMVZHG
> > > > > > >
> > > > > > > and is unique over the sources
> > > > > > >
> > > > > > > but I still hope we could come up with some better solution ;-)
> > > > > >
> > > > > > so far the only better solution I could come up with is to use
> > > > > > cksum (also from coreutils) instead of base32, which makes the
> > > > > > BTF_ID_BASE value compact
> > > > > >
> > > > > > I'll run test to find out how much it hurts the build time
> > > > > >
> > > > > > jirka
> > > > >
> > > > >
> > > > >
> > > > > Seems a bad idea to me.
> > > > >
> > > > > It would fork a new shell and chsum for all files,
> > > > > while only a few of them need it.
> > > >
> > > > right, I have a change to limit this on kernel and net directories,
> > > > but it's still bad
> > > >
> > > > >
> > > > > Better to consult BTF forks.
> > > >
> > > > perhaps there's better way within kbuild to get unique id/value
> > > > for each object file?
> > >
> > > let's just use __LINE__ + __COUNTER__ for now and teach resolve_btfids
> > > to fail and complain loudly about duplicate symbols?
> >
> > ok, will send that.. but it fails during link before resolve_btfids
> > takes place
> >
> > >
> > >
> > > This will give us time and opportunity to implement a better approach
> > > to .BTF_ids overall. Encoding the desired type name in the symbol name
> > > always felt off. Maybe it's better to encode type + name as data,
> > > which is discarded at the latest stage during vmlinux linking? Either
> >
> > hum, so maybe having a special section (.BTF_ids_desc below)
> > that would have record for each ID placed in .BTF_ids section:
> >
> > asm(                                           \
> > ".pushsection .BTF_ids,\"a\";        \n"       \
> > "1:                                  \n"       \
> > ".zero 4                             \n"       \
> > ".popsection;                        \n"       \
> > ".pushsection .BTF_ids_desc,\"a\";   \n"       \
> > ".long 1b                            \n"       \
> >
> > and somehow get prefix and name pointers in here:
> >
> > ".long prefix
> > ".long name
> >
> > ".popsection;                        \n");
> >
> > so resolve_btfids would iterate .BTF_ids_desc records and fix
> > up .BTF_ids data..
> >
> 
> Something like that. I don't think it's even a regression in terms of
> vmlinux space usage, because right now we spend as much space on
> storing symbol names. So just adding string pointers would be already
> a win due to repeating "struct", "func", etc strings.
> 
> 
> > we might need to do one extra link phase to get rid of the
> > .BTF_ids_desc secion
> 
> Hopefully we can find a way to avoid this, we already do like 3 link
> phases at least (for kallsyms), so doing all that on the first one and
> then stripping it out using link script at subsequent one would be
> best.

perhaps we could move that section under .init.data and
get rid of it on startup

jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: duplicate BTF_IDs leading to symbol redefinition errors?
  2023-09-17 14:09                     ` Jiri Olsa
@ 2023-09-24 13:27                       ` Jiri Olsa
  0 siblings, 0 replies; 16+ messages in thread
From: Jiri Olsa @ 2023-09-24 13:27 UTC (permalink / raw)
  To: Jiri Olsa, Andrii Nakryiko
  Cc: Masahiro Yamada, Nick Desaulniers, Marcus Seyfarth, bpf,
	clang-built-linux, Stanislav Fomichev, Nathan Chancellor,
	Yonghong Song, Song Liu

On Sun, Sep 17, 2023 at 04:09:31PM +0200, Jiri Olsa wrote:

SNIP

> > > > This will give us time and opportunity to implement a better approach
> > > > to .BTF_ids overall. Encoding the desired type name in the symbol name
> > > > always felt off. Maybe it's better to encode type + name as data,
> > > > which is discarded at the latest stage during vmlinux linking? Either
> > >
> > > hum, so maybe having a special section (.BTF_ids_desc below)
> > > that would have record for each ID placed in .BTF_ids section:
> > >
> > > asm(                                           \
> > > ".pushsection .BTF_ids,\"a\";        \n"       \
> > > "1:                                  \n"       \
> > > ".zero 4                             \n"       \
> > > ".popsection;                        \n"       \
> > > ".pushsection .BTF_ids_desc,\"a\";   \n"       \
> > > ".long 1b                            \n"       \
> > >
> > > and somehow get prefix and name pointers in here:
> > >
> > > ".long prefix
> > > ".long name
> > >
> > > ".popsection;                        \n");
> > >
> > > so resolve_btfids would iterate .BTF_ids_desc records and fix
> > > up .BTF_ids data..
> > >
> > 
> > Something like that. I don't think it's even a regression in terms of
> > vmlinux space usage, because right now we spend as much space on
> > storing symbol names. So just adding string pointers would be already
> > a win due to repeating "struct", "func", etc strings.
> > 
> > 
> > > we might need to do one extra link phase to get rid of the
> > > .BTF_ids_desc secion
> > 
> > Hopefully we can find a way to avoid this, we already do like 3 link
> > phases at least (for kallsyms), so doing all that on the first one and
> > then stripping it out using link script at subsequent one would be
> > best.
> 
> perhaps we could move that section under .init.data and
> get rid of it on startup

hi,
I made first version on having extra sections that contain the
auxiliary data for .BTF_ids section,

new sections are:

  .BTF_ids_data that holds type and name strings
  .BTF_ids_desc that holds array of

     struct {
       u64 addr_btf_ids;      // address to .BTF_ids section
       u64 addr_type;         // address of type string
       u64 addr_name;         // address of name string
     }

it seems to work ok for vmlinux, but there' problem with kernel modules,
because all the .BTF_ids_desc datas need to be relocated first.. and we
don't get relocated data when reading elf object.. I'll check if we can
relocate that ourselfs, but might be tricky to support this on other archs

It's all in here together with resolve_btfids change:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  btfid_fix

jirka


---
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 9c59409104f6..2f7518b15301 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -658,6 +658,14 @@
 	. = ALIGN(4);							\
 	.BTF_ids : AT(ADDR(.BTF_ids) - LOAD_OFFSET) {			\
 		*(.BTF_ids)						\
+	}								\
+	. = ALIGN(4);							\
+	.BTF_ids_data : AT(ADDR(.BTF_ids_data) - LOAD_OFFSET) {		\
+		*(.BTF_ids_data)					\
+	}								\
+	. = ALIGN(4);							\
+	.BTF_ids_desc : AT(ADDR(.BTF_ids_desc) - LOAD_OFFSET) {		\
+		*(.BTF_ids_desc)					\
 	}
 #else
 #define BTF
diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
index a9cb10b0e2e9..63a4ebc7f331 100644
--- a/include/linux/btf_ids.h
+++ b/include/linux/btf_ids.h
@@ -34,6 +34,7 @@ struct btf_id_set8 {
 
 #define BTF_IDS_SECTION ".BTF_ids"
 
+#ifdef MODULE
 #define ____BTF_ID(symbol, word)			\
 asm(							\
 ".pushsection " BTF_IDS_SECTION ",\"a\";       \n"	\
@@ -65,6 +66,38 @@ word							\
 #define BTF_ID_FLAGS(prefix, name, ...) \
 	__BTF_ID_FLAGS(prefix, name, ##__VA_ARGS__, 0)
 
+#else
+#define __BTF_ID(type, name, word)             \
+asm(                                           \
+".pushsection .BTF_ids,\"a\";        \n"       \
+"1:                                  \n"       \
+".zero 4                             \n"       \
+word                                           \
+".popsection;                        \n"       \
+".pushsection .BTF_ids_data,\"a\";   \n"       \
+"2:                                  \n"       \
+".string \"" #type "\"               \n"       \
+"3:                                  \n"       \
+".string \"" #name "\"               \n"       \
+".popsection;                        \n"       \
+".pushsection .BTF_ids_desc,\"a\";   \n"       \
+".quad 1b                            \n"       \
+".quad 2b                            \n"       \
+".quad 3b                            \n"       \
+".popsection;                        \n");
+
+#define BTF_ID(type, name) \
+	__BTF_ID(type, name, "")
+
+#define ____BTF_ID_FLAGS(type, name, flags, ...) \
+	__BTF_ID(type, name, ".long " #flags "\n")
+#define __BTF_ID_FLAGS(type, name, flags, ...) \
+	____BTF_ID_FLAGS(type, name, flags)
+#define BTF_ID_FLAGS(type, name, ...) \
+	__BTF_ID_FLAGS(type, name, ##__VA_ARGS__, 0)
+
+#endif /* MODULE */
+
 /*
  * The BTF_ID_LIST macro defines pure (unsorted) list
  * of BTF IDs, with following layout:

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-09-24 13:27 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-07 19:01 duplicate BTF_IDs leading to symbol redefinition errors? Nick Desaulniers
2023-09-07 20:33 ` Jiri Olsa
2023-09-08 11:47   ` Jiri Olsa
2023-09-08 17:14     ` Nick Desaulniers
2023-09-08 20:15       ` Jiri Olsa
2023-09-11 16:21         ` Nick Desaulniers
     [not found]         ` <CA+FbhJNz4i4pU+8nT7JBvQKSa0VCkzcNzaJ=dRdRn+JCSTdgKQ@mail.gmail.com>
2023-09-11 18:17           ` Marcus Seyfarth
2023-09-14  8:17         ` Jiri Olsa
2023-09-14  8:30           ` Masahiro Yamada
2023-09-14  9:52             ` Jiri Olsa
2023-09-14 18:14               ` Andrii Nakryiko
2023-09-15  8:28                 ` Jiri Olsa
2023-09-15 16:47                   ` Nick Desaulniers
2023-09-15 20:41                   ` Andrii Nakryiko
2023-09-17 14:09                     ` Jiri Olsa
2023-09-24 13:27                       ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).