All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFT] Testing 1.22
@ 2021-05-27 15:20 Arnaldo Carvalho de Melo
  2021-05-27 16:54 ` Andrii Nakryiko
                   ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-05-27 15:20 UTC (permalink / raw)
  To: Andrii Nakryiko, Jiri Olsa
  Cc: Michal Suchánek, dwarves, bpf, kernel-team, Michael Petlan

Hi guys,

	Its important to have 1.22 out of the door ASAP, so please clone
what is in tmp.master and report your results.

	To make it super easy:

[acme@quaco pahole]$ cd /tmp
[acme@quaco tmp]$ git clone git://git.kernel.org/pub/scm/devel/pahole/pahole.git
Cloning into 'pahole'...
remote: Enumerating objects: 6510, done.
remote: Total 6510 (delta 0), reused 0 (delta 0), pack-reused 6510
Receiving objects: 100% (6510/6510), 1.63 MiB | 296.00 KiB/s, done.
Resolving deltas: 100% (4550/4550), done.
[acme@quaco tmp]$ cd pahole/
[acme@quaco pahole]$ git checkout origin/tmp.master
Note: switching to 'origin/tmp.master'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 0d17503db0580a66 btf_encoder: fix and complete filtering out zero-sized per-CPU variables
[acme@quaco pahole]$ git log --oneline -5
0d17503db0580a66 (HEAD, origin/tmp.master) btf_encoder: fix and complete filtering out zero-sized per-CPU variables
fb418f9d8384d3a9 dwarves: Make handling of NULL by destructos consistent
f049fe9ebf7aa9c2 dutil: Make handling of NULL by destructos consistent
1512ab8ab6fe76a9 pahole: Make handling of NULL by destructos consistent
1105b7dad2d0978b elf_symtab: Use zfree() where applicable
[acme@quaco pahole]$ mkdir build
[acme@quaco pahole]$ cd build
[acme@quaco build]$ cmake ..
<SNIP>
-- Build files have been written to: /tmp/pahole/build
[acme@quaco build]$ cd ..
[acme@quaco pahole]$ make -j8 -C build
make: Entering directory '/tmp/pahole/build'
<SNIP>
[100%] Built target pahole
make[1]: Leaving directory '/tmp/pahole/build'
make: Leaving directory '/tmp/pahole/build'
[acme@quaco pahole]$

Then make sure build/pahole is in your path and try your workloads.

Jiri, Michael, if you could run your tests with this, that would be awesome,

Thanks in advance!

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 15:20 [RFT] Testing 1.22 Arnaldo Carvalho de Melo
@ 2021-05-27 16:54 ` Andrii Nakryiko
  2021-05-27 19:04   ` Arnaldo
  2021-06-02 10:21 ` Michael Petlan
  2021-07-15 21:31 ` Michal Suchánek
  2 siblings, 1 reply; 44+ messages in thread
From: Andrii Nakryiko @ 2021-05-27 16:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Jiri Olsa, Michal Suchánek, dwarves, bpf,
	Kernel Team, Michael Petlan

On Thu, May 27, 2021 at 8:20 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Hi guys,
>
>         Its important to have 1.22 out of the door ASAP, so please clone
> what is in tmp.master and report your results.
>

Hey Arnaldo,

If we are going to make pahole 1.22 a new mandatory minimal version of
pahole, I think we should take a little bit of time and fix another
problematic issue and clean up Kbuild significantly.

We discussed this before, it would be great to have an ability to dump
generated BTF into a separate file instead of modifying vmlinux image
in place. I'd say let's try to push for [0] to land as a temporary
work around to buy us a bit of time to implement this feature. Then,
when pahole 1.22 is released and packaged into major distros, we can
follow up in kernel with Kbuild clean ups and making pahole 1.22
mandatory.

What do you think? If anyone agrees, please consider chiming in on the
above thread ([0]).

  [0] https://lore.kernel.org/bpf/20210526080741.GW30378@techsingularity.net/

>         To make it super easy:
>
> [acme@quaco pahole]$ cd /tmp
> [acme@quaco tmp]$ git clone git://git.kernel.org/pub/scm/devel/pahole/pahole.git
> Cloning into 'pahole'...
> remote: Enumerating objects: 6510, done.
> remote: Total 6510 (delta 0), reused 0 (delta 0), pack-reused 6510
> Receiving objects: 100% (6510/6510), 1.63 MiB | 296.00 KiB/s, done.
> Resolving deltas: 100% (4550/4550), done.
> [acme@quaco tmp]$ cd pahole/
> [acme@quaco pahole]$ git checkout origin/tmp.master
> Note: switching to 'origin/tmp.master'.
>
> You are in 'detached HEAD' state. You can look around, make experimental
> changes and commit them, and you can discard any commits you make in this
> state without impacting any branches by switching back to a branch.
>
> If you want to create a new branch to retain commits you create, you may
> do so (now or later) by using -c with the switch command. Example:
>
>   git switch -c <new-branch-name>
>
> Or undo this operation with:
>
>   git switch -
>
> Turn off this advice by setting config variable advice.detachedHead to false
>
> HEAD is now at 0d17503db0580a66 btf_encoder: fix and complete filtering out zero-sized per-CPU variables
> [acme@quaco pahole]$ git log --oneline -5
> 0d17503db0580a66 (HEAD, origin/tmp.master) btf_encoder: fix and complete filtering out zero-sized per-CPU variables
> fb418f9d8384d3a9 dwarves: Make handling of NULL by destructos consistent
> f049fe9ebf7aa9c2 dutil: Make handling of NULL by destructos consistent
> 1512ab8ab6fe76a9 pahole: Make handling of NULL by destructos consistent
> 1105b7dad2d0978b elf_symtab: Use zfree() where applicable
> [acme@quaco pahole]$ mkdir build
> [acme@quaco pahole]$ cd build
> [acme@quaco build]$ cmake ..
> <SNIP>
> -- Build files have been written to: /tmp/pahole/build
> [acme@quaco build]$ cd ..
> [acme@quaco pahole]$ make -j8 -C build
> make: Entering directory '/tmp/pahole/build'
> <SNIP>
> [100%] Built target pahole
> make[1]: Leaving directory '/tmp/pahole/build'
> make: Leaving directory '/tmp/pahole/build'
> [acme@quaco pahole]$
>
> Then make sure build/pahole is in your path and try your workloads.
>
> Jiri, Michael, if you could run your tests with this, that would be awesome,
>
> Thanks in advance!
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 16:54 ` Andrii Nakryiko
@ 2021-05-27 19:04   ` Arnaldo
  2021-05-27 19:14     ` Andrii Nakryiko
  0 siblings, 1 reply; 44+ messages in thread
From: Arnaldo @ 2021-05-27 19:04 UTC (permalink / raw)
  To: Andrii Nakryiko, Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Jiri Olsa, Michal Suchánek, dwarves, bpf,
	Kernel Team, Michael Petlan



On May 27, 2021 1:54:40 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>On Thu, May 27, 2021 at 8:20 AM Arnaldo Carvalho de Melo
><acme@kernel.org> wrote:
>>
>> Hi guys,
>>
>>         Its important to have 1.22 out of the door ASAP, so please
>clone
>> what is in tmp.master and report your results.
>>
>
>Hey Arnaldo,
>
>If we are going to make pahole 1.22 a new mandatory minimal version of
>pahole, I think we should take a little bit of time and fix another
>problematic issue and clean up Kbuild significantly.
>
>We discussed this before, it would be great to have an ability to dump
>generated BTF into a separate file instead of modifying vmlinux image
>in place. I'd say let's try to push for [0] to land as a temporary
>work around to buy us a bit of time to implement this feature. Then,
>when pahole 1.22 is released and packaged into major distros, we can
>follow up in kernel with Kbuild clean ups and making pahole 1.22
>mandatory.
>
>What do you think? If anyone agrees, please consider chiming in on the
>above thread ([0]).

There's multiple fixes that affects lots of stakeholders, so I'm more inclined to release 1.22 sooner rather than later.

If anyone has cycles right now to work on that detached BTF feature, releasing 1.23 as soon as that feature is complete and tested shouldn't be a problem.

Then 1.23 the mandatory minimal version.

Wdyt?

- Arnaldo


-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 19:04   ` Arnaldo
@ 2021-05-27 19:14     ` Andrii Nakryiko
  2021-05-27 19:55       ` Arnaldo
  0 siblings, 1 reply; 44+ messages in thread
From: Andrii Nakryiko @ 2021-05-27 19:14 UTC (permalink / raw)
  To: Arnaldo
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa,
	Michal Suchánek, dwarves, bpf, Kernel Team, Michael Petlan

On Thu, May 27, 2021 at 12:06 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
>
>
>
> On May 27, 2021 1:54:40 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >On Thu, May 27, 2021 at 8:20 AM Arnaldo Carvalho de Melo
> ><acme@kernel.org> wrote:
> >>
> >> Hi guys,
> >>
> >>         Its important to have 1.22 out of the door ASAP, so please
> >clone
> >> what is in tmp.master and report your results.
> >>
> >
> >Hey Arnaldo,
> >
> >If we are going to make pahole 1.22 a new mandatory minimal version of
> >pahole, I think we should take a little bit of time and fix another
> >problematic issue and clean up Kbuild significantly.
> >
> >We discussed this before, it would be great to have an ability to dump
> >generated BTF into a separate file instead of modifying vmlinux image
> >in place. I'd say let's try to push for [0] to land as a temporary
> >work around to buy us a bit of time to implement this feature. Then,
> >when pahole 1.22 is released and packaged into major distros, we can
> >follow up in kernel with Kbuild clean ups and making pahole 1.22
> >mandatory.
> >
> >What do you think? If anyone agrees, please consider chiming in on the
> >above thread ([0]).
>
> There's multiple fixes that affects lots of stakeholders, so I'm more inclined to release 1.22 sooner rather than later.
>
> If anyone has cycles right now to work on that detached BTF feature, releasing 1.23 as soon as that feature is complete and tested shouldn't be a problem.
>
> Then 1.23 the mandatory minimal version.
>
> Wdyt?

If we make 1.22 mandatory there will be no good reason to make 1.23
mandatory again. So I will have absolutely no inclination to work on
this, for example. So we are just wasting a chance to clean up the
Kbuild story w.r.t. pahole. And we are talking about just a few days
at most, while we do have a reasonable work around on the kernel side.

>
> - Arnaldo
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 19:14     ` Andrii Nakryiko
@ 2021-05-27 19:55       ` Arnaldo
  2021-05-27 20:41         ` Andrii Nakryiko
  0 siblings, 1 reply; 44+ messages in thread
From: Arnaldo @ 2021-05-27 19:55 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa,
	Michal Suchánek, dwarves, bpf, Kernel Team, Michael Petlan



On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>On Thu, May 27, 2021 at 12:06 PM Arnaldo <arnaldo.melo@gmail.com>
>wrote:
>>
>>
>>
>> On May 27, 2021 1:54:40 PM GMT-03:00, Andrii Nakryiko
><andrii.nakryiko@gmail.com> wrote:
>> >On Thu, May 27, 2021 at 8:20 AM Arnaldo Carvalho de Melo
>> ><acme@kernel.org> wrote:
>> >>
>> >> Hi guys,
>> >>
>> >>         Its important to have 1.22 out of the door ASAP, so please
>> >clone
>> >> what is in tmp.master and report your results.
>> >>
>> >
>> >Hey Arnaldo,
>> >
>> >If we are going to make pahole 1.22 a new mandatory minimal version
>of
>> >pahole, I think we should take a little bit of time and fix another
>> >problematic issue and clean up Kbuild significantly.
>> >
>> >We discussed this before, it would be great to have an ability to
>dump
>> >generated BTF into a separate file instead of modifying vmlinux
>image
>> >in place. I'd say let's try to push for [0] to land as a temporary
>> >work around to buy us a bit of time to implement this feature. Then,
>> >when pahole 1.22 is released and packaged into major distros, we can
>> >follow up in kernel with Kbuild clean ups and making pahole 1.22
>> >mandatory.
>> >
>> >What do you think? If anyone agrees, please consider chiming in on
>the
>> >above thread ([0]).
>>
>> There's multiple fixes that affects lots of stakeholders, so I'm more
>inclined to release 1.22 sooner rather than later.
>>
>> If anyone has cycles right now to work on that detached BTF feature,
>releasing 1.23 as soon as that feature is complete and tested shouldn't
>be a problem.
>>
>> Then 1.23 the mandatory minimal version.
>>
>> Wdyt?
>
>If we make 1.22 mandatory there will be no good reason to make 1.23
>mandatory again. So I will have absolutely no inclination to work on
>this, for example. So we are just wasting a chance to clean up the
>Kbuild story w.r.t. pahole. And we are talking about just a few days
>at most, while we do have a reasonable work around on the kernel side.

So there were patches for stop using objcopy, which we thought could uncover some can of worms, were there patches for the detached BTF  file?

- Arnaldo

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 19:55       ` Arnaldo
@ 2021-05-27 20:41         ` Andrii Nakryiko
  2021-05-27 21:08           ` Jiri Olsa
  2021-05-28 19:45           ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 44+ messages in thread
From: Andrii Nakryiko @ 2021-05-27 20:41 UTC (permalink / raw)
  To: Arnaldo
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa,
	Michal Suchánek, dwarves, bpf, Kernel Team, Michael Petlan

On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
>
>
>
> On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >On Thu, May 27, 2021 at 12:06 PM Arnaldo <arnaldo.melo@gmail.com>
> >wrote:
> >>
> >>
> >>
> >> On May 27, 2021 1:54:40 PM GMT-03:00, Andrii Nakryiko
> ><andrii.nakryiko@gmail.com> wrote:
> >> >On Thu, May 27, 2021 at 8:20 AM Arnaldo Carvalho de Melo
> >> ><acme@kernel.org> wrote:
> >> >>
> >> >> Hi guys,
> >> >>
> >> >>         Its important to have 1.22 out of the door ASAP, so please
> >> >clone
> >> >> what is in tmp.master and report your results.
> >> >>
> >> >
> >> >Hey Arnaldo,
> >> >
> >> >If we are going to make pahole 1.22 a new mandatory minimal version
> >of
> >> >pahole, I think we should take a little bit of time and fix another
> >> >problematic issue and clean up Kbuild significantly.
> >> >
> >> >We discussed this before, it would be great to have an ability to
> >dump
> >> >generated BTF into a separate file instead of modifying vmlinux
> >image
> >> >in place. I'd say let's try to push for [0] to land as a temporary
> >> >work around to buy us a bit of time to implement this feature. Then,
> >> >when pahole 1.22 is released and packaged into major distros, we can
> >> >follow up in kernel with Kbuild clean ups and making pahole 1.22
> >> >mandatory.
> >> >
> >> >What do you think? If anyone agrees, please consider chiming in on
> >the
> >> >above thread ([0]).
> >>
> >> There's multiple fixes that affects lots of stakeholders, so I'm more
> >inclined to release 1.22 sooner rather than later.
> >>
> >> If anyone has cycles right now to work on that detached BTF feature,
> >releasing 1.23 as soon as that feature is complete and tested shouldn't
> >be a problem.
> >>
> >> Then 1.23 the mandatory minimal version.
> >>
> >> Wdyt?
> >
> >If we make 1.22 mandatory there will be no good reason to make 1.23
> >mandatory again. So I will have absolutely no inclination to work on
> >this, for example. So we are just wasting a chance to clean up the
> >Kbuild story w.r.t. pahole. And we are talking about just a few days
> >at most, while we do have a reasonable work around on the kernel side.
>
> So there were patches for stop using objcopy, which we thought could uncover some can of worms, were there patches for the detached BTF  file?

No, there weren't, if I remember correctly. What's the concern,
though? That detached BTF file isn't even an ELF, so it's
btf__get_raw_data() and write it to the file. Done.

>
> - Arnaldo
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 20:41         ` Andrii Nakryiko
@ 2021-05-27 21:08           ` Jiri Olsa
  2021-05-27 21:57             ` Arnaldo
  2021-05-28 19:45           ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 44+ messages in thread
From: Jiri Olsa @ 2021-05-27 21:08 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Arnaldo, Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa,
	Michal Suchánek, dwarves, bpf, Kernel Team, Michael Petlan

On Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko wrote:
> On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
> >
> >
> >
> > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >On Thu, May 27, 2021 at 12:06 PM Arnaldo <arnaldo.melo@gmail.com>
> > >wrote:
> > >>
> > >>
> > >>
> > >> On May 27, 2021 1:54:40 PM GMT-03:00, Andrii Nakryiko
> > ><andrii.nakryiko@gmail.com> wrote:
> > >> >On Thu, May 27, 2021 at 8:20 AM Arnaldo Carvalho de Melo
> > >> ><acme@kernel.org> wrote:
> > >> >>
> > >> >> Hi guys,
> > >> >>
> > >> >>         Its important to have 1.22 out of the door ASAP, so please
> > >> >clone
> > >> >> what is in tmp.master and report your results.
> > >> >>
> > >> >
> > >> >Hey Arnaldo,
> > >> >
> > >> >If we are going to make pahole 1.22 a new mandatory minimal version
> > >of
> > >> >pahole, I think we should take a little bit of time and fix another
> > >> >problematic issue and clean up Kbuild significantly.
> > >> >
> > >> >We discussed this before, it would be great to have an ability to
> > >dump
> > >> >generated BTF into a separate file instead of modifying vmlinux
> > >image
> > >> >in place. I'd say let's try to push for [0] to land as a temporary
> > >> >work around to buy us a bit of time to implement this feature. Then,
> > >> >when pahole 1.22 is released and packaged into major distros, we can
> > >> >follow up in kernel with Kbuild clean ups and making pahole 1.22
> > >> >mandatory.
> > >> >
> > >> >What do you think? If anyone agrees, please consider chiming in on
> > >the
> > >> >above thread ([0]).
> > >>
> > >> There's multiple fixes that affects lots of stakeholders, so I'm more
> > >inclined to release 1.22 sooner rather than later.
> > >>
> > >> If anyone has cycles right now to work on that detached BTF feature,
> > >releasing 1.23 as soon as that feature is complete and tested shouldn't
> > >be a problem.
> > >>
> > >> Then 1.23 the mandatory minimal version.
> > >>
> > >> Wdyt?
> > >
> > >If we make 1.22 mandatory there will be no good reason to make 1.23
> > >mandatory again. So I will have absolutely no inclination to work on
> > >this, for example. So we are just wasting a chance to clean up the
> > >Kbuild story w.r.t. pahole. And we are talking about just a few days
> > >at most, while we do have a reasonable work around on the kernel side.
> >
> > So there were patches for stop using objcopy, which we thought could uncover some can of worms, were there patches for the detached BTF  file?
> 
> No, there weren't, if I remember correctly. What's the concern,
> though? That detached BTF file isn't even an ELF, so it's
> btf__get_raw_data() and write it to the file. Done.

heya,
I probably overlooked this, but are there more details about that
detached BTF file feature somewhere? 

thanks,
jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 21:08           ` Jiri Olsa
@ 2021-05-27 21:57             ` Arnaldo
  0 siblings, 0 replies; 44+ messages in thread
From: Arnaldo @ 2021-05-27 21:57 UTC (permalink / raw)
  To: Jiri Olsa, Andrii Nakryiko
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa,
	Michal Suchánek, dwarves, bpf, Kernel Team, Michael Petlan



On May 27, 2021 6:08:33 PM GMT-03:00, Jiri Olsa <jolsa@redhat.com> wrote:
>On Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko wrote:
>> On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com>
>wrote:
>> >
>> >
>> >
>> > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko
><andrii.nakryiko@gmail.com> wrote:
>> > >On Thu, May 27, 2021 at 12:06 PM Arnaldo <arnaldo.melo@gmail.com>
>> > >wrote:
>> > >>
>> > >>
>> > >>
>> > >> On May 27, 2021 1:54:40 PM GMT-03:00, Andrii Nakryiko
>> > ><andrii.nakryiko@gmail.com> wrote:
>> > >> >On Thu, May 27, 2021 at 8:20 AM Arnaldo Carvalho de Melo
>> > >> ><acme@kernel.org> wrote:
>> > >> >>
>> > >> >> Hi guys,
>> > >> >>
>> > >> >>         Its important to have 1.22 out of the door ASAP, so
>please
>> > >> >clone
>> > >> >> what is in tmp.master and report your results.
>> > >> >>
>> > >> >
>> > >> >Hey Arnaldo,
>> > >> >
>> > >> >If we are going to make pahole 1.22 a new mandatory minimal
>version
>> > >of
>> > >> >pahole, I think we should take a little bit of time and fix
>another
>> > >> >problematic issue and clean up Kbuild significantly.
>> > >> >
>> > >> >We discussed this before, it would be great to have an ability
>to
>> > >dump
>> > >> >generated BTF into a separate file instead of modifying vmlinux
>> > >image
>> > >> >in place. I'd say let's try to push for [0] to land as a
>temporary
>> > >> >work around to buy us a bit of time to implement this feature.
>Then,
>> > >> >when pahole 1.22 is released and packaged into major distros,
>we can
>> > >> >follow up in kernel with Kbuild clean ups and making pahole
>1.22
>> > >> >mandatory.
>> > >> >
>> > >> >What do you think? If anyone agrees, please consider chiming in
>on
>> > >the
>> > >> >above thread ([0]).
>> > >>
>> > >> There's multiple fixes that affects lots of stakeholders, so I'm
>more
>> > >inclined to release 1.22 sooner rather than later.
>> > >>
>> > >> If anyone has cycles right now to work on that detached BTF
>feature,
>> > >releasing 1.23 as soon as that feature is complete and tested
>shouldn't
>> > >be a problem.
>> > >>
>> > >> Then 1.23 the mandatory minimal version.
>> > >>
>> > >> Wdyt?
>> > >
>> > >If we make 1.22 mandatory there will be no good reason to make
>1.23
>> > >mandatory again. So I will have absolutely no inclination to work
>on
>> > >this, for example. So we are just wasting a chance to clean up the
>> > >Kbuild story w.r.t. pahole. And we are talking about just a few
>days
>> > >at most, while we do have a reasonable work around on the kernel
>side.
>> >
>> > So there were patches for stop using objcopy, which we thought
>could uncover some can of worms, were there patches for the detached
>BTF  file?
>> 
>> No, there weren't, if I remember correctly. What's the concern,
>> though? That detached BTF file isn't even an ELF, so it's
>> btf__get_raw_data() and write it to the file. Done.
>
>heya,
>I probably overlooked this, but are there more details about that
>detached BTF file feature somewhere? 


Look in the dwarves mailing list archives at lore, but it's just a new option to ask for the BTF data to be written to a file instead of to an ELF section, that will simplify the series of steps in the kernel building process.

I'll cook a patch early tomorrow.

- Arnaldo

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 20:41         ` Andrii Nakryiko
  2021-05-27 21:08           ` Jiri Olsa
@ 2021-05-28 19:45           ` Arnaldo Carvalho de Melo
  2021-05-29  2:36             ` Andrii Nakryiko
                               ` (2 more replies)
  1 sibling, 3 replies; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-05-28 19:45 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Arnaldo, Andrii Nakryiko, Jiri Olsa, Michal Suchánek,
	dwarves, bpf, Kernel Team, Michael Petlan

Em Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko escreveu:
> On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
> > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >If we make 1.22 mandatory there will be no good reason to make 1.23
> > >mandatory again. So I will have absolutely no inclination to work on
> > >this, for example. So we are just wasting a chance to clean up the
> > >Kbuild story w.r.t. pahole. And we are talking about just a few days
> > >at most, while we do have a reasonable work around on the kernel side.

> > So there were patches for stop using objcopy, which we thought could
> > uncover some can of worms, were there patches for the detached BTF
> > file?
 
> No, there weren't, if I remember correctly. What's the concern,
> though? That detached BTF file isn't even an ELF, so it's
> btf__get_raw_data() and write it to the file. Done.

See patch below, lightly tested, now working on making pahole accept raw
BTF files out of /sys/kernel/btf/

Please test, and if works as expected, try to bolt this into the kbuild
process, as intended.

- Arnaldo

commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Fri May 28 16:41:30 2021 -0300

    pahole: Allow encoding BTF into a detached file
    
    Previously the newly encoded BTF info was stored into a ELF section in
    the file where the DWARF info was obtained, but it is useful to just
    dump it into a separate file, do it.
    
    Requested-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/btf_encoder.c b/btf_encoder.c
index 033c927b537dad1e..bc3ac72968cea826 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -21,6 +21,14 @@
 #include <stdlib.h> /* for qsort() and bsearch() */
 #include <inttypes.h>
 
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+
+#include <unistd.h>
+
+#include <errno.h>
+
 /*
  * This corresponds to the same macro defined in
  * include/linux/kallsyms.h
@@ -267,14 +275,62 @@ static struct btf_elf *btfe;
 static uint32_t array_index_id;
 static bool has_index_type;
 
-int btf_encoder__encode()
+static int btf_encoder__dump(struct btf *btf, const char *filename)
+{
+	uint32_t raw_btf_size;
+	const void *raw_btf_data;
+	int fd, err;
+
+	/* Empty file, nothing to do, so... done! */
+	if (btf__get_nr_types(btf) == 0)
+		return 0;
+
+	if (btf__dedup(btf, NULL, NULL)) {
+		fprintf(stderr, "%s: btf__dedup failed!\n", __func__);
+		return -1;
+	}
+
+	raw_btf_data = btf__get_raw_data(btf, &raw_btf_size);
+	if (raw_btf_data == NULL) {
+                fprintf(stderr, "%s: btf__get_raw_data failed!\n", __func__);
+		return -1;
+	}
+
+	fd = open(filename, O_WRONLY | O_CREAT);
+	if (fd < 0) {
+                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
+		return -1;
+	}
+	err = write(fd, raw_btf_data, raw_btf_size);
+	if (err < 0)
+                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
+
+	close(fd);
+
+	if (err != raw_btf_size) {
+                fprintf(stderr, "%s: Could only write %d bytes to %s of raw BTF info out of %d, aborting\n", __func__, err, filename, raw_btf_size);
+		unlink(filename);
+		err = -1;
+	} else {
+		/* go from bytes written == raw_btf_size to an indication that all went fine */
+		err = 0;
+	}
+
+	return err;
+}
+
+int btf_encoder__encode(const char *filename)
 {
 	int err;
 
 	if (gobuffer__size(&btfe->percpu_secinfo) != 0)
 		btf_elf__add_datasec_type(btfe, PERCPU_SECTION, &btfe->percpu_secinfo);
 
-	err = btf_elf__encode(btfe, 0);
+	if (filename == NULL)
+		err = btf_elf__encode(btfe, 0);
+	else
+		err = btf_encoder__dump(btfe->btf, filename);
+
 	delete_functions();
 	btf_elf__delete(btfe);
 	btfe = NULL;
@@ -412,7 +468,7 @@ static bool has_arg_names(struct cu *cu, struct ftype *ftype)
 }
 
 int cu__encode_btf(struct cu *cu, int verbose, bool force,
-		   bool skip_encoding_vars)
+		   bool skip_encoding_vars, const char *detached_btf_filename)
 {
 	uint32_t type_id_off;
 	uint32_t core_id;
@@ -425,7 +481,7 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
 	btf_elf__force = force;
 
 	if (btfe && strcmp(btfe->filename, cu->filename)) {
-		err = btf_encoder__encode();
+		err = btf_encoder__encode(detached_btf_filename);
 		if (err)
 			goto out;
 
diff --git a/btf_encoder.h b/btf_encoder.h
index 46fb2312fc0eea9b..bfc6085092028adc 100644
--- a/btf_encoder.h
+++ b/btf_encoder.h
@@ -11,9 +11,9 @@
 
 struct cu;
 
-int btf_encoder__encode();
+int btf_encoder__encode(const char *filename);
 
 int cu__encode_btf(struct cu *cu, int verbose, bool force,
-		   bool skip_encoding_vars);
+		   bool skip_encoding_vars, const char *detached_btf_filename);
 
 #endif /* _BTF_ENCODER_H_ */
diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
index 2659fe6f231405d8..6e7ded59595f5ea7 100644
--- a/man-pages/pahole.1
+++ b/man-pages/pahole.1
@@ -189,6 +189,10 @@ features such as BPF CO-RE (Compile Once - Run Everywhere).
 
 See \fIhttps://nakryiko.com/posts/bpf-portability-and-co-re/\fR.
 
+.TP
+.B \-j, \-\-btf_encode_detached=FILENAME
+Same thing as -J/--btf_encode, but storing the raw BTF info into a separate file.
+
 .TP
 .B \-\-btf_encode_force
 Ignore those symbols found invalid when encoding BTF.
diff --git a/pahole.c b/pahole.c
index 24659e2969c8fb85..1cbff6175b60af51 100644
--- a/pahole.c
+++ b/pahole.c
@@ -26,6 +26,7 @@
 #include "lib/bpf/src/libbpf.h"
 #include "pahole_strings.h"
 
+static char *detached_btf_filename;
 static bool btf_encode;
 static bool ctf_encode;
 static bool first_obj_only;
@@ -1152,6 +1153,12 @@ static const struct argp_option pahole__options[] = {
 		.key  = 'J',
 		.doc  = "Encode as BTF",
 	},
+	{
+		.name = "btf_encode_detached",
+		.key  = 'j',
+		.arg  = "FILENAME",
+		.doc  = "Encode as BTF in a detached file",
+	},
 	{
 		.name = "skip_encoding_btf_vars",
 		.key  = ARGP_skip_encoding_btf_vars,
@@ -1223,6 +1230,7 @@ static error_t pahole__options_parser(int key, char *arg,
 		  conf_load.extra_dbg_info = 1;		break;
 	case 'i': find_containers = 1;
 		  class_name = arg;			break;
+	case 'j': detached_btf_filename = arg; // fallthru
 	case 'J': btf_encode = 1;
 		  conf_load.get_addr_info = true;
 		  no_bitfield_type_recode = true;	break;
@@ -2458,7 +2466,7 @@ static enum load_steal_kind pahole_stealer(struct cu *cu,
 
 	if (btf_encode) {
 		if (cu__encode_btf(cu, global_verbose, btf_encode_force,
-				   skip_encoding_btf_vars)) {
+				   skip_encoding_btf_vars, detached_btf_filename)) {
 			fprintf(stderr, "Encountered error while encoding BTF.\n");
 			exit(1);
 		}
@@ -2872,7 +2880,7 @@ try_sole_arg_as_class_names:
 	header = NULL;
 
 	if (btf_encode) {
-		err = btf_encoder__encode();
+		err = btf_encoder__encode(detached_btf_filename);
 		if (err) {
 			fputs("Failed to encode BTF\n", stderr);
 			goto out_cus_delete;

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-28 19:45           ` Arnaldo Carvalho de Melo
@ 2021-05-29  2:36             ` Andrii Nakryiko
  2021-06-01 18:31               ` Arnaldo Carvalho de Melo
  2021-05-30  0:40             ` Andrii Nakryiko
  2021-05-30 21:45             ` Jiri Olsa
  2 siblings, 1 reply; 44+ messages in thread
From: Andrii Nakryiko @ 2021-05-29  2:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Jiri Olsa, Michal Suchánek, dwarves, bpf,
	Kernel Team, Michael Petlan

On Fri, May 28, 2021 at 12:45 PM Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> Em Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko escreveu:
> > On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
> > > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > >If we make 1.22 mandatory there will be no good reason to make 1.23
> > > >mandatory again. So I will have absolutely no inclination to work on
> > > >this, for example. So we are just wasting a chance to clean up the
> > > >Kbuild story w.r.t. pahole. And we are talking about just a few days
> > > >at most, while we do have a reasonable work around on the kernel side.
>
> > > So there were patches for stop using objcopy, which we thought could
> > > uncover some can of worms, were there patches for the detached BTF
> > > file?
>
> > No, there weren't, if I remember correctly. What's the concern,
> > though? That detached BTF file isn't even an ELF, so it's
> > btf__get_raw_data() and write it to the file. Done.
>
> See patch below, lightly tested, now working on making pahole accept raw
> BTF files out of /sys/kernel/btf/

btf__parse() handles both ELF and raw BTF transparently. Then there is
btf__parse_elf() and btf__parse_raw() for specifically ELF or raw.

See all APIs at:
https://github.com/libbpf/libbpf/blob/master/src/btf.h#L40

>
> Please test, and if works as expected, try to bolt this into the kbuild
> process, as intended.

I'm on PTO today, will try to get to this soon-ish.

>
> - Arnaldo
>
> commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
> Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date:   Fri May 28 16:41:30 2021 -0300
>
>     pahole: Allow encoding BTF into a detached file
>
>     Previously the newly encoded BTF info was stored into a ELF section in
>     the file where the DWARF info was obtained, but it is useful to just
>     dump it into a separate file, do it.
>
>     Requested-by: Andrii Nakryiko <andrii@kernel.org>
>     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 033c927b537dad1e..bc3ac72968cea826 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -21,6 +21,14 @@
>  #include <stdlib.h> /* for qsort() and bsearch() */
>  #include <inttypes.h>
>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +
> +#include <unistd.h>
> +
> +#include <errno.h>
> +
>  /*
>   * This corresponds to the same macro defined in
>   * include/linux/kallsyms.h
> @@ -267,14 +275,62 @@ static struct btf_elf *btfe;
>  static uint32_t array_index_id;
>  static bool has_index_type;
>
> -int btf_encoder__encode()
> +static int btf_encoder__dump(struct btf *btf, const char *filename)
> +{
> +       uint32_t raw_btf_size;
> +       const void *raw_btf_data;
> +       int fd, err;
> +
> +       /* Empty file, nothing to do, so... done! */
> +       if (btf__get_nr_types(btf) == 0)
> +               return 0;
> +
> +       if (btf__dedup(btf, NULL, NULL)) {
> +               fprintf(stderr, "%s: btf__dedup failed!\n", __func__);
> +               return -1;
> +       }
> +
> +       raw_btf_data = btf__get_raw_data(btf, &raw_btf_size);
> +       if (raw_btf_data == NULL) {
> +                fprintf(stderr, "%s: btf__get_raw_data failed!\n", __func__);
> +               return -1;
> +       }
> +
> +       fd = open(filename, O_WRONLY | O_CREAT);
> +       if (fd < 0) {
> +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
> +               return -1;
> +       }
> +       err = write(fd, raw_btf_data, raw_btf_size);
> +       if (err < 0)
> +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
> +
> +       close(fd);
> +
> +       if (err != raw_btf_size) {
> +                fprintf(stderr, "%s: Could only write %d bytes to %s of raw BTF info out of %d, aborting\n", __func__, err, filename, raw_btf_size);
> +               unlink(filename);
> +               err = -1;
> +       } else {
> +               /* go from bytes written == raw_btf_size to an indication that all went fine */
> +               err = 0;
> +       }
> +
> +       return err;
> +}
> +
> +int btf_encoder__encode(const char *filename)
>  {
>         int err;
>
>         if (gobuffer__size(&btfe->percpu_secinfo) != 0)
>                 btf_elf__add_datasec_type(btfe, PERCPU_SECTION, &btfe->percpu_secinfo);
>
> -       err = btf_elf__encode(btfe, 0);
> +       if (filename == NULL)
> +               err = btf_elf__encode(btfe, 0);
> +       else
> +               err = btf_encoder__dump(btfe->btf, filename);
> +
>         delete_functions();
>         btf_elf__delete(btfe);
>         btfe = NULL;
> @@ -412,7 +468,7 @@ static bool has_arg_names(struct cu *cu, struct ftype *ftype)
>  }
>
>  int cu__encode_btf(struct cu *cu, int verbose, bool force,
> -                  bool skip_encoding_vars)
> +                  bool skip_encoding_vars, const char *detached_btf_filename)
>  {
>         uint32_t type_id_off;
>         uint32_t core_id;
> @@ -425,7 +481,7 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
>         btf_elf__force = force;
>
>         if (btfe && strcmp(btfe->filename, cu->filename)) {
> -               err = btf_encoder__encode();
> +               err = btf_encoder__encode(detached_btf_filename);
>                 if (err)
>                         goto out;
>
> diff --git a/btf_encoder.h b/btf_encoder.h
> index 46fb2312fc0eea9b..bfc6085092028adc 100644
> --- a/btf_encoder.h
> +++ b/btf_encoder.h
> @@ -11,9 +11,9 @@
>
>  struct cu;
>
> -int btf_encoder__encode();
> +int btf_encoder__encode(const char *filename);
>
>  int cu__encode_btf(struct cu *cu, int verbose, bool force,
> -                  bool skip_encoding_vars);
> +                  bool skip_encoding_vars, const char *detached_btf_filename);
>
>  #endif /* _BTF_ENCODER_H_ */
> diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
> index 2659fe6f231405d8..6e7ded59595f5ea7 100644
> --- a/man-pages/pahole.1
> +++ b/man-pages/pahole.1
> @@ -189,6 +189,10 @@ features such as BPF CO-RE (Compile Once - Run Everywhere).
>
>  See \fIhttps://nakryiko.com/posts/bpf-portability-and-co-re/\fR.
>
> +.TP
> +.B \-j, \-\-btf_encode_detached=FILENAME
> +Same thing as -J/--btf_encode, but storing the raw BTF info into a separate file.
> +
>  .TP
>  .B \-\-btf_encode_force
>  Ignore those symbols found invalid when encoding BTF.
> diff --git a/pahole.c b/pahole.c
> index 24659e2969c8fb85..1cbff6175b60af51 100644
> --- a/pahole.c
> +++ b/pahole.c
> @@ -26,6 +26,7 @@
>  #include "lib/bpf/src/libbpf.h"
>  #include "pahole_strings.h"
>
> +static char *detached_btf_filename;
>  static bool btf_encode;
>  static bool ctf_encode;
>  static bool first_obj_only;
> @@ -1152,6 +1153,12 @@ static const struct argp_option pahole__options[] = {
>                 .key  = 'J',
>                 .doc  = "Encode as BTF",
>         },
> +       {
> +               .name = "btf_encode_detached",
> +               .key  = 'j',
> +               .arg  = "FILENAME",
> +               .doc  = "Encode as BTF in a detached file",
> +       },
>         {
>                 .name = "skip_encoding_btf_vars",
>                 .key  = ARGP_skip_encoding_btf_vars,
> @@ -1223,6 +1230,7 @@ static error_t pahole__options_parser(int key, char *arg,
>                   conf_load.extra_dbg_info = 1;         break;
>         case 'i': find_containers = 1;
>                   class_name = arg;                     break;
> +       case 'j': detached_btf_filename = arg; // fallthru
>         case 'J': btf_encode = 1;
>                   conf_load.get_addr_info = true;
>                   no_bitfield_type_recode = true;       break;
> @@ -2458,7 +2466,7 @@ static enum load_steal_kind pahole_stealer(struct cu *cu,
>
>         if (btf_encode) {
>                 if (cu__encode_btf(cu, global_verbose, btf_encode_force,
> -                                  skip_encoding_btf_vars)) {
> +                                  skip_encoding_btf_vars, detached_btf_filename)) {
>                         fprintf(stderr, "Encountered error while encoding BTF.\n");
>                         exit(1);
>                 }
> @@ -2872,7 +2880,7 @@ try_sole_arg_as_class_names:
>         header = NULL;
>
>         if (btf_encode) {
> -               err = btf_encoder__encode();
> +               err = btf_encoder__encode(detached_btf_filename);
>                 if (err) {
>                         fputs("Failed to encode BTF\n", stderr);
>                         goto out_cus_delete;

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-28 19:45           ` Arnaldo Carvalho de Melo
  2021-05-29  2:36             ` Andrii Nakryiko
@ 2021-05-30  0:40             ` Andrii Nakryiko
  2021-06-01 18:24               ` Arnaldo Carvalho de Melo
  2021-06-03 14:57               ` Arnaldo Carvalho de Melo
  2021-05-30 21:45             ` Jiri Olsa
  2 siblings, 2 replies; 44+ messages in thread
From: Andrii Nakryiko @ 2021-05-30  0:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Jiri Olsa, Michal Suchánek, dwarves, bpf,
	Kernel Team, Michael Petlan

On Fri, May 28, 2021 at 12:45 PM Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> Em Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko escreveu:
> > On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
> > > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > >If we make 1.22 mandatory there will be no good reason to make 1.23
> > > >mandatory again. So I will have absolutely no inclination to work on
> > > >this, for example. So we are just wasting a chance to clean up the
> > > >Kbuild story w.r.t. pahole. And we are talking about just a few days
> > > >at most, while we do have a reasonable work around on the kernel side.
>
> > > So there were patches for stop using objcopy, which we thought could
> > > uncover some can of worms, were there patches for the detached BTF
> > > file?
>
> > No, there weren't, if I remember correctly. What's the concern,
> > though? That detached BTF file isn't even an ELF, so it's
> > btf__get_raw_data() and write it to the file. Done.
>
> See patch below, lightly tested, now working on making pahole accept raw
> BTF files out of /sys/kernel/btf/
>
> Please test, and if works as expected, try to bolt this into the kbuild
> process, as intended.

So while looking through this I found --skip_encoding_btf_vars and I
just sent a fix to disable per-CPU var BTF generation for versions
1.18 through 1.21. I think that's a better solution than all the
previously proposed ones. But it also means we have no good reason to
force 1.22+ as minimal version.

But in either case, this is good feature and will definitely be useful
going forward. See my comments below.

>
> - Arnaldo
>
> commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
> Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date:   Fri May 28 16:41:30 2021 -0300
>
>     pahole: Allow encoding BTF into a detached file
>
>     Previously the newly encoded BTF info was stored into a ELF section in
>     the file where the DWARF info was obtained, but it is useful to just
>     dump it into a separate file, do it.
>
>     Requested-by: Andrii Nakryiko <andrii@kernel.org>
>     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>

Looks good, see few minor comments below. At some point it probably
would make sense to formalize "btf_encoder" as a struct with its own
state instead of passing in multiple variables. It would probably also
allow to parallelize BTF generation, where each CU would proceed in
parallel generating local BTF, and then the final pass would merge and
dedup BTFs. Currently reading and processing DWARF is the slowest part
of the DWARF-to-BTF conversion, parallelization and maybe some other
optimization seems like the only way to speed the process up.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/btf_encoder.c b/btf_encoder.c
> index 033c927b537dad1e..bc3ac72968cea826 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -21,6 +21,14 @@
>  #include <stdlib.h> /* for qsort() and bsearch() */
>  #include <inttypes.h>
>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +
> +#include <unistd.h>
> +
> +#include <errno.h>
> +
>  /*
>   * This corresponds to the same macro defined in
>   * include/linux/kallsyms.h
> @@ -267,14 +275,62 @@ static struct btf_elf *btfe;
>  static uint32_t array_index_id;
>  static bool has_index_type;
>
> -int btf_encoder__encode()
> +static int btf_encoder__dump(struct btf *btf, const char *filename)
> +{
> +       uint32_t raw_btf_size;
> +       const void *raw_btf_data;
> +       int fd, err;
> +
> +       /* Empty file, nothing to do, so... done! */
> +       if (btf__get_nr_types(btf) == 0)
> +               return 0;
> +
> +       if (btf__dedup(btf, NULL, NULL)) {
> +               fprintf(stderr, "%s: btf__dedup failed!\n", __func__);
> +               return -1;
> +       }
> +
> +       raw_btf_data = btf__get_raw_data(btf, &raw_btf_size);
> +       if (raw_btf_data == NULL) {
> +                fprintf(stderr, "%s: btf__get_raw_data failed!\n", __func__);

indentation seems off here and in few places below

> +               return -1;
> +       }
> +
> +       fd = open(filename, O_WRONLY | O_CREAT);
> +       if (fd < 0) {
> +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
> +               return -1;
> +       }
> +       err = write(fd, raw_btf_data, raw_btf_size);
> +       if (err < 0)
> +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));

nit: copy-pasted error message is wrong

> +
> +       close(fd);
> +
> +       if (err != raw_btf_size) {
> +                fprintf(stderr, "%s: Could only write %d bytes to %s of raw BTF info out of %d, aborting\n", __func__, err, filename, raw_btf_size);
> +               unlink(filename);
> +               err = -1;
> +       } else {
> +               /* go from bytes written == raw_btf_size to an indication that all went fine */
> +               err = 0;
> +       }
> +
> +       return err;
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-28 19:45           ` Arnaldo Carvalho de Melo
  2021-05-29  2:36             ` Andrii Nakryiko
  2021-05-30  0:40             ` Andrii Nakryiko
@ 2021-05-30 21:45             ` Jiri Olsa
  2021-06-01 19:07               ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 44+ messages in thread
From: Jiri Olsa @ 2021-05-30 21:45 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Andrii Nakryiko, Jiri Olsa,
	Michal Suchánek, dwarves, bpf, Kernel Team, Michael Petlan

On Fri, May 28, 2021 at 04:45:31PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko escreveu:
> > On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
> > > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > >If we make 1.22 mandatory there will be no good reason to make 1.23
> > > >mandatory again. So I will have absolutely no inclination to work on
> > > >this, for example. So we are just wasting a chance to clean up the
> > > >Kbuild story w.r.t. pahole. And we are talking about just a few days
> > > >at most, while we do have a reasonable work around on the kernel side.
> 
> > > So there were patches for stop using objcopy, which we thought could
> > > uncover some can of worms, were there patches for the detached BTF
> > > file?
>  
> > No, there weren't, if I remember correctly. What's the concern,
> > though? That detached BTF file isn't even an ELF, so it's
> > btf__get_raw_data() and write it to the file. Done.
> 
> See patch below, lightly tested, now working on making pahole accept raw
> BTF files out of /sys/kernel/btf/
> 
> Please test, and if works as expected, try to bolt this into the kbuild
> process, as intended.
> 
> - Arnaldo
> 
> commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
> Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date:   Fri May 28 16:41:30 2021 -0300
> 
>     pahole: Allow encoding BTF into a detached file
>     
>     Previously the newly encoded BTF info was stored into a ELF section in
>     the file where the DWARF info was obtained, but it is useful to just
>     dump it into a separate file, do it.
>     
>     Requested-by: Andrii Nakryiko <andrii@kernel.org>
>     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 033c927b537dad1e..bc3ac72968cea826 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -21,6 +21,14 @@
>  #include <stdlib.h> /* for qsort() and bsearch() */
>  #include <inttypes.h>
>  
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +
> +#include <unistd.h>
> +
> +#include <errno.h>
> +
>  /*
>   * This corresponds to the same macro defined in
>   * include/linux/kallsyms.h
> @@ -267,14 +275,62 @@ static struct btf_elf *btfe;
>  static uint32_t array_index_id;
>  static bool has_index_type;
>  
> -int btf_encoder__encode()
> +static int btf_encoder__dump(struct btf *btf, const char *filename)
> +{
> +	uint32_t raw_btf_size;
> +	const void *raw_btf_data;
> +	int fd, err;
> +
> +	/* Empty file, nothing to do, so... done! */
> +	if (btf__get_nr_types(btf) == 0)
> +		return 0;
> +
> +	if (btf__dedup(btf, NULL, NULL)) {
> +		fprintf(stderr, "%s: btf__dedup failed!\n", __func__);
> +		return -1;
> +	}
> +
> +	raw_btf_data = btf__get_raw_data(btf, &raw_btf_size);
> +	if (raw_btf_data == NULL) {
> +                fprintf(stderr, "%s: btf__get_raw_data failed!\n", __func__);
> +		return -1;
> +	}
> +
> +	fd = open(filename, O_WRONLY | O_CREAT);

I think you need to specify file mode as 3rd param:

	$ ./pahole -j krava vmlinux 
	$ ~/linux/tools/bpf/bpftool/bpftool btf dump file ./krava 
	Error: failed to load BTF from ./krava: Permission denied
	$ ll krava 
	--w-r-Sr-T. 1 jolsa jolsa 4525734 May 30 23:38 krava

jirka


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-30  0:40             ` Andrii Nakryiko
@ 2021-06-01 18:24               ` Arnaldo Carvalho de Melo
  2021-06-03 14:57               ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-01 18:24 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa,
	Michal Suchánek, dwarves, bpf, Kernel Team, Michael Petlan

Em Sat, May 29, 2021 at 05:40:17PM -0700, Andrii Nakryiko escreveu:
> On Fri, May 28, 2021 at 12:45 PM Arnaldo Carvalho de Melo
> <arnaldo.melo@gmail.com> wrote:
> >
> > Em Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko escreveu:
> > > On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
> > > > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > > >If we make 1.22 mandatory there will be no good reason to make 1.23
> > > > >mandatory again. So I will have absolutely no inclination to work on
> > > > >this, for example. So we are just wasting a chance to clean up the
> > > > >Kbuild story w.r.t. pahole. And we are talking about just a few days
> > > > >at most, while we do have a reasonable work around on the kernel side.
> >
> > > > So there were patches for stop using objcopy, which we thought could
> > > > uncover some can of worms, were there patches for the detached BTF
> > > > file?
> >
> > > No, there weren't, if I remember correctly. What's the concern,
> > > though? That detached BTF file isn't even an ELF, so it's
> > > btf__get_raw_data() and write it to the file. Done.
> >
> > See patch below, lightly tested, now working on making pahole accept raw
> > BTF files out of /sys/kernel/btf/
> >
> > Please test, and if works as expected, try to bolt this into the kbuild
> > process, as intended.
> 
> So while looking through this I found --skip_encoding_btf_vars and I
> just sent a fix to disable per-CPU var BTF generation for versions
> 1.18 through 1.21. I think that's a better solution than all the

That is already in akpm's tree, cool.

I also forgot that I asked Han to have a way to disable this new feature
as it had gone thru several back and forths so could still have some
problem.

> previously proposed ones. But it also means we have no good reason to
> force 1.22+ as minimal version.

> But in either case, this is good feature and will definitely be useful
> going forward. See my comments below.

Sure
 
> > commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
> > Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Date:   Fri May 28 16:41:30 2021 -0300
> >
> >     pahole: Allow encoding BTF into a detached file
> >
> >     Previously the newly encoded BTF info was stored into a ELF section in
> >     the file where the DWARF info was obtained, but it is useful to just
> >     dump it into a separate file, do it.
> >
> >     Requested-by: Andrii Nakryiko <andrii@kernel.org>
> >     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> >
> 
> Looks good, see few minor comments below. At some point it probably
> would make sense to formalize "btf_encoder" as a struct with its own
> state instead of passing in multiple variables. It would probably also

Yeah, this all was made in haste, to have features out of the door ASAP,
etc. I hate global variables and this code is full of it.

> allow to parallelize BTF generation, where each CU would proceed in
> parallel generating local BTF, and then the final pass would merge and
> dedup BTFs. Currently reading and processing DWARF is the slowest part

yeah, would be wonderful to have someone working on this.

> of the DWARF-to-BTF conversion, parallelization and maybe some other
> optimization seems like the only way to speed the process up.

agreed
 
> Acked-by: Andrii Nakryiko <andrii@kernel.org>
> 
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 033c927b537dad1e..bc3ac72968cea826 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -21,6 +21,14 @@
> >  #include <stdlib.h> /* for qsort() and bsearch() */
> >  #include <inttypes.h>
> >
> > +#include <sys/types.h>
> > +#include <sys/stat.h>
> > +#include <fcntl.h>
> > +
> > +#include <unistd.h>
> > +
> > +#include <errno.h>
> > +
> >  /*
> >   * This corresponds to the same macro defined in
> >   * include/linux/kallsyms.h
> > @@ -267,14 +275,62 @@ static struct btf_elf *btfe;
> >  static uint32_t array_index_id;
> >  static bool has_index_type;
> >
> > -int btf_encoder__encode()
> > +static int btf_encoder__dump(struct btf *btf, const char *filename)
> > +{
> > +       uint32_t raw_btf_size;
> > +       const void *raw_btf_data;
> > +       int fd, err;
> > +
> > +       /* Empty file, nothing to do, so... done! */
> > +       if (btf__get_nr_types(btf) == 0)
> > +               return 0;
> > +
> > +       if (btf__dedup(btf, NULL, NULL)) {
> > +               fprintf(stderr, "%s: btf__dedup failed!\n", __func__);
> > +               return -1;
> > +       }
> > +
> > +       raw_btf_data = btf__get_raw_data(btf, &raw_btf_size);
> > +       if (raw_btf_data == NULL) {
> > +                fprintf(stderr, "%s: btf__get_raw_data failed!\n", __func__);
> 
> indentation seems off here and in few places below

Yeah, I fixed those now
 
> > +               return -1;
> > +       }
> > +
> > +       fd = open(filename, O_WRONLY | O_CREAT);
> > +       if (fd < 0) {
> > +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
> > +               return -1;
> > +       }
> > +       err = write(fd, raw_btf_data, raw_btf_size);
> > +       if (err < 0)
> > +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
> 
> nit: copy-pasted error message is wrong

fixed
 
> > +
> > +       close(fd);
> > +
> > +       if (err != raw_btf_size) {
> > +                fprintf(stderr, "%s: Could only write %d bytes to %s of raw BTF info out of %d, aborting\n", __func__, err, filename, raw_btf_size);
> > +               unlink(filename);
> > +               err = -1;
> > +       } else {
> > +               /* go from bytes written == raw_btf_size to an indication that all went fine */
> > +               err = 0;
> > +       }
> > +
> > +       return err;
> > +}
> > +
> 
> [...]

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-29  2:36             ` Andrii Nakryiko
@ 2021-06-01 18:31               ` Arnaldo Carvalho de Melo
  2021-06-01 18:32                 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-01 18:31 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa,
	Michal Suchánek, dwarves, bpf, Kernel Team, Michael Petlan

Em Fri, May 28, 2021 at 07:36:25PM -0700, Andrii Nakryiko escreveu:
> On Fri, May 28, 2021 at 12:45 PM Arnaldo Carvalho de Melo
> <arnaldo.melo@gmail.com> wrote:
> >
> > Em Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko escreveu:
> > > On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
> > > > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > > >If we make 1.22 mandatory there will be no good reason to make 1.23
> > > > >mandatory again. So I will have absolutely no inclination to work on
> > > > >this, for example. So we are just wasting a chance to clean up the
> > > > >Kbuild story w.r.t. pahole. And we are talking about just a few days
> > > > >at most, while we do have a reasonable work around on the kernel side.
> >
> > > > So there were patches for stop using objcopy, which we thought could
> > > > uncover some can of worms, were there patches for the detached BTF
> > > > file?
> >
> > > No, there weren't, if I remember correctly. What's the concern,
> > > though? That detached BTF file isn't even an ELF, so it's
> > > btf__get_raw_data() and write it to the file. Done.
> >
> > See patch below, lightly tested, now working on making pahole accept raw
> > BTF files out of /sys/kernel/btf/
> 
> btf__parse() handles both ELF and raw BTF transparently. Then there is
> btf__parse_elf() and btf__parse_raw() for specifically ELF or raw.
> 
> See all APIs at:
> https://github.com/libbpf/libbpf/blob/master/src/btf.h#L40

Ok, I was using:

int btf_elf__load(struct btf_elf *btfe)
{
        int err;

        libbpf_set_print(libbpf_log);

        /* free initial empty BTF */
        btf__free(btfe->btf);
        if (btfe->raw_btf)
                btfe->btf = btf__parse_raw_split(btfe->filename, btfe->base_btf);
        else
                btfe->btf = btf__parse_elf_split(btfe->filename, btfe->base_btf);

        err = libbpf_get_error(btfe->btf);
        if (err)
                return err;

        return 0;
}


but btf__parse(path, btfext_ptr) doesn't allow me to apss the base btf,
lemme see if there is a btf__parse_split...

- Arnaldo
 
> >
> > Please test, and if works as expected, try to bolt this into the kbuild
> > process, as intended.
> 
> I'm on PTO today, will try to get to this soon-ish.
> 
> >
> > - Arnaldo
> >
> > commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
> > Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Date:   Fri May 28 16:41:30 2021 -0300
> >
> >     pahole: Allow encoding BTF into a detached file
> >
> >     Previously the newly encoded BTF info was stored into a ELF section in
> >     the file where the DWARF info was obtained, but it is useful to just
> >     dump it into a separate file, do it.
> >
> >     Requested-by: Andrii Nakryiko <andrii@kernel.org>
> >     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> >
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 033c927b537dad1e..bc3ac72968cea826 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -21,6 +21,14 @@
> >  #include <stdlib.h> /* for qsort() and bsearch() */
> >  #include <inttypes.h>
> >
> > +#include <sys/types.h>
> > +#include <sys/stat.h>
> > +#include <fcntl.h>
> > +
> > +#include <unistd.h>
> > +
> > +#include <errno.h>
> > +
> >  /*
> >   * This corresponds to the same macro defined in
> >   * include/linux/kallsyms.h
> > @@ -267,14 +275,62 @@ static struct btf_elf *btfe;
> >  static uint32_t array_index_id;
> >  static bool has_index_type;
> >
> > -int btf_encoder__encode()
> > +static int btf_encoder__dump(struct btf *btf, const char *filename)
> > +{
> > +       uint32_t raw_btf_size;
> > +       const void *raw_btf_data;
> > +       int fd, err;
> > +
> > +       /* Empty file, nothing to do, so... done! */
> > +       if (btf__get_nr_types(btf) == 0)
> > +               return 0;
> > +
> > +       if (btf__dedup(btf, NULL, NULL)) {
> > +               fprintf(stderr, "%s: btf__dedup failed!\n", __func__);
> > +               return -1;
> > +       }
> > +
> > +       raw_btf_data = btf__get_raw_data(btf, &raw_btf_size);
> > +       if (raw_btf_data == NULL) {
> > +                fprintf(stderr, "%s: btf__get_raw_data failed!\n", __func__);
> > +               return -1;
> > +       }
> > +
> > +       fd = open(filename, O_WRONLY | O_CREAT);
> > +       if (fd < 0) {
> > +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
> > +               return -1;
> > +       }
> > +       err = write(fd, raw_btf_data, raw_btf_size);
> > +       if (err < 0)
> > +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
> > +
> > +       close(fd);
> > +
> > +       if (err != raw_btf_size) {
> > +                fprintf(stderr, "%s: Could only write %d bytes to %s of raw BTF info out of %d, aborting\n", __func__, err, filename, raw_btf_size);
> > +               unlink(filename);
> > +               err = -1;
> > +       } else {
> > +               /* go from bytes written == raw_btf_size to an indication that all went fine */
> > +               err = 0;
> > +       }
> > +
> > +       return err;
> > +}
> > +
> > +int btf_encoder__encode(const char *filename)
> >  {
> >         int err;
> >
> >         if (gobuffer__size(&btfe->percpu_secinfo) != 0)
> >                 btf_elf__add_datasec_type(btfe, PERCPU_SECTION, &btfe->percpu_secinfo);
> >
> > -       err = btf_elf__encode(btfe, 0);
> > +       if (filename == NULL)
> > +               err = btf_elf__encode(btfe, 0);
> > +       else
> > +               err = btf_encoder__dump(btfe->btf, filename);
> > +
> >         delete_functions();
> >         btf_elf__delete(btfe);
> >         btfe = NULL;
> > @@ -412,7 +468,7 @@ static bool has_arg_names(struct cu *cu, struct ftype *ftype)
> >  }
> >
> >  int cu__encode_btf(struct cu *cu, int verbose, bool force,
> > -                  bool skip_encoding_vars)
> > +                  bool skip_encoding_vars, const char *detached_btf_filename)
> >  {
> >         uint32_t type_id_off;
> >         uint32_t core_id;
> > @@ -425,7 +481,7 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
> >         btf_elf__force = force;
> >
> >         if (btfe && strcmp(btfe->filename, cu->filename)) {
> > -               err = btf_encoder__encode();
> > +               err = btf_encoder__encode(detached_btf_filename);
> >                 if (err)
> >                         goto out;
> >
> > diff --git a/btf_encoder.h b/btf_encoder.h
> > index 46fb2312fc0eea9b..bfc6085092028adc 100644
> > --- a/btf_encoder.h
> > +++ b/btf_encoder.h
> > @@ -11,9 +11,9 @@
> >
> >  struct cu;
> >
> > -int btf_encoder__encode();
> > +int btf_encoder__encode(const char *filename);
> >
> >  int cu__encode_btf(struct cu *cu, int verbose, bool force,
> > -                  bool skip_encoding_vars);
> > +                  bool skip_encoding_vars, const char *detached_btf_filename);
> >
> >  #endif /* _BTF_ENCODER_H_ */
> > diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
> > index 2659fe6f231405d8..6e7ded59595f5ea7 100644
> > --- a/man-pages/pahole.1
> > +++ b/man-pages/pahole.1
> > @@ -189,6 +189,10 @@ features such as BPF CO-RE (Compile Once - Run Everywhere).
> >
> >  See \fIhttps://nakryiko.com/posts/bpf-portability-and-co-re/\fR.
> >
> > +.TP
> > +.B \-j, \-\-btf_encode_detached=FILENAME
> > +Same thing as -J/--btf_encode, but storing the raw BTF info into a separate file.
> > +
> >  .TP
> >  .B \-\-btf_encode_force
> >  Ignore those symbols found invalid when encoding BTF.
> > diff --git a/pahole.c b/pahole.c
> > index 24659e2969c8fb85..1cbff6175b60af51 100644
> > --- a/pahole.c
> > +++ b/pahole.c
> > @@ -26,6 +26,7 @@
> >  #include "lib/bpf/src/libbpf.h"
> >  #include "pahole_strings.h"
> >
> > +static char *detached_btf_filename;
> >  static bool btf_encode;
> >  static bool ctf_encode;
> >  static bool first_obj_only;
> > @@ -1152,6 +1153,12 @@ static const struct argp_option pahole__options[] = {
> >                 .key  = 'J',
> >                 .doc  = "Encode as BTF",
> >         },
> > +       {
> > +               .name = "btf_encode_detached",
> > +               .key  = 'j',
> > +               .arg  = "FILENAME",
> > +               .doc  = "Encode as BTF in a detached file",
> > +       },
> >         {
> >                 .name = "skip_encoding_btf_vars",
> >                 .key  = ARGP_skip_encoding_btf_vars,
> > @@ -1223,6 +1230,7 @@ static error_t pahole__options_parser(int key, char *arg,
> >                   conf_load.extra_dbg_info = 1;         break;
> >         case 'i': find_containers = 1;
> >                   class_name = arg;                     break;
> > +       case 'j': detached_btf_filename = arg; // fallthru
> >         case 'J': btf_encode = 1;
> >                   conf_load.get_addr_info = true;
> >                   no_bitfield_type_recode = true;       break;
> > @@ -2458,7 +2466,7 @@ static enum load_steal_kind pahole_stealer(struct cu *cu,
> >
> >         if (btf_encode) {
> >                 if (cu__encode_btf(cu, global_verbose, btf_encode_force,
> > -                                  skip_encoding_btf_vars)) {
> > +                                  skip_encoding_btf_vars, detached_btf_filename)) {
> >                         fprintf(stderr, "Encountered error while encoding BTF.\n");
> >                         exit(1);
> >                 }
> > @@ -2872,7 +2880,7 @@ try_sole_arg_as_class_names:
> >         header = NULL;
> >
> >         if (btf_encode) {
> > -               err = btf_encoder__encode();
> > +               err = btf_encoder__encode(detached_btf_filename);
> >                 if (err) {
> >                         fputs("Failed to encode BTF\n", stderr);
> >                         goto out_cus_delete;

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-06-01 18:31               ` Arnaldo Carvalho de Melo
@ 2021-06-01 18:32                 ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-01 18:32 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa,
	Michal Suchánek, dwarves, bpf, Kernel Team, Michael Petlan

Em Tue, Jun 01, 2021 at 03:31:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, May 28, 2021 at 07:36:25PM -0700, Andrii Nakryiko escreveu:
> > On Fri, May 28, 2021 at 12:45 PM Arnaldo Carvalho de Melo
> > <arnaldo.melo@gmail.com> wrote:
> > >
> > > Em Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko escreveu:
> > > > On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
> > > > > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > > > >If we make 1.22 mandatory there will be no good reason to make 1.23
> > > > > >mandatory again. So I will have absolutely no inclination to work on
> > > > > >this, for example. So we are just wasting a chance to clean up the
> > > > > >Kbuild story w.r.t. pahole. And we are talking about just a few days
> > > > > >at most, while we do have a reasonable work around on the kernel side.
> > >
> > > > > So there were patches for stop using objcopy, which we thought could
> > > > > uncover some can of worms, were there patches for the detached BTF
> > > > > file?
> > >
> > > > No, there weren't, if I remember correctly. What's the concern,
> > > > though? That detached BTF file isn't even an ELF, so it's
> > > > btf__get_raw_data() and write it to the file. Done.
> > >
> > > See patch below, lightly tested, now working on making pahole accept raw
> > > BTF files out of /sys/kernel/btf/
> > 
> > btf__parse() handles both ELF and raw BTF transparently. Then there is
> > btf__parse_elf() and btf__parse_raw() for specifically ELF or raw.
> > 
> > See all APIs at:
> > https://github.com/libbpf/libbpf/blob/master/src/btf.h#L40
> 
> Ok, I was using:
> 
> int btf_elf__load(struct btf_elf *btfe)
> {
>         int err;
> 
>         libbpf_set_print(libbpf_log);
> 
>         /* free initial empty BTF */
>         btf__free(btfe->btf);
>         if (btfe->raw_btf)
>                 btfe->btf = btf__parse_raw_split(btfe->filename, btfe->base_btf);
>         else
>                 btfe->btf = btf__parse_elf_split(btfe->filename, btfe->base_btf);
> 
>         err = libbpf_get_error(btfe->btf);
>         if (err)
>                 return err;
> 
>         return 0;
> }
> 
> 
> but btf__parse(path, btfext_ptr) doesn't allow me to apss the base btf,
> lemme see if there is a btf__parse_split...

yeap, using that then...
 
> - Arnaldo
>  
> > >
> > > Please test, and if works as expected, try to bolt this into the kbuild
> > > process, as intended.
> > 
> > I'm on PTO today, will try to get to this soon-ish.
> > 
> > >
> > > - Arnaldo
> > >
> > > commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
> > > Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > Date:   Fri May 28 16:41:30 2021 -0300
> > >
> > >     pahole: Allow encoding BTF into a detached file
> > >
> > >     Previously the newly encoded BTF info was stored into a ELF section in
> > >     the file where the DWARF info was obtained, but it is useful to just
> > >     dump it into a separate file, do it.
> > >
> > >     Requested-by: Andrii Nakryiko <andrii@kernel.org>
> > >     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > >
> > > diff --git a/btf_encoder.c b/btf_encoder.c
> > > index 033c927b537dad1e..bc3ac72968cea826 100644
> > > --- a/btf_encoder.c
> > > +++ b/btf_encoder.c
> > > @@ -21,6 +21,14 @@
> > >  #include <stdlib.h> /* for qsort() and bsearch() */
> > >  #include <inttypes.h>
> > >
> > > +#include <sys/types.h>
> > > +#include <sys/stat.h>
> > > +#include <fcntl.h>
> > > +
> > > +#include <unistd.h>
> > > +
> > > +#include <errno.h>
> > > +
> > >  /*
> > >   * This corresponds to the same macro defined in
> > >   * include/linux/kallsyms.h
> > > @@ -267,14 +275,62 @@ static struct btf_elf *btfe;
> > >  static uint32_t array_index_id;
> > >  static bool has_index_type;
> > >
> > > -int btf_encoder__encode()
> > > +static int btf_encoder__dump(struct btf *btf, const char *filename)
> > > +{
> > > +       uint32_t raw_btf_size;
> > > +       const void *raw_btf_data;
> > > +       int fd, err;
> > > +
> > > +       /* Empty file, nothing to do, so... done! */
> > > +       if (btf__get_nr_types(btf) == 0)
> > > +               return 0;
> > > +
> > > +       if (btf__dedup(btf, NULL, NULL)) {
> > > +               fprintf(stderr, "%s: btf__dedup failed!\n", __func__);
> > > +               return -1;
> > > +       }
> > > +
> > > +       raw_btf_data = btf__get_raw_data(btf, &raw_btf_size);
> > > +       if (raw_btf_data == NULL) {
> > > +                fprintf(stderr, "%s: btf__get_raw_data failed!\n", __func__);
> > > +               return -1;
> > > +       }
> > > +
> > > +       fd = open(filename, O_WRONLY | O_CREAT);
> > > +       if (fd < 0) {
> > > +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
> > > +               return -1;
> > > +       }
> > > +       err = write(fd, raw_btf_data, raw_btf_size);
> > > +       if (err < 0)
> > > +                fprintf(stderr, "%s: Couldn't open %s for writing the raw BTF info: %s\n", __func__, filename, strerror(errno));
> > > +
> > > +       close(fd);
> > > +
> > > +       if (err != raw_btf_size) {
> > > +                fprintf(stderr, "%s: Could only write %d bytes to %s of raw BTF info out of %d, aborting\n", __func__, err, filename, raw_btf_size);
> > > +               unlink(filename);
> > > +               err = -1;
> > > +       } else {
> > > +               /* go from bytes written == raw_btf_size to an indication that all went fine */
> > > +               err = 0;
> > > +       }
> > > +
> > > +       return err;
> > > +}
> > > +
> > > +int btf_encoder__encode(const char *filename)
> > >  {
> > >         int err;
> > >
> > >         if (gobuffer__size(&btfe->percpu_secinfo) != 0)
> > >                 btf_elf__add_datasec_type(btfe, PERCPU_SECTION, &btfe->percpu_secinfo);
> > >
> > > -       err = btf_elf__encode(btfe, 0);
> > > +       if (filename == NULL)
> > > +               err = btf_elf__encode(btfe, 0);
> > > +       else
> > > +               err = btf_encoder__dump(btfe->btf, filename);
> > > +
> > >         delete_functions();
> > >         btf_elf__delete(btfe);
> > >         btfe = NULL;
> > > @@ -412,7 +468,7 @@ static bool has_arg_names(struct cu *cu, struct ftype *ftype)
> > >  }
> > >
> > >  int cu__encode_btf(struct cu *cu, int verbose, bool force,
> > > -                  bool skip_encoding_vars)
> > > +                  bool skip_encoding_vars, const char *detached_btf_filename)
> > >  {
> > >         uint32_t type_id_off;
> > >         uint32_t core_id;
> > > @@ -425,7 +481,7 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
> > >         btf_elf__force = force;
> > >
> > >         if (btfe && strcmp(btfe->filename, cu->filename)) {
> > > -               err = btf_encoder__encode();
> > > +               err = btf_encoder__encode(detached_btf_filename);
> > >                 if (err)
> > >                         goto out;
> > >
> > > diff --git a/btf_encoder.h b/btf_encoder.h
> > > index 46fb2312fc0eea9b..bfc6085092028adc 100644
> > > --- a/btf_encoder.h
> > > +++ b/btf_encoder.h
> > > @@ -11,9 +11,9 @@
> > >
> > >  struct cu;
> > >
> > > -int btf_encoder__encode();
> > > +int btf_encoder__encode(const char *filename);
> > >
> > >  int cu__encode_btf(struct cu *cu, int verbose, bool force,
> > > -                  bool skip_encoding_vars);
> > > +                  bool skip_encoding_vars, const char *detached_btf_filename);
> > >
> > >  #endif /* _BTF_ENCODER_H_ */
> > > diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
> > > index 2659fe6f231405d8..6e7ded59595f5ea7 100644
> > > --- a/man-pages/pahole.1
> > > +++ b/man-pages/pahole.1
> > > @@ -189,6 +189,10 @@ features such as BPF CO-RE (Compile Once - Run Everywhere).
> > >
> > >  See \fIhttps://nakryiko.com/posts/bpf-portability-and-co-re/\fR.
> > >
> > > +.TP
> > > +.B \-j, \-\-btf_encode_detached=FILENAME
> > > +Same thing as -J/--btf_encode, but storing the raw BTF info into a separate file.
> > > +
> > >  .TP
> > >  .B \-\-btf_encode_force
> > >  Ignore those symbols found invalid when encoding BTF.
> > > diff --git a/pahole.c b/pahole.c
> > > index 24659e2969c8fb85..1cbff6175b60af51 100644
> > > --- a/pahole.c
> > > +++ b/pahole.c
> > > @@ -26,6 +26,7 @@
> > >  #include "lib/bpf/src/libbpf.h"
> > >  #include "pahole_strings.h"
> > >
> > > +static char *detached_btf_filename;
> > >  static bool btf_encode;
> > >  static bool ctf_encode;
> > >  static bool first_obj_only;
> > > @@ -1152,6 +1153,12 @@ static const struct argp_option pahole__options[] = {
> > >                 .key  = 'J',
> > >                 .doc  = "Encode as BTF",
> > >         },
> > > +       {
> > > +               .name = "btf_encode_detached",
> > > +               .key  = 'j',
> > > +               .arg  = "FILENAME",
> > > +               .doc  = "Encode as BTF in a detached file",
> > > +       },
> > >         {
> > >                 .name = "skip_encoding_btf_vars",
> > >                 .key  = ARGP_skip_encoding_btf_vars,
> > > @@ -1223,6 +1230,7 @@ static error_t pahole__options_parser(int key, char *arg,
> > >                   conf_load.extra_dbg_info = 1;         break;
> > >         case 'i': find_containers = 1;
> > >                   class_name = arg;                     break;
> > > +       case 'j': detached_btf_filename = arg; // fallthru
> > >         case 'J': btf_encode = 1;
> > >                   conf_load.get_addr_info = true;
> > >                   no_bitfield_type_recode = true;       break;
> > > @@ -2458,7 +2466,7 @@ static enum load_steal_kind pahole_stealer(struct cu *cu,
> > >
> > >         if (btf_encode) {
> > >                 if (cu__encode_btf(cu, global_verbose, btf_encode_force,
> > > -                                  skip_encoding_btf_vars)) {
> > > +                                  skip_encoding_btf_vars, detached_btf_filename)) {
> > >                         fprintf(stderr, "Encountered error while encoding BTF.\n");
> > >                         exit(1);
> > >                 }
> > > @@ -2872,7 +2880,7 @@ try_sole_arg_as_class_names:
> > >         header = NULL;
> > >
> > >         if (btf_encode) {
> > > -               err = btf_encoder__encode();
> > > +               err = btf_encoder__encode(detached_btf_filename);
> > >                 if (err) {
> > >                         fputs("Failed to encode BTF\n", stderr);
> > >                         goto out_cus_delete;
> 
> -- 
> 
> - Arnaldo

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-30 21:45             ` Jiri Olsa
@ 2021-06-01 19:07               ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-01 19:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Andrii Nakryiko,
	Jiri Olsa, Michal Suchánek, dwarves, bpf, Kernel Team,
	Michael Petlan

Em Sun, May 30, 2021 at 11:45:15PM +0200, Jiri Olsa escreveu:
> On Fri, May 28, 2021 at 04:45:31PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, May 27, 2021 at 01:41:13PM -0700, Andrii Nakryiko escreveu:
> > > On Thu, May 27, 2021 at 12:57 PM Arnaldo <arnaldo.melo@gmail.com> wrote:
> > > > On May 27, 2021 4:14:17 PM GMT-03:00, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > > >If we make 1.22 mandatory there will be no good reason to make 1.23
> > > > >mandatory again. So I will have absolutely no inclination to work on
> > > > >this, for example. So we are just wasting a chance to clean up the
> > > > >Kbuild story w.r.t. pahole. And we are talking about just a few days
> > > > >at most, while we do have a reasonable work around on the kernel side.
> > 
> > > > So there were patches for stop using objcopy, which we thought could
> > > > uncover some can of worms, were there patches for the detached BTF
> > > > file?
> >  
> > > No, there weren't, if I remember correctly. What's the concern,
> > > though? That detached BTF file isn't even an ELF, so it's
> > > btf__get_raw_data() and write it to the file. Done.
> > 
> > See patch below, lightly tested, now working on making pahole accept raw
> > BTF files out of /sys/kernel/btf/
> > 
> > Please test, and if works as expected, try to bolt this into the kbuild
> > process, as intended.
> > 
> > - Arnaldo
> > 
> > commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
> > Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Date:   Fri May 28 16:41:30 2021 -0300
> > 
> >     pahole: Allow encoding BTF into a detached file
> >     
> >     Previously the newly encoded BTF info was stored into a ELF section in
> >     the file where the DWARF info was obtained, but it is useful to just
> >     dump it into a separate file, do it.
> >     
> >     Requested-by: Andrii Nakryiko <andrii@kernel.org>
> >     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > 
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 033c927b537dad1e..bc3ac72968cea826 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -21,6 +21,14 @@
> >  #include <stdlib.h> /* for qsort() and bsearch() */
> >  #include <inttypes.h>
> >  
> > +#include <sys/types.h>
> > +#include <sys/stat.h>
> > +#include <fcntl.h>
> > +
> > +#include <unistd.h>
> > +
> > +#include <errno.h>
> > +
> >  /*
> >   * This corresponds to the same macro defined in
> >   * include/linux/kallsyms.h
> > @@ -267,14 +275,62 @@ static struct btf_elf *btfe;
> >  static uint32_t array_index_id;
> >  static bool has_index_type;
> >  
> > -int btf_encoder__encode()
> > +static int btf_encoder__dump(struct btf *btf, const char *filename)
> > +{
> > +	uint32_t raw_btf_size;
> > +	const void *raw_btf_data;
> > +	int fd, err;
> > +
> > +	/* Empty file, nothing to do, so... done! */
> > +	if (btf__get_nr_types(btf) == 0)
> > +		return 0;
> > +
> > +	if (btf__dedup(btf, NULL, NULL)) {
> > +		fprintf(stderr, "%s: btf__dedup failed!\n", __func__);
> > +		return -1;
> > +	}
> > +
> > +	raw_btf_data = btf__get_raw_data(btf, &raw_btf_size);
> > +	if (raw_btf_data == NULL) {
> > +                fprintf(stderr, "%s: btf__get_raw_data failed!\n", __func__);
> > +		return -1;
> > +	}
> > +
> > +	fd = open(filename, O_WRONLY | O_CREAT);
> 
> I think you need to specify file mode as 3rd param:

Right you are! I've read this comment from you and, left it for later
and then when I used btf__parse_split("vmlinux.btf", NULL) as Andrii
suggested, I got a EPERM, erm? tried
btf__parse-split("/sys/kernel/btf/vmlinux", NULL), it worked, humm,
yeah, Jiri mentioned something about weird modes... Fixed it using
'chmod 644' and it worked.

Now fixing it by adding the 3rd param, thanks!

- Arnaldo
 
> 	$ ./pahole -j krava vmlinux 
> 	$ ~/linux/tools/bpf/bpftool/bpftool btf dump file ./krava 
> 	Error: failed to load BTF from ./krava: Permission denied
> 	$ ll krava 
> 	--w-r-Sr-T. 1 jolsa jolsa 4525734 May 30 23:38 krava
> 
> jirka
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 15:20 [RFT] Testing 1.22 Arnaldo Carvalho de Melo
  2021-05-27 16:54 ` Andrii Nakryiko
@ 2021-06-02 10:21 ` Michael Petlan
  2021-07-15 21:31 ` Michal Suchánek
  2 siblings, 0 replies; 44+ messages in thread
From: Michael Petlan @ 2021-06-02 10:21 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Jiri Olsa, Michal Suchánek, dwarves, bpf,
	kernel-team

On Thu, 27 May 2021, Arnaldo Carvalho de Melo wrote:
> Hi guys,
> 
> 	Its important to have 1.22 out of the door ASAP, so please clone
> what is in tmp.master and report your results.
> 
[...]
> 
> Then make sure build/pahole is in your path and try your workloads.
> 
> Jiri, Michael, if you could run your tests with this, that would be awesome,

Hi Arnaldo!

I am sorry, but I don't think I have any tests covering this.

Michael
> 
> Thanks in advance!
> 
> - Arnaldo
> 
> 


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-30  0:40             ` Andrii Nakryiko
  2021-06-01 18:24               ` Arnaldo Carvalho de Melo
@ 2021-06-03 14:57               ` Arnaldo Carvalho de Melo
  2021-06-05  2:55                 ` Andrii Nakryiko
  1 sibling, 1 reply; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-03 14:57 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: Andrii Nakryiko, Jiri Olsa, dwarves, bpf, Kernel Team

Em Sat, May 29, 2021 at 05:40:17PM -0700, Andrii Nakryiko escreveu:
> On Fri, May 28, 2021 at 12:45 PM Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> wrote:
> > commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
> > Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Date:   Fri May 28 16:41:30 2021 -0300
> >
> >     pahole: Allow encoding BTF into a detached file
> >
> >     Previously the newly encoded BTF info was stored into a ELF section in
> >     the file where the DWARF info was obtained, but it is useful to just
> >     dump it into a separate file, do it.
> >
> >     Requested-by: Andrii Nakryiko <andrii@kernel.org>
> >     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> >
> 
> Looks good, see few minor comments below. At some point it probably
> would make sense to formalize "btf_encoder" as a struct with its own
> state instead of passing in multiple variables. It would probably also

Take a look at the tmp.master branch at:

https://git.kernel.org/pub/scm/devel/pahole/pahole.git/log/?h=tmp.master

that btf_elf class isn't used anymore by btf_loader, that uses only
libbpf's APIs, and now we have a btf_encoder class with all the globals,
etc, more baby steps are needed to finally ditch btf_elf altogether and
move on to the parallelization.

I'm doing 'pahole -J vmlinux && btfdiff' after each cset and doing it
very piecemeal as I'm doing will help bisecting any subtle bug this may
introduce.

> allow to parallelize BTF generation, where each CU would proceed in
> parallel generating local BTF, and then the final pass would merge and
> dedup BTFs. Currently reading and processing DWARF is the slowest part
> of the DWARF-to-BTF conversion, parallelization and maybe some other
> optimization seems like the only way to speed the process up.
> 
> Acked-by: Andrii Nakryiko <andrii@kernel.org>

Thanks!

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-06-03 14:57               ` Arnaldo Carvalho de Melo
@ 2021-06-05  2:55                 ` Andrii Nakryiko
  2021-06-07 13:20                   ` Parallelizing vmlinux BTF encoding. was " Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 44+ messages in thread
From: Andrii Nakryiko @ 2021-06-05  2:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Jiri Olsa, dwarves, bpf, Kernel Team

On Thu, Jun 3, 2021 at 7:57 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Sat, May 29, 2021 at 05:40:17PM -0700, Andrii Nakryiko escreveu:
> > On Fri, May 28, 2021 at 12:45 PM Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> wrote:
> > > commit b579a18a1ea0ee84b90b5302f597dda2edf2f61b
> > > Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > Date:   Fri May 28 16:41:30 2021 -0300
> > >
> > >     pahole: Allow encoding BTF into a detached file
> > >
> > >     Previously the newly encoded BTF info was stored into a ELF section in
> > >     the file where the DWARF info was obtained, but it is useful to just
> > >     dump it into a separate file, do it.
> > >
> > >     Requested-by: Andrii Nakryiko <andrii@kernel.org>
> > >     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > >
> >
> > Looks good, see few minor comments below. At some point it probably
> > would make sense to formalize "btf_encoder" as a struct with its own
> > state instead of passing in multiple variables. It would probably also
>
> Take a look at the tmp.master branch at:
>
> https://git.kernel.org/pub/scm/devel/pahole/pahole.git/log/?h=tmp.master

Oh wow, that's a lot of commits! :) Great that you decided to do this
refactoring, thanks!

>
> that btf_elf class isn't used anymore by btf_loader, that uses only
> libbpf's APIs, and now we have a btf_encoder class with all the globals,
> etc, more baby steps are needed to finally ditch btf_elf altogether and
> move on to the parallelization.

So do you plan to try to parallelize as a next step? I'm pretty
confident about BTF encoding part: dump each CU into its own BTF, use
btf__add_type() to merge multiple BTFs together. Just need to re-map
IDs (libbpf internally has API to visit each field that contains
type_id, it's well-defined enough to expose that as a public API, if
necessary). Then final btf_dedup().

But the DWARF loading and parsing part is almost a black box to me, so
I'm not sure how much work it would involve.

>
> I'm doing 'pahole -J vmlinux && btfdiff' after each cset and doing it
> very piecemeal as I'm doing will help bisecting any subtle bug this may
> introduce.
>
> > allow to parallelize BTF generation, where each CU would proceed in
> > parallel generating local BTF, and then the final pass would merge and
> > dedup BTFs. Currently reading and processing DWARF is the slowest part
> > of the DWARF-to-BTF conversion, parallelization and maybe some other
> > optimization seems like the only way to speed the process up.
> >
> > Acked-by: Andrii Nakryiko <andrii@kernel.org>
>
> Thanks!
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-05  2:55                 ` Andrii Nakryiko
@ 2021-06-07 13:20                   ` Arnaldo Carvalho de Melo
  2021-06-07 15:42                     ` Yonghong Song
  2021-06-08  0:53                     ` Andrii Nakryiko
  0 siblings, 2 replies; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-07 13:20 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, Jiri Olsa, dwarves, bpf, Kernel Team,
	Linux Kernel Mailing List

Em Fri, Jun 04, 2021 at 07:55:17PM -0700, Andrii Nakryiko escreveu:
> On Thu, Jun 3, 2021 at 7:57 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > Em Sat, May 29, 2021 at 05:40:17PM -0700, Andrii Nakryiko escreveu:

> > > At some point it probably would make sense to formalize
> > > "btf_encoder" as a struct with its own state instead of passing in
> > > multiple variables. It would probably also

> > Take a look at the tmp.master branch at:

> > https://git.kernel.org/pub/scm/devel/pahole/pahole.git/log/?h=tmp.master
 
> Oh wow, that's a lot of commits! :) Great that you decided to do this
> refactoring, thanks!
 
> > that btf_elf class isn't used anymore by btf_loader, that uses only
> > libbpf's APIs, and now we have a btf_encoder class with all the globals,
> > etc, more baby steps are needed to finally ditch btf_elf altogether and
> > move on to the parallelization.
 
> So do you plan to try to parallelize as a next step? I'm pretty

So, I haven't looked at details but what I thought would be interesting
to investigate is to see if we can piggyback DWARF generation with BTF
one, i.e. when we generate a .o file with -g we encode the DWARF info,
so, right after this, we could call pahole as-is and encode BTF, then,
when vmlinux is linked, we would do the dedup.

I.e. when generating ../build/v5.13.0-rc4+/kernel/fork.o, that comes
with:

⬢[acme@toolbox perf]$ readelf -SW ../build/v5.13.0-rc4+/kernel/fork.o | grep debug
  [78] .debug_info       PROGBITS        0000000000000000 00daec 032968 00      0   0  1
  [79] .rela.debug_info  RELA            0000000000000000 040458 053b68 18   I 95  78  8
  [80] .debug_abbrev     PROGBITS        0000000000000000 093fc0 0012e9 00      0   0  1
  [81] .debug_loclists   PROGBITS        0000000000000000 0952a9 00aa43 00      0   0  1
  [82] .rela.debug_loclists RELA         0000000000000000 09fcf0 009d98 18   I 95  81  8
  [83] .debug_aranges    PROGBITS        0000000000000000 0a9a88 000080 00      0   0  1
  [84] .rela.debug_aranges RELA          0000000000000000 0a9b08 0000a8 18   I 95  83  8
  [85] .debug_rnglists   PROGBITS        0000000000000000 0a9bb0 001509 00      0   0  1
  [86] .rela.debug_rnglists RELA         0000000000000000 0ab0c0 001bc0 18   I 95  85  8
  [87] .debug_line       PROGBITS        0000000000000000 0acc80 0086b7 00      0   0  1
  [88] .rela.debug_line  RELA            0000000000000000 0b5338 002550 18   I 95  87  8
  [89] .debug_str        PROGBITS        0000000000000000 0b7888 0177ad 01  MS  0   0  1
  [90] .debug_line_str   PROGBITS        0000000000000000 0cf035 001308 01  MS  0   0  1
  [93] .debug_frame      PROGBITS        0000000000000000 0d0370 000e38 00      0   0  8
  [94] .rela.debug_frame RELA            0000000000000000 0d11a8 000e70 18   I 95  93  8
⬢[acme@toolbox perf]$

We would do:

⬢[acme@toolbox perf]$ pahole -J ../build/v5.13.0-rc4+/kernel/fork.o
⬢[acme@toolbox perf]$

Which would get us to have:

⬢[acme@toolbox perf]$ readelf -SW ../build/v5.13.0-rc4+/kernel/fork.o | grep BTF
  [103] .BTF              PROGBITS        0000000000000000 0db658 030550 00      0   0  1
⬢[acme@toolbox perf]

⬢[acme@toolbox perf]$ pahole -F btf -C hlist_node ../build/v5.13.0-rc4+/kernel/fork.o 
struct hlist_node {
	struct hlist_node *        next;                 /*     0     8 */
	struct hlist_node * *      pprev;                /*     8     8 */

	/* size: 16, cachelines: 1, members: 2 */
	/* last cacheline: 16 bytes */
};
⬢[acme@toolbox perf]$ 

So, a 'pahole --dedup_btf vmlinux' would just go on looking at:

⬢[acme@toolbox perf]$ readelf -wi ../build/v5.13.0-rc4+/vmlinux | grep -A10 DW_TAG_compile_unit | grep -w DW_AT_name | grep fork
    <f220eb>   DW_AT_name        : (indirect line string, offset: 0x62e7): /var/home/acme/git/linux/kernel/fork.c

To go there and go on extracting those ELF sections to combine and
dedup.

This combine thing could be done even by the linker, I think, when all
the DWARF data in the .o file are combined into vmlinux, we could do it
for the .BTF sections as well, that way would be even more elegant, I
think. Then, the combined vmlinux .BTF section would be read and fed in
one go to libbtf's dedup arg.

This way the encoding of BTF would be as paralellized as the kernel build
process, following the same logic (-j NR_PROCESSORS).

wdyt?

If this isn't the case, we can process vmlinux as is today and go on
creating N threads and feeding each with a DW_TAG_compile_unit
"container", i.e. each thread would consume all the tags below each
DW_TAG_compile_unit and produce a foo.BTF file that in the end would be
combined and deduped by libbpf.

Doing it as my first sketch above would take advantage of locality of
reference, i.e. the DWARF data would be freshly produced and in the
cache hierarchy when we first encode BTF, later, when doing the
combine+dedup we wouldn't be touching the more voluminous DWARF data.

- Arnaldo

> confident about BTF encoding part: dump each CU into its own BTF, use
> btf__add_type() to merge multiple BTFs together. Just need to re-map
> IDs (libbpf internally has API to visit each field that contains
> type_id, it's well-defined enough to expose that as a public API, if
> necessary). Then final btf_dedup().
 
> But the DWARF loading and parsing part is almost a black box to me, so
> I'm not sure how much work it would involve.

> > I'm doing 'pahole -J vmlinux && btfdiff' after each cset and doing it
> > very piecemeal as I'm doing will help bisecting any subtle bug this may
> > introduce.

> > > allow to parallelize BTF generation, where each CU would proceed in
> > > parallel generating local BTF, and then the final pass would merge and
> > > dedup BTFs. Currently reading and processing DWARF is the slowest part
> > > of the DWARF-to-BTF conversion, parallelization and maybe some other
> > > optimization seems like the only way to speed the process up.

> > > Acked-by: Andrii Nakryiko <andrii@kernel.org>

> > Thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-07 13:20                   ` Parallelizing vmlinux BTF encoding. was " Arnaldo Carvalho de Melo
@ 2021-06-07 15:42                     ` Yonghong Song
  2021-06-08  0:53                     ` Andrii Nakryiko
  1 sibling, 0 replies; 44+ messages in thread
From: Yonghong Song @ 2021-06-07 15:42 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Andrii Nakryiko
  Cc: Andrii Nakryiko, Jiri Olsa, dwarves, bpf, Kernel Team,
	Linux Kernel Mailing List



On 6/7/21 6:20 AM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jun 04, 2021 at 07:55:17PM -0700, Andrii Nakryiko escreveu:
>> On Thu, Jun 3, 2021 at 7:57 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>> Em Sat, May 29, 2021 at 05:40:17PM -0700, Andrii Nakryiko escreveu:
> 
>>>> At some point it probably would make sense to formalize
>>>> "btf_encoder" as a struct with its own state instead of passing in
>>>> multiple variables. It would probably also
> 
>>> Take a look at the tmp.master branch at:
> 
>>> https://git.kernel.org/pub/scm/devel/pahole/pahole.git/log/?h=tmp.master
>   
>> Oh wow, that's a lot of commits! :) Great that you decided to do this
>> refactoring, thanks!
>   
>>> that btf_elf class isn't used anymore by btf_loader, that uses only
>>> libbpf's APIs, and now we have a btf_encoder class with all the globals,
>>> etc, more baby steps are needed to finally ditch btf_elf altogether and
>>> move on to the parallelization.
>   
>> So do you plan to try to parallelize as a next step? I'm pretty
> 
> So, I haven't looked at details but what I thought would be interesting
> to investigate is to see if we can piggyback DWARF generation with BTF
> one, i.e. when we generate a .o file with -g we encode the DWARF info,
> so, right after this, we could call pahole as-is and encode BTF, then,
> when vmlinux is linked, we would do the dedup.
> 
> I.e. when generating ../build/v5.13.0-rc4+/kernel/fork.o, that comes
> with:
> 
> ⬢[acme@toolbox perf]$ readelf -SW ../build/v5.13.0-rc4+/kernel/fork.o | grep debug
>    [78] .debug_info       PROGBITS        0000000000000000 00daec 032968 00      0   0  1
>    [79] .rela.debug_info  RELA            0000000000000000 040458 053b68 18   I 95  78  8
>    [80] .debug_abbrev     PROGBITS        0000000000000000 093fc0 0012e9 00      0   0  1
>    [81] .debug_loclists   PROGBITS        0000000000000000 0952a9 00aa43 00      0   0  1
>    [82] .rela.debug_loclists RELA         0000000000000000 09fcf0 009d98 18   I 95  81  8
>    [83] .debug_aranges    PROGBITS        0000000000000000 0a9a88 000080 00      0   0  1
>    [84] .rela.debug_aranges RELA          0000000000000000 0a9b08 0000a8 18   I 95  83  8
>    [85] .debug_rnglists   PROGBITS        0000000000000000 0a9bb0 001509 00      0   0  1
>    [86] .rela.debug_rnglists RELA         0000000000000000 0ab0c0 001bc0 18   I 95  85  8
>    [87] .debug_line       PROGBITS        0000000000000000 0acc80 0086b7 00      0   0  1
>    [88] .rela.debug_line  RELA            0000000000000000 0b5338 002550 18   I 95  87  8
>    [89] .debug_str        PROGBITS        0000000000000000 0b7888 0177ad 01  MS  0   0  1
>    [90] .debug_line_str   PROGBITS        0000000000000000 0cf035 001308 01  MS  0   0  1
>    [93] .debug_frame      PROGBITS        0000000000000000 0d0370 000e38 00      0   0  8
>    [94] .rela.debug_frame RELA            0000000000000000 0d11a8 000e70 18   I 95  93  8
> ⬢[acme@toolbox perf]$
> 
> We would do:
> 
> ⬢[acme@toolbox perf]$ pahole -J ../build/v5.13.0-rc4+/kernel/fork.o
> ⬢[acme@toolbox perf]$
> 
> Which would get us to have:
> 
> ⬢[acme@toolbox perf]$ readelf -SW ../build/v5.13.0-rc4+/kernel/fork.o | grep BTF
>    [103] .BTF              PROGBITS        0000000000000000 0db658 030550 00      0   0  1
> ⬢[acme@toolbox perf]
> 
> ⬢[acme@toolbox perf]$ pahole -F btf -C hlist_node ../build/v5.13.0-rc4+/kernel/fork.o
> struct hlist_node {
> 	struct hlist_node *        next;                 /*     0     8 */
> 	struct hlist_node * *      pprev;                /*     8     8 */
> 
> 	/* size: 16, cachelines: 1, members: 2 */
> 	/* last cacheline: 16 bytes */
> };
> ⬢[acme@toolbox perf]$
> 
> So, a 'pahole --dedup_btf vmlinux' would just go on looking at:
> 
> ⬢[acme@toolbox perf]$ readelf -wi ../build/v5.13.0-rc4+/vmlinux | grep -A10 DW_TAG_compile_unit | grep -w DW_AT_name | grep fork
>      <f220eb>   DW_AT_name        : (indirect line string, offset: 0x62e7): /var/home/acme/git/linux/kernel/fork.c
> 
> To go there and go on extracting those ELF sections to combine and
> dedup.
> 
> This combine thing could be done even by the linker, I think, when all
> the DWARF data in the .o file are combined into vmlinux, we could do it
> for the .BTF sections as well, that way would be even more elegant, I

The linker will do the combine. It should just concatenate
all .BTF sections together like
    .BTF section
       .BTF data from file 1
       .BTF data from file 2
       ...

> think. Then, the combined vmlinux .BTF section would be read and fed in
> one go to libbtf's dedup arg.

I think this should work based on today's implementation but we do have
a caveat here.

The issue is related to DATASEC's. In DATASEC, we tried to encode
section offset for variables. These section offset should be
relocated during linking stage. But currently pahole does not
generate reloations for such variables so linker will ignore
them.

This shouldn't be an issue for global variables as we can find its
name in VAR and look up final symbol table for its section offset.

But this might be an issue for static variables with the same
name and just matching names in VAR is not enough as their
may be multiple ones in the symbol table. We could have a
workaround though, e.g., rename all static variables with a unique name
like <file_name>.[<func_name>.]<var_name> and went to dwarf
to find this static variable offset. dwarf should have
static variable section offset properly relocated.

Another solution is for pahole to generate .rel.BTF which
encodes relocations.

Currently, we don't emit static variables in vmlinux BTF (only
percpu globals), but not sure whether in the future we still
won't.

> 
> This way the encoding of BTF would be as paralellized as the kernel build
> process, following the same logic (-j NR_PROCESSORS).
> 
> wdyt?
> 
> If this isn't the case, we can process vmlinux as is today and go on
> creating N threads and feeding each with a DW_TAG_compile_unit
> "container", i.e. each thread would consume all the tags below each
> DW_TAG_compile_unit and produce a foo.BTF file that in the end would be
> combined and deduped by libbpf.
> 
> Doing it as my first sketch above would take advantage of locality of
> reference, i.e. the DWARF data would be freshly produced and in the
> cache hierarchy when we first encode BTF, later, when doing the
> combine+dedup we wouldn't be touching the more voluminous DWARF data.
> 
> - Arnaldo
> 
>> confident about BTF encoding part: dump each CU into its own BTF, use
>> btf__add_type() to merge multiple BTFs together. Just need to re-map
>> IDs (libbpf internally has API to visit each field that contains
>> type_id, it's well-defined enough to expose that as a public API, if
>> necessary). Then final btf_dedup().
>   
>> But the DWARF loading and parsing part is almost a black box to me, so
>> I'm not sure how much work it would involve.
> 
>>> I'm doing 'pahole -J vmlinux && btfdiff' after each cset and doing it
>>> very piecemeal as I'm doing will help bisecting any subtle bug this may
>>> introduce.
> 
>>>> allow to parallelize BTF generation, where each CU would proceed in
>>>> parallel generating local BTF, and then the final pass would merge and
>>>> dedup BTFs. Currently reading and processing DWARF is the slowest part
>>>> of the DWARF-to-BTF conversion, parallelization and maybe some other
>>>> optimization seems like the only way to speed the process up.
> 
>>>> Acked-by: Andrii Nakryiko <andrii@kernel.org>
> 
>>> Thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-07 13:20                   ` Parallelizing vmlinux BTF encoding. was " Arnaldo Carvalho de Melo
  2021-06-07 15:42                     ` Yonghong Song
@ 2021-06-08  0:53                     ` Andrii Nakryiko
  2021-06-08 12:59                       ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 44+ messages in thread
From: Andrii Nakryiko @ 2021-06-08  0:53 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Jiri Olsa, dwarves, bpf, Kernel Team,
	Linux Kernel Mailing List

On Mon, Jun 7, 2021 at 6:20 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Fri, Jun 04, 2021 at 07:55:17PM -0700, Andrii Nakryiko escreveu:
> > On Thu, Jun 3, 2021 at 7:57 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > Em Sat, May 29, 2021 at 05:40:17PM -0700, Andrii Nakryiko escreveu:
>
> > > > At some point it probably would make sense to formalize
> > > > "btf_encoder" as a struct with its own state instead of passing in
> > > > multiple variables. It would probably also
>
> > > Take a look at the tmp.master branch at:
>
> > > https://git.kernel.org/pub/scm/devel/pahole/pahole.git/log/?h=tmp.master
>
> > Oh wow, that's a lot of commits! :) Great that you decided to do this
> > refactoring, thanks!
>
> > > that btf_elf class isn't used anymore by btf_loader, that uses only
> > > libbpf's APIs, and now we have a btf_encoder class with all the globals,
> > > etc, more baby steps are needed to finally ditch btf_elf altogether and
> > > move on to the parallelization.
>
> > So do you plan to try to parallelize as a next step? I'm pretty
>
> So, I haven't looked at details but what I thought would be interesting
> to investigate is to see if we can piggyback DWARF generation with BTF
> one, i.e. when we generate a .o file with -g we encode the DWARF info,
> so, right after this, we could call pahole as-is and encode BTF, then,
> when vmlinux is linked, we would do the dedup.
>
> I.e. when generating ../build/v5.13.0-rc4+/kernel/fork.o, that comes
> with:
>
> ⬢[acme@toolbox perf]$ readelf -SW ../build/v5.13.0-rc4+/kernel/fork.o | grep debug
>   [78] .debug_info       PROGBITS        0000000000000000 00daec 032968 00      0   0  1
>   [79] .rela.debug_info  RELA            0000000000000000 040458 053b68 18   I 95  78  8
>   [80] .debug_abbrev     PROGBITS        0000000000000000 093fc0 0012e9 00      0   0  1
>   [81] .debug_loclists   PROGBITS        0000000000000000 0952a9 00aa43 00      0   0  1
>   [82] .rela.debug_loclists RELA         0000000000000000 09fcf0 009d98 18   I 95  81  8
>   [83] .debug_aranges    PROGBITS        0000000000000000 0a9a88 000080 00      0   0  1
>   [84] .rela.debug_aranges RELA          0000000000000000 0a9b08 0000a8 18   I 95  83  8
>   [85] .debug_rnglists   PROGBITS        0000000000000000 0a9bb0 001509 00      0   0  1
>   [86] .rela.debug_rnglists RELA         0000000000000000 0ab0c0 001bc0 18   I 95  85  8
>   [87] .debug_line       PROGBITS        0000000000000000 0acc80 0086b7 00      0   0  1
>   [88] .rela.debug_line  RELA            0000000000000000 0b5338 002550 18   I 95  87  8
>   [89] .debug_str        PROGBITS        0000000000000000 0b7888 0177ad 01  MS  0   0  1
>   [90] .debug_line_str   PROGBITS        0000000000000000 0cf035 001308 01  MS  0   0  1
>   [93] .debug_frame      PROGBITS        0000000000000000 0d0370 000e38 00      0   0  8
>   [94] .rela.debug_frame RELA            0000000000000000 0d11a8 000e70 18   I 95  93  8
> ⬢[acme@toolbox perf]$
>
> We would do:
>
> ⬢[acme@toolbox perf]$ pahole -J ../build/v5.13.0-rc4+/kernel/fork.o
> ⬢[acme@toolbox perf]$
>
> Which would get us to have:
>
> ⬢[acme@toolbox perf]$ readelf -SW ../build/v5.13.0-rc4+/kernel/fork.o | grep BTF
>   [103] .BTF              PROGBITS        0000000000000000 0db658 030550 00      0   0  1
> ⬢[acme@toolbox perf]
>
> ⬢[acme@toolbox perf]$ pahole -F btf -C hlist_node ../build/v5.13.0-rc4+/kernel/fork.o
> struct hlist_node {
>         struct hlist_node *        next;                 /*     0     8 */
>         struct hlist_node * *      pprev;                /*     8     8 */
>
>         /* size: 16, cachelines: 1, members: 2 */
>         /* last cacheline: 16 bytes */
> };
> ⬢[acme@toolbox perf]$
>
> So, a 'pahole --dedup_btf vmlinux' would just go on looking at:
>
> ⬢[acme@toolbox perf]$ readelf -wi ../build/v5.13.0-rc4+/vmlinux | grep -A10 DW_TAG_compile_unit | grep -w DW_AT_name | grep fork
>     <f220eb>   DW_AT_name        : (indirect line string, offset: 0x62e7): /var/home/acme/git/linux/kernel/fork.c
>
> To go there and go on extracting those ELF sections to combine and
> dedup.
>
> This combine thing could be done even by the linker, I think, when all
> the DWARF data in the .o file are combined into vmlinux, we could do it
> for the .BTF sections as well, that way would be even more elegant, I
> think. Then, the combined vmlinux .BTF section would be read and fed in
> one go to libbtf's dedup arg.
>
> This way the encoding of BTF would be as paralellized as the kernel build
> process, following the same logic (-j NR_PROCESSORS).
>
> wdyt?

I think it's very fragile and it will be easy to get
broken/invalid/incomplete BTF. Yonghong already brought up the case
for static variables. There might be some other issues that exist
today, or we might run into when we further extend BTF. Like some
custom linker script that will do something to vmlinux.o that we won't
know about.

And also this will be purely vmlinux-specific approach relying on
extra and custom Kbuild integration.

While if you parallelize DWARF loading and BTF generation, that will
be more reliably correct (modulo any bugs of course) and will work for
any DWARF-to-BTF cases that might come up in the future.

So I wouldn't even bother with individual .o's, tbh.

>
> If this isn't the case, we can process vmlinux as is today and go on
> creating N threads and feeding each with a DW_TAG_compile_unit
> "container", i.e. each thread would consume all the tags below each
> DW_TAG_compile_unit and produce a foo.BTF file that in the end would be
> combined and deduped by libbpf.
>
> Doing it as my first sketch above would take advantage of locality of
> reference, i.e. the DWARF data would be freshly produced and in the
> cache hierarchy when we first encode BTF, later, when doing the
> combine+dedup we wouldn't be touching the more voluminous DWARF data.

Yep, that's what I'd do.

>
> - Arnaldo
>
> > confident about BTF encoding part: dump each CU into its own BTF, use
> > btf__add_type() to merge multiple BTFs together. Just need to re-map
> > IDs (libbpf internally has API to visit each field that contains
> > type_id, it's well-defined enough to expose that as a public API, if
> > necessary). Then final btf_dedup().
>
> > But the DWARF loading and parsing part is almost a black box to me, so
> > I'm not sure how much work it would involve.
>
> > > I'm doing 'pahole -J vmlinux && btfdiff' after each cset and doing it
> > > very piecemeal as I'm doing will help bisecting any subtle bug this may
> > > introduce.
>
> > > > allow to parallelize BTF generation, where each CU would proceed in
> > > > parallel generating local BTF, and then the final pass would merge and
> > > > dedup BTFs. Currently reading and processing DWARF is the slowest part
> > > > of the DWARF-to-BTF conversion, parallelization and maybe some other
> > > > optimization seems like the only way to speed the process up.
>
> > > > Acked-by: Andrii Nakryiko <andrii@kernel.org>
>
> > > Thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-08  0:53                     ` Andrii Nakryiko
@ 2021-06-08 12:59                       ` Arnaldo Carvalho de Melo
  2021-06-15 19:01                         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-08 12:59 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, Jiri Olsa, dwarves, bpf, Kernel Team,
	Linux Kernel Mailing List

Em Mon, Jun 07, 2021 at 05:53:59PM -0700, Andrii Nakryiko escreveu:
> On Mon, Jun 7, 2021 at 6:20 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >
> > Em Fri, Jun 04, 2021 at 07:55:17PM -0700, Andrii Nakryiko escreveu:
> > > On Thu, Jun 3, 2021 at 7:57 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > Em Sat, May 29, 2021 at 05:40:17PM -0700, Andrii Nakryiko escreveu:
> >
> > > > > At some point it probably would make sense to formalize
> > > > > "btf_encoder" as a struct with its own state instead of passing in
> > > > > multiple variables. It would probably also
> >
> > > > Take a look at the tmp.master branch at:
> >
> > > > https://git.kernel.org/pub/scm/devel/pahole/pahole.git/log/?h=tmp.master
> >
> > > Oh wow, that's a lot of commits! :) Great that you decided to do this
> > > refactoring, thanks!
> >
> > > > that btf_elf class isn't used anymore by btf_loader, that uses only
> > > > libbpf's APIs, and now we have a btf_encoder class with all the globals,
> > > > etc, more baby steps are needed to finally ditch btf_elf altogether and
> > > > move on to the parallelization.
> >
> > > So do you plan to try to parallelize as a next step? I'm pretty
> >
> > So, I haven't looked at details but what I thought would be interesting
> > to investigate is to see if we can piggyback DWARF generation with BTF
> > one, i.e. when we generate a .o file with -g we encode the DWARF info,
> > so, right after this, we could call pahole as-is and encode BTF, then,
> > when vmlinux is linked, we would do the dedup.
> >
> > I.e. when generating ../build/v5.13.0-rc4+/kernel/fork.o, that comes
> > with:
> >
> > ⬢[acme@toolbox perf]$ readelf -SW ../build/v5.13.0-rc4+/kernel/fork.o | grep debug
> >   [78] .debug_info       PROGBITS        0000000000000000 00daec 032968 00      0   0  1
> >   [79] .rela.debug_info  RELA            0000000000000000 040458 053b68 18   I 95  78  8
> >   [80] .debug_abbrev     PROGBITS        0000000000000000 093fc0 0012e9 00      0   0  1
> >   [81] .debug_loclists   PROGBITS        0000000000000000 0952a9 00aa43 00      0   0  1
> >   [82] .rela.debug_loclists RELA         0000000000000000 09fcf0 009d98 18   I 95  81  8
> >   [83] .debug_aranges    PROGBITS        0000000000000000 0a9a88 000080 00      0   0  1
> >   [84] .rela.debug_aranges RELA          0000000000000000 0a9b08 0000a8 18   I 95  83  8
> >   [85] .debug_rnglists   PROGBITS        0000000000000000 0a9bb0 001509 00      0   0  1
> >   [86] .rela.debug_rnglists RELA         0000000000000000 0ab0c0 001bc0 18   I 95  85  8
> >   [87] .debug_line       PROGBITS        0000000000000000 0acc80 0086b7 00      0   0  1
> >   [88] .rela.debug_line  RELA            0000000000000000 0b5338 002550 18   I 95  87  8
> >   [89] .debug_str        PROGBITS        0000000000000000 0b7888 0177ad 01  MS  0   0  1
> >   [90] .debug_line_str   PROGBITS        0000000000000000 0cf035 001308 01  MS  0   0  1
> >   [93] .debug_frame      PROGBITS        0000000000000000 0d0370 000e38 00      0   0  8
> >   [94] .rela.debug_frame RELA            0000000000000000 0d11a8 000e70 18   I 95  93  8
> > ⬢[acme@toolbox perf]$
> >
> > We would do:
> >
> > ⬢[acme@toolbox perf]$ pahole -J ../build/v5.13.0-rc4+/kernel/fork.o
> > ⬢[acme@toolbox perf]$
> >
> > Which would get us to have:
> >
> > ⬢[acme@toolbox perf]$ readelf -SW ../build/v5.13.0-rc4+/kernel/fork.o | grep BTF
> >   [103] .BTF              PROGBITS        0000000000000000 0db658 030550 00      0   0  1
> > ⬢[acme@toolbox perf]
> >
> > ⬢[acme@toolbox perf]$ pahole -F btf -C hlist_node ../build/v5.13.0-rc4+/kernel/fork.o
> > struct hlist_node {
> >         struct hlist_node *        next;                 /*     0     8 */
> >         struct hlist_node * *      pprev;                /*     8     8 */
> >
> >         /* size: 16, cachelines: 1, members: 2 */
> >         /* last cacheline: 16 bytes */
> > };
> > ⬢[acme@toolbox perf]$
> >
> > So, a 'pahole --dedup_btf vmlinux' would just go on looking at:
> >
> > ⬢[acme@toolbox perf]$ readelf -wi ../build/v5.13.0-rc4+/vmlinux | grep -A10 DW_TAG_compile_unit | grep -w DW_AT_name | grep fork
> >     <f220eb>   DW_AT_name        : (indirect line string, offset: 0x62e7): /var/home/acme/git/linux/kernel/fork.c
> >
> > To go there and go on extracting those ELF sections to combine and
> > dedup.
> >
> > This combine thing could be done even by the linker, I think, when all
> > the DWARF data in the .o file are combined into vmlinux, we could do it
> > for the .BTF sections as well, that way would be even more elegant, I
> > think. Then, the combined vmlinux .BTF section would be read and fed in
> > one go to libbtf's dedup arg.
> >
> > This way the encoding of BTF would be as paralellized as the kernel build
> > process, following the same logic (-j NR_PROCESSORS).
> >
> > wdyt?
> 
> I think it's very fragile and it will be easy to get
> broken/invalid/incomplete BTF. Yonghong already brought up the case

I thought about that as it would be almost like the compiler generating
BTF, but you are right, the vmlinux prep process is a complex beast and
probably it is best to go with the second approach I outlined and you
agreed to be less fragile, so I'll go with that, thanks for your
comments.

- Arnaldo

> for static variables. There might be some other issues that exist
> today, or we might run into when we further extend BTF. Like some
> custom linker script that will do something to vmlinux.o that we won't
> know about.
> 
> And also this will be purely vmlinux-specific approach relying on
> extra and custom Kbuild integration.
> 
> While if you parallelize DWARF loading and BTF generation, that will
> be more reliably correct (modulo any bugs of course) and will work for
> any DWARF-to-BTF cases that might come up in the future.
> 
> So I wouldn't even bother with individual .o's, tbh.
> 
> >
> > If this isn't the case, we can process vmlinux as is today and go on
> > creating N threads and feeding each with a DW_TAG_compile_unit
> > "container", i.e. each thread would consume all the tags below each
> > DW_TAG_compile_unit and produce a foo.BTF file that in the end would be
> > combined and deduped by libbpf.
> >
> > Doing it as my first sketch above would take advantage of locality of
> > reference, i.e. the DWARF data would be freshly produced and in the
> > cache hierarchy when we first encode BTF, later, when doing the
> > combine+dedup we wouldn't be touching the more voluminous DWARF data.
> 
> Yep, that's what I'd do.
> 
> >
> > - Arnaldo
> >
> > > confident about BTF encoding part: dump each CU into its own BTF, use
> > > btf__add_type() to merge multiple BTFs together. Just need to re-map
> > > IDs (libbpf internally has API to visit each field that contains
> > > type_id, it's well-defined enough to expose that as a public API, if
> > > necessary). Then final btf_dedup().
> >
> > > But the DWARF loading and parsing part is almost a black box to me, so
> > > I'm not sure how much work it would involve.
> >
> > > > I'm doing 'pahole -J vmlinux && btfdiff' after each cset and doing it
> > > > very piecemeal as I'm doing will help bisecting any subtle bug this may
> > > > introduce.
> >
> > > > > allow to parallelize BTF generation, where each CU would proceed in
> > > > > parallel generating local BTF, and then the final pass would merge and
> > > > > dedup BTFs. Currently reading and processing DWARF is the slowest part
> > > > > of the DWARF-to-BTF conversion, parallelization and maybe some other
> > > > > optimization seems like the only way to speed the process up.
> >
> > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org>
> >
> > > > Thanks!

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-08 12:59                       ` Arnaldo Carvalho de Melo
@ 2021-06-15 19:01                         ` Arnaldo Carvalho de Melo
  2021-06-15 19:13                           ` Andrii Nakryiko
  0 siblings, 1 reply; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-15 19:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Yonghong Song, Andrii Nakryiko, Jiri Olsa, dwarves, bpf,
	Kernel Team, Linux Kernel Mailing List

Em Tue, Jun 08, 2021 at 09:59:48AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jun 07, 2021 at 05:53:59PM -0700, Andrii Nakryiko escreveu:
> > I think it's very fragile and it will be easy to get
> > broken/invalid/incomplete BTF. Yonghong already brought up the case
 
> I thought about that as it would be almost like the compiler generating
> BTF, but you are right, the vmlinux prep process is a complex beast and
> probably it is best to go with the second approach I outlined and you
> agreed to be less fragile, so I'll go with that, thanks for your
> comments.

So, just to write some notes here from what I saw so far:

1. In the LTO cases there are inter-CU references, so the current code
combines all CUs into one and we end up not being able to parallelize
much. LTO is expensive, so... I'll leave it for later, but yeah, I don't
think the current algorithm is ideal, can be improved.

2. The case where there's no inter CU refs, which so far is the most
common, seems easier, we create N threads, all sharing the dwarf_loader
state and the btf_encoder, as-is now. we can process one CU per thread,
and as soon as we finish it, just grab a lock and call
btf_encoder__encode_cu() with the just produced CU data structures
(tags, types, functions, variables, etc), consume them and delete the
CU.

So each thread will consume one CU, push it to the 'struct btf' class
as-is now and then ask for the next CU, using the dwarf_loader state,
still under that lock, then go back to processing dwarf tags, then
lock, btf add types, rinse, repeat.

The ordering will be different than what we have now, as some smaller
CUs (object files with debug) will be processed faster so will get its
btf encoding slot faster, but that, at btf__dedup() time shouldn't make
a difference, right?

I think I'm done with refactoring the btf_encoder code, which should be
by now a thin layer on top of the excellent libbpf BTF API, just getting
what the previous loader (DWARF) produced and feeding libbpf.

I thought about fancy thread pools, etc, researching some pre-existing
thing or doing some kthread + workqueue lifting from the kernel but will
instead start with the most spartan code, we can improve later.

There it is, dumped my thoughts on this, time to go do some coding
before I get preempted...

- Arnaldo
 
> - Arnaldo
> 
> > for static variables. There might be some other issues that exist
> > today, or we might run into when we further extend BTF. Like some
> > custom linker script that will do something to vmlinux.o that we won't
> > know about.
> > 
> > And also this will be purely vmlinux-specific approach relying on
> > extra and custom Kbuild integration.
> > 
> > While if you parallelize DWARF loading and BTF generation, that will
> > be more reliably correct (modulo any bugs of course) and will work for
> > any DWARF-to-BTF cases that might come up in the future.
> > 
> > So I wouldn't even bother with individual .o's, tbh.
> > 
> > >
> > > If this isn't the case, we can process vmlinux as is today and go on
> > > creating N threads and feeding each with a DW_TAG_compile_unit
> > > "container", i.e. each thread would consume all the tags below each
> > > DW_TAG_compile_unit and produce a foo.BTF file that in the end would be
> > > combined and deduped by libbpf.
> > >
> > > Doing it as my first sketch above would take advantage of locality of
> > > reference, i.e. the DWARF data would be freshly produced and in the
> > > cache hierarchy when we first encode BTF, later, when doing the
> > > combine+dedup we wouldn't be touching the more voluminous DWARF data.
> > 
> > Yep, that's what I'd do.
> > 
> > >
> > > - Arnaldo
> > >
> > > > confident about BTF encoding part: dump each CU into its own BTF, use
> > > > btf__add_type() to merge multiple BTFs together. Just need to re-map
> > > > IDs (libbpf internally has API to visit each field that contains
> > > > type_id, it's well-defined enough to expose that as a public API, if
> > > > necessary). Then final btf_dedup().
> > >
> > > > But the DWARF loading and parsing part is almost a black box to me, so
> > > > I'm not sure how much work it would involve.
> > >
> > > > > I'm doing 'pahole -J vmlinux && btfdiff' after each cset and doing it
> > > > > very piecemeal as I'm doing will help bisecting any subtle bug this may
> > > > > introduce.
> > >
> > > > > > allow to parallelize BTF generation, where each CU would proceed in
> > > > > > parallel generating local BTF, and then the final pass would merge and
> > > > > > dedup BTFs. Currently reading and processing DWARF is the slowest part
> > > > > > of the DWARF-to-BTF conversion, parallelization and maybe some other
> > > > > > optimization seems like the only way to speed the process up.
> > >
> > > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org>
> > >
> > > > > Thanks!
> 
> -- 
> 
> - Arnaldo

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-15 19:01                         ` Arnaldo Carvalho de Melo
@ 2021-06-15 19:13                           ` Andrii Nakryiko
  2021-06-15 19:38                             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 44+ messages in thread
From: Andrii Nakryiko @ 2021-06-15 19:13 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Yonghong Song, Andrii Nakryiko, Jiri Olsa, dwarves, bpf,
	Kernel Team, Linux Kernel Mailing List

On Tue, Jun 15, 2021 at 12:01 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Tue, Jun 08, 2021 at 09:59:48AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Mon, Jun 07, 2021 at 05:53:59PM -0700, Andrii Nakryiko escreveu:
> > > I think it's very fragile and it will be easy to get
> > > broken/invalid/incomplete BTF. Yonghong already brought up the case
>
> > I thought about that as it would be almost like the compiler generating
> > BTF, but you are right, the vmlinux prep process is a complex beast and
> > probably it is best to go with the second approach I outlined and you
> > agreed to be less fragile, so I'll go with that, thanks for your
> > comments.
>
> So, just to write some notes here from what I saw so far:
>
> 1. In the LTO cases there are inter-CU references, so the current code
> combines all CUs into one and we end up not being able to parallelize
> much. LTO is expensive, so... I'll leave it for later, but yeah, I don't
> think the current algorithm is ideal, can be improved.
>

Yeah, let's worry about LTO later.

> 2. The case where there's no inter CU refs, which so far is the most
> common, seems easier, we create N threads, all sharing the dwarf_loader
> state and the btf_encoder, as-is now. we can process one CU per thread,
> and as soon as we finish it, just grab a lock and call
> btf_encoder__encode_cu() with the just produced CU data structures
> (tags, types, functions, variables, etc), consume them and delete the
> CU.
>
> So each thread will consume one CU, push it to the 'struct btf' class
> as-is now and then ask for the next CU, using the dwarf_loader state,
> still under that lock, then go back to processing dwarf tags, then
> lock, btf add types, rinse, repeat.

Hmm... wouldn't keeping a "local" per-thread struct btf and just keep
appending to it for each processed CU until we run out of CUs be
simpler? So each thread does as much as possible locally without any
locks. And only at the very end we merge everything together and then
dedup. Or we can even dedup inside each worker before merging final
btf, that probably would give quite a lot of speed up and some memory
saving. Would be interesting to experiment with that.

So I like the idea of a fixed pool of threads (can be customized, and
I'd default to num_workers == num_cpus), but I think we can and should
keep them independent for as long as possible.

Another disadvantage of generating small struct btf and then lock +
merge is that we don't get as efficient string re-use, we'll churn
more on string memory allocation. Keeping bigger local struct btfs
allow for more efficient memory re-use (and probably a tiny bit of CPU
savings).

So please consider that, it also seems simpler overall.


>
> The ordering will be different than what we have now, as some smaller
> CUs (object files with debug) will be processed faster so will get its
> btf encoding slot faster, but that, at btf__dedup() time shouldn't make
> a difference, right?

Right, order doesn't matter.

>
> I think I'm done with refactoring the btf_encoder code, which should be
> by now a thin layer on top of the excellent libbpf BTF API, just getting
> what the previous loader (DWARF) produced and feeding libbpf.

Great.

>
> I thought about fancy thread pools, etc, researching some pre-existing
> thing or doing some kthread + workqueue lifting from the kernel but will
> instead start with the most spartan code, we can improve later.

Agree, simple is good. Really curious how much faster we can get. I
think anything fancy will give a relatively small improvement. The
biggest one will come from any parallelization.

>
> There it is, dumped my thoughts on this, time to go do some coding
> before I get preempted...
>
> - Arnaldo
>
> > - Arnaldo
> >
> > > for static variables. There might be some other issues that exist
> > > today, or we might run into when we further extend BTF. Like some
> > > custom linker script that will do something to vmlinux.o that we won't
> > > know about.
> > >
> > > And also this will be purely vmlinux-specific approach relying on
> > > extra and custom Kbuild integration.
> > >
> > > While if you parallelize DWARF loading and BTF generation, that will
> > > be more reliably correct (modulo any bugs of course) and will work for
> > > any DWARF-to-BTF cases that might come up in the future.
> > >
> > > So I wouldn't even bother with individual .o's, tbh.
> > >
> > > >
> > > > If this isn't the case, we can process vmlinux as is today and go on
> > > > creating N threads and feeding each with a DW_TAG_compile_unit
> > > > "container", i.e. each thread would consume all the tags below each
> > > > DW_TAG_compile_unit and produce a foo.BTF file that in the end would be
> > > > combined and deduped by libbpf.
> > > >
> > > > Doing it as my first sketch above would take advantage of locality of
> > > > reference, i.e. the DWARF data would be freshly produced and in the
> > > > cache hierarchy when we first encode BTF, later, when doing the
> > > > combine+dedup we wouldn't be touching the more voluminous DWARF data.
> > >
> > > Yep, that's what I'd do.
> > >
> > > >
> > > > - Arnaldo
> > > >
> > > > > confident about BTF encoding part: dump each CU into its own BTF, use
> > > > > btf__add_type() to merge multiple BTFs together. Just need to re-map
> > > > > IDs (libbpf internally has API to visit each field that contains
> > > > > type_id, it's well-defined enough to expose that as a public API, if
> > > > > necessary). Then final btf_dedup().
> > > >
> > > > > But the DWARF loading and parsing part is almost a black box to me, so
> > > > > I'm not sure how much work it would involve.
> > > >
> > > > > > I'm doing 'pahole -J vmlinux && btfdiff' after each cset and doing it
> > > > > > very piecemeal as I'm doing will help bisecting any subtle bug this may
> > > > > > introduce.
> > > >
> > > > > > > allow to parallelize BTF generation, where each CU would proceed in
> > > > > > > parallel generating local BTF, and then the final pass would merge and
> > > > > > > dedup BTFs. Currently reading and processing DWARF is the slowest part
> > > > > > > of the DWARF-to-BTF conversion, parallelization and maybe some other
> > > > > > > optimization seems like the only way to speed the process up.
> > > >
> > > > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org>
> > > >
> > > > > > Thanks!
> >
> > --
> >
> > - Arnaldo
>
> --
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-15 19:13                           ` Andrii Nakryiko
@ 2021-06-15 19:38                             ` Arnaldo Carvalho de Melo
  2021-06-15 20:05                               ` Andrii Nakryiko
  0 siblings, 1 reply; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-15 19:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Yonghong Song, Andrii Nakryiko, Jiri Olsa, dwarves, bpf,
	Kernel Team, Linux Kernel Mailing List

Em Tue, Jun 15, 2021 at 12:13:55PM -0700, Andrii Nakryiko escreveu:
> On Tue, Jun 15, 2021 at 12:01 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> > Em Tue, Jun 08, 2021 at 09:59:48AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Mon, Jun 07, 2021 at 05:53:59PM -0700, Andrii Nakryiko escreveu:
> > > > I think it's very fragile and it will be easy to get
> > > > broken/invalid/incomplete BTF. Yonghong already brought up the case

> > > I thought about that as it would be almost like the compiler generating
> > > BTF, but you are right, the vmlinux prep process is a complex beast and
> > > probably it is best to go with the second approach I outlined and you
> > > agreed to be less fragile, so I'll go with that, thanks for your
> > > comments.

> > So, just to write some notes here from what I saw so far:

> > 1. In the LTO cases there are inter-CU references, so the current code
> > combines all CUs into one and we end up not being able to parallelize
> > much. LTO is expensive, so... I'll leave it for later, but yeah, I don't
> > think the current algorithm is ideal, can be improved.
 
> Yeah, let's worry about LTO later.
 
> > 2. The case where there's no inter CU refs, which so far is the most
> > common, seems easier, we create N threads, all sharing the dwarf_loader
> > state and the btf_encoder, as-is now. we can process one CU per thread,
> > and as soon as we finish it, just grab a lock and call
> > btf_encoder__encode_cu() with the just produced CU data structures
> > (tags, types, functions, variables, etc), consume them and delete the
> > CU.
> >
> > So each thread will consume one CU, push it to the 'struct btf' class
> > as-is now and then ask for the next CU, using the dwarf_loader state,
> > still under that lock, then go back to processing dwarf tags, then
> > lock, btf add types, rinse, repeat.
> 
> Hmm... wouldn't keeping a "local" per-thread struct btf and just keep
> appending to it for each processed CU until we run out of CUs be
> simpler?

I thought about this as a logical next step, I would love to have a
'btf__merge_argv(struct btf *btf[]), is there one?

But from what I've read after this first paragraph of yours, lemme try
to rephrase:

1. pahole calls btf_encoder__new(...)

   Creates a single struct btf.

2. dwarf_loader will create N threads, each will call a
dwarf_get_next_cu() that is locked and will return a CU to process, when
it finishes this CU, calls btf_encoder__encode_cu() under an all-threads
lock. Rinse repeat.

Until all the threads have consumed all CUs.

then btf_encoder__encode(), which should be probably renamed to
btf_econder__finish() will call btf__dedup(encoder->btf) and write ELF
or raw file.

My first reaction to your first paragraph was:

Yeah, we can have multiple 'struct btf' instances, one per thread, that
will each contain a subset of DWARF CU's encoded as BTF, and then I have
to merge the per-thread BTF and then dedup. O think my rephrase above is
better, no?

> So each thread does as much as possible locally without any
> locks. And only at the very end we merge everything together and then
> dedup. Or we can even dedup inside each worker before merging final
> btf, that probably would give quite a lot of speed up and some memory
> saving. Would be interesting to experiment with that.
> 
> So I like the idea of a fixed pool of threads (can be customized, and
> I'd default to num_workers == num_cpus), but I think we can and should
> keep them independent for as long as possible.

Sure, this should map the whatever the user passes to -j in the kernel
make command line, if nothing is passed as an argument, then default to
getconf(_NPROCESSORS_ONLN).

There is a nice coincidence here where we probably don't care about -J
anymore and want to deal only with -j (detached btf) that is the same as
what 'make' expects to state how many "jobs" (thread pool size) the user
wants 8-)
 
> Another disadvantage of generating small struct btf and then lock +
> merge is that we don't get as efficient string re-use, we'll churn
> more on string memory allocation. Keeping bigger local struct btfs
> allow for more efficient memory re-use (and probably a tiny bit of CPU
> savings).

I think we're in the same page, the contention for adding the CU to a
single 'struct btf' (amongst all DWARF loading threads) after we just
produced it should be minimal, so we grab all the advantages: locality
of reference, minimal contention as DWARF reading/creating the pahole
internal, neutral, data structures should be higher than adding
types/functions/variables via the libbpf BTF API.

I.e. we can leave paralellizing the BTF _encoding_ for later, what we're
trying to do now is to paralellize the DWARF _loading_, right?
 
> So please consider that, it also seems simpler overall.
 
> > The ordering will be different than what we have now, as some smaller
> > CUs (object files with debug) will be processed faster so will get its
> > btf encoding slot faster, but that, at btf__dedup() time shouldn't make
> > a difference, right?
 
> Right, order doesn't matter.
 
> > I think I'm done with refactoring the btf_encoder code, which should be
> > by now a thin layer on top of the excellent libbpf BTF API, just getting
> > what the previous loader (DWARF) produced and feeding libbpf.
 
> Great.
 
> > I thought about fancy thread pools, etc, researching some pre-existing
> > thing or doing some kthread + workqueue lifting from the kernel but will
> > instead start with the most spartan code, we can improve later.
 
> Agree, simple is good. Really curious how much faster we can get. I
> think anything fancy will give a relatively small improvement. The
> biggest one will come from any parallelization.

And I think that is possible, modulo elfutils libraries saying no, I
hope that will not be the case.

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-15 19:38                             ` Arnaldo Carvalho de Melo
@ 2021-06-15 20:05                               ` Andrii Nakryiko
  2021-06-15 20:25                                 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 44+ messages in thread
From: Andrii Nakryiko @ 2021-06-15 20:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Yonghong Song, Andrii Nakryiko, Jiri Olsa, dwarves, bpf,
	Kernel Team, Linux Kernel Mailing List

On Tue, Jun 15, 2021 at 12:38 PM Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> Em Tue, Jun 15, 2021 at 12:13:55PM -0700, Andrii Nakryiko escreveu:
> > On Tue, Jun 15, 2021 at 12:01 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> > > Em Tue, Jun 08, 2021 at 09:59:48AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Mon, Jun 07, 2021 at 05:53:59PM -0700, Andrii Nakryiko escreveu:
> > > > > I think it's very fragile and it will be easy to get
> > > > > broken/invalid/incomplete BTF. Yonghong already brought up the case
>
> > > > I thought about that as it would be almost like the compiler generating
> > > > BTF, but you are right, the vmlinux prep process is a complex beast and
> > > > probably it is best to go with the second approach I outlined and you
> > > > agreed to be less fragile, so I'll go with that, thanks for your
> > > > comments.
>
> > > So, just to write some notes here from what I saw so far:
>
> > > 1. In the LTO cases there are inter-CU references, so the current code
> > > combines all CUs into one and we end up not being able to parallelize
> > > much. LTO is expensive, so... I'll leave it for later, but yeah, I don't
> > > think the current algorithm is ideal, can be improved.
>
> > Yeah, let's worry about LTO later.
>
> > > 2. The case where there's no inter CU refs, which so far is the most
> > > common, seems easier, we create N threads, all sharing the dwarf_loader
> > > state and the btf_encoder, as-is now. we can process one CU per thread,
> > > and as soon as we finish it, just grab a lock and call
> > > btf_encoder__encode_cu() with the just produced CU data structures
> > > (tags, types, functions, variables, etc), consume them and delete the
> > > CU.
> > >
> > > So each thread will consume one CU, push it to the 'struct btf' class
> > > as-is now and then ask for the next CU, using the dwarf_loader state,
> > > still under that lock, then go back to processing dwarf tags, then
> > > lock, btf add types, rinse, repeat.
> >
> > Hmm... wouldn't keeping a "local" per-thread struct btf and just keep
> > appending to it for each processed CU until we run out of CUs be
> > simpler?
>
> I thought about this as a logical next step, I would love to have a
> 'btf__merge_argv(struct btf *btf[]), is there one?
>
> But from what I've read after this first paragraph of yours, lemme try
> to rephrase:
>
> 1. pahole calls btf_encoder__new(...)
>
>    Creates a single struct btf.
>
> 2. dwarf_loader will create N threads, each will call a
> dwarf_get_next_cu() that is locked and will return a CU to process, when
> it finishes this CU, calls btf_encoder__encode_cu() under an all-threads
> lock. Rinse repeat.
>
> Until all the threads have consumed all CUs.
>
> then btf_encoder__encode(), which should be probably renamed to
> btf_econder__finish() will call btf__dedup(encoder->btf) and write ELF
> or raw file.
>
> My first reaction to your first paragraph was:
>
> Yeah, we can have multiple 'struct btf' instances, one per thread, that
> will each contain a subset of DWARF CU's encoded as BTF, and then I have
> to merge the per-thread BTF and then dedup. O think my rephrase above is
> better, no?

I think I understood what you want to do from the previous email, so
you didn't have to re-phrase it, it's pretty clear already. I just
don't feel like having per-thread struct btf adds any complexity at
all and gives more flexibility and more parallelism. The next most
expensive thing after loading DWARF is string deduplication
(btf__add_str()), so it would be good to do that at per-thread level
as well as much as possible.

>
> > So each thread does as much as possible locally without any
> > locks. And only at the very end we merge everything together and then
> > dedup. Or we can even dedup inside each worker before merging final
> > btf, that probably would give quite a lot of speed up and some memory
> > saving. Would be interesting to experiment with that.
> >
> > So I like the idea of a fixed pool of threads (can be customized, and
> > I'd default to num_workers == num_cpus), but I think we can and should
> > keep them independent for as long as possible.
>
> Sure, this should map the whatever the user passes to -j in the kernel
> make command line, if nothing is passed as an argument, then default to
> getconf(_NPROCESSORS_ONLN).
>

Yep, cool. I've been told that `make -j` puts no upper limit on number
of jobs, so we shouldn't follow make model completely :-P

> There is a nice coincidence here where we probably don't care about -J
> anymore and want to deal only with -j (detached btf) that is the same as
> what 'make' expects to state how many "jobs" (thread pool size) the user
> wants 8-)
>
> > Another disadvantage of generating small struct btf and then lock +
> > merge is that we don't get as efficient string re-use, we'll churn
> > more on string memory allocation. Keeping bigger local struct btfs
> > allow for more efficient memory re-use (and probably a tiny bit of CPU
> > savings).
>
> I think we're in the same page, the contention for adding the CU to a
> single 'struct btf' (amongst all DWARF loading threads) after we just
> produced it should be minimal, so we grab all the advantages: locality
> of reference, minimal contention as DWARF reading/creating the pahole
> internal, neutral, data structures should be higher than adding
> types/functions/variables via the libbpf BTF API.

I disagree, I think contention might be noticeable because merging
BTFs is still a relatively expensive/slow operation. But feel free to
start with that, I just thought that doing per-thread struct btf
wouldn't add any complexity, which is why I mentioned that.

>
> I.e. we can leave paralellizing the BTF _encoding_ for later, what we're
> trying to do now is to paralellize the DWARF _loading_, right?

We are trying to speed up DWARF-to-BTF generation in general, not
specifically DWARF loading. DWARF loading is an obvious most expensive
part, string deduplication is the next one, if you look at profiling
data. The third one will be btf__dedup, which is why I mentioned that
it might be faster still to do pre-dedup in each thread at the very
end, right before we do final dedup. Each individual dedup will
probably significantly reduce total size of data/strings, so I have a
feeling that it will result in a very nice speed-ups in the end.

So just my 2 cents.

>
> > So please consider that, it also seems simpler overall.
>
> > > The ordering will be different than what we have now, as some smaller
> > > CUs (object files with debug) will be processed faster so will get its
> > > btf encoding slot faster, but that, at btf__dedup() time shouldn't make
> > > a difference, right?
>
> > Right, order doesn't matter.
>
> > > I think I'm done with refactoring the btf_encoder code, which should be
> > > by now a thin layer on top of the excellent libbpf BTF API, just getting
> > > what the previous loader (DWARF) produced and feeding libbpf.
>
> > Great.
>
> > > I thought about fancy thread pools, etc, researching some pre-existing
> > > thing or doing some kthread + workqueue lifting from the kernel but will
> > > instead start with the most spartan code, we can improve later.
>
> > Agree, simple is good. Really curious how much faster we can get. I
> > think anything fancy will give a relatively small improvement. The
> > biggest one will come from any parallelization.
>
> And I think that is possible, modulo elfutils libraries saying no, I
> hope that will not be the case.

You can't imagine how eagerly I'm awaiting this bright future of
faster BTF generation step in the kernel build process. :)

>
> - Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-15 20:05                               ` Andrii Nakryiko
@ 2021-06-15 20:25                                 ` Arnaldo Carvalho de Melo
  2021-06-15 21:26                                   ` Andrii Nakryiko
  0 siblings, 1 reply; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-15 20:25 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Arnaldo Carvalho de Melo, Yonghong Song, Andrii Nakryiko,
	Jiri Olsa, dwarves, bpf, Kernel Team, Linux Kernel Mailing List

Em Tue, Jun 15, 2021 at 01:05:30PM -0700, Andrii Nakryiko escreveu:
> On Tue, Jun 15, 2021 at 12:38 PM Arnaldo Carvalho de Melo
> <arnaldo.melo@gmail.com> wrote:
> >
> > Em Tue, Jun 15, 2021 at 12:13:55PM -0700, Andrii Nakryiko escreveu:
> > > On Tue, Jun 15, 2021 at 12:01 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >
> > > > Em Tue, Jun 08, 2021 at 09:59:48AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > Em Mon, Jun 07, 2021 at 05:53:59PM -0700, Andrii Nakryiko escreveu:
> > > > > > I think it's very fragile and it will be easy to get
> > > > > > broken/invalid/incomplete BTF. Yonghong already brought up the case
> >
> > > > > I thought about that as it would be almost like the compiler generating
> > > > > BTF, but you are right, the vmlinux prep process is a complex beast and
> > > > > probably it is best to go with the second approach I outlined and you
> > > > > agreed to be less fragile, so I'll go with that, thanks for your
> > > > > comments.
> >
> > > > So, just to write some notes here from what I saw so far:
> >
> > > > 1. In the LTO cases there are inter-CU references, so the current code
> > > > combines all CUs into one and we end up not being able to parallelize
> > > > much. LTO is expensive, so... I'll leave it for later, but yeah, I don't
> > > > think the current algorithm is ideal, can be improved.
> >
> > > Yeah, let's worry about LTO later.
> >
> > > > 2. The case where there's no inter CU refs, which so far is the most
> > > > common, seems easier, we create N threads, all sharing the dwarf_loader
> > > > state and the btf_encoder, as-is now. we can process one CU per thread,
> > > > and as soon as we finish it, just grab a lock and call
> > > > btf_encoder__encode_cu() with the just produced CU data structures
> > > > (tags, types, functions, variables, etc), consume them and delete the
> > > > CU.
> > > >
> > > > So each thread will consume one CU, push it to the 'struct btf' class
> > > > as-is now and then ask for the next CU, using the dwarf_loader state,
> > > > still under that lock, then go back to processing dwarf tags, then
> > > > lock, btf add types, rinse, repeat.
> > >
> > > Hmm... wouldn't keeping a "local" per-thread struct btf and just keep
> > > appending to it for each processed CU until we run out of CUs be
> > > simpler?
> >
> > I thought about this as a logical next step, I would love to have a
> > 'btf__merge_argv(struct btf *btf[]), is there one?
> >
> > But from what I've read after this first paragraph of yours, lemme try
> > to rephrase:
> >
> > 1. pahole calls btf_encoder__new(...)
> >
> >    Creates a single struct btf.
> >
> > 2. dwarf_loader will create N threads, each will call a
> > dwarf_get_next_cu() that is locked and will return a CU to process, when
> > it finishes this CU, calls btf_encoder__encode_cu() under an all-threads
> > lock. Rinse repeat.
> >
> > Until all the threads have consumed all CUs.
> >
> > then btf_encoder__encode(), which should be probably renamed to
> > btf_econder__finish() will call btf__dedup(encoder->btf) and write ELF
> > or raw file.
> >
> > My first reaction to your first paragraph was:
> >
> > Yeah, we can have multiple 'struct btf' instances, one per thread, that
> > will each contain a subset of DWARF CU's encoded as BTF, and then I have
> > to merge the per-thread BTF and then dedup. O think my rephrase above is
> > better, no?
> 
> I think I understood what you want to do from the previous email, so
> you didn't have to re-phrase it, it's pretty clear already. I just
> don't feel like having per-thread struct btf adds any complexity at
> all and gives more flexibility and more parallelism. The next most
> expensive thing after loading DWARF is string deduplication
> (btf__add_str()), so it would be good to do that at per-thread level
> as well as much as possible.

So you think a per-thread dedup at the end of each thread is good, ok,
no locking, good.

But what about that question I made:

> > I thought about this as a logical next step, I would love to have a
> > 'btf__merge_argv(struct btf *btf[]), is there one?

I haven't checked, is there alredy an libbpf BTF API that can merge an
array or pre-deduped BTF, deduping it one more time?

Anyway, so you suggest I start by having each dwarf_loader thread tied
to a separate btf_encoder (a shim layer on top of a 'struct btf' and
then at the end dedup each one, then combine the N 'struct btf' into
one, then dump it into an ELF or raw file?

- Arnaldo
 
> > > So each thread does as much as possible locally without any
> > > locks. And only at the very end we merge everything together and then
> > > dedup. Or we can even dedup inside each worker before merging final
> > > btf, that probably would give quite a lot of speed up and some memory
> > > saving. Would be interesting to experiment with that.
> > >
> > > So I like the idea of a fixed pool of threads (can be customized, and
> > > I'd default to num_workers == num_cpus), but I think we can and should
> > > keep them independent for as long as possible.
> >
> > Sure, this should map the whatever the user passes to -j in the kernel
> > make command line, if nothing is passed as an argument, then default to
> > getconf(_NPROCESSORS_ONLN).
> >
> 
> Yep, cool. I've been told that `make -j` puts no upper limit on number
> of jobs, so we shouldn't follow make model completely :-P
> 
> > There is a nice coincidence here where we probably don't care about -J
> > anymore and want to deal only with -j (detached btf) that is the same as
> > what 'make' expects to state how many "jobs" (thread pool size) the user
> > wants 8-)
> >
> > > Another disadvantage of generating small struct btf and then lock +
> > > merge is that we don't get as efficient string re-use, we'll churn
> > > more on string memory allocation. Keeping bigger local struct btfs
> > > allow for more efficient memory re-use (and probably a tiny bit of CPU
> > > savings).
> >
> > I think we're in the same page, the contention for adding the CU to a
> > single 'struct btf' (amongst all DWARF loading threads) after we just
> > produced it should be minimal, so we grab all the advantages: locality
> > of reference, minimal contention as DWARF reading/creating the pahole
> > internal, neutral, data structures should be higher than adding
> > types/functions/variables via the libbpf BTF API.
> 
> I disagree, I think contention might be noticeable because merging
> BTFs is still a relatively expensive/slow operation. But feel free to
> start with that, I just thought that doing per-thread struct btf
> wouldn't add any complexity, which is why I mentioned that.
> 
> >
> > I.e. we can leave paralellizing the BTF _encoding_ for later, what we're
> > trying to do now is to paralellize the DWARF _loading_, right?
> 
> We are trying to speed up DWARF-to-BTF generation in general, not
> specifically DWARF loading. DWARF loading is an obvious most expensive
> part, string deduplication is the next one, if you look at profiling
> data. The third one will be btf__dedup, which is why I mentioned that
> it might be faster still to do pre-dedup in each thread at the very
> end, right before we do final dedup. Each individual dedup will
> probably significantly reduce total size of data/strings, so I have a
> feeling that it will result in a very nice speed-ups in the end.
> 
> So just my 2 cents.
> 
> >
> > > So please consider that, it also seems simpler overall.
> >
> > > > The ordering will be different than what we have now, as some smaller
> > > > CUs (object files with debug) will be processed faster so will get its
> > > > btf encoding slot faster, but that, at btf__dedup() time shouldn't make
> > > > a difference, right?
> >
> > > Right, order doesn't matter.
> >
> > > > I think I'm done with refactoring the btf_encoder code, which should be
> > > > by now a thin layer on top of the excellent libbpf BTF API, just getting
> > > > what the previous loader (DWARF) produced and feeding libbpf.
> >
> > > Great.
> >
> > > > I thought about fancy thread pools, etc, researching some pre-existing
> > > > thing or doing some kthread + workqueue lifting from the kernel but will
> > > > instead start with the most spartan code, we can improve later.
> >
> > > Agree, simple is good. Really curious how much faster we can get. I
> > > think anything fancy will give a relatively small improvement. The
> > > biggest one will come from any parallelization.
> >
> > And I think that is possible, modulo elfutils libraries saying no, I
> > hope that will not be the case.
> 
> You can't imagine how eagerly I'm awaiting this bright future of
> faster BTF generation step in the kernel build process. :)
> 
> >
> > - Arnaldo

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Parallelizing vmlinux BTF encoding. was Re: [RFT] Testing 1.22
  2021-06-15 20:25                                 ` Arnaldo Carvalho de Melo
@ 2021-06-15 21:26                                   ` Andrii Nakryiko
  0 siblings, 0 replies; 44+ messages in thread
From: Andrii Nakryiko @ 2021-06-15 21:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Arnaldo Carvalho de Melo, Yonghong Song, Andrii Nakryiko,
	Jiri Olsa, dwarves, bpf, Kernel Team, Linux Kernel Mailing List

On Tue, Jun 15, 2021 at 1:26 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Tue, Jun 15, 2021 at 01:05:30PM -0700, Andrii Nakryiko escreveu:
> > On Tue, Jun 15, 2021 at 12:38 PM Arnaldo Carvalho de Melo
> > <arnaldo.melo@gmail.com> wrote:
> > >
> > > Em Tue, Jun 15, 2021 at 12:13:55PM -0700, Andrii Nakryiko escreveu:
> > > > On Tue, Jun 15, 2021 at 12:01 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > >
> > > > > Em Tue, Jun 08, 2021 at 09:59:48AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > > Em Mon, Jun 07, 2021 at 05:53:59PM -0700, Andrii Nakryiko escreveu:
> > > > > > > I think it's very fragile and it will be easy to get
> > > > > > > broken/invalid/incomplete BTF. Yonghong already brought up the case
> > >
> > > > > > I thought about that as it would be almost like the compiler generating
> > > > > > BTF, but you are right, the vmlinux prep process is a complex beast and
> > > > > > probably it is best to go with the second approach I outlined and you
> > > > > > agreed to be less fragile, so I'll go with that, thanks for your
> > > > > > comments.
> > >
> > > > > So, just to write some notes here from what I saw so far:
> > >
> > > > > 1. In the LTO cases there are inter-CU references, so the current code
> > > > > combines all CUs into one and we end up not being able to parallelize
> > > > > much. LTO is expensive, so... I'll leave it for later, but yeah, I don't
> > > > > think the current algorithm is ideal, can be improved.
> > >
> > > > Yeah, let's worry about LTO later.
> > >
> > > > > 2. The case where there's no inter CU refs, which so far is the most
> > > > > common, seems easier, we create N threads, all sharing the dwarf_loader
> > > > > state and the btf_encoder, as-is now. we can process one CU per thread,
> > > > > and as soon as we finish it, just grab a lock and call
> > > > > btf_encoder__encode_cu() with the just produced CU data structures
> > > > > (tags, types, functions, variables, etc), consume them and delete the
> > > > > CU.
> > > > >
> > > > > So each thread will consume one CU, push it to the 'struct btf' class
> > > > > as-is now and then ask for the next CU, using the dwarf_loader state,
> > > > > still under that lock, then go back to processing dwarf tags, then
> > > > > lock, btf add types, rinse, repeat.
> > > >
> > > > Hmm... wouldn't keeping a "local" per-thread struct btf and just keep
> > > > appending to it for each processed CU until we run out of CUs be
> > > > simpler?
> > >
> > > I thought about this as a logical next step, I would love to have a
> > > 'btf__merge_argv(struct btf *btf[]), is there one?
> > >
> > > But from what I've read after this first paragraph of yours, lemme try
> > > to rephrase:
> > >
> > > 1. pahole calls btf_encoder__new(...)
> > >
> > >    Creates a single struct btf.
> > >
> > > 2. dwarf_loader will create N threads, each will call a
> > > dwarf_get_next_cu() that is locked and will return a CU to process, when
> > > it finishes this CU, calls btf_encoder__encode_cu() under an all-threads
> > > lock. Rinse repeat.
> > >
> > > Until all the threads have consumed all CUs.
> > >
> > > then btf_encoder__encode(), which should be probably renamed to
> > > btf_econder__finish() will call btf__dedup(encoder->btf) and write ELF
> > > or raw file.
> > >
> > > My first reaction to your first paragraph was:
> > >
> > > Yeah, we can have multiple 'struct btf' instances, one per thread, that
> > > will each contain a subset of DWARF CU's encoded as BTF, and then I have
> > > to merge the per-thread BTF and then dedup. O think my rephrase above is
> > > better, no?
> >
> > I think I understood what you want to do from the previous email, so
> > you didn't have to re-phrase it, it's pretty clear already. I just
> > don't feel like having per-thread struct btf adds any complexity at
> > all and gives more flexibility and more parallelism. The next most
> > expensive thing after loading DWARF is string deduplication
> > (btf__add_str()), so it would be good to do that at per-thread level
> > as well as much as possible.
>
> So you think a per-thread dedup at the end of each thread is good, ok,
> no locking, good.
>
> But what about that question I made:
>
> > > I thought about this as a logical next step, I would love to have a
> > > 'btf__merge_argv(struct btf *btf[]), is there one?
>

Right, sorry, got too excited about parallelisation, forgot to reply to this.

I thought about this a bit in the context of BPF static linker work.
This is a bit problematic as a general API, because merging two BTFs
is not just appending types one after the other and calling it a day.
For DATASEC, for instance, you need to take few DATASEC with the same
name and combine them into a single DATASEC, otherwise resulting BTF
is non-sensical. While you are doing that, you need to re-adjust
variable offsets, take into account original data section alignment
requirements, etc. This operation can't be done safely in BTF only,
you need to know original ELF information (e.g., that ELF section
alignment). This is all done by static linker explicitly because only
static linker has enough information to do that. It goes even further,
extern VARs and FUNCs have to be resolved and de-duplicated (e.g.,
extern can be replaced with globals), etc. There is too much attached
semantics to some of BTF data.

So as a general API I don't see how it can be done nicely. Unless we
say that we'll error out if any VAR or DATASEC is found, or any extern
FUNC. Which sounds like a not-so-great idea right now.

But pahole has a bit simpler case, because BTF vars and DATASEC(s) are
generated at the very end, so it shouldn't be so complicated for
pahole. libbpf provides a generic btf__add_type() API that copies over
any type and associated strings (field names, func/struct names, etc).
That's quite a reduction in amount of code written. The only thing is
that after that IDs have to be updated and adjusted, because libbpf
doesn't have enough info to do this. So take a look at btf__add_type()
as a starting point.

Next, libbpf internally has btf_type_visit_type_ids() which will
provide a callback for each place in any btf_type that contains an ID.
This is the best way to adjust those IDs. We can probably expose those
APIs as public API because they are well-defined and have
straightforward semantics. So let me know.


> I haven't checked, is there alredy an libbpf BTF API that can merge an
> array or pre-deduped BTF, deduping it one more time?

btf__dedup() can be called multiple times on any struct btf, so once
you merge (see above), you can just dedup the merged btf to make it
small again.

>
> Anyway, so you suggest I start by having each dwarf_loader thread tied
> to a separate btf_encoder (a shim layer on top of a 'struct btf' and
> then at the end dedup each one, then combine the N 'struct btf' into
> one, then dump it into an ELF or raw file?

Yes, exactly.

>
> - Arnaldo
>
> > > > So each thread does as much as possible locally without any
> > > > locks. And only at the very end we merge everything together and then
> > > > dedup. Or we can even dedup inside each worker before merging final
> > > > btf, that probably would give quite a lot of speed up and some memory
> > > > saving. Would be interesting to experiment with that.
> > > >
> > > > So I like the idea of a fixed pool of threads (can be customized, and
> > > > I'd default to num_workers == num_cpus), but I think we can and should
> > > > keep them independent for as long as possible.
> > >
> > > Sure, this should map the whatever the user passes to -j in the kernel
> > > make command line, if nothing is passed as an argument, then default to
> > > getconf(_NPROCESSORS_ONLN).
> > >
> >
> > Yep, cool. I've been told that `make -j` puts no upper limit on number
> > of jobs, so we shouldn't follow make model completely :-P
> >
> > > There is a nice coincidence here where we probably don't care about -J
> > > anymore and want to deal only with -j (detached btf) that is the same as
> > > what 'make' expects to state how many "jobs" (thread pool size) the user
> > > wants 8-)
> > >
> > > > Another disadvantage of generating small struct btf and then lock +
> > > > merge is that we don't get as efficient string re-use, we'll churn
> > > > more on string memory allocation. Keeping bigger local struct btfs
> > > > allow for more efficient memory re-use (and probably a tiny bit of CPU
> > > > savings).
> > >
> > > I think we're in the same page, the contention for adding the CU to a
> > > single 'struct btf' (amongst all DWARF loading threads) after we just
> > > produced it should be minimal, so we grab all the advantages: locality
> > > of reference, minimal contention as DWARF reading/creating the pahole
> > > internal, neutral, data structures should be higher than adding
> > > types/functions/variables via the libbpf BTF API.
> >
> > I disagree, I think contention might be noticeable because merging
> > BTFs is still a relatively expensive/slow operation. But feel free to
> > start with that, I just thought that doing per-thread struct btf
> > wouldn't add any complexity, which is why I mentioned that.
> >
> > >
> > > I.e. we can leave paralellizing the BTF _encoding_ for later, what we're
> > > trying to do now is to paralellize the DWARF _loading_, right?
> >
> > We are trying to speed up DWARF-to-BTF generation in general, not
> > specifically DWARF loading. DWARF loading is an obvious most expensive
> > part, string deduplication is the next one, if you look at profiling
> > data. The third one will be btf__dedup, which is why I mentioned that
> > it might be faster still to do pre-dedup in each thread at the very
> > end, right before we do final dedup. Each individual dedup will
> > probably significantly reduce total size of data/strings, so I have a
> > feeling that it will result in a very nice speed-ups in the end.
> >
> > So just my 2 cents.
> >
> > >
> > > > So please consider that, it also seems simpler overall.
> > >
> > > > > The ordering will be different than what we have now, as some smaller
> > > > > CUs (object files with debug) will be processed faster so will get its
> > > > > btf encoding slot faster, but that, at btf__dedup() time shouldn't make
> > > > > a difference, right?
> > >
> > > > Right, order doesn't matter.
> > >
> > > > > I think I'm done with refactoring the btf_encoder code, which should be
> > > > > by now a thin layer on top of the excellent libbpf BTF API, just getting
> > > > > what the previous loader (DWARF) produced and feeding libbpf.
> > >
> > > > Great.
> > >
> > > > > I thought about fancy thread pools, etc, researching some pre-existing
> > > > > thing or doing some kthread + workqueue lifting from the kernel but will
> > > > > instead start with the most spartan code, we can improve later.
> > >
> > > > Agree, simple is good. Really curious how much faster we can get. I
> > > > think anything fancy will give a relatively small improvement. The
> > > > biggest one will come from any parallelization.
> > >
> > > And I think that is possible, modulo elfutils libraries saying no, I
> > > hope that will not be the case.
> >
> > You can't imagine how eagerly I'm awaiting this bright future of
> > faster BTF generation step in the kernel build process. :)
> >
> > >
> > > - Arnaldo
>
> --
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-05-27 15:20 [RFT] Testing 1.22 Arnaldo Carvalho de Melo
  2021-05-27 16:54 ` Andrii Nakryiko
  2021-06-02 10:21 ` Michael Petlan
@ 2021-07-15 21:31 ` Michal Suchánek
  2021-07-16 13:25   ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 44+ messages in thread
From: Michal Suchánek @ 2021-07-15 21:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Jiri Olsa, dwarves, bpf, kernel-team, Michael Petlan

Hello,

when building with system libbpf I get:

[   40s] make[1]: Nothing to be done for 'preinstall'.
[   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
[   40s] Install the project...
[   40s] /usr/bin/cmake -P cmake_install.cmake
[   40s] -- Install configuration: "RelWithDebInfo"
[   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
[   40s] CMake Error at cmake_install.cmake:63 (file):
[   40s]   file RPATH_CHANGE could not write new RPATH:
[   40s] 
[   40s]     
[   40s] 
[   40s]   to the file:
[   40s] 
[   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
[   40s] 
[   40s]   The current RUNPATH is:
[   40s] 
[   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
[   40s] 
[   40s]   which does not contain:
[   40s] 
[   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
[   40s] 
[   40s]   as was expected.
[   40s] 
[   40s] 
[   40s] make: *** [Makefile:74: install] Error 1
[   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
[   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)

This is not a problem with embedded libbpf.

Using system libbpf seems to be new in 1.22

Thanks

Michal

On Thu, May 27, 2021 at 12:20:53PM -0300, Arnaldo Carvalho de Melo wrote:
> Hi guys,
> 
> 	Its important to have 1.22 out of the door ASAP, so please clone
> what is in tmp.master and report your results.
> 
> 	To make it super easy:
> 
> [acme@quaco pahole]$ cd /tmp
> [acme@quaco tmp]$ git clone git://git.kernel.org/pub/scm/devel/pahole/pahole.git
> Cloning into 'pahole'...
> remote: Enumerating objects: 6510, done.
> remote: Total 6510 (delta 0), reused 0 (delta 0), pack-reused 6510
> Receiving objects: 100% (6510/6510), 1.63 MiB | 296.00 KiB/s, done.
> Resolving deltas: 100% (4550/4550), done.
> [acme@quaco tmp]$ cd pahole/
> [acme@quaco pahole]$ git checkout origin/tmp.master
> Note: switching to 'origin/tmp.master'.
> 
> You are in 'detached HEAD' state. You can look around, make experimental
> changes and commit them, and you can discard any commits you make in this
> state without impacting any branches by switching back to a branch.
> 
> If you want to create a new branch to retain commits you create, you may
> do so (now or later) by using -c with the switch command. Example:
> 
>   git switch -c <new-branch-name>
> 
> Or undo this operation with:
> 
>   git switch -
> 
> Turn off this advice by setting config variable advice.detachedHead to false
> 
> HEAD is now at 0d17503db0580a66 btf_encoder: fix and complete filtering out zero-sized per-CPU variables
> [acme@quaco pahole]$ git log --oneline -5
> 0d17503db0580a66 (HEAD, origin/tmp.master) btf_encoder: fix and complete filtering out zero-sized per-CPU variables
> fb418f9d8384d3a9 dwarves: Make handling of NULL by destructos consistent
> f049fe9ebf7aa9c2 dutil: Make handling of NULL by destructos consistent
> 1512ab8ab6fe76a9 pahole: Make handling of NULL by destructos consistent
> 1105b7dad2d0978b elf_symtab: Use zfree() where applicable
> [acme@quaco pahole]$ mkdir build
> [acme@quaco pahole]$ cd build
> [acme@quaco build]$ cmake ..
> <SNIP>
> -- Build files have been written to: /tmp/pahole/build
> [acme@quaco build]$ cd ..
> [acme@quaco pahole]$ make -j8 -C build
> make: Entering directory '/tmp/pahole/build'
> <SNIP>
> [100%] Built target pahole
> make[1]: Leaving directory '/tmp/pahole/build'
> make: Leaving directory '/tmp/pahole/build'
> [acme@quaco pahole]$
> 
> Then make sure build/pahole is in your path and try your workloads.
> 
> Jiri, Michael, if you could run your tests with this, that would be awesome,
> 
> Thanks in advance!
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-15 21:31 ` Michal Suchánek
@ 2021-07-16 13:25   ` Arnaldo Carvalho de Melo
  2021-07-16 13:35     ` Luca Boccassi
  2021-07-19 12:10     ` Luca Boccassi
  0 siblings, 2 replies; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-07-16 13:25 UTC (permalink / raw)
  To: Luca Boccassi
  Cc: Michal Suchánek, Andrii Nakryiko, Jiri Olsa, dwarves, bpf,
	kernel-team, Michael Petlan

Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> Hello,
> 
> when building with system libbpf I get:
> 
> [   40s] make[1]: Nothing to be done for 'preinstall'.
> [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> [   40s] Install the project...
> [   40s] /usr/bin/cmake -P cmake_install.cmake
> [   40s] -- Install configuration: "RelWithDebInfo"
> [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> [   40s] CMake Error at cmake_install.cmake:63 (file):
> [   40s]   file RPATH_CHANGE could not write new RPATH:
> [   40s] 
> [   40s]     
> [   40s] 
> [   40s]   to the file:
> [   40s] 
> [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> [   40s] 
> [   40s]   The current RUNPATH is:
> [   40s] 
> [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> [   40s] 
> [   40s]   which does not contain:
> [   40s] 
> [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> [   40s] 
> [   40s]   as was expected.
> [   40s] 
> [   40s] 
> [   40s] make: *** [Makefile:74: install] Error 1
> [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> 
> This is not a problem with embedded libbpf.
> 
> Using system libbpf seems to be new in 1.22

Lucca, can you take a look at this?

Thanks,

- Arnaldo
 
> On Thu, May 27, 2021 at 12:20:53PM -0300, Arnaldo Carvalho de Melo wrote:
> > Hi guys,
> > 
> > 	Its important to have 1.22 out of the door ASAP, so please clone
> > what is in tmp.master and report your results.
> > 
> > 	To make it super easy:
> > 
> > [acme@quaco pahole]$ cd /tmp
> > [acme@quaco tmp]$ git clone git://git.kernel.org/pub/scm/devel/pahole/pahole.git
> > Cloning into 'pahole'...
> > remote: Enumerating objects: 6510, done.
> > remote: Total 6510 (delta 0), reused 0 (delta 0), pack-reused 6510
> > Receiving objects: 100% (6510/6510), 1.63 MiB | 296.00 KiB/s, done.
> > Resolving deltas: 100% (4550/4550), done.
> > [acme@quaco tmp]$ cd pahole/
> > [acme@quaco pahole]$ git checkout origin/tmp.master
> > Note: switching to 'origin/tmp.master'.
> > 
> > You are in 'detached HEAD' state. You can look around, make experimental
> > changes and commit them, and you can discard any commits you make in this
> > state without impacting any branches by switching back to a branch.
> > 
> > If you want to create a new branch to retain commits you create, you may
> > do so (now or later) by using -c with the switch command. Example:
> > 
> >   git switch -c <new-branch-name>
> > 
> > Or undo this operation with:
> > 
> >   git switch -
> > 
> > Turn off this advice by setting config variable advice.detachedHead to false
> > 
> > HEAD is now at 0d17503db0580a66 btf_encoder: fix and complete filtering out zero-sized per-CPU variables
> > [acme@quaco pahole]$ git log --oneline -5
> > 0d17503db0580a66 (HEAD, origin/tmp.master) btf_encoder: fix and complete filtering out zero-sized per-CPU variables
> > fb418f9d8384d3a9 dwarves: Make handling of NULL by destructos consistent
> > f049fe9ebf7aa9c2 dutil: Make handling of NULL by destructos consistent
> > 1512ab8ab6fe76a9 pahole: Make handling of NULL by destructos consistent
> > 1105b7dad2d0978b elf_symtab: Use zfree() where applicable
> > [acme@quaco pahole]$ mkdir build
> > [acme@quaco pahole]$ cd build
> > [acme@quaco build]$ cmake ..
> > <SNIP>
> > -- Build files have been written to: /tmp/pahole/build
> > [acme@quaco build]$ cd ..
> > [acme@quaco pahole]$ make -j8 -C build
> > make: Entering directory '/tmp/pahole/build'
> > <SNIP>
> > [100%] Built target pahole
> > make[1]: Leaving directory '/tmp/pahole/build'
> > make: Leaving directory '/tmp/pahole/build'
> > [acme@quaco pahole]$
> > 
> > Then make sure build/pahole is in your path and try your workloads.
> > 
> > Jiri, Michael, if you could run your tests with this, that would be awesome,
> > 
> > Thanks in advance!
> > 
> > - Arnaldo

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-16 13:25   ` Arnaldo Carvalho de Melo
@ 2021-07-16 13:35     ` Luca Boccassi
  2021-07-16 19:08       ` Luca Boccassi
  2021-07-19 12:10     ` Luca Boccassi
  1 sibling, 1 reply; 44+ messages in thread
From: Luca Boccassi @ 2021-07-16 13:35 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Michal Suchánek, Andrii Nakryiko, Jiri Olsa, dwarves, bpf,
	kernel-team, Michael Petlan

[-- Attachment #1: Type: text/plain, Size: 1888 bytes --]

On Fri, 2021-07-16 at 10:25 -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> > Hello,
> > 
> > when building with system libbpf I get:
> > 
> > [   40s] make[1]: Nothing to be done for 'preinstall'.
> > [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > [   40s] Install the project...
> > [   40s] /usr/bin/cmake -P cmake_install.cmake
> > [   40s] -- Install configuration: "RelWithDebInfo"
> > [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > [   40s] CMake Error at cmake_install.cmake:63 (file):
> > [   40s]   file RPATH_CHANGE could not write new RPATH:
> > [   40s] 
> > [   40s]     
> > [   40s] 
> > [   40s]   to the file:
> > [   40s] 
> > [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > [   40s] 
> > [   40s]   The current RUNPATH is:
> > [   40s] 
> > [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > [   40s] 
> > [   40s]   which does not contain:
> > [   40s] 
> > [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > [   40s] 
> > [   40s]   as was expected.
> > [   40s] 
> > [   40s] 
> > [   40s] make: *** [Makefile:74: install] Error 1
> > [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> > 
> > This is not a problem with embedded libbpf.
> > 
> > Using system libbpf seems to be new in 1.22
> 
> Lucca, can you take a look at this?
> 
> Thanks,
> 
> - Arnaldo

Hi,

Michal, what is the CMake configuration command line you are using?

-- 
Kind regards,
Luca Boccassi

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-16 13:35     ` Luca Boccassi
@ 2021-07-16 19:08       ` Luca Boccassi
       [not found]         ` <20210716201248.GL24916@kitsune.suse.cz>
  0 siblings, 1 reply; 44+ messages in thread
From: Luca Boccassi @ 2021-07-16 19:08 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa, dwarves,
	bpf, kernel-team, Michael Petlan

[-- Attachment #1: Type: text/plain, Size: 2204 bytes --]

On Fri, 2021-07-16 at 14:35 +0100, Luca Boccassi wrote:
> On Fri, 2021-07-16 at 10:25 -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> > > Hello,
> > > 
> > > when building with system libbpf I get:
> > > 
> > > [   40s] make[1]: Nothing to be done for 'preinstall'.
> > > [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > [   40s] Install the project...
> > > [   40s] /usr/bin/cmake -P cmake_install.cmake
> > > [   40s] -- Install configuration: "RelWithDebInfo"
> > > [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > [   40s] CMake Error at cmake_install.cmake:63 (file):
> > > [   40s]   file RPATH_CHANGE could not write new RPATH:
> > > [   40s] 
> > > [   40s]     
> > > [   40s] 
> > > [   40s]   to the file:
> > > [   40s] 
> > > [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > [   40s] 
> > > [   40s]   The current RUNPATH is:
> > > [   40s] 
> > > [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > [   40s] 
> > > [   40s]   which does not contain:
> > > [   40s] 
> > > [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > [   40s] 
> > > [   40s]   as was expected.
> > > [   40s] 
> > > [   40s] 
> > > [   40s] make: *** [Makefile:74: install] Error 1
> > > [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> > > 
> > > This is not a problem with embedded libbpf.
> > > 
> > > Using system libbpf seems to be new in 1.22
> > 
> > Lucca, can you take a look at this?
> > 
> > Thanks,
> > 
> > - Arnaldo
> 
> Hi,
> 
> Michal, what is the CMake configuration command line you are using?

Latest tmp.master builds fine here with libbpf 0.4.0. To reproduce it
would be good to know build env and command line used, otherwise I
can't really tell.

-- 
Kind regards,
Luca Boccassi

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
       [not found]         ` <20210716201248.GL24916@kitsune.suse.cz>
@ 2021-07-17 14:35           ` Luca Boccassi
  2021-07-17 15:10             ` Michal Suchánek
  0 siblings, 1 reply; 44+ messages in thread
From: Luca Boccassi @ 2021-07-17 14:35 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa, dwarves,
	bpf, kernel-team, Michael Petlan

[-- Attachment #1: Type: text/plain, Size: 7003 bytes --]

On Fri, 2021-07-16 at 22:12 +0200, Michal Suchánek wrote:
> On Fri, Jul 16, 2021 at 08:08:27PM +0100, Luca Boccassi wrote:
> > On Fri, 2021-07-16 at 14:35 +0100, Luca Boccassi wrote:
> > > On Fri, 2021-07-16 at 10:25 -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> > > > > Hello,
> > > > > 
> > > > > when building with system libbpf I get:
> > > > > 
> > > > > [   40s] make[1]: Nothing to be done for 'preinstall'.
> > > > > [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > [   40s] Install the project...
> > > > > [   40s] /usr/bin/cmake -P cmake_install.cmake
> > > > > [   40s] -- Install configuration: "RelWithDebInfo"
> > > > > [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > [   40s] CMake Error at cmake_install.cmake:63 (file):
> > > > > [   40s]   file RPATH_CHANGE could not write new RPATH:
> > > > > [   40s] 
> > > > > [   40s]     
> > > > > [   40s] 
> > > > > [   40s]   to the file:
> > > > > [   40s] 
> > > > > [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > [   40s] 
> > > > > [   40s]   The current RUNPATH is:
> > > > > [   40s] 
> > > > > [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > [   40s] 
> > > > > [   40s]   which does not contain:
> > > > > [   40s] 
> > > > > [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > [   40s] 
> > > > > [   40s]   as was expected.
> > > > > [   40s] 
> > > > > [   40s] 
> > > > > [   40s] make: *** [Makefile:74: install] Error 1
> > > > > [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> > > > > 
> > > > > This is not a problem with embedded libbpf.
> > > > > 
> > > > > Using system libbpf seems to be new in 1.22
> > > > 
> > > > Lucca, can you take a look at this?
> > > > 
> > > > Thanks,
> > > > 
> > > > - Arnaldo
> > > 
> > > Hi,
> > > 
> > > Michal, what is the CMake configuration command line you are using?
> > 
> > Latest tmp.master builds fine here with libbpf 0.4.0. To reproduce it
> > would be good to know build env and command line used, otherwise I
> > can't really tell.
> 
> Indeed, there is non-trivial rpm macro expanded when invoking cmake.
> 
> Attaching log.
> 
> Thanks
> 
> Michal

So, this took some spelunking. TL;DR: openSUSE's libbpf.pc from libbpf-
devel is broken, it lists -L/usr/local/lib64 in Libs even though it
doesn't install anything in that prefix. Fix it to list the right path
and the problem goes away.

Longer version: CMake is complaining that the rpath in the binary is
not what it expected it to be when stripping it. Of course the first
question is, why does that matter since it's removing it? Just remove
it, no? Another CMake weirdness to add the the list, I guess.

[   20s]   file RPATH_CHANGE could not write new RPATH:
[   20s] 
[   20s]     
[   20s] 
[   20s]   to the file:
[   20s] 
[   20s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-
1.21+git175.1ef87b2-21.1.x86_64/usr/bin/codiff
[   20s] 
[   20s]   The current RUNPATH is:
[   20s] 
[   20s]     /home/abuild/rpmbuild/BUILD/dwarves-
1.21+git175.1ef87b2/build:
[   20s] 
[   20s]   which does not contain:
[   20s] 
[   20s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
1.21+git175.1ef87b2/build:
[   20s] 
[   20s]   as was expected.

This is the linking step where the rpath is set:

[   19s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
protection -Werror=return-type -flto=auto -g -DNDEBUG
-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
-Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
CMakeFiles/codiff.dir/codiff.c.o -o codiff   -L/usr/local/lib64  -Wl,-
rpath,/usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
1.21+git175.1ef87b2/build: libdwarves.so.1.0.0 -ldw -lelf -lz -lbpf 

On a build without -DLIBBPF_EMBEDDED=off, this is the linking step for
the same binary:

[   64s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
protection -Werror=return-type -flto=auto -g -DNDEBUG
-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
-Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
CMakeFiles/codiff.dir/codiff.c.o -o codiff  -Wl,-
rpath,/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
libdwarves.so.1.0.0 -ldw -lelf -lz

/usr/local/lib64 is not in the rpath. Why? The hint is that
-L/usr/local/lib64 is not passed either, too much of a coincidence to
be unrelated.

Where is it coming from? Well, when using the system's libbpf we are
using the pkgconfig file from the package. It is common to list linker
flags in there, although this one shouldn't be in it. Sure enough,
downloading libbpf-devel from 
https://build.opensuse.org/package/binaries/openSUSE:Factory/libbpf/standard
and extracting the pc file:

$ cat /tmp/libbpf.pc 
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

prefix=/usr/local
libdir=/usr/local/lib64
includedir=${prefix}/include

Name: libbpf
Description: BPF library
Version: 0.3.0
Libs: -L${libdir} -lbpf
Requires.private: libelf zlib
Cflags: -I${includedir}

There it is. Nothing is installed in that path, so it shouldn't be
there in the first place.

$ rpm -qlp /tmp/libbpf0-5.12.4-2.7.x86_64.rpm 
warning: /tmp/libbpf0-5.12.4-2.7.x86_64.rpm: Header V3 RSA/SHA256
Signature, key ID 3dbdc284: NOKEY
/usr/lib64/libbpf.so.0
/usr/lib64/libbpf.so.0.3.0
$ rpm -qlp /tmp/libbpf-devel-5.12.4-2.7.x86_64.rpm 
warning: /tmp/libbpf-devel-5.12.4-2.7.x86_64.rpm: Header V3 RSA/SHA256
Signature, key ID 3dbdc284: NOKEY
/usr/include/bpf
/usr/include/bpf/bpf.h
/usr/include/bpf/bpf_core_read.h
/usr/include/bpf/bpf_endian.h
/usr/include/bpf/bpf_helper_defs.h
/usr/include/bpf/bpf_helpers.h
/usr/include/bpf/bpf_tracing.h
/usr/include/bpf/btf.h
/usr/include/bpf/libbpf.h
/usr/include/bpf/libbpf_common.h
/usr/include/bpf/libbpf_util.h
/usr/include/bpf/xsk.h
/usr/lib64/libbpf.so
/usr/lib64/pkgconfig/libbpf.pc

After hacking it out in the libbpf build:

https://build.opensuse.org/package/show/home:bluca:branches:home:michals/libbpf

Dwarves then builds just fine with system libbpf:

https://build.opensuse.org/package/show/home:bluca:branches:home:michals/dwarves

Here's a PR for the libbpf package that applies a quick hack to fix it:

https://build.opensuse.org/request/show/906834

I'll send my invoice to SUSE Inc. ;-)

-- 
Kind regards,
Luca Boccassi

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-17 14:35           ` Luca Boccassi
@ 2021-07-17 15:10             ` Michal Suchánek
  2021-07-17 15:14               ` Luca Boccassi
  0 siblings, 1 reply; 44+ messages in thread
From: Michal Suchánek @ 2021-07-17 15:10 UTC (permalink / raw)
  To: Luca Boccassi
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa, dwarves,
	bpf, kernel-team, Michael Petlan

Hello,

On Sat, Jul 17, 2021 at 03:35:03PM +0100, Luca Boccassi wrote:
> On Fri, 2021-07-16 at 22:12 +0200, Michal Suchánek wrote:
> > On Fri, Jul 16, 2021 at 08:08:27PM +0100, Luca Boccassi wrote:
> > > On Fri, 2021-07-16 at 14:35 +0100, Luca Boccassi wrote:
> > > > On Fri, 2021-07-16 at 10:25 -0300, Arnaldo Carvalho de Melo wrote:
> > > > > Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> > > > > > Hello,
> > > > > > 
> > > > > > when building with system libbpf I get:
> > > > > > 
> > > > > > [   40s] make[1]: Nothing to be done for 'preinstall'.
> > > > > > [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > > [   40s] Install the project...
> > > > > > [   40s] /usr/bin/cmake -P cmake_install.cmake
> > > > > > [   40s] -- Install configuration: "RelWithDebInfo"
> > > > > > [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > > [   40s] CMake Error at cmake_install.cmake:63 (file):
> > > > > > [   40s]   file RPATH_CHANGE could not write new RPATH:
> > > > > > [   40s] 
> > > > > > [   40s]     
> > > > > > [   40s] 
> > > > > > [   40s]   to the file:
> > > > > > [   40s] 
> > > > > > [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > > [   40s] 
> > > > > > [   40s]   The current RUNPATH is:
> > > > > > [   40s] 
> > > > > > [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > > [   40s] 
> > > > > > [   40s]   which does not contain:
> > > > > > [   40s] 
> > > > > > [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > > [   40s] 
> > > > > > [   40s]   as was expected.
> > > > > > [   40s] 
> > > > > > [   40s] 
> > > > > > [   40s] make: *** [Makefile:74: install] Error 1
> > > > > > [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > > [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> > > > > > 
> > > > > > This is not a problem with embedded libbpf.
> > > > > > 
> > > > > > Using system libbpf seems to be new in 1.22
> > > > > 
> > > > > Lucca, can you take a look at this?
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > - Arnaldo
> > > > 
> > > > Hi,
> > > > 
> > > > Michal, what is the CMake configuration command line you are using?
> > > 
> > > Latest tmp.master builds fine here with libbpf 0.4.0. To reproduce it
> > > would be good to know build env and command line used, otherwise I
> > > can't really tell.
> > 
> > Indeed, there is non-trivial rpm macro expanded when invoking cmake.
> > 
> > Attaching log.
> > 
> > Thanks
> > 
> > Michal
> 
> So, this took some spelunking. TL;DR: openSUSE's libbpf.pc from libbpf-
> devel is broken, it lists -L/usr/local/lib64 in Libs even though it
> doesn't install anything in that prefix. Fix it to list the right path
> and the problem goes away.
> 
> Longer version: CMake is complaining that the rpath in the binary is
> not what it expected it to be when stripping it. Of course the first
> question is, why does that matter since it's removing it? Just remove
> it, no? Another CMake weirdness to add the the list, I guess.
> 
> [   20s]   file RPATH_CHANGE could not write new RPATH:
> [   20s] 
> [   20s]     
> [   20s] 
> [   20s]   to the file:
> [   20s] 
> [   20s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-
> 1.21+git175.1ef87b2-21.1.x86_64/usr/bin/codiff
> [   20s] 
> [   20s]   The current RUNPATH is:
> [   20s] 
> [   20s]     /home/abuild/rpmbuild/BUILD/dwarves-
> 1.21+git175.1ef87b2/build:
> [   20s] 
> [   20s]   which does not contain:
> [   20s] 
> [   20s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
> 1.21+git175.1ef87b2/build:
> [   20s] 
> [   20s]   as was expected.
> 
> This is the linking step where the rpath is set:
> 
> [   19s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
> strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
> protection -Werror=return-type -flto=auto -g -DNDEBUG
> -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
> -Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
> CMakeFiles/codiff.dir/codiff.c.o -o codiff   -L/usr/local/lib64  -Wl,-
> rpath,/usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
> 1.21+git175.1ef87b2/build: libdwarves.so.1.0.0 -ldw -lelf -lz -lbpf 
> 
> On a build without -DLIBBPF_EMBEDDED=off, this is the linking step for
> the same binary:
> 
> [   64s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
> strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
> protection -Werror=return-type -flto=auto -g -DNDEBUG
> -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
> -Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
> CMakeFiles/codiff.dir/codiff.c.o -o codiff  -Wl,-
> rpath,/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> libdwarves.so.1.0.0 -ldw -lelf -lz
> 
> /usr/local/lib64 is not in the rpath. Why? The hint is that
> -L/usr/local/lib64 is not passed either, too much of a coincidence to
> be unrelated.
> 
> Where is it coming from? Well, when using the system's libbpf we are
> using the pkgconfig file from the package. It is common to list linker
> flags in there, although this one shouldn't be in it. Sure enough,
> downloading libbpf-devel from 
> https://build.opensuse.org/package/binaries/openSUSE:Factory/libbpf/standard
> and extracting the pc file:
> 
> $ cat /tmp/libbpf.pc 
> # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> 
> prefix=/usr/local
> libdir=/usr/local/lib64
> includedir=${prefix}/include
> 
> Name: libbpf
> Description: BPF library
> Version: 0.3.0
> Libs: -L${libdir} -lbpf
> Requires.private: libelf zlib
> Cflags: -I${includedir}
> 
> There it is. Nothing is installed in that path, so it shouldn't be
> there in the first place.
> 
> $ rpm -qlp /tmp/libbpf0-5.12.4-2.7.x86_64.rpm 
> warning: /tmp/libbpf0-5.12.4-2.7.x86_64.rpm: Header V3 RSA/SHA256
> Signature, key ID 3dbdc284: NOKEY
> /usr/lib64/libbpf.so.0
> /usr/lib64/libbpf.so.0.3.0

Thanks for the investigation

So this libbpf comes from the kernel, and there is a separate github
repository for libbpf.

Should the kernel ship its own copy of the library?

Seeing that the one in the kernel is 0.3.0 and the required one for
dwarves is 0.4.0 maybe the one in the kernel needs updating if it needs
to be shipped there?

I wil file a bug to build the libbpf from the git repo instead of the
kernel to make the openSUSE libbpf less baroque.

Thanks

Michal

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-17 15:10             ` Michal Suchánek
@ 2021-07-17 15:14               ` Luca Boccassi
  2021-07-17 16:36                 ` Michal Suchánek
  2021-07-19 10:30                 ` Michal Suchánek
  0 siblings, 2 replies; 44+ messages in thread
From: Luca Boccassi @ 2021-07-17 15:14 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa, dwarves,
	bpf, kernel-team, Michael Petlan

[-- Attachment #1: Type: text/plain, Size: 7691 bytes --]

On Sat, 2021-07-17 at 17:10 +0200, Michal Suchánek wrote:
> Hello,
> 
> On Sat, Jul 17, 2021 at 03:35:03PM +0100, Luca Boccassi wrote:
> > On Fri, 2021-07-16 at 22:12 +0200, Michal Suchánek wrote:
> > > On Fri, Jul 16, 2021 at 08:08:27PM +0100, Luca Boccassi wrote:
> > > > On Fri, 2021-07-16 at 14:35 +0100, Luca Boccassi wrote:
> > > > > On Fri, 2021-07-16 at 10:25 -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> > > > > > > Hello,
> > > > > > > 
> > > > > > > when building with system libbpf I get:
> > > > > > > 
> > > > > > > [   40s] make[1]: Nothing to be done for 'preinstall'.
> > > > > > > [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > > > [   40s] Install the project...
> > > > > > > [   40s] /usr/bin/cmake -P cmake_install.cmake
> > > > > > > [   40s] -- Install configuration: "RelWithDebInfo"
> > > > > > > [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > > > [   40s] CMake Error at cmake_install.cmake:63 (file):
> > > > > > > [   40s]   file RPATH_CHANGE could not write new RPATH:
> > > > > > > [   40s] 
> > > > > > > [   40s]     
> > > > > > > [   40s] 
> > > > > > > [   40s]   to the file:
> > > > > > > [   40s] 
> > > > > > > [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > > > [   40s] 
> > > > > > > [   40s]   The current RUNPATH is:
> > > > > > > [   40s] 
> > > > > > > [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > > > [   40s] 
> > > > > > > [   40s]   which does not contain:
> > > > > > > [   40s] 
> > > > > > > [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > > > [   40s] 
> > > > > > > [   40s]   as was expected.
> > > > > > > [   40s] 
> > > > > > > [   40s] 
> > > > > > > [   40s] make: *** [Makefile:74: install] Error 1
> > > > > > > [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > > > [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> > > > > > > 
> > > > > > > This is not a problem with embedded libbpf.
> > > > > > > 
> > > > > > > Using system libbpf seems to be new in 1.22
> > > > > > 
> > > > > > Lucca, can you take a look at this?
> > > > > > 
> > > > > > Thanks,
> > > > > > 
> > > > > > - Arnaldo
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > Michal, what is the CMake configuration command line you are using?
> > > > 
> > > > Latest tmp.master builds fine here with libbpf 0.4.0. To reproduce it
> > > > would be good to know build env and command line used, otherwise I
> > > > can't really tell.
> > > 
> > > Indeed, there is non-trivial rpm macro expanded when invoking cmake.
> > > 
> > > Attaching log.
> > > 
> > > Thanks
> > > 
> > > Michal
> > 
> > So, this took some spelunking. TL;DR: openSUSE's libbpf.pc from libbpf-
> > devel is broken, it lists -L/usr/local/lib64 in Libs even though it
> > doesn't install anything in that prefix. Fix it to list the right path
> > and the problem goes away.
> > 
> > Longer version: CMake is complaining that the rpath in the binary is
> > not what it expected it to be when stripping it. Of course the first
> > question is, why does that matter since it's removing it? Just remove
> > it, no? Another CMake weirdness to add the the list, I guess.
> > 
> > [   20s]   file RPATH_CHANGE could not write new RPATH:
> > [   20s] 
> > [   20s]     
> > [   20s] 
> > [   20s]   to the file:
> > [   20s] 
> > [   20s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-
> > 1.21+git175.1ef87b2-21.1.x86_64/usr/bin/codiff
> > [   20s] 
> > [   20s]   The current RUNPATH is:
> > [   20s] 
> > [   20s]     /home/abuild/rpmbuild/BUILD/dwarves-
> > 1.21+git175.1ef87b2/build:
> > [   20s] 
> > [   20s]   which does not contain:
> > [   20s] 
> > [   20s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
> > 1.21+git175.1ef87b2/build:
> > [   20s] 
> > [   20s]   as was expected.
> > 
> > This is the linking step where the rpath is set:
> > 
> > [   19s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
> > strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
> > protection -Werror=return-type -flto=auto -g -DNDEBUG
> > -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
> > -Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
> > CMakeFiles/codiff.dir/codiff.c.o -o codiff   -L/usr/local/lib64  -Wl,-
> > rpath,/usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
> > 1.21+git175.1ef87b2/build: libdwarves.so.1.0.0 -ldw -lelf -lz -lbpf 
> > 
> > On a build without -DLIBBPF_EMBEDDED=off, this is the linking step for
> > the same binary:
> > 
> > [   64s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
> > strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
> > protection -Werror=return-type -flto=auto -g -DNDEBUG
> > -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
> > -Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
> > CMakeFiles/codiff.dir/codiff.c.o -o codiff  -Wl,-
> > rpath,/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > libdwarves.so.1.0.0 -ldw -lelf -lz
> > 
> > /usr/local/lib64 is not in the rpath. Why? The hint is that
> > -L/usr/local/lib64 is not passed either, too much of a coincidence to
> > be unrelated.
> > 
> > Where is it coming from? Well, when using the system's libbpf we are
> > using the pkgconfig file from the package. It is common to list linker
> > flags in there, although this one shouldn't be in it. Sure enough,
> > downloading libbpf-devel from 
> > https://build.opensuse.org/package/binaries/openSUSE:Factory/libbpf/standard
> > and extracting the pc file:
> > 
> > $ cat /tmp/libbpf.pc 
> > # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> > 
> > prefix=/usr/local
> > libdir=/usr/local/lib64
> > includedir=${prefix}/include
> > 
> > Name: libbpf
> > Description: BPF library
> > Version: 0.3.0
> > Libs: -L${libdir} -lbpf
> > Requires.private: libelf zlib
> > Cflags: -I${includedir}
> > 
> > There it is. Nothing is installed in that path, so it shouldn't be
> > there in the first place.
> > 
> > $ rpm -qlp /tmp/libbpf0-5.12.4-2.7.x86_64.rpm 
> > warning: /tmp/libbpf0-5.12.4-2.7.x86_64.rpm: Header V3 RSA/SHA256
> > Signature, key ID 3dbdc284: NOKEY
> > /usr/lib64/libbpf.so.0
> > /usr/lib64/libbpf.so.0.3.0
> 
> Thanks for the investigation
> 
> So this libbpf comes from the kernel, and there is a separate github
> repository for libbpf.
> 
> Should the kernel ship its own copy of the library?
> 
> Seeing that the one in the kernel is 0.3.0 and the required one for
> dwarves is 0.4.0 maybe the one in the kernel needs updating if it needs
> to be shipped there?
> 
> I wil file a bug to build the libbpf from the git repo instead of the
> kernel to make the openSUSE libbpf less baroque.
> 
> Thanks
> 
> Michal

They provide the same ABI, so there should be only one in the same
distro, the kernel package shouldn't ship its own copy if there's a
standalone package built from the standalone sources.
If you are asking why the sources are still present in the upstream
kernel, I don't know - maybe historical reasons, since that's where it
came from? But AFAIK the majority of distros don't use that anymore.

-- 
Kind regards,
Luca Boccassi

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-17 15:14               ` Luca Boccassi
@ 2021-07-17 16:36                 ` Michal Suchánek
  2021-07-17 16:39                   ` Luca Boccassi
  2021-07-19 10:30                 ` Michal Suchánek
  1 sibling, 1 reply; 44+ messages in thread
From: Michal Suchánek @ 2021-07-17 16:36 UTC (permalink / raw)
  To: Luca Boccassi
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa, dwarves,
	bpf, kernel-team, Michael Petlan

On Sat, Jul 17, 2021 at 04:14:54PM +0100, Luca Boccassi wrote:
> On Sat, 2021-07-17 at 17:10 +0200, Michal Suchánek wrote:
> > Hello,
...
> > 
> > So this libbpf comes from the kernel, and there is a separate github
> > repository for libbpf.
> > 
> > Should the kernel ship its own copy of the library?
> > 
> > Seeing that the one in the kernel is 0.3.0 and the required one for
> > dwarves is 0.4.0 maybe the one in the kernel needs updating if it needs
> > to be shipped there?
> > 
> > I wil file a bug to build the libbpf from the git repo instead of the
> > kernel to make the openSUSE libbpf less baroque.
> 
> They provide the same ABI, so there should be only one in the same
> distro, the kernel package shouldn't ship its own copy if there's a
> standalone package built from the standalone sources.
> If you are asking why the sources are still present in the upstream
> kernel, I don't know - maybe historical reasons, since that's where it
> came from? But AFAIK the majority of distros don't use that anymore.

FWIW the libbpf from github does not work for me with LTO (GCC 11).

Also there is a problem that LIBDIR is /usr/lib64 on arm64 ppc64(le) and
s390x but the library gets installed into /usr/lib by default. For some
reason x86_64 is the only 64bit arch that works out of the box.

Thanks

Michal

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-17 16:36                 ` Michal Suchánek
@ 2021-07-17 16:39                   ` Luca Boccassi
  0 siblings, 0 replies; 44+ messages in thread
From: Luca Boccassi @ 2021-07-17 16:39 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa, dwarves,
	bpf, kernel-team, Michael Petlan

[-- Attachment #1: Type: text/plain, Size: 1670 bytes --]

On Sat, 2021-07-17 at 18:36 +0200, Michal Suchánek wrote:
> On Sat, Jul 17, 2021 at 04:14:54PM +0100, Luca Boccassi wrote:
> > On Sat, 2021-07-17 at 17:10 +0200, Michal Suchánek wrote:
> > > Hello,
> ...
> > > So this libbpf comes from the kernel, and there is a separate github
> > > repository for libbpf.
> > > 
> > > Should the kernel ship its own copy of the library?
> > > 
> > > Seeing that the one in the kernel is 0.3.0 and the required one for
> > > dwarves is 0.4.0 maybe the one in the kernel needs updating if it needs
> > > to be shipped there?
> > > 
> > > I wil file a bug to build the libbpf from the git repo instead of the
> > > kernel to make the openSUSE libbpf less baroque.
> > 
> > They provide the same ABI, so there should be only one in the same
> > distro, the kernel package shouldn't ship its own copy if there's a
> > standalone package built from the standalone sources.
> > If you are asking why the sources are still present in the upstream
> > kernel, I don't know - maybe historical reasons, since that's where it
> > came from? But AFAIK the majority of distros don't use that anymore.
> 
> FWIW the libbpf from github does not work for me with LTO (GCC 11).
> 
> Also there is a problem that LIBDIR is /usr/lib64 on arm64 ppc64(le) and
> s390x but the library gets installed into /usr/lib by default. For some
> reason x86_64 is the only 64bit arch that works out of the box.
> 
> Thanks
> 
> Michal

I don't know about LTO - but for LIBDIR, you can use LIBSUBDIR= like we
do in Debian: 
https://tracker.debian.org/media/packages/libb/libbpf/rules-0.4.0-1

-- 
Kind regards,
Luca Boccassi

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-17 15:14               ` Luca Boccassi
  2021-07-17 16:36                 ` Michal Suchánek
@ 2021-07-19 10:30                 ` Michal Suchánek
  2021-07-19 10:34                   ` Luca Boccassi
  1 sibling, 1 reply; 44+ messages in thread
From: Michal Suchánek @ 2021-07-19 10:30 UTC (permalink / raw)
  To: Luca Boccassi
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa, dwarves,
	bpf, kernel-team, Michael Petlan

On Sat, Jul 17, 2021 at 04:14:54PM +0100, Luca Boccassi wrote:
> On Sat, 2021-07-17 at 17:10 +0200, Michal Suchánek wrote:
> > Hello,
> > 
> > On Sat, Jul 17, 2021 at 03:35:03PM +0100, Luca Boccassi wrote:
> > > On Fri, 2021-07-16 at 22:12 +0200, Michal Suchánek wrote:
> > > > On Fri, Jul 16, 2021 at 08:08:27PM +0100, Luca Boccassi wrote:
> > > > > On Fri, 2021-07-16 at 14:35 +0100, Luca Boccassi wrote:
> > > > > > On Fri, 2021-07-16 at 10:25 -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > > Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> > > > > > > > Hello,
> > > > > > > > 
> > > > > > > > when building with system libbpf I get:
> > > > > > > > 
> > > > > > > > [   40s] make[1]: Nothing to be done for 'preinstall'.
> > > > > > > > [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > > > > [   40s] Install the project...
> > > > > > > > [   40s] /usr/bin/cmake -P cmake_install.cmake
> > > > > > > > [   40s] -- Install configuration: "RelWithDebInfo"
> > > > > > > > [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > > > > [   40s] CMake Error at cmake_install.cmake:63 (file):
> > > > > > > > [   40s]   file RPATH_CHANGE could not write new RPATH:
> > > > > > > > [   40s] 
> > > > > > > > [   40s]     
> > > > > > > > [   40s] 
> > > > > > > > [   40s]   to the file:
> > > > > > > > [   40s] 
> > > > > > > > [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > > > > [   40s] 
> > > > > > > > [   40s]   The current RUNPATH is:
> > > > > > > > [   40s] 
> > > > > > > > [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > > > > [   40s] 
> > > > > > > > [   40s]   which does not contain:
> > > > > > > > [   40s] 
> > > > > > > > [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > > > > [   40s] 
> > > > > > > > [   40s]   as was expected.
> > > > > > > > [   40s] 
> > > > > > > > [   40s] 
> > > > > > > > [   40s] make: *** [Makefile:74: install] Error 1
> > > > > > > > [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > > > > [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> > > > > > > > 
> > > > > > > > This is not a problem with embedded libbpf.
> > > > > > > > 
> > > > > > > > Using system libbpf seems to be new in 1.22
> > > > > > > 
> > > > > > > Lucca, can you take a look at this?
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 
> > > > > > > - Arnaldo
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > Michal, what is the CMake configuration command line you are using?
> > > > > 
> > > > > Latest tmp.master builds fine here with libbpf 0.4.0. To reproduce it
> > > > > would be good to know build env and command line used, otherwise I
> > > > > can't really tell.
> > > > 
> > > > Indeed, there is non-trivial rpm macro expanded when invoking cmake.
> > > > 
> > > > Attaching log.
> > > > 
> > > > Thanks
> > > > 
> > > > Michal
> > > 
> > > So, this took some spelunking. TL;DR: openSUSE's libbpf.pc from libbpf-
> > > devel is broken, it lists -L/usr/local/lib64 in Libs even though it
> > > doesn't install anything in that prefix. Fix it to list the right path
> > > and the problem goes away.
> > > 
> > > Longer version: CMake is complaining that the rpath in the binary is
> > > not what it expected it to be when stripping it. Of course the first
> > > question is, why does that matter since it's removing it? Just remove
> > > it, no? Another CMake weirdness to add the the list, I guess.
> > > 
> > > [   20s]   file RPATH_CHANGE could not write new RPATH:
> > > [   20s] 
> > > [   20s]     
> > > [   20s] 
> > > [   20s]   to the file:
> > > [   20s] 
> > > [   20s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-
> > > 1.21+git175.1ef87b2-21.1.x86_64/usr/bin/codiff
> > > [   20s] 
> > > [   20s]   The current RUNPATH is:
> > > [   20s] 
> > > [   20s]     /home/abuild/rpmbuild/BUILD/dwarves-
> > > 1.21+git175.1ef87b2/build:
> > > [   20s] 
> > > [   20s]   which does not contain:
> > > [   20s] 
> > > [   20s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
> > > 1.21+git175.1ef87b2/build:
> > > [   20s] 
> > > [   20s]   as was expected.
> > > 
> > > This is the linking step where the rpath is set:
> > > 
> > > [   19s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
> > > strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
> > > protection -Werror=return-type -flto=auto -g -DNDEBUG
> > > -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
> > > -Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
> > > CMakeFiles/codiff.dir/codiff.c.o -o codiff   -L/usr/local/lib64  -Wl,-
> > > rpath,/usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
> > > 1.21+git175.1ef87b2/build: libdwarves.so.1.0.0 -ldw -lelf -lz -lbpf 
> > > 
> > > On a build without -DLIBBPF_EMBEDDED=off, this is the linking step for
> > > the same binary:
> > > 
> > > [   64s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
> > > strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
> > > protection -Werror=return-type -flto=auto -g -DNDEBUG
> > > -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
> > > -Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
> > > CMakeFiles/codiff.dir/codiff.c.o -o codiff  -Wl,-
> > > rpath,/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > libdwarves.so.1.0.0 -ldw -lelf -lz
> > > 
> > > /usr/local/lib64 is not in the rpath. Why? The hint is that
> > > -L/usr/local/lib64 is not passed either, too much of a coincidence to
> > > be unrelated.
> > > 
> > > Where is it coming from? Well, when using the system's libbpf we are
> > > using the pkgconfig file from the package. It is common to list linker
> > > flags in there, although this one shouldn't be in it. Sure enough,
> > > downloading libbpf-devel from 
> > > https://build.opensuse.org/package/binaries/openSUSE:Factory/libbpf/standard
> > > and extracting the pc file:
> > > 
> > > $ cat /tmp/libbpf.pc 
> > > # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> > > 
> > > prefix=/usr/local
> > > libdir=/usr/local/lib64
> > > includedir=${prefix}/include
> > > 
> > > Name: libbpf
> > > Description: BPF library
> > > Version: 0.3.0
> > > Libs: -L${libdir} -lbpf
> > > Requires.private: libelf zlib
> > > Cflags: -I${includedir}
> > > 
> > > There it is. Nothing is installed in that path, so it shouldn't be
> > > there in the first place.
> > > 
> > > $ rpm -qlp /tmp/libbpf0-5.12.4-2.7.x86_64.rpm 
> > > warning: /tmp/libbpf0-5.12.4-2.7.x86_64.rpm: Header V3 RSA/SHA256
> > > Signature, key ID 3dbdc284: NOKEY
> > > /usr/lib64/libbpf.so.0
> > > /usr/lib64/libbpf.so.0.3.0
> > 
> > Thanks for the investigation
> > 
> > So this libbpf comes from the kernel, and there is a separate github
> > repository for libbpf.
> > 
> > Should the kernel ship its own copy of the library?
> > 
> > Seeing that the one in the kernel is 0.3.0 and the required one for
> > dwarves is 0.4.0 maybe the one in the kernel needs updating if it needs
> > to be shipped there?
> > 
> > I wil file a bug to build the libbpf from the git repo instead of the
> > kernel to make the openSUSE libbpf less baroque.
> > 
> > Thanks
> > 
> > Michal
> 
> They provide the same ABI, so there should be only one in the same
> distro, the kernel package shouldn't ship its own copy if there's a
> standalone package built from the standalone sources.
> If you are asking why the sources are still present in the upstream
> kernel, I don't know - maybe historical reasons, since that's where it
> came from? But AFAIK the majority of distros don't use that anymore.
Apparently the libbpf github repository is only mirror of the kernel
libbpf source with some modifications to build it as standalone.

Thanks

Michal

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-19 10:30                 ` Michal Suchánek
@ 2021-07-19 10:34                   ` Luca Boccassi
  0 siblings, 0 replies; 44+ messages in thread
From: Luca Boccassi @ 2021-07-19 10:34 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Arnaldo Carvalho de Melo, Andrii Nakryiko, Jiri Olsa, dwarves,
	bpf, kernel-team, Michael Petlan

[-- Attachment #1: Type: text/plain, Size: 8851 bytes --]

On Mon, 2021-07-19 at 12:30 +0200, Michal Suchánek wrote:
> On Sat, Jul 17, 2021 at 04:14:54PM +0100, Luca Boccassi wrote:
> > On Sat, 2021-07-17 at 17:10 +0200, Michal Suchánek wrote:
> > > Hello,
> > > 
> > > On Sat, Jul 17, 2021 at 03:35:03PM +0100, Luca Boccassi wrote:
> > > > On Fri, 2021-07-16 at 22:12 +0200, Michal Suchánek wrote:
> > > > > On Fri, Jul 16, 2021 at 08:08:27PM +0100, Luca Boccassi wrote:
> > > > > > On Fri, 2021-07-16 at 14:35 +0100, Luca Boccassi wrote:
> > > > > > > On Fri, 2021-07-16 at 10:25 -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > > > Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> > > > > > > > > Hello,
> > > > > > > > > 
> > > > > > > > > when building with system libbpf I get:
> > > > > > > > > 
> > > > > > > > > [   40s] make[1]: Nothing to be done for 'preinstall'.
> > > > > > > > > [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > > > > > [   40s] Install the project...
> > > > > > > > > [   40s] /usr/bin/cmake -P cmake_install.cmake
> > > > > > > > > [   40s] -- Install configuration: "RelWithDebInfo"
> > > > > > > > > [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > > > > > [   40s] CMake Error at cmake_install.cmake:63 (file):
> > > > > > > > > [   40s]   file RPATH_CHANGE could not write new RPATH:
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s]     
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s]   to the file:
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s]   The current RUNPATH is:
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s]   which does not contain:
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s]   as was expected.
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s] 
> > > > > > > > > [   40s] make: *** [Makefile:74: install] Error 1
> > > > > > > > > [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > > > > > > > [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> > > > > > > > > 
> > > > > > > > > This is not a problem with embedded libbpf.
> > > > > > > > > 
> > > > > > > > > Using system libbpf seems to be new in 1.22
> > > > > > > > 
> > > > > > > > Lucca, can you take a look at this?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > 
> > > > > > > > - Arnaldo
> > > > > > > 
> > > > > > > Hi,
> > > > > > > 
> > > > > > > Michal, what is the CMake configuration command line you are using?
> > > > > > 
> > > > > > Latest tmp.master builds fine here with libbpf 0.4.0. To reproduce it
> > > > > > would be good to know build env and command line used, otherwise I
> > > > > > can't really tell.
> > > > > 
> > > > > Indeed, there is non-trivial rpm macro expanded when invoking cmake.
> > > > > 
> > > > > Attaching log.
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > Michal
> > > > 
> > > > So, this took some spelunking. TL;DR: openSUSE's libbpf.pc from libbpf-
> > > > devel is broken, it lists -L/usr/local/lib64 in Libs even though it
> > > > doesn't install anything in that prefix. Fix it to list the right path
> > > > and the problem goes away.
> > > > 
> > > > Longer version: CMake is complaining that the rpath in the binary is
> > > > not what it expected it to be when stripping it. Of course the first
> > > > question is, why does that matter since it's removing it? Just remove
> > > > it, no? Another CMake weirdness to add the the list, I guess.
> > > > 
> > > > [   20s]   file RPATH_CHANGE could not write new RPATH:
> > > > [   20s] 
> > > > [   20s]     
> > > > [   20s] 
> > > > [   20s]   to the file:
> > > > [   20s] 
> > > > [   20s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-
> > > > 1.21+git175.1ef87b2-21.1.x86_64/usr/bin/codiff
> > > > [   20s] 
> > > > [   20s]   The current RUNPATH is:
> > > > [   20s] 
> > > > [   20s]     /home/abuild/rpmbuild/BUILD/dwarves-
> > > > 1.21+git175.1ef87b2/build:
> > > > [   20s] 
> > > > [   20s]   which does not contain:
> > > > [   20s] 
> > > > [   20s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
> > > > 1.21+git175.1ef87b2/build:
> > > > [   20s] 
> > > > [   20s]   as was expected.
> > > > 
> > > > This is the linking step where the rpath is set:
> > > > 
> > > > [   19s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
> > > > strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
> > > > protection -Werror=return-type -flto=auto -g -DNDEBUG
> > > > -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
> > > > -Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
> > > > CMakeFiles/codiff.dir/codiff.c.o -o codiff   -L/usr/local/lib64  -Wl,-
> > > > rpath,/usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-
> > > > 1.21+git175.1ef87b2/build: libdwarves.so.1.0.0 -ldw -lelf -lz -lbpf 
> > > > 
> > > > On a build without -DLIBBPF_EMBEDDED=off, this is the linking step for
> > > > the same binary:
> > > > 
> > > > [   64s] /usr/bin/cc -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-
> > > > strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-
> > > > protection -Werror=return-type -flto=auto -g -DNDEBUG
> > > > -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -g -DNDEBUG -flto=auto
> > > > -Wl,--as-needed -Wl,--no-undefined -Wl,-z,now -rdynamic
> > > > CMakeFiles/codiff.dir/codiff.c.o -o codiff  -Wl,-
> > > > rpath,/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > > libdwarves.so.1.0.0 -ldw -lelf -lz
> > > > 
> > > > /usr/local/lib64 is not in the rpath. Why? The hint is that
> > > > -L/usr/local/lib64 is not passed either, too much of a coincidence to
> > > > be unrelated.
> > > > 
> > > > Where is it coming from? Well, when using the system's libbpf we are
> > > > using the pkgconfig file from the package. It is common to list linker
> > > > flags in there, although this one shouldn't be in it. Sure enough,
> > > > downloading libbpf-devel from 
> > > > https://build.opensuse.org/package/binaries/openSUSE:Factory/libbpf/standard
> > > > and extracting the pc file:
> > > > 
> > > > $ cat /tmp/libbpf.pc 
> > > > # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> > > > 
> > > > prefix=/usr/local
> > > > libdir=/usr/local/lib64
> > > > includedir=${prefix}/include
> > > > 
> > > > Name: libbpf
> > > > Description: BPF library
> > > > Version: 0.3.0
> > > > Libs: -L${libdir} -lbpf
> > > > Requires.private: libelf zlib
> > > > Cflags: -I${includedir}
> > > > 
> > > > There it is. Nothing is installed in that path, so it shouldn't be
> > > > there in the first place.
> > > > 
> > > > $ rpm -qlp /tmp/libbpf0-5.12.4-2.7.x86_64.rpm 
> > > > warning: /tmp/libbpf0-5.12.4-2.7.x86_64.rpm: Header V3 RSA/SHA256
> > > > Signature, key ID 3dbdc284: NOKEY
> > > > /usr/lib64/libbpf.so.0
> > > > /usr/lib64/libbpf.so.0.3.0
> > > 
> > > Thanks for the investigation
> > > 
> > > So this libbpf comes from the kernel, and there is a separate github
> > > repository for libbpf.
> > > 
> > > Should the kernel ship its own copy of the library?
> > > 
> > > Seeing that the one in the kernel is 0.3.0 and the required one for
> > > dwarves is 0.4.0 maybe the one in the kernel needs updating if it needs
> > > to be shipped there?
> > > 
> > > I wil file a bug to build the libbpf from the git repo instead of the
> > > kernel to make the openSUSE libbpf less baroque.
> > > 
> > > Thanks
> > > 
> > > Michal
> > 
> > They provide the same ABI, so there should be only one in the same
> > distro, the kernel package shouldn't ship its own copy if there's a
> > standalone package built from the standalone sources.
> > If you are asking why the sources are still present in the upstream
> > kernel, I don't know - maybe historical reasons, since that's where it
> > came from? But AFAIK the majority of distros don't use that anymore.
> Apparently the libbpf github repository is only mirror of the kernel
> libbpf source with some modifications to build it as standalone.
> 
> Thanks
> 
> Michal

Yes the source code is mirrored, IIRC the main difference is in the
makefiles which are more standalone-build-friendly than the kernel's.

-- 
Kind regards,
Luca Boccassi

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-16 13:25   ` Arnaldo Carvalho de Melo
  2021-07-16 13:35     ` Luca Boccassi
@ 2021-07-19 12:10     ` Luca Boccassi
  2021-07-19 21:08       ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 44+ messages in thread
From: Luca Boccassi @ 2021-07-19 12:10 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Michal Suchánek, Andrii Nakryiko, Jiri Olsa, dwarves, bpf,
	kernel-team, Michael Petlan

[-- Attachment #1: Type: text/plain, Size: 2118 bytes --]

On Fri, 2021-07-16 at 10:25 -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> > Hello,
> > 
> > when building with system libbpf I get:
> > 
> > [   40s] make[1]: Nothing to be done for 'preinstall'.
> > [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > [   40s] Install the project...
> > [   40s] /usr/bin/cmake -P cmake_install.cmake
> > [   40s] -- Install configuration: "RelWithDebInfo"
> > [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > [   40s] CMake Error at cmake_install.cmake:63 (file):
> > [   40s]   file RPATH_CHANGE could not write new RPATH:
> > [   40s] 
> > [   40s]     
> > [   40s] 
> > [   40s]   to the file:
> > [   40s] 
> > [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > [   40s] 
> > [   40s]   The current RUNPATH is:
> > [   40s] 
> > [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > [   40s] 
> > [   40s]   which does not contain:
> > [   40s] 
> > [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > [   40s] 
> > [   40s]   as was expected.
> > [   40s] 
> > [   40s] 
> > [   40s] make: *** [Makefile:74: install] Error 1
> > [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> > 
> > This is not a problem with embedded libbpf.
> > 
> > Using system libbpf seems to be new in 1.22
> 
> Lucca, can you take a look at this?
> 
> Thanks,
> 
> - Arnaldo

Arnaldo, just to give you a quick summary and close the loop in case
you haven't followed the (very verbose) thread: the root cause was a
packaging issue of libbpf in SUSE, which is now solved, and the
reported build problem is now gone:

https://build.opensuse.org/package/show/home:michals/dwarves

-- 
Kind regards,
Luca Boccassi

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFT] Testing 1.22
  2021-07-19 12:10     ` Luca Boccassi
@ 2021-07-19 21:08       ` Arnaldo Carvalho de Melo
  2021-07-28 10:53         ` Expected release date of v1.22 Deepak Kumar Mishra
  0 siblings, 1 reply; 44+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-07-19 21:08 UTC (permalink / raw)
  To: Luca Boccassi
  Cc: Michal Suchánek, Andrii Nakryiko, Jiri Olsa, dwarves, bpf,
	kernel-team, Michael Petlan

Em Mon, Jul 19, 2021 at 01:10:52PM +0100, Luca Boccassi escreveu:
> On Fri, 2021-07-16 at 10:25 -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Jul 15, 2021 at 11:31:20PM +0200, Michal Suchánek escreveu:
> > > Hello,
> > > 
> > > when building with system libbpf I get:
> > > 
> > > [   40s] make[1]: Nothing to be done for 'preinstall'.
> > > [   40s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > [   40s] Install the project...
> > > [   40s] /usr/bin/cmake -P cmake_install.cmake
> > > [   40s] -- Install configuration: "RelWithDebInfo"
> > > [   40s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > [   40s] CMake Error at cmake_install.cmake:63 (file):
> > > [   40s]   file RPATH_CHANGE could not write new RPATH:
> > > [   40s] 
> > > [   40s]     
> > > [   40s] 
> > > [   40s]   to the file:
> > > [   40s] 
> > > [   40s]     /home/abuild/rpmbuild/BUILDROOT/dwarves-1.21+git175.1ef87b2-15.1.ppc64le/usr/bin/codiff
> > > [   40s] 
> > > [   40s]   The current RUNPATH is:
> > > [   40s] 
> > > [   40s]     /home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > [   40s] 
> > > [   40s]   which does not contain:
> > > [   40s] 
> > > [   40s]     /usr/local/lib64:/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build:
> > > [   40s] 
> > > [   40s]   as was expected.
> > > [   40s] 
> > > [   40s] 
> > > [   40s] make: *** [Makefile:74: install] Error 1
> > > [   40s] make: Leaving directory '/home/abuild/rpmbuild/BUILD/dwarves-1.21+git175.1ef87b2/build'
> > > [   40s] error: Bad exit status from /var/tmp/rpm-tmp.OaGNX4 (%install)
> > > 
> > > This is not a problem with embedded libbpf.
> > > 
> > > Using system libbpf seems to be new in 1.22
> > 
> > Lucca, can you take a look at this?
 
> Arnaldo, just to give you a quick summary and close the loop in case
> you haven't followed the (very verbose) thread: the root cause was a
> packaging issue of libbpf in SUSE, which is now solved, and the
> reported build problem is now gone:
> 
> https://build.opensuse.org/package/show/home:michals/dwarves

Thanks a lot for taking care of this, appreciated.

- Arnaldo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Expected release date of v1.22
  2021-07-19 21:08       ` Arnaldo Carvalho de Melo
@ 2021-07-28 10:53         ` Deepak Kumar Mishra
  2021-07-28 11:21           ` Greg KH
  0 siblings, 1 reply; 44+ messages in thread
From: Deepak Kumar Mishra @ 2021-07-28 10:53 UTC (permalink / raw)
  To: dwarves, acme

Hello All,
I want to pick the latest released version of pahole for our internal integration.
I will be glad if you kindly share the expected time line for release of v1.22 so that I can plan accordingly.
Hope the v1.22 is released soon.

Best Regards
Deepak
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: Expected release date of v1.22
  2021-07-28 10:53         ` Expected release date of v1.22 Deepak Kumar Mishra
@ 2021-07-28 11:21           ` Greg KH
  0 siblings, 0 replies; 44+ messages in thread
From: Greg KH @ 2021-07-28 11:21 UTC (permalink / raw)
  To: Deepak Kumar Mishra; +Cc: dwarves, acme

On Wed, Jul 28, 2021 at 10:53:08AM +0000, Deepak Kumar Mishra wrote:
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Now deleted.

Also, not a good idea to send to public email lists...

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2021-07-28 11:21 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-27 15:20 [RFT] Testing 1.22 Arnaldo Carvalho de Melo
2021-05-27 16:54 ` Andrii Nakryiko
2021-05-27 19:04   ` Arnaldo
2021-05-27 19:14     ` Andrii Nakryiko
2021-05-27 19:55       ` Arnaldo
2021-05-27 20:41         ` Andrii Nakryiko
2021-05-27 21:08           ` Jiri Olsa
2021-05-27 21:57             ` Arnaldo
2021-05-28 19:45           ` Arnaldo Carvalho de Melo
2021-05-29  2:36             ` Andrii Nakryiko
2021-06-01 18:31               ` Arnaldo Carvalho de Melo
2021-06-01 18:32                 ` Arnaldo Carvalho de Melo
2021-05-30  0:40             ` Andrii Nakryiko
2021-06-01 18:24               ` Arnaldo Carvalho de Melo
2021-06-03 14:57               ` Arnaldo Carvalho de Melo
2021-06-05  2:55                 ` Andrii Nakryiko
2021-06-07 13:20                   ` Parallelizing vmlinux BTF encoding. was " Arnaldo Carvalho de Melo
2021-06-07 15:42                     ` Yonghong Song
2021-06-08  0:53                     ` Andrii Nakryiko
2021-06-08 12:59                       ` Arnaldo Carvalho de Melo
2021-06-15 19:01                         ` Arnaldo Carvalho de Melo
2021-06-15 19:13                           ` Andrii Nakryiko
2021-06-15 19:38                             ` Arnaldo Carvalho de Melo
2021-06-15 20:05                               ` Andrii Nakryiko
2021-06-15 20:25                                 ` Arnaldo Carvalho de Melo
2021-06-15 21:26                                   ` Andrii Nakryiko
2021-05-30 21:45             ` Jiri Olsa
2021-06-01 19:07               ` Arnaldo Carvalho de Melo
2021-06-02 10:21 ` Michael Petlan
2021-07-15 21:31 ` Michal Suchánek
2021-07-16 13:25   ` Arnaldo Carvalho de Melo
2021-07-16 13:35     ` Luca Boccassi
2021-07-16 19:08       ` Luca Boccassi
     [not found]         ` <20210716201248.GL24916@kitsune.suse.cz>
2021-07-17 14:35           ` Luca Boccassi
2021-07-17 15:10             ` Michal Suchánek
2021-07-17 15:14               ` Luca Boccassi
2021-07-17 16:36                 ` Michal Suchánek
2021-07-17 16:39                   ` Luca Boccassi
2021-07-19 10:30                 ` Michal Suchánek
2021-07-19 10:34                   ` Luca Boccassi
2021-07-19 12:10     ` Luca Boccassi
2021-07-19 21:08       ` Arnaldo Carvalho de Melo
2021-07-28 10:53         ` Expected release date of v1.22 Deepak Kumar Mishra
2021-07-28 11:21           ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.