All of lore.kernel.org
 help / color / mirror / Atom feed
* [Question] How to reliably get BuildIDs from bpf prog
@ 2022-01-24 20:59 Hao Luo
  2022-01-25  7:07 ` Song Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Hao Luo @ 2022-01-24 20:59 UTC (permalink / raw)
  To: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann
  Cc: Song Liu, Yonghong Song, Martin KaFai Lau, KP Singh, bpf,
	open list, Jiri Olsa, Blake Jones, Alexey Alexandrov,
	Namhyung Kim, Ian Rogers, pasha.tatashin

Dear BPF experts,

I'm working on collecting some kernel performance data using BPF
tracing prog. Our performance profiling team wants to associate the
data with user stack information. One of the requirements is to
reliably get BuildIDs from bpf_get_stackid() and other similar helpers
[1].

As part of an early investigation, we found that there are a couple
issues that make bpf_get_stackid() much less reliable than we'd like
for our use:

1. The first page of many binaries (which contains the ELF headers and
thus the BuildID that we need) is often not in memory. The failure of
find_get_page() (called from build_id_parse()) is higher than we would
want.

2. When anonymous huge pages are used to hold some regions of process
text, build_id_parse() also fails to get a BuildID because
vma->vm_file is NULL.

These two issues are critical blockers for us to use BPF in
production. Can we do better? What do other users do to reliably get
build ids?

Thanks very much,
Hao

[1] https://man7.org/linux/man-pages/man7/bpf-helpers.7.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Question] How to reliably get BuildIDs from bpf prog
  2022-01-24 20:59 [Question] How to reliably get BuildIDs from bpf prog Hao Luo
@ 2022-01-25  7:07 ` Song Liu
  2022-01-25 23:54   ` Hao Luo
  0 siblings, 1 reply; 7+ messages in thread
From: Song Liu @ 2022-01-25  7:07 UTC (permalink / raw)
  To: Hao Luo
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Song Liu,
	Yonghong Song, Martin KaFai Lau, KP Singh, bpf, open list,
	Jiri Olsa, Blake Jones, Alexey Alexandrov, Namhyung Kim,
	Ian Rogers, pasha.tatashin

On Mon, Jan 24, 2022 at 2:43 PM Hao Luo <haoluo@google.com> wrote:
>
> Dear BPF experts,
>
> I'm working on collecting some kernel performance data using BPF
> tracing prog. Our performance profiling team wants to associate the
> data with user stack information. One of the requirements is to
> reliably get BuildIDs from bpf_get_stackid() and other similar helpers
> [1].
>
> As part of an early investigation, we found that there are a couple
> issues that make bpf_get_stackid() much less reliable than we'd like
> for our use:
>
> 1. The first page of many binaries (which contains the ELF headers and
> thus the BuildID that we need) is often not in memory. The failure of
> find_get_page() (called from build_id_parse()) is higher than we would
> want.

Our top use case of bpf_get_stack() is called from NMI, so there isn't
much we can do. Maybe it is possible to improve it by changing the
layout of the binary and the libraries? Specifically, if the text is
also in the first page, it is likely to stay in memory?

> 2. When anonymous huge pages are used to hold some regions of process
> text, build_id_parse() also fails to get a BuildID because
> vma->vm_file is NULL.

How did the text get in anonymous memory? I guess it is NOT from JIT?
We had a hack to use transparent huge page for application text. The
hack looks like:

"At run time, the application creates an 8MB temporary buffer and the
hot section of the executable memory is copied to it. The 8MB region in
the executable memory is then converted to a huge page (by way of an
mmap() to anonymous pages and an madvise() to create a huge page), the
data is copied back to it, and it is made executable again using
mprotect()."

If your case is the same (or similar), it can probably be fixed with
CONFIG_READ_ONLY_THP_FOR_FS, and modified user space.

Thanks,
Song

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Question] How to reliably get BuildIDs from bpf prog
  2022-01-25  7:07 ` Song Liu
@ 2022-01-25 23:54   ` Hao Luo
  2022-01-26  0:16     ` Song Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Hao Luo @ 2022-01-25 23:54 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Song Liu,
	Yonghong Song, Martin KaFai Lau, KP Singh, bpf, open list,
	Jiri Olsa, Blake Jones, Alexey Alexandrov, Namhyung Kim,
	Ian Rogers, pasha.tatashin

Thanks Song for your suggestion.

On Mon, Jan 24, 2022 at 11:08 PM Song Liu <song@kernel.org> wrote:
>
> On Mon, Jan 24, 2022 at 2:43 PM Hao Luo <haoluo@google.com> wrote:
> >
> > Dear BPF experts,
> >
> > I'm working on collecting some kernel performance data using BPF
> > tracing prog. Our performance profiling team wants to associate the
> > data with user stack information. One of the requirements is to
> > reliably get BuildIDs from bpf_get_stackid() and other similar helpers
> > [1].
> >
> > As part of an early investigation, we found that there are a couple
> > issues that make bpf_get_stackid() much less reliable than we'd like
> > for our use:
> >
> > 1. The first page of many binaries (which contains the ELF headers and
> > thus the BuildID that we need) is often not in memory. The failure of
> > find_get_page() (called from build_id_parse()) is higher than we would
> > want.
>
> Our top use case of bpf_get_stack() is called from NMI, so there isn't
> much we can do. Maybe it is possible to improve it by changing the
> layout of the binary and the libraries? Specifically, if the text is
> also in the first page, it is likely to stay in memory?
>

We are seeing 30-40% of stack frames not able to get build ids due to
this. This is a place where we could improve the reliability of build
id.

There were a few proposals coming up when we found this issue. One of
them is to have userspace mlock the first page. This would be the
easiest fix, if it works. Another proposal from Ian Rogers (cc'ed) is
to embed build id in vma. This is an idea similar to [1], but it's
unclear (at least to me) where to store the string. I'm wondering if
we can introduce a sleepable version of bpf_get_stack() if it helps.
When a page is not present, sleepable bpf_get_stack() can bring in the
page.

[1] https://lwn.net/Articles/867818/

> > 2. When anonymous huge pages are used to hold some regions of process
> > text, build_id_parse() also fails to get a BuildID because
> > vma->vm_file is NULL.
>
> How did the text get in anonymous memory? I guess it is NOT from JIT?
> We had a hack to use transparent huge page for application text. The
> hack looks like:
>
> "At run time, the application creates an 8MB temporary buffer and the
> hot section of the executable memory is copied to it. The 8MB region in
> the executable memory is then converted to a huge page (by way of an
> mmap() to anonymous pages and an madvise() to create a huge page), the
> data is copied back to it, and it is made executable again using
> mprotect()."
>
> If your case is the same (or similar), it can probably be fixed with
> CONFIG_READ_ONLY_THP_FOR_FS, and modified user space.
>

In our use cases, we have text mapped to huge pages that are not
backed by files. vma->vm_file could be null or points some fake file.
This causes challenges for us on getting build id for these code text.

> Thanks,
> Song

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Question] How to reliably get BuildIDs from bpf prog
  2022-01-25 23:54   ` Hao Luo
@ 2022-01-26  0:16     ` Song Liu
  2022-02-04 19:28       ` Hao Luo
  0 siblings, 1 reply; 7+ messages in thread
From: Song Liu @ 2022-01-26  0:16 UTC (permalink / raw)
  To: Hao Luo
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Song Liu,
	Yonghong Song, Martin KaFai Lau, KP Singh, bpf, open list,
	Jiri Olsa, Blake Jones, Alexey Alexandrov, Namhyung Kim,
	Ian Rogers, pasha.tatashin

On Tue, Jan 25, 2022 at 3:54 PM Hao Luo <haoluo@google.com> wrote:
>
> Thanks Song for your suggestion.
>
> On Mon, Jan 24, 2022 at 11:08 PM Song Liu <song@kernel.org> wrote:
> >
> > On Mon, Jan 24, 2022 at 2:43 PM Hao Luo <haoluo@google.com> wrote:
> > >
> > > Dear BPF experts,
> > >
> > > I'm working on collecting some kernel performance data using BPF
> > > tracing prog. Our performance profiling team wants to associate the
> > > data with user stack information. One of the requirements is to
> > > reliably get BuildIDs from bpf_get_stackid() and other similar helpers
> > > [1].
> > >
> > > As part of an early investigation, we found that there are a couple
> > > issues that make bpf_get_stackid() much less reliable than we'd like
> > > for our use:
> > >
> > > 1. The first page of many binaries (which contains the ELF headers and
> > > thus the BuildID that we need) is often not in memory. The failure of
> > > find_get_page() (called from build_id_parse()) is higher than we would
> > > want.
> >
> > Our top use case of bpf_get_stack() is called from NMI, so there isn't
> > much we can do. Maybe it is possible to improve it by changing the
> > layout of the binary and the libraries? Specifically, if the text is
> > also in the first page, it is likely to stay in memory?
> >
>
> We are seeing 30-40% of stack frames not able to get build ids due to
> this. This is a place where we could improve the reliability of build
> id.
>
> There were a few proposals coming up when we found this issue. One of
> them is to have userspace mlock the first page. This would be the
> easiest fix, if it works. Another proposal from Ian Rogers (cc'ed) is
> to embed build id in vma. This is an idea similar to [1], but it's
> unclear (at least to me) where to store the string. I'm wondering if
> we can introduce a sleepable version of bpf_get_stack() if it helps.
> When a page is not present, sleepable bpf_get_stack() can bring in the
> page.

I guess it is possible to have different flavors of bpf_get_stack().
However, I am not sure whether the actual use case could use sleepable
BPF programs. Our user of bpf_get_stack() is a profiler. The BPF program
which triggers a perf_event from NMI, where we really cannot sleep.

If we have target use case that could sleep, sleepable bpf_get_stack() sounds
reasonable to me.

>
> [1] https://lwn.net/Articles/867818/
>
> > > 2. When anonymous huge pages are used to hold some regions of process
> > > text, build_id_parse() also fails to get a BuildID because
> > > vma->vm_file is NULL.
> >
> > How did the text get in anonymous memory? I guess it is NOT from JIT?
> > We had a hack to use transparent huge page for application text. The
> > hack looks like:
> >
> > "At run time, the application creates an 8MB temporary buffer and the
> > hot section of the executable memory is copied to it. The 8MB region in
> > the executable memory is then converted to a huge page (by way of an
> > mmap() to anonymous pages and an madvise() to create a huge page), the
> > data is copied back to it, and it is made executable again using
> > mprotect()."
> >
> > If your case is the same (or similar), it can probably be fixed with
> > CONFIG_READ_ONLY_THP_FOR_FS, and modified user space.
> >
>
> In our use cases, we have text mapped to huge pages that are not
> backed by files. vma->vm_file could be null or points some fake file.
> This causes challenges for us on getting build id for these code text.

So, what is the ideal output in these cases? If there isn't a back file, we
don't really have good build-id for it, right?

Thanks,
Song

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Question] How to reliably get BuildIDs from bpf prog
  2022-01-26  0:16     ` Song Liu
@ 2022-02-04 19:28       ` Hao Luo
  2022-02-04 19:37         ` Song Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Hao Luo @ 2022-02-04 19:28 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Song Liu,
	Yonghong Song, Martin KaFai Lau, KP Singh, bpf, open list,
	Jiri Olsa, Blake Jones, Alexey Alexandrov, Namhyung Kim,
	Ian Rogers, pasha.tatashin

On Tue, Jan 25, 2022 at 4:16 PM Song Liu <song@kernel.org> wrote:
>
> On Tue, Jan 25, 2022 at 3:54 PM Hao Luo <haoluo@google.com> wrote:
> >
> > Thanks Song for your suggestion.
> >
> > On Mon, Jan 24, 2022 at 11:08 PM Song Liu <song@kernel.org> wrote:
> > >
> > > On Mon, Jan 24, 2022 at 2:43 PM Hao Luo <haoluo@google.com> wrote:
> > > >
> > > > Dear BPF experts,
> > > >
> > > > I'm working on collecting some kernel performance data using BPF
> > > > tracing prog. Our performance profiling team wants to associate the
> > > > data with user stack information. One of the requirements is to
> > > > reliably get BuildIDs from bpf_get_stackid() and other similar helpers
> > > > [1].
> > > >
> > > > As part of an early investigation, we found that there are a couple
> > > > issues that make bpf_get_stackid() much less reliable than we'd like
> > > > for our use:
> > > >
> > > > 1. The first page of many binaries (which contains the ELF headers and
> > > > thus the BuildID that we need) is often not in memory. The failure of
> > > > find_get_page() (called from build_id_parse()) is higher than we would
> > > > want.
> > >
> > > Our top use case of bpf_get_stack() is called from NMI, so there isn't
> > > much we can do. Maybe it is possible to improve it by changing the
> > > layout of the binary and the libraries? Specifically, if the text is
> > > also in the first page, it is likely to stay in memory?
> > >
> >
> > We are seeing 30-40% of stack frames not able to get build ids due to
> > this. This is a place where we could improve the reliability of build
> > id.
> >
> > There were a few proposals coming up when we found this issue. One of
> > them is to have userspace mlock the first page. This would be the
> > easiest fix, if it works. Another proposal from Ian Rogers (cc'ed) is
> > to embed build id in vma. This is an idea similar to [1], but it's
> > unclear (at least to me) where to store the string. I'm wondering if
> > we can introduce a sleepable version of bpf_get_stack() if it helps.
> > When a page is not present, sleepable bpf_get_stack() can bring in the
> > page.
>
> I guess it is possible to have different flavors of bpf_get_stack().
> However, I am not sure whether the actual use case could use sleepable
> BPF programs. Our user of bpf_get_stack() is a profiler. The BPF program
> which triggers a perf_event from NMI, where we really cannot sleep.
>
> If we have target use case that could sleep, sleepable bpf_get_stack() sounds
> reasonable to me.
>
> >
> > [1] https://lwn.net/Articles/867818/
> >
> > > > 2. When anonymous huge pages are used to hold some regions of process
> > > > text, build_id_parse() also fails to get a BuildID because
> > > > vma->vm_file is NULL.
> > >
> > > How did the text get in anonymous memory? I guess it is NOT from JIT?
> > > We had a hack to use transparent huge page for application text. The
> > > hack looks like:
> > >
> > > "At run time, the application creates an 8MB temporary buffer and the
> > > hot section of the executable memory is copied to it. The 8MB region in
> > > the executable memory is then converted to a huge page (by way of an
> > > mmap() to anonymous pages and an madvise() to create a huge page), the
> > > data is copied back to it, and it is made executable again using
> > > mprotect()."
> > >
> > > If your case is the same (or similar), it can probably be fixed with
> > > CONFIG_READ_ONLY_THP_FOR_FS, and modified user space.
> > >
> >
> > In our use cases, we have text mapped to huge pages that are not
> > backed by files. vma->vm_file could be null or points some fake file.
> > This causes challenges for us on getting build id for these code text.
>
> So, what is the ideal output in these cases? If there isn't a back file, we
> don't really have good build-id for it, right?
>

Right, I don't have a solution for this case unfortunately. Probably
will just discard the failed frames. :(

But in the case where the problem is the page not in mem, Song, do you
also see a similar high rate of build id parsing failure in your use
case (30 ~ 40% of frames)? If no, we may have done something wrong on
our side. If yes, is that a problem for your use case?

> Thanks,
> Song

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Question] How to reliably get BuildIDs from bpf prog
  2022-02-04 19:28       ` Hao Luo
@ 2022-02-04 19:37         ` Song Liu
  2022-02-04 19:42           ` Hao Luo
  0 siblings, 1 reply; 7+ messages in thread
From: Song Liu @ 2022-02-04 19:37 UTC (permalink / raw)
  To: Hao Luo
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Song Liu,
	Yonghong Song, Martin KaFai Lau, KP Singh, bpf, open list,
	Jiri Olsa, Blake Jones, Alexey Alexandrov, Namhyung Kim,
	Ian Rogers, pasha.tatashin

On Fri, Feb 4, 2022 at 11:29 AM Hao Luo <haoluo@google.com> wrote:
>
> On Tue, Jan 25, 2022 at 4:16 PM Song Liu <song@kernel.org> wrote:
> >
> > On Tue, Jan 25, 2022 at 3:54 PM Hao Luo <haoluo@google.com> wrote:
> > >
> > > Thanks Song for your suggestion.
> > >
> > > On Mon, Jan 24, 2022 at 11:08 PM Song Liu <song@kernel.org> wrote:
> > > >
> > > > On Mon, Jan 24, 2022 at 2:43 PM Hao Luo <haoluo@google.com> wrote:
> > > > >
> > > > > Dear BPF experts,
> > > > >
> > > > > I'm working on collecting some kernel performance data using BPF
> > > > > tracing prog. Our performance profiling team wants to associate the
> > > > > data with user stack information. One of the requirements is to
> > > > > reliably get BuildIDs from bpf_get_stackid() and other similar helpers
> > > > > [1].
> > > > >
> > > > > As part of an early investigation, we found that there are a couple
> > > > > issues that make bpf_get_stackid() much less reliable than we'd like
> > > > > for our use:
> > > > >
> > > > > 1. The first page of many binaries (which contains the ELF headers and
> > > > > thus the BuildID that we need) is often not in memory. The failure of
> > > > > find_get_page() (called from build_id_parse()) is higher than we would
> > > > > want.
> > > >
> > > > Our top use case of bpf_get_stack() is called from NMI, so there isn't
> > > > much we can do. Maybe it is possible to improve it by changing the
> > > > layout of the binary and the libraries? Specifically, if the text is
> > > > also in the first page, it is likely to stay in memory?
> > > >
> > >
> > > We are seeing 30-40% of stack frames not able to get build ids due to
> > > this. This is a place where we could improve the reliability of build
> > > id.
> > >
> > > There were a few proposals coming up when we found this issue. One of
> > > them is to have userspace mlock the first page. This would be the
> > > easiest fix, if it works. Another proposal from Ian Rogers (cc'ed) is
> > > to embed build id in vma. This is an idea similar to [1], but it's
> > > unclear (at least to me) where to store the string. I'm wondering if
> > > we can introduce a sleepable version of bpf_get_stack() if it helps.
> > > When a page is not present, sleepable bpf_get_stack() can bring in the
> > > page.
> >
> > I guess it is possible to have different flavors of bpf_get_stack().
> > However, I am not sure whether the actual use case could use sleepable
> > BPF programs. Our user of bpf_get_stack() is a profiler. The BPF program
> > which triggers a perf_event from NMI, where we really cannot sleep.
> >
> > If we have target use case that could sleep, sleepable bpf_get_stack() sounds
> > reasonable to me.
> >
> > >
> > > [1] https://lwn.net/Articles/867818/
> > >
> > > > > 2. When anonymous huge pages are used to hold some regions of process
> > > > > text, build_id_parse() also fails to get a BuildID because
> > > > > vma->vm_file is NULL.
> > > >
> > > > How did the text get in anonymous memory? I guess it is NOT from JIT?
> > > > We had a hack to use transparent huge page for application text. The
> > > > hack looks like:
> > > >
> > > > "At run time, the application creates an 8MB temporary buffer and the
> > > > hot section of the executable memory is copied to it. The 8MB region in
> > > > the executable memory is then converted to a huge page (by way of an
> > > > mmap() to anonymous pages and an madvise() to create a huge page), the
> > > > data is copied back to it, and it is made executable again using
> > > > mprotect()."
> > > >
> > > > If your case is the same (or similar), it can probably be fixed with
> > > > CONFIG_READ_ONLY_THP_FOR_FS, and modified user space.
> > > >
> > >
> > > In our use cases, we have text mapped to huge pages that are not
> > > backed by files. vma->vm_file could be null or points some fake file.
> > > This causes challenges for us on getting build id for these code text.
> >
> > So, what is the ideal output in these cases? If there isn't a back file, we
> > don't really have good build-id for it, right?
> >
>
> Right, I don't have a solution for this case unfortunately. Probably
> will just discard the failed frames. :(
>
> But in the case where the problem is the page not in mem, Song, do you
> also see a similar high rate of build id parsing failure in your use
> case (30 ~ 40% of frames)? If no, we may have done something wrong on
> our side. If yes, is that a problem for your use case?

The latest data I found (which is not too recent) is about 3 % missing symbols.
I think there must be something different here.

Thanks,
Song

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Question] How to reliably get BuildIDs from bpf prog
  2022-02-04 19:37         ` Song Liu
@ 2022-02-04 19:42           ` Hao Luo
  0 siblings, 0 replies; 7+ messages in thread
From: Hao Luo @ 2022-02-04 19:42 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Song Liu,
	Yonghong Song, Martin KaFai Lau, KP Singh, bpf, open list,
	Jiri Olsa, Blake Jones, Alexey Alexandrov, Namhyung Kim,
	Ian Rogers, pasha.tatashin

On Fri, Feb 4, 2022 at 11:37 AM Song Liu <song@kernel.org> wrote:
>
> On Fri, Feb 4, 2022 at 11:29 AM Hao Luo <haoluo@google.com> wrote:
> >
> > On Tue, Jan 25, 2022 at 4:16 PM Song Liu <song@kernel.org> wrote:
> > >
> > > On Tue, Jan 25, 2022 at 3:54 PM Hao Luo <haoluo@google.com> wrote:
> > > >
> > > > Thanks Song for your suggestion.
> > > >
> > > > On Mon, Jan 24, 2022 at 11:08 PM Song Liu <song@kernel.org> wrote:
> > > > >
> > > > > On Mon, Jan 24, 2022 at 2:43 PM Hao Luo <haoluo@google.com> wrote:
> > > > > >
> > > > > > Dear BPF experts,
> > > > > >
> > > > > > I'm working on collecting some kernel performance data using BPF
> > > > > > tracing prog. Our performance profiling team wants to associate the
> > > > > > data with user stack information. One of the requirements is to
> > > > > > reliably get BuildIDs from bpf_get_stackid() and other similar helpers
> > > > > > [1].
> > > > > >
> > > > > > As part of an early investigation, we found that there are a couple
> > > > > > issues that make bpf_get_stackid() much less reliable than we'd like
> > > > > > for our use:
> > > > > >
> > > > > > 1. The first page of many binaries (which contains the ELF headers and
> > > > > > thus the BuildID that we need) is often not in memory. The failure of
> > > > > > find_get_page() (called from build_id_parse()) is higher than we would
> > > > > > want.
> > > > >
> > > > > Our top use case of bpf_get_stack() is called from NMI, so there isn't
> > > > > much we can do. Maybe it is possible to improve it by changing the
> > > > > layout of the binary and the libraries? Specifically, if the text is
> > > > > also in the first page, it is likely to stay in memory?
> > > > >
> > > >
> > > > We are seeing 30-40% of stack frames not able to get build ids due to
> > > > this. This is a place where we could improve the reliability of build
> > > > id.
> > > >
> > > > There were a few proposals coming up when we found this issue. One of
> > > > them is to have userspace mlock the first page. This would be the
> > > > easiest fix, if it works. Another proposal from Ian Rogers (cc'ed) is
> > > > to embed build id in vma. This is an idea similar to [1], but it's
> > > > unclear (at least to me) where to store the string. I'm wondering if
> > > > we can introduce a sleepable version of bpf_get_stack() if it helps.
> > > > When a page is not present, sleepable bpf_get_stack() can bring in the
> > > > page.
> > >
> > > I guess it is possible to have different flavors of bpf_get_stack().
> > > However, I am not sure whether the actual use case could use sleepable
> > > BPF programs. Our user of bpf_get_stack() is a profiler. The BPF program
> > > which triggers a perf_event from NMI, where we really cannot sleep.
> > >
> > > If we have target use case that could sleep, sleepable bpf_get_stack() sounds
> > > reasonable to me.
> > >
> > > >
> > > > [1] https://lwn.net/Articles/867818/
> > > >
> > > > > > 2. When anonymous huge pages are used to hold some regions of process
> > > > > > text, build_id_parse() also fails to get a BuildID because
> > > > > > vma->vm_file is NULL.
> > > > >
> > > > > How did the text get in anonymous memory? I guess it is NOT from JIT?
> > > > > We had a hack to use transparent huge page for application text. The
> > > > > hack looks like:
> > > > >
> > > > > "At run time, the application creates an 8MB temporary buffer and the
> > > > > hot section of the executable memory is copied to it. The 8MB region in
> > > > > the executable memory is then converted to a huge page (by way of an
> > > > > mmap() to anonymous pages and an madvise() to create a huge page), the
> > > > > data is copied back to it, and it is made executable again using
> > > > > mprotect()."
> > > > >
> > > > > If your case is the same (or similar), it can probably be fixed with
> > > > > CONFIG_READ_ONLY_THP_FOR_FS, and modified user space.
> > > > >
> > > >
> > > > In our use cases, we have text mapped to huge pages that are not
> > > > backed by files. vma->vm_file could be null or points some fake file.
> > > > This causes challenges for us on getting build id for these code text.
> > >
> > > So, what is the ideal output in these cases? If there isn't a back file, we
> > > don't really have good build-id for it, right?
> > >
> >
> > Right, I don't have a solution for this case unfortunately. Probably
> > will just discard the failed frames. :(
> >
> > But in the case where the problem is the page not in mem, Song, do you
> > also see a similar high rate of build id parsing failure in your use
> > case (30 ~ 40% of frames)? If no, we may have done something wrong on
> > our side. If yes, is that a problem for your use case?
>
> The latest data I found (which is not too recent) is about 3 % missing symbols.
> I think there must be something different here.
>

Thanks Song! This is interesting. I'll go look at our user cases.

> Thanks,
> Song

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-02-04 19:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-24 20:59 [Question] How to reliably get BuildIDs from bpf prog Hao Luo
2022-01-25  7:07 ` Song Liu
2022-01-25 23:54   ` Hao Luo
2022-01-26  0:16     ` Song Liu
2022-02-04 19:28       ` Hao Luo
2022-02-04 19:37         ` Song Liu
2022-02-04 19:42           ` Hao Luo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.