bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] store function address in BTF
@ 2021-10-06  8:41 Jiri Olsa
  2021-10-06  8:44 ` Jiri Olsa
  2021-10-06 16:17 ` Andrii Nakryiko
  0 siblings, 2 replies; 10+ messages in thread
From: Jiri Olsa @ 2021-10-06  8:41 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

hi,
I'm hitting performance issue and soft lock ups with the new version
of the patchset and the reason seems to be kallsyms lookup that we
need to do for each btf id we want to attach

I tried to change kallsyms_lookup_name linear search into rbtree search,
but it has its own pitfalls like duplicate function names and it still
seems not to be fast enough when you want to attach like 30k functions

so I wonder we could 'fix this' by storing function address in BTF,
which would cut kallsyms lookup completely, because it'd be done in
compile time

my first thought was to add extra BTF section for that, after discussion
with Arnaldo perhaps we could be able to store extra 8 bytes after
BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
indicate that? or new BTF_KIND_FUNC2 type?

thoughts?

thanks,
jirka


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] store function address in BTF
  2021-10-06  8:41 [RFC] store function address in BTF Jiri Olsa
@ 2021-10-06  8:44 ` Jiri Olsa
  2021-10-06 14:53   ` Alexei Starovoitov
  2021-10-06 16:17 ` Andrii Nakryiko
  1 sibling, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2021-10-06  8:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

On Wed, Oct 06, 2021 at 10:41:41AM +0200, Jiri Olsa wrote:
> hi,
> I'm hitting performance issue and soft lock ups with the new version
> of the patchset and the reason seems to be kallsyms lookup that we
> need to do for each btf id we want to attach

ugh, I meant to sent this as reply to the patchset mentioned above,
nevermind, here's the patchset:
  https://lore.kernel.org/bpf/20210605111034.1810858-1-jolsa@kernel.org/

jirka

> 
> I tried to change kallsyms_lookup_name linear search into rbtree search,
> but it has its own pitfalls like duplicate function names and it still
> seems not to be fast enough when you want to attach like 30k functions
> 
> so I wonder we could 'fix this' by storing function address in BTF,
> which would cut kallsyms lookup completely, because it'd be done in
> compile time
> 
> my first thought was to add extra BTF section for that, after discussion
> with Arnaldo perhaps we could be able to store extra 8 bytes after
> BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> indicate that? or new BTF_KIND_FUNC2 type?
> 
> thoughts?
> 
> thanks,
> jirka


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] store function address in BTF
  2021-10-06  8:44 ` Jiri Olsa
@ 2021-10-06 14:53   ` Alexei Starovoitov
  2021-10-06 20:07     ` Jiri Olsa
  0 siblings, 1 reply; 10+ messages in thread
From: Alexei Starovoitov @ 2021-10-06 14:53 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

On Wed, Oct 6, 2021 at 1:44 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Wed, Oct 06, 2021 at 10:41:41AM +0200, Jiri Olsa wrote:
> > hi,
> > I'm hitting performance issue and soft lock ups with the new version
> > of the patchset and the reason seems to be kallsyms lookup that we
> > need to do for each btf id we want to attach
>
> ugh, I meant to sent this as reply to the patchset mentioned above,
> nevermind, here's the patchset:
>   https://lore.kernel.org/bpf/20210605111034.1810858-1-jolsa@kernel.org/
>
> jirka
>
> >
> > I tried to change kallsyms_lookup_name linear search into rbtree search,
> > but it has its own pitfalls like duplicate function names and it still
> > seems not to be fast enough when you want to attach like 30k functions
> >
> > so I wonder we could 'fix this' by storing function address in BTF,
> > which would cut kallsyms lookup completely, because it'd be done in
> > compile time
> >
> > my first thought was to add extra BTF section for that, after discussion
> > with Arnaldo perhaps we could be able to store extra 8 bytes after
> > BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> > indicate that? or new BTF_KIND_FUNC2 type?
> >
> > thoughts?

That would be on top of your next patch set?
Please post it first.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] store function address in BTF
  2021-10-06  8:41 [RFC] store function address in BTF Jiri Olsa
  2021-10-06  8:44 ` Jiri Olsa
@ 2021-10-06 16:17 ` Andrii Nakryiko
  2021-10-06 20:06   ` Jiri Olsa
  1 sibling, 1 reply; 10+ messages in thread
From: Andrii Nakryiko @ 2021-10-06 16:17 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

On Wed, Oct 6, 2021 at 1:42 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> hi,
> I'm hitting performance issue and soft lock ups with the new version
> of the patchset and the reason seems to be kallsyms lookup that we
> need to do for each btf id we want to attach
>
> I tried to change kallsyms_lookup_name linear search into rbtree search,
> but it has its own pitfalls like duplicate function names and it still
> seems not to be fast enough when you want to attach like 30k functions

How not fast enough is it exactly? How long does it take?

>
> so I wonder we could 'fix this' by storing function address in BTF,
> which would cut kallsyms lookup completely, because it'd be done in
> compile time
>
> my first thought was to add extra BTF section for that, after discussion
> with Arnaldo perhaps we could be able to store extra 8 bytes after
> BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> indicate that? or new BTF_KIND_FUNC2 type?
>
> thoughts?

I'm strongly against this, because (besides the BTF bloat reason) we
need similar mass attachment functionality for kprobe/kretprobe and
that one won't be relying on BTF FUNCs, so I think it's better to
stick to the same mechanism for figuring out the address of the
function.

If RB tree is not feasible, we can do a linear search over unsorted
kallsyms and do binary search over sorted function names (derived from
BTF IDs). That would be O(Nlog(M)), where N is number of ksyms, M is
number of BTF IDs/functions-to-be-attached-to. If we did have an RB
tree for kallsyms (is it hard to support duplicates? why?) it could be
even faster O(Mlog(N)).


>
> thanks,
> jirka
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] store function address in BTF
  2021-10-06 16:17 ` Andrii Nakryiko
@ 2021-10-06 20:06   ` Jiri Olsa
  2021-10-06 21:22     ` Andrii Nakryiko
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2021-10-06 20:06 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

On Wed, Oct 06, 2021 at 09:17:39AM -0700, Andrii Nakryiko wrote:
> On Wed, Oct 6, 2021 at 1:42 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > hi,
> > I'm hitting performance issue and soft lock ups with the new version
> > of the patchset and the reason seems to be kallsyms lookup that we
> > need to do for each btf id we want to attach
> >
> > I tried to change kallsyms_lookup_name linear search into rbtree search,
> > but it has its own pitfalls like duplicate function names and it still
> > seems not to be fast enough when you want to attach like 30k functions
> 
> How not fast enough is it exactly? How long does it take?

30k functions takes 75 seconds for me, it's loop calling bpf_check_attach_target

getting soft lock up messages:

krava33 login: [  168.896671] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [bpftrace:1087]


> 
> >
> > so I wonder we could 'fix this' by storing function address in BTF,
> > which would cut kallsyms lookup completely, because it'd be done in
> > compile time
> >
> > my first thought was to add extra BTF section for that, after discussion
> > with Arnaldo perhaps we could be able to store extra 8 bytes after
> > BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> > indicate that? or new BTF_KIND_FUNC2 type?
> >
> > thoughts?
> 
> I'm strongly against this, because (besides the BTF bloat reason) we
> need similar mass attachment functionality for kprobe/kretprobe and
> that one won't be relying on BTF FUNCs, so I think it's better to
> stick to the same mechanism for figuring out the address of the
> function.

ok

> 
> If RB tree is not feasible, we can do a linear search over unsorted
> kallsyms and do binary search over sorted function names (derived from
> BTF IDs). That would be O(Nlog(M)), where N is number of ksyms, M is
> number of BTF IDs/functions-to-be-attached-to. If we did have an RB
> tree for kallsyms (is it hard to support duplicates? why?) it could be
> even faster O(Mlog(N)).

I had issues with generic kallsyms rbtree in the post some time ago,
I'll revisit it to check on details.. but having the tree with just
btf id functions might clear that.. I'll check

thanks,
jirka

> 
> 
> >
> > thanks,
> > jirka
> >
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] store function address in BTF
  2021-10-06 14:53   ` Alexei Starovoitov
@ 2021-10-06 20:07     ` Jiri Olsa
  0 siblings, 0 replies; 10+ messages in thread
From: Jiri Olsa @ 2021-10-06 20:07 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

On Wed, Oct 06, 2021 at 07:53:31AM -0700, Alexei Starovoitov wrote:
> On Wed, Oct 6, 2021 at 1:44 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Wed, Oct 06, 2021 at 10:41:41AM +0200, Jiri Olsa wrote:
> > > hi,
> > > I'm hitting performance issue and soft lock ups with the new version
> > > of the patchset and the reason seems to be kallsyms lookup that we
> > > need to do for each btf id we want to attach
> >
> > ugh, I meant to sent this as reply to the patchset mentioned above,
> > nevermind, here's the patchset:
> >   https://lore.kernel.org/bpf/20210605111034.1810858-1-jolsa@kernel.org/
> >
> > jirka
> >
> > >
> > > I tried to change kallsyms_lookup_name linear search into rbtree search,
> > > but it has its own pitfalls like duplicate function names and it still
> > > seems not to be fast enough when you want to attach like 30k functions
> > >
> > > so I wonder we could 'fix this' by storing function address in BTF,
> > > which would cut kallsyms lookup completely, because it'd be done in
> > > compile time
> > >
> > > my first thought was to add extra BTF section for that, after discussion
> > > with Arnaldo perhaps we could be able to store extra 8 bytes after
> > > BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> > > indicate that? or new BTF_KIND_FUNC2 type?
> > >
> > > thoughts?
> 
> That would be on top of your next patch set?
> Please post it first.

ok, will do

jirka


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] store function address in BTF
  2021-10-06 20:06   ` Jiri Olsa
@ 2021-10-06 21:22     ` Andrii Nakryiko
  2021-10-06 22:05       ` Jiri Olsa
  0 siblings, 1 reply; 10+ messages in thread
From: Andrii Nakryiko @ 2021-10-06 21:22 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

On Wed, Oct 6, 2021 at 1:06 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Wed, Oct 06, 2021 at 09:17:39AM -0700, Andrii Nakryiko wrote:
> > On Wed, Oct 6, 2021 at 1:42 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > hi,
> > > I'm hitting performance issue and soft lock ups with the new version
> > > of the patchset and the reason seems to be kallsyms lookup that we
> > > need to do for each btf id we want to attach
> > >
> > > I tried to change kallsyms_lookup_name linear search into rbtree search,
> > > but it has its own pitfalls like duplicate function names and it still
> > > seems not to be fast enough when you want to attach like 30k functions
> >
> > How not fast enough is it exactly? How long does it take?
>
> 30k functions takes 75 seconds for me, it's loop calling bpf_check_attach_target
>
> getting soft lock up messages:
>
> krava33 login: [  168.896671] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [bpftrace:1087]
>

That's without RB tree right? I was curious about the case of you
converting kallsyms to RB tree and it still being slow. Can't imagine
30k queries against RB tree with ~160k kallsyms taking 75 seconds.

But as I said, why not map BTF IDs into function names, sort function
names, and then pass over kallsyms once, doing binary search into a
sorted array of requested function names and then recording addr for
each. Then check that you found addresses for all functions (it also
leaves a question of what to do when we have multiple matching
functions, but it's a problem with any approach). If everything checks
out, you have a nice btf id -> func name -> func addr mapping. It's
O(N log(M)), which sounds like it shouldn't be slow. Definitely not
multiple seconds slow.


>
> >
> > >
> > > so I wonder we could 'fix this' by storing function address in BTF,
> > > which would cut kallsyms lookup completely, because it'd be done in
> > > compile time
> > >
> > > my first thought was to add extra BTF section for that, after discussion
> > > with Arnaldo perhaps we could be able to store extra 8 bytes after
> > > BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> > > indicate that? or new BTF_KIND_FUNC2 type?
> > >
> > > thoughts?
> >
> > I'm strongly against this, because (besides the BTF bloat reason) we
> > need similar mass attachment functionality for kprobe/kretprobe and
> > that one won't be relying on BTF FUNCs, so I think it's better to
> > stick to the same mechanism for figuring out the address of the
> > function.
>
> ok
>
> >
> > If RB tree is not feasible, we can do a linear search over unsorted
> > kallsyms and do binary search over sorted function names (derived from
> > BTF IDs). That would be O(Nlog(M)), where N is number of ksyms, M is
> > number of BTF IDs/functions-to-be-attached-to. If we did have an RB
> > tree for kallsyms (is it hard to support duplicates? why?) it could be
> > even faster O(Mlog(N)).
>
> I had issues with generic kallsyms rbtree in the post some time ago,
> I'll revisit it to check on details.. but having the tree with just
> btf id functions might clear that.. I'll check

That's not what I'm proposing. See above. Please let me know if
something is not clear before going all in for RB tree implementation
:)


But while we are on topic, do you think (with ftrace changes you are
doing) it would be hard to support multi-attach for
kprobes/kretprobes? We now have bpf_link interface for attaching
kprobes, so API can be pretty aligned with fentry/fexit, except
instead of btf IDs we'd need to pass array of pointers of C strings, I
suppose.

>
> thanks,
> jirka
>
> >
> >
> > >
> > > thanks,
> > > jirka
> > >
> >
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] store function address in BTF
  2021-10-06 21:22     ` Andrii Nakryiko
@ 2021-10-06 22:05       ` Jiri Olsa
  2021-10-06 22:37         ` Andrii Nakryiko
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2021-10-06 22:05 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

On Wed, Oct 06, 2021 at 02:22:28PM -0700, Andrii Nakryiko wrote:
> On Wed, Oct 6, 2021 at 1:06 PM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Wed, Oct 06, 2021 at 09:17:39AM -0700, Andrii Nakryiko wrote:
> > > On Wed, Oct 6, 2021 at 1:42 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > hi,
> > > > I'm hitting performance issue and soft lock ups with the new version
> > > > of the patchset and the reason seems to be kallsyms lookup that we
> > > > need to do for each btf id we want to attach
> > > >
> > > > I tried to change kallsyms_lookup_name linear search into rbtree search,
> > > > but it has its own pitfalls like duplicate function names and it still
> > > > seems not to be fast enough when you want to attach like 30k functions
> > >
> > > How not fast enough is it exactly? How long does it take?
> >
> > 30k functions takes 75 seconds for me, it's loop calling bpf_check_attach_target
> >
> > getting soft lock up messages:
> >
> > krava33 login: [  168.896671] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [bpftrace:1087]
> >
> 
> That's without RB tree right? I was curious about the case of you
> converting kallsyms to RB tree and it still being slow. Can't imagine
> 30k queries against RB tree with ~160k kallsyms taking 75 seconds.

yep, that's the standard kallsyms lookup api

I need to make some adjustment for rbtree kalsyms code, I think I found
a bug in there, so the numbers are probably better as you suggest

> 
> But as I said, why not map BTF IDs into function names, sort function
> names, and then pass over kallsyms once, doing binary search into a
> sorted array of requested function names and then recording addr for
> each. Then check that you found addresses for all functions (it also
> leaves a question of what to do when we have multiple matching
> functions, but it's a problem with any approach). If everything checks
> out, you have a nice btf id -> func name -> func addr mapping. It's
> O(N log(M)), which sounds like it shouldn't be slow. Definitely not
> multiple seconds slow.

ok, now that's clear to me, thanks for these details

> 
> 
> >
> > >
> > > >
> > > > so I wonder we could 'fix this' by storing function address in BTF,
> > > > which would cut kallsyms lookup completely, because it'd be done in
> > > > compile time
> > > >
> > > > my first thought was to add extra BTF section for that, after discussion
> > > > with Arnaldo perhaps we could be able to store extra 8 bytes after
> > > > BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> > > > indicate that? or new BTF_KIND_FUNC2 type?
> > > >
> > > > thoughts?
> > >
> > > I'm strongly against this, because (besides the BTF bloat reason) we
> > > need similar mass attachment functionality for kprobe/kretprobe and
> > > that one won't be relying on BTF FUNCs, so I think it's better to
> > > stick to the same mechanism for figuring out the address of the
> > > function.
> >
> > ok
> >
> > >
> > > If RB tree is not feasible, we can do a linear search over unsorted
> > > kallsyms and do binary search over sorted function names (derived from
> > > BTF IDs). That would be O(Nlog(M)), where N is number of ksyms, M is
> > > number of BTF IDs/functions-to-be-attached-to. If we did have an RB
> > > tree for kallsyms (is it hard to support duplicates? why?) it could be
> > > even faster O(Mlog(N)).
> >
> > I had issues with generic kallsyms rbtree in the post some time ago,
> > I'll revisit it to check on details.. but having the tree with just
> > btf id functions might clear that.. I'll check
> 
> That's not what I'm proposing. See above. Please let me know if
> something is not clear before going all in for RB tree implementation
> :)
> 
> 
> But while we are on topic, do you think (with ftrace changes you are
> doing) it would be hard to support multi-attach for
> kprobes/kretprobes? We now have bpf_link interface for attaching
> kprobes, so API can be pretty aligned with fentry/fexit, except
> instead of btf IDs we'd need to pass array of pointers of C strings, I
> suppose.

hum, I think kprobe/kretprobe is made of perf event (kprobe/kretprobe
pmus), then you pass event fd and program fd to bpf link syscall,
and it attaches bpf program to that perf event

so perhaps the user interface would be array of perf events fds and prog fd

also I think you can have just one probe for function, so we will not need
to share kprobes for multiple users like we need for trampolines, so the
attach logic will be simple

jirka

> 
> >
> > thanks,
> > jirka
> >
> > >
> > >
> > > >
> > > > thanks,
> > > > jirka
> > > >
> > >
> >
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] store function address in BTF
  2021-10-06 22:05       ` Jiri Olsa
@ 2021-10-06 22:37         ` Andrii Nakryiko
  2021-10-07  6:55           ` Jiri Olsa
  0 siblings, 1 reply; 10+ messages in thread
From: Andrii Nakryiko @ 2021-10-06 22:37 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

On Wed, Oct 6, 2021 at 3:05 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Wed, Oct 06, 2021 at 02:22:28PM -0700, Andrii Nakryiko wrote:
> > On Wed, Oct 6, 2021 at 1:06 PM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Wed, Oct 06, 2021 at 09:17:39AM -0700, Andrii Nakryiko wrote:
> > > > On Wed, Oct 6, 2021 at 1:42 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > >
> > > > > hi,
> > > > > I'm hitting performance issue and soft lock ups with the new version
> > > > > of the patchset and the reason seems to be kallsyms lookup that we
> > > > > need to do for each btf id we want to attach
> > > > >
> > > > > I tried to change kallsyms_lookup_name linear search into rbtree search,
> > > > > but it has its own pitfalls like duplicate function names and it still
> > > > > seems not to be fast enough when you want to attach like 30k functions
> > > >
> > > > How not fast enough is it exactly? How long does it take?
> > >
> > > 30k functions takes 75 seconds for me, it's loop calling bpf_check_attach_target
> > >
> > > getting soft lock up messages:
> > >
> > > krava33 login: [  168.896671] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [bpftrace:1087]
> > >
> >
> > That's without RB tree right? I was curious about the case of you
> > converting kallsyms to RB tree and it still being slow. Can't imagine
> > 30k queries against RB tree with ~160k kallsyms taking 75 seconds.
>
> yep, that's the standard kallsyms lookup api
>
> I need to make some adjustment for rbtree kalsyms code, I think I found
> a bug in there, so the numbers are probably better as you suggest

ok, cool, let's see what are the new numbers then

>
> >
> > But as I said, why not map BTF IDs into function names, sort function
> > names, and then pass over kallsyms once, doing binary search into a
> > sorted array of requested function names and then recording addr for
> > each. Then check that you found addresses for all functions (it also
> > leaves a question of what to do when we have multiple matching
> > functions, but it's a problem with any approach). If everything checks
> > out, you have a nice btf id -> func name -> func addr mapping. It's
> > O(N log(M)), which sounds like it shouldn't be slow. Definitely not
> > multiple seconds slow.
>
> ok, now that's clear to me, thanks for these details

great

>
> >
> >
> > >
> > > >
> > > > >
> > > > > so I wonder we could 'fix this' by storing function address in BTF,
> > > > > which would cut kallsyms lookup completely, because it'd be done in
> > > > > compile time
> > > > >
> > > > > my first thought was to add extra BTF section for that, after discussion
> > > > > with Arnaldo perhaps we could be able to store extra 8 bytes after
> > > > > BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> > > > > indicate that? or new BTF_KIND_FUNC2 type?
> > > > >
> > > > > thoughts?
> > > >
> > > > I'm strongly against this, because (besides the BTF bloat reason) we
> > > > need similar mass attachment functionality for kprobe/kretprobe and
> > > > that one won't be relying on BTF FUNCs, so I think it's better to
> > > > stick to the same mechanism for figuring out the address of the
> > > > function.
> > >
> > > ok
> > >
> > > >
> > > > If RB tree is not feasible, we can do a linear search over unsorted
> > > > kallsyms and do binary search over sorted function names (derived from
> > > > BTF IDs). That would be O(Nlog(M)), where N is number of ksyms, M is
> > > > number of BTF IDs/functions-to-be-attached-to. If we did have an RB
> > > > tree for kallsyms (is it hard to support duplicates? why?) it could be
> > > > even faster O(Mlog(N)).
> > >
> > > I had issues with generic kallsyms rbtree in the post some time ago,
> > > I'll revisit it to check on details.. but having the tree with just
> > > btf id functions might clear that.. I'll check
> >
> > That's not what I'm proposing. See above. Please let me know if
> > something is not clear before going all in for RB tree implementation
> > :)
> >
> >
> > But while we are on topic, do you think (with ftrace changes you are
> > doing) it would be hard to support multi-attach for
> > kprobes/kretprobes? We now have bpf_link interface for attaching
> > kprobes, so API can be pretty aligned with fentry/fexit, except
> > instead of btf IDs we'd need to pass array of pointers of C strings, I
> > suppose.
>
> hum, I think kprobe/kretprobe is made of perf event (kprobe/kretprobe
> pmus), then you pass event fd and program fd to bpf link syscall,
> and it attaches bpf program to that perf event
>
> so perhaps the user interface would be array of perf events fds and prog fd
>
> also I think you can have just one probe for function, so we will not need
> to share kprobes for multiple users like we need for trampolines, so the
> attach logic will be simple

creating thousands of perf_event FDs seems expensive (and you'll be
running into the limit of open files pretty soon in a lot of systems).
So I think for multi-attach we'll have to have a separate way where
you'd specify kernel function name (and maybe also offset). I'm just
saying that having BPF_LINK_CREATE command, it's easier (probably) to
extend this for kprobe multi-attach, than trying to retrofit this into
perf_event_open.

>
> jirka
>
> >
> > >
> > > thanks,
> > > jirka
> > >
> > > >
> > > >
> > > > >
> > > > > thanks,
> > > > > jirka
> > > > >
> > > >
> > >
> >
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] store function address in BTF
  2021-10-06 22:37         ` Andrii Nakryiko
@ 2021-10-07  6:55           ` Jiri Olsa
  0 siblings, 0 replies; 10+ messages in thread
From: Jiri Olsa @ 2021-10-07  6:55 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik,
	Arnaldo Carvalho de Melo

On Wed, Oct 06, 2021 at 03:37:16PM -0700, Andrii Nakryiko wrote:
> On Wed, Oct 6, 2021 at 3:05 PM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Wed, Oct 06, 2021 at 02:22:28PM -0700, Andrii Nakryiko wrote:
> > > On Wed, Oct 6, 2021 at 1:06 PM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > On Wed, Oct 06, 2021 at 09:17:39AM -0700, Andrii Nakryiko wrote:
> > > > > On Wed, Oct 6, 2021 at 1:42 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > >
> > > > > > hi,
> > > > > > I'm hitting performance issue and soft lock ups with the new version
> > > > > > of the patchset and the reason seems to be kallsyms lookup that we
> > > > > > need to do for each btf id we want to attach
> > > > > >
> > > > > > I tried to change kallsyms_lookup_name linear search into rbtree search,
> > > > > > but it has its own pitfalls like duplicate function names and it still
> > > > > > seems not to be fast enough when you want to attach like 30k functions
> > > > >
> > > > > How not fast enough is it exactly? How long does it take?
> > > >
> > > > 30k functions takes 75 seconds for me, it's loop calling bpf_check_attach_target
> > > >
> > > > getting soft lock up messages:
> > > >
> > > > krava33 login: [  168.896671] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [bpftrace:1087]
> > > >
> > >
> > > That's without RB tree right? I was curious about the case of you
> > > converting kallsyms to RB tree and it still being slow. Can't imagine
> > > 30k queries against RB tree with ~160k kallsyms taking 75 seconds.
> >
> > yep, that's the standard kallsyms lookup api
> >
> > I need to make some adjustment for rbtree kalsyms code, I think I found
> > a bug in there, so the numbers are probably better as you suggest
> 
> ok, cool, let's see what are the new numbers then
> 
> >
> > >
> > > But as I said, why not map BTF IDs into function names, sort function
> > > names, and then pass over kallsyms once, doing binary search into a
> > > sorted array of requested function names and then recording addr for
> > > each. Then check that you found addresses for all functions (it also
> > > leaves a question of what to do when we have multiple matching
> > > functions, but it's a problem with any approach). If everything checks
> > > out, you have a nice btf id -> func name -> func addr mapping. It's
> > > O(N log(M)), which sounds like it shouldn't be slow. Definitely not
> > > multiple seconds slow.
> >
> > ok, now that's clear to me, thanks for these details
> 
> great
> 
> >
> > >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > so I wonder we could 'fix this' by storing function address in BTF,
> > > > > > which would cut kallsyms lookup completely, because it'd be done in
> > > > > > compile time
> > > > > >
> > > > > > my first thought was to add extra BTF section for that, after discussion
> > > > > > with Arnaldo perhaps we could be able to store extra 8 bytes after
> > > > > > BTF_KIND_FUNC record, using one of the 'unused' bits in btf_type to
> > > > > > indicate that? or new BTF_KIND_FUNC2 type?
> > > > > >
> > > > > > thoughts?
> > > > >
> > > > > I'm strongly against this, because (besides the BTF bloat reason) we
> > > > > need similar mass attachment functionality for kprobe/kretprobe and
> > > > > that one won't be relying on BTF FUNCs, so I think it's better to
> > > > > stick to the same mechanism for figuring out the address of the
> > > > > function.
> > > >
> > > > ok
> > > >
> > > > >
> > > > > If RB tree is not feasible, we can do a linear search over unsorted
> > > > > kallsyms and do binary search over sorted function names (derived from
> > > > > BTF IDs). That would be O(Nlog(M)), where N is number of ksyms, M is
> > > > > number of BTF IDs/functions-to-be-attached-to. If we did have an RB
> > > > > tree for kallsyms (is it hard to support duplicates? why?) it could be
> > > > > even faster O(Mlog(N)).
> > > >
> > > > I had issues with generic kallsyms rbtree in the post some time ago,
> > > > I'll revisit it to check on details.. but having the tree with just
> > > > btf id functions might clear that.. I'll check
> > >
> > > That's not what I'm proposing. See above. Please let me know if
> > > something is not clear before going all in for RB tree implementation
> > > :)
> > >
> > >
> > > But while we are on topic, do you think (with ftrace changes you are
> > > doing) it would be hard to support multi-attach for
> > > kprobes/kretprobes? We now have bpf_link interface for attaching
> > > kprobes, so API can be pretty aligned with fentry/fexit, except
> > > instead of btf IDs we'd need to pass array of pointers of C strings, I
> > > suppose.
> >
> > hum, I think kprobe/kretprobe is made of perf event (kprobe/kretprobe
> > pmus), then you pass event fd and program fd to bpf link syscall,
> > and it attaches bpf program to that perf event
> >
> > so perhaps the user interface would be array of perf events fds and prog fd
> >
> > also I think you can have just one probe for function, so we will not need
> > to share kprobes for multiple users like we need for trampolines, so the
> > attach logic will be simple
> 
> creating thousands of perf_event FDs seems expensive (and you'll be
> running into the limit of open files pretty soon in a lot of systems).
> So I think for multi-attach we'll have to have a separate way where
> you'd specify kernel function name (and maybe also offset). I'm just
> saying that having BPF_LINK_CREATE command, it's easier (probably) to
> extend this for kprobe multi-attach, than trying to retrofit this into
> perf_event_open.

ah true.. I wonder we could bypass perf by using directly kernel
kprobe interface

jirka


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-07  6:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-06  8:41 [RFC] store function address in BTF Jiri Olsa
2021-10-06  8:44 ` Jiri Olsa
2021-10-06 14:53   ` Alexei Starovoitov
2021-10-06 20:07     ` Jiri Olsa
2021-10-06 16:17 ` Andrii Nakryiko
2021-10-06 20:06   ` Jiri Olsa
2021-10-06 21:22     ` Andrii Nakryiko
2021-10-06 22:05       ` Jiri Olsa
2021-10-06 22:37         ` Andrii Nakryiko
2021-10-07  6:55           ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).