From: Jiri Olsa <jolsa@kernel.org>
To: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andriin@fb.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
"Martin KaFai Lau" <kafai@fb.com>,
"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"KP Singh" <kpsingh@chromium.org>, "Daniel Xu" <dxu@dxuuu.xyz>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Jesper Brouer" <jbrouer@redhat.com>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
"Viktor Malik" <vmalik@redhat.com>
Subject: [RFC bpf-next 00/16] bpf: Speed up trampoline attach
Date: Thu, 22 Oct 2020 10:21:22 +0200 [thread overview]
Message-ID: <20201022082138.2322434-1-jolsa@kernel.org> (raw)
hi,
this patchset tries to speed up the attach time for trampolines
and make bpftrace faster for wildcard use cases like:
# bpftrace -ve "kfunc:__x64_sys_s* { printf("test\n"); }"
Profiles show mostly ftrace backend, because we add trampoline
functions one by one and ftrace direct function registering is
quite expensive. Thus main change in this patchset is to allow
batch attach and use just single ftrace call to attach or detach
multiple ips/trampolines.
This patchset also contains other speedup changes that showed
up in profiles:
- delayed link free
to bypass detach cycles completely
- kallsyms rbtree search
change linear search to rb tree search
For clean attach workload I added also new attach selftest,
which is not meant to be merged but is used to show profile
results.
Following numbers show speedup after applying specific change
on top of the previous (and including the previous changes).
profiled with: 'perf stat -r 5 -e cycles:k,cycles:u ...'
For bpftrace:
# bpftrace -ve "kfunc:__x64_sys_s* { printf("test\n"); } i:ms:10 { printf("exit\n"); exit();}"
- base
3,290,457,628 cycles:k ( +- 0.27% )
933,581,973 cycles:u ( +- 0.20% )
50.25 +- 4.79 seconds time elapsed ( +- 9.53% )
+ delayed link free
2,535,458,767 cycles:k ( +- 0.55% )
940,046,382 cycles:u ( +- 0.27% )
33.60 +- 3.27 seconds time elapsed ( +- 9.73% )
+ kallsym rbtree search
2,199,433,771 cycles:k ( +- 0.55% )
936,105,469 cycles:u ( +- 0.37% )
26.48 +- 3.57 seconds time elapsed ( +- 13.49% )
+ batch support
1,456,854,867 cycles:k ( +- 0.57% )
937,737,431 cycles:u ( +- 0.13% )
12.44 +- 2.98 seconds time elapsed ( +- 23.95% )
+ rcu fix
1,427,959,119 cycles:k ( +- 0.87% )
930,833,507 cycles:u ( +- 0.23% )
14.53 +- 3.51 seconds time elapsed ( +- 24.14% )
For attach_test numbers do not show direct time speedup when
using the batch support, but show big decrease in kernel cycles.
It seems the time is spent in rcu waiting, which I tried to
address in most likely wrong rcu fix:
# ./test_progs -t attach_test
- base
1,350,136,760 cycles:k ( +- 0.07% )
70,591,712 cycles:u ( +- 0.26% )
24.26 +- 2.82 seconds time elapsed ( +- 11.62% )
+ delayed link free
996,152,309 cycles:k ( +- 0.37% )
69,263,150 cycles:u ( +- 0.50% )
15.63 +- 1.80 seconds time elapsed ( +- 11.51% )
+ kallsym rbtree search
390,217,706 cycles:k ( +- 0.66% )
68,999,019 cycles:u ( +- 0.46% )
14.11 +- 2.11 seconds time elapsed ( +- 14.98% )
+ batch support
37,410,887 cycles:k ( +- 0.98% )
70,062,158 cycles:u ( +- 0.39% )
26.80 +- 4.10 seconds time elapsed ( +- 15.31% )
+ rcu fix
36,812,432 cycles:k ( +- 2.52% )
69,907,191 cycles:u ( +- 0.38% )
15.04 +- 2.94 seconds time elapsed ( +- 19.54% )
I still need to go through the changes and double check them,
also those ftrace changes are most likely wrong and most likely
I broke few tests (hence it's RFC), but I wonder you guys would
like this batch solution and if there are any thoughts on that.
Also available in
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
bpf/batch
thanks,
jirka
---
Jiri Olsa (16):
ftrace: Add check_direct_entry function
ftrace: Add adjust_direct_size function
ftrace: Add get/put_direct_func function
ftrace: Add ftrace_set_filter_ips function
ftrace: Add register_ftrace_direct_ips function
ftrace: Add unregister_ftrace_direct_ips function
kallsyms: Use rb tree for kallsyms name search
bpf: Use delayed link free in bpf_link_put
bpf: Add BPF_TRAMPOLINE_BATCH_ATTACH support
bpf: Add BPF_TRAMPOLINE_BATCH_DETACH support
bpf: Sync uapi bpf.h to tools
bpf: Move synchronize_rcu_mult for batch processing (NOT TO BE MERGED)
libbpf: Add trampoline batch attach support
libbpf: Add trampoline batch detach support
selftests/bpf: Add trampoline batch test
selftests/bpf: Add attach batch test (NOT TO BE MERGED)
include/linux/bpf.h | 18 +++++-
include/linux/ftrace.h | 7 +++
include/uapi/linux/bpf.h | 8 +++
kernel/bpf/syscall.c | 125 ++++++++++++++++++++++++++++++++++----
kernel/bpf/trampoline.c | 95 +++++++++++++++++++++++------
kernel/kallsyms.c | 95 ++++++++++++++++++++++++++---
kernel/trace/ftrace.c | 304 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------
net/bpf/test_run.c | 55 +++++++++++++++++
tools/include/uapi/linux/bpf.h | 8 +++
tools/lib/bpf/bpf.c | 24 ++++++++
tools/lib/bpf/bpf.h | 2 +
tools/lib/bpf/libbpf.c | 126 ++++++++++++++++++++++++++++++++++++++-
tools/lib/bpf/libbpf.h | 5 +-
tools/lib/bpf/libbpf.map | 2 +
tools/testing/selftests/bpf/prog_tests/attach_test.c | 27 +++++++++
tools/testing/selftests/bpf/prog_tests/trampoline_batch.c | 45 ++++++++++++++
tools/testing/selftests/bpf/progs/attach_test.c | 62 +++++++++++++++++++
tools/testing/selftests/bpf/progs/trampoline_batch_test.c | 75 +++++++++++++++++++++++
18 files changed, 995 insertions(+), 88 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/attach_test.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/trampoline_batch.c
create mode 100644 tools/testing/selftests/bpf/progs/attach_test.c
create mode 100644 tools/testing/selftests/bpf/progs/trampoline_batch_test.c
next reply other threads:[~2020-10-22 8:21 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-22 8:21 Jiri Olsa [this message]
2020-10-22 8:21 ` [RFC bpf-next 01/16] ftrace: Add check_direct_entry function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 02/16] ftrace: Add adjust_direct_size function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 03/16] ftrace: Add get/put_direct_func function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 04/16] ftrace: Add ftrace_set_filter_ips function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 05/16] ftrace: Add register_ftrace_direct_ips function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 06/16] ftrace: Add unregister_ftrace_direct_ips function Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 07/16] kallsyms: Use rb tree for kallsyms name search Jiri Olsa
2020-10-28 18:25 ` Jiri Olsa
2020-10-28 21:15 ` Alexei Starovoitov
2020-10-29 9:29 ` Jiri Olsa
2020-10-29 22:45 ` Andrii Nakryiko
2020-10-28 22:40 ` Andrii Nakryiko
2020-10-29 9:33 ` Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 08/16] bpf: Use delayed link free in bpf_link_put Jiri Olsa
2020-10-23 19:46 ` Andrii Nakryiko
2020-10-25 19:02 ` Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 09/16] bpf: Add BPF_TRAMPOLINE_BATCH_ATTACH support Jiri Olsa
2020-10-23 20:03 ` Andrii Nakryiko
2020-10-23 20:31 ` Steven Rostedt
2020-10-23 22:23 ` Andrii Nakryiko
2020-10-25 19:41 ` Jiri Olsa
2020-10-26 23:19 ` Andrii Nakryiko
2020-10-22 8:21 ` [RFC bpf-next 10/16] bpf: Add BPF_TRAMPOLINE_BATCH_DETACH support Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 11/16] bpf: Sync uapi bpf.h to tools Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 12/16] bpf: Move synchronize_rcu_mult for batch processing (NOT TO BE MERGED) Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 13/16] libbpf: Add trampoline batch attach support Jiri Olsa
2020-10-23 20:09 ` Andrii Nakryiko
2020-10-25 19:11 ` Jiri Olsa
2020-10-26 23:15 ` Andrii Nakryiko
2020-10-27 19:03 ` Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 14/16] libbpf: Add trampoline batch detach support Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 15/16] selftests/bpf: Add trampoline batch test Jiri Olsa
2020-10-22 8:21 ` [RFC bpf-next 16/16] selftests/bpf: Add attach batch test (NOT TO BE MERGED) Jiri Olsa
2020-10-22 13:35 ` [RFC bpf-next 00/16] bpf: Speed up trampoline attach Steven Rostedt
2020-10-22 14:11 ` Jiri Olsa
2020-10-22 14:42 ` Steven Rostedt
2020-10-22 16:21 ` Steven Rostedt
2020-10-22 20:52 ` Steven Rostedt
2020-10-23 6:09 ` Jiri Olsa
2020-10-23 13:50 ` Steven Rostedt
2020-10-25 19:01 ` Jiri Olsa
2020-10-27 4:30 ` Alexei Starovoitov
2020-10-27 13:14 ` Steven Rostedt
2020-10-27 14:28 ` Jiri Olsa
2020-10-28 21:13 ` Alexei Starovoitov
2020-10-29 11:09 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201022082138.2322434-1-jolsa@kernel.org \
--to=jolsa@kernel.org \
--cc=andriin@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=dxu@dxuuu.xyz \
--cc=jbrouer@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@chromium.org \
--cc=netdev@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=toke@redhat.com \
--cc=vmalik@redhat.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).