From: Xu Kuohai <xukuohai@huawei.com>
To: Steven Rostedt <rostedt@goodmis.org>,
Florent Revest <revest@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Daniel Borkmann <daniel@iogearbox.net>,
<linux-arm-kernel@lists.infradead.org>,
<linux-kernel@vger.kernel.org>, <bpf@vger.kernel.org>,
Will Deacon <will@kernel.org>,
Jean-Philippe Brucker <jean-philippe@linaro.org>,
Ingo Molnar <mingo@redhat.com>, Oleg Nesterov <oleg@redhat.com>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Zi Shen Lim <zlim.lnx@gmail.com>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Ard Biesheuvel <ardb@kernel.org>, Marc Zyngier <maz@kernel.org>,
Guo Ren <guoren@kernel.org>,
Masami Hiramatsu <mhiramat@kernel.org>
Subject: Re: [PATCH bpf-next v2 0/4] Add ftrace direct call for arm64
Date: Thu, 6 Oct 2022 18:09:44 +0800 [thread overview]
Message-ID: <fb3973b6-c65e-fb98-7cdf-46c8a4cf0c4d@huawei.com> (raw)
In-Reply-To: <20221005113019.18aeda76@gandalf.local.home>
On 10/5/2022 11:30 PM, Steven Rostedt wrote:
> On Wed, 5 Oct 2022 17:10:33 +0200
> Florent Revest <revest@chromium.org> wrote:
>
>> On Wed, Oct 5, 2022 at 5:07 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>>>
>>> On Wed, 5 Oct 2022 22:54:15 +0800
>>> Xu Kuohai <xukuohai@huawei.com> wrote:
>>>
>>>> 1.3 attach bpf prog with with direct call, bpftrace -e 'kfunc:vfs_write {}'
>>>>
>>>> # dd if=/dev/zero of=/dev/null count=1000000
>>>> 1000000+0 records in
>>>> 1000000+0 records out
>>>> 512000000 bytes (512 MB, 488 MiB) copied, 1.72973 s, 296 MB/s
>>>>
>>>>
>>>> 1.4 attach bpf prog with with indirect call, bpftrace -e 'kfunc:vfs_write {}'
>>>>
>>>> # dd if=/dev/zero of=/dev/null count=1000000
>>>> 1000000+0 records in
>>>> 1000000+0 records out
>>>> 512000000 bytes (512 MB, 488 MiB) copied, 1.99179 s, 257 MB/s
>>
>> Thanks for the measurements Xu!
>>
>>> Can you show the implementation of the indirect call you used?
>>
>> Xu used my development branch here
>> https://github.com/FlorentRevest/linux/commits/fprobe-min-args
>
> That looks like it could be optimized quite a bit too.
>
> Specifically this part:
>
> static bool bpf_fprobe_entry(struct fprobe *fp, unsigned long ip, struct ftrace_regs *regs, void *private)
> {
> struct bpf_fprobe_call_context *call_ctx = private;
> struct bpf_fprobe_context *fprobe_ctx = fp->ops.private;
> struct bpf_tramp_links *links = fprobe_ctx->links;
> struct bpf_tramp_links *fentry = &links[BPF_TRAMP_FENTRY];
> struct bpf_tramp_links *fmod_ret = &links[BPF_TRAMP_MODIFY_RETURN];
> struct bpf_tramp_links *fexit = &links[BPF_TRAMP_FEXIT];
> int i, ret;
>
> memset(&call_ctx->ctx, 0, sizeof(call_ctx->ctx));
> call_ctx->ip = ip;
> for (i = 0; i < fprobe_ctx->nr_args; i++)
> call_ctx->args[i] = ftrace_regs_get_argument(regs, i);
>
> for (i = 0; i < fentry->nr_links; i++)
> call_bpf_prog(fentry->links[i], &call_ctx->ctx, call_ctx->args);
>
> call_ctx->args[fprobe_ctx->nr_args] = 0;
> for (i = 0; i < fmod_ret->nr_links; i++) {
> ret = call_bpf_prog(fmod_ret->links[i], &call_ctx->ctx,
> call_ctx->args);
>
> if (ret) {
> ftrace_regs_set_return_value(regs, ret);
> ftrace_override_function_with_return(regs);
>
> bpf_fprobe_exit(fp, ip, regs, private);
> return false;
> }
> }
>
> return fexit->nr_links;
> }
>
> There's a lot of low hanging fruit to speed up there. I wouldn't be too
> fast to throw out this solution if it hasn't had the care that direct calls
> have had to speed that up.
>
> For example, trampolines currently only allow to attach to functions with 6
> parameters or less (3 on x86_32). You could make 7 specific callbacks, with
> zero to 6 parameters, and unroll the argument loop.
>
> Would also be interesting to run perf to see where the overhead is. There
> may be other locations to work on to make it almost as fast as direct
> callers without the other baggage.
>
There is something wrong with my pi4 perf, I'll send the perf report after
I fix it.
> -- Steve
>
>>
>> As it stands, the performance impact of the fprobe based
>> implementation would be too high for us. I wonder how much Mark's idea
>> here https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/ftrace/per-callsite-ops
>> would help but it doesn't work right now.
>
>
> .
next prev parent reply other threads:[~2022-10-06 10:09 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-13 16:27 [PATCH bpf-next v2 0/4] Add ftrace direct call for arm64 Xu Kuohai
2022-09-13 16:27 ` [PATCH bpf-next v2 1/4] ftrace: Allow users to disable ftrace direct call Xu Kuohai
2022-09-13 16:27 ` [PATCH bpf-next v2 2/4] arm64: ftrace: Support long jump for " Xu Kuohai
2022-09-13 16:27 ` [PATCH bpf-next v2 3/4] arm64: ftrace: Add ftrace direct call support Xu Kuohai
2022-09-13 16:27 ` [PATCH bpf-next v2 4/4] ftrace: Fix dead loop caused by direct call in ftrace selftest Xu Kuohai
2022-09-22 18:01 ` [PATCH bpf-next v2 0/4] Add ftrace direct call for arm64 Daniel Borkmann
2022-09-26 14:40 ` Catalin Marinas
2022-09-26 17:43 ` Mark Rutland
2022-09-27 4:49 ` Xu Kuohai
2022-09-28 16:42 ` Mark Rutland
2022-09-30 4:07 ` Xu Kuohai
2022-10-04 16:06 ` Florent Revest
2022-10-05 14:54 ` Xu Kuohai
2022-10-05 15:07 ` Steven Rostedt
2022-10-05 15:10 ` Florent Revest
2022-10-05 15:30 ` Steven Rostedt
2022-10-05 22:12 ` Jiri Olsa
2022-10-06 16:35 ` Florent Revest
2022-10-06 10:09 ` Xu Kuohai [this message]
2022-10-06 16:19 ` Florent Revest
2022-10-06 16:29 ` Steven Rostedt
2022-10-07 10:13 ` Xu Kuohai
2022-10-17 17:55 ` Florent Revest
2022-10-17 18:49 ` Steven Rostedt
2022-10-17 19:10 ` Florent Revest
2022-10-21 11:31 ` Masami Hiramatsu
2022-10-21 16:49 ` Florent Revest
2022-10-24 13:00 ` Masami Hiramatsu
2022-11-10 4:58 ` wuqiang
2022-10-06 10:09 ` Xu Kuohai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fb3973b6-c65e-fb98-7cdf-46c8a4cf0c4d@huawei.com \
--to=xukuohai@huawei.com \
--cc=andrii@kernel.org \
--cc=ardb@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=catalin.marinas@arm.com \
--cc=daniel@iogearbox.net \
--cc=guoren@kernel.org \
--cc=haoluo@google.com \
--cc=jean-philippe@linaro.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=martin.lau@linux.dev \
--cc=maz@kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=pasha.tatashin@soleen.com \
--cc=revest@chromium.org \
--cc=rostedt@goodmis.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=will@kernel.org \
--cc=yhs@fb.com \
--cc=zlim.lnx@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).