From: Jiri Olsa <jolsa@redhat.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@chromium.org>
Subject: Re: [PATCH bpf-next 06/29] bpf: Add bpf_arg/bpf_ret_value helpers for tracing programs
Date: Sun, 28 Nov 2021 19:06:10 +0100 [thread overview]
Message-ID: <YaPFEpAqIREeUMU7@krava> (raw)
In-Reply-To: <CAEf4Bza0UZv6EFdELpg30o=67-Zzs6ggZext4u40+if9a5oQDg@mail.gmail.com>
On Wed, Nov 24, 2021 at 01:43:22PM -0800, Andrii Nakryiko wrote:
> On Thu, Nov 18, 2021 at 3:25 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > Adding bpf_arg/bpf_ret_value helpers for tracing programs
> > that returns traced function arguments.
> >
> > Get n-th argument of the traced function:
> > long bpf_arg(void *ctx, int n)
> >
> > Get return value of the traced function:
> > long bpf_ret_value(void *ctx)
> >
> > The trampoline now stores number of arguments on ctx-8
> > address, so it's easy to verify argument index and find
> > return value argument.
> >
> > Moving function ip address on the trampoline stack behind
> > the number of functions arguments, so it's now stored
> > on ctx-16 address.
> >
> > Both helpers are inlined by verifier.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
>
> It would be great to land these changes separate from your huge patch
> set. There are some upcoming BPF trampoline related changes that will
> touch this (to add BPF cookie support for fentry/fexit progs), so
> would be nice to minimize the interdependencies. So maybe post this
> patch separately (probably after holidays ;) ).
ok
>
> > arch/x86/net/bpf_jit_comp.c | 18 +++++++++++---
> > include/uapi/linux/bpf.h | 14 +++++++++++
> > kernel/bpf/verifier.c | 45 ++++++++++++++++++++++++++++++++--
> > kernel/trace/bpf_trace.c | 38 +++++++++++++++++++++++++++-
> > tools/include/uapi/linux/bpf.h | 14 +++++++++++
> > 5 files changed, 122 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index 631847907786..67e8ac9aaf0d 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1941,7 +1941,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > void *orig_call)
> > {
> > int ret, i, nr_args = m->nr_args;
> > - int stack_size = nr_args * 8;
> > + int stack_size = nr_args * 8 + 8 /* nr_args */;
>
> this /* nr_args */ next to 8 is super confusing, would be better to
> expand the comment; might be a good idea to have some sort of a
> description of possible stack layouts (e.g., fexit has some extra
> stuff on the stack, I think, but it's impossible to remember and need
> to recover that knowledge from the assembly code, basically).
>
> > struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
> > struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
> > struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
> > @@ -1987,12 +1987,22 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > EMIT4(0x48, 0x83, 0xe8, X86_PATCH_SIZE);
> > emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size);
> >
> > - /* Continue with stack_size for regs storage, stack will
> > - * be correctly restored with 'leave' instruction.
> > - */
> > + /* Continue with stack_size for 'nr_args' storage */
>
> same, I don't think this comment really helps, just confuses some more
ok, I'll put some more comments with list of possible the stack layouts
>
> > stack_size -= 8;
> > }
> >
> > + /* Store number of arguments of the traced function:
> > + * mov rax, nr_args
> > + * mov QWORD PTR [rbp - stack_size], rax
> > + */
> > + emit_mov_imm64(&prog, BPF_REG_0, 0, (u32) nr_args);
> > + emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size);
> > +
> > + /* Continue with stack_size for regs storage, stack will
> > + * be correctly restored with 'leave' instruction.
> > + */
> > + stack_size -= 8;
>
> I think "stack_size" as a name outlived itself and it just makes
> everything harder to understand. It's used more like a stack offset
> (relative to rsp or rbp) for different things. Would it make code
> worse if we had few offset variables instead (or rather in addition,
> we still need to calculate a full stack_size; it's just it's constant
> re-adjustment is what's hard to keep track of), like regs_off,
> ret_ip_off, arg_cnt_off, etc?
let's see, I'll try that
>
> > +
> > save_regs(m, &prog, nr_args, stack_size);
> >
> > if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index a69e4b04ffeb..fc8b344eecba 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4957,6 +4957,18 @@ union bpf_attr {
> > * **-ENOENT** if *task->mm* is NULL, or no vma contains *addr*.
> > * **-EBUSY** if failed to try lock mmap_lock.
> > * **-EINVAL** for invalid **flags**.
> > + *
> > + * long bpf_arg(void *ctx, int n)
>
> __u32 n ?
ok
>
> > + * Description
> > + * Get n-th argument of the traced function (for tracing programs).
> > + * Return
> > + * Value of the argument.
>
> What about errors? those need to be documented.
ok
>
> > + *
> > + * long bpf_ret_value(void *ctx)
> > + * Description
> > + * Get return value of the traced function (for tracing programs).
> > + * Return
> > + * Return value of the traced function.
>
> Same, errors not documented. Also would be good to document what
> happens when ret_value is requested in the context where there is no
> ret value (e.g., fentry)
ugh, that's not handled at the moment.. should we fail when
we see bpf_ret_value helper call in fentry program?
>
> > */
> > #define __BPF_FUNC_MAPPER(FN) \
> > FN(unspec), \
> > @@ -5140,6 +5152,8 @@ union bpf_attr {
> > FN(skc_to_unix_sock), \
> > FN(kallsyms_lookup_name), \
> > FN(find_vma), \
> > + FN(arg), \
> > + FN(ret_value), \
>
> We already have bpf_get_func_ip, so why not continue a tradition and
> call these bpf_get_func_arg() and bpf_get_func_ret(). Nice, short,
> clean, consistent.
ok
>
> BTW, a wild thought. Wouldn't it be cool to have these functions work
> with kprobe/kretprobe as well? Do you think it's possible?
right, bpf_get_func_ip already works for kprobes
struct kprobe could have the btf_func_model of the traced function,
so in case we trace function directly on the entry point we could
read arguments registers based on the btf_func_model
I'll check with Massami
>
> > /* */
> >
> > /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index fac0c3518add..d4249ef6ca7e 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -13246,11 +13246,52 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> > continue;
> > }
> >
> > + /* Implement bpf_arg inline. */
> > + if (prog_type == BPF_PROG_TYPE_TRACING &&
> > + insn->imm == BPF_FUNC_arg) {
> > + /* Load nr_args from ctx - 8 */
> > + insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
> > + insn_buf[1] = BPF_JMP32_REG(BPF_JGE, BPF_REG_2, BPF_REG_0, 4);
> > + insn_buf[2] = BPF_ALU64_IMM(BPF_MUL, BPF_REG_2, 8);
> > + insn_buf[3] = BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_1);
> > + insn_buf[4] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_2, 0);
> > + insn_buf[5] = BPF_JMP_A(1);
> > + insn_buf[6] = BPF_MOV64_IMM(BPF_REG_0, 0);
> > +
> > + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 7);
> > + if (!new_prog)
> > + return -ENOMEM;
> > +
> > + delta += 6;
> > + env->prog = prog = new_prog;
> > + insn = new_prog->insnsi + i + delta;
> > + continue;
>
> nit: this whole sequence of steps and calculations seems like
> something that might be abstracted and hidden behind a macro or helper
> func? Not related to your change, though. But wouldn't it be easier to
> understand if it was just written as:
>
> PATCH_INSNS(
> BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
> BPF_JMP32_REG(BPF_JGE, BPF_REG_2, BPF_REG_0, 4);
> BPF_ALU64_IMM(BPF_MUL, BPF_REG_2, 8);
> BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_1);
> BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_2, 0);
> BPF_JMP_A(1);
> BPF_MOV64_IMM(BPF_REG_0, 0));
> continue;
yep, looks better ;-) I'll check
>
> ?
>
>
> > + }
> > +
> > + /* Implement bpf_ret_value inline. */
> > + if (prog_type == BPF_PROG_TYPE_TRACING &&
> > + insn->imm == BPF_FUNC_ret_value) {
> > + /* Load nr_args from ctx - 8 */
> > + insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, -8);
> > + insn_buf[1] = BPF_ALU64_IMM(BPF_MUL, BPF_REG_2, 8);
> > + insn_buf[2] = BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_1);
> > + insn_buf[3] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_2, 0);
> > +
> > + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 4);
> > + if (!new_prog)
> > + return -ENOMEM;
> > +
> > + delta += 3;
> > + env->prog = prog = new_prog;
> > + insn = new_prog->insnsi + i + delta;
> > + continue;
> > + }
> > +
> > /* Implement bpf_get_func_ip inline. */
> > if (prog_type == BPF_PROG_TYPE_TRACING &&
> > insn->imm == BPF_FUNC_get_func_ip) {
> > - /* Load IP address from ctx - 8 */
> > - insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
> > + /* Load IP address from ctx - 16 */
> > + insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -16);
> >
> > new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 1);
> > if (!new_prog)
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 25ea521fb8f1..3844cfb45490 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1012,7 +1012,7 @@ const struct bpf_func_proto bpf_snprintf_btf_proto = {
> > BPF_CALL_1(bpf_get_func_ip_tracing, void *, ctx)
> > {
> > /* This helper call is inlined by verifier. */
> > - return ((u64 *)ctx)[-1];
> > + return ((u64 *)ctx)[-2];
> > }
> >
> > static const struct bpf_func_proto bpf_get_func_ip_proto_tracing = {
> > @@ -1091,6 +1091,38 @@ static const struct bpf_func_proto bpf_get_branch_snapshot_proto = {
> > .arg2_type = ARG_CONST_SIZE_OR_ZERO,
> > };
> >
> > +BPF_CALL_2(bpf_arg, void *, ctx, int, n)
> > +{
> > + /* This helper call is inlined by verifier. */
> > + u64 nr_args = ((u64 *)ctx)[-1];
> > +
> > + if ((u64) n >= nr_args)
> > + return 0;
>
> We'll need bpf_get_func_arg_cnt() helper as well to be able to know
> the actual number of arguments traced function has. It's impossible to
> know whether the argument is zero or there is no argument, otherwise.
my idea was that the program will call those helpers based
on get_func_ip with proper argument indexes
but with bpf_get_func_arg_cnt we could make a simple program
that would just print function with all its arguments easily, ok ;-)
>
> > + return ((u64 *)ctx)[n];
> > +}
> > +
> > +static const struct bpf_func_proto bpf_arg_proto = {
> > + .func = bpf_arg,
> > + .gpl_only = true,
> > + .ret_type = RET_INTEGER,
> > + .arg1_type = ARG_PTR_TO_CTX,
> > + .arg1_type = ARG_ANYTHING,
> > +};
> > +
> > +BPF_CALL_1(bpf_ret_value, void *, ctx)
> > +{
> > + /* This helper call is inlined by verifier. */
> > + u64 nr_args = ((u64 *)ctx)[-1];
> > +
> > + return ((u64 *)ctx)[nr_args];
>
> we should return 0 for fentry or disable this helper for anything but
> fexit? It's going to return garbage otherwise.
disabling seems like right choice to me
thanks,
jirka
next prev parent reply other threads:[~2021-11-28 18:08 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-18 11:24 [RFC bpf-next v5 00/29] bpf: Add batch support for attaching trampolines Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 01/29] ftrace: Use direct_ops hash in unregister_ftrace_direct Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 02/29] ftrace: Add cleanup to unregister_ftrace_direct_multi Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 03/29] ftrace: Add ftrace_set_filter_ips function Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 04/29] bpf: Factor bpf_check_attach_target function Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 05/29] bpf: Add bpf_check_attach_model function Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 06/29] bpf: Add bpf_arg/bpf_ret_value helpers for tracing programs Jiri Olsa
2021-11-24 21:43 ` Andrii Nakryiko
2021-11-25 16:14 ` Alexei Starovoitov
2021-11-28 18:07 ` Jiri Olsa
2021-11-28 18:06 ` Jiri Olsa [this message]
2021-12-01 7:13 ` Andrii Nakryiko
2021-12-01 17:37 ` Alexei Starovoitov
2021-12-01 17:59 ` Andrii Nakryiko
2021-12-01 20:36 ` Alexei Starovoitov
2021-12-01 21:16 ` Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 07/29] bpf, x64: Allow to use caller address from stack Jiri Olsa
2021-11-19 4:14 ` Alexei Starovoitov
2021-11-19 21:46 ` Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 08/29] bpf: Keep active attached trampoline in bpf_prog Jiri Olsa
2021-11-24 21:48 ` Andrii Nakryiko
2021-11-28 17:24 ` Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 09/29] bpf: Add support to load multi func tracing program Jiri Olsa
2021-11-19 4:11 ` Alexei Starovoitov
2021-11-22 20:15 ` Jiri Olsa
2021-11-24 21:51 ` Andrii Nakryiko
2021-11-28 17:41 ` Jiri Olsa
2021-12-01 7:17 ` Andrii Nakryiko
2021-12-01 21:20 ` Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 10/29] bpf: Add bpf_trampoline_id object Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 11/29] bpf: Add addr to " Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 12/29] bpf: Add struct bpf_tramp_node layer Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 13/29] bpf: Add bpf_tramp_attach layer for trampoline attachment Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 14/29] bpf: Add support to store multiple ids in bpf_tramp_id object Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 15/29] bpf: Add support to store multiple addrs " Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 16/29] bpf: Add bpf_tramp_id_single function Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 17/29] bpf: Resolve id in bpf_tramp_id_single Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 18/29] bpf: Add refcount_t to struct bpf_tramp_id Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 19/29] bpf: Add support to attach trampolines with multiple IDs Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 20/29] bpf: Add support for tracing multi link Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 21/29] libbpf: Add btf__find_by_glob_kind function Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 22/29] libbpf: Add support to link multi func tracing program Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 23/29] selftests/bpf: Add bpf_arg/bpf_ret_value test Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 24/29] selftests/bpf: Add fentry multi func test Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 25/29] selftests/bpf: Add fexit " Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 26/29] selftests/bpf: Add fentry/fexit " Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 27/29] selftests/bpf: Add mixed " Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 28/29] selftests/bpf: Add ret_mod " Jiri Olsa
2021-11-18 11:24 ` [PATCH bpf-next 29/29] selftests/bpf: Add attach " Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YaPFEpAqIREeUMU7@krava \
--to=jolsa@redhat.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@chromium.org \
--cc=netdev@vger.kernel.org \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).