All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Wang Nan <wangnan0@huawei.com>
Cc: acme@kernel.org, ast@plumgrid.com, brendan.d.gregg@gmail.com,
	a.p.zijlstra@chello.nl, daniel@iogearbox.net, dsahern@gmail.com,
	hekuang@huawei.com, jolsa@kernel.org, lizefan@huawei.com,
	masami.hiramatsu.pt@hitachi.com, paulus@samba.org,
	linux-kernel@vger.kernel.org, pi3orama@163.com,
	xiakaixu@huawei.com, Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: Re: [PATCH 16/31] perf tools: Add prologue for BPF programs for fetching arguments
Date: Thu, 15 Oct 2015 14:26:23 +0900	[thread overview]
Message-ID: <20151015052623.GA26747@sejong> (raw)
In-Reply-To: <1444826502-49291-17-git-send-email-wangnan0@huawei.com>

On Wed, Oct 14, 2015 at 12:41:27PM +0000, Wang Nan wrote:
> From: He Kuang <hekuang@huawei.com>
> 
> This patch generates prologue for a BPF program which fetch arguments
> for it. With this patch, the program can have arguments as follow:
> 
>  SEC("lock_page=__lock_page page->flags")
>  int lock_page(struct pt_regs *ctx, int err, unsigned long flags)
>  {
> 	 return 1;
>  }
> 
> This patch passes at most 3 arguments from r3, r4 and r5. r1 is still
> the ctx pointer. r2 is used to indicate the successfulness of
> dereferencing.
> 
> This patch uses r6 to hold ctx (struct pt_regs) and r7 to hold stack
> pointer for result. Result of each arguments first store on stack:
> 
>  low address
>  BPF_REG_FP - 24  ARG3
>  BPF_REG_FP - 16  ARG2
>  BPF_REG_FP - 8   ARG1
>  BPF_REG_FP
>  high address
> 
> Then loaded into r3, r4 and r5.
> 
> The output prologue for offn(...off2(off1(reg)))) should be:
> 
>      r6 <- r1			// save ctx into a callee saved register
>      r7 <- fp
>      r7 <- r7 - stack_offset	// pointer to result slot
>      /* load r3 with the offset in pt_regs of 'reg' */
>      (r7) <- r3			// make slot valid
>      r3 <- r3 + off1		// prepare to read unsafe pointer
>      r2 <- 8
>      r1 <- r7			// result put onto stack
>      call probe_read		// read unsafe pointer
>      jnei r0, 0, err		// error checking
>      r3 <- (r7)			// read result
>      r3 <- r3 + off2		// prepare to read unsafe pointer
>      r2 <- 8
>      r1 <- r7
>      call probe_read
>      jnei r0, 0, err
>      ...
>      /* load r2, r3, r4 from stack */
>      goto success
> err:
>      r2 <- 1
>      /* load r3, r4, r5 with 0 */
>      goto usercode
> success:
>      r2 <- 0
> usercode:
>      r1 <- r6	// restore ctx
>      // original user code
> 
> If all of arguments reside in register (dereferencing is not
> required), gen_prologue_fastpath() will be used to create
> fast prologue:
> 
>      r3 <- (r1 + offset of reg1)
>      r4 <- (r1 + offset of reg2)
>      r5 <- (r1 + offset of reg3)
>      r2 <- 0
> 
> P.S.
> 
> eBPF calling convention is defined as:
> 
> * r0		- return value from in-kernel function, and exit value
>                   for eBPF program
> * r1 - r5	- arguments from eBPF program to in-kernel function
> * r6 - r9	- callee saved registers that in-kernel function will
>                   preserve
> * r10		- read-only frame pointer to access stack
> 
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/n/ebpf-6yw9eg0ej3l4jnqhinngkw86@git.kernel.org
> ---

[SNIP]
> +int bpf__gen_prologue(struct probe_trace_arg *args, int nargs,
> +		      struct bpf_insn *new_prog, size_t *new_cnt,
> +		      size_t cnt_space)
> +{
> +	struct bpf_insn *success_code = NULL;
> +	struct bpf_insn *error_code = NULL;
> +	struct bpf_insn *user_code = NULL;
> +	struct bpf_insn_pos pos;
> +	bool fastpath = true;
> +	int i;
> +
> +	if (!new_prog || !new_cnt)
> +		return -EINVAL;
> +
> +	pos.begin = new_prog;
> +	pos.end = new_prog + cnt_space;
> +	pos.pos = new_prog;
> +
> +	if (!nargs) {
> +		ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0),
> +		    &pos);
> +
> +		if (check_pos(&pos))
> +			goto errout;
> +
> +		*new_cnt = pos_get_cnt(&pos);
> +		return 0;
> +	}
> +
> +	if (nargs > BPF_PROLOGUE_MAX_ARGS)
> +		nargs = BPF_PROLOGUE_MAX_ARGS;

Wouldn't it be better to inform user if it ignored some arguments?

Thanks,
Namhyung


> +	if (cnt_space > BPF_MAXINSNS)
> +		cnt_space = BPF_MAXINSNS;
> +
> +	/* First pass: validation */
> +	for (i = 0; i < nargs; i++) {
> +		struct probe_trace_arg_ref *ref = args[i].ref;
> +
> +		if (args[i].value[0] == '@') {
> +			/* TODO: fetch global variable */
> +			pr_err("bpf: prologue: global %s%+ld not support\n",
> +				args[i].value, ref ? ref->offset : 0);
> +			return -ENOTSUP;
> +		}
> +
> +		while (ref) {
> +			/* fastpath is true if all args has ref == NULL */
> +			fastpath = false;
> +
> +			/*
> +			 * Instruction encodes immediate value using
> +			 * s32, ref->offset is long. On systems which
> +			 * can't fill long in s32, refuse to process if
> +			 * ref->offset too large (or small).
> +			 */
> +#ifdef __LP64__
> +#define OFFSET_MAX	((1LL << 31) - 1)
> +#define OFFSET_MIN	((1LL << 31) * -1)
> +			if (ref->offset > OFFSET_MAX ||
> +					ref->offset < OFFSET_MIN) {
> +				pr_err("bpf: prologue: offset out of bound: %ld\n",
> +				       ref->offset);
> +				return -E2BIG;
> +			}
> +#endif
> +			ref = ref->next;
> +		}
> +	}
> +	pr_debug("prologue: pass validation\n");
> +
> +	if (fastpath) {
> +		/* If all variables are registers... */
> +		pr_debug("prologue: fast path\n");
> +		if (gen_prologue_fastpath(&pos, args, nargs))
> +			goto errout;
> +	} else {
> +		pr_debug("prologue: slow path\n");
> +
> +		/* Initialization: move ctx to a callee saved register. */
> +		ins(BPF_MOV64_REG(BPF_REG_CTX, BPF_REG_ARG1), &pos);
> +
> +		if (gen_prologue_slowpath(&pos, args, nargs))
> +			goto errout;
> +		/*
> +		 * start of ERROR_CODE (only slow pass needs error code)
> +		 *   mov r2 <- 1
> +		 *   goto usercode
> +		 */
> +		error_code = pos.pos;
> +		ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 1),
> +		    &pos);
> +
> +		for (i = 0; i < nargs; i++)
> +			ins(BPF_ALU64_IMM(BPF_MOV,
> +					  BPF_PROLOGUE_START_ARG_REG + i,
> +					  0),
> +			    &pos);
> +		ins(BPF_JMP_IMM(BPF_JA, BPF_REG_0, 0, JMP_TO_USER_CODE),
> +				&pos);
> +	}
> +
> +	/*
> +	 * start of SUCCESS_CODE:
> +	 *   mov r2 <- 0
> +	 *   goto usercode  // skip
> +	 */
> +	success_code = pos.pos;
> +	ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0), &pos);
> +
> +	/*
> +	 * start of USER_CODE:
> +	 *   Restore ctx to r1
> +	 */
> +	user_code = pos.pos;
> +	if (!fastpath) {
> +		/*
> +		 * Only slow path needs restoring of ctx. In fast path,
> +		 * register are loaded directly from r1.
> +		 */
> +		ins(BPF_MOV64_REG(BPF_REG_ARG1, BPF_REG_CTX), &pos);
> +		if (prologue_relocate(&pos, error_code, success_code,
> +				      user_code))
> +			goto errout;
> +	}
> +
> +	if (check_pos(&pos))
> +		goto errout;
> +
> +	*new_cnt = pos_get_cnt(&pos);
> +	return 0;
> +errout:
> +	return -ERANGE;
> +}
> diff --git a/tools/perf/util/bpf-prologue.h b/tools/perf/util/bpf-prologue.h
> new file mode 100644
> index 0000000..f1e4c5d
> --- /dev/null
> +++ b/tools/perf/util/bpf-prologue.h
> @@ -0,0 +1,34 @@
> +/*
> + * Copyright (C) 2015, He Kuang <hekuang@huawei.com>
> + * Copyright (C) 2015, Huawei Inc.
> + */
> +#ifndef __BPF_PROLOGUE_H
> +#define __BPF_PROLOGUE_H
> +
> +#include <linux/compiler.h>
> +#include <linux/filter.h>
> +#include "probe-event.h"
> +
> +#define BPF_PROLOGUE_MAX_ARGS 3
> +#define BPF_PROLOGUE_START_ARG_REG BPF_REG_3
> +#define BPF_PROLOGUE_FETCH_RESULT_REG BPF_REG_2
> +
> +#ifdef HAVE_BPF_PROLOGUE
> +int bpf__gen_prologue(struct probe_trace_arg *args, int nargs,
> +		      struct bpf_insn *new_prog, size_t *new_cnt,
> +		      size_t cnt_space);
> +#else
> +static inline int
> +bpf__gen_prologue(struct probe_trace_arg *args __maybe_unused,
> +		  int nargs __maybe_unused,
> +		  struct bpf_insn *new_prog __maybe_unused,
> +		  size_t *new_cnt,
> +		  size_t cnt_space __maybe_unused)
> +{
> +	if (!new_cnt)
> +		return -EINVAL;
> +	*new_cnt = 0;
> +	return 0;
> +}
> +#endif
> +#endif /* __BPF_PROLOGUE_H */
> -- 
> 1.8.3.4
> 

  reply	other threads:[~2015-10-15  5:26 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-14 12:41 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
2015-10-14 12:41 ` [PATCH 01/31] perf tools: Make perf depend on libbpf Wang Nan
2015-10-29 12:21   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 02/31] perf ebpf: Add the libbpf glue Wang Nan
2015-10-29 12:21   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 03/31] perf tools: Enable passing bpf object file to --event Wang Nan
2015-10-20 15:12   ` Arnaldo Carvalho de Melo
2015-10-20 15:15     ` Arnaldo Carvalho de Melo
2015-10-20 15:42       ` Arnaldo Carvalho de Melo
2015-10-21  2:01         ` Wangnan (F)
2015-10-21  1:55       ` Wangnan (F)
2015-10-29 12:22   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 04/31] perf record, bpf: Create probe points for BPF programs Wang Nan
2015-10-20 19:12   ` Arnaldo Carvalho de Melo
2015-10-20 19:16     ` David Ahern
2015-10-20 19:21       ` Arnaldo Carvalho de Melo
2015-10-20 20:34     ` Arnaldo Carvalho de Melo
2015-10-21  2:27     ` Wangnan (F)
2015-10-21  3:31     ` Wangnan (F)
2015-10-21 13:28       ` Arnaldo Carvalho de Melo
2015-10-22 16:13         ` Arnaldo Carvalho de Melo
2015-10-29 12:22   ` [tip:perf/core] perf tools: " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 05/31] perf record: Load eBPF object into kernel Wang Nan
2015-10-23 16:58   ` Arnaldo Carvalho de Melo
2015-10-24  0:27     ` Arnaldo Carvalho de Melo
2015-10-26  7:18       ` Wangnan (F)
2015-10-24  1:18     ` pi3orama
2015-10-29 12:22   ` [tip:perf/core] perf tools: " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 06/31] perf tools: Collect perf_evsel in BPF object files Wang Nan
2015-10-29 12:23   ` [tip:perf/core] perf bpf: " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 07/31] perf tools: Attach eBPF program to perf event Wang Nan
2015-10-30  9:13   ` [tip:perf/core] perf bpf: Attach eBPF filter " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 08/31] perf record: Add clang options for compiling BPF scripts Wang Nan
2015-10-30  9:14   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 09/31] perf tools: Compile scriptlets to BPF objects when passing '.c' to --event Wang Nan
2015-10-14 15:45   ` Namhyung Kim
2015-10-15  2:10     ` Wangnan (F)
2015-10-29 16:25   ` Arnaldo Carvalho de Melo
2015-10-29 16:30     ` Arnaldo Carvalho de Melo
2015-10-29 22:52       ` Arnaldo Carvalho de Melo
2015-10-30  9:14   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 10/31] perf test: Enforce LLVM test for BPF test Wang Nan
2015-10-14 15:48   ` Namhyung Kim
2015-10-15 11:58     ` Wangnan (F)
2015-11-03 18:24       ` Arnaldo Carvalho de Melo
2015-11-04  1:41         ` Wangnan (F)
2015-10-29 22:37   ` Arnaldo Carvalho de Melo
2015-10-31  5:31     ` Wangnan (F)
2015-10-14 12:41 ` [PATCH 11/31] perf test: Add 'perf test BPF' Wang Nan
2015-10-14 12:41 ` [PATCH 12/31] perf probe: Reset args and nargs for probe_trace_event when failure Wang Nan
2015-10-29 22:39   ` Arnaldo Carvalho de Melo
2015-10-30 10:24   ` 平松雅巳 / HIRAMATU,MASAMI
2015-10-14 12:41 ` [PATCH 13/31] bpf tools: Load a program with different instances using preprocessor Wang Nan
2015-10-29 22:44   ` Arnaldo Carvalho de Melo
2015-10-31 10:40     ` Wangnan (F)
2015-10-14 12:41 ` [PATCH 14/31] perf tools: Add BPF_PROLOGUE config options for further patches Wang Nan
2015-10-29 22:45   ` Arnaldo Carvalho de Melo
2015-10-14 12:41 ` [PATCH 15/31] perf tools: Compile dwarf-regs.c if CONFIG_BPF_PROLOGUE is on Wang Nan
2015-10-14 12:41 ` [PATCH 16/31] perf tools: Add prologue for BPF programs for fetching arguments Wang Nan
2015-10-15  5:26   ` Namhyung Kim [this message]
2015-10-15 11:56     ` Wangnan (F)
2015-10-14 12:41 ` [PATCH 17/31] perf tools: Generate prologue for BPF programs Wang Nan
2015-10-14 12:41 ` [PATCH 18/31] perf tools: Use same BPF program if arguments are identical Wang Nan
2015-10-14 12:41 ` [PATCH 19/31] perf record: Support custom vmlinux path Wang Nan
2015-10-14 12:41 ` [PATCH 20/31] perf tools: Allow BPF program attach to uprobe events Wang Nan
2015-10-27  2:28   ` Wangnan (F)
2015-10-27  3:07     ` [PATCH] perf tools: Allow BPF program attach to modules Wang Nan
2015-10-14 12:41 ` [PATCH 21/31] perf test: Enforce LLVM test, add kbuild test Wang Nan
2015-10-19 14:42   ` Namhyung Kim
2015-10-19 14:53     ` Arnaldo Carvalho de Melo
2015-10-19 15:21       ` Namhyung Kim
2015-10-20 10:36       ` Wangnan (F)
2015-10-20 13:42         ` Arnaldo Carvalho de Melo
2015-10-20 12:06       ` Wangnan (F)
2015-10-20 13:41         ` Arnaldo Carvalho de Melo
2015-10-14 12:41 ` [PATCH 22/31] perf test: Test BPF prologue Wang Nan
2015-10-14 12:41 ` [PATCH 23/31] bpf tools: Add helper function for updating bpf maps elements Wang Nan
2015-10-14 12:41 ` [PATCH 24/31] bpf tools: Collect map definition in bpf_object Wang Nan
2015-10-14 12:41 ` [PATCH 25/31] bpf tools: Extract and collect map names from BPF object file Wang Nan
2015-10-14 12:41 ` [PATCH 26/31] perf tools: Support perf event alias name Wang Nan
2015-10-21  8:53   ` Namhyung Kim
2015-10-21 13:00     ` Wangnan (F)
2015-10-22  7:16       ` Namhyung Kim
2015-10-22  7:29         ` Wangnan (F)
2015-10-22  7:53           ` Namhyung Kim
2015-10-22  7:59             ` Wangnan (F)
2015-10-14 12:41 ` [PATCH 27/31] perf tools: Pass available CPU number to clang compiler Wang Nan
2015-10-14 12:41 ` [PATCH 28/31] perf tools: Add API to config maps in bpf object Wang Nan
2015-10-14 12:41 ` [PATCH 29/31] perf tools: Add API to apply config to BPF map Wang Nan
2015-10-14 12:41 ` [PATCH 30/31] perf record: Apply config to BPF objects before recording Wang Nan
2015-10-14 12:41 ` [PATCH 31/31] perf tools: Enable BPF object configure syntax Wang Nan
2015-10-14 15:44 ` [GIT PULL 00/31] perf tools: filtering events using eBPF programs Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151015052623.GA26747@sejong \
    --to=namhyung@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=ast@plumgrid.com \
    --cc=brendan.d.gregg@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=hekuang@huawei.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=paulus@samba.org \
    --cc=pi3orama@163.com \
    --cc=wangnan0@huawei.com \
    --cc=xiakaixu@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.