linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Wang Nan <wangnan0@huawei.com>
Cc: acme@kernel.org, ast@plumgrid.com, brendan.d.gregg@gmail.com,
	a.p.zijlstra@chello.nl, daniel@iogearbox.net, dsahern@gmail.com,
	hekuang@huawei.com, jolsa@kernel.org, lizefan@huawei.com,
	masami.hiramatsu.pt@hitachi.com, paulus@samba.org,
	linux-kernel@vger.kernel.org, pi3orama@163.com,
	xiakaixu@huawei.com, Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: Re: [PATCH 16/31] perf tools: Add prologue for BPF programs for fetching arguments
Date: Thu, 15 Oct 2015 14:26:23 +0900	[thread overview]
Message-ID: <20151015052623.GA26747@sejong> (raw)
In-Reply-To: <1444826502-49291-17-git-send-email-wangnan0@huawei.com>

On Wed, Oct 14, 2015 at 12:41:27PM +0000, Wang Nan wrote:
> From: He Kuang <hekuang@huawei.com>
> 
> This patch generates prologue for a BPF program which fetch arguments
> for it. With this patch, the program can have arguments as follow:
> 
>  SEC("lock_page=__lock_page page->flags")
>  int lock_page(struct pt_regs *ctx, int err, unsigned long flags)
>  {
> 	 return 1;
>  }
> 
> This patch passes at most 3 arguments from r3, r4 and r5. r1 is still
> the ctx pointer. r2 is used to indicate the successfulness of
> dereferencing.
> 
> This patch uses r6 to hold ctx (struct pt_regs) and r7 to hold stack
> pointer for result. Result of each arguments first store on stack:
> 
>  low address
>  BPF_REG_FP - 24  ARG3
>  BPF_REG_FP - 16  ARG2
>  BPF_REG_FP - 8   ARG1
>  BPF_REG_FP
>  high address
> 
> Then loaded into r3, r4 and r5.
> 
> The output prologue for offn(...off2(off1(reg)))) should be:
> 
>      r6 <- r1			// save ctx into a callee saved register
>      r7 <- fp
>      r7 <- r7 - stack_offset	// pointer to result slot
>      /* load r3 with the offset in pt_regs of 'reg' */
>      (r7) <- r3			// make slot valid
>      r3 <- r3 + off1		// prepare to read unsafe pointer
>      r2 <- 8
>      r1 <- r7			// result put onto stack
>      call probe_read		// read unsafe pointer
>      jnei r0, 0, err		// error checking
>      r3 <- (r7)			// read result
>      r3 <- r3 + off2		// prepare to read unsafe pointer
>      r2 <- 8
>      r1 <- r7
>      call probe_read
>      jnei r0, 0, err
>      ...
>      /* load r2, r3, r4 from stack */
>      goto success
> err:
>      r2 <- 1
>      /* load r3, r4, r5 with 0 */
>      goto usercode
> success:
>      r2 <- 0
> usercode:
>      r1 <- r6	// restore ctx
>      // original user code
> 
> If all of arguments reside in register (dereferencing is not
> required), gen_prologue_fastpath() will be used to create
> fast prologue:
> 
>      r3 <- (r1 + offset of reg1)
>      r4 <- (r1 + offset of reg2)
>      r5 <- (r1 + offset of reg3)
>      r2 <- 0
> 
> P.S.
> 
> eBPF calling convention is defined as:
> 
> * r0		- return value from in-kernel function, and exit value
>                   for eBPF program
> * r1 - r5	- arguments from eBPF program to in-kernel function
> * r6 - r9	- callee saved registers that in-kernel function will
>                   preserve
> * r10		- read-only frame pointer to access stack
> 
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/n/ebpf-6yw9eg0ej3l4jnqhinngkw86@git.kernel.org
> ---

[SNIP]
> +int bpf__gen_prologue(struct probe_trace_arg *args, int nargs,
> +		      struct bpf_insn *new_prog, size_t *new_cnt,
> +		      size_t cnt_space)
> +{
> +	struct bpf_insn *success_code = NULL;
> +	struct bpf_insn *error_code = NULL;
> +	struct bpf_insn *user_code = NULL;
> +	struct bpf_insn_pos pos;
> +	bool fastpath = true;
> +	int i;
> +
> +	if (!new_prog || !new_cnt)
> +		return -EINVAL;
> +
> +	pos.begin = new_prog;
> +	pos.end = new_prog + cnt_space;
> +	pos.pos = new_prog;
> +
> +	if (!nargs) {
> +		ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0),
> +		    &pos);
> +
> +		if (check_pos(&pos))
> +			goto errout;
> +
> +		*new_cnt = pos_get_cnt(&pos);
> +		return 0;
> +	}
> +
> +	if (nargs > BPF_PROLOGUE_MAX_ARGS)
> +		nargs = BPF_PROLOGUE_MAX_ARGS;

Wouldn't it be better to inform user if it ignored some arguments?

Thanks,
Namhyung


> +	if (cnt_space > BPF_MAXINSNS)
> +		cnt_space = BPF_MAXINSNS;
> +
> +	/* First pass: validation */
> +	for (i = 0; i < nargs; i++) {
> +		struct probe_trace_arg_ref *ref = args[i].ref;
> +
> +		if (args[i].value[0] == '@') {
> +			/* TODO: fetch global variable */
> +			pr_err("bpf: prologue: global %s%+ld not support\n",
> +				args[i].value, ref ? ref->offset : 0);
> +			return -ENOTSUP;
> +		}
> +
> +		while (ref) {
> +			/* fastpath is true if all args has ref == NULL */
> +			fastpath = false;
> +
> +			/*
> +			 * Instruction encodes immediate value using
> +			 * s32, ref->offset is long. On systems which
> +			 * can't fill long in s32, refuse to process if
> +			 * ref->offset too large (or small).
> +			 */
> +#ifdef __LP64__
> +#define OFFSET_MAX	((1LL << 31) - 1)
> +#define OFFSET_MIN	((1LL << 31) * -1)
> +			if (ref->offset > OFFSET_MAX ||
> +					ref->offset < OFFSET_MIN) {
> +				pr_err("bpf: prologue: offset out of bound: %ld\n",
> +				       ref->offset);
> +				return -E2BIG;
> +			}
> +#endif
> +			ref = ref->next;
> +		}
> +	}
> +	pr_debug("prologue: pass validation\n");
> +
> +	if (fastpath) {
> +		/* If all variables are registers... */
> +		pr_debug("prologue: fast path\n");
> +		if (gen_prologue_fastpath(&pos, args, nargs))
> +			goto errout;
> +	} else {
> +		pr_debug("prologue: slow path\n");
> +
> +		/* Initialization: move ctx to a callee saved register. */
> +		ins(BPF_MOV64_REG(BPF_REG_CTX, BPF_REG_ARG1), &pos);
> +
> +		if (gen_prologue_slowpath(&pos, args, nargs))
> +			goto errout;
> +		/*
> +		 * start of ERROR_CODE (only slow pass needs error code)
> +		 *   mov r2 <- 1
> +		 *   goto usercode
> +		 */
> +		error_code = pos.pos;
> +		ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 1),
> +		    &pos);
> +
> +		for (i = 0; i < nargs; i++)
> +			ins(BPF_ALU64_IMM(BPF_MOV,
> +					  BPF_PROLOGUE_START_ARG_REG + i,
> +					  0),
> +			    &pos);
> +		ins(BPF_JMP_IMM(BPF_JA, BPF_REG_0, 0, JMP_TO_USER_CODE),
> +				&pos);
> +	}
> +
> +	/*
> +	 * start of SUCCESS_CODE:
> +	 *   mov r2 <- 0
> +	 *   goto usercode  // skip
> +	 */
> +	success_code = pos.pos;
> +	ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0), &pos);
> +
> +	/*
> +	 * start of USER_CODE:
> +	 *   Restore ctx to r1
> +	 */
> +	user_code = pos.pos;
> +	if (!fastpath) {
> +		/*
> +		 * Only slow path needs restoring of ctx. In fast path,
> +		 * register are loaded directly from r1.
> +		 */
> +		ins(BPF_MOV64_REG(BPF_REG_ARG1, BPF_REG_CTX), &pos);
> +		if (prologue_relocate(&pos, error_code, success_code,
> +				      user_code))
> +			goto errout;
> +	}
> +
> +	if (check_pos(&pos))
> +		goto errout;
> +
> +	*new_cnt = pos_get_cnt(&pos);
> +	return 0;
> +errout:
> +	return -ERANGE;
> +}
> diff --git a/tools/perf/util/bpf-prologue.h b/tools/perf/util/bpf-prologue.h
> new file mode 100644
> index 0000000..f1e4c5d
> --- /dev/null
> +++ b/tools/perf/util/bpf-prologue.h
> @@ -0,0 +1,34 @@
> +/*
> + * Copyright (C) 2015, He Kuang <hekuang@huawei.com>
> + * Copyright (C) 2015, Huawei Inc.
> + */
> +#ifndef __BPF_PROLOGUE_H
> +#define __BPF_PROLOGUE_H
> +
> +#include <linux/compiler.h>
> +#include <linux/filter.h>
> +#include "probe-event.h"
> +
> +#define BPF_PROLOGUE_MAX_ARGS 3
> +#define BPF_PROLOGUE_START_ARG_REG BPF_REG_3
> +#define BPF_PROLOGUE_FETCH_RESULT_REG BPF_REG_2
> +
> +#ifdef HAVE_BPF_PROLOGUE
> +int bpf__gen_prologue(struct probe_trace_arg *args, int nargs,
> +		      struct bpf_insn *new_prog, size_t *new_cnt,
> +		      size_t cnt_space);
> +#else
> +static inline int
> +bpf__gen_prologue(struct probe_trace_arg *args __maybe_unused,
> +		  int nargs __maybe_unused,
> +		  struct bpf_insn *new_prog __maybe_unused,
> +		  size_t *new_cnt,
> +		  size_t cnt_space __maybe_unused)
> +{
> +	if (!new_cnt)
> +		return -EINVAL;
> +	*new_cnt = 0;
> +	return 0;
> +}
> +#endif
> +#endif /* __BPF_PROLOGUE_H */
> -- 
> 1.8.3.4
> 

  reply	other threads:[~2015-10-15  5:26 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-14 12:41 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
2015-10-14 12:41 ` [PATCH 01/31] perf tools: Make perf depend on libbpf Wang Nan
2015-10-29 12:21   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 02/31] perf ebpf: Add the libbpf glue Wang Nan
2015-10-29 12:21   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 03/31] perf tools: Enable passing bpf object file to --event Wang Nan
2015-10-20 15:12   ` Arnaldo Carvalho de Melo
2015-10-20 15:15     ` Arnaldo Carvalho de Melo
2015-10-20 15:42       ` Arnaldo Carvalho de Melo
2015-10-21  2:01         ` Wangnan (F)
2015-10-21  1:55       ` Wangnan (F)
2015-10-29 12:22   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 04/31] perf record, bpf: Create probe points for BPF programs Wang Nan
2015-10-20 19:12   ` Arnaldo Carvalho de Melo
2015-10-20 19:16     ` David Ahern
2015-10-20 19:21       ` Arnaldo Carvalho de Melo
2015-10-20 20:34     ` Arnaldo Carvalho de Melo
2015-10-21  2:27     ` Wangnan (F)
2015-10-21  3:31     ` Wangnan (F)
2015-10-21 13:28       ` Arnaldo Carvalho de Melo
2015-10-22 16:13         ` Arnaldo Carvalho de Melo
2015-10-29 12:22   ` [tip:perf/core] perf tools: " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 05/31] perf record: Load eBPF object into kernel Wang Nan
2015-10-23 16:58   ` Arnaldo Carvalho de Melo
2015-10-24  0:27     ` Arnaldo Carvalho de Melo
2015-10-26  7:18       ` Wangnan (F)
2015-10-24  1:18     ` pi3orama
2015-10-29 12:22   ` [tip:perf/core] perf tools: " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 06/31] perf tools: Collect perf_evsel in BPF object files Wang Nan
2015-10-29 12:23   ` [tip:perf/core] perf bpf: " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 07/31] perf tools: Attach eBPF program to perf event Wang Nan
2015-10-30  9:13   ` [tip:perf/core] perf bpf: Attach eBPF filter " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 08/31] perf record: Add clang options for compiling BPF scripts Wang Nan
2015-10-30  9:14   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 09/31] perf tools: Compile scriptlets to BPF objects when passing '.c' to --event Wang Nan
2015-10-14 15:45   ` Namhyung Kim
2015-10-15  2:10     ` Wangnan (F)
2015-10-29 16:25   ` Arnaldo Carvalho de Melo
2015-10-29 16:30     ` Arnaldo Carvalho de Melo
2015-10-29 22:52       ` Arnaldo Carvalho de Melo
2015-10-30  9:14   ` [tip:perf/core] " tip-bot for Wang Nan
2015-10-14 12:41 ` [PATCH 10/31] perf test: Enforce LLVM test for BPF test Wang Nan
2015-10-14 15:48   ` Namhyung Kim
2015-10-15 11:58     ` Wangnan (F)
2015-11-03 18:24       ` Arnaldo Carvalho de Melo
2015-11-04  1:41         ` Wangnan (F)
2015-10-29 22:37   ` Arnaldo Carvalho de Melo
2015-10-31  5:31     ` Wangnan (F)
2015-10-14 12:41 ` [PATCH 11/31] perf test: Add 'perf test BPF' Wang Nan
2015-10-14 12:41 ` [PATCH 12/31] perf probe: Reset args and nargs for probe_trace_event when failure Wang Nan
2015-10-29 22:39   ` Arnaldo Carvalho de Melo
2015-10-30 10:24   ` 平松雅巳 / HIRAMATU,MASAMI
2015-10-14 12:41 ` [PATCH 13/31] bpf tools: Load a program with different instances using preprocessor Wang Nan
2015-10-29 22:44   ` Arnaldo Carvalho de Melo
2015-10-31 10:40     ` Wangnan (F)
2015-10-14 12:41 ` [PATCH 14/31] perf tools: Add BPF_PROLOGUE config options for further patches Wang Nan
2015-10-29 22:45   ` Arnaldo Carvalho de Melo
2015-10-14 12:41 ` [PATCH 15/31] perf tools: Compile dwarf-regs.c if CONFIG_BPF_PROLOGUE is on Wang Nan
2015-10-14 12:41 ` [PATCH 16/31] perf tools: Add prologue for BPF programs for fetching arguments Wang Nan
2015-10-15  5:26   ` Namhyung Kim [this message]
2015-10-15 11:56     ` Wangnan (F)
2015-10-14 12:41 ` [PATCH 17/31] perf tools: Generate prologue for BPF programs Wang Nan
2015-10-14 12:41 ` [PATCH 18/31] perf tools: Use same BPF program if arguments are identical Wang Nan
2015-10-14 12:41 ` [PATCH 19/31] perf record: Support custom vmlinux path Wang Nan
2015-10-14 12:41 ` [PATCH 20/31] perf tools: Allow BPF program attach to uprobe events Wang Nan
2015-10-27  2:28   ` Wangnan (F)
2015-10-27  3:07     ` [PATCH] perf tools: Allow BPF program attach to modules Wang Nan
2015-10-14 12:41 ` [PATCH 21/31] perf test: Enforce LLVM test, add kbuild test Wang Nan
2015-10-19 14:42   ` Namhyung Kim
2015-10-19 14:53     ` Arnaldo Carvalho de Melo
2015-10-19 15:21       ` Namhyung Kim
2015-10-20 10:36       ` Wangnan (F)
2015-10-20 13:42         ` Arnaldo Carvalho de Melo
2015-10-20 12:06       ` Wangnan (F)
2015-10-20 13:41         ` Arnaldo Carvalho de Melo
2015-10-14 12:41 ` [PATCH 22/31] perf test: Test BPF prologue Wang Nan
2015-10-14 12:41 ` [PATCH 23/31] bpf tools: Add helper function for updating bpf maps elements Wang Nan
2015-10-14 12:41 ` [PATCH 24/31] bpf tools: Collect map definition in bpf_object Wang Nan
2015-10-14 12:41 ` [PATCH 25/31] bpf tools: Extract and collect map names from BPF object file Wang Nan
2015-10-14 12:41 ` [PATCH 26/31] perf tools: Support perf event alias name Wang Nan
2015-10-21  8:53   ` Namhyung Kim
2015-10-21 13:00     ` Wangnan (F)
2015-10-22  7:16       ` Namhyung Kim
2015-10-22  7:29         ` Wangnan (F)
2015-10-22  7:53           ` Namhyung Kim
2015-10-22  7:59             ` Wangnan (F)
2015-10-14 12:41 ` [PATCH 27/31] perf tools: Pass available CPU number to clang compiler Wang Nan
2015-10-14 12:41 ` [PATCH 28/31] perf tools: Add API to config maps in bpf object Wang Nan
2015-10-14 12:41 ` [PATCH 29/31] perf tools: Add API to apply config to BPF map Wang Nan
2015-10-14 12:41 ` [PATCH 30/31] perf record: Apply config to BPF objects before recording Wang Nan
2015-10-14 12:41 ` [PATCH 31/31] perf tools: Enable BPF object configure syntax Wang Nan
2015-10-14 15:44 ` [GIT PULL 00/31] perf tools: filtering events using eBPF programs Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151015052623.GA26747@sejong \
    --to=namhyung@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=ast@plumgrid.com \
    --cc=brendan.d.gregg@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=hekuang@huawei.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=paulus@samba.org \
    --cc=pi3orama@163.com \
    --cc=wangnan0@huawei.com \
    --cc=xiakaixu@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).