All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL 00/31] perf tools: filtering events using eBPF programs
@ 2015-08-29  4:21 Wang Nan
  2015-08-29  4:21 ` [PATCH 01/31] bpf tools: New API to get name from a BPF object Wang Nan
                   ` (30 more replies)
  0 siblings, 31 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast; +Cc: linux-kernel, lizefan, pi3orama, Wang Nan

Hi Arnaldo and Ingo,

Several small proglems are fixed based on yesterday's pull request. Please
see below. Since patch order is changed (original 20/32 and 32/32 are
dropped), I decide to send all of them again. Sorry for the noisy.

In addition: I collect a cross-compiling fix I posted yesterday into this
cset (the last one).

The following changes since commit 2c07144dfce366e21465cc7b0ada9f0b6dc7b7ed:

  perf evlist: Add backpointer for perf_env to evlist (2015-08-28 14:54:14 -0300)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pi3orama/linux tags/perf-ebpf-for-acme-20150829

for you to fetch changes up to d4a337392b3724899a084170d9ea36a8e2392097:

  tools lib traceevent: Support function __get_dynamic_array_len (2015-08-29 02:57:40 +0000)

----------------------------------------------------------------
perf BPF related improvements and bugfix:

 - Rebase to Arnaldo's newest perf/core.

 - Fix a missing include in builtin-trace.c.

 - Drop patch 'perf tools: Fix probe-event.h include' since
   the problem has been fixed by commit 5a023b57.

 - Fix a cross compiling error (introduced by inter pt).

 - Drop patch 'bpf: Introduce function for outputing data to
   perf event' because we want to do better.

Signed-off-by: Wang Nan <wangnan0@huawei.com>

----------------------------------------------------------------
He Kuang (4):
      perf tools: Move linux/filter.h to tools/include
      perf tools: Introduce arch_get_reg_info() for x86
      perf record: Support custom vmlinux path
      tools lib traceevent: Support function __get_dynamic_array_len

Wang Nan (27):
      bpf tools: New API to get name from a BPF object
      perf tools: Don't set cmdline_group_boundary if no evsel is collected
      perf tools: Introduce dummy evsel
      perf tools: Make perf depend on libbpf
      perf ebpf: Add the libbpf glue
      perf tools: Enable passing bpf object file to --event
      perf probe: Attach trace_probe_event with perf_probe_event
      perf record, bpf: Parse and probe eBPF programs probe points
      perf bpf: Collect 'struct perf_probe_event' for bpf_program
      perf record: Load all eBPF object into kernel
      perf tools: Add bpf_fd field to evsel and config it
      perf tools: Allow filter option to be applied to bof object
      perf tools: Attach eBPF program to perf event
      perf tools: Suppress probing messages when probing by BPF loading
      perf record: Add clang options for compiling BPF scripts
      perf tools: Infrastructure for compiling scriptlets when passing '.c' to --event
      perf tests: Enforce LLVM test for BPF test
      perf test: Add 'perf test BPF'
      bpf tools: Load a program with different instances using preprocessor
      perf probe: Reset args and nargs for probe_trace_event when failure
      perf tools: Add BPF_PROLOGUE config options for further patches
      perf tools: Add prologue for BPF programs for fetching arguments
      perf tools: Generate prologue for BPF programs
      perf tools: Use same BPF program if arguments are identical
      perf probe: Init symbol as kprobe
      perf tools: Support attach BPF program on uprobe events
      perf tools: Fix cross compiling error

 tools/build/Makefile.feature                       |   6 +-
 tools/include/linux/filter.h                       | 237 +++++++
 tools/lib/bpf/libbpf.c                             | 168 ++++-
 tools/lib/bpf/libbpf.h                             |  26 +-
 tools/lib/traceevent/event-parse.c                 |  56 +-
 tools/lib/traceevent/event-parse.h                 |   1 +
 tools/perf/MANIFEST                                |   4 +
 tools/perf/Makefile.perf                           |  19 +-
 tools/perf/arch/x86/Makefile                       |   1 +
 tools/perf/arch/x86/util/Build                     |   2 +
 tools/perf/arch/x86/util/dwarf-regs.c              | 104 ++-
 tools/perf/builtin-probe.c                         |   4 +-
 tools/perf/builtin-record.c                        |  64 +-
 tools/perf/builtin-stat.c                          |   9 +-
 tools/perf/builtin-top.c                           |  11 +-
 tools/perf/builtin-trace.c                         |   7 +-
 tools/perf/config/Makefile                         |  31 +-
 tools/perf/tests/Build                             |  10 +-
 tools/perf/tests/bpf-script-example.c              |  44 ++
 tools/perf/tests/bpf.c                             | 170 +++++
 tools/perf/tests/builtin-test.c                    |  12 +
 tools/perf/tests/llvm.c                            | 125 +++-
 tools/perf/tests/llvm.h                            |  15 +
 tools/perf/tests/make                              |   4 +-
 tools/perf/tests/tests.h                           |   3 +
 tools/perf/util/Build                              |   4 +-
 tools/perf/util/bpf-loader.c                       | 730 +++++++++++++++++++++
 tools/perf/util/bpf-loader.h                       |  95 +++
 tools/perf/util/bpf-prologue.c                     | 442 +++++++++++++
 tools/perf/util/bpf-prologue.h                     |  34 +
 tools/perf/util/evlist.c                           | 107 +++
 tools/perf/util/evlist.h                           |   2 +
 tools/perf/util/evsel.c                            |  49 ++
 tools/perf/util/evsel.h                            |   7 +
 tools/perf/util/include/dwarf-regs.h               |   7 +
 tools/perf/util/parse-events.c                     |  73 ++-
 tools/perf/util/parse-events.h                     |   4 +
 tools/perf/util/parse-events.l                     |   6 +
 tools/perf/util/parse-events.y                     |  29 +-
 tools/perf/util/probe-event.c                      |  79 ++-
 tools/perf/util/probe-event.h                      |   7 +-
 tools/perf/util/probe-file.c                       |   5 +-
 tools/perf/util/probe-finder.c                     |   4 +
 .../perf/util/scripting-engines/trace-event-perl.c |   1 +
 .../util/scripting-engines/trace-event-python.c    |   1 +
 45 files changed, 2698 insertions(+), 121 deletions(-)
 create mode 100644 tools/include/linux/filter.h
 create mode 100644 tools/perf/tests/bpf-script-example.c
 create mode 100644 tools/perf/tests/bpf.c
 create mode 100644 tools/perf/tests/llvm.h
 create mode 100644 tools/perf/util/bpf-loader.c
 create mode 100644 tools/perf/util/bpf-loader.h
 create mode 100644 tools/perf/util/bpf-prologue.c
 create mode 100644 tools/perf/util/bpf-prologue.h

-- 
2.1.0


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 01/31] bpf tools: New API to get name from a BPF object
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected Wang Nan
                   ` (29 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

Before this patch there's no way to connect a loaded bpf object
to its source file. However, during applying perf's '--filter' to BPF
object, without this connection makes things harder, because perf loads
all programs together, but '--filter' setting is for each object.

API of bpf_object__open_buffer() is changed to allow passing a name.
Fortunately, at this time there's only one user of it (perf test LLVM),
so we change it together.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1440742821-44548-2-git-send-email-wangnan0@huawei.com
---
 tools/lib/bpf/libbpf.c  | 25 ++++++++++++++++++++++---
 tools/lib/bpf/libbpf.h  |  4 +++-
 tools/perf/tests/llvm.c |  2 +-
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 4fa4bc4..4252fc2 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -880,15 +880,26 @@ struct bpf_object *bpf_object__open(const char *path)
 }
 
 struct bpf_object *bpf_object__open_buffer(void *obj_buf,
-					   size_t obj_buf_sz)
+					   size_t obj_buf_sz,
+					   const char *name)
 {
+	char tmp_name[64];
+
 	/* param validation */
 	if (!obj_buf || obj_buf_sz <= 0)
 		return NULL;
 
-	pr_debug("loading object from buffer\n");
+	if (!name) {
+		snprintf(tmp_name, sizeof(tmp_name), "%lx-%lx",
+			 (unsigned long)obj_buf,
+			 (unsigned long)obj_buf_sz);
+		tmp_name[sizeof(tmp_name) - 1] = '\0';
+		name = tmp_name;
+	}
+	pr_debug("loading object '%s' from buffer\n",
+		 name);
 
-	return __bpf_object__open("[buffer]", obj_buf, obj_buf_sz);
+	return __bpf_object__open(name, obj_buf, obj_buf_sz);
 }
 
 int bpf_object__unload(struct bpf_object *obj)
@@ -975,6 +986,14 @@ bpf_object__next(struct bpf_object *prev)
 	return next;
 }
 
+const char *
+bpf_object__get_name(struct bpf_object *obj)
+{
+	if (!obj)
+		return NULL;
+	return obj->path;
+}
+
 struct bpf_program *
 bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
 {
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index ea8adc2..f16170c 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -28,12 +28,14 @@ struct bpf_object;
 
 struct bpf_object *bpf_object__open(const char *path);
 struct bpf_object *bpf_object__open_buffer(void *obj_buf,
-					   size_t obj_buf_sz);
+					   size_t obj_buf_sz,
+					   const char *name);
 void bpf_object__close(struct bpf_object *object);
 
 /* Load/unload object into/from kernel */
 int bpf_object__load(struct bpf_object *obj);
 int bpf_object__unload(struct bpf_object *obj);
+const char *bpf_object__get_name(struct bpf_object *obj);
 
 struct bpf_object *bpf_object__next(struct bpf_object *prev);
 #define bpf_object__for_each_safe(pos, tmp)			\
diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
index a337356..52d5597 100644
--- a/tools/perf/tests/llvm.c
+++ b/tools/perf/tests/llvm.c
@@ -26,7 +26,7 @@ static int test__bpf_parsing(void *obj_buf, size_t obj_buf_sz)
 {
 	struct bpf_object *obj;
 
-	obj = bpf_object__open_buffer(obj_buf, obj_buf_sz);
+	obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, NULL);
 	if (!obj)
 		return -1;
 	bpf_object__close(obj);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
  2015-08-29  4:21 ` [PATCH 01/31] bpf tools: New API to get name from a BPF object Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-31 19:20   ` Arnaldo Carvalho de Melo
  2015-09-02  2:53   ` [PATCH] perf tools: Don't set leader if parser doesn't collect an evsel Wang Nan
  2015-08-29  4:21 ` [PATCH 03/31] perf tools: Introduce dummy evsel Wang Nan
                   ` (28 subsequent siblings)
  30 siblings, 2 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Masami Hiramatsu,
	Namhyung Kim

If parse_events__scanner() collects no entry, perf_evlist__last(evlist)
is invalid. Then setting of cmdline_group_boundary touches invalid.

It could happend in currect BPF implementation. See [1]. Although it
can be fixed, for safety reason it whould be better to introduce this
check.

Instead of checking number of entries, check data.list instead, so we
can add dummy evsel here.

[1]: http://lkml.kernel.org/n/1436445342-1402-19-git-send-email-wangnan0@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1440742821-44548-3-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/parse-events.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index d826e6f..14cd7e3 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1143,10 +1143,14 @@ int parse_events(struct perf_evlist *evlist, const char *str,
 		int entries = data.idx - evlist->nr_entries;
 		struct perf_evsel *last;
 
+		if (!list_empty(&data.list)) {
+			last = list_entry(data.list.prev,
+					  struct perf_evsel, node);
+			last->cmdline_group_boundary = true;
+		}
+
 		perf_evlist__splice_list_tail(evlist, &data.list, entries);
 		evlist->nr_groups += data.nr_groups;
-		last = perf_evlist__last(evlist);
-		last->cmdline_group_boundary = true;
 
 		return 0;
 	}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 03/31] perf tools: Introduce dummy evsel
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
  2015-08-29  4:21 ` [PATCH 01/31] bpf tools: New API to get name from a BPF object Wang Nan
  2015-08-29  4:21 ` [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-31 19:38   ` Arnaldo Carvalho de Melo
                     ` (2 more replies)
  2015-08-29  4:21 ` [PATCH 04/31] perf tools: Make perf depend on libbpf Wang Nan
                   ` (27 subsequent siblings)
  30 siblings, 3 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

This patch allows linking dummy evsel onto evlist as a placeholder. It
is for following patch which allows passing BPF object using '--event
object.o'.

Doesn't link other event selectors, if passing a BPF object file to
'--event', nothing is linked onto evlist. Instead, events described in
BPF object file are probed and linked in a delayed manner because we
want do all probing work together. Therefore, evsel for events in BPF
object would be linked at the end of evlist. Which causes a small
problem that, if passing '--filter' setting after object file, the
filter option won't be correctly applied to those events.

This patch links dummy onto evlist, so following --filter can be
collected by the dummy evsel. For this reason dummy evsels are set to
PERF_TYPE_TRACEPOINT.

Due to the possibility of existance of dummy evsel,
perf_evlist__purge_dummy() must be called right after parse_options().
This patch adds it to record, top, trace and stat builtin commands.
Further patch moves it down after real BPF events are processed with.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440742821-44548-4-git-send-email-wangnan0@huawei.com
---
 tools/perf/builtin-record.c    |  2 ++
 tools/perf/builtin-stat.c      |  1 +
 tools/perf/builtin-top.c       |  1 +
 tools/perf/builtin-trace.c     |  1 +
 tools/perf/util/evlist.c       | 19 +++++++++++++++++++
 tools/perf/util/evlist.h       |  1 +
 tools/perf/util/evsel.c        | 32 ++++++++++++++++++++++++++++++++
 tools/perf/util/evsel.h        |  6 ++++++
 tools/perf/util/parse-events.c | 25 +++++++++++++++++++++----
 9 files changed, 84 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a660022..81829de 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1112,6 +1112,8 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 
 	argc = parse_options(argc, argv, record_options, record_usage,
 			    PARSE_OPT_STOP_AT_NON_OPTION);
+	perf_evlist__purge_dummy(rec->evlist);
+
 	if (!argc && target__none(&rec->opts.target))
 		usage_with_options(record_usage, record_options);
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7aa039b..99b62f1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1208,6 +1208,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 
 	argc = parse_options(argc, argv, options, stat_usage,
 		PARSE_OPT_STOP_AT_NON_OPTION);
+	perf_evlist__purge_dummy(evsel_list);
 
 	interval = stat_config.interval;
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 8c465c8..246203b 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1198,6 +1198,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 	perf_config(perf_top_config, &top);
 
 	argc = parse_options(argc, argv, options, top_usage, 0);
+	perf_evlist__purge_dummy(top.evlist);
 	if (argc)
 		usage_with_options(top_usage, options);
 
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 4e3abba..57712b9 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3099,6 +3099,7 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
 
 	argc = parse_options_subcommand(argc, argv, trace_options, trace_subcommands,
 				 trace_usage, PARSE_OPT_STOP_AT_NON_OPTION);
+	perf_evlist__purge_dummy(trace.evlist);
 
 	if (trace.trace_pgfaults) {
 		trace.opts.sample_address = true;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 8d00039..8a4e64d 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1696,3 +1696,22 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 
 	tracking_evsel->tracking = true;
 }
+
+void perf_evlist__purge_dummy(struct perf_evlist *evlist)
+{
+	struct perf_evsel *pos, *n;
+
+	/*
+	 * Remove all dummy events.
+	 * During linking, we don't touch anything except link
+	 * it into evlist. As a result, we don't
+	 * need to adjust evlist->nr_entries during removal.
+	 */
+
+	evlist__for_each_safe(evlist, n, pos) {
+		if (perf_evsel__is_dummy(pos)) {
+			list_del_init(&pos->node);
+			perf_evsel__delete(pos);
+		}
+	}
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b39a619..7f15727 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -181,6 +181,7 @@ bool perf_evlist__valid_read_format(struct perf_evlist *evlist);
 void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
 				   struct list_head *list,
 				   int nr_entries);
+void perf_evlist__purge_dummy(struct perf_evlist *evlist);
 
 static inline struct perf_evsel *perf_evlist__first(struct perf_evlist *evlist)
 {
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index bac25f4..01267f4 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -213,6 +213,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
 	evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
 	perf_evsel__calc_id_pos(evsel);
 	evsel->cmdline_group_boundary = false;
+	evsel->is_dummy = false;
 }
 
 struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
@@ -225,6 +226,37 @@ struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
 	return evsel;
 }
 
+struct perf_evsel *perf_evsel__new_dummy(const char *name)
+{
+	struct perf_evsel *evsel = zalloc(perf_evsel__object.size);
+
+	if (!evsel)
+		return NULL;
+
+	/*
+	 * Don't need call perf_evsel__init() for dummy evsel.
+	 * Keep it simple.
+	 */
+	evsel->name = strdup(name);
+	if (!evsel->name)
+		goto out_free;
+
+	INIT_LIST_HEAD(&evsel->node);
+	INIT_LIST_HEAD(&evsel->config_terms);
+
+	evsel->cmdline_group_boundary = false;
+	/*
+	 * Set dummy evsel as TRACEPOINT event so it can collect filter
+	 * options.
+	 */
+	evsel->attr.type = PERF_TYPE_TRACEPOINT;
+	evsel->is_dummy = true;
+	return evsel;
+out_free:
+	free(evsel);
+	return NULL;
+}
+
 struct perf_evsel *perf_evsel__newtp_idx(const char *sys, const char *name, int idx)
 {
 	struct perf_evsel *evsel = zalloc(perf_evsel__object.size);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 298e6bb..0b8e47d 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -118,6 +118,7 @@ struct perf_evsel {
 	struct perf_evsel	*leader;
 	char			*group_name;
 	bool			cmdline_group_boundary;
+	bool			is_dummy;
 	struct list_head	config_terms;
 };
 
@@ -153,6 +154,11 @@ int perf_evsel__object_config(size_t object_size,
 			      void (*fini)(struct perf_evsel *evsel));
 
 struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx);
+struct perf_evsel *perf_evsel__new_dummy(const char *name);
+static inline bool perf_evsel__is_dummy(struct perf_evsel *evsel)
+{
+	return evsel->is_dummy;
+}
 
 static inline struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr)
 {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 14cd7e3..71d91fb 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1141,7 +1141,7 @@ int parse_events(struct perf_evlist *evlist, const char *str,
 	perf_pmu__parse_cleanup();
 	if (!ret) {
 		int entries = data.idx - evlist->nr_entries;
-		struct perf_evsel *last;
+		struct perf_evsel *last = NULL;
 
 		if (!list_empty(&data.list)) {
 			last = list_entry(data.list.prev,
@@ -1149,8 +1149,25 @@ int parse_events(struct perf_evlist *evlist, const char *str,
 			last->cmdline_group_boundary = true;
 		}
 
-		perf_evlist__splice_list_tail(evlist, &data.list, entries);
-		evlist->nr_groups += data.nr_groups;
+		if (last && perf_evsel__is_dummy(last)) {
+			if (!list_is_singular(&data.list)) {
+				parse_events_evlist_error(&data, 0,
+					"Dummy evsel error: not on a singular list");
+				return -1;
+			}
+			/*
+			 * We are introducing a dummy event. Don't touch
+			 * anything, just link it.
+			 *
+			 * Don't use perf_evlist__splice_list_tail() since
+			 * it alerts evlist->nr_entries, which affect header
+			 * of resulting perf.data.
+			 */
+			list_splice_tail(&data.list, &evlist->entries);
+		} else {
+			perf_evlist__splice_list_tail(evlist, &data.list, entries);
+			evlist->nr_groups += data.nr_groups;
+		}
 
 		return 0;
 	}
@@ -1256,7 +1273,7 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
 	struct perf_evsel *last = NULL;
 	int err;
 
-	if (evlist->nr_entries > 0)
+	if (!list_empty(&evlist->entries))
 		last = perf_evlist__last(evlist);
 
 	do {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 04/31] perf tools: Make perf depend on libbpf
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (2 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 03/31] perf tools: Introduce dummy evsel Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 05/31] perf ebpf: Add the libbpf glue Wang Nan
                   ` (26 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

By adding libbpf into perf's Makefile, this patch enables perf to build
libbpf during building if libelf is found and neither NO_LIBELF nor
NO_LIBBPF is set. The newly introduced code is similar to libapi and
libtraceevent building in Makefile.perf.

MANIFEST is also updated for 'make perf-*-src-pkg'.

Append make_no_libbpf to tools/perf/tests/make.

'bpf' feature check is appended into default FEATURE_TESTS and
FEATURE_DISPLAY, so perf will check API version of bpf in
/path/to/kernel/include/uapi/linux/bpf.h. Which should not fail except
when we are trying to port this code to an old kernel.

Error messages are also updated to notify users about the disable of BPF
support of 'perf record' if libelf is missed or BPF API check failed.

tools/lib/bpf is added into TAG_FOLDERS to allow us to navigate on
libbpf files when working on perf using tools/perf/tags.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1435716878-189507-24-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/build/Makefile.feature |  6 ++++--
 tools/perf/MANIFEST          |  3 +++
 tools/perf/Makefile.perf     | 19 +++++++++++++++++--
 tools/perf/config/Makefile   | 19 ++++++++++++++++++-
 tools/perf/tests/make        |  4 +++-
 5 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 2975632..5ec6b37 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -51,7 +51,8 @@ FEATURE_TESTS ?=			\
 	timerfd				\
 	libdw-dwarf-unwind		\
 	zlib				\
-	lzma
+	lzma				\
+	bpf
 
 FEATURE_DISPLAY ?=			\
 	dwarf				\
@@ -67,7 +68,8 @@ FEATURE_DISPLAY ?=			\
 	libunwind			\
 	libdw-dwarf-unwind		\
 	zlib				\
-	lzma
+	lzma				\
+	bpf
 
 # Set FEATURE_CHECK_(C|LD)FLAGS-all for all FEATURE_TESTS features.
 # If in the future we need per-feature checks/flags for features not
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index af009bd..56fe0c9 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -17,6 +17,7 @@ tools/build
 tools/arch/x86/include/asm/atomic.h
 tools/arch/x86/include/asm/rmwcc.h
 tools/lib/traceevent
+tools/lib/bpf
 tools/lib/api
 tools/lib/bpf
 tools/lib/hweight.c
@@ -67,6 +68,8 @@ arch/*/lib/memset*.S
 include/linux/poison.h
 include/linux/hw_breakpoint.h
 include/uapi/linux/perf_event.h
+include/uapi/linux/bpf.h
+include/uapi/linux/bpf_common.h
 include/uapi/linux/const.h
 include/uapi/linux/swab.h
 include/uapi/linux/hw_breakpoint.h
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index d9863cb..a6a789e 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -145,6 +145,7 @@ AWK     = awk
 
 LIB_DIR          = $(srctree)/tools/lib/api/
 TRACE_EVENT_DIR = $(srctree)/tools/lib/traceevent/
+BPF_DIR = $(srctree)/tools/lib/bpf/
 
 # include config/Makefile by default and rule out
 # non-config cases
@@ -180,6 +181,7 @@ strip-libs = $(filter-out -l%,$(1))
 
 ifneq ($(OUTPUT),)
   TE_PATH=$(OUTPUT)
+  BPF_PATH=$(OUTPUT)
 ifneq ($(subdir),)
   LIB_PATH=$(OUTPUT)/../lib/api/
 else
@@ -188,6 +190,7 @@ endif
 else
   TE_PATH=$(TRACE_EVENT_DIR)
   LIB_PATH=$(LIB_DIR)
+  BPF_PATH=$(BPF_DIR)
 endif
 
 LIBTRACEEVENT = $(TE_PATH)libtraceevent.a
@@ -199,6 +202,8 @@ LIBTRACEEVENT_DYNAMIC_LIST_LDFLAGS = -Xlinker --dynamic-list=$(LIBTRACEEVENT_DYN
 LIBAPI = $(LIB_PATH)libapi.a
 export LIBAPI
 
+LIBBPF = $(BPF_PATH)libbpf.a
+
 # python extension build directories
 PYTHON_EXTBUILD     := $(OUTPUT)python_ext_build/
 PYTHON_EXTBUILD_LIB := $(PYTHON_EXTBUILD)lib/
@@ -251,6 +256,9 @@ export PERL_PATH
 LIB_FILE=$(OUTPUT)libperf.a
 
 PERFLIBS = $(LIB_FILE) $(LIBAPI) $(LIBTRACEEVENT)
+ifndef NO_LIBBPF
+  PERFLIBS += $(LIBBPF)
+endif
 
 # We choose to avoid "if .. else if .. else .. endif endif"
 # because maintaining the nesting to match is a pain.  If
@@ -420,6 +428,13 @@ $(LIBAPI)-clean:
 	$(call QUIET_CLEAN, libapi)
 	$(Q)$(MAKE) -C $(LIB_DIR) O=$(OUTPUT) clean >/dev/null
 
+$(LIBBPF): FORCE
+	$(Q)$(MAKE) -C $(BPF_DIR) O=$(OUTPUT) $(OUTPUT)libbpf.a
+
+$(LIBBPF)-clean:
+	$(call QUIET_CLEAN, libbpf)
+	$(Q)$(MAKE) -C $(BPF_DIR) O=$(OUTPUT) clean >/dev/null
+
 help:
 	@echo 'Perf make targets:'
 	@echo '  doc		- make *all* documentation (see below)'
@@ -459,7 +474,7 @@ INSTALL_DOC_TARGETS += quick-install-doc quick-install-man quick-install-html
 $(DOC_TARGETS):
 	$(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) $(@:doc=all)
 
-TAG_FOLDERS= . ../lib/traceevent ../lib/api ../lib/symbol
+TAG_FOLDERS= . ../lib/traceevent ../lib/api ../lib/symbol ../lib/bpf
 TAG_FILES= ../../include/uapi/linux/perf_event.h
 
 TAGS:
@@ -567,7 +582,7 @@ config-clean:
 	$(call QUIET_CLEAN, config)
 	$(Q)$(MAKE) -C $(srctree)/tools/build/feature/ clean >/dev/null
 
-clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean config-clean
+clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean config-clean
 	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIB_FILE) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
 	$(Q)find . -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
 	$(Q)$(RM) $(OUTPUT).config-detected
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 827557f..38a4144 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -106,6 +106,7 @@ ifdef LIBBABELTRACE
   FEATURE_CHECK_LDFLAGS-libbabeltrace := $(LIBBABELTRACE_LDFLAGS) -lbabeltrace-ctf
 endif
 
+FEATURE_CHECK_CFLAGS-bpf = -I. -I$(srctree)/tools/include -I$(srctree)/arch/$(ARCH)/include/uapi -I$(srctree)/include/uapi
 # include ARCH specific config
 -include $(src-perf)/arch/$(ARCH)/Makefile
 
@@ -233,6 +234,7 @@ ifdef NO_LIBELF
   NO_DEMANGLE := 1
   NO_LIBUNWIND := 1
   NO_LIBDW_DWARF_UNWIND := 1
+  NO_LIBBPF := 1
 else
   ifeq ($(feature-libelf), 0)
     ifeq ($(feature-glibc), 1)
@@ -242,13 +244,14 @@ else
       LIBC_SUPPORT := 1
     endif
     ifeq ($(LIBC_SUPPORT),1)
-      msg := $(warning No libelf found, disables 'probe' tool, please install elfutils-libelf-devel/libelf-dev);
+      msg := $(warning No libelf found, disables 'probe' tool and BPF support in 'perf record', please install elfutils-libelf-devel/libelf-dev);
 
       NO_LIBELF := 1
       NO_DWARF := 1
       NO_DEMANGLE := 1
       NO_LIBUNWIND := 1
       NO_LIBDW_DWARF_UNWIND := 1
+      NO_LIBBPF := 1
     else
       ifneq ($(filter s% -static%,$(LDFLAGS),),)
         msg := $(error No static glibc found, please install glibc-static);
@@ -305,6 +308,13 @@ ifndef NO_LIBELF
       $(call detected,CONFIG_DWARF)
     endif # PERF_HAVE_DWARF_REGS
   endif # NO_DWARF
+
+  ifndef NO_LIBBPF
+    ifeq ($(feature-bpf), 1)
+      CFLAGS += -DHAVE_LIBBPF_SUPPORT
+      $(call detected,CONFIG_LIBBPF)
+    endif
+  endif # NO_LIBBPF
 endif # NO_LIBELF
 
 ifeq ($(ARCH),powerpc)
@@ -320,6 +330,13 @@ ifndef NO_LIBUNWIND
   endif
 endif
 
+ifndef NO_LIBBPF
+  ifneq ($(feature-bpf), 1)
+    msg := $(warning BPF API too old. Please install recent kernel headers. BPF support in 'perf record' is disabled.)
+    NO_LIBBPF := 1
+  endif
+endif
+
 dwarf-post-unwind := 1
 dwarf-post-unwind-text := BUG
 
diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index ba31c4b..2cbd0c6 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -44,6 +44,7 @@ make_no_libnuma     := NO_LIBNUMA=1
 make_no_libaudit    := NO_LIBAUDIT=1
 make_no_libbionic   := NO_LIBBIONIC=1
 make_no_auxtrace    := NO_AUXTRACE=1
+make_no_libbpf	    := NO_LIBBPF=1
 make_tags           := tags
 make_cscope         := cscope
 make_help           := help
@@ -66,7 +67,7 @@ make_static         := LDFLAGS=-static
 make_minimal        := NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1
 make_minimal        += NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1
 make_minimal        += NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1
-make_minimal        += NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1
+make_minimal        += NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1
 
 # $(run) contains all available tests
 run := make_pure
@@ -94,6 +95,7 @@ run += make_no_libnuma
 run += make_no_libaudit
 run += make_no_libbionic
 run += make_no_auxtrace
+run += make_no_libbpf
 run += make_help
 run += make_doc
 run += make_perf_o
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 05/31] perf ebpf: Add the libbpf glue
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (3 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 04/31] perf tools: Make perf depend on libbpf Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 06/31] perf tools: Enable passing bpf object file to --event Wang Nan
                   ` (25 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

The 'bpf-loader.[ch]' files are introduced in this patch. Which will be
the interface between perf and libbpf. bpf__prepare_load() resides in
bpf-loader.c. Dummy functions should be used because bpf-loader.c is
available only when CONFIG_LIBBPF is on.

Functions in bpf-loader.c should not report error explicitly. Instead,
strerror style error reporting should be used.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1436445342-1402-19-git-send-email-wangnan0@huawei.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/bpf-loader.c | 92 ++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h | 47 ++++++++++++++++++++++
 2 files changed, 139 insertions(+)
 create mode 100644 tools/perf/util/bpf-loader.c
 create mode 100644 tools/perf/util/bpf-loader.h

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
new file mode 100644
index 0000000..88531ea
--- /dev/null
+++ b/tools/perf/util/bpf-loader.c
@@ -0,0 +1,92 @@
+/*
+ * bpf-loader.c
+ *
+ * Copyright (C) 2015 Wang Nan <wangnan0@huawei.com>
+ * Copyright (C) 2015 Huawei Inc.
+ */
+
+#include <bpf/libbpf.h>
+#include "perf.h"
+#include "debug.h"
+#include "bpf-loader.h"
+
+#define DEFINE_PRINT_FN(name, level) \
+static int libbpf_##name(const char *fmt, ...)	\
+{						\
+	va_list args;				\
+	int ret;				\
+						\
+	va_start(args, fmt);			\
+	ret = veprintf(level, verbose, pr_fmt(fmt), args);\
+	va_end(args);				\
+	return ret;				\
+}
+
+DEFINE_PRINT_FN(warning, 0)
+DEFINE_PRINT_FN(info, 0)
+DEFINE_PRINT_FN(debug, 1)
+
+static bool libbpf_initialized;
+
+int bpf__prepare_load(const char *filename)
+{
+	struct bpf_object *obj;
+
+	if (!libbpf_initialized)
+		libbpf_set_print(libbpf_warning,
+				 libbpf_info,
+				 libbpf_debug);
+
+	obj = bpf_object__open(filename);
+	if (!obj) {
+		pr_debug("bpf: failed to load %s\n", filename);
+		return -EINVAL;
+	}
+
+	/*
+	 * Throw object pointer away: it will be retrived using
+	 * bpf_objects iterater.
+	 */
+
+	return 0;
+}
+
+void bpf__clear(void)
+{
+	struct bpf_object *obj, *tmp;
+
+	bpf_object__for_each_safe(obj, tmp)
+		bpf_object__close(obj);
+}
+
+#define bpf__strerror_head(err, buf, size) \
+	char sbuf[STRERR_BUFSIZE], *emsg;\
+	if (!size)\
+		return 0;\
+	if (err < 0)\
+		err = -err;\
+	emsg = strerror_r(err, sbuf, sizeof(sbuf));\
+	switch (err) {\
+	default:\
+		scnprintf(buf, size, "%s", emsg);\
+		break;
+
+#define bpf__strerror_entry(val, fmt...)\
+	case val: {\
+		scnprintf(buf, size, fmt);\
+		break;\
+	}
+
+#define bpf__strerror_end(buf, size)\
+	}\
+	buf[size - 1] = '\0';
+
+int bpf__strerror_prepare_load(const char *filename, int err,
+			       char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(EINVAL, "%s: BPF object file '%s' is invalid",
+			    emsg, filename)
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
new file mode 100644
index 0000000..12be630
--- /dev/null
+++ b/tools/perf/util/bpf-loader.h
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2015, Wang Nan <wangnan0@huawei.com>
+ * Copyright (C) 2015, Huawei Inc.
+ */
+#ifndef __BPF_LOADER_H
+#define __BPF_LOADER_H
+
+#include <linux/compiler.h>
+#include <string.h>
+#include "debug.h"
+
+#ifdef HAVE_LIBBPF_SUPPORT
+int bpf__prepare_load(const char *filename);
+int bpf__strerror_prepare_load(const char *filename, int err,
+			       char *buf, size_t size);
+
+void bpf__clear(void);
+#else
+static inline int bpf__prepare_load(const char *filename __maybe_unused)
+{
+	pr_debug("ERROR: eBPF object loading is disabled during compiling.\n");
+	return -1;
+}
+
+static inline void bpf__clear(void) { }
+
+static inline int
+__bpf_strerror(char *buf, size_t size)
+{
+	if (!size)
+		return 0;
+	strncpy(buf,
+		"ERROR: eBPF object loading is disabled during compiling.\n",
+		size);
+	buf[size - 1] = '\0';
+	return 0;
+}
+
+static inline int
+bpf__strerror_prepare_load(const char *filename __maybe_unused,
+			   int err __maybe_unused,
+			   char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
+#endif
+#endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 06/31] perf tools: Enable passing bpf object file to --event
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (4 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 05/31] perf ebpf: Add the libbpf glue Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 07/31] perf probe: Attach trace_probe_event with perf_probe_event Wang Nan
                   ` (24 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.

Instead of introducing evsel to evlist during parsing, events selected
by eBPF object files are appended separately. The reason is:

 1. During parsing, the probing points have not been initialized.

 2. Currently we are unable to call add_perf_probe_events() twice,
    therefore we have to wait until all such events are collected,
    then probe all points by one call.

The real probing and selecting is reside in following patches.

To collect '--filter' events, add a dummy evsel during parsing.

Since bpf__prepare_load() is possible to be called during cmdline
parsing, all builtin commands which are possible to call
parse_events_option() should release bpf resources during cleanup.
Add bpf__clear() to stat, record, top and trace commands, although
currently we are going to support 'perf record' only.

Commiter note:

Testing if the event parsing changes indeed call the BPF loading
routines:

  [root@felicio ~]# ls -la foo.o
  ls: cannot access foo.o: No such file or directory
  [root@felicio ~]# perf record --event foo.o sleep
  libbpf: failed to open foo.o: No such file or directory
  bpf: failed to load foo.o
  invalid or unsupported event: 'foo.o'
  Run 'perf list' for a list of valid events

   usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -e, --event <event>   event selector. use 'perf list' to list available events
  [root@felicio ~]#

Yes, it does this time around.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1436445342-1402-19-git-send-email-wangnan0@huawei.com
[ The veprintf() and bpf loader parts were split from this one;
  Add bpf__clear() into stat, record, top and trace commands.
  Add dummy evsel when parsing.
]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c    |  7 +++++--
 tools/perf/builtin-stat.c      |  8 ++++++--
 tools/perf/builtin-top.c       | 10 +++++++---
 tools/perf/builtin-trace.c     |  6 +++++-
 tools/perf/util/Build          |  1 +
 tools/perf/util/parse-events.c | 40 ++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/parse-events.h |  3 +++
 tools/perf/util/parse-events.l |  3 +++
 tools/perf/util/parse-events.y | 18 +++++++++++++++++-
 9 files changed, 87 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 81829de..31934b1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -29,6 +29,7 @@
 #include "util/data.h"
 #include "util/auxtrace.h"
 #include "util/parse-branch-options.h"
+#include "util/bpf-loader.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -1131,13 +1132,13 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (!rec->itr) {
 		rec->itr = auxtrace_record__init(rec->evlist, &err);
 		if (err)
-			return err;
+			goto out_bpf_clear;
 	}
 
 	err = auxtrace_parse_snapshot_options(rec->itr, &rec->opts,
 					      rec->opts.auxtrace_snapshot_opts);
 	if (err)
-		return err;
+		goto out_bpf_clear;
 
 	err = -ENOMEM;
 
@@ -1200,6 +1201,8 @@ out_symbol_exit:
 	perf_evlist__delete(rec->evlist);
 	symbol__exit();
 	auxtrace_record__free(rec->itr);
+out_bpf_clear:
+	bpf__clear();
 	return err;
 }
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 99b62f1..d50a19a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -59,6 +59,7 @@
 #include "util/thread.h"
 #include "util/thread_map.h"
 #include "util/counts.h"
+#include "util/bpf-loader.h"
 
 #include <stdlib.h>
 #include <sys/prctl.h>
@@ -1235,7 +1236,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 		output = fopen(output_name, mode);
 		if (!output) {
 			perror("failed to create output file");
-			return -1;
+			status = -1;
+			goto out;
 		}
 		clock_gettime(CLOCK_REALTIME, &tm);
 		fprintf(output, "# started on %s\n", ctime(&tm.tv_sec));
@@ -1244,7 +1246,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 		output = fdopen(output_fd, mode);
 		if (!output) {
 			perror("Failed opening logfd");
-			return -errno;
+			status = -errno;
+			goto out;
 		}
 	}
 
@@ -1377,5 +1380,6 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 	perf_evlist__free_stats(evsel_list);
 out:
 	perf_evlist__delete(evsel_list);
+	bpf__clear();
 	return status;
 }
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 246203b..ee946dc 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -41,6 +41,7 @@
 #include "util/sort.h"
 #include "util/intlist.h"
 #include "util/parse-branch-options.h"
+#include "util/bpf-loader.h"
 #include "arch/common.h"
 
 #include "util/debug.h"
@@ -1271,8 +1272,10 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 	symbol_conf.priv_size = sizeof(struct annotation);
 
 	symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
-	if (symbol__init(NULL) < 0)
-		return -1;
+	if (symbol__init(NULL) < 0) {
+		status = -1;
+		goto out_bpf_clear;
+	}
 
 	sort__setup_elide(stdout);
 
@@ -1290,6 +1293,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 
 out_delete_evlist:
 	perf_evlist__delete(top.evlist);
-
+out_bpf_clear:
+	bpf__clear();
 	return status;
 }
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 57712b9..edb882b 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -30,6 +30,7 @@
 #include "util/intlist.h"
 #include "util/thread_map.h"
 #include "util/stat.h"
+#include "util/bpf-loader.h"
 #include "trace-event.h"
 #include "util/parse-events.h"
 
@@ -3109,6 +3110,7 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (trace.evlist->nr_entries > 0)
 		evlist__set_evsel_handler(trace.evlist, trace__event_handler);
 
+	/* trace__record calls cmd_record, which calls bpf__clear() */
 	if ((argc >= 1) && (strcmp(argv[0], "record") == 0))
 		return trace__record(&trace, argc-1, &argv[1]);
 
@@ -3119,7 +3121,8 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (!trace.trace_syscalls && !trace.trace_pgfaults &&
 	    trace.evlist->nr_entries == 0 /* Was --events used? */) {
 		pr_err("Please specify something to trace.\n");
-		return -1;
+		err = -1;
+		goto out;
 	}
 
 	if (output_name != NULL) {
@@ -3178,5 +3181,6 @@ out_close:
 	if (output_name != NULL)
 		fclose(trace.output);
 out:
+	bpf__clear();
 	return err;
 }
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index e912856..c0ca4a1 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -83,6 +83,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-pt.o
 libperf-$(CONFIG_AUXTRACE) += intel-bts.o
 libperf-y += parse-branch-options.o
 
+libperf-$(CONFIG_LIBBPF) += bpf-loader.o
 libperf-$(CONFIG_LIBELF) += symbol-elf.o
 libperf-$(CONFIG_LIBELF) += probe-file.o
 libperf-$(CONFIG_LIBELF) += probe-event.o
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 71d91fb..4343433 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -19,6 +19,7 @@
 #include "thread_map.h"
 #include "cpumap.h"
 #include "asm/bug.h"
+#include "bpf-loader.h"
 
 #define MAX_NAME_LEN 100
 
@@ -481,6 +482,45 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
 		return add_tracepoint_event(list, idx, sys, event);
 }
 
+int parse_events_load_bpf(struct parse_events_evlist *data,
+			  struct list_head *list,
+			  char *bpf_file_name)
+{
+	int err;
+	char errbuf[BUFSIZ];
+	struct perf_evsel *evsel;
+
+	/*
+	 * Currently don't link useful event to list. BPF object files
+	 * should be saved to a seprated list and processed together.
+	 *
+	 * A dummy event is added here to collect '--filter' option.
+	 *
+	 * Things could be changed if we solve perf probe reentering
+	 * problem. After that probe events file by file is possible.
+	 * However, probing cost is still need to be considered.
+	 */
+	err = bpf__prepare_load(bpf_file_name);
+	if (err) {
+		bpf__strerror_prepare_load(bpf_file_name, err,
+					   errbuf, sizeof(errbuf));
+		data->error->str = strdup(errbuf);
+		data->error->help = strdup("(add -v to see detail)");
+		return err;
+	}
+
+	/*
+	 * Don't need call perf_evsel__init() for dummy evsel.
+	 * Also, don't increase data->idx.
+	 * data->idx affects other evsel's tracking field.
+	 */
+	evsel = perf_evsel__new_dummy(bpf_file_name);
+	if (!evsel)
+		return -ENOMEM;
+	list_add_tail(&evsel->node, list);
+	return 0;
+}
+
 static int
 parse_breakpoint_type(const char *type, struct perf_event_attr *attr)
 {
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index a09b0e2..3652387 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -119,6 +119,9 @@ int parse_events__modifier_group(struct list_head *list, char *event_mod);
 int parse_events_name(struct list_head *list, char *name);
 int parse_events_add_tracepoint(struct list_head *list, int *idx,
 				char *sys, char *event);
+int parse_events_load_bpf(struct parse_events_evlist *data,
+			  struct list_head *list,
+			  char *bpf_file_name);
 int parse_events_add_numeric(struct parse_events_evlist *data,
 			     struct list_head *list,
 			     u32 type, u64 config,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 936d566..22e8f93 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -115,6 +115,7 @@ do {							\
 group		[^,{}/]*[{][^}]*[}][^,{}/]*
 event_pmu	[^,{}/]+[/][^/]*[/][^,{}/]*
 event		[^,{}/]+
+bpf_object	.*\.(o|bpf)
 
 num_dec		[0-9]+
 num_hex		0x[a-fA-F0-9]+
@@ -159,6 +160,7 @@ modifier_bp	[rwx]{1,3}
 		}
 
 {event_pmu}	|
+{bpf_object}	|
 {event}		{
 			BEGIN(INITIAL);
 			REWIND(1);
@@ -264,6 +266,7 @@ r{num_raw_hex}		{ return raw(yyscanner); }
 {num_hex}		{ return value(yyscanner, 16); }
 
 {modifier_event}	{ return str(yyscanner, PE_MODIFIER_EVENT); }
+{bpf_object}		{ return str(yyscanner, PE_BPF_OBJECT); }
 {name}			{ return pmu_str_check(yyscanner); }
 "/"			{ BEGIN(config); return '/'; }
 -			{ return '-'; }
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 591905a..3ee3a32 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -42,6 +42,7 @@ static inc_group_count(struct list_head *list,
 %token PE_VALUE PE_VALUE_SYM_HW PE_VALUE_SYM_SW PE_RAW PE_TERM
 %token PE_EVENT_NAME
 %token PE_NAME
+%token PE_BPF_OBJECT
 %token PE_MODIFIER_EVENT PE_MODIFIER_BP
 %token PE_NAME_CACHE_TYPE PE_NAME_CACHE_OP_RESULT
 %token PE_PREFIX_MEM PE_PREFIX_RAW PE_PREFIX_GROUP
@@ -53,6 +54,7 @@ static inc_group_count(struct list_head *list,
 %type <num> PE_RAW
 %type <num> PE_TERM
 %type <str> PE_NAME
+%type <str> PE_BPF_OBJECT
 %type <str> PE_NAME_CACHE_TYPE
 %type <str> PE_NAME_CACHE_OP_RESULT
 %type <str> PE_MODIFIER_EVENT
@@ -69,6 +71,7 @@ static inc_group_count(struct list_head *list,
 %type <head> event_legacy_tracepoint
 %type <head> event_legacy_numeric
 %type <head> event_legacy_raw
+%type <head> event_bpf_file
 %type <head> event_def
 %type <head> event_mod
 %type <head> event_name
@@ -198,7 +201,8 @@ event_def: event_pmu |
 	   event_legacy_mem |
 	   event_legacy_tracepoint sep_dc |
 	   event_legacy_numeric sep_dc |
-	   event_legacy_raw sep_dc
+	   event_legacy_raw sep_dc |
+	   event_bpf_file
 
 event_pmu:
 PE_NAME '/' event_config '/'
@@ -420,6 +424,18 @@ PE_RAW
 	$$ = list;
 }
 
+event_bpf_file:
+PE_BPF_OBJECT
+{
+	struct parse_events_evlist *data = _data;
+	struct list_head *list;
+
+	ALLOC_LIST(list);
+	ABORT_ON(parse_events_load_bpf(data, list, $1));
+	$$ = list;
+}
+
+
 start_terms: event_config
 {
 	struct parse_events_terms *data = _data;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 07/31] perf probe: Attach trace_probe_event with perf_probe_event
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (5 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 06/31] perf tools: Enable passing bpf object file to --event Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-09-02  4:32   ` Namhyung Kim
  2015-08-29  4:21 ` [PATCH 08/31] perf record, bpf: Parse and probe eBPF programs probe points Wang Nan
                   ` (23 subsequent siblings)
  30 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

This patch drops struct __event_package structure. Instead, it adds
trace_probe_event into 'struct perf_probe_event'.

trace_probe_event information gives further patches a chance to access
actual probe points and actual arguments. Using them, bpf_loader will
be able to attach one bpf program to different probing points of a
inline functions (which has multiple probing points) and glob
functions. Moreover, by reading arguments information, bpf code for
reading those arguments can be generated.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-22-git-send-email-wangnan0@huawei.com
---
 tools/perf/builtin-probe.c    |  4 ++-
 tools/perf/util/probe-event.c | 60 +++++++++++++++++++++----------------------
 tools/perf/util/probe-event.h |  6 ++++-
 3 files changed, 38 insertions(+), 32 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index b81cec3..826d452 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -496,7 +496,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
 			usage_with_options(probe_usage, options);
 		}
 
-		ret = add_perf_probe_events(params.events, params.nevents);
+		ret = add_perf_probe_events(params.events,
+					    params.nevents,
+					    true);
 		if (ret < 0) {
 			pr_err_with_code("  Error: Failed to add events.", ret);
 			return ret;
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index eb5f18b..57a7bae 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1985,6 +1985,9 @@ void clear_perf_probe_event(struct perf_probe_event *pev)
 	struct perf_probe_arg_field *field, *next;
 	int i;
 
+	if (pev->ntevs)
+		cleanup_perf_probe_event(pev);
+
 	free(pev->event);
 	free(pev->group);
 	free(pev->target);
@@ -2759,61 +2762,58 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
 	return find_probe_trace_events_from_map(pev, tevs);
 }
 
-struct __event_package {
-	struct perf_probe_event		*pev;
-	struct probe_trace_event	*tevs;
-	int				ntevs;
-};
-
-int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
+int cleanup_perf_probe_event(struct perf_probe_event *pev)
 {
-	int i, j, ret;
-	struct __event_package *pkgs;
+	int i;
 
-	ret = 0;
-	pkgs = zalloc(sizeof(struct __event_package) * npevs);
+	if (!pev || !pev->ntevs)
+		return 0;
 
-	if (pkgs == NULL)
-		return -ENOMEM;
+	for (i = 0; i < pev->ntevs; i++)
+		clear_probe_trace_event(&pev->tevs[i]);
+
+	zfree(&pev->tevs);
+	pev->ntevs = 0;
+	return 0;
+}
+
+int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
+			  bool cleanup)
+{
+	int i, ret;
 
 	ret = init_symbol_maps(pevs->uprobes);
-	if (ret < 0) {
-		free(pkgs);
+	if (ret < 0)
 		return ret;
-	}
 
 	/* Loop 1: convert all events */
 	for (i = 0; i < npevs; i++) {
-		pkgs[i].pev = &pevs[i];
 		/* Init kprobe blacklist if needed */
-		if (!pkgs[i].pev->uprobes)
+		if (pevs[i].uprobes)
 			kprobe_blacklist__init();
 		/* Convert with or without debuginfo */
-		ret  = convert_to_probe_trace_events(pkgs[i].pev,
-						     &pkgs[i].tevs);
-		if (ret < 0)
+		ret  = convert_to_probe_trace_events(&pevs[i], &pevs[i].tevs);
+		if (ret < 0) {
+			cleanup = true;
 			goto end;
-		pkgs[i].ntevs = ret;
+		}
+		pevs[i].ntevs = ret;
 	}
 	/* This just release blacklist only if allocated */
 	kprobe_blacklist__release();
 
 	/* Loop 2: add all events */
 	for (i = 0; i < npevs; i++) {
-		ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs,
-					       pkgs[i].ntevs,
+		ret = __add_probe_trace_events(&pevs[i], pevs[i].tevs,
+					       pevs[i].ntevs,
 					       probe_conf.force_add);
 		if (ret < 0)
 			break;
 	}
 end:
 	/* Loop 3: cleanup and free trace events  */
-	for (i = 0; i < npevs; i++) {
-		for (j = 0; j < pkgs[i].ntevs; j++)
-			clear_probe_trace_event(&pkgs[i].tevs[j]);
-		zfree(&pkgs[i].tevs);
-	}
-	free(pkgs);
+	for (i = 0; cleanup && (i < npevs); i++)
+		cleanup_perf_probe_event(&pevs[i]);
 	exit_symbol_maps();
 
 	return ret;
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 6e7ec68..915f0d8 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -87,6 +87,8 @@ struct perf_probe_event {
 	bool			uprobes;	/* Uprobe event flag */
 	char			*target;	/* Target binary */
 	struct perf_probe_arg	*args;	/* Arguments */
+	struct probe_trace_event *tevs;
+	int			ntevs;
 };
 
 /* Line range */
@@ -137,8 +139,10 @@ extern void line_range__clear(struct line_range *lr);
 /* Initialize line range */
 extern int line_range__init(struct line_range *lr);
 
-extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs);
+extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
+				 bool cleanup);
 extern int del_perf_probe_events(struct strfilter *filter);
+extern int cleanup_perf_probe_event(struct perf_probe_event *pev);
 extern int show_perf_probe_events(struct strfilter *filter);
 extern int show_line_range(struct line_range *lr, const char *module,
 			   bool user);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 08/31] perf record, bpf: Parse and probe eBPF programs probe points
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (6 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 07/31] perf probe: Attach trace_probe_event with perf_probe_event Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 09/31] perf bpf: Collect 'struct perf_probe_event' for bpf_program Wang Nan
                   ` (22 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

This patch introduces bpf__{un,}probe() functions to enable callers to
create kprobe points based on section names of BPF programs. It parses
the section names of each eBPF program and creates corresponding 'struct
perf_probe_event' structures. The parse_perf_probe_command() function is
used to do the main parsing work.

Parsing result is stored into an array to satisify
add_perf_probe_events(). It accepts an array of 'struct perf_probe_event'
and do all the work in one call.

Define PERF_BPF_PROBE_GROUP as "perf_bpf_probe", which will be used as
the group name of all eBPF probing points.

probe_conf.max_probes is set to MAX_PROBES to support glob matching.

Before ending of bpf__probe(), data in each 'struct perf_probe_event' is
cleaned. Things will be changed by following patches because they need
'struct probe_trace_event' in them,

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1436445342-1402-21-git-send-email-wangnan0@huawei.com
Link: http://lkml.kernel.org/n/1436445342-1402-23-git-send-email-wangnan0@huawei.com
[Merged by two patches]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c  |  19 ++++++-
 tools/perf/util/bpf-loader.c | 133 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h |  13 +++++
 3 files changed, 164 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 31934b1..8833186 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1140,7 +1140,23 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (err)
 		goto out_bpf_clear;
 
-	err = -ENOMEM;
+	/*
+	 * bpf__probe must be called before symbol__init() because we
+	 * need init_symbol_maps. If called after symbol__init,
+	 * symbol_conf.sort_by_name won't take effect.
+	 *
+	 * bpf__unprobe() is safe even if bpf__probe() failed, and it
+	 * also calls symbol__init. Therefore, goto out_symbol_exit
+	 * is safe when probe failed.
+	 */
+	err = bpf__probe();
+	if (err) {
+		bpf__strerror_probe(err, errbuf, sizeof(errbuf));
+
+		pr_err("Probing at events in BPF object failed.\n");
+		pr_err("\t%s\n", errbuf);
+		goto out_symbol_exit;
+	}
 
 	symbol__init(NULL);
 
@@ -1201,6 +1217,7 @@ out_symbol_exit:
 	perf_evlist__delete(rec->evlist);
 	symbol__exit();
 	auxtrace_record__free(rec->itr);
+	bpf__unprobe();
 out_bpf_clear:
 	bpf__clear();
 	return err;
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 88531ea..435f52e 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -9,6 +9,8 @@
 #include "perf.h"
 #include "debug.h"
 #include "bpf-loader.h"
+#include "probe-event.h"
+#include "probe-finder.h"
 
 #define DEFINE_PRINT_FN(name, level) \
 static int libbpf_##name(const char *fmt, ...)	\
@@ -28,6 +30,58 @@ DEFINE_PRINT_FN(debug, 1)
 
 static bool libbpf_initialized;
 
+static int
+config_bpf_program(struct bpf_program *prog, struct perf_probe_event *pev)
+{
+	const char *config_str;
+	int err;
+
+	config_str = bpf_program__title(prog, false);
+	if (!config_str) {
+		pr_debug("bpf: unable to get title for program\n");
+		return -EINVAL;
+	}
+
+	pr_debug("bpf: config program '%s'\n", config_str);
+	err = parse_perf_probe_command(config_str, pev);
+	if (err < 0) {
+		pr_debug("bpf: '%s' is not a valid config string\n",
+			 config_str);
+		/* parse failed, don't need clear pev. */
+		return -EINVAL;
+	}
+
+	if (pev->group && strcmp(pev->group, PERF_BPF_PROBE_GROUP)) {
+		pr_debug("bpf: '%s': group for event is set and not '%s'.\n",
+			 config_str, PERF_BPF_PROBE_GROUP);
+		err = -EINVAL;
+		goto errout;
+	} else if (!pev->group)
+		pev->group = strdup(PERF_BPF_PROBE_GROUP);
+
+	if (!pev->group) {
+		pr_debug("bpf: strdup failed\n");
+		err = -ENOMEM;
+		goto errout;
+	}
+
+	if (!pev->event) {
+		pr_debug("bpf: '%s': event name is missing\n",
+			 config_str);
+		err = -EINVAL;
+		goto errout;
+	}
+
+	pr_debug("bpf: config '%s' is ok\n", config_str);
+
+	return 0;
+
+errout:
+	if (pev)
+		clear_perf_probe_event(pev);
+	return err;
+}
+
 int bpf__prepare_load(const char *filename)
 {
 	struct bpf_object *obj;
@@ -59,6 +113,74 @@ void bpf__clear(void)
 		bpf_object__close(obj);
 }
 
+static bool is_probed;
+
+int bpf__unprobe(void)
+{
+	struct strfilter *delfilter;
+	int ret;
+
+	if (!is_probed)
+		return 0;
+
+	delfilter = strfilter__new(PERF_BPF_PROBE_GROUP ":*", NULL);
+	if (!delfilter) {
+		pr_debug("Failed to create delfilter when unprobing\n");
+		return -ENOMEM;
+	}
+
+	ret = del_perf_probe_events(delfilter);
+	strfilter__delete(delfilter);
+	if (ret < 0 && is_probed)
+		pr_debug("Error: failed to delete events: %s\n",
+			 strerror(-ret));
+	else
+		is_probed = false;
+	return ret < 0 ? ret : 0;
+}
+
+int bpf__probe(void)
+{
+	int err, nr_events = 0;
+	struct bpf_object *obj, *tmp;
+	struct bpf_program *prog;
+	struct perf_probe_event *pevs;
+
+	pevs = calloc(MAX_PROBES, sizeof(pevs[0]));
+	if (!pevs)
+		return -ENOMEM;
+
+	bpf_object__for_each_safe(obj, tmp) {
+		bpf_object__for_each_program(prog, obj) {
+			err = config_bpf_program(prog, &pevs[nr_events++]);
+			if (err < 0)
+				goto out;
+
+			if (nr_events >= MAX_PROBES) {
+				pr_debug("Too many (more than %d) events\n",
+					 MAX_PROBES);
+				err = -ERANGE;
+				goto out;
+			};
+		}
+	}
+
+	probe_conf.max_probes = MAX_PROBES;
+	/* Let add_perf_probe_events generates probe_trace_event (tevs) */
+	err = add_perf_probe_events(pevs, nr_events, false);
+
+	/* add_perf_probe_events return negative when fail */
+	if (err < 0) {
+		pr_debug("bpf probe: failed to probe events\n");
+	} else
+		is_probed = true;
+out:
+	while (nr_events > 0)
+		clear_perf_probe_event(&pevs[--nr_events]);
+	free(pevs);
+	return err < 0 ? err : 0;
+}
+
 #define bpf__strerror_head(err, buf, size) \
 	char sbuf[STRERR_BUFSIZE], *emsg;\
 	if (!size)\
@@ -90,3 +212,14 @@ int bpf__strerror_prepare_load(const char *filename, int err,
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_probe(int err, char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(ERANGE, "Too many (more than %d) events",
+			    MAX_PROBES);
+	bpf__strerror_entry(ENOENT, "Selected kprobe point doesn't exist.");
+	bpf__strerror_entry(EEXIST, "Selected kprobe point already exist, try perf probe -d '*'.");
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 12be630..6b09a85 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -9,10 +9,15 @@
 #include <string.h>
 #include "debug.h"
 
+#define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
+
 #ifdef HAVE_LIBBPF_SUPPORT
 int bpf__prepare_load(const char *filename);
 int bpf__strerror_prepare_load(const char *filename, int err,
 			       char *buf, size_t size);
+int bpf__probe(void);
+int bpf__unprobe(void);
+int bpf__strerror_probe(int err, char *buf, size_t size);
 
 void bpf__clear(void);
 #else
@@ -22,6 +27,8 @@ static inline int bpf__prepare_load(const char *filename __maybe_unused)
 	return -1;
 }
 
+static inline int bpf__probe(void) { return 0; }
+static inline int bpf__unprobe(void) { return 0; }
 static inline void bpf__clear(void) { }
 
 static inline int
@@ -43,5 +50,11 @@ bpf__strerror_prepare_load(const char *filename __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int bpf__strerror_probe(int err __maybe_unused,
+				      char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 09/31] perf bpf: Collect 'struct perf_probe_event' for bpf_program
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (7 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 08/31] perf record, bpf: Parse and probe eBPF programs probe points Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 10/31] perf record: Load all eBPF object into kernel Wang Nan
                   ` (21 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

This patch utilizes bpf_program__set_private(), binding perf_probe_event
with bpf program by private field.

Saving those information so 'perf record' knows which kprobe point a program
should be attached.

Since data in 'struct perf_probe_event' is build by 2 stages, tev_ready
is used to mark whether the information (especially tevs) in 'struct
perf_probe_event' is valid or not. It is false at first, and set to true
by sync_bpf_program_pev(), which copy all pointers in original pev into
a program specific memory region. sync_bpf_program_pev() is called after
add_perf_probe_events() to make sure ready of data.

Remove code which clean 'struct perf_probe_event' after bpf__probe()
because pointers in pevs are copied to program's private field, calling
clear_perf_probe_event() becomes unsafe.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1436445342-1402-21-git-send-email-wangnan0@huawei.com
[Splitted from a larger patch]
---
 tools/perf/util/bpf-loader.c | 90 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 88 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 435f52e..ae23f6f 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -30,9 +30,35 @@ DEFINE_PRINT_FN(debug, 1)
 
 static bool libbpf_initialized;
 
+struct bpf_prog_priv {
+	/*
+	 * If pev_ready is false, ppev pointes to a local memory which
+	 * is only valid inside bpf__probe().
+	 * pev is valid only when pev_ready.
+	 */
+	bool pev_ready;
+	union {
+		struct perf_probe_event *ppev;
+		struct perf_probe_event pev;
+	};
+};
+
+static void
+bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
+			  void *_priv)
+{
+	struct bpf_prog_priv *priv = _priv;
+
+	/* check if pev is initialized */
+	if (priv && priv->pev_ready)
+		clear_perf_probe_event(&priv->pev);
+	free(priv);
+}
+
 static int
 config_bpf_program(struct bpf_program *prog, struct perf_probe_event *pev)
 {
+	struct bpf_prog_priv *priv = NULL;
 	const char *config_str;
 	int err;
 
@@ -74,14 +100,58 @@ config_bpf_program(struct bpf_program *prog, struct perf_probe_event *pev)
 
 	pr_debug("bpf: config '%s' is ok\n", config_str);
 
+	priv = calloc(1, sizeof(*priv));
+	if (!priv) {
+		pr_debug("bpf: failed to alloc memory\n");
+		err = -ENOMEM;
+		goto errout;
+	}
+
+	/*
+	 * At this very early stage, tevs inside pev are not ready.
+	 * It becomes usable after add_perf_probe_events() is called.
+	 * set pev_ready to false so further access read priv->ppev
+	 * only.
+	 */
+	priv->pev_ready = false;
+	priv->ppev = pev;
+
+	err = bpf_program__set_private(prog, priv,
+				       bpf_prog_priv__clear);
+	if (err) {
+		pr_debug("bpf: set program private failed\n");
+		err = -ENOMEM;
+		goto errout;
+	}
 	return 0;
 
 errout:
 	if (pev)
 		clear_perf_probe_event(pev);
+	if (priv)
+		free(priv);
 	return err;
 }
 
+static int
+sync_bpf_program_pev(struct bpf_program *prog)
+{
+	int err;
+	struct bpf_prog_priv *priv;
+	struct perf_probe_event *ppev;
+
+	err = bpf_program__get_private(prog, (void **)&priv);
+	if (err || !priv || priv->pev_ready) {
+		pr_debug("Internal error: sync_bpf_program_pev\n");
+		return -EINVAL;
+	}
+
+	ppev = priv->ppev;
+	memcpy(&priv->pev, ppev, sizeof(*ppev));
+	priv->pev_ready = true;
+	return 0;
+}
+
 int bpf__prepare_load(const char *filename)
 {
 	struct bpf_object *obj;
@@ -172,11 +242,27 @@ int bpf__probe(void)
 	/* add_perf_probe_events return negative when fail */
 	if (err < 0) {
 		pr_debug("bpf probe: failed to probe events\n");
+		goto out;
 	} else
 		is_probed = true;
+
+	/*
+	 * After add_perf_probe_events, 'struct perf_probe_event' is ready.
+	 * Until now copying program's priv->pev field and freeing
+	 * the big array allocated before become safe.
+	 */
+	bpf_object__for_each_safe(obj, tmp) {
+		bpf_object__for_each_program(prog, obj) {
+			err = sync_bpf_program_pev(prog);
+			if (err)
+				goto out;
+		}
+	}
 out:
-	while (nr_events > 0)
-		clear_perf_probe_event(&pevs[--nr_events]);
+	/*
+	 * Don't call clear_perf_probe_event() for entries of pevs:
+	 * they are used by prog's private field.
+	 */
 	free(pevs);
 	return err < 0 ? err : 0;
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 10/31] perf record: Load all eBPF object into kernel
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (8 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 09/31] perf bpf: Collect 'struct perf_probe_event' for bpf_program Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 11/31] perf tools: Add bpf_fd field to evsel and config it Wang Nan
                   ` (20 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

This patch utilizes bpf_object__load() provided by libbpf to load all
objects into kernel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-24-git-send-email-wangnan0@huawei.com
---
 tools/perf/builtin-record.c  | 15 +++++++++++++++
 tools/perf/util/bpf-loader.c | 28 ++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h | 10 ++++++++++
 3 files changed, 53 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8833186..c335ac5 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1158,6 +1158,21 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		goto out_symbol_exit;
 	}
 
+	/*
+	 * bpf__probe() also calls symbol__init() if there are probe
+	 * events in bpf objects, so calling symbol_exit when failure
+	 * is safe. If there is no probe event, bpf__load() always
+	 * success.
+	 */
+	err = bpf__load();
+	if (err) {
+		pr_err("Loading BPF programs failed:\n");
+
+		bpf__strerror_load(err, errbuf, sizeof(errbuf));
+		pr_err("\t%s\n", errbuf);
+		goto out_symbol_exit;
+	}
+
 	symbol__init(NULL);
 
 	if (symbol_conf.kptr_restrict)
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index ae23f6f..d63a594 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -267,6 +267,25 @@ out:
 	return err < 0 ? err : 0;
 }
 
+int bpf__load(void)
+{
+	struct bpf_object *obj, *tmp;
+	int err = 0;
+
+	bpf_object__for_each_safe(obj, tmp) {
+		err = bpf_object__load(obj);
+		if (err) {
+			pr_debug("bpf: load objects failed\n");
+			goto errout;
+		}
+	}
+	return 0;
+errout:
+	bpf_object__for_each_safe(obj, tmp)
+		bpf_object__unload(obj);
+	return err;
+}
+
 #define bpf__strerror_head(err, buf, size) \
 	char sbuf[STRERR_BUFSIZE], *emsg;\
 	if (!size)\
@@ -309,3 +328,12 @@ int bpf__strerror_probe(int err, char *buf, size_t size)
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_load(int err, char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(EINVAL, "%s: add -v to see detail. Run a CONFIG_BPF_SYSCALL kernel?",
+			    emsg)
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 6b09a85..4d7552e 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -19,6 +19,9 @@ int bpf__probe(void);
 int bpf__unprobe(void);
 int bpf__strerror_probe(int err, char *buf, size_t size);
 
+int bpf__load(void);
+int bpf__strerror_load(int err, char *buf, size_t size);
+
 void bpf__clear(void);
 #else
 static inline int bpf__prepare_load(const char *filename __maybe_unused)
@@ -29,6 +32,7 @@ static inline int bpf__prepare_load(const char *filename __maybe_unused)
 
 static inline int bpf__probe(void) { return 0; }
 static inline int bpf__unprobe(void) { return 0; }
+static inline int bpf__load(void) { return 0; }
 static inline void bpf__clear(void) { }
 
 static inline int
@@ -56,5 +60,11 @@ static inline int bpf__strerror_probe(int err __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int bpf__strerror_load(int err __maybe_unused,
+				     char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 11/31] perf tools: Add bpf_fd field to evsel and config it
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (9 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 10/31] perf record: Load all eBPF object into kernel Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 12/31] perf tools: Allow filter option to be applied to bof object Wang Nan
                   ` (19 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

This patch adds a bpf_fd field to 'struct evsel' then introduces method
to config it. In bpf-loader, a bpf__foreach_tev() function is added,
which calls the callback function for each 'struct probe_trace_event'
events for each bpf program with their file descriptors. In evlist.c,
perf_evlist__add_bpf() is introduced to add all bpf events into evlist.
The event names are found from probe_trace_event structure.
'perf record' calls perf_evlist__add_bpf().

Since bpf-loader.c will not be built if libbpf is turned off, an empty
bpf__foreach_tev() is defined in bpf-loader.h to avoid compiling
error.

This patch iterates over 'struct probe_trace_event' instead of
'struct probe_trace_event' during the loop for further patches, which
will generate multiple instances form one BPF program and install then
onto different 'struct probe_trace_event'.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-25-git-send-email-wangnan0@huawei.com
---
 tools/perf/builtin-record.c  |  6 ++++++
 tools/perf/util/bpf-loader.c | 41 +++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h | 13 +++++++++++++
 tools/perf/util/evlist.c     | 41 +++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h     |  1 +
 tools/perf/util/evsel.c      |  1 +
 tools/perf/util/evsel.h      |  1 +
 7 files changed, 104 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index c335ac5..5051d3b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1173,6 +1173,12 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		goto out_symbol_exit;
 	}
 
+	err = perf_evlist__add_bpf(rec->evlist);
+	if (err < 0) {
+		pr_err("Failed to add events from BPF object(s)\n");
+		goto out_symbol_exit;
+	}
+
 	symbol__init(NULL);
 
 	if (symbol_conf.kptr_restrict)
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index d63a594..126aa71 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -286,6 +286,47 @@ errout:
 	return err;
 }
 
+int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg)
+{
+	struct bpf_object *obj, *tmp;
+	struct bpf_program *prog;
+	int err;
+
+	bpf_object__for_each_safe(obj, tmp) {
+		bpf_object__for_each_program(prog, obj) {
+			struct probe_trace_event *tev;
+			struct perf_probe_event *pev;
+			struct bpf_prog_priv *priv;
+			int i, fd;
+
+			err = bpf_program__get_private(prog,
+						       (void **)&priv);
+			if (err || !priv) {
+				pr_debug("bpf: failed to get private field\n");
+				return -EINVAL;
+			}
+
+			pev = &priv->pev;
+			for (i = 0; i < pev->ntevs; i++) {
+				tev = &pev->tevs[i];
+
+				fd = bpf_program__fd(prog);
+				if (fd < 0) {
+					pr_debug("bpf: failed to get file descriptor\n");
+					return fd;
+				}
+
+				err = func(tev, fd, arg);
+				if (err) {
+					pr_debug("bpf: call back failed, stop iterate\n");
+					return err;
+				}
+			}
+		}
+	}
+	return 0;
+}
+
 #define bpf__strerror_head(err, buf, size) \
 	char sbuf[STRERR_BUFSIZE], *emsg;\
 	if (!size)\
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 4d7552e..34656f8 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -7,10 +7,14 @@
 
 #include <linux/compiler.h>
 #include <string.h>
+#include "probe-event.h"
 #include "debug.h"
 
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
+typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
+					int fd, void *arg);
+
 #ifdef HAVE_LIBBPF_SUPPORT
 int bpf__prepare_load(const char *filename);
 int bpf__strerror_prepare_load(const char *filename, int err,
@@ -23,6 +27,8 @@ int bpf__load(void);
 int bpf__strerror_load(int err, char *buf, size_t size);
 
 void bpf__clear(void);
+
+int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg);
 #else
 static inline int bpf__prepare_load(const char *filename __maybe_unused)
 {
@@ -36,6 +42,13 @@ static inline int bpf__load(void) { return 0; }
 static inline void bpf__clear(void) { }
 
 static inline int
+bpf__foreach_tev(bpf_prog_iter_callback_t func __maybe_unused,
+		 void *arg __maybe_unused)
+{
+	return 0;
+}
+
+static inline int
 __bpf_strerror(char *buf, size_t size)
 {
 	if (!size)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 8a4e64d..f79bbf8 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -14,6 +14,7 @@
 #include "target.h"
 #include "evlist.h"
 #include "evsel.h"
+#include "bpf-loader.h"
 #include "debug.h"
 #include <unistd.h>
 
@@ -196,6 +197,46 @@ error:
 	return -ENOMEM;
 }
 
+static int add_bpf_event(struct probe_trace_event *tev, int fd,
+			 void *arg)
+{
+	struct perf_evlist *evlist = arg;
+	struct perf_evsel *pos;
+	struct list_head list;
+	int err, idx, entries;
+
+	pr_debug("add bpf event %s:%s and attach bpf program %d\n",
+			tev->group, tev->event, fd);
+	INIT_LIST_HEAD(&list);
+	idx = evlist->nr_entries;
+
+	pr_debug("adding %s:%s\n", tev->group, tev->event);
+	err = parse_events_add_tracepoint(&list, &idx, tev->group,
+					  tev->event);
+	if (err) {
+		struct perf_evsel *evsel, *tmp;
+
+		pr_err("Failed to add BPF event %s:%s\n",
+				tev->group, tev->event);
+		list_for_each_entry_safe(evsel, tmp, &list, node) {
+			list_del(&evsel->node);
+			perf_evsel__delete(evsel);
+		}
+		return -EINVAL;
+	}
+
+	list_for_each_entry(pos, &list, node)
+		pos->bpf_fd = fd;
+	entries = idx - evlist->nr_entries;
+	perf_evlist__splice_list_tail(evlist, &list, entries);
+	return 0;
+}
+
+int perf_evlist__add_bpf(struct perf_evlist *evlist)
+{
+	return bpf__foreach_tev(add_bpf_event, evlist);
+}
+
 static int perf_evlist__add_attrs(struct perf_evlist *evlist,
 				  struct perf_event_attr *attrs, size_t nr_attrs)
 {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7f15727..f7159c5 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -73,6 +73,7 @@ void perf_evlist__delete(struct perf_evlist *evlist);
 
 void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry);
 int perf_evlist__add_default(struct perf_evlist *evlist);
+int perf_evlist__add_bpf(struct perf_evlist *evlist);
 int __perf_evlist__add_default_attrs(struct perf_evlist *evlist,
 				     struct perf_event_attr *attrs, size_t nr_attrs);
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 01267f4..6fff961 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -207,6 +207,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
 	evsel->unit	   = "";
 	evsel->scale	   = 1.0;
 	evsel->evlist	   = NULL;
+	evsel->bpf_fd	   = -1;
 	INIT_LIST_HEAD(&evsel->node);
 	INIT_LIST_HEAD(&evsel->config_terms);
 	perf_evsel__object.init(evsel);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 0b8e47d..699bb14 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -120,6 +120,7 @@ struct perf_evsel {
 	bool			cmdline_group_boundary;
 	bool			is_dummy;
 	struct list_head	config_terms;
+	int			bpf_fd;
 };
 
 union u64_swap {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 12/31] perf tools: Allow filter option to be applied to bof object
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (10 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 11/31] perf tools: Add bpf_fd field to evsel and config it Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 13/31] perf tools: Attach eBPF program to perf event Wang Nan
                   ` (18 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Before this patch, --filter options can't be applied to BPF object
'events'. For example, the following command:

 # perf record -e cycles -e test_bpf.o --exclude-perf -a sleep 1

doesn't apply '--exclude-perf' to events in test_bpf.o. Instead, the
filter will be applied to 'cycles' event. This is caused by the delay
manner of adding real BPF events. Because all BPF probing points are
probed by one call, we can't add real events until all BPF objects
are collected. In previous patch (perf tools: Enable passing bpf object
file to --event), nothing is appended to evlist.

This patch fixes this by utilizing the dummy event linked during
parse_events(). Filter settings goes to dummy event, and be synced with
real events in add_bpf_event().

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1440742821-44548-5-git-send-email-wangnan0@huawei.com
---
 tools/perf/builtin-record.c  |  6 ++++-
 tools/perf/util/bpf-loader.c |  8 ++++++-
 tools/perf/util/bpf-loader.h |  2 ++
 tools/perf/util/evlist.c     | 53 +++++++++++++++++++++++++++++++++++++++++---
 4 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5051d3b..fd56a5b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1113,7 +1113,6 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 
 	argc = parse_options(argc, argv, record_options, record_usage,
 			    PARSE_OPT_STOP_AT_NON_OPTION);
-	perf_evlist__purge_dummy(rec->evlist);
 
 	if (!argc && target__none(&rec->opts.target))
 		usage_with_options(record_usage, record_options);
@@ -1178,6 +1177,11 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		pr_err("Failed to add events from BPF object(s)\n");
 		goto out_symbol_exit;
 	}
+	/*
+	 * Until now let's purge dummy event. Filter options should
+	 * have been attached to real events by perf_evlist__add_bpf().
+	 */
+	perf_evlist__purge_dummy(rec->evlist);
 
 	symbol__init(NULL);
 
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 126aa71..c3bc0a8 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -293,6 +293,12 @@ int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg)
 	int err;
 
 	bpf_object__for_each_safe(obj, tmp) {
+		const char *obj_name;
+
+		obj_name = bpf_object__get_name(obj);
+		if (!obj_name)
+			obj_name = "[unknown]";
+
 		bpf_object__for_each_program(prog, obj) {
 			struct probe_trace_event *tev;
 			struct perf_probe_event *pev;
@@ -316,7 +322,7 @@ int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg)
 					return fd;
 				}
 
-				err = func(tev, fd, arg);
+				err = func(tev, obj_name, fd, arg);
 				if (err) {
 					pr_debug("bpf: call back failed, stop iterate\n");
 					return err;
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 34656f8..323e664 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -6,6 +6,7 @@
 #define __BPF_LOADER_H
 
 #include <linux/compiler.h>
+#include <linux/perf_event.h>
 #include <string.h>
 #include "probe-event.h"
 #include "debug.h"
@@ -13,6 +14,7 @@
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
 typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
+					const char *obj_name,
 					int fd, void *arg);
 
 #ifdef HAVE_LIBBPF_SUPPORT
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f79bbf8..c00e939 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -197,7 +197,45 @@ error:
 	return -ENOMEM;
 }
 
-static int add_bpf_event(struct probe_trace_event *tev, int fd,
+static void
+sync_with_dummy(struct perf_evlist *evlist, const char *obj_name,
+		struct list_head *list)
+{
+	struct perf_evsel *dummy_evsel, *pos;
+	const char *filter;
+	bool found = false;
+	int err;
+
+	evlist__for_each(evlist, dummy_evsel) {
+		if (!perf_evsel__is_dummy(dummy_evsel))
+			continue;
+
+		if (strcmp(dummy_evsel->name, obj_name) == 0) {
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		pr_debug("Failed to find dummy event of '%s'\n",
+			 obj_name);
+		return;
+	}
+
+	filter = dummy_evsel->filter;
+	if (!filter)
+		return;
+
+	list_for_each_entry(pos, list, node) {
+		err = perf_evsel__set_filter(pos, filter);
+		if (err)
+			pr_debug("Failed to set filter '%s' to evsel %s\n",
+				 filter, pos->name);
+	}
+}
+
+static int add_bpf_event(struct probe_trace_event *tev,
+			 const char *obj_name, int fd,
 			 void *arg)
 {
 	struct perf_evlist *evlist = arg;
@@ -205,8 +243,8 @@ static int add_bpf_event(struct probe_trace_event *tev, int fd,
 	struct list_head list;
 	int err, idx, entries;
 
-	pr_debug("add bpf event %s:%s and attach bpf program %d\n",
-			tev->group, tev->event, fd);
+	pr_debug("add bpf event %s:%s and attach bpf program %d (from %s)\n",
+			tev->group, tev->event, fd, obj_name);
 	INIT_LIST_HEAD(&list);
 	idx = evlist->nr_entries;
 
@@ -228,6 +266,15 @@ static int add_bpf_event(struct probe_trace_event *tev, int fd,
 	list_for_each_entry(pos, &list, node)
 		pos->bpf_fd = fd;
 	entries = idx - evlist->nr_entries;
+
+	sync_with_dummy(evlist, obj_name, &list);
+
+	/*
+	 * Currectly we don't need to link those new events at the
+	 * same place where dummy node reside because order of
+	 * events in cmdline won't be used after
+	 * 'perf_evlist__add_bpf'.
+	 */
 	perf_evlist__splice_list_tail(evlist, &list, entries);
 	return 0;
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 13/31] perf tools: Attach eBPF program to perf event
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (11 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 12/31] perf tools: Allow filter option to be applied to bof object Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 14/31] perf tools: Suppress probing messages when probing by BPF loading Wang Nan
                   ` (17 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

This is the final patch which makes basic BPF filter work. After
applying this patch, users are allowed to use BPF filter like:

 # perf record --event ./hello_world.c ls

In this patch PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF
program to a newly created perf event. The file descriptor of the
eBPF program is passed to perf record using previous patches, and
stored into evsel->bpf_fd.

It is possible that different perf event are created for one kprobe
events for different CPUs. In this case, when trying to call the
ioctl, EEXIST will be return. This patch doesn't treat it as an error.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-26-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/evsel.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6fff961..5f59841 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1365,6 +1365,22 @@ retry_open:
 					  err);
 				goto try_fallback;
 			}
+
+			if (evsel->bpf_fd >= 0) {
+				int evt_fd = FD(evsel, cpu, thread);
+				int bpf_fd = evsel->bpf_fd;
+
+				err = ioctl(evt_fd,
+					    PERF_EVENT_IOC_SET_BPF,
+					    bpf_fd);
+				if (err && errno != EEXIST) {
+					pr_err("failed to attach bpf fd %d: %s\n",
+					       bpf_fd, strerror(errno));
+					err = -EINVAL;
+					goto out_close;
+				}
+			}
+
 			set_rlimit = NO_CHANGE;
 
 			/*
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 14/31] perf tools: Suppress probing messages when probing by BPF loading
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (12 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 13/31] perf tools: Attach eBPF program to perf event Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-09-03  0:20   ` Namhyung Kim
  2015-08-29  4:21 ` [PATCH 15/31] perf record: Add clang options for compiling BPF scripts Wang Nan
                   ` (16 subsequent siblings)
  30 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

This patch suppresses message output by add_perf_probe_events() and
del_perf_probe_events() if they are triggered by BPF loading. Before
this patch, when using 'perf record' with BPF object/source as event
selector, following message will be output:

     Added new event:
           perf_bpf_probe:lock_page_ret (on __lock_page%return)
        You can now use it in all perf tools, such as:
	            perf record -e perf_bpf_probe:lock_page_ret -aR sleep 1
     ...
     Removed event: perf_bpf_probe:lock_page_ret

Which is misleading, especially 'use it in all perf tools' because they
will be removed after 'pref record' exit.

In this patch, a 'silent' field is appended into probe_conf to control
output. bpf__{,un}probe() set it to true when calling
{add,del}_perf_probe_events().

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1440151770-129878-12-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/bpf-loader.c  |  6 ++++++
 tools/perf/util/probe-event.c | 17 ++++++++++++-----
 tools/perf/util/probe-event.h |  1 +
 tools/perf/util/probe-file.c  |  5 ++++-
 4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index c3bc0a8..77eeb99 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -188,6 +188,7 @@ static bool is_probed;
 int bpf__unprobe(void)
 {
 	struct strfilter *delfilter;
+	bool old_silent = probe_conf.silent;
 	int ret;
 
 	if (!is_probed)
@@ -199,7 +200,9 @@ int bpf__unprobe(void)
 		return -ENOMEM;
 	}
 
+	probe_conf.silent = true;
 	ret = del_perf_probe_events(delfilter);
+	probe_conf.silent = old_silent;
 	strfilter__delete(delfilter);
 	if (ret < 0 && is_probed)
 		pr_debug("Error: failed to delete events: %s\n",
@@ -215,6 +218,7 @@ int bpf__probe(void)
 	struct bpf_object *obj, *tmp;
 	struct bpf_program *prog;
 	struct perf_probe_event *pevs;
+	bool old_silent = probe_conf.silent;
 
 	pevs = calloc(MAX_PROBES, sizeof(pevs[0]));
 	if (!pevs)
@@ -235,9 +239,11 @@ int bpf__probe(void)
 		}
 	}
 
+	probe_conf.silent = true;
 	probe_conf.max_probes = MAX_PROBES;
 	/* Let add_perf_probe_events generates probe_trace_event (tevs) */
 	err = add_perf_probe_events(pevs, nr_events, false);
+	probe_conf.silent = old_silent;
 
 	/* add_perf_probe_events return negative when fail */
 	if (err < 0) {
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 57a7bae..e720913 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -52,7 +52,9 @@
 #define PERFPROBE_GROUP "probe"
 
 bool probe_event_dry_run;	/* Dry run flag */
-struct probe_conf probe_conf;
+struct probe_conf probe_conf = {
+	.silent = false,
+};
 
 #define semantic_error(msg ...) pr_err("Semantic error :" msg)
 
@@ -2192,10 +2194,12 @@ static int show_perf_probe_event(const char *group, const char *event,
 
 	ret = perf_probe_event__sprintf(group, event, pev, module, &buf);
 	if (ret >= 0) {
-		if (use_stdout)
+		if (use_stdout && !probe_conf.silent)
 			printf("%s\n", buf.buf);
-		else
+		else if (!probe_conf.silent)
 			pr_info("%s\n", buf.buf);
+		else
+			pr_debug("%s\n", buf.buf);
 	}
 	strbuf_release(&buf);
 
@@ -2418,7 +2422,10 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
 	}
 
 	ret = 0;
-	pr_info("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
+	if (!probe_conf.silent)
+		pr_info("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
+	else
+		pr_debug("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
 	for (i = 0; i < ntevs; i++) {
 		tev = &tevs[i];
 		/* Skip if the symbol is out of .text or blacklisted */
@@ -2454,7 +2461,7 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
 		warn_uprobe_event_compat(tev);
 
 	/* Note that it is possible to skip all events because of blacklist */
-	if (ret >= 0 && event) {
+	if (ret >= 0 && event && !probe_conf.silent) {
 		/* Show how to use the event. */
 		pr_info("\nYou can now use it in all perf tools, such as:\n\n");
 		pr_info("\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 915f0d8..3ab9c3e 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -13,6 +13,7 @@ struct probe_conf {
 	bool	force_add;
 	bool	no_inlines;
 	int	max_probes;
+	bool	silent;
 };
 extern struct probe_conf probe_conf;
 extern bool probe_event_dry_run;
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index bbb2437..db7bd4c 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -267,7 +267,10 @@ static int __del_trace_probe_event(int fd, struct str_node *ent)
 		goto error;
 	}
 
-	pr_info("Removed event: %s\n", ent->s);
+	if (!probe_conf.silent)
+		pr_info("Removed event: %s\n", ent->s);
+	else
+		pr_debug("Removed event: %s\n", ent->s);
 	return 0;
 error:
 	pr_warning("Failed to delete event: %s\n",
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 15/31] perf record: Add clang options for compiling BPF scripts
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (13 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 14/31] perf tools: Suppress probing messages when probing by BPF loading Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 16/31] perf tools: Infrastructure for compiling scriptlets when passing '.c' to --event Wang Nan
                   ` (15 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Although previous patch allows setting BPF compiler related options in
perfconfig, on some ad-hoc situation it still requires passing options
through cmdline. This patch introduces 2 options to 'perf record' for
this propose: --clang-path and --clang-opt.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-28-git-send-email-wangnan0@huawei.com
---
 tools/perf/builtin-record.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index fd56a5b..212718c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -30,6 +30,7 @@
 #include "util/auxtrace.h"
 #include "util/parse-branch-options.h"
 #include "util/bpf-loader.h"
+#include "util/llvm-utils.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -1094,6 +1095,12 @@ struct option __record_options[] = {
 			"per thread proc mmap processing timeout in ms"),
 	OPT_BOOLEAN(0, "switch-events", &record.opts.record_switch_events,
 		    "Record context switch events"),
+#ifdef HAVE_LIBBPF_SUPPORT
+	OPT_STRING(0, "clang-path", &llvm_param.clang_path, "clang path",
+		   "clang binary to use for compiling BPF scriptlets"),
+	OPT_STRING(0, "clang-opt", &llvm_param.clang_opt, "clang options",
+		   "options passed to clang when compiling BPF scriptlets"),
+#endif
 	OPT_END()
 };
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 16/31] perf tools: Infrastructure for compiling scriptlets when passing '.c' to --event
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (14 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 15/31] perf record: Add clang options for compiling BPF scripts Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 17/31] perf tests: Enforce LLVM test for BPF test Wang Nan
                   ` (14 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

This patch provides infrastructure for passing source files to --event
directly using:

 # perf record --event bpf-file.c command

This patch does following works:

 1) Allow passing '.c' file to '--event'. parse_events_load_bpf() is
    expanded to allow caller tell it whether the passed file is source
    file or object.

 2) llvm__compile_bpf() is called to compile the '.c' file, the result
    is saved into memory. Use bpf_object__open_buffer() to load the
    in-memory object.

Introduces a bpf-script-example.c so we can manually test it:

 # perf record --clang-opt "-DLINUX_VERSION_CODE=0x40200" --event ./bpf-script-example.c sleep 1

Note that '--clang-opt' must put before '--event'.

Futher patches will merge it into a testcase so can be tested automatically.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1436445342-1402-20-git-send-email-wangnan0@huawei.com
[ wangnan: Pass name of source file to bpf_object__open_buffer(). ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/bpf-script-example.c | 44 +++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.c          | 25 +++++++++++++++-----
 tools/perf/util/bpf-loader.h          | 10 ++++----
 tools/perf/util/parse-events.c        |  8 +++----
 tools/perf/util/parse-events.h        |  3 ++-
 tools/perf/util/parse-events.l        |  3 +++
 tools/perf/util/parse-events.y        | 15 ++++++++++--
 7 files changed, 91 insertions(+), 17 deletions(-)
 create mode 100644 tools/perf/tests/bpf-script-example.c

diff --git a/tools/perf/tests/bpf-script-example.c b/tools/perf/tests/bpf-script-example.c
new file mode 100644
index 0000000..410a70b
--- /dev/null
+++ b/tools/perf/tests/bpf-script-example.c
@@ -0,0 +1,44 @@
+#ifndef LINUX_VERSION_CODE
+# error Need LINUX_VERSION_CODE
+# error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
+#endif
+#define BPF_ANY 0
+#define BPF_MAP_TYPE_ARRAY 2
+#define BPF_FUNC_map_lookup_elem 1
+#define BPF_FUNC_map_update_elem 2
+
+static void *(*bpf_map_lookup_elem)(void *map, void *key) =
+	(void *) BPF_FUNC_map_lookup_elem;
+static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) =
+	(void *) BPF_FUNC_map_update_elem;
+
+struct bpf_map_def {
+	unsigned int type;
+	unsigned int key_size;
+	unsigned int value_size;
+	unsigned int max_entries;
+};
+
+#define SEC(NAME) __attribute__((section(NAME), used))
+struct bpf_map_def SEC("maps") flip_table = {
+	.type = BPF_MAP_TYPE_ARRAY,
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.max_entries = 1,
+};
+
+SEC("func=sys_epoll_pwait")
+int bpf_func__sys_epoll_pwait(void *ctx)
+{
+	int ind =0;
+	int *flag = bpf_map_lookup_elem(&flip_table, &ind);
+	int new_flag;
+	if (!flag)
+		return 0;
+	/* flip flag and store back */
+	new_flag = !*flag;
+	bpf_map_update_elem(&flip_table, &ind, &new_flag, BPF_ANY);
+	return new_flag;
+}
+char _license[] SEC("license") = "GPL";
+int _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 77eeb99..c2aafe2 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -11,6 +11,7 @@
 #include "bpf-loader.h"
 #include "probe-event.h"
 #include "probe-finder.h"
+#include "llvm-utils.h"
 
 #define DEFINE_PRINT_FN(name, level) \
 static int libbpf_##name(const char *fmt, ...)	\
@@ -152,16 +153,28 @@ sync_bpf_program_pev(struct bpf_program *prog)
 	return 0;
 }
 
-int bpf__prepare_load(const char *filename)
+int bpf__prepare_load(const char *filename, bool source)
 {
 	struct bpf_object *obj;
+	int err;
 
 	if (!libbpf_initialized)
 		libbpf_set_print(libbpf_warning,
 				 libbpf_info,
 				 libbpf_debug);
 
-	obj = bpf_object__open(filename);
+	if (source) {
+		void *obj_buf;
+		size_t obj_buf_sz;
+
+		err = llvm__compile_bpf(filename, &obj_buf, &obj_buf_sz);
+		if (err)
+			return err;
+		obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, filename);
+		free(obj_buf);
+	} else
+		obj = bpf_object__open(filename);
+
 	if (!obj) {
 		pr_debug("bpf: failed to load %s\n", filename);
 		return -EINVAL;
@@ -361,12 +374,12 @@ int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg)
 	}\
 	buf[size - 1] = '\0';
 
-int bpf__strerror_prepare_load(const char *filename, int err,
-			       char *buf, size_t size)
+int bpf__strerror_prepare_load(const char *filename, bool source,
+			       int err, char *buf, size_t size)
 {
 	bpf__strerror_head(err, buf, size);
-	bpf__strerror_entry(EINVAL, "%s: BPF object file '%s' is invalid",
-			    emsg, filename)
+	bpf__strerror_entry(EINVAL, "%s: BPF %s file '%s' is invalid",
+			    emsg, source ? "source" : "object", filename);
 	bpf__strerror_end(buf, size);
 	return 0;
 }
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 323e664..97aed65 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -18,9 +18,9 @@ typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
 					int fd, void *arg);
 
 #ifdef HAVE_LIBBPF_SUPPORT
-int bpf__prepare_load(const char *filename);
-int bpf__strerror_prepare_load(const char *filename, int err,
-			       char *buf, size_t size);
+int bpf__prepare_load(const char *filename, bool source);
+int bpf__strerror_prepare_load(const char *filename, bool source,
+			       int err, char *buf, size_t size);
 int bpf__probe(void);
 int bpf__unprobe(void);
 int bpf__strerror_probe(int err, char *buf, size_t size);
@@ -32,7 +32,8 @@ void bpf__clear(void);
 
 int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg);
 #else
-static inline int bpf__prepare_load(const char *filename __maybe_unused)
+static inline int bpf__prepare_load(const char *filename __maybe_unused,
+				    bool source __maybe_unused)
 {
 	pr_debug("ERROR: eBPF object loading is disabled during compiling.\n");
 	return -1;
@@ -64,6 +65,7 @@ __bpf_strerror(char *buf, size_t size)
 
 static inline int
 bpf__strerror_prepare_load(const char *filename __maybe_unused,
+			   bool source __maybe_unused,
 			   int err __maybe_unused,
 			   char *buf, size_t size)
 {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4343433..08b277b 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -483,8 +483,8 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
 }
 
 int parse_events_load_bpf(struct parse_events_evlist *data,
-			  struct list_head *list,
-			  char *bpf_file_name)
+			  struct list_head *list __maybe_unused,
+			  char *bpf_file_name, bool source)
 {
 	int err;
 	char errbuf[BUFSIZ];
@@ -500,9 +500,9 @@ int parse_events_load_bpf(struct parse_events_evlist *data,
 	 * problem. After that probe events file by file is possible.
 	 * However, probing cost is still need to be considered.
 	 */
-	err = bpf__prepare_load(bpf_file_name);
+	err = bpf__prepare_load(bpf_file_name, source);
 	if (err) {
-		bpf__strerror_prepare_load(bpf_file_name, err,
+		bpf__strerror_prepare_load(bpf_file_name, source, err,
 					   errbuf, sizeof(errbuf));
 		data->error->str = strdup(errbuf);
 		data->error->help = strdup("(add -v to see detail)");
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 3652387..728a424 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -121,7 +121,8 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
 				char *sys, char *event);
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
-			  char *bpf_file_name);
+			  char *bpf_file_name,
+			  bool source);
 int parse_events_add_numeric(struct parse_events_evlist *data,
 			     struct list_head *list,
 			     u32 type, u64 config,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 22e8f93..8033890 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -116,6 +116,7 @@ group		[^,{}/]*[{][^}]*[}][^,{}/]*
 event_pmu	[^,{}/]+[/][^/]*[/][^,{}/]*
 event		[^,{}/]+
 bpf_object	.*\.(o|bpf)
+bpf_source	.*\.c
 
 num_dec		[0-9]+
 num_hex		0x[a-fA-F0-9]+
@@ -161,6 +162,7 @@ modifier_bp	[rwx]{1,3}
 
 {event_pmu}	|
 {bpf_object}	|
+{bpf_source}	|
 {event}		{
 			BEGIN(INITIAL);
 			REWIND(1);
@@ -267,6 +269,7 @@ r{num_raw_hex}		{ return raw(yyscanner); }
 
 {modifier_event}	{ return str(yyscanner, PE_MODIFIER_EVENT); }
 {bpf_object}		{ return str(yyscanner, PE_BPF_OBJECT); }
+{bpf_source}		{ return str(yyscanner, PE_BPF_SOURCE); }
 {name}			{ return pmu_str_check(yyscanner); }
 "/"			{ BEGIN(config); return '/'; }
 -			{ return '-'; }
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 3ee3a32..90d2458 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -42,7 +42,7 @@ static inc_group_count(struct list_head *list,
 %token PE_VALUE PE_VALUE_SYM_HW PE_VALUE_SYM_SW PE_RAW PE_TERM
 %token PE_EVENT_NAME
 %token PE_NAME
-%token PE_BPF_OBJECT
+%token PE_BPF_OBJECT PE_BPF_SOURCE
 %token PE_MODIFIER_EVENT PE_MODIFIER_BP
 %token PE_NAME_CACHE_TYPE PE_NAME_CACHE_OP_RESULT
 %token PE_PREFIX_MEM PE_PREFIX_RAW PE_PREFIX_GROUP
@@ -55,6 +55,7 @@ static inc_group_count(struct list_head *list,
 %type <num> PE_TERM
 %type <str> PE_NAME
 %type <str> PE_BPF_OBJECT
+%type <str> PE_BPF_SOURCE
 %type <str> PE_NAME_CACHE_TYPE
 %type <str> PE_NAME_CACHE_OP_RESULT
 %type <str> PE_MODIFIER_EVENT
@@ -431,7 +432,17 @@ PE_BPF_OBJECT
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_load_bpf(data, list, $1));
+	ABORT_ON(parse_events_load_bpf(data, list, $1, false));
+	$$ = list;
+}
+|
+PE_BPF_SOURCE
+{
+	struct parse_events_evlist *data = _data;
+	struct list_head *list;
+
+	ALLOC_LIST(list);
+	ABORT_ON(parse_events_load_bpf(data, list, $1, true));
 	$$ = list;
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 17/31] perf tests: Enforce LLVM test for BPF test
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (15 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 16/31] perf tools: Infrastructure for compiling scriptlets when passing '.c' to --event Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-09-01  5:59   ` Wangnan (F)
  2015-08-29  4:21 ` [PATCH 18/31] perf test: Add 'perf test BPF' Wang Nan
                   ` (13 subsequent siblings)
  30 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

This patch replaces the original toy BPF program with previous introduced
bpf-script-example.c. Dynamically embedded it into 'llvm-src.c'.

The newly introduced BPF program attaches a BPF program at
'sys_epoll_pwait()', and collect half samples from it. perf itself never
use that syscall, so further test can verify their result with it.

Since BPF program require LINUX_VERSION_CODE of runtime kernel, this
patch computes that code from uname.

Since the resuling BPF object is useful for further testcases, this patch
introduces 'prepare' and 'cleanup' method to tests, and makes test__llvm()
create a MAP_SHARED memory array to hold the resulting object.

Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1440151770-129878-15-git-send-email-wangnan0@huawei.com
---
 tools/perf/tests/Build          |   9 +++-
 tools/perf/tests/builtin-test.c |   8 ++++
 tools/perf/tests/llvm.c         | 104 +++++++++++++++++++++++++++++++++++-----
 tools/perf/tests/llvm.h         |  14 ++++++
 tools/perf/tests/tests.h        |   2 +
 5 files changed, 123 insertions(+), 14 deletions(-)
 create mode 100644 tools/perf/tests/llvm.h

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index c1518bd..8c98409 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -32,7 +32,14 @@ perf-y += sample-parsing.o
 perf-y += parse-no-sample-id-all.o
 perf-y += kmod-path.o
 perf-y += thread-map.o
-perf-y += llvm.o
+perf-y += llvm.o llvm-src.o
+
+$(OUTPUT)tests/llvm-src.c: tests/bpf-script-example.c
+	$(Q)echo '#include <tests/llvm.h>' > $@
+	$(Q)echo 'const char test_llvm__bpf_prog[] =' >> $@
+	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
+	$(Q)echo ';' >> $@
+
 
 perf-$(CONFIG_X86) += perf-time-to-tsc.o
 
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 136cd93..1a349e8 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -17,6 +17,8 @@
 static struct test {
 	const char *desc;
 	int (*func)(void);
+	void (*prepare)(void);
+	void (*cleanup)(void);
 } tests[] = {
 	{
 		.desc = "vmlinux symtab matches kallsyms",
@@ -177,6 +179,8 @@ static struct test {
 	{
 		.desc = "Test LLVM searching and compiling",
 		.func = test__llvm,
+		.prepare = test__llvm_prepare,
+		.cleanup = test__llvm_cleanup,
 	},
 	{
 		.func = NULL,
@@ -265,7 +269,11 @@ static int __cmd_test(int argc, const char *argv[], struct intlist *skiplist)
 		}
 
 		pr_debug("\n--- start ---\n");
+		if (tests[curr].prepare)
+			tests[curr].prepare();
 		err = run_test(&tests[curr]);
+		if (tests[curr].cleanup)
+			tests[curr].cleanup();
 		pr_debug("---- end ----\n%s:", tests[curr].desc);
 
 		switch (err) {
diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
index 52d5597..236bf39 100644
--- a/tools/perf/tests/llvm.c
+++ b/tools/perf/tests/llvm.c
@@ -1,9 +1,13 @@
 #include <stdio.h>
+#include <sys/utsname.h>
 #include <bpf/libbpf.h>
 #include <util/llvm-utils.h>
 #include <util/cache.h>
+#include <util/util.h>
+#include <sys/mman.h>
 #include "tests.h"
 #include "debug.h"
+#include "llvm.h"
 
 static int perf_config_cb(const char *var, const char *val,
 			  void *arg __maybe_unused)
@@ -11,16 +15,6 @@ static int perf_config_cb(const char *var, const char *val,
 	return perf_default_config(var, val, arg);
 }
 
-/*
- * Randomly give it a "version" section since we don't really load it
- * into kernel
- */
-static const char test_bpf_prog[] =
-	"__attribute__((section(\"do_fork\"), used)) "
-	"int fork(void *ctx) {return 0;} "
-	"char _license[] __attribute__((section(\"license\"), used)) = \"GPL\";"
-	"int _version __attribute__((section(\"version\"), used)) = 0x40100;";
-
 #ifdef HAVE_LIBBPF_SUPPORT
 static int test__bpf_parsing(void *obj_buf, size_t obj_buf_sz)
 {
@@ -41,12 +35,44 @@ static int test__bpf_parsing(void *obj_buf __maybe_unused,
 }
 #endif
 
+static char *
+compose_source(void)
+{
+	struct utsname utsname;
+	int version, patchlevel, sublevel, err;
+	unsigned long version_code;
+	char *code;
+
+	if (uname(&utsname))
+		return NULL;
+
+	err = sscanf(utsname.release, "%d.%d.%d",
+		     &version, &patchlevel, &sublevel);
+	if (err != 3) {
+		fprintf(stderr, " (Can't get kernel version from uname '%s')",
+			utsname.release);
+		return NULL;
+	}
+
+	version_code = (version << 16) + (patchlevel << 8) + sublevel;
+	err = asprintf(&code, "#define LINUX_VERSION_CODE 0x%08lx;\n%s",
+		       version_code, test_llvm__bpf_prog);
+	if (err < 0)
+		return NULL;
+
+	return code;
+}
+
+#define SHARED_BUF_INIT_SIZE	(1 << 20)
+struct test_llvm__bpf_result *p_test_llvm__bpf_result;
+
 int test__llvm(void)
 {
 	char *tmpl_new, *clang_opt_new;
 	void *obj_buf;
 	size_t obj_buf_sz;
 	int err, old_verbose;
+	char *source;
 
 	perf_config(perf_config_cb, NULL);
 
@@ -73,10 +99,22 @@ int test__llvm(void)
 	if (!llvm_param.clang_opt)
 		llvm_param.clang_opt = strdup("");
 
-	err = asprintf(&tmpl_new, "echo '%s' | %s", test_bpf_prog,
-		       llvm_param.clang_bpf_cmd_template);
-	if (err < 0)
+	source = compose_source();
+	if (!source) {
+		pr_err("Failed to compose source code\n");
+		return -1;
+	}
+
+	/* Quote __EOF__ so strings in source won't be expanded by shell */
+	err = asprintf(&tmpl_new, "cat << '__EOF__' | %s\n%s\n__EOF__\n",
+		       llvm_param.clang_bpf_cmd_template, source);
+	free(source);
+	source = NULL;
+	if (err < 0) {
+		pr_err("Failed to alloc new template\n");
 		return -1;
+	}
+
 	err = asprintf(&clang_opt_new, "-xc %s", llvm_param.clang_opt);
 	if (err < 0)
 		return -1;
@@ -93,6 +131,46 @@ int test__llvm(void)
 	}
 
 	err = test__bpf_parsing(obj_buf, obj_buf_sz);
+	if (!err && p_test_llvm__bpf_result) {
+		if (obj_buf_sz > SHARED_BUF_INIT_SIZE) {
+			pr_err("Resulting object too large\n");
+		} else {
+			p_test_llvm__bpf_result->size = obj_buf_sz;
+			memcpy(p_test_llvm__bpf_result->object,
+			       obj_buf, obj_buf_sz);
+		}
+	}
 	free(obj_buf);
 	return err;
 }
+
+void test__llvm_prepare(void)
+{
+	p_test_llvm__bpf_result = mmap(NULL, SHARED_BUF_INIT_SIZE,
+				       PROT_READ | PROT_WRITE,
+				       MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+	if (!p_test_llvm__bpf_result)
+		return;
+	memset((void *)p_test_llvm__bpf_result, '\0', SHARED_BUF_INIT_SIZE);
+}
+
+void test__llvm_cleanup(void)
+{
+	unsigned long boundary, buf_end;
+
+	if (!p_test_llvm__bpf_result)
+		return;
+	if (p_test_llvm__bpf_result->size == 0) {
+		munmap((void *)p_test_llvm__bpf_result, SHARED_BUF_INIT_SIZE);
+		p_test_llvm__bpf_result = NULL;
+		return;
+	}
+
+	buf_end = (unsigned long)p_test_llvm__bpf_result + SHARED_BUF_INIT_SIZE;
+
+	boundary = (unsigned long)(p_test_llvm__bpf_result);
+	boundary += p_test_llvm__bpf_result->size;
+	boundary = (boundary + (page_size - 1)) &
+			(~((unsigned long)page_size - 1));
+	munmap((void *)boundary, buf_end - boundary);
+}
diff --git a/tools/perf/tests/llvm.h b/tools/perf/tests/llvm.h
new file mode 100644
index 0000000..1e89e46
--- /dev/null
+++ b/tools/perf/tests/llvm.h
@@ -0,0 +1,14 @@
+#ifndef PERF_TEST_LLVM_H
+#define PERF_TEST_LLVM_H
+
+#include <stddef.h> /* for size_t */
+
+struct test_llvm__bpf_result {
+	size_t size;
+	char object[];
+};
+
+extern struct test_llvm__bpf_result *p_test_llvm__bpf_result;
+extern const char test_llvm__bpf_prog[];
+
+#endif
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index bf113a2..0d79f04 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -63,6 +63,8 @@ int test__fdarray__add(void);
 int test__kmod_path__parse(void);
 int test__thread_map(void);
 int test__llvm(void);
+void test__llvm_prepare(void);
+void test__llvm_cleanup(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 18/31] perf test: Add 'perf test BPF'
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (16 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 17/31] perf tests: Enforce LLVM test for BPF test Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-09-02 12:45   ` Namhyung Kim
  2015-08-29  4:21 ` [PATCH 19/31] bpf tools: Load a program with different instances using preprocessor Wang Nan
                   ` (12 subsequent siblings)
  30 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

This patch adds BPF testcase for testing BPF event filtering.

By utilizing the result of 'perf test LLVM', this patch compiles the
eBPF sample program then test it ability. The BPF script in 'perf test
LLVM' collects half of execution of epoll_pwait(). This patch runs 111
times of it, so the resule should contains 56 samples.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1440151770-129878-16-git-send-email-wangnan0@huawei.com
---
 tools/perf/tests/Build          |   1 +
 tools/perf/tests/bpf.c          | 170 ++++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/builtin-test.c |   4 +
 tools/perf/tests/llvm.c         |  19 +++++
 tools/perf/tests/llvm.h         |   1 +
 tools/perf/tests/tests.h        |   1 +
 tools/perf/util/bpf-loader.c    |  14 ++++
 tools/perf/util/bpf-loader.h    |   8 ++
 8 files changed, 218 insertions(+)
 create mode 100644 tools/perf/tests/bpf.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 8c98409..7ceb448 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -33,6 +33,7 @@ perf-y += parse-no-sample-id-all.o
 perf-y += kmod-path.o
 perf-y += thread-map.o
 perf-y += llvm.o llvm-src.o
+perf-y += bpf.o
 
 $(OUTPUT)tests/llvm-src.c: tests/bpf-script-example.c
 	$(Q)echo '#include <tests/llvm.h>' > $@
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
new file mode 100644
index 0000000..6c238ca
--- /dev/null
+++ b/tools/perf/tests/bpf.c
@@ -0,0 +1,170 @@
+#include <stdio.h>
+#include <sys/epoll.h>
+#include <util/bpf-loader.h>
+#include <util/evlist.h>
+#include "tests.h"
+#include "llvm.h"
+#include "debug.h"
+#define NR_ITERS       111
+
+#ifdef HAVE_LIBBPF_SUPPORT
+
+static int epoll_pwait_loop(void)
+{
+	int i;
+
+	/* Should fail NR_ITERS times */
+	for (i = 0; i < NR_ITERS; i++)
+		epoll_pwait(-(i + 1), NULL, 0, 0, NULL);
+	return 0;
+}
+
+static int prepare_bpf(void *obj_buf, size_t obj_buf_sz)
+{
+	int err;
+	char errbuf[BUFSIZ];
+
+	err = bpf__prepare_load_buffer(obj_buf, obj_buf_sz, NULL);
+	if (err) {
+		bpf__strerror_prepare_load("[buffer]", false, err, errbuf,
+					   sizeof(errbuf));
+		fprintf(stderr, " (%s)", errbuf);
+		return TEST_FAIL;
+	}
+
+	err = bpf__probe();
+	if (err) {
+		bpf__strerror_load(err, errbuf, sizeof(errbuf));
+		fprintf(stderr, " (%s)", errbuf);
+		if (getuid() != 0)
+			fprintf(stderr, " (try run as root)");
+		return TEST_FAIL;
+	}
+
+	err = bpf__load();
+	if (err) {
+		bpf__strerror_load(err, errbuf, sizeof(errbuf));
+		fprintf(stderr, " (%s)", errbuf);
+		return TEST_FAIL;
+	}
+
+	return 0;
+}
+
+static int do_test(void)
+{
+	struct record_opts opts = {
+		.target = {
+			.uid = UINT_MAX,
+			.uses_mmap = true,
+		},
+		.freq	      = 0,
+		.mmap_pages   = 256,
+		.default_interval = 1,
+	};
+
+	int err, i, count = 0;
+	char pid[16];
+	char sbuf[STRERR_BUFSIZE];
+	struct perf_evlist *evlist;
+
+	snprintf(pid, sizeof(pid), "%d", getpid());
+	pid[sizeof(pid) - 1] = '\0';
+	opts.target.tid = opts.target.pid = pid;
+
+	/* Instead of perf_evlist__new_default, don't add default events */
+	evlist = perf_evlist__new();
+	if (!evlist) {
+		pr_debug("No ehough memory to create evlist\n");
+		return -ENOMEM;
+	}
+
+	err = perf_evlist__create_maps(evlist, &opts.target);
+	if (err < 0) {
+		pr_debug("Not enough memory to create thread/cpu maps\n");
+		goto out_delete_evlist;
+	}
+
+	err = perf_evlist__add_bpf(evlist);
+	if (err) {
+		fprintf(stderr, " (Failed to add events selected by BPF)");
+		goto out_delete_evlist;
+	}
+
+	perf_evlist__config(evlist, &opts);
+
+	err = perf_evlist__open(evlist);
+	if (err < 0) {
+		pr_debug("perf_evlist__open: %s\n",
+			 strerror_r(errno, sbuf, sizeof(sbuf)));
+		goto out_delete_evlist;
+	}
+
+	err = perf_evlist__mmap(evlist, opts.mmap_pages, false);
+	if (err < 0) {
+		pr_debug("perf_evlist__mmap: %s\n",
+			 strerror_r(errno, sbuf, sizeof(sbuf)));
+		goto out_delete_evlist;
+	}
+
+	perf_evlist__enable(evlist);
+	epoll_pwait_loop();
+	perf_evlist__disable(evlist);
+
+	for (i = 0; i < evlist->nr_mmaps; i++) {
+		union perf_event *event;
+
+		while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
+			const u32 type = event->header.type;
+
+			if (type == PERF_RECORD_SAMPLE)
+				count ++;
+		}
+	}
+
+	if (count != (NR_ITERS + 1) / 2) {
+		fprintf(stderr, " (filter result incorrect)");
+		err = -EBADF;
+	}
+
+out_delete_evlist:
+	perf_evlist__delete(evlist);
+	if (err)
+		return TEST_FAIL;
+	return 0;
+}
+
+int test__bpf(void)
+{
+	int err;
+	void *obj_buf;
+	size_t obj_buf_sz;
+
+	test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz);
+	if (!obj_buf || !obj_buf_sz) {
+		if (verbose == 0)
+			fprintf(stderr, " (fix 'perf test LLVM' first)");
+		return TEST_SKIP;
+	}
+
+	err = prepare_bpf(obj_buf, obj_buf_sz);
+	if (err)
+		goto out;
+
+	err = do_test();
+	if (err)
+		goto out;
+out:
+	bpf__unprobe();
+	bpf__clear();
+	if (err)
+		return TEST_FAIL;
+	return 0;
+}
+
+#else
+int test__bpf(void)
+{
+	return TEST_SKIP;
+}
+#endif
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 1a349e8..c32c836 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -183,6 +183,10 @@ static struct test {
 		.cleanup = test__llvm_cleanup,
 	},
 	{
+		.desc = "Test BPF filter",
+		.func = test__bpf,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
index 236bf39..fd5fdb0 100644
--- a/tools/perf/tests/llvm.c
+++ b/tools/perf/tests/llvm.c
@@ -174,3 +174,22 @@ void test__llvm_cleanup(void)
 			(~((unsigned long)page_size - 1));
 	munmap((void *)boundary, buf_end - boundary);
 }
+
+void
+test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz)
+{
+	*p_obj_buf = NULL;
+	*p_obj_buf_sz = 0;
+
+	if (!p_test_llvm__bpf_result) {
+		test__llvm_prepare();
+		test__llvm();
+		test__llvm_cleanup();
+	}
+
+	if (!p_test_llvm__bpf_result)
+		return;
+
+	*p_obj_buf = p_test_llvm__bpf_result->object;
+	*p_obj_buf_sz = p_test_llvm__bpf_result->size;
+}
diff --git a/tools/perf/tests/llvm.h b/tools/perf/tests/llvm.h
index 1e89e46..2fd7ed6 100644
--- a/tools/perf/tests/llvm.h
+++ b/tools/perf/tests/llvm.h
@@ -10,5 +10,6 @@ struct test_llvm__bpf_result {
 
 extern struct test_llvm__bpf_result *p_test_llvm__bpf_result;
 extern const char test_llvm__bpf_prog[];
+void test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz);
 
 #endif
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 0d79f04..f8ded73 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -65,6 +65,7 @@ int test__thread_map(void);
 int test__llvm(void);
 void test__llvm_prepare(void);
 void test__llvm_cleanup(void);
+int test__bpf(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index c2aafe2..95e529b 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -153,6 +153,20 @@ sync_bpf_program_pev(struct bpf_program *prog)
 	return 0;
 }
 
+int bpf__prepare_load_buffer(void *obj_buf, size_t obj_buf_sz,
+			     const char *name)
+{
+	struct bpf_object *obj;
+
+	obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, name);
+	if (!obj) {
+		pr_debug("bpf: failed to load buffer\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 int bpf__prepare_load(const char *filename, bool source)
 {
 	struct bpf_object *obj;
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 97aed65..dead4d4 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -19,6 +19,8 @@ typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
 
 #ifdef HAVE_LIBBPF_SUPPORT
 int bpf__prepare_load(const char *filename, bool source);
+int bpf__prepare_load_buffer(void *obj_buf, size_t obj_buf_sz,
+			     const char *name);
 int bpf__strerror_prepare_load(const char *filename, bool source,
 			       int err, char *buf, size_t size);
 int bpf__probe(void);
@@ -39,6 +41,12 @@ static inline int bpf__prepare_load(const char *filename __maybe_unused,
 	return -1;
 }
 
+static inline int bpf__prepare_load_buffer(void *obj_buf __maybe_unused,
+					   size_t obj_buf_sz __maybe_unused)
+{
+	return bpf__prepare_load(NULL, false);
+}
+
 static inline int bpf__probe(void) { return 0; }
 static inline int bpf__unprobe(void) { return 0; }
 static inline int bpf__load(void) { return 0; }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 19/31] bpf tools: Load a program with different instances using preprocessor
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (17 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 18/31] perf test: Add 'perf test BPF' Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 20/31] perf probe: Reset args and nargs for probe_trace_event when failure Wang Nan
                   ` (11 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

In this patch, caller of libbpf is able to control the loaded programs
by installing a preprocessor callback for a BPF program. With
preprocessor, different instances can be created from one BPF program.

This patch will be used by perf to generate different prologue for
different 'struct probe_trace_event' instances matched by one
'struct perf_probe_event'.

bpf_program__set_prep() is added to support this feature. Caller
should pass libbpf the number of instances should be created and a
preprocessor function which will be called when doing real loading.
The callback should return instructions arrays for each instances.

fd field in bpf_programs is replaced by instance, which has an nr field
and fds array. bpf_program__nth_fd() is introduced for read fd of
instances. Old interface bpf_program__fd() is reimplemented by
returning the first fd.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-29-git-send-email-wangnan0@huawei.com
[wangnan: Add missing '!',
          allows bpf_program__unload() when prog->instance.nr == -1
]
---
 tools/lib/bpf/libbpf.c | 143 +++++++++++++++++++++++++++++++++++++++++++++----
 tools/lib/bpf/libbpf.h |  22 ++++++++
 2 files changed, 156 insertions(+), 9 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 4252fc2..6a07b26 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -98,7 +98,11 @@ struct bpf_program {
 	} *reloc_desc;
 	int nr_reloc;
 
-	int fd;
+	struct {
+		int nr;
+		int *fds;
+	} instance;
+	bpf_program_prep_t preprocessor;
 
 	struct bpf_object *obj;
 	void *priv;
@@ -152,10 +156,24 @@ struct bpf_object {
 
 static void bpf_program__unload(struct bpf_program *prog)
 {
+	int i;
+
 	if (!prog)
 		return;
 
-	zclose(prog->fd);
+	/*
+	 * If the object is opened but the program is never loaded,
+	 * it is possible that prog->instance.nr == -1.
+	 */
+	if (prog->instance.nr > 0) {
+		for (i = 0; i < prog->instance.nr; i++)
+			zclose(prog->instance.fds[i]);
+	} else if (prog->instance.nr != -1)
+		pr_warning("Internal error: instance.nr is %d\n",
+			   prog->instance.nr);
+
+	prog->instance.nr = -1;
+	zfree(&prog->instance.fds);
 }
 
 static void bpf_program__exit(struct bpf_program *prog)
@@ -206,7 +224,8 @@ bpf_program__init(void *data, size_t size, char *name, int idx,
 	memcpy(prog->insns, data,
 	       prog->insns_cnt * sizeof(struct bpf_insn));
 	prog->idx = idx;
-	prog->fd = -1;
+	prog->instance.fds = NULL;
+	prog->instance.nr = -1;
 
 	return 0;
 errout:
@@ -795,13 +814,71 @@ static int
 bpf_program__load(struct bpf_program *prog,
 		  char *license, u32 kern_version)
 {
-	int err, fd;
+	int err = 0, fd, i;
+
+	if (prog->instance.nr < 0 || !prog->instance.fds) {
+		if (prog->preprocessor) {
+			pr_warning("Internal error: can't load program '%s'\n",
+				   prog->section_name);
+			return -EINVAL;
+		}
+
+		prog->instance.fds = malloc(sizeof(int));
+		if (!prog->instance.fds) {
+			pr_warning("No enough memory for fds\n");
+			return -ENOMEM;
+		}
+		prog->instance.nr = 1;
+		prog->instance.fds[0] = -1;
+	}
+
+	if (!prog->preprocessor) {
+		if (prog->instance.nr != 1)
+			pr_warning("Program '%s' inconsistent: nr(%d) not 1\n",
+				   prog->section_name, prog->instance.nr);
 
-	err = load_program(prog->insns, prog->insns_cnt,
-			   license, kern_version, &fd);
-	if (!err)
-		prog->fd = fd;
+		err = load_program(prog->insns, prog->insns_cnt,
+				   license, kern_version, &fd);
+		if (!err)
+			prog->instance.fds[0] = fd;
+		goto out;
+	}
+
+	for (i = 0; i < prog->instance.nr; i++) {
+		struct bpf_prog_prep_result result;
+		bpf_program_prep_t preprocessor = prog->preprocessor;
+
+		bzero(&result, sizeof(result));
+		err = preprocessor(prog, i, prog->insns,
+				   prog->insns_cnt, &result);
+		if (err) {
+			pr_warning("Preprocessing %dth instance of program '%s' failed\n",
+					i, prog->section_name);
+			goto out;
+		}
+
+		if (!result.new_insn_ptr || !result.new_insn_cnt) {
+			pr_debug("Skip loading %dth instance of program '%s'\n",
+					i, prog->section_name);
+			prog->instance.fds[i] = -1;
+			continue;
+		}
+
+		err = load_program(result.new_insn_ptr,
+				   result.new_insn_cnt,
+				   license, kern_version, &fd);
+
+		if (err) {
+			pr_warning("Loading %dth instance of program '%s' failed\n",
+					i, prog->section_name);
+			goto out;
+		}
 
+		if (result.pfd)
+			*result.pfd = fd;
+		prog->instance.fds[i] = fd;
+	}
+out:
 	if (err)
 		pr_warning("failed to load program '%s'\n",
 			   prog->section_name);
@@ -1052,5 +1129,53 @@ const char *bpf_program__title(struct bpf_program *prog, bool dup)
 
 int bpf_program__fd(struct bpf_program *prog)
 {
-	return prog->fd;
+	return bpf_program__nth_fd(prog, 0);
+}
+
+int bpf_program__set_prep(struct bpf_program *prog, int nr_instance,
+			  bpf_program_prep_t prep)
+{
+	int *instance_fds;
+
+	if (nr_instance <= 0 || !prep)
+		return -EINVAL;
+
+	if (prog->instance.nr > 0 || prog->instance.fds) {
+		pr_warning("Can't set pre-processor after loading\n");
+		return -EINVAL;
+	}
+
+	instance_fds = malloc(sizeof(int) * nr_instance);
+	if (!instance_fds) {
+		pr_warning("alloc memory failed for instance of fds\n");
+		return -ENOMEM;
+	}
+
+	/* fill all fd with -1 */
+	memset(instance_fds, 0xff, sizeof(int) * nr_instance);
+
+	prog->instance.nr = nr_instance;
+	prog->instance.fds = instance_fds;
+	prog->preprocessor = prep;
+	return 0;
+}
+
+int bpf_program__nth_fd(struct bpf_program *prog, int n)
+{
+	int fd;
+
+	if (n >= prog->instance.nr || n < 0) {
+		pr_warning("Can't get the %dth fd from program %s: only %d instances\n",
+			   n, prog->section_name, prog->instance.nr);
+		return -EINVAL;
+	}
+
+	fd = prog->instance.fds[n];
+	if (fd < 0) {
+		pr_warning("%dth instance of program '%s' is invalid\n",
+			   n, prog->section_name);
+		return -ENOENT;
+	}
+
+	return fd;
 }
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index f16170c..d82b89e 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -67,6 +67,28 @@ const char *bpf_program__title(struct bpf_program *prog, bool dup);
 
 int bpf_program__fd(struct bpf_program *prog);
 
+struct bpf_insn;
+struct bpf_prog_prep_result {
+	/*
+	 * If not NULL, load new instruction array.
+	 * If set to NULL, don't load this instance.
+	 */
+	struct bpf_insn *new_insn_ptr;
+	int new_insn_cnt;
+
+	/* If not NULL, result fd is set to it */
+	int *pfd;
+};
+
+typedef int (*bpf_program_prep_t)(struct bpf_program *, int n,
+				  struct bpf_insn *, int insn_cnt,
+				  struct bpf_prog_prep_result *res);
+
+int bpf_program__set_prep(struct bpf_program *prog, int nr_instance,
+			  bpf_program_prep_t prep);
+
+int bpf_program__nth_fd(struct bpf_program *prog, int n);
+
 /*
  * We don't need __attribute__((packed)) now since it is
  * unnecessary for 'bpf_map_def' because they are all aligned.
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 20/31] perf probe: Reset args and nargs for probe_trace_event when failure
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (18 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 19/31] bpf tools: Load a program with different instances using preprocessor Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 21/31] perf tools: Move linux/filter.h to tools/include Wang Nan
                   ` (10 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

When failure occures in add_probe_trace_event(), args in
probe_trace_event is incomplete. Since information in it may be used
in further, this patch frees the allocated memory and set it to NULL
to avoid dangling pointer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-31-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/probe-finder.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 29c43c068..5ab9cd6 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1228,6 +1228,10 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, struct probe_finder *pf)
 
 end:
 	free(args);
+	if (ret) {
+		tev->nargs = 0;
+		zfree(&tev->args);
+	}
 	return ret;
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 21/31] perf tools: Move linux/filter.h to tools/include
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (19 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 20/31] perf probe: Reset args and nargs for probe_trace_event when failure Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-31 20:35   ` Arnaldo Carvalho de Melo
                     ` (2 more replies)
  2015-08-29  4:21 ` [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches Wang Nan
                   ` (9 subsequent siblings)
  30 siblings, 3 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, He Kuang, Wang Nan,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: He Kuang <hekuang@huawei.com>

This patch moves filter.h from include/linux/kernel.h to
tools/include/linux/filter.h to enable other libraries use macros in
it, like libbpf which will be introduced by further patches. Currenty,
the moved filter.h only contains the useful macros needed by libbpf
for not introducing too much dependence.

MANIFEST is also updated for 'make perf-*-src-pkg'.

One change:
  imm field of BPF_EMIT_CALL becomes ((FUNC) - BPF_FUNC_unspec) to
  suit user space code generator.

Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-32-git-send-email-wangnan0@huawei.com
---
 tools/include/linux/filter.h | 237 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/MANIFEST          |   1 +
 2 files changed, 238 insertions(+)
 create mode 100644 tools/include/linux/filter.h

diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h
new file mode 100644
index 0000000..11d2b1c
--- /dev/null
+++ b/tools/include/linux/filter.h
@@ -0,0 +1,237 @@
+/*
+ * Linux Socket Filter Data Structures
+ */
+#ifndef __TOOLS_LINUX_FILTER_H
+#define __TOOLS_LINUX_FILTER_H
+
+#include <linux/bpf.h>
+
+/* ArgX, context and stack frame pointer register positions. Note,
+ * Arg1, Arg2, Arg3, etc are used as argument mappings of function
+ * calls in BPF_CALL instruction.
+ */
+#define BPF_REG_ARG1	BPF_REG_1
+#define BPF_REG_ARG2	BPF_REG_2
+#define BPF_REG_ARG3	BPF_REG_3
+#define BPF_REG_ARG4	BPF_REG_4
+#define BPF_REG_ARG5	BPF_REG_5
+#define BPF_REG_CTX	BPF_REG_6
+#define BPF_REG_FP	BPF_REG_10
+
+/* Additional register mappings for converted user programs. */
+#define BPF_REG_A	BPF_REG_0
+#define BPF_REG_X	BPF_REG_7
+#define BPF_REG_TMP	BPF_REG_8
+
+/* BPF program can access up to 512 bytes of stack space. */
+#define MAX_BPF_STACK	512
+
+/* Helper macros for filter block array initializers. */
+
+/* ALU ops on registers, bpf_add|sub|...: dst_reg += src_reg */
+
+#define BPF_ALU64_REG(OP, DST, SRC)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU64 | BPF_OP(OP) | BPF_X,	\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = 0 })
+
+#define BPF_ALU32_REG(OP, DST, SRC)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU | BPF_OP(OP) | BPF_X,		\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = 0 })
+
+/* ALU ops on immediates, bpf_add|sub|...: dst_reg += imm32 */
+
+#define BPF_ALU64_IMM(OP, DST, IMM)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU64 | BPF_OP(OP) | BPF_K,	\
+		.dst_reg = DST,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = IMM })
+
+#define BPF_ALU32_IMM(OP, DST, IMM)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU | BPF_OP(OP) | BPF_K,		\
+		.dst_reg = DST,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = IMM })
+
+/* Endianness conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
+
+#define BPF_ENDIAN(TYPE, DST, LEN)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU | BPF_END | BPF_SRC(TYPE),	\
+		.dst_reg = DST,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = LEN })
+
+/* Short form of mov, dst_reg = src_reg */
+
+#define BPF_MOV64_REG(DST, SRC)					\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU64 | BPF_MOV | BPF_X,		\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = 0 })
+
+#define BPF_MOV32_REG(DST, SRC)					\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU | BPF_MOV | BPF_X,		\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = 0 })
+
+/* Short form of mov, dst_reg = imm32 */
+
+#define BPF_MOV64_IMM(DST, IMM)					\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU64 | BPF_MOV | BPF_K,		\
+		.dst_reg = DST,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = IMM })
+
+#define BPF_MOV32_IMM(DST, IMM)					\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU | BPF_MOV | BPF_K,		\
+		.dst_reg = DST,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = IMM })
+
+/* Short form of mov based on type,
+ * BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32
+ */
+
+#define BPF_MOV64_RAW(TYPE, DST, SRC, IMM)			\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU64 | BPF_MOV | BPF_SRC(TYPE),	\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = IMM })
+
+#define BPF_MOV32_RAW(TYPE, DST, SRC, IMM)			\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU | BPF_MOV | BPF_SRC(TYPE),	\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = IMM })
+
+/* Direct packet access, R0 = *(uint *) (skb->data + imm32) */
+
+#define BPF_LD_ABS(SIZE, IMM)					\
+	((struct bpf_insn) {					\
+		.code  = BPF_LD | BPF_SIZE(SIZE) | BPF_ABS,	\
+		.dst_reg = 0,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = IMM })
+
+/* Indirect packet access, R0 = *(uint *) (skb->data + src_reg + imm32) */
+
+#define BPF_LD_IND(SIZE, SRC, IMM)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_LD | BPF_SIZE(SIZE) | BPF_IND,	\
+		.dst_reg = 0,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = IMM })
+
+/* Memory load, dst_reg = *(uint *) (src_reg + off16) */
+
+#define BPF_LDX_MEM(SIZE, DST, SRC, OFF)			\
+	((struct bpf_insn) {					\
+		.code  = BPF_LDX | BPF_SIZE(SIZE) | BPF_MEM,	\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = OFF,					\
+		.imm   = 0 })
+
+/* Memory store, *(uint *) (dst_reg + off16) = src_reg */
+
+#define BPF_STX_MEM(SIZE, DST, SRC, OFF)			\
+	((struct bpf_insn) {					\
+		.code  = BPF_STX | BPF_SIZE(SIZE) | BPF_MEM,	\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = OFF,					\
+		.imm   = 0 })
+
+/* Memory store, *(uint *) (dst_reg + off16) = imm32 */
+
+#define BPF_ST_MEM(SIZE, DST, OFF, IMM)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_ST | BPF_SIZE(SIZE) | BPF_MEM,	\
+		.dst_reg = DST,					\
+		.src_reg = 0,					\
+		.off   = OFF,					\
+		.imm   = IMM })
+
+/* Conditional jumps against registers,
+ * if (dst_reg 'op' src_reg) goto pc + off16
+ */
+
+#define BPF_JMP_REG(OP, DST, SRC, OFF)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_JMP | BPF_OP(OP) | BPF_X,		\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = OFF,					\
+		.imm   = 0 })
+
+/* Conditional jumps against immediates,
+ * if (dst_reg 'op' imm32) goto pc + off16
+ */
+
+#define BPF_JMP_IMM(OP, DST, IMM, OFF)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_JMP | BPF_OP(OP) | BPF_K,		\
+		.dst_reg = DST,					\
+		.src_reg = 0,					\
+		.off   = OFF,					\
+		.imm   = IMM })
+
+/* Function call */
+
+#define BPF_EMIT_CALL(FUNC)					\
+	((struct bpf_insn) {					\
+		.code  = BPF_JMP | BPF_CALL,			\
+		.dst_reg = 0,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = ((FUNC) - BPF_FUNC_unspec) })
+
+/* Raw code statement block */
+
+#define BPF_RAW_INSN(CODE, DST, SRC, OFF, IMM)			\
+	((struct bpf_insn) {					\
+		.code  = CODE,					\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = OFF,					\
+		.imm   = IMM })
+
+/* Program exit */
+
+#define BPF_EXIT_INSN()						\
+	((struct bpf_insn) {					\
+		.code  = BPF_JMP | BPF_EXIT,			\
+		.dst_reg = 0,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = 0 })
+
+#endif /* __TOOLS_LINUX_FILTER_H */
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index 56fe0c9..14e8b98 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -42,6 +42,7 @@ tools/include/asm-generic/bitops.h
 tools/include/linux/atomic.h
 tools/include/linux/bitops.h
 tools/include/linux/compiler.h
+tools/include/linux/filter.h
 tools/include/linux/hash.h
 tools/include/linux/kernel.h
 tools/include/linux/list.h
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (20 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 21/31] perf tools: Move linux/filter.h to tools/include Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-31 20:39   ` Arnaldo Carvalho de Melo
  2015-09-01  6:59   ` Wang Nan
  2015-08-29  4:21 ` [PATCH 23/31] perf tools: Introduce arch_get_reg_info() for x86 Wang Nan
                   ` (8 subsequent siblings)
  30 siblings, 2 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

If both LIBBPF and DWARF are detected, it is possible to create prologue
for eBPF programs to help them accessing kernel data. HAVE_BPF_PROLOGUE
and CONFIG_BPF_PROLOGUE is added as flags for this feature.

PERF_HAVE_ARCH_GET_REG_OFFSET indicates an architecture supports
converting name of a register to its offset in 'struct pt_regs'.
Without this support, BPF_PROLOGUE should be turned off.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-33-git-send-email-wangnan0@huawei.com
---
 tools/perf/config/Makefile           | 12 ++++++++++++
 tools/perf/util/include/dwarf-regs.h |  7 +++++++
 2 files changed, 19 insertions(+)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 38a4144..d46765b7 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -314,6 +314,18 @@ ifndef NO_LIBELF
       CFLAGS += -DHAVE_LIBBPF_SUPPORT
       $(call detected,CONFIG_LIBBPF)
     endif
+
+    ifndef NO_DWARF
+      ifneq ($(origin PERF_HAVE_ARCH_GET_REG_INFO), undefined)
+        CFLAGS += -DHAVE_BPF_PROLOGUE
+        $(call detected,CONFIG_BPF_PROLOGUE)
+      else
+        msg := $(warning BPF prologue is not supported by architecture $(ARCH));
+      endif
+    else
+      msg := $(warning DWARF support is off, BPF prologue is disabled);
+    endif
+
   endif # NO_LIBBPF
 endif # NO_LIBELF
 
diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
index 8f14965..3dda083 100644
--- a/tools/perf/util/include/dwarf-regs.h
+++ b/tools/perf/util/include/dwarf-regs.h
@@ -5,4 +5,11 @@
 const char *get_arch_regstr(unsigned int n);
 #endif
 
+#ifdef HAVE_BPF_PROLOGUE
+/*
+ * Arch should support fetching the offset of a register in pt_regs
+ * by its name.
+ */
+int arch_get_reg_info(const char *name, int *offset);
+#endif
 #endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 23/31] perf tools: Introduce arch_get_reg_info() for x86
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (21 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-31 20:43   ` Arnaldo Carvalho de Melo
  2015-08-29  4:21 ` [PATCH 24/31] perf tools: Add prologue for BPF programs for fetching arguments Wang Nan
                   ` (7 subsequent siblings)
  30 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, He Kuang, Wang Nan,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: He Kuang <hekuang@huawei.com>

arch_get_reg_info() is a helper function which converts register name
like "%rax" to offset of a register in 'struct pt_regs', which is
required by BPF prologue generator.

This patch replaces original string table by a 'struct reg_info' table,
which records offset of registers according to its name.

For x86, since there are two sub-archs (x86_32 and x86_64) but we can
only get pt_regs for the arch we are currently on, this patch fills
offset with '-1' for another sub-arch. This introduces a limitation to
perf prologue that, we are unable to generate prologue on a x86_32
compiled perf for BPF programs targeted on x86_64 kernel. This
limitation is acceptable, because this is a very rare usecase.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-34-git-send-email-wangnan0@huawei.com
---
 tools/perf/arch/x86/Makefile          |   1 +
 tools/perf/arch/x86/util/Build        |   2 +
 tools/perf/arch/x86/util/dwarf-regs.c | 104 ++++++++++++++++++++++++----------
 3 files changed, 78 insertions(+), 29 deletions(-)

diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
index 21322e0..a84a6f6f 100644
--- a/tools/perf/arch/x86/Makefile
+++ b/tools/perf/arch/x86/Makefile
@@ -2,3 +2,4 @@ ifndef NO_DWARF
 PERF_HAVE_DWARF_REGS := 1
 endif
 HAVE_KVM_STAT_SUPPORT := 1
+PERF_HAVE_ARCH_GET_REG_INFO := 1
diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 2c55e1b..09429f6 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -3,6 +3,8 @@ libperf-y += tsc.o
 libperf-y += pmu.o
 libperf-y += kvm-stat.o
 
+# BPF_PROLOGUE also need dwarf-regs.o. However, if CONFIG_BPF_PROLOGUE
+# is true, CONFIG_DWARF must true.
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 
 libperf-$(CONFIG_LIBUNWIND)          += unwind-libunwind.o
diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
index be22dd4..9928caf 100644
--- a/tools/perf/arch/x86/util/dwarf-regs.c
+++ b/tools/perf/arch/x86/util/dwarf-regs.c
@@ -22,44 +22,67 @@
 
 #include <stddef.h>
 #include <dwarf-regs.h>
+#include <string.h>
+#include <linux/ptrace.h>
+#include <linux/kernel.h> /* for offsetof */
+#include <util/bpf-loader.h>
+
+struct reg_info {
+	const char	*name;		/* Reg string in debuginfo      */
+	int		offset;		/* Reg offset in struct pt_regs */
+};
 
 /*
  * Generic dwarf analysis helpers
  */
-
+/*
+ * x86_64 compiling can't access pt_regs for x86_32, so fill offset
+ * with -1.
+ */
+#ifdef __x86_64__
+# define REG_INFO(n, f) { .name = n, .offset = -1, }
+#else
+# define REG_INFO(n, f) { .name = n, .offset = offsetof(struct pt_regs, f), }
+#endif
 #define X86_32_MAX_REGS 8
-const char *x86_32_regs_table[X86_32_MAX_REGS] = {
-	"%ax",
-	"%cx",
-	"%dx",
-	"%bx",
-	"$stack",	/* Stack address instead of %sp */
-	"%bp",
-	"%si",
-	"%di",
+
+struct reg_info x86_32_regs_table[X86_32_MAX_REGS] = {
+	REG_INFO("%ax", eax),
+	REG_INFO("%cx", ecx),
+	REG_INFO("%dx", edx),
+	REG_INFO("%bx", ebx),
+	REG_INFO("$stack", esp),	/* Stack address instead of %sp */
+	REG_INFO("%bp", ebp),
+	REG_INFO("%si", esi),
+	REG_INFO("%di", edi),
 };
 
+#undef REG_INFO
+#ifdef __x86_64__
+# define REG_INFO(n, f) { .name = n, .offset = offsetof(struct pt_regs, f), }
+#else
+# define REG_INFO(n, f) { .name = n, .offset = -1, }
+#endif
 #define X86_64_MAX_REGS 16
-const char *x86_64_regs_table[X86_64_MAX_REGS] = {
-	"%ax",
-	"%dx",
-	"%cx",
-	"%bx",
-	"%si",
-	"%di",
-	"%bp",
-	"%sp",
-	"%r8",
-	"%r9",
-	"%r10",
-	"%r11",
-	"%r12",
-	"%r13",
-	"%r14",
-	"%r15",
+struct reg_info x86_64_regs_table[X86_64_MAX_REGS] = {
+	REG_INFO("%ax",		rax),
+	REG_INFO("%dx",		rdx),
+	REG_INFO("%cx",		rcx),
+	REG_INFO("%bx",		rbx),
+	REG_INFO("%si",		rsi),
+	REG_INFO("%di",		rdi),
+	REG_INFO("%bp",		rbp),
+	REG_INFO("%sp",		rsp),
+	REG_INFO("%r8",		r8),
+	REG_INFO("%r9",		r9),
+	REG_INFO("%r10",	r10),
+	REG_INFO("%r11",	r11),
+	REG_INFO("%r12",	r12),
+	REG_INFO("%r13",	r13),
+	REG_INFO("%r14",	r14),
+	REG_INFO("%r15",	r15),
 };
 
-/* TODO: switching by dwarf address size */
 #ifdef __x86_64__
 #define ARCH_MAX_REGS X86_64_MAX_REGS
 #define arch_regs_table x86_64_regs_table
@@ -71,5 +94,28 @@ const char *x86_64_regs_table[X86_64_MAX_REGS] = {
 /* Return architecture dependent register string (for kprobe-tracer) */
 const char *get_arch_regstr(unsigned int n)
 {
-	return (n <= ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
+	return (n <= ARCH_MAX_REGS) ? arch_regs_table[n].name : NULL;
 }
+
+#ifdef HAVE_BPF_PROLOGUE
+int arch_get_reg_info(const char *name, int *offset)
+{
+	int i;
+	struct reg_info *info;
+
+	if (!name || !offset)
+		return -1;
+
+	for (i = 0; i < ARCH_MAX_REGS; i++) {
+		info = &arch_regs_table[i];
+		if (strcmp(info->name, name) == 0) {
+			if (info->offset < 0)
+				return -1;
+			*offset = info->offset;
+			return 0;
+		}
+	}
+
+	return -1;
+}
+#endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 24/31] perf tools: Add prologue for BPF programs for fetching arguments
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (22 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 23/31] perf tools: Introduce arch_get_reg_info() for x86 Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:21 ` [PATCH 25/31] perf tools: Generate prologue for BPF programs Wang Nan
                   ` (6 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

This patch generates prologue for a BPF program which fetch arguments
for it. With this patch, the program can have arguments as follow:

 SEC("lock_page=__lock_page page->flags")
 int lock_page(struct pt_regs *ctx, int err, unsigned long flags)
 {
	 return 1;
 }

This patch passes at most 3 arguments from r3, r4 and r5. r1 is still
the ctx pointer. r2 is used to indicate the successfulness of
dereferencing.

This patch uses r6 to hold ctx (struct pt_regs) and r7 to hold stack
pointer for result. Result of each arguments first store on stack:

 low address
 BPF_REG_FP - 24  ARG3
 BPF_REG_FP - 16  ARG2
 BPF_REG_FP - 8   ARG1
 BPF_REG_FP
 high address

Then loaded into r3, r4 and r5.

The output prologue for offn(...off2(off1(reg)))) should be:

     r6 <- r1			// save ctx into a callee saved register
     r7 <- fp
     r7 <- r7 - stack_offset	// pointer to result slot
     /* load r3 with the offset in pt_regs of 'reg' */
     (r7) <- r3			// make slot valid
     r3 <- r3 + off1		// prepare to read unsafe pointer
     r2 <- 8
     r1 <- r7			// result put onto stack
     call probe_read		// read unsafe pointer
     jnei r0, 0, err		// error checking
     r3 <- (r7)			// read result
     r3 <- r3 + off2		// prepare to read unsafe pointer
     r2 <- 8
     r1 <- r7
     call probe_read
     jnei r0, 0, err
     ...
     /* load r2, r3, r4 from stack */
     goto success
err:
     r2 <- 1
     /* load r3, r4, r5 with 0 */
     goto usercode
success:
     r2 <- 0
usercode:
     r1 <- r6	// restore ctx
     // original user code

If all of arguments reside in register (dereferencing is not
required), gen_prologue_fastpath() will be used to create
fast prologue:

     r3 <- (r1 + offset of reg1)
     r4 <- (r1 + offset of reg2)
     r5 <- (r1 + offset of reg3)
     r2 <- 0

P.S.

eBPF calling convention is defined as:

* r0		- return value from in-kernel function, and exit value
                  for eBPF program
* r1 - r5	- arguments from eBPF program to in-kernel function
* r6 - r9	- callee saved registers that in-kernel function will
                  preserve
* r10		- read-only frame pointer to access stack

Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-35-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/Build          |   1 +
 tools/perf/util/bpf-prologue.c | 442 +++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-prologue.h |  34 ++++
 3 files changed, 477 insertions(+)
 create mode 100644 tools/perf/util/bpf-prologue.c
 create mode 100644 tools/perf/util/bpf-prologue.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index c0ca4a1..fd2f084 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -84,6 +84,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-bts.o
 libperf-y += parse-branch-options.o
 
 libperf-$(CONFIG_LIBBPF) += bpf-loader.o
+libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
 libperf-$(CONFIG_LIBELF) += symbol-elf.o
 libperf-$(CONFIG_LIBELF) += probe-file.o
 libperf-$(CONFIG_LIBELF) += probe-event.o
diff --git a/tools/perf/util/bpf-prologue.c b/tools/perf/util/bpf-prologue.c
new file mode 100644
index 0000000..2a5f4c7
--- /dev/null
+++ b/tools/perf/util/bpf-prologue.c
@@ -0,0 +1,442 @@
+/*
+ * bpf-prologue.c
+ *
+ * Copyright (C) 2015 He Kuang <hekuang@huawei.com>
+ * Copyright (C) 2015 Huawei Inc.
+ */
+
+#include <bpf/libbpf.h>
+#include "perf.h"
+#include "debug.h"
+#include "bpf-prologue.h"
+#include "probe-finder.h"
+#include <dwarf-regs.h>
+#include <linux/filter.h>
+
+#define BPF_REG_SIZE		8
+
+#define JMP_TO_ERROR_CODE	-1
+#define JMP_TO_SUCCESS_CODE	-2
+#define JMP_TO_USER_CODE	-3
+
+struct bpf_insn_pos {
+	struct bpf_insn *begin;
+	struct bpf_insn *end;
+	struct bpf_insn *pos;
+};
+
+static inline int
+pos_get_cnt(struct bpf_insn_pos *pos)
+{
+	return pos->pos - pos->begin;
+}
+
+static int
+append_insn(struct bpf_insn new_insn, struct bpf_insn_pos *pos)
+{
+	if (!pos->pos)
+		return -ERANGE;
+
+	if (pos->pos + 1 >= pos->end) {
+		pr_err("bpf prologue: prologue too long\n");
+		pos->pos = NULL;
+		return -ERANGE;
+	}
+
+	*(pos->pos)++ = new_insn;
+	return 0;
+}
+
+static int
+check_pos(struct bpf_insn_pos *pos)
+{
+	if (!pos->pos || pos->pos >= pos->end)
+		return -ERANGE;
+	return 0;
+}
+
+/* Give it a shorter name */
+#define ins(i, p) append_insn((i), (p))
+
+/*
+ * Give a register name (in 'reg'), generate instruction to
+ * load register into an eBPF register rd:
+ *   'ldd target_reg, offset(ctx_reg)', where:
+ * ctx_reg is pre initialized to pointer of 'struct pt_regs'.
+ */
+static int
+gen_ldx_reg_from_ctx(struct bpf_insn_pos *pos, int ctx_reg,
+		     const char *reg, int target_reg)
+{
+	int offset;
+
+	if (arch_get_reg_info(reg, &offset)) {
+		pr_err("bpf: prologue: failed to get register %s\n",
+		       reg);
+		return -1;
+	}
+	ins(BPF_LDX_MEM(BPF_DW, target_reg, ctx_reg, offset), pos);
+
+	if (check_pos(pos))
+		return -ERANGE;
+	return 0;
+}
+
+/*
+ * Generate a BPF_FUNC_probe_read function call.
+ *
+ * src_base_addr_reg is a register holding base address,
+ * dst_addr_reg is a register holding dest address (on stack),
+ * result is:
+ *
+ *  *[dst_addr_reg] = *([src_base_addr_reg] + offset)
+ *
+ * Arguments of BPF_FUNC_probe_read:
+ *     ARG1: ptr to stack (dest)
+ *     ARG2: size (8)
+ *     ARG3: unsafe ptr (src)
+ */
+static int
+gen_read_mem(struct bpf_insn_pos *pos,
+	     int src_base_addr_reg,
+	     int dst_addr_reg,
+	     long offset)
+{
+	/* mov arg3, src_base_addr_reg */
+	if (src_base_addr_reg != BPF_REG_ARG3)
+		ins(BPF_MOV64_REG(BPF_REG_ARG3, src_base_addr_reg), pos);
+	/* add arg3, #offset */
+	if (offset)
+		ins(BPF_ALU64_IMM(BPF_ADD, BPF_REG_ARG3, offset), pos);
+
+	/* mov arg2, #reg_size */
+	ins(BPF_ALU64_IMM(BPF_MOV, BPF_REG_ARG2, BPF_REG_SIZE), pos);
+
+	/* mov arg1, dst_addr_reg */
+	if (dst_addr_reg != BPF_REG_ARG1)
+		ins(BPF_MOV64_REG(BPF_REG_ARG1, dst_addr_reg), pos);
+
+	/* Call probe_read  */
+	ins(BPF_EMIT_CALL(BPF_FUNC_probe_read), pos);
+	/*
+	 * Error processing: if read fail, goto error code,
+	 * will be relocated. Target should be the start of
+	 * error processing code.
+	 */
+	ins(BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, JMP_TO_ERROR_CODE),
+	    pos);
+
+	if (check_pos(pos))
+		return -ERANGE;
+	return 0;
+}
+
+/*
+ * Each arg should be bare register. Fetch and save them into argument
+ * registers (r3 - r5).
+ *
+ * BPF_REG_1 should have been initialized with pointer to
+ * 'struct pt_regs'.
+ */
+static int
+gen_prologue_fastpath(struct bpf_insn_pos *pos,
+		      struct probe_trace_arg *args, int nargs)
+{
+	int i;
+
+	for (i = 0; i < nargs; i++)
+		if (gen_ldx_reg_from_ctx(pos, BPF_REG_1, args[i].value,
+					 BPF_PROLOGUE_START_ARG_REG + i))
+			goto errout;
+
+	if (check_pos(pos))
+		goto errout;
+	return 0;
+errout:
+	return -1;
+}
+
+/*
+ * Slow path:
+ *   At least one argument has the form of 'offset($rx)'.
+ *
+ * Following code first stores them into stack, then loads all of then
+ * to r2 - r5.
+ * Before final loading, the final result should be:
+ *
+ * low address
+ * BPF_REG_FP - 24  ARG3
+ * BPF_REG_FP - 16  ARG2
+ * BPF_REG_FP - 8   ARG1
+ * BPF_REG_FP
+ * high address
+ *
+ * For each argument (described as: offn(...off2(off1(reg)))),
+ * generates following code:
+ *
+ *  r7 <- fp
+ *  r7 <- r7 - stack_offset  // Ideal code should initialize r7 using
+ *                           // fp before generating args. However,
+ *                           // eBPF won't regard r7 as stack pointer
+ *                           // if it is generated by minus 8 from
+ *                           // another stack pointer except fp.
+ *                           // This is why we have to set r7
+ *                           // to fp for each variable.
+ *  r3 <- value of 'reg'-> generated using gen_ldx_reg_from_ctx()
+ *  (r7) <- r3       // skip following instructions for bare reg
+ *  r3 <- r3 + off1  . // skip if off1 == 0
+ *  r2 <- 8           \
+ *  r1 <- r7           |-> generated by gen_read_mem()
+ *  call probe_read    /
+ *  jnei r0, 0, err  ./
+ *  r3 <- (r7)
+ *  r3 <- r3 + off2  . // skip if off2 == 0
+ *  r2 <- 8           \  // r2 may be broken by probe_read, so set again
+ *  r1 <- r7           |-> generated by gen_read_mem()
+ *  call probe_read    /
+ *  jnei r0, 0, err  ./
+ *  ...
+ */
+static int
+gen_prologue_slowpath(struct bpf_insn_pos *pos,
+		      struct probe_trace_arg *args, int nargs)
+{
+	int i;
+
+	for (i = 0; i < nargs; i++) {
+		struct probe_trace_arg *arg = &args[i];
+		const char *reg = arg->value;
+		struct probe_trace_arg_ref *ref = NULL;
+		int stack_offset = (i + 1) * -8;
+
+		pr_debug("prologue: fetch arg %d, base reg is %s\n",
+			 i, reg);
+
+		/* value of base register is stored into ARG3 */
+		if (gen_ldx_reg_from_ctx(pos, BPF_REG_CTX, reg,
+					 BPF_REG_ARG3)) {
+			pr_err("prologue: failed to get offset of register %s\n",
+			       reg);
+			goto errout;
+		}
+
+		/* Make r7 the stack pointer. */
+		ins(BPF_MOV64_REG(BPF_REG_7, BPF_REG_FP), pos);
+		/* r7 += -8 */
+		ins(BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, stack_offset), pos);
+		/*
+		 * Store r3 (base register) onto stack
+		 * Ensure fp[offset] is set.
+		 * fp is the only valid base register when storing
+		 * into stack. We are not allowed to use r7 as base
+		 * register here.
+		 */
+		ins(BPF_STX_MEM(BPF_DW, BPF_REG_FP, BPF_REG_ARG3,
+				stack_offset), pos);
+
+		ref = arg->ref;
+		while (ref) {
+			pr_debug("prologue: arg %d: offset %ld\n",
+				 i, ref->offset);
+			if (gen_read_mem(pos, BPF_REG_3, BPF_REG_7,
+					 ref->offset)) {
+				pr_err("prologue: failed to generate probe_read function call\n");
+				goto errout;
+			}
+
+			ref = ref->next;
+			/*
+			 * Load previous result into ARG3. Use
+			 * BPF_REG_FP instead of r7 because verifier
+			 * allows FP based addressing only.
+			 */
+			if (ref)
+				ins(BPF_LDX_MEM(BPF_DW, BPF_REG_ARG3,
+						BPF_REG_FP, stack_offset), pos);
+		}
+	}
+
+	/* Final pass: read to registers */
+	for (i = 0; i < nargs; i++)
+		ins(BPF_LDX_MEM(BPF_DW, BPF_PROLOGUE_START_ARG_REG + i,
+				BPF_REG_FP, -BPF_REG_SIZE * (i + 1)), pos);
+
+	ins(BPF_JMP_IMM(BPF_JA, BPF_REG_0, 0, JMP_TO_SUCCESS_CODE), pos);
+
+	if (check_pos(pos))
+		goto errout;
+	return 0;
+errout:
+	return -1;
+}
+
+static int
+prologue_relocate(struct bpf_insn_pos *pos, struct bpf_insn *error_code,
+	    struct bpf_insn *success_code, struct bpf_insn *user_code)
+{
+	struct bpf_insn *insn;
+
+	if (check_pos(pos))
+		return -ERANGE;
+
+	for (insn = pos->begin; insn < pos->pos; insn++) {
+		u8 class = BPF_CLASS(insn->code);
+		u8 opcode;
+
+		if (class != BPF_JMP)
+			continue;
+		opcode = BPF_OP(insn->code);
+		if (opcode == BPF_CALL)
+			continue;
+
+		switch (insn->off) {
+		case JMP_TO_ERROR_CODE:
+			insn->off = error_code - (insn + 1);
+			break;
+		case JMP_TO_SUCCESS_CODE:
+			insn->off = success_code - (insn + 1);
+			break;
+		case JMP_TO_USER_CODE:
+			insn->off = user_code - (insn + 1);
+			break;
+		default:
+			pr_err("bpf prologue: internal error: relocation failed\n");
+			return -1;
+		}
+	}
+	return 0;
+}
+
+int bpf__gen_prologue(struct probe_trace_arg *args, int nargs,
+		      struct bpf_insn *new_prog, size_t *new_cnt,
+		      size_t cnt_space)
+{
+	struct bpf_insn *success_code = NULL;
+	struct bpf_insn *error_code = NULL;
+	struct bpf_insn *user_code = NULL;
+	struct bpf_insn_pos pos;
+	bool fastpath = true;
+	int i;
+
+	if (!new_prog || !new_cnt)
+		return -EINVAL;
+
+	pos.begin = new_prog;
+	pos.end = new_prog + cnt_space;
+	pos.pos = new_prog;
+
+	if (!nargs) {
+		ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0),
+		    &pos);
+
+		if (check_pos(&pos))
+			goto errout;
+
+		*new_cnt = pos_get_cnt(&pos);
+		return 0;
+	}
+
+	if (nargs > BPF_PROLOGUE_MAX_ARGS)
+		nargs = BPF_PROLOGUE_MAX_ARGS;
+	if (cnt_space > BPF_MAXINSNS)
+		cnt_space = BPF_MAXINSNS;
+
+	/* First pass: validation */
+	for (i = 0; i < nargs; i++) {
+		struct probe_trace_arg_ref *ref = args[i].ref;
+
+		if (args[i].value[0] == '@') {
+			/* TODO: fetch global variable */
+			pr_err("bpf: prologue: global %s%+ld not support\n",
+				args[i].value, ref ? ref->offset : 0);
+			return -ENOTSUP;
+		}
+
+		while (ref) {
+			/* fastpath is true if all args has ref == NULL */
+			fastpath = false;
+
+			/*
+			 * Instruction encodes immediate value using
+			 * s32, ref->offset is long. On systems which
+			 * can't fill long in s32, refuse to process if
+			 * ref->offset too large (or small).
+			 */
+#ifdef __LP64__
+#define OFFSET_MAX	((1LL << 31) - 1)
+#define OFFSET_MIN	((1LL << 31) * -1)
+			if (ref->offset > OFFSET_MAX ||
+					ref->offset < OFFSET_MIN) {
+				pr_err("bpf: prologue: offset out of bound: %ld\n",
+				       ref->offset);
+				return -E2BIG;
+			}
+#endif
+			ref = ref->next;
+		}
+	}
+	pr_debug("prologue: pass validation\n");
+
+	if (fastpath) {
+		/* If all variables are registers... */
+		pr_debug("prologue: fast path\n");
+		if (gen_prologue_fastpath(&pos, args, nargs))
+			goto errout;
+	} else {
+		pr_debug("prologue: slow path\n");
+
+		/* Initialization: move ctx to a callee saved register. */
+		ins(BPF_MOV64_REG(BPF_REG_CTX, BPF_REG_ARG1), &pos);
+
+		if (gen_prologue_slowpath(&pos, args, nargs))
+			goto errout;
+		/*
+		 * start of ERROR_CODE (only slow pass needs error code)
+		 *   mov r2 <- 1
+		 *   goto usercode
+		 */
+		error_code = pos.pos;
+		ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 1),
+		    &pos);
+
+		for (i = 0; i < nargs; i++)
+			ins(BPF_ALU64_IMM(BPF_MOV,
+					  BPF_PROLOGUE_START_ARG_REG + i,
+					  0),
+			    &pos);
+		ins(BPF_JMP_IMM(BPF_JA, BPF_REG_0, 0, JMP_TO_USER_CODE),
+				&pos);
+	}
+
+	/*
+	 * start of SUCCESS_CODE:
+	 *   mov r2 <- 0
+	 *   goto usercode  // skip
+	 */
+	success_code = pos.pos;
+	ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0), &pos);
+
+	/*
+	 * start of USER_CODE:
+	 *   Restore ctx to r1
+	 */
+	user_code = pos.pos;
+	if (!fastpath) {
+		/*
+		 * Only slow path needs restoring of ctx. In fast path,
+		 * register are loaded directly from r1.
+		 */
+		ins(BPF_MOV64_REG(BPF_REG_ARG1, BPF_REG_CTX), &pos);
+		if (prologue_relocate(&pos, error_code, success_code,
+				      user_code))
+			goto errout;
+	}
+
+	if (check_pos(&pos))
+		goto errout;
+
+	*new_cnt = pos_get_cnt(&pos);
+	return 0;
+errout:
+	return -ERANGE;
+}
diff --git a/tools/perf/util/bpf-prologue.h b/tools/perf/util/bpf-prologue.h
new file mode 100644
index 0000000..f1e4c5d
--- /dev/null
+++ b/tools/perf/util/bpf-prologue.h
@@ -0,0 +1,34 @@
+/*
+ * Copyright (C) 2015, He Kuang <hekuang@huawei.com>
+ * Copyright (C) 2015, Huawei Inc.
+ */
+#ifndef __BPF_PROLOGUE_H
+#define __BPF_PROLOGUE_H
+
+#include <linux/compiler.h>
+#include <linux/filter.h>
+#include "probe-event.h"
+
+#define BPF_PROLOGUE_MAX_ARGS 3
+#define BPF_PROLOGUE_START_ARG_REG BPF_REG_3
+#define BPF_PROLOGUE_FETCH_RESULT_REG BPF_REG_2
+
+#ifdef HAVE_BPF_PROLOGUE
+int bpf__gen_prologue(struct probe_trace_arg *args, int nargs,
+		      struct bpf_insn *new_prog, size_t *new_cnt,
+		      size_t cnt_space);
+#else
+static inline int
+bpf__gen_prologue(struct probe_trace_arg *args __maybe_unused,
+		  int nargs __maybe_unused,
+		  struct bpf_insn *new_prog __maybe_unused,
+		  size_t *new_cnt,
+		  size_t cnt_space __maybe_unused)
+{
+	if (!new_cnt)
+		return -EINVAL;
+	*new_cnt = 0;
+	return 0;
+}
+#endif
+#endif /* __BPF_PROLOGUE_H */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 25/31] perf tools: Generate prologue for BPF programs
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (23 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 24/31] perf tools: Add prologue for BPF programs for fetching arguments Wang Nan
@ 2015-08-29  4:21 ` Wang Nan
  2015-08-29  4:22 ` [PATCH 26/31] perf tools: Use same BPF program if arguments are identical Wang Nan
                   ` (5 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:21 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

This patch generates prologue for each 'struct probe_trace_event' for
fetching arguments for BPF programs.

After bpf__probe(), iterate over each programs to check whether
prologue is required. If none of 'struct perf_probe_event' a program
will attach to has at least one argument, simply skip preprocessor
hooking. For those who prologue is required, calls bpf__gen_prologue()
and paste original instruction after prologue.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-36-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/bpf-loader.c | 120 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 119 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 95e529b..66d9bea 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -5,10 +5,13 @@
  * Copyright (C) 2015 Huawei Inc.
  */
 
+#include <linux/bpf.h>
 #include <bpf/libbpf.h>
 #include "perf.h"
 #include "debug.h"
 #include "bpf-loader.h"
+#include "bpf-prologue.h"
+#include "llvm-utils.h"
 #include "probe-event.h"
 #include "probe-finder.h"
 #include "llvm-utils.h"
@@ -42,6 +45,8 @@ struct bpf_prog_priv {
 		struct perf_probe_event *ppev;
 		struct perf_probe_event pev;
 	};
+	bool need_prologue;
+	struct bpf_insn *insns_buf;
 };
 
 static void
@@ -53,6 +58,7 @@ bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
 	/* check if pev is initialized */
 	if (priv && priv->pev_ready)
 		clear_perf_probe_event(&priv->pev);
+	zfree(&priv->insns_buf);
 	free(priv);
 }
 
@@ -239,6 +245,103 @@ int bpf__unprobe(void)
 	return ret < 0 ? ret : 0;
 }
 
+static int
+preproc_gen_prologue(struct bpf_program *prog, int n,
+		     struct bpf_insn *orig_insns, int orig_insns_cnt,
+		     struct bpf_prog_prep_result *res)
+{
+	struct probe_trace_event *tev;
+	struct perf_probe_event *pev;
+	struct bpf_prog_priv *priv;
+	struct bpf_insn *buf;
+	size_t prologue_cnt = 0;
+	int err;
+
+	err = bpf_program__get_private(prog, (void **)&priv);
+	if (err || !priv || !priv->pev_ready)
+		goto errout;
+
+	pev = &priv->pev;
+
+	if (n < 0 || n >= pev->ntevs)
+		goto errout;
+
+	tev = &pev->tevs[n];
+
+	buf = priv->insns_buf;
+	err = bpf__gen_prologue(tev->args, tev->nargs,
+				buf, &prologue_cnt,
+				BPF_MAXINSNS - orig_insns_cnt);
+	if (err) {
+		const char *title;
+
+		title = bpf_program__title(prog, false);
+		if (!title)
+			title = "??";
+
+		pr_debug("Failed to generate prologue for program %s\n",
+			 title);
+		return err;
+	}
+
+	memcpy(&buf[prologue_cnt], orig_insns,
+	       sizeof(struct bpf_insn) * orig_insns_cnt);
+
+	res->new_insn_ptr = buf;
+	res->new_insn_cnt = prologue_cnt + orig_insns_cnt;
+	res->pfd = NULL;
+	return 0;
+
+errout:
+	pr_debug("Internal error in preproc_gen_prologue\n");
+	return -EINVAL;
+}
+
+static int hook_load_preprocessor(struct bpf_program *prog)
+{
+	struct perf_probe_event *pev;
+	struct bpf_prog_priv *priv;
+	bool need_prologue = false;
+	int err, i;
+
+	err = bpf_program__get_private(prog, (void **)&priv);
+	if (err || !priv) {
+		pr_debug("Internal error when hook preprocessor\n");
+		return -EINVAL;
+	}
+
+	pev = &priv->pev;
+	for (i = 0; i < pev->ntevs; i++) {
+		struct probe_trace_event *tev = &pev->tevs[i];
+
+		if (tev->nargs > 0) {
+			need_prologue = true;
+			break;
+		}
+	}
+
+	/*
+	 * Since all tev doesn't have argument, we don't need generate
+	 * prologue.
+	 */
+	if (!need_prologue) {
+		priv->need_prologue = false;
+		return 0;
+	}
+
+	priv->need_prologue = true;
+	priv->insns_buf = malloc(sizeof(struct bpf_insn) *
+					BPF_MAXINSNS);
+	if (!priv->insns_buf) {
+		pr_debug("No enough memory: alloc insns_buf failed\n");
+		return -ENOMEM;
+	}
+
+	err = bpf_program__set_prep(prog, pev->ntevs,
+				    preproc_gen_prologue);
+	return err;
+}
+
 int bpf__probe(void)
 {
 	int err, nr_events = 0;
@@ -289,6 +392,17 @@ int bpf__probe(void)
 			err = sync_bpf_program_pev(prog);
 			if (err)
 				goto out;
+			/*
+			 * After probing, let's consider prologue, which
+			 * adds program fetcher to BPF programs.
+			 *
+			 * hook_load_preprocessorr() hooks pre-processor
+			 * to bpf_program, let it generate prologue
+			 * dynamically during loading.
+			 */
+			err = hook_load_preprocessor(prog);
+			if (err)
+				goto out;
 		}
 	}
 out:
@@ -349,7 +463,11 @@ int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg)
 			for (i = 0; i < pev->ntevs; i++) {
 				tev = &pev->tevs[i];
 
-				fd = bpf_program__fd(prog);
+				if (priv->need_prologue)
+					fd = bpf_program__nth_fd(prog, i);
+				else
+					fd = bpf_program__fd(prog);
+
 				if (fd < 0) {
 					pr_debug("bpf: failed to get file descriptor\n");
 					return fd;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 26/31] perf tools: Use same BPF program if arguments are identical
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (24 preceding siblings ...)
  2015-08-29  4:21 ` [PATCH 25/31] perf tools: Generate prologue for BPF programs Wang Nan
@ 2015-08-29  4:22 ` Wang Nan
  2015-08-29  4:22 ` [PATCH 27/31] perf record: Support custom vmlinux path Wang Nan
                   ` (4 subsequent siblings)
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:22 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Wang Nan,
	Brendan Gregg, Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

This patch allows creating only one BPF program for different
'probe_trace_event'(tev) generated by one 'perf_probe_event'(pev), if
their prologues are identical.

This is done by comparing argument list of different tev, and maps type
of prologue and tev using a mapping array. This patch utilizes qsort to
sort tevs. After sorting, tevs with identical argument list will group
together.

Signed-off-by: Wang Nan <wangnan0@hauwei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-37-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/bpf-loader.c | 133 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 126 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 66d9bea..a23aaf0 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -47,6 +47,8 @@ struct bpf_prog_priv {
 	};
 	bool need_prologue;
 	struct bpf_insn *insns_buf;
+	int nr_types;
+	int *type_mapping;
 };
 
 static void
@@ -59,6 +61,7 @@ bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
 	if (priv && priv->pev_ready)
 		clear_perf_probe_event(&priv->pev);
 	zfree(&priv->insns_buf);
+	zfree(&priv->type_mapping);
 	free(priv);
 }
 
@@ -255,7 +258,7 @@ preproc_gen_prologue(struct bpf_program *prog, int n,
 	struct bpf_prog_priv *priv;
 	struct bpf_insn *buf;
 	size_t prologue_cnt = 0;
-	int err;
+	int i, err;
 
 	err = bpf_program__get_private(prog, (void **)&priv);
 	if (err || !priv || !priv->pev_ready)
@@ -263,10 +266,20 @@ preproc_gen_prologue(struct bpf_program *prog, int n,
 
 	pev = &priv->pev;
 
-	if (n < 0 || n >= pev->ntevs)
+	if (n < 0 || n >= priv->nr_types)
 		goto errout;
 
-	tev = &pev->tevs[n];
+	/* Find a tev belongs to that type */
+	for (i = 0; i < pev->ntevs; i++)
+		if (priv->type_mapping[i] == n)
+			break;
+
+	if (i >= pev->ntevs) {
+		pr_debug("Internal error: prologue type %d not found\n", n);
+		return -ENOENT;
+	}
+
+	tev = &pev->tevs[i];
 
 	buf = priv->insns_buf;
 	err = bpf__gen_prologue(tev->args, tev->nargs,
@@ -297,6 +310,98 @@ errout:
 	return -EINVAL;
 }
 
+/*
+ * compare_tev_args is reflexive, transitive and antisymmetric.
+ * I can show that but this margin is too narrow to contain.
+ */
+static int compare_tev_args(const void *ptev1, const void *ptev2)
+{
+	int i, ret;
+	const struct probe_trace_event *tev1 =
+		*(const struct probe_trace_event **)ptev1;
+	const struct probe_trace_event *tev2 =
+		*(const struct probe_trace_event **)ptev2;
+
+	ret = tev2->nargs - tev1->nargs;
+	if (ret)
+		return ret;
+
+	for (i = 0; i < tev1->nargs; i++) {
+		struct probe_trace_arg *arg1, *arg2;
+		struct probe_trace_arg_ref *ref1, *ref2;
+
+		arg1 = &tev1->args[i];
+		arg2 = &tev2->args[i];
+
+		ret = strcmp(arg1->value, arg2->value);
+		if (ret)
+			return ret;
+
+		ref1 = arg1->ref;
+		ref2 = arg2->ref;
+
+		while (ref1 && ref2) {
+			ret = ref2->offset - ref1->offset;
+			if (ret)
+				return ret;
+
+			ref1 = ref1->next;
+			ref2 = ref2->next;
+		}
+
+		if (ref1 || ref2)
+			return ref2 ? 1 : -1;
+	}
+
+	return 0;
+}
+
+static int map_prologue(struct perf_probe_event *pev, int *mapping,
+			int *nr_types)
+{
+	int i, type = 0;
+	struct {
+		struct probe_trace_event *tev;
+		int idx;
+	} *stevs;
+	size_t array_sz = sizeof(*stevs) * pev->ntevs;
+
+	stevs = malloc(array_sz);
+	if (!stevs) {
+		pr_debug("No ehough memory: alloc stevs failed\n");
+		return -ENOMEM;
+	}
+
+	pr_debug("In map_prologue, ntevs=%d\n", pev->ntevs);
+	for (i = 0; i < pev->ntevs; i++) {
+		stevs[i].tev = &pev->tevs[i];
+		stevs[i].idx = i;
+	}
+	qsort(stevs, pev->ntevs, sizeof(*stevs),
+	      compare_tev_args);
+
+	for (i = 0; i < pev->ntevs; i++) {
+		if (i == 0) {
+			mapping[stevs[i].idx] = type;
+			pr_debug("mapping[%d]=%d\n", stevs[i].idx,
+				 type);
+			continue;
+		}
+
+		if (compare_tev_args(stevs + i, stevs + i - 1) == 0)
+			mapping[stevs[i].idx] = type;
+		else
+			mapping[stevs[i].idx] = ++type;
+
+		pr_debug("mapping[%d]=%d\n", stevs[i].idx,
+			 mapping[stevs[i].idx]);
+	}
+	free(stevs);
+	*nr_types = type + 1;
+
+	return 0;
+}
+
 static int hook_load_preprocessor(struct bpf_program *prog)
 {
 	struct perf_probe_event *pev;
@@ -337,7 +442,19 @@ static int hook_load_preprocessor(struct bpf_program *prog)
 		return -ENOMEM;
 	}
 
-	err = bpf_program__set_prep(prog, pev->ntevs,
+	priv->type_mapping = malloc(sizeof(int) * pev->ntevs);
+	if (!priv->type_mapping) {
+		pr_debug("No enough memory: alloc type_mapping failed\n");
+		return -ENOMEM;
+	}
+	memset(priv->type_mapping, 0xff,
+	       sizeof(int) * pev->ntevs);
+
+	err = map_prologue(pev, priv->type_mapping, &priv->nr_types);
+	if (err)
+		return err;
+
+	err = bpf_program__set_prep(prog, priv->nr_types,
 				    preproc_gen_prologue);
 	return err;
 }
@@ -463,9 +580,11 @@ int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg)
 			for (i = 0; i < pev->ntevs; i++) {
 				tev = &pev->tevs[i];
 
-				if (priv->need_prologue)
-					fd = bpf_program__nth_fd(prog, i);
-				else
+				if (priv->need_prologue) {
+					int type = priv->type_mapping[i];
+
+					fd = bpf_program__nth_fd(prog, type);
+				} else
 					fd = bpf_program__fd(prog);
 
 				if (fd < 0) {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 27/31] perf record: Support custom vmlinux path
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (25 preceding siblings ...)
  2015-08-29  4:22 ` [PATCH 26/31] perf tools: Use same BPF program if arguments are identical Wang Nan
@ 2015-08-29  4:22 ` Wang Nan
  2015-09-01 20:19   ` Arnaldo Carvalho de Melo
  2015-08-29  4:22 ` [PATCH 28/31] perf probe: Init symbol as kprobe Wang Nan
                   ` (3 subsequent siblings)
  30 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:22 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, He Kuang, Wang Nan,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

From: He Kuang <hekuang@huawei.com>

Make perf-record command support --vmlinux option if BPF_PROLOGUE is on.

'perf record' needs vmlinux as the source of DWARF info to generate
prologue for BPF programs, so path of vmlinux should be specified.

Short name 'k' has been taken by 'clockid'. This patch skips the short
option name and use '--vmlinux' for vmlinux path.

Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-38-git-send-email-wangnan0@huawei.com
---
 tools/perf/builtin-record.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 212718c..8eb39d5 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1100,6 +1100,10 @@ struct option __record_options[] = {
 		   "clang binary to use for compiling BPF scriptlets"),
 	OPT_STRING(0, "clang-opt", &llvm_param.clang_opt, "clang options",
 		   "options passed to clang when compiling BPF scriptlets"),
+#ifdef HAVE_BPF_PROLOGUE
+	OPT_STRING(0, "vmlinux", &symbol_conf.vmlinux_name,
+		   "file", "vmlinux pathname"),
+#endif
 #endif
 	OPT_END()
 };
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 28/31] perf probe: Init symbol as kprobe
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (26 preceding siblings ...)
  2015-08-29  4:22 ` [PATCH 27/31] perf record: Support custom vmlinux path Wang Nan
@ 2015-08-29  4:22 ` Wang Nan
  2015-09-01 20:11   ` Arnaldo Carvalho de Melo
  2015-08-29  4:22 ` [PATCH 29/31] perf tools: Support attach BPF program on uprobe events Wang Nan
                   ` (2 subsequent siblings)
  30 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:22 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Before this patch, add_perf_probe_events() init symbol maps only for
uprobe if the first 'struct perf_probe_event' passed to it is a uprobe
event. This is a trick because 'perf probe''s command line syntax
constrains the first elements of the probe_event arrays must be kprobes
if there is one kprobe there.

However, with the incoming BPF uprobe support, that constrain is not
hold since 'perf record' will also probe on k/u probes through BPF
object, and is possible to pass an array with kprobe but the first
element is uprobe.

This patch init symbol maps for kprobes even if all of events are
uprobes, because the extra cost should be small enough.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-39-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/probe-event.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index e720913..b94a8d7 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2789,7 +2789,7 @@ int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
 {
 	int i, ret;
 
-	ret = init_symbol_maps(pevs->uprobes);
+	ret = init_symbol_maps(false);
 	if (ret < 0)
 		return ret;
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 29/31] perf tools: Support attach BPF program on uprobe events
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (27 preceding siblings ...)
  2015-08-29  4:22 ` [PATCH 28/31] perf probe: Init symbol as kprobe Wang Nan
@ 2015-08-29  4:22 ` Wang Nan
  2015-08-29  4:22 ` [PATCH 30/31] perf tools: Fix cross compiling error Wang Nan
  2015-08-29  4:22 ` [PATCH 31/31] tools lib traceevent: Support function __get_dynamic_array_len Wang Nan
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:22 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, Wang Nan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

This patch appends new syntax to BPF object section name to support
probing at uprobe event. Now we can use BPF program like this:

 SEC(
 "target=/lib64/libc.so.6\n"
 "libcwrite=__write"
 )
 int libcwrite(void *ctx)
 {
     return 1;
 }

Where, in section name of a program, before the main config string,
we can use 'key=value' style options. Now the only option key "target"
is for uprobe probing.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-40-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/bpf-loader.c | 88 ++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 81 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index a23aaf0..2735389 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -66,6 +66,84 @@ bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
 }
 
 static int
+do_config(const char *key, const char *value,
+	  struct perf_probe_event *pev)
+{
+	pr_debug("config bpf program: %s=%s\n", key, value);
+	if (strcmp(key, "target") == 0) {
+		pev->uprobes = true;
+		pev->target = strdup(value);
+		return 0;
+	}
+
+	pr_warning("BPF: WARNING: invalid config option in object: %s=%s\n",
+		   key, value);
+	pr_warning("\tHint: Currently only valid option is 'target=<file>'\n");
+	return 0;
+}
+
+static const char *
+parse_config_kvpair(const char *config_str, struct perf_probe_event *pev)
+{
+	char *text = strdup(config_str);
+	char *sep, *line;
+	const char *main_str = NULL;
+	int err = 0;
+
+	if (!text) {
+		pr_debug("No enough memory: dup config_str failed\n");
+		return NULL;
+	}
+
+	line = text;
+	while ((sep = strchr(line, '\n'))) {
+		char *equ;
+
+		*sep = '\0';
+		equ = strchr(line, '=');
+		if (!equ) {
+			pr_warning("WARNING: invalid config in BPF object: %s\n",
+				   line);
+			pr_warning("\tShould be 'key=value'.\n");
+			goto nextline;
+		}
+		*equ = '\0';
+
+		err = do_config(line, equ + 1, pev);
+		if (err)
+			break;
+nextline:
+		line = sep + 1;
+	}
+
+	if (!err)
+		main_str = config_str + (line - text);
+	free(text);
+
+	return main_str;
+}
+
+static int
+parse_config(const char *config_str, struct perf_probe_event *pev)
+{
+	const char *main_str;
+	int err;
+
+	main_str = parse_config_kvpair(config_str, pev);
+	if (!main_str)
+		return -EINVAL;
+
+	err = parse_perf_probe_command(main_str, pev);
+	if (err < 0) {
+		pr_debug("bpf: '%s' is not a valid config string\n",
+			 config_str);
+		/* parse failed, don't need clear pev. */
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
 config_bpf_program(struct bpf_program *prog, struct perf_probe_event *pev)
 {
 	struct bpf_prog_priv *priv = NULL;
@@ -79,13 +157,9 @@ config_bpf_program(struct bpf_program *prog, struct perf_probe_event *pev)
 	}
 
 	pr_debug("bpf: config program '%s'\n", config_str);
-	err = parse_perf_probe_command(config_str, pev);
-	if (err < 0) {
-		pr_debug("bpf: '%s' is not a valid config string\n",
-			 config_str);
-		/* parse failed, don't need clear pev. */
-		return -EINVAL;
-	}
+	err = parse_config(config_str, pev);
+	if (err)
+		return err;
 
 	if (pev->group && strcmp(pev->group, PERF_BPF_PROBE_GROUP)) {
 		pr_debug("bpf: '%s': group for event is set and not '%s'.\n",
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 30/31] perf tools: Fix cross compiling error
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (28 preceding siblings ...)
  2015-08-29  4:22 ` [PATCH 29/31] perf tools: Support attach BPF program on uprobe events Wang Nan
@ 2015-08-29  4:22 ` Wang Nan
  2015-08-29  4:22 ` [PATCH 31/31] tools lib traceevent: Support function __get_dynamic_array_len Wang Nan
  30 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:22 UTC (permalink / raw)
  To: acme, mingo, ast; +Cc: linux-kernel, lizefan, pi3orama, Wang Nan

Cross compiling perf to other platform failed due to missing symbol:

  ...
  AR       /pathofperf/libperf.a
  LD       /pathofperf/tests/perf-in.o
  LD       /pathofperf/perf-in.o
  LINK     /pathofperf/perf
/pathofperf/libperf.a(libperf-in.o): In function `intel_pt_synth_branch_sample':
/usr/src/kernel/tools/perf/util/intel-pt.c:899: undefined reference to `tsc_to_perf_time'
/pathofperf/libperf.a(libperf-in.o): In function `intel_pt_synth_transaction_sample':
/usr/src/kernel/tools/perf/util/intel-pt.c:992: undefined reference to `tsc_to_perf_time'
/pathofperf/libperf.a(libperf-in.o): In function `intel_pt_synth_instruction_sample':
/usr/src/kernel/tools/perf/util/intel-pt.c:943: undefined reference to `tsc_to_perf_time'
  ...

This is because we allow newly introduced intel-pt-decoder to be
compiled to not only X86, but tsc.c which required by it is compiled
for x86 only.

This patch fix the compiling error by allow tsc.c to be compiled if
CONFIG_AUXTRACE is set, no matter the target platform.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1440766442-48116-1-git-send-email-wangnan0@huawei.com
---
 tools/perf/util/Build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index fd2f084..c8d9c7e 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -74,7 +74,7 @@ libperf-y += stat-shadow.o
 libperf-y += record.o
 libperf-y += srcline.o
 libperf-y += data.o
-libperf-$(CONFIG_X86) += tsc.o
+libperf-$(CONFIG_AUXTRACE) += tsc.o
 libperf-y += cloexec.o
 libperf-y += thread-stack.o
 libperf-$(CONFIG_AUXTRACE) += auxtrace.o
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 31/31] tools lib traceevent: Support function __get_dynamic_array_len
  2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
                   ` (29 preceding siblings ...)
  2015-08-29  4:22 ` [PATCH 30/31] perf tools: Fix cross compiling error Wang Nan
@ 2015-08-29  4:22 ` Wang Nan
  2015-09-08 14:31   ` [tip:perf/core] " tip-bot for He Kuang
  30 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-08-29  4:22 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, He Kuang,
	Arnaldo Carvalho de Melo, Ingo Molnar, Jiri Olsa,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, Steven Rostedt,
	Wang Nan

From: He Kuang <hekuang@huawei.com>

Support helper function __get_dynamic_array_len() in libtraceevent,
this function is used accompany with __print_array() or __print_hex(),
but currently it is not an available function in the function list of
process_function().

The total allocated length of the dynamic array is embedded in the top
half of __data_loc_##item field. This patch adds new arg type
PRINT_DYNAMIC_ARRAY_LEN to return the length to eval_num_arg(),

Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: pi3orama@163.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/1437448130-134621-2-git-send-email-hekuang@huawei.com
---
 tools/lib/traceevent/event-parse.c                 | 56 +++++++++++++++++++++-
 tools/lib/traceevent/event-parse.h                 |  1 +
 .../perf/util/scripting-engines/trace-event-perl.c |  1 +
 .../util/scripting-engines/trace-event-python.c    |  1 +
 4 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 4d88593..1244797 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -848,6 +848,7 @@ static void free_arg(struct print_arg *arg)
 		free(arg->bitmask.bitmask);
 		break;
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 		free(arg->dynarray.index);
 		break;
 	case PRINT_OP:
@@ -2729,6 +2730,42 @@ process_dynamic_array(struct event_format *event, struct print_arg *arg, char **
 }
 
 static enum event_type
+process_dynamic_array_len(struct event_format *event, struct print_arg *arg,
+			  char **tok)
+{
+	struct format_field *field;
+	enum event_type type;
+	char *token;
+
+	if (read_expect_type(EVENT_ITEM, &token) < 0)
+		goto out_free;
+
+	arg->type = PRINT_DYNAMIC_ARRAY_LEN;
+
+	/* Find the field */
+	field = pevent_find_field(event, token);
+	if (!field)
+		goto out_free;
+
+	arg->dynarray.field = field;
+	arg->dynarray.index = 0;
+
+	if (read_expected(EVENT_DELIM, ")") < 0)
+		goto out_err;
+
+	type = read_token(&token);
+	*tok = token;
+
+	return type;
+
+ out_free:
+	free_token(token);
+ out_err:
+	*tok = NULL;
+	return EVENT_ERROR;
+}
+
+static enum event_type
 process_paren(struct event_format *event, struct print_arg *arg, char **tok)
 {
 	struct print_arg *item_arg;
@@ -2975,6 +3012,10 @@ process_function(struct event_format *event, struct print_arg *arg,
 		free_token(token);
 		return process_dynamic_array(event, arg, tok);
 	}
+	if (strcmp(token, "__get_dynamic_array_len") == 0) {
+		free_token(token);
+		return process_dynamic_array_len(event, arg, tok);
+	}
 
 	func = find_func_handler(event->pevent, token);
 	if (func) {
@@ -3655,14 +3696,25 @@ eval_num_arg(void *data, int size, struct event_format *event, struct print_arg
 			goto out_warning_op;
 		}
 		break;
+	case PRINT_DYNAMIC_ARRAY_LEN:
+		offset = pevent_read_number(pevent,
+					    data + arg->dynarray.field->offset,
+					    arg->dynarray.field->size);
+		/*
+		 * The total allocated length of the dynamic array is
+		 * stored in the top half of the field, and the offset
+		 * is in the bottom half of the 32 bit field.
+		 */
+		val = (unsigned long long)(offset >> 16);
+		break;
 	case PRINT_DYNAMIC_ARRAY:
 		/* Without [], we pass the address to the dynamic data */
 		offset = pevent_read_number(pevent,
 					    data + arg->dynarray.field->offset,
 					    arg->dynarray.field->size);
 		/*
-		 * The actual length of the dynamic array is stored
-		 * in the top half of the field, and the offset
+		 * The total allocated length of the dynamic array is
+		 * stored in the top half of the field, and the offset
 		 * is in the bottom half of the 32 bit field.
 		 */
 		offset &= 0xffff;
diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
index 204befb..6fc83c7 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -294,6 +294,7 @@ enum print_arg_type {
 	PRINT_OP,
 	PRINT_FUNC,
 	PRINT_BITMASK,
+	PRINT_DYNAMIC_ARRAY_LEN,
 };
 
 struct print_arg {
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index 1bd593b..544509c 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -221,6 +221,7 @@ static void define_event_symbols(struct event_format *event,
 		break;
 	case PRINT_BSTRING:
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 	case PRINT_STRING:
 	case PRINT_BITMASK:
 		break;
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index ace2484..aa9e125 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -251,6 +251,7 @@ static void define_event_symbols(struct event_format *event,
 		/* gcc warns for these? */
 	case PRINT_BSTRING:
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 	case PRINT_FUNC:
 	case PRINT_BITMASK:
 		/* we should warn... */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected
  2015-08-29  4:21 ` [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected Wang Nan
@ 2015-08-31 19:20   ` Arnaldo Carvalho de Melo
  2015-09-01 10:37     ` Wangnan (F)
  2015-09-01 10:38     ` Jiri Olsa
  2015-09-02  2:53   ` [PATCH] perf tools: Don't set leader if parser doesn't collect an evsel Wang Nan
  1 sibling, 2 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 19:20 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Wang Nan, mingo, ast, linux-kernel, lizefan, pi3orama,
	Masami Hiramatsu, Namhyung Kim

Em Sat, Aug 29, 2015 at 04:21:36AM +0000, Wang Nan escreveu:
> If parse_events__scanner() collects no entry, perf_evlist__last(evlist)
> is invalid. Then setting of cmdline_group_boundary touches invalid.
> 
> It could happend in currect BPF implementation. See [1]. Although it
> can be fixed, for safety reason it whould be better to introduce this
> check.
> 
> Instead of checking number of entries, check data.list instead, so we
> can add dummy evsel here.

Event parsing fixes should have Jiri Olsa on the CC list, Jiri, is this
ok?

>From what I can see it looks Ok, my question, just from looking at this
patch, is if it is valid to get to this point with an empty data.list,
i.e. was it ever possible and this is a bug irrespective of eBPF?

- Arnaldo
 
> [1]: http://lkml.kernel.org/n/1436445342-1402-19-git-send-email-wangnan0@huawei.com
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/r/1440742821-44548-3-git-send-email-wangnan0@huawei.com
> ---
>  tools/perf/util/parse-events.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index d826e6f..14cd7e3 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -1143,10 +1143,14 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>  		int entries = data.idx - evlist->nr_entries;
>  		struct perf_evsel *last;
>  
> +		if (!list_empty(&data.list)) {
> +			last = list_entry(data.list.prev,
> +					  struct perf_evsel, node);
> +			last->cmdline_group_boundary = true;
> +		}
> +
>  		perf_evlist__splice_list_tail(evlist, &data.list, entries);
>  		evlist->nr_groups += data.nr_groups;
> -		last = perf_evlist__last(evlist);
> -		last->cmdline_group_boundary = true;
>  
>  		return 0;
>  	}
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 03/31] perf tools: Introduce dummy evsel
  2015-08-29  4:21 ` [PATCH 03/31] perf tools: Introduce dummy evsel Wang Nan
@ 2015-08-31 19:38   ` Arnaldo Carvalho de Melo
  2015-09-03  0:11   ` Namhyung Kim
  2015-09-06  5:55   ` [PATCH] perf tools: Allow BPF placeholder dummy events to collect --filter options Wang Nan
  2 siblings, 0 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 19:38 UTC (permalink / raw)
  To: Wang Nan
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra

Em Sat, Aug 29, 2015 at 04:21:37AM +0000, Wang Nan escreveu:
> This patch allows linking dummy evsel onto evlist as a placeholder. It
> is for following patch which allows passing BPF object using '--event
> object.o'.

Summary: this patch ended up adding too many subtle clever tricks to
achieve what it needs to accomplish, please try to clearly describe the
problem, then describe how you implemented it.
 
> Doesn't link other event selectors, if passing a BPF object file to
> '--event', nothing is linked onto evlist.

-ENOPARSE, can you rewrite the above sentence?

You mean you will segregate the events that related to eBPF to process
them in a separate step? Like, for instance, putting them in a separate
evlist or perhaps flipping a bit like: evsel->process_me_later = true
and then avoid those and process them at some later stage?

> Instead, events described in BPF object file are probed and linked in
> a delayed manner because we want do all probing work together.
> Therefore, evsel for events in BPF object would be linked at the end
> of evlist. Which causes a small problem that, if passing '--filter'
> setting after object file, the filter option won't be correctly
> applied to those events.
 
> This patch links dummy onto evlist, so following --filter can be
> collected by the dummy evsel. For this reason dummy evsels are set to
> PERF_TYPE_TRACEPOINT.

Looks like a roundabout way of applying the --filter to the eBPF, but I
really need to read the patch then... See more below.
 
> Due to the possibility of existance of dummy evsel,
> perf_evlist__purge_dummy() must be called right after parse_options().
> This patch adds it to record, top, trace and stat builtin commands.
> Further patch moves it down after real BPF events are processed with.
 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Link: http://lkml.kernel.org/r/1440742821-44548-4-git-send-email-wangnan0@huawei.com
> ---
>  tools/perf/builtin-record.c    |  2 ++
>  tools/perf/builtin-stat.c      |  1 +
>  tools/perf/builtin-top.c       |  1 +
>  tools/perf/builtin-trace.c     |  1 +
>  tools/perf/util/evlist.c       | 19 +++++++++++++++++++
>  tools/perf/util/evlist.h       |  1 +
>  tools/perf/util/evsel.c        | 32 ++++++++++++++++++++++++++++++++
>  tools/perf/util/evsel.h        |  6 ++++++
>  tools/perf/util/parse-events.c | 25 +++++++++++++++++++++----
>  9 files changed, 84 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index a660022..81829de 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1112,6 +1112,8 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
>  
>  	argc = parse_options(argc, argv, record_options, record_usage,
>  			    PARSE_OPT_STOP_AT_NON_OPTION);
> +	perf_evlist__purge_dummy(rec->evlist);
> +
>  	if (!argc && target__none(&rec->opts.target))
>  		usage_with_options(record_usage, record_options);
>  
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 7aa039b..99b62f1 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1208,6 +1208,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
>  
>  	argc = parse_options(argc, argv, options, stat_usage,
>  		PARSE_OPT_STOP_AT_NON_OPTION);
> +	perf_evlist__purge_dummy(evsel_list);
>  
>  	interval = stat_config.interval;
>  
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 8c465c8..246203b 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1198,6 +1198,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
>  	perf_config(perf_top_config, &top);
>  
>  	argc = parse_options(argc, argv, options, top_usage, 0);
> +	perf_evlist__purge_dummy(top.evlist);
>  	if (argc)
>  		usage_with_options(top_usage, options);
>  
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 4e3abba..57712b9 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -3099,6 +3099,7 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
>  
>  	argc = parse_options_subcommand(argc, argv, trace_options, trace_subcommands,
>  				 trace_usage, PARSE_OPT_STOP_AT_NON_OPTION);
> +	perf_evlist__purge_dummy(trace.evlist);
>  
>  	if (trace.trace_pgfaults) {
>  		trace.opts.sample_address = true;
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 8d00039..8a4e64d 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1696,3 +1696,22 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
>  
>  	tracking_evsel->tracking = true;
>  }
> +
> +void perf_evlist__purge_dummy(struct perf_evlist *evlist)

If it remove more than one event, then it should be named accordingly,
either:

  perf_evlist__purge_dummies() or perf_evlist__purge_dummy_events().

I would prefer the former.

But we already have a "dummy" event:

[root@zoo linux]# perf stat -e dummy -a sleep 10s

 Performance counter stats for 'system wide':

                 0      dummy                                                       

      10.003173114 seconds time elapsed

[root@zoo linux]#

It has some specific purpose, but then now, with your patch, we need to
figure out which dummy is which, so I think this needs rethinking.


> +{
> +	struct perf_evsel *pos, *n;
> +
> +	/*
> +	 * Remove all dummy events.
> +	 * During linking, we don't touch anything except link
> +	 * it into evlist. As a result, we don't
> +	 * need to adjust evlist->nr_entries during removal.
> +	 */
> +
> +	evlist__for_each_safe(evlist, n, pos) {
> +		if (perf_evsel__is_dummy(pos)) {
> +			list_del_init(&pos->node);
> +			perf_evsel__delete(pos);
> +		}
> +	}
> +}
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index b39a619..7f15727 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -181,6 +181,7 @@ bool perf_evlist__valid_read_format(struct perf_evlist *evlist);
>  void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
>  				   struct list_head *list,
>  				   int nr_entries);
> +void perf_evlist__purge_dummy(struct perf_evlist *evlist);
>  
>  static inline struct perf_evsel *perf_evlist__first(struct perf_evlist *evlist)
>  {
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index bac25f4..01267f4 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -213,6 +213,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
>  	evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
>  	perf_evsel__calc_id_pos(evsel);
>  	evsel->cmdline_group_boundary = false;
> +	evsel->is_dummy = false;
>  }
>  
>  struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
> @@ -225,6 +226,37 @@ struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
>  	return evsel;
>  }
>  
> +struct perf_evsel *perf_evsel__new_dummy(const char *name)
> +{
> +	struct perf_evsel *evsel = zalloc(perf_evsel__object.size);
> +
> +	if (!evsel)
> +		return NULL;
> +
> +	/*
> +	 * Don't need call perf_evsel__init() for dummy evsel.
> +	 * Keep it simple.
> +	 */
> +	evsel->name = strdup(name);
> +	if (!evsel->name)
> +		goto out_free;
> +
> +	INIT_LIST_HEAD(&evsel->node);
> +	INIT_LIST_HEAD(&evsel->config_terms);
> +
> +	evsel->cmdline_group_boundary = false;
> +	/*
> +	 * Set dummy evsel as TRACEPOINT event so it can collect filter
> +	 * options.
> +	 */
> +	evsel->attr.type = PERF_TYPE_TRACEPOINT;
> +	evsel->is_dummy = true;
> +	return evsel;
> +out_free:
> +	free(evsel);
> +	return NULL;
> +}
> +
>  struct perf_evsel *perf_evsel__newtp_idx(const char *sys, const char *name, int idx)
>  {
>  	struct perf_evsel *evsel = zalloc(perf_evsel__object.size);
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 298e6bb..0b8e47d 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -118,6 +118,7 @@ struct perf_evsel {
>  	struct perf_evsel	*leader;
>  	char			*group_name;
>  	bool			cmdline_group_boundary;
> +	bool			is_dummy;
>  	struct list_head	config_terms;
>  };
>  
> @@ -153,6 +154,11 @@ int perf_evsel__object_config(size_t object_size,
>  			      void (*fini)(struct perf_evsel *evsel));
>  
>  struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx);
> +struct perf_evsel *perf_evsel__new_dummy(const char *name);
> +static inline bool perf_evsel__is_dummy(struct perf_evsel *evsel)
> +{
> +	return evsel->is_dummy;
> +}
>  
>  static inline struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr)
>  {
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 14cd7e3..71d91fb 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -1141,7 +1141,7 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>  	perf_pmu__parse_cleanup();
>  	if (!ret) {
>  		int entries = data.idx - evlist->nr_entries;
> -		struct perf_evsel *last;
> +		struct perf_evsel *last = NULL;
>  
>  		if (!list_empty(&data.list)) {
>  			last = list_entry(data.list.prev,
> @@ -1149,8 +1149,25 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>  			last->cmdline_group_boundary = true;
>  		}
>  
> -		perf_evlist__splice_list_tail(evlist, &data.list, entries);
> -		evlist->nr_groups += data.nr_groups;
> +		if (last && perf_evsel__is_dummy(last)) {
> +			if (!list_is_singular(&data.list)) {
> +				parse_events_evlist_error(&data, 0,
> +					"Dummy evsel error: not on a singular list");
> +				return -1;
> +			}
> +			/*
> +			 * We are introducing a dummy event. Don't touch
> +			 * anything, just link it.

What is the advantage of "just linking it"? What will we achieve by
that, you told what you want to avoid, i.e. "alerting"
evlist->nr_entries, but why is that important and what is the part you
want to reuse?

> +			 * Don't use perf_evlist__splice_list_tail() since
> +			 * it alerts evlist->nr_entries, which affect header
> +			 * of resulting perf.data.
> +			 */
> +			list_splice_tail(&data.list, &evlist->entries);
> +		} else {
> +			perf_evlist__splice_list_tail(evlist, &data.list, entries);
> +			evlist->nr_groups += data.nr_groups;
> +		}
>  
>  		return 0;
>  	}
> @@ -1256,7 +1273,7 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
>  	struct perf_evsel *last = NULL;
>  	int err;
>  
> -	if (evlist->nr_entries > 0)
> +	if (!list_empty(&evlist->entries))

So here is part of that clever trick, i.e. evlist->nr_entries, that so
far we could rely on being the number of evsels in evlist->entries,
can't be trusted for that, argh :-\

>  		last = perf_evlist__last(evlist);
>  
>  	do {
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 21/31] perf tools: Move linux/filter.h to tools/include
  2015-08-29  4:21 ` [PATCH 21/31] perf tools: Move linux/filter.h to tools/include Wang Nan
@ 2015-08-31 20:35   ` Arnaldo Carvalho de Melo
  2015-09-01 19:39   ` Arnaldo Carvalho de Melo
  2015-09-08 14:31   ` [tip:perf/core] perf tools: Copy " tip-bot for He Kuang
  2 siblings, 0 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 20:35 UTC (permalink / raw)
  To: Wang Nan
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

Em Sat, Aug 29, 2015 at 04:21:55AM +0000, Wang Nan escreveu:
> From: He Kuang <hekuang@huawei.com>
> 
> This patch moves filter.h from include/linux/kernel.h to

Does this really _move_ a file from include/linux/kernel.h to some other
place? Isn't there any other users for such file in the kernel sources?
:-)

Looking at the patch I see that it doesn't move anything, it _copies_
the file, right?

- Arnaldo

> tools/include/linux/filter.h to enable other libraries use macros in
> it, like libbpf which will be introduced by further patches. Currenty,
> the moved filter.h only contains the useful macros needed by libbpf
> for not introducing too much dependence.
> 
> MANIFEST is also updated for 'make perf-*-src-pkg'.
> 
> One change:
>   imm field of BPF_EMIT_CALL becomes ((FUNC) - BPF_FUNC_unspec) to
>   suit user space code generator.
> 
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/n/1436445342-1402-32-git-send-email-wangnan0@huawei.com
> ---
>  tools/include/linux/filter.h | 237 +++++++++++++++++++++++++++++++++++++++++++
>  tools/perf/MANIFEST          |   1 +
>  2 files changed, 238 insertions(+)
>  create mode 100644 tools/include/linux/filter.h
> 
> diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h
> new file mode 100644
> index 0000000..11d2b1c
> --- /dev/null
> +++ b/tools/include/linux/filter.h
> @@ -0,0 +1,237 @@
> +/*
> + * Linux Socket Filter Data Structures
> + */
> +#ifndef __TOOLS_LINUX_FILTER_H
> +#define __TOOLS_LINUX_FILTER_H
> +
> +#include <linux/bpf.h>
> +
> +/* ArgX, context and stack frame pointer register positions. Note,
> + * Arg1, Arg2, Arg3, etc are used as argument mappings of function
> + * calls in BPF_CALL instruction.
> + */
> +#define BPF_REG_ARG1	BPF_REG_1
> +#define BPF_REG_ARG2	BPF_REG_2
> +#define BPF_REG_ARG3	BPF_REG_3
> +#define BPF_REG_ARG4	BPF_REG_4
> +#define BPF_REG_ARG5	BPF_REG_5
> +#define BPF_REG_CTX	BPF_REG_6
> +#define BPF_REG_FP	BPF_REG_10
> +
> +/* Additional register mappings for converted user programs. */
> +#define BPF_REG_A	BPF_REG_0
> +#define BPF_REG_X	BPF_REG_7
> +#define BPF_REG_TMP	BPF_REG_8
> +
> +/* BPF program can access up to 512 bytes of stack space. */
> +#define MAX_BPF_STACK	512
> +
> +/* Helper macros for filter block array initializers. */
> +
> +/* ALU ops on registers, bpf_add|sub|...: dst_reg += src_reg */
> +
> +#define BPF_ALU64_REG(OP, DST, SRC)				\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU64 | BPF_OP(OP) | BPF_X,	\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = 0,					\
> +		.imm   = 0 })
> +
> +#define BPF_ALU32_REG(OP, DST, SRC)				\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU | BPF_OP(OP) | BPF_X,		\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = 0,					\
> +		.imm   = 0 })
> +
> +/* ALU ops on immediates, bpf_add|sub|...: dst_reg += imm32 */
> +
> +#define BPF_ALU64_IMM(OP, DST, IMM)				\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU64 | BPF_OP(OP) | BPF_K,	\
> +		.dst_reg = DST,					\
> +		.src_reg = 0,					\
> +		.off   = 0,					\
> +		.imm   = IMM })
> +
> +#define BPF_ALU32_IMM(OP, DST, IMM)				\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU | BPF_OP(OP) | BPF_K,		\
> +		.dst_reg = DST,					\
> +		.src_reg = 0,					\
> +		.off   = 0,					\
> +		.imm   = IMM })
> +
> +/* Endianness conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
> +
> +#define BPF_ENDIAN(TYPE, DST, LEN)				\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU | BPF_END | BPF_SRC(TYPE),	\
> +		.dst_reg = DST,					\
> +		.src_reg = 0,					\
> +		.off   = 0,					\
> +		.imm   = LEN })
> +
> +/* Short form of mov, dst_reg = src_reg */
> +
> +#define BPF_MOV64_REG(DST, SRC)					\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU64 | BPF_MOV | BPF_X,		\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = 0,					\
> +		.imm   = 0 })
> +
> +#define BPF_MOV32_REG(DST, SRC)					\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU | BPF_MOV | BPF_X,		\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = 0,					\
> +		.imm   = 0 })
> +
> +/* Short form of mov, dst_reg = imm32 */
> +
> +#define BPF_MOV64_IMM(DST, IMM)					\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU64 | BPF_MOV | BPF_K,		\
> +		.dst_reg = DST,					\
> +		.src_reg = 0,					\
> +		.off   = 0,					\
> +		.imm   = IMM })
> +
> +#define BPF_MOV32_IMM(DST, IMM)					\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU | BPF_MOV | BPF_K,		\
> +		.dst_reg = DST,					\
> +		.src_reg = 0,					\
> +		.off   = 0,					\
> +		.imm   = IMM })
> +
> +/* Short form of mov based on type,
> + * BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32
> + */
> +
> +#define BPF_MOV64_RAW(TYPE, DST, SRC, IMM)			\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU64 | BPF_MOV | BPF_SRC(TYPE),	\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = 0,					\
> +		.imm   = IMM })
> +
> +#define BPF_MOV32_RAW(TYPE, DST, SRC, IMM)			\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ALU | BPF_MOV | BPF_SRC(TYPE),	\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = 0,					\
> +		.imm   = IMM })
> +
> +/* Direct packet access, R0 = *(uint *) (skb->data + imm32) */
> +
> +#define BPF_LD_ABS(SIZE, IMM)					\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_LD | BPF_SIZE(SIZE) | BPF_ABS,	\
> +		.dst_reg = 0,					\
> +		.src_reg = 0,					\
> +		.off   = 0,					\
> +		.imm   = IMM })
> +
> +/* Indirect packet access, R0 = *(uint *) (skb->data + src_reg + imm32) */
> +
> +#define BPF_LD_IND(SIZE, SRC, IMM)				\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_LD | BPF_SIZE(SIZE) | BPF_IND,	\
> +		.dst_reg = 0,					\
> +		.src_reg = SRC,					\
> +		.off   = 0,					\
> +		.imm   = IMM })
> +
> +/* Memory load, dst_reg = *(uint *) (src_reg + off16) */
> +
> +#define BPF_LDX_MEM(SIZE, DST, SRC, OFF)			\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_LDX | BPF_SIZE(SIZE) | BPF_MEM,	\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = OFF,					\
> +		.imm   = 0 })
> +
> +/* Memory store, *(uint *) (dst_reg + off16) = src_reg */
> +
> +#define BPF_STX_MEM(SIZE, DST, SRC, OFF)			\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_STX | BPF_SIZE(SIZE) | BPF_MEM,	\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = OFF,					\
> +		.imm   = 0 })
> +
> +/* Memory store, *(uint *) (dst_reg + off16) = imm32 */
> +
> +#define BPF_ST_MEM(SIZE, DST, OFF, IMM)				\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_ST | BPF_SIZE(SIZE) | BPF_MEM,	\
> +		.dst_reg = DST,					\
> +		.src_reg = 0,					\
> +		.off   = OFF,					\
> +		.imm   = IMM })
> +
> +/* Conditional jumps against registers,
> + * if (dst_reg 'op' src_reg) goto pc + off16
> + */
> +
> +#define BPF_JMP_REG(OP, DST, SRC, OFF)				\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_JMP | BPF_OP(OP) | BPF_X,		\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = OFF,					\
> +		.imm   = 0 })
> +
> +/* Conditional jumps against immediates,
> + * if (dst_reg 'op' imm32) goto pc + off16
> + */
> +
> +#define BPF_JMP_IMM(OP, DST, IMM, OFF)				\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_JMP | BPF_OP(OP) | BPF_K,		\
> +		.dst_reg = DST,					\
> +		.src_reg = 0,					\
> +		.off   = OFF,					\
> +		.imm   = IMM })
> +
> +/* Function call */
> +
> +#define BPF_EMIT_CALL(FUNC)					\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_JMP | BPF_CALL,			\
> +		.dst_reg = 0,					\
> +		.src_reg = 0,					\
> +		.off   = 0,					\
> +		.imm   = ((FUNC) - BPF_FUNC_unspec) })
> +
> +/* Raw code statement block */
> +
> +#define BPF_RAW_INSN(CODE, DST, SRC, OFF, IMM)			\
> +	((struct bpf_insn) {					\
> +		.code  = CODE,					\
> +		.dst_reg = DST,					\
> +		.src_reg = SRC,					\
> +		.off   = OFF,					\
> +		.imm   = IMM })
> +
> +/* Program exit */
> +
> +#define BPF_EXIT_INSN()						\
> +	((struct bpf_insn) {					\
> +		.code  = BPF_JMP | BPF_EXIT,			\
> +		.dst_reg = 0,					\
> +		.src_reg = 0,					\
> +		.off   = 0,					\
> +		.imm   = 0 })
> +
> +#endif /* __TOOLS_LINUX_FILTER_H */
> diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
> index 56fe0c9..14e8b98 100644
> --- a/tools/perf/MANIFEST
> +++ b/tools/perf/MANIFEST
> @@ -42,6 +42,7 @@ tools/include/asm-generic/bitops.h
>  tools/include/linux/atomic.h
>  tools/include/linux/bitops.h
>  tools/include/linux/compiler.h
> +tools/include/linux/filter.h
>  tools/include/linux/hash.h
>  tools/include/linux/kernel.h
>  tools/include/linux/list.h
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches
  2015-08-29  4:21 ` [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches Wang Nan
@ 2015-08-31 20:39   ` Arnaldo Carvalho de Melo
  2015-09-01  6:59   ` Wang Nan
  1 sibling, 0 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 20:39 UTC (permalink / raw)
  To: Wang Nan
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Sat, Aug 29, 2015 at 04:21:56AM +0000, Wang Nan escreveu:
> If both LIBBPF and DWARF are detected, it is possible to create prologue
> for eBPF programs to help them accessing kernel data. HAVE_BPF_PROLOGUE
> and CONFIG_BPF_PROLOGUE is added as flags for this feature.
> 
> PERF_HAVE_ARCH_GET_REG_OFFSET indicates an architecture supports
> converting name of a register to its offset in 'struct pt_regs'.
> Without this support, BPF_PROLOGUE should be turned off.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/n/1436445342-1402-33-git-send-email-wangnan0@huawei.com
> ---
>  tools/perf/config/Makefile           | 12 ++++++++++++
>  tools/perf/util/include/dwarf-regs.h |  7 +++++++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
> index 38a4144..d46765b7 100644
> --- a/tools/perf/config/Makefile
> +++ b/tools/perf/config/Makefile
> @@ -314,6 +314,18 @@ ifndef NO_LIBELF
>        CFLAGS += -DHAVE_LIBBPF_SUPPORT
>        $(call detected,CONFIG_LIBBPF)
>      endif
> +
> +    ifndef NO_DWARF
> +      ifneq ($(origin PERF_HAVE_ARCH_GET_REG_INFO), undefined)
> +        CFLAGS += -DHAVE_BPF_PROLOGUE
> +        $(call detected,CONFIG_BPF_PROLOGUE)
> +      else
> +        msg := $(warning BPF prologue is not supported by architecture $(ARCH));

Shouldn't this be replace to something like:

        msg := $(warning BPF prologue is not supported by architecture $(ARCH), missing ARCH_GET_REG_INFO);

Or even in lowercase?

> +      endif
> +    else
> +      msg := $(warning DWARF support is off, BPF prologue is disabled);
> +    endif
> +
>    endif # NO_LIBBPF
>  endif # NO_LIBELF
>  
> diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
> index 8f14965..3dda083 100644
> --- a/tools/perf/util/include/dwarf-regs.h
> +++ b/tools/perf/util/include/dwarf-regs.h
> @@ -5,4 +5,11 @@
>  const char *get_arch_regstr(unsigned int n);
>  #endif

Shouldn't this test against PERF_HAVE_ARCH_GET_REG_INFO instead? I.e.
is arch_get_reg_info() only allowed to work with eBPF? I guess not,
right?
  
> +#ifdef HAVE_BPF_PROLOGUE
> +/*
> + * Arch should support fetching the offset of a register in pt_regs
> + * by its name.
> + */
> +int arch_get_reg_info(const char *name, int *offset);
> +#endif
>  #endif
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 23/31] perf tools: Introduce arch_get_reg_info() for x86
  2015-08-29  4:21 ` [PATCH 23/31] perf tools: Introduce arch_get_reg_info() for x86 Wang Nan
@ 2015-08-31 20:43   ` Arnaldo Carvalho de Melo
  2015-09-01  2:39     ` Wangnan (F)
  0 siblings, 1 reply; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 20:43 UTC (permalink / raw)
  To: Wang Nan
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

Em Sat, Aug 29, 2015 at 04:21:57AM +0000, Wang Nan escreveu:
> From: He Kuang <hekuang@huawei.com>
> 
> arch_get_reg_info() is a helper function which converts register name
> like "%rax" to offset of a register in 'struct pt_regs', which is
> required by BPF prologue generator.

Is this something like:

/* Query offset/name of register from its name/offset */
extern int regs_query_register_offset(const char *name);

in ptrace? Can't we reuse that name and even code?

Was this that was done and only a rename was made?

- Arnaldo

> This patch replaces original string table by a 'struct reg_info' table,
> which records offset of registers according to its name.
> 
> For x86, since there are two sub-archs (x86_32 and x86_64) but we can
> only get pt_regs for the arch we are currently on, this patch fills
> offset with '-1' for another sub-arch. This introduces a limitation to
> perf prologue that, we are unable to generate prologue on a x86_32
> compiled perf for BPF programs targeted on x86_64 kernel. This
> limitation is acceptable, because this is a very rare usecase.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/n/1436445342-1402-34-git-send-email-wangnan0@huawei.com
> ---
>  tools/perf/arch/x86/Makefile          |   1 +
>  tools/perf/arch/x86/util/Build        |   2 +
>  tools/perf/arch/x86/util/dwarf-regs.c | 104 ++++++++++++++++++++++++----------
>  3 files changed, 78 insertions(+), 29 deletions(-)
> 
> diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
> index 21322e0..a84a6f6f 100644
> --- a/tools/perf/arch/x86/Makefile
> +++ b/tools/perf/arch/x86/Makefile
> @@ -2,3 +2,4 @@ ifndef NO_DWARF
>  PERF_HAVE_DWARF_REGS := 1
>  endif
>  HAVE_KVM_STAT_SUPPORT := 1
> +PERF_HAVE_ARCH_GET_REG_INFO := 1
> diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
> index 2c55e1b..09429f6 100644
> --- a/tools/perf/arch/x86/util/Build
> +++ b/tools/perf/arch/x86/util/Build
> @@ -3,6 +3,8 @@ libperf-y += tsc.o
>  libperf-y += pmu.o
>  libperf-y += kvm-stat.o
>  
> +# BPF_PROLOGUE also need dwarf-regs.o. However, if CONFIG_BPF_PROLOGUE
> +# is true, CONFIG_DWARF must true.
>  libperf-$(CONFIG_DWARF) += dwarf-regs.o
>  
>  libperf-$(CONFIG_LIBUNWIND)          += unwind-libunwind.o
> diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
> index be22dd4..9928caf 100644
> --- a/tools/perf/arch/x86/util/dwarf-regs.c
> +++ b/tools/perf/arch/x86/util/dwarf-regs.c
> @@ -22,44 +22,67 @@
>  
>  #include <stddef.h>
>  #include <dwarf-regs.h>
> +#include <string.h>
> +#include <linux/ptrace.h>
> +#include <linux/kernel.h> /* for offsetof */
> +#include <util/bpf-loader.h>
> +
> +struct reg_info {
> +	const char	*name;		/* Reg string in debuginfo      */
> +	int		offset;		/* Reg offset in struct pt_regs */
> +};
>  
>  /*
>   * Generic dwarf analysis helpers
>   */
> -
> +/*
> + * x86_64 compiling can't access pt_regs for x86_32, so fill offset
> + * with -1.
> + */
> +#ifdef __x86_64__
> +# define REG_INFO(n, f) { .name = n, .offset = -1, }
> +#else
> +# define REG_INFO(n, f) { .name = n, .offset = offsetof(struct pt_regs, f), }
> +#endif
>  #define X86_32_MAX_REGS 8
> -const char *x86_32_regs_table[X86_32_MAX_REGS] = {
> -	"%ax",
> -	"%cx",
> -	"%dx",
> -	"%bx",
> -	"$stack",	/* Stack address instead of %sp */
> -	"%bp",
> -	"%si",
> -	"%di",
> +
> +struct reg_info x86_32_regs_table[X86_32_MAX_REGS] = {
> +	REG_INFO("%ax", eax),
> +	REG_INFO("%cx", ecx),
> +	REG_INFO("%dx", edx),
> +	REG_INFO("%bx", ebx),
> +	REG_INFO("$stack", esp),	/* Stack address instead of %sp */
> +	REG_INFO("%bp", ebp),
> +	REG_INFO("%si", esi),
> +	REG_INFO("%di", edi),
>  };
>  
> +#undef REG_INFO
> +#ifdef __x86_64__
> +# define REG_INFO(n, f) { .name = n, .offset = offsetof(struct pt_regs, f), }
> +#else
> +# define REG_INFO(n, f) { .name = n, .offset = -1, }
> +#endif
>  #define X86_64_MAX_REGS 16
> -const char *x86_64_regs_table[X86_64_MAX_REGS] = {
> -	"%ax",
> -	"%dx",
> -	"%cx",
> -	"%bx",
> -	"%si",
> -	"%di",
> -	"%bp",
> -	"%sp",
> -	"%r8",
> -	"%r9",
> -	"%r10",
> -	"%r11",
> -	"%r12",
> -	"%r13",
> -	"%r14",
> -	"%r15",
> +struct reg_info x86_64_regs_table[X86_64_MAX_REGS] = {
> +	REG_INFO("%ax",		rax),
> +	REG_INFO("%dx",		rdx),
> +	REG_INFO("%cx",		rcx),
> +	REG_INFO("%bx",		rbx),
> +	REG_INFO("%si",		rsi),
> +	REG_INFO("%di",		rdi),
> +	REG_INFO("%bp",		rbp),
> +	REG_INFO("%sp",		rsp),
> +	REG_INFO("%r8",		r8),
> +	REG_INFO("%r9",		r9),
> +	REG_INFO("%r10",	r10),
> +	REG_INFO("%r11",	r11),
> +	REG_INFO("%r12",	r12),
> +	REG_INFO("%r13",	r13),
> +	REG_INFO("%r14",	r14),
> +	REG_INFO("%r15",	r15),
>  };
>  
> -/* TODO: switching by dwarf address size */
>  #ifdef __x86_64__
>  #define ARCH_MAX_REGS X86_64_MAX_REGS
>  #define arch_regs_table x86_64_regs_table
> @@ -71,5 +94,28 @@ const char *x86_64_regs_table[X86_64_MAX_REGS] = {
>  /* Return architecture dependent register string (for kprobe-tracer) */
>  const char *get_arch_regstr(unsigned int n)
>  {
> -	return (n <= ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
> +	return (n <= ARCH_MAX_REGS) ? arch_regs_table[n].name : NULL;
>  }
> +
> +#ifdef HAVE_BPF_PROLOGUE
> +int arch_get_reg_info(const char *name, int *offset)
> +{
> +	int i;
> +	struct reg_info *info;
> +
> +	if (!name || !offset)
> +		return -1;
> +
> +	for (i = 0; i < ARCH_MAX_REGS; i++) {
> +		info = &arch_regs_table[i];
> +		if (strcmp(info->name, name) == 0) {
> +			if (info->offset < 0)
> +				return -1;
> +			*offset = info->offset;
> +			return 0;
> +		}
> +	}
> +
> +	return -1;
> +}
> +#endif
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 23/31] perf tools: Introduce arch_get_reg_info() for x86
  2015-08-31 20:43   ` Arnaldo Carvalho de Melo
@ 2015-09-01  2:39     ` Wangnan (F)
  0 siblings, 0 replies; 94+ messages in thread
From: Wangnan (F) @ 2015-09-01  2:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra



On 2015/9/1 4:43, Arnaldo Carvalho de Melo wrote:
> Em Sat, Aug 29, 2015 at 04:21:57AM +0000, Wang Nan escreveu:
>> From: He Kuang <hekuang@huawei.com>
>>
>> arch_get_reg_info() is a helper function which converts register name
>> like "%rax" to offset of a register in 'struct pt_regs', which is
>> required by BPF prologue generator.
> Is this something like:
>
> /* Query offset/name of register from its name/offset */
> extern int regs_query_register_offset(const char *name);
>
> in ptrace? Can't we reuse that name and even code?

Unfortunately we can't reuse its code, because pt_regs is defined 
differently
in user and kernel side.

In arch/x86/kernel/ptrace.c we have:

struct pt_regs_offset {
         const char *name;
         int offset;
};

#define REG_OFFSET_NAME(r) {.name = #r, .offset = offsetof(struct 
pt_regs, r)}
#define REG_OFFSET_END {.name = NULL, .offset = 0}

static const struct pt_regs_offset regoffset_table[] = {
#ifdef CONFIG_X86_64
...
     REG_OFFSET_NAME(r15),
     REG_OFFSET_NAME(r14),
     REG_OFFSET_NAME(r13),
...

The definition of REG_OFFSET_NAME relys on the field name and the string 
name of a
register are identical. This is true for kernel, but not true for userspace.

For example, for x86_64, 'struct pt_regs' is defined in 
arch/x86/include/asm/ptrace.h
for kernel, and the reigster name is 'ax, cx, dx, si, di ...'. In 
contract, which is
defined in arch/x86/include/uapi/asm/ptrace.h for user, and the register 
name becomes
'rax, rcx, rdx, rsi, rdi ...'.

Since logical of regs_query_register_offset() is very simple, changing 
REG_OFFSET_NAME()
makes it a totally different function.

But yes, let's reuse its name. And it may worth considering to reuse its 
code for other
archs.

Thank you.

> Was this that was done and only a rename was made?
>
> - Arnaldo
>
>> This patch replaces original string table by a 'struct reg_info' table,
>> which records offset of registers according to its name.
>>
>> For x86, since there are two sub-archs (x86_32 and x86_64) but we can
>> only get pt_regs for the arch we are currently on, this patch fills
>> offset with '-1' for another sub-arch. This introduces a limitation to
>> perf prologue that, we are unable to generate prologue on a x86_32
>> compiled perf for BPF programs targeted on x86_64 kernel. This
>> limitation is acceptable, because this is a very rare usecase.
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Signed-off-by: He Kuang <hekuang@huawei.com>
>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: He Kuang <hekuang@huawei.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Kaixu Xia <xiakaixu@huawei.com>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Link: http://lkml.kernel.org/n/1436445342-1402-34-git-send-email-wangnan0@huawei.com
>> ---
>>   tools/perf/arch/x86/Makefile          |   1 +
>>   tools/perf/arch/x86/util/Build        |   2 +
>>   tools/perf/arch/x86/util/dwarf-regs.c | 104 ++++++++++++++++++++++++----------
>>   3 files changed, 78 insertions(+), 29 deletions(-)
>>
>> diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
>> index 21322e0..a84a6f6f 100644
>> --- a/tools/perf/arch/x86/Makefile
>> +++ b/tools/perf/arch/x86/Makefile
>> @@ -2,3 +2,4 @@ ifndef NO_DWARF
>>   PERF_HAVE_DWARF_REGS := 1
>>   endif
>>   HAVE_KVM_STAT_SUPPORT := 1
>> +PERF_HAVE_ARCH_GET_REG_INFO := 1
>> diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
>> index 2c55e1b..09429f6 100644
>> --- a/tools/perf/arch/x86/util/Build
>> +++ b/tools/perf/arch/x86/util/Build
>> @@ -3,6 +3,8 @@ libperf-y += tsc.o
>>   libperf-y += pmu.o
>>   libperf-y += kvm-stat.o
>>   
>> +# BPF_PROLOGUE also need dwarf-regs.o. However, if CONFIG_BPF_PROLOGUE
>> +# is true, CONFIG_DWARF must true.
>>   libperf-$(CONFIG_DWARF) += dwarf-regs.o
>>   
>>   libperf-$(CONFIG_LIBUNWIND)          += unwind-libunwind.o
>> diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
>> index be22dd4..9928caf 100644
>> --- a/tools/perf/arch/x86/util/dwarf-regs.c
>> +++ b/tools/perf/arch/x86/util/dwarf-regs.c
>> @@ -22,44 +22,67 @@
>>   
>>   #include <stddef.h>
>>   #include <dwarf-regs.h>
>> +#include <string.h>
>> +#include <linux/ptrace.h>
>> +#include <linux/kernel.h> /* for offsetof */
>> +#include <util/bpf-loader.h>
>> +
>> +struct reg_info {
>> +	const char	*name;		/* Reg string in debuginfo      */
>> +	int		offset;		/* Reg offset in struct pt_regs */
>> +};
>>   
>>   /*
>>    * Generic dwarf analysis helpers
>>    */
>> -
>> +/*
>> + * x86_64 compiling can't access pt_regs for x86_32, so fill offset
>> + * with -1.
>> + */
>> +#ifdef __x86_64__
>> +# define REG_INFO(n, f) { .name = n, .offset = -1, }
>> +#else
>> +# define REG_INFO(n, f) { .name = n, .offset = offsetof(struct pt_regs, f), }
>> +#endif
>>   #define X86_32_MAX_REGS 8
>> -const char *x86_32_regs_table[X86_32_MAX_REGS] = {
>> -	"%ax",
>> -	"%cx",
>> -	"%dx",
>> -	"%bx",
>> -	"$stack",	/* Stack address instead of %sp */
>> -	"%bp",
>> -	"%si",
>> -	"%di",
>> +
>> +struct reg_info x86_32_regs_table[X86_32_MAX_REGS] = {
>> +	REG_INFO("%ax", eax),
>> +	REG_INFO("%cx", ecx),
>> +	REG_INFO("%dx", edx),
>> +	REG_INFO("%bx", ebx),
>> +	REG_INFO("$stack", esp),	/* Stack address instead of %sp */
>> +	REG_INFO("%bp", ebp),
>> +	REG_INFO("%si", esi),
>> +	REG_INFO("%di", edi),
>>   };
>>   
>> +#undef REG_INFO
>> +#ifdef __x86_64__
>> +# define REG_INFO(n, f) { .name = n, .offset = offsetof(struct pt_regs, f), }
>> +#else
>> +# define REG_INFO(n, f) { .name = n, .offset = -1, }
>> +#endif
>>   #define X86_64_MAX_REGS 16
>> -const char *x86_64_regs_table[X86_64_MAX_REGS] = {
>> -	"%ax",
>> -	"%dx",
>> -	"%cx",
>> -	"%bx",
>> -	"%si",
>> -	"%di",
>> -	"%bp",
>> -	"%sp",
>> -	"%r8",
>> -	"%r9",
>> -	"%r10",
>> -	"%r11",
>> -	"%r12",
>> -	"%r13",
>> -	"%r14",
>> -	"%r15",
>> +struct reg_info x86_64_regs_table[X86_64_MAX_REGS] = {
>> +	REG_INFO("%ax",		rax),
>> +	REG_INFO("%dx",		rdx),
>> +	REG_INFO("%cx",		rcx),
>> +	REG_INFO("%bx",		rbx),
>> +	REG_INFO("%si",		rsi),
>> +	REG_INFO("%di",		rdi),
>> +	REG_INFO("%bp",		rbp),
>> +	REG_INFO("%sp",		rsp),
>> +	REG_INFO("%r8",		r8),
>> +	REG_INFO("%r9",		r9),
>> +	REG_INFO("%r10",	r10),
>> +	REG_INFO("%r11",	r11),
>> +	REG_INFO("%r12",	r12),
>> +	REG_INFO("%r13",	r13),
>> +	REG_INFO("%r14",	r14),
>> +	REG_INFO("%r15",	r15),
>>   };
>>   
>> -/* TODO: switching by dwarf address size */
>>   #ifdef __x86_64__
>>   #define ARCH_MAX_REGS X86_64_MAX_REGS
>>   #define arch_regs_table x86_64_regs_table
>> @@ -71,5 +94,28 @@ const char *x86_64_regs_table[X86_64_MAX_REGS] = {
>>   /* Return architecture dependent register string (for kprobe-tracer) */
>>   const char *get_arch_regstr(unsigned int n)
>>   {
>> -	return (n <= ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
>> +	return (n <= ARCH_MAX_REGS) ? arch_regs_table[n].name : NULL;
>>   }
>> +
>> +#ifdef HAVE_BPF_PROLOGUE
>> +int arch_get_reg_info(const char *name, int *offset)
>> +{
>> +	int i;
>> +	struct reg_info *info;
>> +
>> +	if (!name || !offset)
>> +		return -1;
>> +
>> +	for (i = 0; i < ARCH_MAX_REGS; i++) {
>> +		info = &arch_regs_table[i];
>> +		if (strcmp(info->name, name) == 0) {
>> +			if (info->offset < 0)
>> +				return -1;
>> +			*offset = info->offset;
>> +			return 0;
>> +		}
>> +	}
>> +
>> +	return -1;
>> +}
>> +#endif
>> -- 
>> 2.1.0



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 17/31] perf tests: Enforce LLVM test for BPF test
  2015-08-29  4:21 ` [PATCH 17/31] perf tests: Enforce LLVM test for BPF test Wang Nan
@ 2015-09-01  5:59   ` Wangnan (F)
  0 siblings, 0 replies; 94+ messages in thread
From: Wangnan (F) @ 2015-09-01  5:59 UTC (permalink / raw)
  To: acme, mingo, ast
  Cc: linux-kernel, lizefan, pi3orama, He Kuang, Brendan Gregg,
	Daniel Borkmann, David Ahern, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra



On 2015/8/29 12:21, Wang Nan wrote:
> diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
> index c1518bd..8c98409 100644
> --- a/tools/perf/tests/Build
> +++ b/tools/perf/tests/Build
> @@ -32,7 +32,14 @@ perf-y += sample-parsing.o
>   perf-y += parse-no-sample-id-all.o
>   perf-y += kmod-path.o
>   perf-y += thread-map.o
> -perf-y += llvm.o
> +perf-y += llvm.o llvm-src.o
> +
> +$(OUTPUT)tests/llvm-src.c: tests/bpf-script-example.c
This rule requires a $(call rule_mkdir). Will be fixed in next pull request.
> +	$(Q)echo '#include <tests/llvm.h>' > $@
> +	$(Q)echo 'const char test_llvm__bpf_prog[] =' >> $@
> +	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
> +	$(Q)echo ';' >> $@
> +
>   
>   perf-$(CONFIG_X86) += perf-time-to-tsc.o
>   
>



^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches
  2015-08-29  4:21 ` [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches Wang Nan
  2015-08-31 20:39   ` Arnaldo Carvalho de Melo
@ 2015-09-01  6:59   ` Wang Nan
  2015-09-01  6:59     ` [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86 Wang Nan
  2015-09-02 14:08     ` [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches Namhyung Kim
  1 sibling, 2 replies; 94+ messages in thread
From: Wang Nan @ 2015-09-01  6:59 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, Wang Nan, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	Zefan Li, pi3orama, Arnaldo Carvalho de Melo

If both LIBBPF and DWARF are detected, it is possible to create prologue
for eBPF programs to help them accessing kernel data. HAVE_BPF_PROLOGUE
and CONFIG_BPF_PROLOGUE is added as flags for this feature.

PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET indicates an architecture
supports converting name of a register to its offset in
'struct pt_regs'. Without this support, BPF_PROLOGUE should be turned off.

HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET is introduced as the corresponding
CFLAGS of PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/1436445342-1402-33-git-send-email-wangnan0@huawei.com
[wawngnan:
 - Introduce new CFLAGS to control BPF prologue and arch_get_reg_info()
   separately.
 - Rename ARCH_GET_REG_INFO to ARCH_REGS_QUERY_REGISTER_OFFSET,
   arch_get_reg_info() to regs_query_register_offset(), change its API accordingly
   to make it similar to kernel's regs_query_register_offset().
]
---
 tools/perf/config/Makefile           | 17 +++++++++++++++++
 tools/perf/util/include/dwarf-regs.h |  8 ++++++++
 2 files changed, 25 insertions(+)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 38a4144..33785a1 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -110,6 +110,11 @@ FEATURE_CHECK_CFLAGS-bpf = -I. -I$(srctree)/tools/include -I$(srctree)/arch/$(AR
 # include ARCH specific config
 -include $(src-perf)/arch/$(ARCH)/Makefile
 
+ifneq ($(origin PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET), undefined)
+  CFLAGS += -DHAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
+endif
+
+
 include $(src-perf)/config/utilities.mak
 
 ifeq ($(call get-executable,$(FLEX)),)
@@ -314,6 +319,18 @@ ifndef NO_LIBELF
       CFLAGS += -DHAVE_LIBBPF_SUPPORT
       $(call detected,CONFIG_LIBBPF)
     endif
+
+    ifndef NO_DWARF
+      ifneq ($(origin PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET), undefined)
+        CFLAGS += -DHAVE_BPF_PROLOGUE
+        $(call detected,CONFIG_BPF_PROLOGUE)
+      else
+        msg := $(warning BPF prologue is not supported by architecture $(ARCH), missing regs_query_register_offset());
+      endif
+    else
+      msg := $(warning DWARF support is off, BPF prologue is disabled);
+    endif
+
   endif # NO_LIBBPF
 endif # NO_LIBELF
 
diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
index 8f14965..07c644e 100644
--- a/tools/perf/util/include/dwarf-regs.h
+++ b/tools/perf/util/include/dwarf-regs.h
@@ -5,4 +5,12 @@
 const char *get_arch_regstr(unsigned int n);
 #endif
 
+#ifdef HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
+/*
+ * Arch should support fetching the offset of a register in pt_regs
+ * by its name. See kernel's regs_query_register_offset in
+ * arch/xxx/kernel/ptrace.c.
+ */
+int regs_query_register_offset(const char *name);
+#endif
 #endif
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86
  2015-09-01  6:59   ` Wang Nan
@ 2015-09-01  6:59     ` Wang Nan
  2015-09-01 11:47       ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-02 14:08     ` [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches Namhyung Kim
  1 sibling, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-09-01  6:59 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, Wang Nan, He Kuang, Alexei Starovoitov,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Zefan Li, pi3orama, Arnaldo Carvalho de Melo

regs_query_register_offset() is a helper function which converts
register name like "%rax" to offset of a register in 'struct pt_regs',
which is required by BPF prologue generator. Since the function is
identical, try to reuse the code in arch/x86/kernel/ptrace.c.

Comment inside dwarf-regs.c list the differences between this
implementation and kernel code.

get_arch_regstr() switches to regoffset_table and the old string table
is dropped.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/Makefile          |   1 +
 tools/perf/arch/x86/util/Build        |   1 +
 tools/perf/arch/x86/util/dwarf-regs.c | 122 ++++++++++++++++++++++++----------
 3 files changed, 90 insertions(+), 34 deletions(-)

diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
index 21322e0..09ba923 100644
--- a/tools/perf/arch/x86/Makefile
+++ b/tools/perf/arch/x86/Makefile
@@ -2,3 +2,4 @@ ifndef NO_DWARF
 PERF_HAVE_DWARF_REGS := 1
 endif
 HAVE_KVM_STAT_SUPPORT := 1
+PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 2c55e1b..d4d1f23 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -4,6 +4,7 @@ libperf-y += pmu.o
 libperf-y += kvm-stat.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
+libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
 
 libperf-$(CONFIG_LIBUNWIND)          += unwind-libunwind.o
 libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
index a08de0a..de5b936 100644
--- a/tools/perf/arch/x86/util/dwarf-regs.c
+++ b/tools/perf/arch/x86/util/dwarf-regs.c
@@ -21,55 +21,109 @@
  */
 
 #include <stddef.h>
+#include <errno.h> /* for EINVAL */
+#include <string.h> /* for strcmp */
+#include <linux/ptrace.h> /* for struct pt_regs */
+#include <linux/kernel.h> /* for offsetof */
 #include <dwarf-regs.h>
 
 /*
- * Generic dwarf analysis helpers
+ * See arch/x86/kernel/ptrace.c.
+ * Different from it:
+ *
+ *  - Since struct pt_regs is defined differently for user and kernel,
+ *    but we want to use 'ax, bx' instead of 'rax, rbx' (which is struct
+ *    field name of user's pt_regs), we make REG_OFFSET_NAME to accept
+ *    both string name and reg field name.
+ *
+ *  - Since accessing x86_32's pt_regs from x86_64 building is difficult
+ *    and vise versa, we simply fill offset with -1, so
+ *    get_arch_regstr() still works but regs_query_register_offset()
+ *    returns error.
+ *    The only inconvenience caused by it now is that we are not allowed
+ *    to generate BPF prologue for a x86_64 kernel if perf is built for
+ *    x86_32. This is really a rare usecase.
+ *
+ *  - Order is different from kernel's ptrace.c for get_arch_regstr(), which
+ *    is defined by dwarf.
  */
 
-#define X86_32_MAX_REGS 8
-const char *x86_32_regs_table[X86_32_MAX_REGS] = {
-	"%ax",
-	"%cx",
-	"%dx",
-	"%bx",
-	"$stack",	/* Stack address instead of %sp */
-	"%bp",
-	"%si",
-	"%di",
+struct pt_regs_offset {
+	const char *name;
+	int offset;
+};
+
+#define REG_OFFSET_END {.name = NULL, .offset = 0}
+
+#ifdef __x86_64__
+# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
+# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = -1}
+#else
+# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = -1}
+# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
+#endif
+
+static const struct pt_regs_offset x86_32_regoffset_table[] = {
+	REG_OFFSET_NAME_32("%ax",	eax),
+	REG_OFFSET_NAME_32("%cx",	ecx),
+	REG_OFFSET_NAME_32("%dx",	edx),
+	REG_OFFSET_NAME_32("%bx",	ebx),
+	REG_OFFSET_NAME_32("$stack",	esp),	/* Stack address instead of %sp */
+	REG_OFFSET_NAME_32("%bp",	ebp),
+	REG_OFFSET_NAME_32("%si",	esi),
+	REG_OFFSET_NAME_32("%di",	edi),
+	REG_OFFSET_END,
 };
 
-#define X86_64_MAX_REGS 16
-const char *x86_64_regs_table[X86_64_MAX_REGS] = {
-	"%ax",
-	"%dx",
-	"%cx",
-	"%bx",
-	"%si",
-	"%di",
-	"%bp",
-	"%sp",
-	"%r8",
-	"%r9",
-	"%r10",
-	"%r11",
-	"%r12",
-	"%r13",
-	"%r14",
-	"%r15",
+static const struct pt_regs_offset x86_64_regoffset_table[] = {
+	REG_OFFSET_NAME_64("%ax",	rax),
+	REG_OFFSET_NAME_64("%dx",	rdx),
+	REG_OFFSET_NAME_64("%cx",	rcx),
+	REG_OFFSET_NAME_64("%bx",	rbx),
+	REG_OFFSET_NAME_64("%si",	rsi),
+	REG_OFFSET_NAME_64("%di",	rdi),
+	REG_OFFSET_NAME_64("%bp",	rbp),
+	REG_OFFSET_NAME_64("%sp",	rsp),
+	REG_OFFSET_NAME_64("%r8",	r8),
+	REG_OFFSET_NAME_64("%r9",	r9),
+	REG_OFFSET_NAME_64("%r10",	r10),
+	REG_OFFSET_NAME_64("%r11",	r11),
+	REG_OFFSET_NAME_64("%r12",	r12),
+	REG_OFFSET_NAME_64("%r13",	r13),
+	REG_OFFSET_NAME_64("%r14",	r14),
+	REG_OFFSET_NAME_64("%r15",	r15),
+	REG_OFFSET_END,
 };
 
 /* TODO: switching by dwarf address size */
 #ifdef __x86_64__
-#define ARCH_MAX_REGS X86_64_MAX_REGS
-#define arch_regs_table x86_64_regs_table
+#define regoffset_table x86_64_regoffset_table
 #else
-#define ARCH_MAX_REGS X86_32_MAX_REGS
-#define arch_regs_table x86_32_regs_table
+#define regoffset_table x86_32_regoffset_table
 #endif
 
+/* Minus 1 for the ending REG_OFFSET_END */
+#define ARCH_MAX_REGS ((sizeof(regoffset_table) / sizeof(regoffset_table[0])) - 1)
+
 /* Return architecture dependent register string (for kprobe-tracer) */
 const char *get_arch_regstr(unsigned int n)
 {
-	return (n < ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
+	return (n < ARCH_MAX_REGS) ? regoffset_table[n].name : NULL;
+}
+
+/* Reuse code from arch/x86/kernel/ptrace.c */
+/**
+ * regs_query_register_offset() - query register offset from its name
+ * @name:	the name of a register
+ *
+ * regs_query_register_offset() returns the offset of a register in struct
+ * pt_regs from its name. If the name is invalid, this returns -EINVAL;
+ */
+int regs_query_register_offset(const char *name)
+{
+	const struct pt_regs_offset *roff;
+	for (roff = regoffset_table; roff->name != NULL; roff++)
+		if (!strcmp(roff->name, name))
+			return roff->offset;
+	return -EINVAL;
 }
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected
  2015-08-31 19:20   ` Arnaldo Carvalho de Melo
@ 2015-09-01 10:37     ` Wangnan (F)
  2015-09-01 10:38     ` Jiri Olsa
  1 sibling, 0 replies; 94+ messages in thread
From: Wangnan (F) @ 2015-09-01 10:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, Masami Hiramatsu,
	Namhyung Kim



On 2015/9/1 3:20, Arnaldo Carvalho de Melo wrote:
> Em Sat, Aug 29, 2015 at 04:21:36AM +0000, Wang Nan escreveu:
>> If parse_events__scanner() collects no entry, perf_evlist__last(evlist)
>> is invalid. Then setting of cmdline_group_boundary touches invalid.
>>
>> It could happend in currect BPF implementation. See [1]. Although it
>> can be fixed, for safety reason it whould be better to introduce this
>> check.
>>
>> Instead of checking number of entries, check data.list instead, so we
>> can add dummy evsel here.
> Event parsing fixes should have Jiri Olsa on the CC list, Jiri, is this
> ok?
>
>  From what I can see it looks Ok, my question, just from looking at this
> patch, is if it is valid to get to this point with an empty data.list,
> i.e. was it ever possible and this is a bug irrespective of eBPF?

It should not be a existing bug in perf. There are other places rely on
non-empty of the list. For example, in parse_events__set_leader(). 
Furtunately,
it won't triggered problem because we don't allow a BPF object to be 
wrapped with "{}"
lexically ("{./aaa.o}" will be interpreterd as file '{./aaa.o' and a 
extra '}').


> - Arnaldo
>   
>> [1]: http://lkml.kernel.org/n/1436445342-1402-19-git-send-email-wangnan0@huawei.com
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Link: http://lkml.kernel.org/r/1440742821-44548-3-git-send-email-wangnan0@huawei.com
>> ---
>>   tools/perf/util/parse-events.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
>> index d826e6f..14cd7e3 100644
>> --- a/tools/perf/util/parse-events.c
>> +++ b/tools/perf/util/parse-events.c
>> @@ -1143,10 +1143,14 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>>   		int entries = data.idx - evlist->nr_entries;
>>   		struct perf_evsel *last;
>>   
>> +		if (!list_empty(&data.list)) {
>> +			last = list_entry(data.list.prev,
>> +					  struct perf_evsel, node);
>> +			last->cmdline_group_boundary = true;
>> +		}
>> +
>>   		perf_evlist__splice_list_tail(evlist, &data.list, entries);
>>   		evlist->nr_groups += data.nr_groups;
>> -		last = perf_evlist__last(evlist);
>> -		last->cmdline_group_boundary = true;
>>   
>>   		return 0;
>>   	}
>> -- 
>> 2.1.0



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected
  2015-08-31 19:20   ` Arnaldo Carvalho de Melo
  2015-09-01 10:37     ` Wangnan (F)
@ 2015-09-01 10:38     ` Jiri Olsa
  2015-09-01 12:44       ` Wangnan (F)
  1 sibling, 1 reply; 94+ messages in thread
From: Jiri Olsa @ 2015-09-01 10:38 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Wang Nan, mingo, ast, linux-kernel, lizefan, pi3orama,
	Masami Hiramatsu, Namhyung Kim

On Mon, Aug 31, 2015 at 04:20:03PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Sat, Aug 29, 2015 at 04:21:36AM +0000, Wang Nan escreveu:
> > If parse_events__scanner() collects no entry, perf_evlist__last(evlist)
> > is invalid. Then setting of cmdline_group_boundary touches invalid.
> > 
> > It could happend in currect BPF implementation. See [1]. Although it
> > can be fixed, for safety reason it whould be better to introduce this
> > check.
> > 
> > Instead of checking number of entries, check data.list instead, so we
> > can add dummy evsel here.
> 
> Event parsing fixes should have Jiri Olsa on the CC list, Jiri, is this
> ok?
> 
> From what I can see it looks Ok, my question, just from looking at this
> patch, is if it is valid to get to this point with an empty data.list,
> i.e. was it ever possible and this is a bug irrespective of eBPF?

good point, I believe it's either fail or event(s) added to the list
I haven't checked how's eBPF connected with event parsing, is there a
git tree I could check?

thanks,
jirka

^ permalink raw reply	[flat|nested] 94+ messages in thread

* RE: [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86
  2015-09-01  6:59     ` [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86 Wang Nan
@ 2015-09-01 11:47       ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-01 13:52         ` Wangnan (F)
  2015-09-01 14:14         ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 94+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-01 11:47 UTC (permalink / raw)
  To: 'Wang Nan', acme
  Cc: linux-kernel, He Kuang, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, Jiri Olsa, Kaixu Xia, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Zefan Li, pi3orama,
	Arnaldo Carvalho de Melo

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 7388 bytes --]

> From: Wang Nan [mailto:wangnan0@huawei.com]
> 
> regs_query_register_offset() is a helper function which converts
> register name like "%rax" to offset of a register in 'struct pt_regs',
> which is required by BPF prologue generator. Since the function is
> identical, try to reuse the code in arch/x86/kernel/ptrace.c.
> 
> Comment inside dwarf-regs.c list the differences between this
> implementation and kernel code.

Hmm, this also introduce a duplication of the code...
It might be a good time to move them into arch/x86/lib/ and
reuse it directly from perf code.

Thank you,

> 
> get_arch_regstr() switches to regoffset_table and the old string table
> is dropped.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/arch/x86/Makefile          |   1 +
>  tools/perf/arch/x86/util/Build        |   1 +
>  tools/perf/arch/x86/util/dwarf-regs.c | 122 ++++++++++++++++++++++++----------
>  3 files changed, 90 insertions(+), 34 deletions(-)
> 
> diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
> index 21322e0..09ba923 100644
> --- a/tools/perf/arch/x86/Makefile
> +++ b/tools/perf/arch/x86/Makefile
> @@ -2,3 +2,4 @@ ifndef NO_DWARF
>  PERF_HAVE_DWARF_REGS := 1
>  endif
>  HAVE_KVM_STAT_SUPPORT := 1
> +PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
> diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
> index 2c55e1b..d4d1f23 100644
> --- a/tools/perf/arch/x86/util/Build
> +++ b/tools/perf/arch/x86/util/Build
> @@ -4,6 +4,7 @@ libperf-y += pmu.o
>  libperf-y += kvm-stat.o
> 
>  libperf-$(CONFIG_DWARF) += dwarf-regs.o
> +libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
> 
>  libperf-$(CONFIG_LIBUNWIND)          += unwind-libunwind.o
>  libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
> diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
> index a08de0a..de5b936 100644
> --- a/tools/perf/arch/x86/util/dwarf-regs.c
> +++ b/tools/perf/arch/x86/util/dwarf-regs.c
> @@ -21,55 +21,109 @@
>   */
> 
>  #include <stddef.h>
> +#include <errno.h> /* for EINVAL */
> +#include <string.h> /* for strcmp */
> +#include <linux/ptrace.h> /* for struct pt_regs */
> +#include <linux/kernel.h> /* for offsetof */
>  #include <dwarf-regs.h>
> 
>  /*
> - * Generic dwarf analysis helpers
> + * See arch/x86/kernel/ptrace.c.
> + * Different from it:
> + *
> + *  - Since struct pt_regs is defined differently for user and kernel,
> + *    but we want to use 'ax, bx' instead of 'rax, rbx' (which is struct
> + *    field name of user's pt_regs), we make REG_OFFSET_NAME to accept
> + *    both string name and reg field name.
> + *
> + *  - Since accessing x86_32's pt_regs from x86_64 building is difficult
> + *    and vise versa, we simply fill offset with -1, so
> + *    get_arch_regstr() still works but regs_query_register_offset()
> + *    returns error.
> + *    The only inconvenience caused by it now is that we are not allowed
> + *    to generate BPF prologue for a x86_64 kernel if perf is built for
> + *    x86_32. This is really a rare usecase.
> + *
> + *  - Order is different from kernel's ptrace.c for get_arch_regstr(), which
> + *    is defined by dwarf.
>   */
> 
> -#define X86_32_MAX_REGS 8
> -const char *x86_32_regs_table[X86_32_MAX_REGS] = {
> -	"%ax",
> -	"%cx",
> -	"%dx",
> -	"%bx",
> -	"$stack",	/* Stack address instead of %sp */
> -	"%bp",
> -	"%si",
> -	"%di",
> +struct pt_regs_offset {
> +	const char *name;
> +	int offset;
> +};
> +
> +#define REG_OFFSET_END {.name = NULL, .offset = 0}
> +
> +#ifdef __x86_64__
> +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
> +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = -1}
> +#else
> +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = -1}
> +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
> +#endif
> +
> +static const struct pt_regs_offset x86_32_regoffset_table[] = {
> +	REG_OFFSET_NAME_32("%ax",	eax),
> +	REG_OFFSET_NAME_32("%cx",	ecx),
> +	REG_OFFSET_NAME_32("%dx",	edx),
> +	REG_OFFSET_NAME_32("%bx",	ebx),
> +	REG_OFFSET_NAME_32("$stack",	esp),	/* Stack address instead of %sp */
> +	REG_OFFSET_NAME_32("%bp",	ebp),
> +	REG_OFFSET_NAME_32("%si",	esi),
> +	REG_OFFSET_NAME_32("%di",	edi),
> +	REG_OFFSET_END,
>  };
> 
> -#define X86_64_MAX_REGS 16
> -const char *x86_64_regs_table[X86_64_MAX_REGS] = {
> -	"%ax",
> -	"%dx",
> -	"%cx",
> -	"%bx",
> -	"%si",
> -	"%di",
> -	"%bp",
> -	"%sp",
> -	"%r8",
> -	"%r9",
> -	"%r10",
> -	"%r11",
> -	"%r12",
> -	"%r13",
> -	"%r14",
> -	"%r15",
> +static const struct pt_regs_offset x86_64_regoffset_table[] = {
> +	REG_OFFSET_NAME_64("%ax",	rax),
> +	REG_OFFSET_NAME_64("%dx",	rdx),
> +	REG_OFFSET_NAME_64("%cx",	rcx),
> +	REG_OFFSET_NAME_64("%bx",	rbx),
> +	REG_OFFSET_NAME_64("%si",	rsi),
> +	REG_OFFSET_NAME_64("%di",	rdi),
> +	REG_OFFSET_NAME_64("%bp",	rbp),
> +	REG_OFFSET_NAME_64("%sp",	rsp),
> +	REG_OFFSET_NAME_64("%r8",	r8),
> +	REG_OFFSET_NAME_64("%r9",	r9),
> +	REG_OFFSET_NAME_64("%r10",	r10),
> +	REG_OFFSET_NAME_64("%r11",	r11),
> +	REG_OFFSET_NAME_64("%r12",	r12),
> +	REG_OFFSET_NAME_64("%r13",	r13),
> +	REG_OFFSET_NAME_64("%r14",	r14),
> +	REG_OFFSET_NAME_64("%r15",	r15),
> +	REG_OFFSET_END,
>  };
> 
>  /* TODO: switching by dwarf address size */
>  #ifdef __x86_64__
> -#define ARCH_MAX_REGS X86_64_MAX_REGS
> -#define arch_regs_table x86_64_regs_table
> +#define regoffset_table x86_64_regoffset_table
>  #else
> -#define ARCH_MAX_REGS X86_32_MAX_REGS
> -#define arch_regs_table x86_32_regs_table
> +#define regoffset_table x86_32_regoffset_table
>  #endif
> 
> +/* Minus 1 for the ending REG_OFFSET_END */
> +#define ARCH_MAX_REGS ((sizeof(regoffset_table) / sizeof(regoffset_table[0])) - 1)
> +
>  /* Return architecture dependent register string (for kprobe-tracer) */
>  const char *get_arch_regstr(unsigned int n)
>  {
> -	return (n < ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
> +	return (n < ARCH_MAX_REGS) ? regoffset_table[n].name : NULL;
> +}
> +
> +/* Reuse code from arch/x86/kernel/ptrace.c */
> +/**
> + * regs_query_register_offset() - query register offset from its name
> + * @name:	the name of a register
> + *
> + * regs_query_register_offset() returns the offset of a register in struct
> + * pt_regs from its name. If the name is invalid, this returns -EINVAL;
> + */
> +int regs_query_register_offset(const char *name)
> +{
> +	const struct pt_regs_offset *roff;
> +	for (roff = regoffset_table; roff->name != NULL; roff++)
> +		if (!strcmp(roff->name, name))
> +			return roff->offset;
> +	return -EINVAL;
>  }
> --
> 1.8.3.4

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected
  2015-09-01 10:38     ` Jiri Olsa
@ 2015-09-01 12:44       ` Wangnan (F)
  0 siblings, 0 replies; 94+ messages in thread
From: Wangnan (F) @ 2015-09-01 12:44 UTC (permalink / raw)
  To: Jiri Olsa, Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, mingo, ast, linux-kernel, lizefan, pi3orama,
	Masami Hiramatsu, Namhyung Kim



On 2015/9/1 18:38, Jiri Olsa wrote:
> On Mon, Aug 31, 2015 at 04:20:03PM -0300, Arnaldo Carvalho de Melo wrote:
>> Em Sat, Aug 29, 2015 at 04:21:36AM +0000, Wang Nan escreveu:
>>> If parse_events__scanner() collects no entry, perf_evlist__last(evlist)
>>> is invalid. Then setting of cmdline_group_boundary touches invalid.
>>>
>>> It could happend in currect BPF implementation. See [1]. Although it
>>> can be fixed, for safety reason it whould be better to introduce this
>>> check.
>>>
>>> Instead of checking number of entries, check data.list instead, so we
>>> can add dummy evsel here.
>> Event parsing fixes should have Jiri Olsa on the CC list, Jiri, is this
>> ok?
>>
>>  From what I can see it looks Ok, my question, just from looking at this
>> patch, is if it is valid to get to this point with an empty data.list,
>> i.e. was it ever possible and this is a bug irrespective of eBPF?
> good point, I believe it's either fail or event(s) added to the list
> I haven't checked how's eBPF connected with event parsing, is there a
> git tree I could check?

Please check:

https://git.kernel.org/cgit/linux/kernel/git/pi3orama/linux.git/log/?h=perf/ebpf

commit d7d91228cad0a78eae5ea9526a8a78debf3cf584
commit 2606fe61219899cb386823eddc1bc231ff5067a6

related to parsing.

Thank you.

> thanks,
> jirka



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86
  2015-09-01 11:47       ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-01 13:52         ` Wangnan (F)
  2015-09-01 14:50           ` Arnaldo Carvalho de Melo
  2015-09-01 14:14         ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 94+ messages in thread
From: Wangnan (F) @ 2015-09-01 13:52 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI, acme
  Cc: linux-kernel, He Kuang, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, Jiri Olsa, Kaixu Xia, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Zefan Li, pi3orama,
	Arnaldo Carvalho de Melo



On 2015/9/1 19:47, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: Wang Nan [mailto:wangnan0@huawei.com]
>>
>> regs_query_register_offset() is a helper function which converts
>> register name like "%rax" to offset of a register in 'struct pt_regs',
>> which is required by BPF prologue generator. Since the function is
>> identical, try to reuse the code in arch/x86/kernel/ptrace.c.
>>
>> Comment inside dwarf-regs.c list the differences between this
>> implementation and kernel code.
> Hmm, this also introduce a duplication of the code...
> It might be a good time to move them into arch/x86/lib/ and
> reuse it directly from perf code.

So you want to move it from ./arch/x86/kernel/ptrace.c to arch/x86/lib and
let perf link against arch/x86/lib/lib.a to use it?

I think it worth a specific work to do it. Currently we lake
scaffold to compile and link against the kernel side library. Moreover,
we should also consider other archs. Seems not very easy.

This is not the only one file which can benifite from kernel's arch/x86/lib
Newly introduced tools/perf/util/intel-pt-decoder/insn.c, and I believe 
there's
more. Therefore I think it should be a separated work from perf BPF patches.
So how about keep this patch at this time? Or do you have some idea?

Thank you.

> Thank you,
>
>> get_arch_regstr() switches to regoffset_table and the old string table
>> is dropped.
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Signed-off-by: He Kuang <hekuang@huawei.com>
>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: He Kuang <hekuang@huawei.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Kaixu Xia <xiakaixu@huawei.com>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> ---
>>   tools/perf/arch/x86/Makefile          |   1 +
>>   tools/perf/arch/x86/util/Build        |   1 +
>>   tools/perf/arch/x86/util/dwarf-regs.c | 122 ++++++++++++++++++++++++----------
>>   3 files changed, 90 insertions(+), 34 deletions(-)
>>
>> diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
>> index 21322e0..09ba923 100644
>> --- a/tools/perf/arch/x86/Makefile
>> +++ b/tools/perf/arch/x86/Makefile
>> @@ -2,3 +2,4 @@ ifndef NO_DWARF
>>   PERF_HAVE_DWARF_REGS := 1
>>   endif
>>   HAVE_KVM_STAT_SUPPORT := 1
>> +PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
>> diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
>> index 2c55e1b..d4d1f23 100644
>> --- a/tools/perf/arch/x86/util/Build
>> +++ b/tools/perf/arch/x86/util/Build
>> @@ -4,6 +4,7 @@ libperf-y += pmu.o
>>   libperf-y += kvm-stat.o
>>
>>   libperf-$(CONFIG_DWARF) += dwarf-regs.o
>> +libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
>>
>>   libperf-$(CONFIG_LIBUNWIND)          += unwind-libunwind.o
>>   libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
>> diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
>> index a08de0a..de5b936 100644
>> --- a/tools/perf/arch/x86/util/dwarf-regs.c
>> +++ b/tools/perf/arch/x86/util/dwarf-regs.c
>> @@ -21,55 +21,109 @@
>>    */
>>
>>   #include <stddef.h>
>> +#include <errno.h> /* for EINVAL */
>> +#include <string.h> /* for strcmp */
>> +#include <linux/ptrace.h> /* for struct pt_regs */
>> +#include <linux/kernel.h> /* for offsetof */
>>   #include <dwarf-regs.h>
>>
>>   /*
>> - * Generic dwarf analysis helpers
>> + * See arch/x86/kernel/ptrace.c.
>> + * Different from it:
>> + *
>> + *  - Since struct pt_regs is defined differently for user and kernel,
>> + *    but we want to use 'ax, bx' instead of 'rax, rbx' (which is struct
>> + *    field name of user's pt_regs), we make REG_OFFSET_NAME to accept
>> + *    both string name and reg field name.
>> + *
>> + *  - Since accessing x86_32's pt_regs from x86_64 building is difficult
>> + *    and vise versa, we simply fill offset with -1, so
>> + *    get_arch_regstr() still works but regs_query_register_offset()
>> + *    returns error.
>> + *    The only inconvenience caused by it now is that we are not allowed
>> + *    to generate BPF prologue for a x86_64 kernel if perf is built for
>> + *    x86_32. This is really a rare usecase.
>> + *
>> + *  - Order is different from kernel's ptrace.c for get_arch_regstr(), which
>> + *    is defined by dwarf.
>>    */
>>
>> -#define X86_32_MAX_REGS 8
>> -const char *x86_32_regs_table[X86_32_MAX_REGS] = {
>> -	"%ax",
>> -	"%cx",
>> -	"%dx",
>> -	"%bx",
>> -	"$stack",	/* Stack address instead of %sp */
>> -	"%bp",
>> -	"%si",
>> -	"%di",
>> +struct pt_regs_offset {
>> +	const char *name;
>> +	int offset;
>> +};
>> +
>> +#define REG_OFFSET_END {.name = NULL, .offset = 0}
>> +
>> +#ifdef __x86_64__
>> +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
>> +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = -1}
>> +#else
>> +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = -1}
>> +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
>> +#endif
>> +
>> +static const struct pt_regs_offset x86_32_regoffset_table[] = {
>> +	REG_OFFSET_NAME_32("%ax",	eax),
>> +	REG_OFFSET_NAME_32("%cx",	ecx),
>> +	REG_OFFSET_NAME_32("%dx",	edx),
>> +	REG_OFFSET_NAME_32("%bx",	ebx),
>> +	REG_OFFSET_NAME_32("$stack",	esp),	/* Stack address instead of %sp */
>> +	REG_OFFSET_NAME_32("%bp",	ebp),
>> +	REG_OFFSET_NAME_32("%si",	esi),
>> +	REG_OFFSET_NAME_32("%di",	edi),
>> +	REG_OFFSET_END,
>>   };
>>
>> -#define X86_64_MAX_REGS 16
>> -const char *x86_64_regs_table[X86_64_MAX_REGS] = {
>> -	"%ax",
>> -	"%dx",
>> -	"%cx",
>> -	"%bx",
>> -	"%si",
>> -	"%di",
>> -	"%bp",
>> -	"%sp",
>> -	"%r8",
>> -	"%r9",
>> -	"%r10",
>> -	"%r11",
>> -	"%r12",
>> -	"%r13",
>> -	"%r14",
>> -	"%r15",
>> +static const struct pt_regs_offset x86_64_regoffset_table[] = {
>> +	REG_OFFSET_NAME_64("%ax",	rax),
>> +	REG_OFFSET_NAME_64("%dx",	rdx),
>> +	REG_OFFSET_NAME_64("%cx",	rcx),
>> +	REG_OFFSET_NAME_64("%bx",	rbx),
>> +	REG_OFFSET_NAME_64("%si",	rsi),
>> +	REG_OFFSET_NAME_64("%di",	rdi),
>> +	REG_OFFSET_NAME_64("%bp",	rbp),
>> +	REG_OFFSET_NAME_64("%sp",	rsp),
>> +	REG_OFFSET_NAME_64("%r8",	r8),
>> +	REG_OFFSET_NAME_64("%r9",	r9),
>> +	REG_OFFSET_NAME_64("%r10",	r10),
>> +	REG_OFFSET_NAME_64("%r11",	r11),
>> +	REG_OFFSET_NAME_64("%r12",	r12),
>> +	REG_OFFSET_NAME_64("%r13",	r13),
>> +	REG_OFFSET_NAME_64("%r14",	r14),
>> +	REG_OFFSET_NAME_64("%r15",	r15),
>> +	REG_OFFSET_END,
>>   };
>>
>>   /* TODO: switching by dwarf address size */
>>   #ifdef __x86_64__
>> -#define ARCH_MAX_REGS X86_64_MAX_REGS
>> -#define arch_regs_table x86_64_regs_table
>> +#define regoffset_table x86_64_regoffset_table
>>   #else
>> -#define ARCH_MAX_REGS X86_32_MAX_REGS
>> -#define arch_regs_table x86_32_regs_table
>> +#define regoffset_table x86_32_regoffset_table
>>   #endif
>>
>> +/* Minus 1 for the ending REG_OFFSET_END */
>> +#define ARCH_MAX_REGS ((sizeof(regoffset_table) / sizeof(regoffset_table[0])) - 1)
>> +
>>   /* Return architecture dependent register string (for kprobe-tracer) */
>>   const char *get_arch_regstr(unsigned int n)
>>   {
>> -	return (n < ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
>> +	return (n < ARCH_MAX_REGS) ? regoffset_table[n].name : NULL;
>> +}
>> +
>> +/* Reuse code from arch/x86/kernel/ptrace.c */
>> +/**
>> + * regs_query_register_offset() - query register offset from its name
>> + * @name:	the name of a register
>> + *
>> + * regs_query_register_offset() returns the offset of a register in struct
>> + * pt_regs from its name. If the name is invalid, this returns -EINVAL;
>> + */
>> +int regs_query_register_offset(const char *name)
>> +{
>> +	const struct pt_regs_offset *roff;
>> +	for (roff = regoffset_table; roff->name != NULL; roff++)
>> +		if (!strcmp(roff->name, name))
>> +			return roff->offset;
>> +	return -EINVAL;
>>   }
>> --
>> 1.8.3.4



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86
  2015-09-01 11:47       ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-01 13:52         ` Wangnan (F)
@ 2015-09-01 14:14         ` Arnaldo Carvalho de Melo
  2015-09-01 15:54           ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 1 reply; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 14:14 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI
  Cc: 'Wang Nan',
	acme, linux-kernel, He Kuang, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, Jiri Olsa, Kaixu Xia, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Zefan Li, pi3orama

Em Tue, Sep 01, 2015 at 11:47:41AM +0000, 平松雅巳 / HIRAMATU,MASAMI escreveu:
> > From: Wang Nan [mailto:wangnan0@huawei.com]
> > 
> > regs_query_register_offset() is a helper function which converts
> > register name like "%rax" to offset of a register in 'struct pt_regs',
> > which is required by BPF prologue generator. Since the function is
> > identical, try to reuse the code in arch/x86/kernel/ptrace.c.
> > 
> > Comment inside dwarf-regs.c list the differences between this
> > implementation and kernel code.
> 
> Hmm, this also introduce a duplication of the code...
> It might be a good time to move them into arch/x86/lib/ and
> reuse it directly from perf code.

It is strange to, having tried sharing stuff directly from the kernel,
to be now in a position where I advocate against it...

Copy'n'pasting what I said in another message:

-----
We would go back to sharing stuff with the kernel, but this time around
we would be using something that everybody knows is being shared, which
doesn't elliminates the possibility that at some point changes made with
the kernel in mind would break the tools/ using code.

Perhaps it is better to keep copying what we want and introduce
infrastructure to check for differences and warn us as soon as possible
so that we would do the copy, test if it doesn't break what we use, etc.

I.e. we wouldn't be putting any new burden on the "kernel people", i.e.
the burden of having to check that changes they make don't break tools/
living code, nor any out of the blue breakage on tools/ for tools/
developers to fix when changes are made on the kernel "side" -----
---

The "stop sharing directly stuff with the kernel" stance was taken after
a report from Linus about breakage due to tools/ using kernel files
directly and then a change made in some RCU files broke the tools/perf/
build, even with tools/perf/ not using anything RCU related so far.

Looking at tools/perf/MANIFEST, the file used to create a detached
tarball so that perf can be built outside the kernel sources there are
still some kernel source files listed, but those probably need to be
copied too...

- Arnaldo
 
> Thank you,
> 
> > 
> > get_arch_regstr() switches to regoffset_table and the old string table
> > is dropped.
> > 
> > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > Signed-off-by: He Kuang <hekuang@huawei.com>
> > Cc: Alexei Starovoitov <ast@plumgrid.com>
> > Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: David Ahern <dsahern@gmail.com>
> > Cc: He Kuang <hekuang@huawei.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Kaixu Xia <xiakaixu@huawei.com>
> > Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Cc: Paul Mackerras <paulus@samba.org>
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Cc: Zefan Li <lizefan@huawei.com>
> > Cc: pi3orama@163.com
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > ---
> >  tools/perf/arch/x86/Makefile          |   1 +
> >  tools/perf/arch/x86/util/Build        |   1 +
> >  tools/perf/arch/x86/util/dwarf-regs.c | 122 ++++++++++++++++++++++++----------
> >  3 files changed, 90 insertions(+), 34 deletions(-)
> > 
> > diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
> > index 21322e0..09ba923 100644
> > --- a/tools/perf/arch/x86/Makefile
> > +++ b/tools/perf/arch/x86/Makefile
> > @@ -2,3 +2,4 @@ ifndef NO_DWARF
> >  PERF_HAVE_DWARF_REGS := 1
> >  endif
> >  HAVE_KVM_STAT_SUPPORT := 1
> > +PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
> > diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
> > index 2c55e1b..d4d1f23 100644
> > --- a/tools/perf/arch/x86/util/Build
> > +++ b/tools/perf/arch/x86/util/Build
> > @@ -4,6 +4,7 @@ libperf-y += pmu.o
> >  libperf-y += kvm-stat.o
> > 
> >  libperf-$(CONFIG_DWARF) += dwarf-regs.o
> > +libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
> > 
> >  libperf-$(CONFIG_LIBUNWIND)          += unwind-libunwind.o
> >  libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
> > diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
> > index a08de0a..de5b936 100644
> > --- a/tools/perf/arch/x86/util/dwarf-regs.c
> > +++ b/tools/perf/arch/x86/util/dwarf-regs.c
> > @@ -21,55 +21,109 @@
> >   */
> > 
> >  #include <stddef.h>
> > +#include <errno.h> /* for EINVAL */
> > +#include <string.h> /* for strcmp */
> > +#include <linux/ptrace.h> /* for struct pt_regs */
> > +#include <linux/kernel.h> /* for offsetof */
> >  #include <dwarf-regs.h>
> > 
> >  /*
> > - * Generic dwarf analysis helpers
> > + * See arch/x86/kernel/ptrace.c.
> > + * Different from it:
> > + *
> > + *  - Since struct pt_regs is defined differently for user and kernel,
> > + *    but we want to use 'ax, bx' instead of 'rax, rbx' (which is struct
> > + *    field name of user's pt_regs), we make REG_OFFSET_NAME to accept
> > + *    both string name and reg field name.
> > + *
> > + *  - Since accessing x86_32's pt_regs from x86_64 building is difficult
> > + *    and vise versa, we simply fill offset with -1, so
> > + *    get_arch_regstr() still works but regs_query_register_offset()
> > + *    returns error.
> > + *    The only inconvenience caused by it now is that we are not allowed
> > + *    to generate BPF prologue for a x86_64 kernel if perf is built for
> > + *    x86_32. This is really a rare usecase.
> > + *
> > + *  - Order is different from kernel's ptrace.c for get_arch_regstr(), which
> > + *    is defined by dwarf.
> >   */
> > 
> > -#define X86_32_MAX_REGS 8
> > -const char *x86_32_regs_table[X86_32_MAX_REGS] = {
> > -	"%ax",
> > -	"%cx",
> > -	"%dx",
> > -	"%bx",
> > -	"$stack",	/* Stack address instead of %sp */
> > -	"%bp",
> > -	"%si",
> > -	"%di",
> > +struct pt_regs_offset {
> > +	const char *name;
> > +	int offset;
> > +};
> > +
> > +#define REG_OFFSET_END {.name = NULL, .offset = 0}
> > +
> > +#ifdef __x86_64__
> > +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
> > +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = -1}
> > +#else
> > +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = -1}
> > +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
> > +#endif
> > +
> > +static const struct pt_regs_offset x86_32_regoffset_table[] = {
> > +	REG_OFFSET_NAME_32("%ax",	eax),
> > +	REG_OFFSET_NAME_32("%cx",	ecx),
> > +	REG_OFFSET_NAME_32("%dx",	edx),
> > +	REG_OFFSET_NAME_32("%bx",	ebx),
> > +	REG_OFFSET_NAME_32("$stack",	esp),	/* Stack address instead of %sp */
> > +	REG_OFFSET_NAME_32("%bp",	ebp),
> > +	REG_OFFSET_NAME_32("%si",	esi),
> > +	REG_OFFSET_NAME_32("%di",	edi),
> > +	REG_OFFSET_END,
> >  };
> > 
> > -#define X86_64_MAX_REGS 16
> > -const char *x86_64_regs_table[X86_64_MAX_REGS] = {
> > -	"%ax",
> > -	"%dx",
> > -	"%cx",
> > -	"%bx",
> > -	"%si",
> > -	"%di",
> > -	"%bp",
> > -	"%sp",
> > -	"%r8",
> > -	"%r9",
> > -	"%r10",
> > -	"%r11",
> > -	"%r12",
> > -	"%r13",
> > -	"%r14",
> > -	"%r15",
> > +static const struct pt_regs_offset x86_64_regoffset_table[] = {
> > +	REG_OFFSET_NAME_64("%ax",	rax),
> > +	REG_OFFSET_NAME_64("%dx",	rdx),
> > +	REG_OFFSET_NAME_64("%cx",	rcx),
> > +	REG_OFFSET_NAME_64("%bx",	rbx),
> > +	REG_OFFSET_NAME_64("%si",	rsi),
> > +	REG_OFFSET_NAME_64("%di",	rdi),
> > +	REG_OFFSET_NAME_64("%bp",	rbp),
> > +	REG_OFFSET_NAME_64("%sp",	rsp),
> > +	REG_OFFSET_NAME_64("%r8",	r8),
> > +	REG_OFFSET_NAME_64("%r9",	r9),
> > +	REG_OFFSET_NAME_64("%r10",	r10),
> > +	REG_OFFSET_NAME_64("%r11",	r11),
> > +	REG_OFFSET_NAME_64("%r12",	r12),
> > +	REG_OFFSET_NAME_64("%r13",	r13),
> > +	REG_OFFSET_NAME_64("%r14",	r14),
> > +	REG_OFFSET_NAME_64("%r15",	r15),
> > +	REG_OFFSET_END,
> >  };
> > 
> >  /* TODO: switching by dwarf address size */
> >  #ifdef __x86_64__
> > -#define ARCH_MAX_REGS X86_64_MAX_REGS
> > -#define arch_regs_table x86_64_regs_table
> > +#define regoffset_table x86_64_regoffset_table
> >  #else
> > -#define ARCH_MAX_REGS X86_32_MAX_REGS
> > -#define arch_regs_table x86_32_regs_table
> > +#define regoffset_table x86_32_regoffset_table
> >  #endif
> > 
> > +/* Minus 1 for the ending REG_OFFSET_END */
> > +#define ARCH_MAX_REGS ((sizeof(regoffset_table) / sizeof(regoffset_table[0])) - 1)
> > +
> >  /* Return architecture dependent register string (for kprobe-tracer) */
> >  const char *get_arch_regstr(unsigned int n)
> >  {
> > -	return (n < ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
> > +	return (n < ARCH_MAX_REGS) ? regoffset_table[n].name : NULL;
> > +}
> > +
> > +/* Reuse code from arch/x86/kernel/ptrace.c */
> > +/**
> > + * regs_query_register_offset() - query register offset from its name
> > + * @name:	the name of a register
> > + *
> > + * regs_query_register_offset() returns the offset of a register in struct
> > + * pt_regs from its name. If the name is invalid, this returns -EINVAL;
> > + */
> > +int regs_query_register_offset(const char *name)
> > +{
> > +	const struct pt_regs_offset *roff;
> > +	for (roff = regoffset_table; roff->name != NULL; roff++)
> > +		if (!strcmp(roff->name, name))
> > +			return roff->offset;
> > +	return -EINVAL;
> >  }
> > --
> > 1.8.3.4
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86
  2015-09-01 13:52         ` Wangnan (F)
@ 2015-09-01 14:50           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 14:50 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: 平松雅巳 / HIRAMATU,MASAMI,
	linux-kernel, He Kuang, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, Jiri Olsa, Kaixu Xia, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Zefan Li, pi3orama

Em Tue, Sep 01, 2015 at 09:52:30PM +0800, Wangnan (F) escreveu:
> On 2015/9/1 19:47, 平松雅巳 / HIRAMATU,MASAMI wrote:
> >>From: Wang Nan [mailto:wangnan0@huawei.com]
> >>regs_query_register_offset() is a helper function which converts
> >>register name like "%rax" to offset of a register in 'struct pt_regs',
> >>which is required by BPF prologue generator. Since the function is
> >>identical, try to reuse the code in arch/x86/kernel/ptrace.c.

> >>Comment inside dwarf-regs.c list the differences between this
> >>implementation and kernel code.
> >Hmm, this also introduce a duplication of the code...
> >It might be a good time to move them into arch/x86/lib/ and
> >reuse it directly from perf code.
 
> So you want to move it from ./arch/x86/kernel/ptrace.c to arch/x86/lib and
> let perf link against arch/x86/lib/lib.a to use it?
 
> I think it worth a specific work to do it. Currently we lake
> scaffold to compile and link against the kernel side library. Moreover,
> we should also consider other archs. Seems not very easy.
 
> This is not the only one file which can benifite from kernel's arch/x86/lib
> Newly introduced tools/perf/util/intel-pt-decoder/insn.c, and I believe
> there's
> more. Therefore I think it should be a separated work from perf BPF patches.
> So how about keep this patch at this time? Or do you have some idea?

I would go with what Wang did at this time, its a step in the right
direction in the sense that we're trying to use the same function names
and semantics, and, as much as possible, possibly in verbatim form,
using the same implementation.

Doing the work to fully share it is something being discussed, but that
should not get in the way of eBPF work, IMHO.

- Arnaldo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* RE: [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86
  2015-09-01 14:14         ` Arnaldo Carvalho de Melo
@ 2015-09-01 15:54           ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-06  6:02             ` Wangnan (F)
  0 siblings, 1 reply; 94+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-01 15:54 UTC (permalink / raw)
  To: 'Arnaldo Carvalho de Melo'
  Cc: 'Wang Nan',
	acme, linux-kernel, He Kuang, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, Jiri Olsa, Kaixu Xia, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Zefan Li, pi3orama

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 10042 bytes --]

> From: Arnaldo Carvalho de Melo [mailto:acme@redhat.com]
> 
> Em Tue, Sep 01, 2015 at 11:47:41AM +0000, 平松雅巳 / HIRAMATU,MASAMI escreveu:
> > > From: Wang Nan [mailto:wangnan0@huawei.com]
> > >
> > > regs_query_register_offset() is a helper function which converts
> > > register name like "%rax" to offset of a register in 'struct pt_regs',
> > > which is required by BPF prologue generator. Since the function is
> > > identical, try to reuse the code in arch/x86/kernel/ptrace.c.
> > >
> > > Comment inside dwarf-regs.c list the differences between this
> > > implementation and kernel code.
> >
> > Hmm, this also introduce a duplication of the code...
> > It might be a good time to move them into arch/x86/lib/ and
> > reuse it directly from perf code.
> 
> It is strange to, having tried sharing stuff directly from the kernel,
> to be now in a position where I advocate against it...
> 
> Copy'n'pasting what I said in another message:
> 
> -----
> We would go back to sharing stuff with the kernel, but this time around
> we would be using something that everybody knows is being shared, which
> doesn't elliminates the possibility that at some point changes made with
> the kernel in mind would break the tools/ using code.
> 
> Perhaps it is better to keep copying what we want and introduce
> infrastructure to check for differences and warn us as soon as possible
> so that we would do the copy, test if it doesn't break what we use, etc.
> 
> I.e. we wouldn't be putting any new burden on the "kernel people", i.e.
> the burden of having to check that changes they make don't break tools/
> living code, nor any out of the blue breakage on tools/ for tools/
> developers to fix when changes are made on the kernel "side" -----
> ---
> 
> The "stop sharing directly stuff with the kernel" stance was taken after
> a report from Linus about breakage due to tools/ using kernel files
> directly and then a change made in some RCU files broke the tools/perf/
> build, even with tools/perf/ not using anything RCU related so far.
> 
> Looking at tools/perf/MANIFEST, the file used to create a detached
> tarball so that perf can be built outside the kernel sources there are
> still some kernel source files listed, but those probably need to be
> copied too...

OK, so let this apply.

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

And also we'll need a testcase for this.

Thank you,

> 
> - Arnaldo
> 
> > Thank you,
> >
> > >
> > > get_arch_regstr() switches to regoffset_table and the old string table
> > > is dropped.
> > >
> > > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > > Signed-off-by: He Kuang <hekuang@huawei.com>
> > > Cc: Alexei Starovoitov <ast@plumgrid.com>
> > > Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> > > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > > Cc: David Ahern <dsahern@gmail.com>
> > > Cc: He Kuang <hekuang@huawei.com>
> > > Cc: Jiri Olsa <jolsa@kernel.org>
> > > Cc: Kaixu Xia <xiakaixu@huawei.com>
> > > Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > > Cc: Namhyung Kim <namhyung@kernel.org>
> > > Cc: Paul Mackerras <paulus@samba.org>
> > > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > > Cc: Zefan Li <lizefan@huawei.com>
> > > Cc: pi3orama@163.com
> > > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > ---
> > >  tools/perf/arch/x86/Makefile          |   1 +
> > >  tools/perf/arch/x86/util/Build        |   1 +
> > >  tools/perf/arch/x86/util/dwarf-regs.c | 122 ++++++++++++++++++++++++----------
> > >  3 files changed, 90 insertions(+), 34 deletions(-)
> > >
> > > diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
> > > index 21322e0..09ba923 100644
> > > --- a/tools/perf/arch/x86/Makefile
> > > +++ b/tools/perf/arch/x86/Makefile
> > > @@ -2,3 +2,4 @@ ifndef NO_DWARF
> > >  PERF_HAVE_DWARF_REGS := 1
> > >  endif
> > >  HAVE_KVM_STAT_SUPPORT := 1
> > > +PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
> > > diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
> > > index 2c55e1b..d4d1f23 100644
> > > --- a/tools/perf/arch/x86/util/Build
> > > +++ b/tools/perf/arch/x86/util/Build
> > > @@ -4,6 +4,7 @@ libperf-y += pmu.o
> > >  libperf-y += kvm-stat.o
> > >
> > >  libperf-$(CONFIG_DWARF) += dwarf-regs.o
> > > +libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
> > >
> > >  libperf-$(CONFIG_LIBUNWIND)          += unwind-libunwind.o
> > >  libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
> > > diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
> > > index a08de0a..de5b936 100644
> > > --- a/tools/perf/arch/x86/util/dwarf-regs.c
> > > +++ b/tools/perf/arch/x86/util/dwarf-regs.c
> > > @@ -21,55 +21,109 @@
> > >   */
> > >
> > >  #include <stddef.h>
> > > +#include <errno.h> /* for EINVAL */
> > > +#include <string.h> /* for strcmp */
> > > +#include <linux/ptrace.h> /* for struct pt_regs */
> > > +#include <linux/kernel.h> /* for offsetof */
> > >  #include <dwarf-regs.h>
> > >
> > >  /*
> > > - * Generic dwarf analysis helpers
> > > + * See arch/x86/kernel/ptrace.c.
> > > + * Different from it:
> > > + *
> > > + *  - Since struct pt_regs is defined differently for user and kernel,
> > > + *    but we want to use 'ax, bx' instead of 'rax, rbx' (which is struct
> > > + *    field name of user's pt_regs), we make REG_OFFSET_NAME to accept
> > > + *    both string name and reg field name.
> > > + *
> > > + *  - Since accessing x86_32's pt_regs from x86_64 building is difficult
> > > + *    and vise versa, we simply fill offset with -1, so
> > > + *    get_arch_regstr() still works but regs_query_register_offset()
> > > + *    returns error.
> > > + *    The only inconvenience caused by it now is that we are not allowed
> > > + *    to generate BPF prologue for a x86_64 kernel if perf is built for
> > > + *    x86_32. This is really a rare usecase.
> > > + *
> > > + *  - Order is different from kernel's ptrace.c for get_arch_regstr(), which
> > > + *    is defined by dwarf.
> > >   */
> > >
> > > -#define X86_32_MAX_REGS 8
> > > -const char *x86_32_regs_table[X86_32_MAX_REGS] = {
> > > -	"%ax",
> > > -	"%cx",
> > > -	"%dx",
> > > -	"%bx",
> > > -	"$stack",	/* Stack address instead of %sp */
> > > -	"%bp",
> > > -	"%si",
> > > -	"%di",
> > > +struct pt_regs_offset {
> > > +	const char *name;
> > > +	int offset;
> > > +};
> > > +
> > > +#define REG_OFFSET_END {.name = NULL, .offset = 0}
> > > +
> > > +#ifdef __x86_64__
> > > +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
> > > +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = -1}
> > > +#else
> > > +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = -1}
> > > +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
> > > +#endif
> > > +
> > > +static const struct pt_regs_offset x86_32_regoffset_table[] = {
> > > +	REG_OFFSET_NAME_32("%ax",	eax),
> > > +	REG_OFFSET_NAME_32("%cx",	ecx),
> > > +	REG_OFFSET_NAME_32("%dx",	edx),
> > > +	REG_OFFSET_NAME_32("%bx",	ebx),
> > > +	REG_OFFSET_NAME_32("$stack",	esp),	/* Stack address instead of %sp */
> > > +	REG_OFFSET_NAME_32("%bp",	ebp),
> > > +	REG_OFFSET_NAME_32("%si",	esi),
> > > +	REG_OFFSET_NAME_32("%di",	edi),
> > > +	REG_OFFSET_END,
> > >  };
> > >
> > > -#define X86_64_MAX_REGS 16
> > > -const char *x86_64_regs_table[X86_64_MAX_REGS] = {
> > > -	"%ax",
> > > -	"%dx",
> > > -	"%cx",
> > > -	"%bx",
> > > -	"%si",
> > > -	"%di",
> > > -	"%bp",
> > > -	"%sp",
> > > -	"%r8",
> > > -	"%r9",
> > > -	"%r10",
> > > -	"%r11",
> > > -	"%r12",
> > > -	"%r13",
> > > -	"%r14",
> > > -	"%r15",
> > > +static const struct pt_regs_offset x86_64_regoffset_table[] = {
> > > +	REG_OFFSET_NAME_64("%ax",	rax),
> > > +	REG_OFFSET_NAME_64("%dx",	rdx),
> > > +	REG_OFFSET_NAME_64("%cx",	rcx),
> > > +	REG_OFFSET_NAME_64("%bx",	rbx),
> > > +	REG_OFFSET_NAME_64("%si",	rsi),
> > > +	REG_OFFSET_NAME_64("%di",	rdi),
> > > +	REG_OFFSET_NAME_64("%bp",	rbp),
> > > +	REG_OFFSET_NAME_64("%sp",	rsp),
> > > +	REG_OFFSET_NAME_64("%r8",	r8),
> > > +	REG_OFFSET_NAME_64("%r9",	r9),
> > > +	REG_OFFSET_NAME_64("%r10",	r10),
> > > +	REG_OFFSET_NAME_64("%r11",	r11),
> > > +	REG_OFFSET_NAME_64("%r12",	r12),
> > > +	REG_OFFSET_NAME_64("%r13",	r13),
> > > +	REG_OFFSET_NAME_64("%r14",	r14),
> > > +	REG_OFFSET_NAME_64("%r15",	r15),
> > > +	REG_OFFSET_END,
> > >  };
> > >
> > >  /* TODO: switching by dwarf address size */
> > >  #ifdef __x86_64__
> > > -#define ARCH_MAX_REGS X86_64_MAX_REGS
> > > -#define arch_regs_table x86_64_regs_table
> > > +#define regoffset_table x86_64_regoffset_table
> > >  #else
> > > -#define ARCH_MAX_REGS X86_32_MAX_REGS
> > > -#define arch_regs_table x86_32_regs_table
> > > +#define regoffset_table x86_32_regoffset_table
> > >  #endif
> > >
> > > +/* Minus 1 for the ending REG_OFFSET_END */
> > > +#define ARCH_MAX_REGS ((sizeof(regoffset_table) / sizeof(regoffset_table[0])) - 1)
> > > +
> > >  /* Return architecture dependent register string (for kprobe-tracer) */
> > >  const char *get_arch_regstr(unsigned int n)
> > >  {
> > > -	return (n < ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
> > > +	return (n < ARCH_MAX_REGS) ? regoffset_table[n].name : NULL;
> > > +}
> > > +
> > > +/* Reuse code from arch/x86/kernel/ptrace.c */
> > > +/**
> > > + * regs_query_register_offset() - query register offset from its name
> > > + * @name:	the name of a register
> > > + *
> > > + * regs_query_register_offset() returns the offset of a register in struct
> > > + * pt_regs from its name. If the name is invalid, this returns -EINVAL;
> > > + */
> > > +int regs_query_register_offset(const char *name)
> > > +{
> > > +	const struct pt_regs_offset *roff;
> > > +	for (roff = regoffset_table; roff->name != NULL; roff++)
> > > +		if (!strcmp(roff->name, name))
> > > +			return roff->offset;
> > > +	return -EINVAL;
> > >  }
> > > --
> > > 1.8.3.4
> >
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 21/31] perf tools: Move linux/filter.h to tools/include
  2015-08-29  4:21 ` [PATCH 21/31] perf tools: Move linux/filter.h to tools/include Wang Nan
  2015-08-31 20:35   ` Arnaldo Carvalho de Melo
@ 2015-09-01 19:39   ` Arnaldo Carvalho de Melo
  2015-09-01 19:47     ` Arnaldo Carvalho de Melo
  2015-09-01 21:08     ` pi3orama
  2015-09-08 14:31   ` [tip:perf/core] perf tools: Copy " tip-bot for He Kuang
  2 siblings, 2 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 19:39 UTC (permalink / raw)
  To: Wang Nan
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

Em Sat, Aug 29, 2015 at 04:21:55AM +0000, Wang Nan escreveu:
> From: He Kuang <hekuang@huawei.com>
> 
> This patch moves filter.h from include/linux/kernel.h to

I said that before: this is not moving anything, it is copying :-)

> tools/include/linux/filter.h to enable other libraries use macros in
> it, like libbpf which will be introduced by further patches. Currenty,
> the moved filter.h only contains the useful macros needed by libbpf
> for not introducing too much dependence.
> 
> MANIFEST is also updated for 'make perf-*-src-pkg'.

So, I did a:

$ diff -u include/linux/filter.h tools/include/linux/filter.h

And noticed these:

-/* Endianess conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
+/* Endianness conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */

-/* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */
+/* Short form of mov based on type,
+ * BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32
+ */

-/* Conditional jumps against registers, if (dst_reg 'op' src_reg) goto pc + off16 */
+/* Conditional jumps against registers,
+ * if (dst_reg 'op' src_reg) goto pc + off16
+ */

-/* Conditional jumps against immediates, if (dst_reg 'op' imm32) goto pc + off16 */
+/* Conditional jumps against immediates,
+ * if (dst_reg 'op' imm32) goto pc + off16
+ */

------------------------------------------------------------------

Please refrain from doing that... I.e. spell checking is kinda useful,
introducing gratuitous further drift from include/linux/FOO.h to
tools/include/linux/FOO.h is not.

So either resist the urge to do these stylistic changes or do those changes in
include/linux/FOO.h and _then_ copy it to tools/include/linux/FOO.h.

If the copy was already done, fix both, so that when we do that diff again, we
can see what is really different in kernel and userspace copies and that maybe
will help us spot things that aren't diverging over time.

- Arnaldo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 21/31] perf tools: Move linux/filter.h to tools/include
  2015-09-01 19:39   ` Arnaldo Carvalho de Melo
@ 2015-09-01 19:47     ` Arnaldo Carvalho de Melo
  2015-09-01 21:08     ` pi3orama
  1 sibling, 0 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 19:47 UTC (permalink / raw)
  To: Wang Nan
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

Em Tue, Sep 01, 2015 at 04:39:59PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Sat, Aug 29, 2015 at 04:21:55AM +0000, Wang Nan escreveu:
> > From: He Kuang <hekuang@huawei.com>
> -/* Conditional jumps against immediates, if (dst_reg 'op' imm32) goto pc + off16 */
> +/* Conditional jumps against immediates,
> + * if (dst_reg 'op' imm32) goto pc + off16
> + */
 
> Please refrain from doing that... I.e. spell checking is kinda useful,
> introducing gratuitous further drift from include/linux/FOO.h to
> tools/include/linux/FOO.h is not.

I've done it this time. I.e. left it as a strict copy, modulo the unused
parts that were purposefuly removed in the original patch. Trying to
cherry pick as many patches from this series as possible, to help in
getting the eBPF patchkit in.

- Arnaldo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 28/31] perf probe: Init symbol as kprobe
  2015-08-29  4:22 ` [PATCH 28/31] perf probe: Init symbol as kprobe Wang Nan
@ 2015-09-01 20:11   ` Arnaldo Carvalho de Melo
  2015-09-02  1:22     ` Wangnan (F)
  2015-09-02  1:38     ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 2 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 20:11 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Wang Nan, mingo, ast, linux-kernel, lizefan, pi3orama,
	Brendan Gregg, Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa,
	Kaixu Xia, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Sat, Aug 29, 2015 at 04:22:02AM +0000, Wang Nan escreveu:
> Before this patch, add_perf_probe_events() init symbol maps only for
> uprobe if the first 'struct perf_probe_event' passed to it is a uprobe
> event. This is a trick because 'perf probe''s command line syntax
> constrains the first elements of the probe_event arrays must be kprobes
> if there is one kprobe there.
> 
> However, with the incoming BPF uprobe support, that constrain is not
> hold since 'perf record' will also probe on k/u probes through BPF
> object, and is possible to pass an array with kprobe but the first
> element is uprobe.
> 
> This patch init symbol maps for kprobes even if all of events are
> uprobes, because the extra cost should be small enough.

Masami, are you Ok with this one?

- Arnaldo
 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/n/1436445342-1402-39-git-send-email-wangnan0@huawei.com
> ---
>  tools/perf/util/probe-event.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index e720913..b94a8d7 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2789,7 +2789,7 @@ int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
>  {
>  	int i, ret;
>  
> -	ret = init_symbol_maps(pevs->uprobes);
> +	ret = init_symbol_maps(false);
>  	if (ret < 0)
>  		return ret;
>  
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 27/31] perf record: Support custom vmlinux path
  2015-08-29  4:22 ` [PATCH 27/31] perf record: Support custom vmlinux path Wang Nan
@ 2015-09-01 20:19   ` Arnaldo Carvalho de Melo
  2015-09-01 20:21     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 20:19 UTC (permalink / raw)
  To: Wang Nan
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

Em Sat, Aug 29, 2015 at 04:22:01AM +0000, Wang Nan escreveu:
> From: He Kuang <hekuang@huawei.com>
> 
> Make perf-record command support --vmlinux option if BPF_PROLOGUE is on.

Ok, this should be supported, i.e. letting the user specify a vmlinux
path to use.

But it shouldn't be _required_, i.e. we have things like vmlinux_path to
try to find it in well known places.

Right now it will search for it in the process of trying to load its
symtab, but I think we should have a function that tries to find a
vmlinux that matches the build-id of the running kernel, for things that
want to have access directly to the ELF file with debuginfo without
having to load the symtab in a struct dso, etc.

I'll look at the next patches to check how you make use of this info...

- Arnaldo

> 'perf record' needs vmlinux as the source of DWARF info to generate
> prologue for BPF programs, so path of vmlinux should be specified.
> 
> Short name 'k' has been taken by 'clockid'. This patch skips the short
> option name and use '--vmlinux' for vmlinux path.
> 
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/n/1436445342-1402-38-git-send-email-wangnan0@huawei.com
> ---
>  tools/perf/builtin-record.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 212718c..8eb39d5 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1100,6 +1100,10 @@ struct option __record_options[] = {
>  		   "clang binary to use for compiling BPF scriptlets"),
>  	OPT_STRING(0, "clang-opt", &llvm_param.clang_opt, "clang options",
>  		   "options passed to clang when compiling BPF scriptlets"),
> +#ifdef HAVE_BPF_PROLOGUE
> +	OPT_STRING(0, "vmlinux", &symbol_conf.vmlinux_name,
> +		   "file", "vmlinux pathname"),
> +#endif
>  #endif
>  	OPT_END()
>  };
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 27/31] perf record: Support custom vmlinux path
  2015-09-01 20:19   ` Arnaldo Carvalho de Melo
@ 2015-09-01 20:21     ` Arnaldo Carvalho de Melo
  2015-09-01 21:00       ` pi3orama
  0 siblings, 1 reply; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 20:21 UTC (permalink / raw)
  To: Wang Nan, He Kuang
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra

Em Tue, Sep 01, 2015 at 05:19:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Sat, Aug 29, 2015 at 04:22:01AM +0000, Wang Nan escreveu:
> > From: He Kuang <hekuang@huawei.com>
> > 
> > Make perf-record command support --vmlinux option if BPF_PROLOGUE is on.
> 
> Ok, this should be supported, i.e. letting the user specify a vmlinux
> path to use.
> 
> But it shouldn't be _required_, i.e. we have things like vmlinux_path to
> try to find it in well known places.
> 
> Right now it will search for it in the process of trying to load its
> symtab, but I think we should have a function that tries to find a
> vmlinux that matches the build-id of the running kernel, for things that
> want to have access directly to the ELF file with debuginfo without
> having to load the symtab in a struct dso, etc.
> 
> I'll look at the next patches to check how you make use of this info...

So, the do it all from 'perf record' is in not yet in this patchkit,
right? At least not in [ N/31 ] with N > 27, can you point me to it?

- Arnaldo
 
> - Arnaldo
> 
> > 'perf record' needs vmlinux as the source of DWARF info to generate
> > prologue for BPF programs, so path of vmlinux should be specified.
> > 
> > Short name 'k' has been taken by 'clockid'. This patch skips the short
> > option name and use '--vmlinux' for vmlinux path.
> > 
> > Signed-off-by: He Kuang <hekuang@huawei.com>
> > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > Cc: Alexei Starovoitov <ast@plumgrid.com>
> > Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: David Ahern <dsahern@gmail.com>
> > Cc: He Kuang <hekuang@huawei.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Kaixu Xia <xiakaixu@huawei.com>
> > Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Cc: Paul Mackerras <paulus@samba.org>
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Cc: Zefan Li <lizefan@huawei.com>
> > Cc: pi3orama@163.com
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Link: http://lkml.kernel.org/n/1436445342-1402-38-git-send-email-wangnan0@huawei.com
> > ---
> >  tools/perf/builtin-record.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index 212718c..8eb39d5 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -1100,6 +1100,10 @@ struct option __record_options[] = {
> >  		   "clang binary to use for compiling BPF scriptlets"),
> >  	OPT_STRING(0, "clang-opt", &llvm_param.clang_opt, "clang options",
> >  		   "options passed to clang when compiling BPF scriptlets"),
> > +#ifdef HAVE_BPF_PROLOGUE
> > +	OPT_STRING(0, "vmlinux", &symbol_conf.vmlinux_name,
> > +		   "file", "vmlinux pathname"),
> > +#endif
> >  #endif
> >  	OPT_END()
> >  };
> > -- 
> > 2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 27/31] perf record: Support custom vmlinux path
  2015-09-01 20:21     ` Arnaldo Carvalho de Melo
@ 2015-09-01 21:00       ` pi3orama
  2015-09-01 21:33         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 94+ messages in thread
From: pi3orama @ 2015-09-01 21:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Wang Nan, He Kuang, mingo, ast, linux-kernel, lizefan,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra



发自我的 iPhone

> 在 2015年9月2日,上午4:21,Arnaldo Carvalho de Melo <acme@redhat.com> 写道:
> 
> Em Tue, Sep 01, 2015 at 05:19:17PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Sat, Aug 29, 2015 at 04:22:01AM +0000, Wang Nan escreveu:
>>> From: He Kuang <hekuang@huawei.com>
>>> 
>>> Make perf-record command support --vmlinux option if BPF_PROLOGUE is on.
>> 
>> Ok, this should be supported, i.e. letting the user specify a vmlinux
>> path to use.
>> 
>> But it shouldn't be _required_, i.e. we have things like vmlinux_path to
>> try to find it in well known places.
>> 
>> Right now it will search for it in the process of trying to load its
>> symtab, but I think we should have a function that tries to find a
>> vmlinux that matches the build-id of the running kernel, for things that
>> want to have access directly to the ELF file with debuginfo without
>> having to load the symtab in a struct dso, etc.
>> 
>> I'll look at the next patches to check how you make use of this info...
> 
> So, the do it all from 'perf record' is in not yet in this patchkit,
> right? At least not in [ N/31 ] with N > 27, can you point me to it?
> 

It is for patch 8/31, which create kprobe points using add_perf_probe_events().

Before this patch it won't search debug info, prevent us use argument like this:

SEC("lock_page=lock_page page->flags")

or probe at line number, expect it finds a valid vmlinux from the default path.

Thank you.

> - Arnaldo
> 
>> - Arnaldo
>> 
>>> 'perf record' needs vmlinux as the source of DWARF info to generate
>>> prologue for BPF programs, so path of vmlinux should be specified.
>>> 
>>> Short name 'k' has been taken by 'clockid'. This patch skips the short
>>> option name and use '--vmlinux' for vmlinux path.
>>> 
>>> Signed-off-by: He Kuang <hekuang@huawei.com>
>>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>>> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
>>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>>> Cc: David Ahern <dsahern@gmail.com>
>>> Cc: He Kuang <hekuang@huawei.com>
>>> Cc: Jiri Olsa <jolsa@kernel.org>
>>> Cc: Kaixu Xia <xiakaixu@huawei.com>
>>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>> Cc: Paul Mackerras <paulus@samba.org>
>>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>> Cc: Zefan Li <lizefan@huawei.com>
>>> Cc: pi3orama@163.com
>>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>>> Link: http://lkml.kernel.org/n/1436445342-1402-38-git-send-email-wangnan0@huawei.com
>>> ---
>>> tools/perf/builtin-record.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>> 
>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>> index 212718c..8eb39d5 100644
>>> --- a/tools/perf/builtin-record.c
>>> +++ b/tools/perf/builtin-record.c
>>> @@ -1100,6 +1100,10 @@ struct option __record_options[] = {
>>>           "clang binary to use for compiling BPF scriptlets"),
>>>    OPT_STRING(0, "clang-opt", &llvm_param.clang_opt, "clang options",
>>>           "options passed to clang when compiling BPF scriptlets"),
>>> +#ifdef HAVE_BPF_PROLOGUE
>>> +    OPT_STRING(0, "vmlinux", &symbol_conf.vmlinux_name,
>>> +           "file", "vmlinux pathname"),
>>> +#endif
>>> #endif
>>>    OPT_END()
>>> };
>>> -- 
>>> 2.1.0


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 21/31] perf tools: Move linux/filter.h to tools/include
  2015-09-01 19:39   ` Arnaldo Carvalho de Melo
  2015-09-01 19:47     ` Arnaldo Carvalho de Melo
@ 2015-09-01 21:08     ` pi3orama
  2015-09-01 21:43       ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 94+ messages in thread
From: pi3orama @ 2015-09-01 21:08 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Wang Nan, mingo, ast, linux-kernel, lizefan, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra



发自我的 iPhone

> 在 2015年9月2日,上午3:39,Arnaldo Carvalho de Melo <acme@redhat.com> 写道:
> 
> Em Sat, Aug 29, 2015 at 04:21:55AM +0000, Wang Nan escreveu:
>> From: He Kuang <hekuang@huawei.com>
>> 
>> This patch moves filter.h from include/linux/kernel.h to
> 
> I said that before: this is not moving anything, it is copying :-)
> 
>> tools/include/linux/filter.h to enable other libraries use macros in
>> it, like libbpf which will be introduced by further patches. Currenty,
>> the moved filter.h only contains the useful macros needed by libbpf
>> for not introducing too much dependence.
>> 
>> MANIFEST is also updated for 'make perf-*-src-pkg'.
> 
> So, I did a:
> 
> $ diff -u include/linux/filter.h tools/include/linux/filter.h
> 
> And noticed these:
> 
> -/* Endianess conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
> +/* Endianness conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
> 
> -/* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */
> +/* Short form of mov based on type,
> + * BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32
> + */
> 
> -/* Conditional jumps against registers, if (dst_reg 'op' src_reg) goto pc + off16 */
> +/* Conditional jumps against registers,
> + * if (dst_reg 'op' src_reg) goto pc + off16
> + */
> 
> -/* Conditional jumps against immediates, if (dst_reg 'op' imm32) goto pc + off16 */
> +/* Conditional jumps against immediates,
> + * if (dst_reg 'op' imm32) goto pc + off16
> + */
> 
> ------------------------------------------------------------------
> 

I think these changes are made after we made this patch.

Thank you for checking it.


> Please refrain from doing that... I.e. spell checking is kinda useful,
> introducing gratuitous further drift from include/linux/FOO.h to
> tools/include/linux/FOO.h is not.
> 
> So either resist the urge to do these stylistic changes or do those changes in
> include/linux/FOO.h and _then_ copy it to tools/include/linux/FOO.h.
> 
> If the copy was already done, fix both, so that when we do that diff again, we
> can see what is really different in kernel and userspace copies and that maybe
> will help us spot things that aren't diverging over time.
> 
> - Arnaldo


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 27/31] perf record: Support custom vmlinux path
  2015-09-01 21:00       ` pi3orama
@ 2015-09-01 21:33         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 21:33 UTC (permalink / raw)
  To: pi3orama
  Cc: Wang Nan, He Kuang, mingo, ast, linux-kernel, lizefan,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra

Em Wed, Sep 02, 2015 at 05:00:39AM +0800, pi3orama escreveu:
> 发自我的 iPhone
> > 在 2015年9月2日,上午4:21,Arnaldo Carvalho de Melo <acme@redhat.com> 写道:
> > Em Tue, Sep 01, 2015 at 05:19:17PM -0300, Arnaldo Carvalho de Melo escreveu:
> >> Em Sat, Aug 29, 2015 at 04:22:01AM +0000, Wang Nan escreveu:
> >>> From: He Kuang <hekuang@huawei.com>

> >>> Make perf-record command support --vmlinux option if BPF_PROLOGUE is on.

> >> Ok, this should be supported, i.e. letting the user specify a vmlinux
> >> path to use.

> >> But it shouldn't be _required_, i.e. we have things like vmlinux_path to
> >> try to find it in well known places.

> >> Right now it will search for it in the process of trying to load its
> >> symtab, but I think we should have a function that tries to find a
> >> vmlinux that matches the build-id of the running kernel, for things that
> >> want to have access directly to the ELF file with debuginfo without
> >> having to load the symtab in a struct dso, etc.

> >> I'll look at the next patches to check how you make use of this info...
> > 
> > So, the do it all from 'perf record' is in not yet in this patchkit,
> > right? At least not in [ N/31 ] with N > 27, can you point me to it?
 
> It is for patch 8/31, which create kprobe points using add_perf_probe_events().
> 
> Before this patch it won't search debug info, prevent us use argument like this:
> 
> SEC("lock_page=lock_page page->flags")
> 
> or probe at line number, expect it finds a valid vmlinux from the default path.

Argh, that is because init_symbol_maps() uses symbol__init() that is
also being used in 'perf record' by now... I.e. it was designed to be
called just once, at tool start :-\

Will have to get my head around how this is being used to try to
untangle this mess...

- Arnaldo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 21/31] perf tools: Move linux/filter.h to tools/include
  2015-09-01 21:08     ` pi3orama
@ 2015-09-01 21:43       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 21:43 UTC (permalink / raw)
  To: pi3orama
  Cc: Wang Nan, mingo, ast, linux-kernel, lizefan, He Kuang,
	Brendan Gregg, Daniel Borkmann, David Ahern, Jiri Olsa,
	Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, acme

Em Wed, Sep 02, 2015 at 05:08:27AM +0800, pi3orama escreveu:
> 发自我的 iPhone
> > 在 2015年9月2日,上午3:39,Arnaldo Carvalho de Melo <acme@redhat.com> 写道:

> > Em Sat, Aug 29, 2015 at 04:21:55AM +0000, Wang Nan escreveu:
> >> From: He Kuang <hekuang@huawei.com>
> >> This patch moves filter.h from include/linux/kernel.h to

> > I said that before: this is not moving anything, it is copying :-)

> >> tools/include/linux/filter.h to enable other libraries use macros in
> >> it, like libbpf which will be introduced by further patches. Currenty,
> >> the moved filter.h only contains the useful macros needed by libbpf
> >> for not introducing too much dependence.

> >> MANIFEST is also updated for 'make perf-*-src-pkg'.

> > So, I did a:
> > 
> > $ diff -u include/linux/filter.h tools/include/linux/filter.h
> > 
> > And noticed these:
> > 
> > -/* Endianess conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
> > +/* Endianness conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
> > 
> > -/* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */
> > +/* Short form of mov based on type,
> > + * BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32
> > + */
> > 
> > -/* Conditional jumps against registers, if (dst_reg 'op' src_reg) goto pc + off16 */
> > +/* Conditional jumps against registers,
> > + * if (dst_reg 'op' src_reg) goto pc + off16
> > + */
> > 
> > -/* Conditional jumps against immediates, if (dst_reg 'op' imm32) goto pc + off16 */
> > +/* Conditional jumps against immediates,
> > + * if (dst_reg 'op' imm32) goto pc + off16
> > + */
> > 
> > ------------------------------------------------------------------
> > 
> 
> I think these changes are made after we made this patch.

Don't think so, for instance, this one:

/* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */

[acme@zoo linux]$ git log -p include/linux/filter.h | grep 'Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32'
 /* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */
 /* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */
 /* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */
+/* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */
[acme@zoo linux]$

Was introduced, in just one line, and never again touched, just appearing as context in
subsequent patches.

I bet this was related to checkpatch.pl complaining it has more than 80 lines ;-\

Ditto for:

[acme@zoo linux]$ git log -p include/linux/filter.h | grep 'Endiann\?ess conversion'
 /* Endianess conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
 /* Endianess conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
+/* Endianess conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
[acme@zoo linux]$
 
> Thank you for checking it.

np.

- Arnaldo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 28/31] perf probe: Init symbol as kprobe
  2015-09-01 20:11   ` Arnaldo Carvalho de Melo
@ 2015-09-02  1:22     ` Wangnan (F)
  2015-09-02  1:38     ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 0 replies; 94+ messages in thread
From: Wangnan (F) @ 2015-09-02  1:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Masami Hiramatsu
  Cc: mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra



On 2015/9/2 4:11, Arnaldo Carvalho de Melo wrote:
> Em Sat, Aug 29, 2015 at 04:22:02AM +0000, Wang Nan escreveu:
>> Before this patch, add_perf_probe_events() init symbol maps only for
>> uprobe if the first 'struct perf_probe_event' passed to it is a uprobe
>> event. This is a trick because 'perf probe''s command line syntax
>> constrains the first elements of the probe_event arrays must be kprobes
>> if there is one kprobe there.
>>
>> However, with the incoming BPF uprobe support, that constrain is not
>> hold since 'perf record' will also probe on k/u probes through BPF
>> object, and is possible to pass an array with kprobe but the first
>> element is uprobe.
>>
>> This patch init symbol maps for kprobes even if all of events are
>> uprobes, because the extra cost should be small enough.
> Masami, are you Ok with this one?

I think he would be okay with it because it is his idea :)

Please refer to: http://lkml.kernel.org/n/558E5F42.1060705@hitachi.com

Thank you.
> - Arnaldo
>   
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: He Kuang <hekuang@huawei.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Kaixu Xia <xiakaixu@huawei.com>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Link: http://lkml.kernel.org/n/1436445342-1402-39-git-send-email-wangnan0@huawei.com
>> ---
>>   tools/perf/util/probe-event.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
>> index e720913..b94a8d7 100644
>> --- a/tools/perf/util/probe-event.c
>> +++ b/tools/perf/util/probe-event.c
>> @@ -2789,7 +2789,7 @@ int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
>>   {
>>   	int i, ret;
>>   
>> -	ret = init_symbol_maps(pevs->uprobes);
>> +	ret = init_symbol_maps(false);
>>   	if (ret < 0)
>>   		return ret;
>>   
>> -- 
>> 2.1.0



^ permalink raw reply	[flat|nested] 94+ messages in thread

* RE: Re: [PATCH 28/31] perf probe: Init symbol as kprobe
  2015-09-01 20:11   ` Arnaldo Carvalho de Melo
  2015-09-02  1:22     ` Wangnan (F)
@ 2015-09-02  1:38     ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 0 replies; 94+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-02  1:38 UTC (permalink / raw)
  To: 'Arnaldo Carvalho de Melo'
  Cc: Wang Nan, mingo, ast, linux-kernel, lizefan, pi3orama,
	Brendan Gregg, Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa,
	Kaixu Xia, Namhyung Kim, Paul Mackerras, Peter Zijlstra

> From: Arnaldo Carvalho de Melo [mailto:acme@redhat.com]
> 
> Em Sat, Aug 29, 2015 at 04:22:02AM +0000, Wang Nan escreveu:
> > Before this patch, add_perf_probe_events() init symbol maps only for
> > uprobe if the first 'struct perf_probe_event' passed to it is a uprobe
> > event. This is a trick because 'perf probe''s command line syntax
> > constrains the first elements of the probe_event arrays must be kprobes
> > if there is one kprobe there.
> >
> > However, with the incoming BPF uprobe support, that constrain is not
> > hold since 'perf record' will also probe on k/u probes through BPF
> > object, and is possible to pass an array with kprobe but the first
> > element is uprobe.
> >
> > This patch init symbol maps for kprobes even if all of events are
> > uprobes, because the extra cost should be small enough.
> 
> Masami, are you Ok with this one?

Yeah, looks OK for me ! :)

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thanks!

> 
> - Arnaldo
> 
> > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > Cc: Alexei Starovoitov <ast@plumgrid.com>
> > Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: David Ahern <dsahern@gmail.com>
> > Cc: He Kuang <hekuang@huawei.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Kaixu Xia <xiakaixu@huawei.com>
> > Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Cc: Paul Mackerras <paulus@samba.org>
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Cc: Zefan Li <lizefan@huawei.com>
> > Cc: pi3orama@163.com
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Link: http://lkml.kernel.org/n/1436445342-1402-39-git-send-email-wangnan0@huawei.com
> > ---
> >  tools/perf/util/probe-event.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> > index e720913..b94a8d7 100644
> > --- a/tools/perf/util/probe-event.c
> > +++ b/tools/perf/util/probe-event.c
> > @@ -2789,7 +2789,7 @@ int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> >  {
> >  	int i, ret;
> >
> > -	ret = init_symbol_maps(pevs->uprobes);
> > +	ret = init_symbol_maps(false);
> >  	if (ret < 0)
> >  		return ret;
> >
> > --
> > 2.1.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH] perf tools: Don't set leader if parser doesn't collect an evsel
  2015-08-29  4:21 ` [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected Wang Nan
  2015-08-31 19:20   ` Arnaldo Carvalho de Melo
@ 2015-09-02  2:53   ` Wang Nan
  2015-09-02  3:01     ` Wangnan (F)
  2015-09-02  5:57     ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 2 replies; 94+ messages in thread
From: Wang Nan @ 2015-09-02  2:53 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, Wang Nan, Alexei Starovoitov, Jiri Olsa,
	Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama

Similar to patch 'perf tools: Don't set cmdline_group_boundary if no
evsel is collected', in case when parser collects no evsel (at this
point it shouldn't happen), parse_events__set_leader() is not safe.

This patch checks list_empty becore calling __perf_evlist__set_leader()
for safty reason.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---

I'd like to queue this patch into my next pull request. Since it is not
a real bug, it may be dropped.

---
 tools/perf/util/parse-events.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index f2c0317..836d226 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -793,6 +793,9 @@ void parse_events__set_leader(char *name, struct list_head *list)
 {
 	struct perf_evsel *leader;
 
+	if (list_empty(list))
+		return;
+
 	__perf_evlist__set_leader(list);
 	leader = list_entry(list->next, struct perf_evsel, node);
 	leader->group_name = name ? strdup(name) : NULL;
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't set leader if parser doesn't collect an evsel
  2015-09-02  2:53   ` [PATCH] perf tools: Don't set leader if parser doesn't collect an evsel Wang Nan
@ 2015-09-02  3:01     ` Wangnan (F)
  2015-09-02  5:57     ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 0 replies; 94+ messages in thread
From: Wangnan (F) @ 2015-09-02  3:01 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, Alexei Starovoitov, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama



On 2015/9/2 10:53, Wang Nan wrote:
> Similar to patch 'perf tools: Don't set cmdline_group_boundary if no
> evsel is collected', in case when parser collects no evsel (at this
> point it shouldn't happen), parse_events__set_leader() is not safe.
>
> This patch checks list_empty becore calling __perf_evlist__set_leader()
> for safty reason.
>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>
> I'd like to queue this patch into my next pull request. Since it is not
> a real bug, it may be dropped.

I think merging this into patch 2/31 should be better. If we decide to 
drop then
only one patch should be considered.

> ---
>   tools/perf/util/parse-events.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index f2c0317..836d226 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -793,6 +793,9 @@ void parse_events__set_leader(char *name, struct list_head *list)
>   {
>   	struct perf_evsel *leader;
>   
> +	if (list_empty(list))
> +		return;
> +
>   	__perf_evlist__set_leader(list);
>   	leader = list_entry(list->next, struct perf_evsel, node);
>   	leader->group_name = name ? strdup(name) : NULL;



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 07/31] perf probe: Attach trace_probe_event with perf_probe_event
  2015-08-29  4:21 ` [PATCH 07/31] perf probe: Attach trace_probe_event with perf_probe_event Wang Nan
@ 2015-09-02  4:32   ` Namhyung Kim
  2015-09-02  5:40     ` Wangnan (F)
  0 siblings, 1 reply; 94+ messages in thread
From: Namhyung Kim @ 2015-09-02  4:32 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Paul Mackerras, Peter Zijlstra

Hi,

On Sat, Aug 29, 2015 at 04:21:41AM +0000, Wang Nan wrote:
> This patch drops struct __event_package structure. Instead, it adds
> trace_probe_event into 'struct perf_probe_event'.
> 
> trace_probe_event information gives further patches a chance to access
> actual probe points and actual arguments. Using them, bpf_loader will
> be able to attach one bpf program to different probing points of a
> inline functions (which has multiple probing points) and glob
> functions. Moreover, by reading arguments information, bpf code for
> reading those arguments can be generated.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/n/1436445342-1402-22-git-send-email-wangnan0@huawei.com
> ---

[SNIP]

> +int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> +			  bool cleanup)
> +{
> +	int i, ret;
>  
>  	ret = init_symbol_maps(pevs->uprobes);
> -	if (ret < 0) {
> -		free(pkgs);
> +	if (ret < 0)
>  		return ret;
> -	}
>  
>  	/* Loop 1: convert all events */
>  	for (i = 0; i < npevs; i++) {
> -		pkgs[i].pev = &pevs[i];
>  		/* Init kprobe blacklist if needed */
> -		if (!pkgs[i].pev->uprobes)
> +		if (pevs[i].uprobes)

Missing '!'.

Thanks,
Namhyung


>  			kprobe_blacklist__init();
>  		/* Convert with or without debuginfo */
> -		ret  = convert_to_probe_trace_events(pkgs[i].pev,
> -						     &pkgs[i].tevs);
> -		if (ret < 0)
> +		ret  = convert_to_probe_trace_events(&pevs[i], &pevs[i].tevs);
> +		if (ret < 0) {
> +			cleanup = true;
>  			goto end;
> -		pkgs[i].ntevs = ret;
> +		}
> +		pevs[i].ntevs = ret;
>  	}
>  	/* This just release blacklist only if allocated */
>  	kprobe_blacklist__release();
>  
>  	/* Loop 2: add all events */
>  	for (i = 0; i < npevs; i++) {
> -		ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs,
> -					       pkgs[i].ntevs,
> +		ret = __add_probe_trace_events(&pevs[i], pevs[i].tevs,
> +					       pevs[i].ntevs,
>  					       probe_conf.force_add);
>  		if (ret < 0)
>  			break;
>  	}
>  end:
>  	/* Loop 3: cleanup and free trace events  */
> -	for (i = 0; i < npevs; i++) {
> -		for (j = 0; j < pkgs[i].ntevs; j++)
> -			clear_probe_trace_event(&pkgs[i].tevs[j]);
> -		zfree(&pkgs[i].tevs);
> -	}
> -	free(pkgs);
> +	for (i = 0; cleanup && (i < npevs); i++)
> +		cleanup_perf_probe_event(&pevs[i]);
>  	exit_symbol_maps();
>  
>  	return ret;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 07/31] perf probe: Attach trace_probe_event with perf_probe_event
  2015-09-02  4:32   ` Namhyung Kim
@ 2015-09-02  5:40     ` Wangnan (F)
  0 siblings, 0 replies; 94+ messages in thread
From: Wangnan (F) @ 2015-09-02  5:40 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: acme, mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Paul Mackerras, Peter Zijlstra



On 2015/9/2 12:32, Namhyung Kim wrote:
> Hi,
>
> On Sat, Aug 29, 2015 at 04:21:41AM +0000, Wang Nan wrote:
>> This patch drops struct __event_package structure. Instead, it adds
>> trace_probe_event into 'struct perf_probe_event'.
>>
>> trace_probe_event information gives further patches a chance to access
>> actual probe points and actual arguments. Using them, bpf_loader will
>> be able to attach one bpf program to different probing points of a
>> inline functions (which has multiple probing points) and glob
>> functions. Moreover, by reading arguments information, bpf code for
>> reading those arguments can be generated.
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: He Kuang <hekuang@huawei.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Kaixu Xia <xiakaixu@huawei.com>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Link: http://lkml.kernel.org/n/1436445342-1402-22-git-send-email-wangnan0@huawei.com
>> ---
> [SNIP]
>
>> +int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
>> +			  bool cleanup)
>> +{
>> +	int i, ret;
>>   
>>   	ret = init_symbol_maps(pevs->uprobes);
>> -	if (ret < 0) {
>> -		free(pkgs);
>> +	if (ret < 0)
>>   		return ret;
>> -	}
>>   
>>   	/* Loop 1: convert all events */
>>   	for (i = 0; i < npevs; i++) {
>> -		pkgs[i].pev = &pevs[i];
>>   		/* Init kprobe blacklist if needed */
>> -		if (!pkgs[i].pev->uprobes)
>> +		if (pevs[i].uprobes)
> Missing '!'.

It's my fault. Already fixed in my local tree.

Thank you for your review!

> Thanks,
> Namhyung
>
>
>>   			kprobe_blacklist__init();
>>   		/* Convert with or without debuginfo */
>> -		ret  = convert_to_probe_trace_events(pkgs[i].pev,
>> -						     &pkgs[i].tevs);
>> -		if (ret < 0)
>> +		ret  = convert_to_probe_trace_events(&pevs[i], &pevs[i].tevs);
>> +		if (ret < 0) {
>> +			cleanup = true;
>>   			goto end;
>> -		pkgs[i].ntevs = ret;
>> +		}
>> +		pevs[i].ntevs = ret;
>>   	}
>>   	/* This just release blacklist only if allocated */
>>   	kprobe_blacklist__release();
>>   
>>   	/* Loop 2: add all events */
>>   	for (i = 0; i < npevs; i++) {
>> -		ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs,
>> -					       pkgs[i].ntevs,
>> +		ret = __add_probe_trace_events(&pevs[i], pevs[i].tevs,
>> +					       pevs[i].ntevs,
>>   					       probe_conf.force_add);
>>   		if (ret < 0)
>>   			break;
>>   	}
>>   end:
>>   	/* Loop 3: cleanup and free trace events  */
>> -	for (i = 0; i < npevs; i++) {
>> -		for (j = 0; j < pkgs[i].ntevs; j++)
>> -			clear_probe_trace_event(&pkgs[i].tevs[j]);
>> -		zfree(&pkgs[i].tevs);
>> -	}
>> -	free(pkgs);
>> +	for (i = 0; cleanup && (i < npevs); i++)
>> +		cleanup_perf_probe_event(&pevs[i]);
>>   	exit_symbol_maps();
>>   
>>   	return ret;



^ permalink raw reply	[flat|nested] 94+ messages in thread

* RE: [PATCH] perf tools: Don't set leader if parser doesn't collect an evsel
  2015-09-02  2:53   ` [PATCH] perf tools: Don't set leader if parser doesn't collect an evsel Wang Nan
  2015-09-02  3:01     ` Wangnan (F)
@ 2015-09-02  5:57     ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-02  6:09       ` Wangnan (F)
       [not found]       ` <1441176553-116129-1-git-send-email-wangnan0@huawei.com>
  1 sibling, 2 replies; 94+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-02  5:57 UTC (permalink / raw)
  To: 'Wang Nan', acme
  Cc: linux-kernel, Alexei Starovoitov, Jiri Olsa, Namhyung Kim,
	Zefan Li, pi3orama

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1643 bytes --]

> From: Wang Nan [mailto:wangnan0@huawei.com]
> 
> Similar to patch 'perf tools: Don't set cmdline_group_boundary if no
> evsel is collected', in case when parser collects no evsel (at this
> point it shouldn't happen), parse_events__set_leader() is not safe.
> 
> This patch checks list_empty becore calling __perf_evlist__set_leader()
> for safty reason.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
> 
> I'd like to queue this patch into my next pull request. Since it is not
> a real bug, it may be dropped.
> 
> ---
>  tools/perf/util/parse-events.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index f2c0317..836d226 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -793,6 +793,9 @@ void parse_events__set_leader(char *name, struct list_head *list)
>  {
>  	struct perf_evsel *leader;
> 
> +	if (list_empty(list))

Would we need to warn/debug something here?

Thank you,

> +		return;
> +
>  	__perf_evlist__set_leader(list);
>  	leader = list_entry(list->next, struct perf_evsel, node);
>  	leader->group_name = name ? strdup(name) : NULL;
> --
> 1.8.3.4

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't set leader if parser doesn't collect an evsel
  2015-09-02  5:57     ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-02  6:09       ` Wangnan (F)
       [not found]       ` <1441176553-116129-1-git-send-email-wangnan0@huawei.com>
  1 sibling, 0 replies; 94+ messages in thread
From: Wangnan (F) @ 2015-09-02  6:09 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI, acme
  Cc: linux-kernel, Alexei Starovoitov, Jiri Olsa, Namhyung Kim,
	Zefan Li, pi3orama



On 2015/9/2 13:57, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: Wang Nan [mailto:wangnan0@huawei.com]
>>
>> Similar to patch 'perf tools: Don't set cmdline_group_boundary if no
>> evsel is collected', in case when parser collects no evsel (at this
>> point it shouldn't happen), parse_events__set_leader() is not safe.
>>
>> This patch checks list_empty becore calling __perf_evlist__set_leader()
>> for safty reason.
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> ---
>>
>> I'd like to queue this patch into my next pull request. Since it is not
>> a real bug, it may be dropped.
>>
>> ---
>>   tools/perf/util/parse-events.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
>> index f2c0317..836d226 100644
>> --- a/tools/perf/util/parse-events.c
>> +++ b/tools/perf/util/parse-events.c
>> @@ -793,6 +793,9 @@ void parse_events__set_leader(char *name, struct list_head *list)
>>   {
>>   	struct perf_evsel *leader;
>>
>> +	if (list_empty(list))
> Would we need to warn/debug something here?

OK, let's add a WARN message here and other 2 places.

Thank you.

> Thank you,
>
>> +		return;
>> +
>>   	__perf_evlist__set_leader(list);
>>   	leader = list_entry(list->next, struct perf_evsel, node);
>>   	leader->group_name = name ? strdup(name) : NULL;
>> --
>> 1.8.3.4



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel
       [not found]       ` <1441176553-116129-1-git-send-email-wangnan0@huawei.com>
@ 2015-09-02  6:53         ` Wangnan (F)
  2015-09-02 10:31           ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-02 11:54           ` Jiri Olsa
  0 siblings, 2 replies; 94+ messages in thread
From: Wangnan (F) @ 2015-09-02  6:53 UTC (permalink / raw)
  To: masami.hiramatsu.pt, acme
  Cc: Alexei Starovoitov, Jiri Olsa, Namhyung Kim, Zefan Li, pi3orama,
	linux-kernel

Sorry, forget to CC kernel mailing list...

On 2015/9/2 14:49, Wang Nan wrote:
> If parse_events__scanner() collects no entry, perf_evlist__last(evlist)
> is invalid.
>
> Although it shouldn't happen at this point, before calling
> perf_evlist__last(), we should ensure the list is not empty for safety
> reason.
>
> There are 3 places need this checking:
>
>   1. Before setting cmdline_group_boundary;
>   2. Before __perf_evlist__set_leader();
>   3. In foreach_evsel_in_last_glob.
>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>
> Merge all 3 list_empty() test together into one patch.
>
> Add warning messages.
>
> Improve commit message.
>
> ---
>   tools/perf/util/parse-events.c | 22 +++++++++++++++++++---
>   1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index d826e6f..069848d 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -793,6 +793,11 @@ void parse_events__set_leader(char *name, struct list_head *list)
>   {
>   	struct perf_evsel *leader;
>   
> +	if (list_empty(list)) {
> +		__WARN_printf("WARNING: failed to set leader: empty list");
> +		return;
> +	}
> +
>   	__perf_evlist__set_leader(list);
>   	leader = list_entry(list->next, struct perf_evsel, node);
>   	leader->group_name = name ? strdup(name) : NULL;
> @@ -1143,10 +1148,15 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>   		int entries = data.idx - evlist->nr_entries;
>   		struct perf_evsel *last;
>   
> +		if (!list_empty(&data.list)) {
> +			last = list_entry(data.list.prev,
> +					  struct perf_evsel, node);
> +			last->cmdline_group_boundary = true;
> +		} else
> +			__WARN_printf("WARNING: event parser found nothing");
> +
>   		perf_evlist__splice_list_tail(evlist, &data.list, entries);
>   		evlist->nr_groups += data.nr_groups;
> -		last = perf_evlist__last(evlist);
> -		last->cmdline_group_boundary = true;
>   
>   		return 0;
>   	}
> @@ -1252,7 +1262,13 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
>   	struct perf_evsel *last = NULL;
>   	int err;
>   
> -	if (evlist->nr_entries > 0)
> +	/*
> +	 * Don't return when list_empty, give func a chance to report
> +	 * error when it found last == NULL.
> +	 *
> +	 * So no need to WARN here, let *func do this.
> +	 */
> +	if (!list_empty(&evlist->entries))
>   		last = perf_evlist__last(evlist);
>   
>   	do {



^ permalink raw reply	[flat|nested] 94+ messages in thread

* RE: [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel
  2015-09-02  6:53         ` [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel Wangnan (F)
@ 2015-09-02 10:31           ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-02 11:54           ` Jiri Olsa
  1 sibling, 0 replies; 94+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-02 10:31 UTC (permalink / raw)
  To: 'Wangnan (F)', acme
  Cc: Alexei Starovoitov, Jiri Olsa, Namhyung Kim, Zefan Li, pi3orama,
	linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3191 bytes --]

> From: Wangnan (F) [mailto:wangnan0@huawei.com]
> 
> Sorry, forget to CC kernel mailing list...
> 
> On 2015/9/2 14:49, Wang Nan wrote:
> > If parse_events__scanner() collects no entry, perf_evlist__last(evlist)
> > is invalid.
> >
> > Although it shouldn't happen at this point, before calling
> > perf_evlist__last(), we should ensure the list is not empty for safety
> > reason.
> >
> > There are 3 places need this checking:
> >
> >   1. Before setting cmdline_group_boundary;
> >   2. Before __perf_evlist__set_leader();
> >   3. In foreach_evsel_in_last_glob.
> >

This looks OK to me.

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thanks!

> > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Alexei Starovoitov <ast@plumgrid.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Cc: Zefan Li <lizefan@huawei.com>
> > Cc: pi3orama@163.com
> > ---
> >
> > Merge all 3 list_empty() test together into one patch.
> >
> > Add warning messages.
> >
> > Improve commit message.
> >
> > ---
> >   tools/perf/util/parse-events.c | 22 +++++++++++++++++++---
> >   1 file changed, 19 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> > index d826e6f..069848d 100644
> > --- a/tools/perf/util/parse-events.c
> > +++ b/tools/perf/util/parse-events.c
> > @@ -793,6 +793,11 @@ void parse_events__set_leader(char *name, struct list_head *list)
> >   {
> >   	struct perf_evsel *leader;
> >
> > +	if (list_empty(list)) {
> > +		__WARN_printf("WARNING: failed to set leader: empty list");
> > +		return;
> > +	}
> > +
> >   	__perf_evlist__set_leader(list);
> >   	leader = list_entry(list->next, struct perf_evsel, node);
> >   	leader->group_name = name ? strdup(name) : NULL;
> > @@ -1143,10 +1148,15 @@ int parse_events(struct perf_evlist *evlist, const char *str,
> >   		int entries = data.idx - evlist->nr_entries;
> >   		struct perf_evsel *last;
> >
> > +		if (!list_empty(&data.list)) {
> > +			last = list_entry(data.list.prev,
> > +					  struct perf_evsel, node);
> > +			last->cmdline_group_boundary = true;
> > +		} else
> > +			__WARN_printf("WARNING: event parser found nothing");
> > +
> >   		perf_evlist__splice_list_tail(evlist, &data.list, entries);
> >   		evlist->nr_groups += data.nr_groups;
> > -		last = perf_evlist__last(evlist);
> > -		last->cmdline_group_boundary = true;
> >
> >   		return 0;
> >   	}
> > @@ -1252,7 +1262,13 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
> >   	struct perf_evsel *last = NULL;
> >   	int err;
> >
> > -	if (evlist->nr_entries > 0)
> > +	/*
> > +	 * Don't return when list_empty, give func a chance to report
> > +	 * error when it found last == NULL.
> > +	 *
> > +	 * So no need to WARN here, let *func do this.
> > +	 */
> > +	if (!list_empty(&evlist->entries))
> >   		last = perf_evlist__last(evlist);
> >
> >   	do {
> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel
  2015-09-02  6:53         ` [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel Wangnan (F)
  2015-09-02 10:31           ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-02 11:54           ` Jiri Olsa
  2015-09-02 12:05             ` pi3orama
  1 sibling, 1 reply; 94+ messages in thread
From: Jiri Olsa @ 2015-09-02 11:54 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: masami.hiramatsu.pt, acme, Alexei Starovoitov, Jiri Olsa,
	Namhyung Kim, Zefan Li, pi3orama, linux-kernel

On Wed, Sep 02, 2015 at 02:53:58PM +0800, Wangnan (F) wrote:
> Sorry, forget to CC kernel mailing list...
> 
> On 2015/9/2 14:49, Wang Nan wrote:
> >If parse_events__scanner() collects no entry, perf_evlist__last(evlist)
> >is invalid.
> >
> >Although it shouldn't happen at this point, before calling
> >perf_evlist__last(), we should ensure the list is not empty for safety
> >reason.
> >
> >There are 3 places need this checking:
> >
> >  1. Before setting cmdline_group_boundary;
> >  2. Before __perf_evlist__set_leader();
> >  3. In foreach_evsel_in_last_glob.
> >
> >Signed-off-by: Wang Nan <wangnan0@huawei.com>
> >Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> >Cc: Alexei Starovoitov <ast@plumgrid.com>
> >Cc: Jiri Olsa <jolsa@kernel.org>
> >Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> >Cc: Namhyung Kim <namhyung@kernel.org>
> >Cc: Zefan Li <lizefan@huawei.com>
> >Cc: pi3orama@163.com
> >---
> >
> >Merge all 3 list_empty() test together into one patch.
> >
> >Add warning messages.
> >
> >Improve commit message.
> >
> >---
> >  tools/perf/util/parse-events.c | 22 +++++++++++++++++++---
> >  1 file changed, 19 insertions(+), 3 deletions(-)
> >
> >diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> >index d826e6f..069848d 100644
> >--- a/tools/perf/util/parse-events.c
> >+++ b/tools/perf/util/parse-events.c
> >@@ -793,6 +793,11 @@ void parse_events__set_leader(char *name, struct list_head *list)
> >  {
> >  	struct perf_evsel *leader;
> >+	if (list_empty(list)) {
> >+		__WARN_printf("WARNING: failed to set leader: empty list");
> >+		return;
> >+	}
> >+
> >  	__perf_evlist__set_leader(list);
> >  	leader = list_entry(list->next, struct perf_evsel, node);
> >  	leader->group_name = name ? strdup(name) : NULL;
> >@@ -1143,10 +1148,15 @@ int parse_events(struct perf_evlist *evlist, const char *str,
> >  		int entries = data.idx - evlist->nr_entries;
> >  		struct perf_evsel *last;
> >+		if (!list_empty(&data.list)) {
> >+			last = list_entry(data.list.prev,
> >+					  struct perf_evsel, node);
> >+			last->cmdline_group_boundary = true;
> >+		} else
> >+			__WARN_printf("WARNING: event parser found nothing");

we need to unify error printing in this object ;-) with this one it's 3

__WARN_printf(...
fprintf(stderr,...
printf(...
WARN_ONCE(...

;-)


> >+
> >  		perf_evlist__splice_list_tail(evlist, &data.list, entries);
> >  		evlist->nr_groups += data.nr_groups;
> >-		last = perf_evlist__last(evlist);
> >-		last->cmdline_group_boundary = true;
> >  		return 0;
> >  	}
> >@@ -1252,7 +1262,13 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
> >  	struct perf_evsel *last = NULL;
> >  	int err;
> >-	if (evlist->nr_entries > 0)
> >+	/*
> >+	 * Don't return when list_empty, give func a chance to report
> >+	 * error when it found last == NULL.
> >+	 *
> >+	 * So no need to WARN here, let *func do this.
> >+	 */
> >+	if (!list_empty(&evlist->entries))

why is it better than to check evlist->nr_entries?
evlist->nr_entries is equivalent to !list_empty(&evlist->entries) in here, right?


jirka

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel
  2015-09-02 11:54           ` Jiri Olsa
@ 2015-09-02 12:05             ` pi3orama
  2015-09-02 12:46               ` Jiri Olsa
  2015-09-02 13:55               ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 94+ messages in thread
From: pi3orama @ 2015-09-02 12:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Wangnan (F),
	masami.hiramatsu.pt, acme, Alexei Starovoitov, Jiri Olsa,
	Namhyung Kim, Zefan Li, linux-kernel



发自我的 iPhone

> 在 2015年9月2日,下午7:54,Jiri Olsa <jolsa@redhat.com> 写道:
> 
>> On Wed, Sep 02, 2015 at 02:53:58PM +0800, Wangnan (F) wrote:
>> Sorry, forget to CC kernel mailing list...
>> 
>>> On 2015/9/2 14:49, Wang Nan wrote:
>>> If parse_events__scanner() collects no entry, perf_evlist__last(evlist)
>>> is invalid.
>>> 
>>> Although it shouldn't happen at this point, before calling
>>> perf_evlist__last(), we should ensure the list is not empty for safety
>>> reason.
>>> 
>>> There are 3 places need this checking:
>>> 
>>> 1. Before setting cmdline_group_boundary;
>>> 2. Before __perf_evlist__set_leader();
>>> 3. In foreach_evsel_in_last_glob.
>>> 
>>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>>> Cc: Jiri Olsa <jolsa@kernel.org>
>>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>> Cc: Zefan Li <lizefan@huawei.com>
>>> Cc: pi3orama@163.com
>>> ---
>>> 
>>> Merge all 3 list_empty() test together into one patch.
>>> 
>>> Add warning messages.
>>> 
>>> Improve commit message.
>>> 
>>> ---
>>> tools/perf/util/parse-events.c | 22 +++++++++++++++++++---
>>> 1 file changed, 19 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
>>> index d826e6f..069848d 100644
>>> --- a/tools/perf/util/parse-events.c
>>> +++ b/tools/perf/util/parse-events.c
>>> @@ -793,6 +793,11 @@ void parse_events__set_leader(char *name, struct list_head *list)
>>> {
>>>    struct perf_evsel *leader;
>>> +    if (list_empty(list)) {
>>> +        __WARN_printf("WARNING: failed to set leader: empty list");
>>> +        return;
>>> +    }
>>> +
>>>    __perf_evlist__set_leader(list);
>>>    leader = list_entry(list->next, struct perf_evsel, node);
>>>    leader->group_name = name ? strdup(name) : NULL;
>>> @@ -1143,10 +1148,15 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>>>        int entries = data.idx - evlist->nr_entries;
>>>        struct perf_evsel *last;
>>> +        if (!list_empty(&data.list)) {
>>> +            last = list_entry(data.list.prev,
>>> +                      struct perf_evsel, node);
>>> +            last->cmdline_group_boundary = true;
>>> +        } else
>>> +            __WARN_printf("WARNING: event parser found nothing");
> 
> we need to unify error printing in this object ;-) with this one it's 3
> 
> __WARN_printf(...
> fprintf(stderr,...
> printf(...
> WARN_ONCE(...
> 
> ;-)
> 
> 
>>> +
>>>        perf_evlist__splice_list_tail(evlist, &data.list, entries);
>>>        evlist->nr_groups += data.nr_groups;
>>> -        last = perf_evlist__last(evlist);
>>> -        last->cmdline_group_boundary = true;
>>>        return 0;
>>>    }
>>> @@ -1252,7 +1262,13 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
>>>    struct perf_evsel *last = NULL;
>>>    int err;
>>> -    if (evlist->nr_entries > 0)
>>> +    /*
>>> +     * Don't return when list_empty, give func a chance to report
>>> +     * error when it found last == NULL.
>>> +     *
>>> +     * So no need to WARN here, let *func do this.
>>> +     */
>>> +    if (!list_empty(&evlist->entries))
> 
> why is it better than to check evlist->nr_entries?
> evlist->nr_entries is equivalent to !list_empty(&evlist->entries) in here, right?
> 

By checking list we won't rely on the assumption that nr_entries reflects the
actual number of elements in that list, makes the logic of this code more compact.
Don't you think so?

At this point they are equivalent, but the whole patch is preventive action.

Thank you.

> 
> jirka


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 18/31] perf test: Add 'perf test BPF'
  2015-08-29  4:21 ` [PATCH 18/31] perf test: Add 'perf test BPF' Wang Nan
@ 2015-09-02 12:45   ` Namhyung Kim
  2015-09-05 12:21     ` Wang Nan
  0 siblings, 1 reply; 94+ messages in thread
From: Namhyung Kim @ 2015-09-02 12:45 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Peter Zijlstra

On Sat, Aug 29, 2015 at 04:21:52AM +0000, Wang Nan wrote:
> This patch adds BPF testcase for testing BPF event filtering.
> 
> By utilizing the result of 'perf test LLVM', this patch compiles the
> eBPF sample program then test it ability. The BPF script in 'perf test
> LLVM' collects half of execution of epoll_pwait(). This patch runs 111
> times of it, so the resule should contains 56 samples.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Link: http://lkml.kernel.org/n/1440151770-129878-16-git-send-email-wangnan0@huawei.com
> ---

[SNIP]

> +static int prepare_bpf(void *obj_buf, size_t obj_buf_sz)
> +{
> +	int err;
> +	char errbuf[BUFSIZ];
> +
> +	err = bpf__prepare_load_buffer(obj_buf, obj_buf_sz, NULL);
> +	if (err) {
> +		bpf__strerror_prepare_load("[buffer]", false, err, errbuf,
> +					   sizeof(errbuf));
> +		fprintf(stderr, " (%s)", errbuf);
> +		return TEST_FAIL;
> +	}
> +
> +	err = bpf__probe();
> +	if (err) {
> +		bpf__strerror_load(err, errbuf, sizeof(errbuf));
> +		fprintf(stderr, " (%s)", errbuf);
> +		if (getuid() != 0)

geteuid() ?

Thanks,
Namhyung


> +			fprintf(stderr, " (try run as root)");
> +		return TEST_FAIL;
> +	}
> +
> +	err = bpf__load();
> +	if (err) {
> +		bpf__strerror_load(err, errbuf, sizeof(errbuf));
> +		fprintf(stderr, " (%s)", errbuf);
> +		return TEST_FAIL;
> +	}
> +
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel
  2015-09-02 12:05             ` pi3orama
@ 2015-09-02 12:46               ` Jiri Olsa
  2015-09-02 13:55               ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 94+ messages in thread
From: Jiri Olsa @ 2015-09-02 12:46 UTC (permalink / raw)
  To: pi3orama
  Cc: Wangnan (F),
	masami.hiramatsu.pt, acme, Alexei Starovoitov, Jiri Olsa,
	Namhyung Kim, Zefan Li, linux-kernel

On Wed, Sep 02, 2015 at 08:05:54PM +0800, pi3orama wrote:

SNIP

> >>>        perf_evlist__splice_list_tail(evlist, &data.list, entries);
> >>>        evlist->nr_groups += data.nr_groups;
> >>> -        last = perf_evlist__last(evlist);
> >>> -        last->cmdline_group_boundary = true;
> >>>        return 0;
> >>>    }
> >>> @@ -1252,7 +1262,13 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
> >>>    struct perf_evsel *last = NULL;
> >>>    int err;
> >>> -    if (evlist->nr_entries > 0)
> >>> +    /*
> >>> +     * Don't return when list_empty, give func a chance to report
> >>> +     * error when it found last == NULL.
> >>> +     *
> >>> +     * So no need to WARN here, let *func do this.
> >>> +     */
> >>> +    if (!list_empty(&evlist->entries))
> > 
> > why is it better than to check evlist->nr_entries?
> > evlist->nr_entries is equivalent to !list_empty(&evlist->entries) in here, right?
> > 
> 
> By checking list we won't rely on the assumption that nr_entries reflects the
> actual number of elements in that list, makes the logic of this code more compact.
> Don't you think so?
> 
> At this point they are equivalent, but the whole patch is preventive action.

ok, fair enough ;-)

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel
  2015-09-02 12:05             ` pi3orama
  2015-09-02 12:46               ` Jiri Olsa
@ 2015-09-02 13:55               ` Arnaldo Carvalho de Melo
  2015-09-02 14:04                 ` pi3orama
  1 sibling, 1 reply; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-02 13:55 UTC (permalink / raw)
  To: pi3orama
  Cc: Jiri Olsa, Wangnan (F),
	masami.hiramatsu.pt, Alexei Starovoitov, Jiri Olsa, Namhyung Kim,
	Zefan Li, acme, linux-kernel

Em Wed, Sep 02, 2015 at 08:05:54PM +0800, pi3orama escreveu:
> 发自我的 iPhone
> > 在 2015年9月2日,下午7:54,Jiri Olsa <jolsa@redhat.com> 写道:
> >> On Wed, Sep 02, 2015 at 02:53:58PM +0800, Wangnan (F) wrote:
> >>> @@ -1252,7 +1262,13 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
> >>>    struct perf_evsel *last = NULL;
> >>>    int err;
> >>> -    if (evlist->nr_entries > 0)
> >>> +    /*
> >>> +     * Don't return when list_empty, give func a chance to report
> >>> +     * error when it found last == NULL.
> >>> +     *
> >>> +     * So no need to WARN here, let *func do this.
> >>> +     */
> >>> +    if (!list_empty(&evlist->entries))

> > why is it better than to check evlist->nr_entries?
> > evlist->nr_entries is equivalent to !list_empty(&evlist->entries) in here, right?
 
> By checking list we won't rely on the assumption that nr_entries reflects the
> actual number of elements in that list, makes the logic of this code more compact.

But why would we want to break that assumption?

If I see FOO->entries and FOO->nr_entries, it is reasonable to expect
that whatever data structure FOO->entries may be has FOO->nr_entries in
it, lets not break that assumption.

- Arnaldo

> Don't you think so?
> 
> At this point they are equivalent, but the whole patch is preventive action.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel
  2015-09-02 13:55               ` Arnaldo Carvalho de Melo
@ 2015-09-02 14:04                 ` pi3orama
  2015-09-02 14:43                   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 94+ messages in thread
From: pi3orama @ 2015-09-02 14:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Wangnan (F),
	masami.hiramatsu.pt, Alexei Starovoitov, Jiri Olsa, Namhyung Kim,
	Zefan Li, acme, linux-kernel



发自我的 iPhone

> 在 2015年9月2日,下午9:55,Arnaldo Carvalho de Melo <acme@redhat.com> 写道:
> 
> Em Wed, Sep 02, 2015 at 08:05:54PM +0800, pi3orama escreveu:
>> 发自我的 iPhone
>>> 在 2015年9月2日,下午7:54,Jiri Olsa <jolsa@redhat.com> 写道:
>>>>> On Wed, Sep 02, 2015 at 02:53:58PM +0800, Wangnan (F) wrote:
>>>>> @@ -1252,7 +1262,13 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
>>>>>   struct perf_evsel *last = NULL;
>>>>>   int err;
>>>>> -    if (evlist->nr_entries > 0)
>>>>> +    /*
>>>>> +     * Don't return when list_empty, give func a chance to report
>>>>> +     * error when it found last == NULL.
>>>>> +     *
>>>>> +     * So no need to WARN here, let *func do this.
>>>>> +     */
>>>>> +    if (!list_empty(&evlist->entries))
> 
>>> why is it better than to check evlist->nr_entries?
>>> evlist->nr_entries is equivalent to !list_empty(&evlist->entries) in here, right?
> 
>> By checking list we won't rely on the assumption that nr_entries reflects the
>> actual number of elements in that list, makes the logic of this code more compact.
> 
> But why would we want to break that assumption?
> 
> If I see FOO->entries and FOO->nr_entries, it is reasonable to expect
> that whatever data structure FOO->entries may be has FOO->nr_entries in
> it, lets not break that assumption.

Then we should enforce it. For example, check the list collected by parser,
report an error if the list is empty, to avoid someone like me adding
nothing on the list but report success. I'm not insistent on this patch. In my newest
patch set I use real dummy evsel as placeholder so we won't meet empty list again.

Thank you.

> 
> - Arnaldo
> 
>> Don't you think so?
>> 
>> At this point they are equivalent, but the whole patch is preventive action.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches
  2015-09-01  6:59   ` Wang Nan
  2015-09-01  6:59     ` [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86 Wang Nan
@ 2015-09-02 14:08     ` Namhyung Kim
  1 sibling, 0 replies; 94+ messages in thread
From: Namhyung Kim @ 2015-09-02 14:08 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, linux-kernel, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Paul Mackerras, Peter Zijlstra, Zefan Li,
	pi3orama, Arnaldo Carvalho de Melo

On Tue, Sep 01, 2015 at 06:59:48AM +0000, Wang Nan wrote:
> If both LIBBPF and DWARF are detected, it is possible to create prologue
> for eBPF programs to help them accessing kernel data. HAVE_BPF_PROLOGUE
> and CONFIG_BPF_PROLOGUE is added as flags for this feature.
> 
> PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET indicates an architecture
> supports converting name of a register to its offset in
> 'struct pt_regs'. Without this support, BPF_PROLOGUE should be turned off.
> 
> HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET is introduced as the corresponding
> CFLAGS of PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Link: http://lkml.kernel.org/n/1436445342-1402-33-git-send-email-wangnan0@huawei.com
> [wawngnan:
>  - Introduce new CFLAGS to control BPF prologue and arch_get_reg_info()
>    separately.
>  - Rename ARCH_GET_REG_INFO to ARCH_REGS_QUERY_REGISTER_OFFSET,
>    arch_get_reg_info() to regs_query_register_offset(), change its API accordingly
>    to make it similar to kernel's regs_query_register_offset().
> ]
> ---
>  tools/perf/config/Makefile           | 17 +++++++++++++++++
>  tools/perf/util/include/dwarf-regs.h |  8 ++++++++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
> index 38a4144..33785a1 100644
> --- a/tools/perf/config/Makefile
> +++ b/tools/perf/config/Makefile
> @@ -110,6 +110,11 @@ FEATURE_CHECK_CFLAGS-bpf = -I. -I$(srctree)/tools/include -I$(srctree)/arch/$(AR
>  # include ARCH specific config
>  -include $(src-perf)/arch/$(ARCH)/Makefile
>  
> +ifneq ($(origin PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET), undefined)

Why not just using

  ifdef PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET

?


> +  CFLAGS += -DHAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
> +endif
> +
> +
>  include $(src-perf)/config/utilities.mak
>  
>  ifeq ($(call get-executable,$(FLEX)),)
> @@ -314,6 +319,18 @@ ifndef NO_LIBELF
>        CFLAGS += -DHAVE_LIBBPF_SUPPORT
>        $(call detected,CONFIG_LIBBPF)
>      endif
> +
> +    ifndef NO_DWARF
> +      ifneq ($(origin PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET), undefined)

Ditto.

Thanks,
Namhyung


> +        CFLAGS += -DHAVE_BPF_PROLOGUE
> +        $(call detected,CONFIG_BPF_PROLOGUE)
> +      else
> +        msg := $(warning BPF prologue is not supported by architecture $(ARCH), missing regs_query_register_offset());
> +      endif
> +    else
> +      msg := $(warning DWARF support is off, BPF prologue is disabled);
> +    endif
> +
>    endif # NO_LIBBPF
>  endif # NO_LIBELF
>  
> diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
> index 8f14965..07c644e 100644
> --- a/tools/perf/util/include/dwarf-regs.h
> +++ b/tools/perf/util/include/dwarf-regs.h
> @@ -5,4 +5,12 @@
>  const char *get_arch_regstr(unsigned int n);
>  #endif
>  
> +#ifdef HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
> +/*
> + * Arch should support fetching the offset of a register in pt_regs
> + * by its name. See kernel's regs_query_register_offset in
> + * arch/xxx/kernel/ptrace.c.
> + */
> +int regs_query_register_offset(const char *name);
> +#endif
>  #endif
> -- 
> 1.8.3.4
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel
  2015-09-02 14:04                 ` pi3orama
@ 2015-09-02 14:43                   ` Arnaldo Carvalho de Melo
  2015-09-02 22:24                     ` pi3orama
  0 siblings, 1 reply; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-02 14:43 UTC (permalink / raw)
  To: pi3orama
  Cc: Jiri Olsa, Wangnan (F),
	masami.hiramatsu.pt, Alexei Starovoitov, Jiri Olsa, Namhyung Kim,
	Zefan Li, acme, linux-kernel

Em Wed, Sep 02, 2015 at 10:04:21PM +0800, pi3orama escreveu:
> 发自我的 iPhone
> > 在 2015年9月2日,下午9:55,Arnaldo Carvalho de Melo <acme@redhat.com> 写道:
> > Em Wed, Sep 02, 2015 at 08:05:54PM +0800, pi3orama escreveu:
> >> 发自我的 iPhone
> >>> 在 2015年9月2日,下午7:54,Jiri Olsa <jolsa@redhat.com> 写道:
> >>>>> On Wed, Sep 02, 2015 at 02:53:58PM +0800, Wangnan (F) wrote:
> >>>>> @@ -1252,7 +1262,13 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
> >>>>>   struct perf_evsel *last = NULL;
> >>>>>   int err;
> >>>>> -    if (evlist->nr_entries > 0)
> >>>>> +    /*
> >>>>> +     * Don't return when list_empty, give func a chance to report
> >>>>> +     * error when it found last == NULL.
> >>>>> +     *
> >>>>> +     * So no need to WARN here, let *func do this.
> >>>>> +     */
> >>>>> +    if (!list_empty(&evlist->entries))

> >>> why is it better than to check evlist->nr_entries?
> >>> evlist->nr_entries is equivalent to !list_empty(&evlist->entries) in here, right?

> >> By checking list we won't rely on the assumption that nr_entries reflects the
> >> actual number of elements in that list, makes the logic of this code more compact.

> > But why would we want to break that assumption?

> > If I see FOO->entries and FOO->nr_entries, it is reasonable to expect
> > that whatever data structure FOO->entries may be has FOO->nr_entries in
> > it, lets not break that assumption.
 
> Then we should enforce it.

Agreed, but it is a reasonable expectation, right? Its a general
pattern, one that we expect and when it breaks like that, that may lead
to bugs :-)

> For example, check the list collected by parser, report an error if
> the list is empty, to avoid someone like me adding nothing on the list
> but report success. I'm not insistent on this patch. In my newest
> patch set I use real dummy evsel as placeholder so we won't meet empty
> list again.

Ok, I'll look at the new patch then, I keep thinking that if you need to
have a separate list for eBPF, that you will do something special on it,
etc, then that is not a problem just keep it as a separate list till you
can insert it in the evlist to then open the evlist, mmap it, etc.

If in the parsing routines you have access only to a perf_evlist
pointer, well, then we can have something like an
evlist->pending_entries + evlist->nr_pending_entries. Something like
that.

If you have detailed why you need it to be left in the evlist (will some
operation be done on all evsels, even the ones that need eBPF specific
work before you do this final eBPF specific stuff?), I'll try to find
it, if not, describing the sequence of events that justifies this or the
"dummy evsel as a placeholder" would help reviewing this, treat us like
7 year old kids (aka use patience + details) :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel
  2015-09-02 14:43                   ` Arnaldo Carvalho de Melo
@ 2015-09-02 22:24                     ` pi3orama
  0 siblings, 0 replies; 94+ messages in thread
From: pi3orama @ 2015-09-02 22:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Wangnan (F),
	masami.hiramatsu.pt, Alexei Starovoitov, Jiri Olsa, Namhyung Kim,
	Zefan Li, acme, linux-kernel



发自我的 iPhone

> 在 2015年9月2日,下午10:43,Arnaldo Carvalho de Melo <acme@redhat.com> 写道:
> 
> Em Wed, Sep 02, 2015 at 10:04:21PM +0800, pi3orama escreveu:
>> 发自我的 iPhone
>>> 在 2015年9月2日,下午9:55,Arnaldo Carvalho de Melo <acme@redhat.com> 写道:
>>> Em Wed, Sep 02, 2015 at 08:05:54PM +0800, pi3orama escreveu:
>>>> 发自我的 iPhone
>>>>> 在 2015年9月2日,下午7:54,Jiri Olsa <jolsa@redhat.com> 写道:
>>>>>>> On Wed, Sep 02, 2015 at 02:53:58PM +0800, Wangnan (F) wrote:
>>>>>>> @@ -1252,7 +1262,13 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
>>>>>>>  struct perf_evsel *last = NULL;
>>>>>>>  int err;
>>>>>>> -    if (evlist->nr_entries > 0)
>>>>>>> +    /*
>>>>>>> +     * Don't return when list_empty, give func a chance to report
>>>>>>> +     * error when it found last == NULL.
>>>>>>> +     *
>>>>>>> +     * So no need to WARN here, let *func do this.
>>>>>>> +     */
>>>>>>> +    if (!list_empty(&evlist->entries))
> 
>>>>> why is it better than to check evlist->nr_entries?
>>>>> evlist->nr_entries is equivalent to !list_empty(&evlist->entries) in here, right?
> 
>>>> By checking list we won't rely on the assumption that nr_entries reflects the
>>>> actual number of elements in that list, makes the logic of this code more compact.
> 
>>> But why would we want to break that assumption?
> 
>>> If I see FOO->entries and FOO->nr_entries, it is reasonable to expect
>>> that whatever data structure FOO->entries may be has FOO->nr_entries in
>>> it, lets not break that assumption.
> 
>> Then we should enforce it.
> 
> Agreed, but it is a reasonable expectation, right? Its a general
> pattern, one that we expect and when it breaks like that, that may lead
> to bugs :-)
> 
>> For example, check the list collected by parser, report an error if
>> the list is empty, to avoid someone like me adding nothing on the list
>> but report success. I'm not insistent on this patch. In my newest
>> patch set I use real dummy evsel as placeholder so we won't meet empty
>> list again.
> 
> Ok, I'll look at the new patch then, I keep thinking that if you need to
> have a separate list for eBPF, that you will do something special on it,
> etc, then that is not a problem just keep it as a separate list till you
> can insert it in the evlist to then open the evlist, mmap it, etc.
> 
> If in the parsing routines you have access only to a perf_evlist
> pointer, well, then we can have something like an
> evlist->pending_entries + evlist->nr_pending_entries. Something like
> that.
> 
> If you have detailed why you need it to be left in the evlist (will some
> operation be done on all evsels, even the ones that need eBPF specific
> work before you do this final eBPF specific stuff?), I'll try to find
> it, if not, describing the sequence of events that justifies this or the
> "dummy evsel as a placeholder" would help reviewing this, treat us like
> 7 year old kids (aka use patience + details) :-)

Just because adding placeholder make things simpler, because which makes
bpf object "events" become compatible with other types of events, so we
don't need to maintain a separated mechanism during parsing.

I think you should remember how we sync filters between place holder and real
events. Actually, filter is not the only setting can be made to an event after the evsel
is collected. We also have config terms, groups and modifiers. Although currently
we only support filter, we are planning adding config terms to config bpf objects,
so we can use commands like:
 # perf record --event abc.o/key=value/ ...

Modifier is also useful:
 # perf record --event abc.o:G ...

And also group:
 # perf record --event {abc.o,def.o,cycles}/key=value/...

Think about how we can do this if we use a separate list for BPF object. Then in
all the above processing, we must change current code, detect whether we are
dealing with BPF, and treat BPF object and normal events differently.

Instead, with placeholder dummy event, we don't need to modify existing implementation. And we also don't need to consider BPF if we decide to
add more settings by new syntax. BPF object "events" would be naturally
compatible with normal dummy events (only thing we should consider is we
need to treat it as TRACEPOINT event, not a SOFTWARE event). We use it to
collect settings, and when real BPF events created, sync settings between them.

I believe the above should be strong enough to support dummy event, do you think
so?

And you can review how I do this now from github (sorry I can't send patch
because I'm at home and won't be able to access company's SMTP server for
3 days due holiday):
https://github.com/WangNan0/linux/commit/c6fe9842d27ae1d228be2c7bb6c20216ddc49632
https://github.com/WangNan0/linux/commit/802426eddb9ea15386e70063d935d95caa30c045

Thank you.

> - Arnaldo


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 03/31] perf tools: Introduce dummy evsel
  2015-08-29  4:21 ` [PATCH 03/31] perf tools: Introduce dummy evsel Wang Nan
  2015-08-31 19:38   ` Arnaldo Carvalho de Melo
@ 2015-09-03  0:11   ` Namhyung Kim
  2015-09-03  0:42     ` pi3orama
  2015-09-06  5:55   ` [PATCH] perf tools: Allow BPF placeholder dummy events to collect --filter options Wang Nan
  2 siblings, 1 reply; 94+ messages in thread
From: Namhyung Kim @ 2015-09-03  0:11 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Peter Zijlstra

Hi,

On Sat, Aug 29, 2015 at 04:21:37AM +0000, Wang Nan wrote:
> This patch allows linking dummy evsel onto evlist as a placeholder. It
> is for following patch which allows passing BPF object using '--event
> object.o'.
> 
> Doesn't link other event selectors, if passing a BPF object file to
> '--event', nothing is linked onto evlist. Instead, events described in
> BPF object file are probed and linked in a delayed manner because we
> want do all probing work together. Therefore, evsel for events in BPF
> object would be linked at the end of evlist. Which causes a small
> problem that, if passing '--filter' setting after object file, the
> filter option won't be correctly applied to those events.
> 
> This patch links dummy onto evlist, so following --filter can be
> collected by the dummy evsel. For this reason dummy evsels are set to
> PERF_TYPE_TRACEPOINT.

I understand the need of the dummy event.  But we already have dummy
event so it's confusing to have similar event IMHO.  So what about
using existing dummy event instead?  You can save a link to a bpf
object in the dummy evsel (to check it later) and change to allow
setting filter on dummy events IMHO.

> 
> Due to the possibility of existance of dummy evsel,
> perf_evlist__purge_dummy() must be called right after parse_options().
> This patch adds it to record, top, trace and stat builtin commands.
> Further patch moves it down after real BPF events are processed with.

IMHO it'd be better to do this kind of job in a single place -
e.g. perf_evlist__config() ? - so that other commands get benefits
from it easily.

Thanks,
Namhyung


> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Link: http://lkml.kernel.org/r/1440742821-44548-4-git-send-email-wangnan0@huawei.com
> ---
>  tools/perf/builtin-record.c    |  2 ++
>  tools/perf/builtin-stat.c      |  1 +
>  tools/perf/builtin-top.c       |  1 +
>  tools/perf/builtin-trace.c     |  1 +
>  tools/perf/util/evlist.c       | 19 +++++++++++++++++++
>  tools/perf/util/evlist.h       |  1 +
>  tools/perf/util/evsel.c        | 32 ++++++++++++++++++++++++++++++++
>  tools/perf/util/evsel.h        |  6 ++++++
>  tools/perf/util/parse-events.c | 25 +++++++++++++++++++++----
>  9 files changed, 84 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index a660022..81829de 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1112,6 +1112,8 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
>  
>  	argc = parse_options(argc, argv, record_options, record_usage,
>  			    PARSE_OPT_STOP_AT_NON_OPTION);
> +	perf_evlist__purge_dummy(rec->evlist);
> +
>  	if (!argc && target__none(&rec->opts.target))
>  		usage_with_options(record_usage, record_options);
>  
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 7aa039b..99b62f1 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1208,6 +1208,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
>  
>  	argc = parse_options(argc, argv, options, stat_usage,
>  		PARSE_OPT_STOP_AT_NON_OPTION);
> +	perf_evlist__purge_dummy(evsel_list);
>  
>  	interval = stat_config.interval;
>  
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 8c465c8..246203b 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1198,6 +1198,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
>  	perf_config(perf_top_config, &top);
>  
>  	argc = parse_options(argc, argv, options, top_usage, 0);
> +	perf_evlist__purge_dummy(top.evlist);
>  	if (argc)
>  		usage_with_options(top_usage, options);
>  
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 4e3abba..57712b9 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -3099,6 +3099,7 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
>  
>  	argc = parse_options_subcommand(argc, argv, trace_options, trace_subcommands,
>  				 trace_usage, PARSE_OPT_STOP_AT_NON_OPTION);
> +	perf_evlist__purge_dummy(trace.evlist);
>  
>  	if (trace.trace_pgfaults) {
>  		trace.opts.sample_address = true;
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 8d00039..8a4e64d 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1696,3 +1696,22 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
>  
>  	tracking_evsel->tracking = true;
>  }
> +
> +void perf_evlist__purge_dummy(struct perf_evlist *evlist)
> +{
> +	struct perf_evsel *pos, *n;
> +
> +	/*
> +	 * Remove all dummy events.
> +	 * During linking, we don't touch anything except link
> +	 * it into evlist. As a result, we don't
> +	 * need to adjust evlist->nr_entries during removal.
> +	 */
> +
> +	evlist__for_each_safe(evlist, n, pos) {
> +		if (perf_evsel__is_dummy(pos)) {
> +			list_del_init(&pos->node);
> +			perf_evsel__delete(pos);
> +		}
> +	}
> +}
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index b39a619..7f15727 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -181,6 +181,7 @@ bool perf_evlist__valid_read_format(struct perf_evlist *evlist);
>  void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
>  				   struct list_head *list,
>  				   int nr_entries);
> +void perf_evlist__purge_dummy(struct perf_evlist *evlist);
>  
>  static inline struct perf_evsel *perf_evlist__first(struct perf_evlist *evlist)
>  {
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index bac25f4..01267f4 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -213,6 +213,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
>  	evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
>  	perf_evsel__calc_id_pos(evsel);
>  	evsel->cmdline_group_boundary = false;
> +	evsel->is_dummy = false;
>  }
>  
>  struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
> @@ -225,6 +226,37 @@ struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
>  	return evsel;
>  }
>  
> +struct perf_evsel *perf_evsel__new_dummy(const char *name)
> +{
> +	struct perf_evsel *evsel = zalloc(perf_evsel__object.size);
> +
> +	if (!evsel)
> +		return NULL;
> +
> +	/*
> +	 * Don't need call perf_evsel__init() for dummy evsel.
> +	 * Keep it simple.
> +	 */
> +	evsel->name = strdup(name);
> +	if (!evsel->name)
> +		goto out_free;
> +
> +	INIT_LIST_HEAD(&evsel->node);
> +	INIT_LIST_HEAD(&evsel->config_terms);
> +
> +	evsel->cmdline_group_boundary = false;
> +	/*
> +	 * Set dummy evsel as TRACEPOINT event so it can collect filter
> +	 * options.
> +	 */
> +	evsel->attr.type = PERF_TYPE_TRACEPOINT;
> +	evsel->is_dummy = true;
> +	return evsel;
> +out_free:
> +	free(evsel);
> +	return NULL;
> +}
> +
>  struct perf_evsel *perf_evsel__newtp_idx(const char *sys, const char *name, int idx)
>  {
>  	struct perf_evsel *evsel = zalloc(perf_evsel__object.size);
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 298e6bb..0b8e47d 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -118,6 +118,7 @@ struct perf_evsel {
>  	struct perf_evsel	*leader;
>  	char			*group_name;
>  	bool			cmdline_group_boundary;
> +	bool			is_dummy;
>  	struct list_head	config_terms;
>  };
>  
> @@ -153,6 +154,11 @@ int perf_evsel__object_config(size_t object_size,
>  			      void (*fini)(struct perf_evsel *evsel));
>  
>  struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx);
> +struct perf_evsel *perf_evsel__new_dummy(const char *name);
> +static inline bool perf_evsel__is_dummy(struct perf_evsel *evsel)
> +{
> +	return evsel->is_dummy;
> +}
>  
>  static inline struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr)
>  {
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 14cd7e3..71d91fb 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -1141,7 +1141,7 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>  	perf_pmu__parse_cleanup();
>  	if (!ret) {
>  		int entries = data.idx - evlist->nr_entries;
> -		struct perf_evsel *last;
> +		struct perf_evsel *last = NULL;
>  
>  		if (!list_empty(&data.list)) {
>  			last = list_entry(data.list.prev,
> @@ -1149,8 +1149,25 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>  			last->cmdline_group_boundary = true;
>  		}
>  
> -		perf_evlist__splice_list_tail(evlist, &data.list, entries);
> -		evlist->nr_groups += data.nr_groups;
> +		if (last && perf_evsel__is_dummy(last)) {
> +			if (!list_is_singular(&data.list)) {
> +				parse_events_evlist_error(&data, 0,
> +					"Dummy evsel error: not on a singular list");
> +				return -1;
> +			}
> +			/*
> +			 * We are introducing a dummy event. Don't touch
> +			 * anything, just link it.
> +			 *
> +			 * Don't use perf_evlist__splice_list_tail() since
> +			 * it alerts evlist->nr_entries, which affect header
> +			 * of resulting perf.data.
> +			 */
> +			list_splice_tail(&data.list, &evlist->entries);
> +		} else {
> +			perf_evlist__splice_list_tail(evlist, &data.list, entries);
> +			evlist->nr_groups += data.nr_groups;
> +		}
>  
>  		return 0;
>  	}
> @@ -1256,7 +1273,7 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
>  	struct perf_evsel *last = NULL;
>  	int err;
>  
> -	if (evlist->nr_entries > 0)
> +	if (!list_empty(&evlist->entries))
>  		last = perf_evlist__last(evlist);
>  
>  	do {
> -- 
> 2.1.0
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 14/31] perf tools: Suppress probing messages when probing by BPF loading
  2015-08-29  4:21 ` [PATCH 14/31] perf tools: Suppress probing messages when probing by BPF loading Wang Nan
@ 2015-09-03  0:20   ` Namhyung Kim
  2015-09-03  2:42     ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-03 12:10     ` [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe Masami Hiramatsu
  0 siblings, 2 replies; 94+ messages in thread
From: Namhyung Kim @ 2015-09-03  0:20 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Peter Zijlstra

On Sat, Aug 29, 2015 at 04:21:48AM +0000, Wang Nan wrote:
> This patch suppresses message output by add_perf_probe_events() and
> del_perf_probe_events() if they are triggered by BPF loading. Before
> this patch, when using 'perf record' with BPF object/source as event
> selector, following message will be output:
> 
>      Added new event:
>            perf_bpf_probe:lock_page_ret (on __lock_page%return)
>         You can now use it in all perf tools, such as:
> 	            perf record -e perf_bpf_probe:lock_page_ret -aR sleep 1
>      ...
>      Removed event: perf_bpf_probe:lock_page_ret
> 
> Which is misleading, especially 'use it in all perf tools' because they
> will be removed after 'pref record' exit.
> 
> In this patch, a 'silent' field is appended into probe_conf to control
> output. bpf__{,un}probe() set it to true when calling
> {add,del}_perf_probe_events().

I think that printing those messages should be done in cmd_probe()
rather than add/del_perf_probe_events()..

Thanks,
Namhyung


> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kaixu Xia <xiakaixu@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Link: http://lkml.kernel.org/n/1440151770-129878-12-git-send-email-wangnan0@huawei.com
> ---
>  tools/perf/util/bpf-loader.c  |  6 ++++++
>  tools/perf/util/probe-event.c | 17 ++++++++++++-----
>  tools/perf/util/probe-event.h |  1 +
>  tools/perf/util/probe-file.c  |  5 ++++-
>  4 files changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
> index c3bc0a8..77eeb99 100644
> --- a/tools/perf/util/bpf-loader.c
> +++ b/tools/perf/util/bpf-loader.c
> @@ -188,6 +188,7 @@ static bool is_probed;
>  int bpf__unprobe(void)
>  {
>  	struct strfilter *delfilter;
> +	bool old_silent = probe_conf.silent;
>  	int ret;
>  
>  	if (!is_probed)
> @@ -199,7 +200,9 @@ int bpf__unprobe(void)
>  		return -ENOMEM;
>  	}
>  
> +	probe_conf.silent = true;
>  	ret = del_perf_probe_events(delfilter);
> +	probe_conf.silent = old_silent;
>  	strfilter__delete(delfilter);
>  	if (ret < 0 && is_probed)
>  		pr_debug("Error: failed to delete events: %s\n",
> @@ -215,6 +218,7 @@ int bpf__probe(void)
>  	struct bpf_object *obj, *tmp;
>  	struct bpf_program *prog;
>  	struct perf_probe_event *pevs;
> +	bool old_silent = probe_conf.silent;
>  
>  	pevs = calloc(MAX_PROBES, sizeof(pevs[0]));
>  	if (!pevs)
> @@ -235,9 +239,11 @@ int bpf__probe(void)
>  		}
>  	}
>  
> +	probe_conf.silent = true;
>  	probe_conf.max_probes = MAX_PROBES;
>  	/* Let add_perf_probe_events generates probe_trace_event (tevs) */
>  	err = add_perf_probe_events(pevs, nr_events, false);
> +	probe_conf.silent = old_silent;
>  
>  	/* add_perf_probe_events return negative when fail */
>  	if (err < 0) {
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 57a7bae..e720913 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -52,7 +52,9 @@
>  #define PERFPROBE_GROUP "probe"
>  
>  bool probe_event_dry_run;	/* Dry run flag */
> -struct probe_conf probe_conf;
> +struct probe_conf probe_conf = {
> +	.silent = false,
> +};
>  
>  #define semantic_error(msg ...) pr_err("Semantic error :" msg)
>  
> @@ -2192,10 +2194,12 @@ static int show_perf_probe_event(const char *group, const char *event,
>  
>  	ret = perf_probe_event__sprintf(group, event, pev, module, &buf);
>  	if (ret >= 0) {
> -		if (use_stdout)
> +		if (use_stdout && !probe_conf.silent)
>  			printf("%s\n", buf.buf);
> -		else
> +		else if (!probe_conf.silent)
>  			pr_info("%s\n", buf.buf);
> +		else
> +			pr_debug("%s\n", buf.buf);
>  	}
>  	strbuf_release(&buf);
>  
> @@ -2418,7 +2422,10 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
>  	}
>  
>  	ret = 0;
> -	pr_info("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
> +	if (!probe_conf.silent)
> +		pr_info("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
> +	else
> +		pr_debug("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
>  	for (i = 0; i < ntevs; i++) {
>  		tev = &tevs[i];
>  		/* Skip if the symbol is out of .text or blacklisted */
> @@ -2454,7 +2461,7 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
>  		warn_uprobe_event_compat(tev);
>  
>  	/* Note that it is possible to skip all events because of blacklist */
> -	if (ret >= 0 && event) {
> +	if (ret >= 0 && event && !probe_conf.silent) {
>  		/* Show how to use the event. */
>  		pr_info("\nYou can now use it in all perf tools, such as:\n\n");
>  		pr_info("\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
> diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
> index 915f0d8..3ab9c3e 100644
> --- a/tools/perf/util/probe-event.h
> +++ b/tools/perf/util/probe-event.h
> @@ -13,6 +13,7 @@ struct probe_conf {
>  	bool	force_add;
>  	bool	no_inlines;
>  	int	max_probes;
> +	bool	silent;
>  };
>  extern struct probe_conf probe_conf;
>  extern bool probe_event_dry_run;
> diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
> index bbb2437..db7bd4c 100644
> --- a/tools/perf/util/probe-file.c
> +++ b/tools/perf/util/probe-file.c
> @@ -267,7 +267,10 @@ static int __del_trace_probe_event(int fd, struct str_node *ent)
>  		goto error;
>  	}
>  
> -	pr_info("Removed event: %s\n", ent->s);
> +	if (!probe_conf.silent)
> +		pr_info("Removed event: %s\n", ent->s);
> +	else
> +		pr_debug("Removed event: %s\n", ent->s);
>  	return 0;
>  error:
>  	pr_warning("Failed to delete event: %s\n",
> -- 
> 2.1.0
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 03/31] perf tools: Introduce dummy evsel
  2015-09-03  0:11   ` Namhyung Kim
@ 2015-09-03  0:42     ` pi3orama
  0 siblings, 0 replies; 94+ messages in thread
From: pi3orama @ 2015-09-03  0:42 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Wang Nan, acme, mingo, ast, linux-kernel, lizefan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Peter Zijlstra



发自我的 iPhone

> 在 2015年9月3日,上午8:11,Namhyung Kim <namhyung@kernel.org> 写道:
> 
> Hi,
> 
>> On Sat, Aug 29, 2015 at 04:21:37AM +0000, Wang Nan wrote:
>> This patch allows linking dummy evsel onto evlist as a placeholder. It
>> is for following patch which allows passing BPF object using '--event
>> object.o'.
>> 
>> Doesn't link other event selectors, if passing a BPF object file to
>> '--event', nothing is linked onto evlist. Instead, events described in
>> BPF object file are probed and linked in a delayed manner because we
>> want do all probing work together. Therefore, evsel for events in BPF
>> object would be linked at the end of evlist. Which causes a small
>> problem that, if passing '--filter' setting after object file, the
>> filter option won't be correctly applied to those events.
>> 
>> This patch links dummy onto evlist, so following --filter can be
>> collected by the dummy evsel. For this reason dummy evsels are set to
>> PERF_TYPE_TRACEPOINT.
> 
> I understand the need of the dummy event.  But we already have dummy
> event so it's confusing to have similar event IMHO.  So what about
> using existing dummy event instead?  You can save a link to a bpf
> object in the dummy evsel (to check it later) and change to allow
> setting filter on dummy events IMHO.
> 

Yes, in my working-in-progress implement I use existing dummy event. Connect it
to the object by setting its name to the name of object.

Thank you.

>> 
>> Due to the possibility of existance of dummy evsel,
>> perf_evlist__purge_dummy() must be called right after parse_options().
>> This patch adds it to record, top, trace and stat builtin commands.
>> Further patch moves it down after real BPF events are processed with.
> 
> IMHO it'd be better to do this kind of job in a single place -
> e.g. perf_evlist__config() ? - so that other commands get benefits
> from it easily.
> 
> Thanks,
> Namhyung
> 
> 
>> 
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: He Kuang <hekuang@huawei.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Kaixu Xia <xiakaixu@huawei.com>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> Link: http://lkml.kernel.org/r/1440742821-44548-4-git-send-email-wangnan0@huawei.com
>> ---
>> tools/perf/builtin-record.c    |  2 ++
>> tools/perf/builtin-stat.c      |  1 +
>> tools/perf/builtin-top.c       |  1 +
>> tools/perf/builtin-trace.c     |  1 +
>> tools/perf/util/evlist.c       | 19 +++++++++++++++++++
>> tools/perf/util/evlist.h       |  1 +
>> tools/perf/util/evsel.c        | 32 ++++++++++++++++++++++++++++++++
>> tools/perf/util/evsel.h        |  6 ++++++
>> tools/perf/util/parse-events.c | 25 +++++++++++++++++++++----
>> 9 files changed, 84 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index a660022..81829de 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -1112,6 +1112,8 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
>> 
>>    argc = parse_options(argc, argv, record_options, record_usage,
>>                PARSE_OPT_STOP_AT_NON_OPTION);
>> +    perf_evlist__purge_dummy(rec->evlist);
>> +
>>    if (!argc && target__none(&rec->opts.target))
>>        usage_with_options(record_usage, record_options);
>> 
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 7aa039b..99b62f1 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -1208,6 +1208,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
>> 
>>    argc = parse_options(argc, argv, options, stat_usage,
>>        PARSE_OPT_STOP_AT_NON_OPTION);
>> +    perf_evlist__purge_dummy(evsel_list);
>> 
>>    interval = stat_config.interval;
>> 
>> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
>> index 8c465c8..246203b 100644
>> --- a/tools/perf/builtin-top.c
>> +++ b/tools/perf/builtin-top.c
>> @@ -1198,6 +1198,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
>>    perf_config(perf_top_config, &top);
>> 
>>    argc = parse_options(argc, argv, options, top_usage, 0);
>> +    perf_evlist__purge_dummy(top.evlist);
>>    if (argc)
>>        usage_with_options(top_usage, options);
>> 
>> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
>> index 4e3abba..57712b9 100644
>> --- a/tools/perf/builtin-trace.c
>> +++ b/tools/perf/builtin-trace.c
>> @@ -3099,6 +3099,7 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
>> 
>>    argc = parse_options_subcommand(argc, argv, trace_options, trace_subcommands,
>>                 trace_usage, PARSE_OPT_STOP_AT_NON_OPTION);
>> +    perf_evlist__purge_dummy(trace.evlist);
>> 
>>    if (trace.trace_pgfaults) {
>>        trace.opts.sample_address = true;
>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
>> index 8d00039..8a4e64d 100644
>> --- a/tools/perf/util/evlist.c
>> +++ b/tools/perf/util/evlist.c
>> @@ -1696,3 +1696,22 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
>> 
>>    tracking_evsel->tracking = true;
>> }
>> +
>> +void perf_evlist__purge_dummy(struct perf_evlist *evlist)
>> +{
>> +    struct perf_evsel *pos, *n;
>> +
>> +    /*
>> +     * Remove all dummy events.
>> +     * During linking, we don't touch anything except link
>> +     * it into evlist. As a result, we don't
>> +     * need to adjust evlist->nr_entries during removal.
>> +     */
>> +
>> +    evlist__for_each_safe(evlist, n, pos) {
>> +        if (perf_evsel__is_dummy(pos)) {
>> +            list_del_init(&pos->node);
>> +            perf_evsel__delete(pos);
>> +        }
>> +    }
>> +}
>> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
>> index b39a619..7f15727 100644
>> --- a/tools/perf/util/evlist.h
>> +++ b/tools/perf/util/evlist.h
>> @@ -181,6 +181,7 @@ bool perf_evlist__valid_read_format(struct perf_evlist *evlist);
>> void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
>>                   struct list_head *list,
>>                   int nr_entries);
>> +void perf_evlist__purge_dummy(struct perf_evlist *evlist);
>> 
>> static inline struct perf_evsel *perf_evlist__first(struct perf_evlist *evlist)
>> {
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index bac25f4..01267f4 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -213,6 +213,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
>>    evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
>>    perf_evsel__calc_id_pos(evsel);
>>    evsel->cmdline_group_boundary = false;
>> +    evsel->is_dummy = false;
>> }
>> 
>> struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
>> @@ -225,6 +226,37 @@ struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
>>    return evsel;
>> }
>> 
>> +struct perf_evsel *perf_evsel__new_dummy(const char *name)
>> +{
>> +    struct perf_evsel *evsel = zalloc(perf_evsel__object.size);
>> +
>> +    if (!evsel)
>> +        return NULL;
>> +
>> +    /*
>> +     * Don't need call perf_evsel__init() for dummy evsel.
>> +     * Keep it simple.
>> +     */
>> +    evsel->name = strdup(name);
>> +    if (!evsel->name)
>> +        goto out_free;
>> +
>> +    INIT_LIST_HEAD(&evsel->node);
>> +    INIT_LIST_HEAD(&evsel->config_terms);
>> +
>> +    evsel->cmdline_group_boundary = false;
>> +    /*
>> +     * Set dummy evsel as TRACEPOINT event so it can collect filter
>> +     * options.
>> +     */
>> +    evsel->attr.type = PERF_TYPE_TRACEPOINT;
>> +    evsel->is_dummy = true;
>> +    return evsel;
>> +out_free:
>> +    free(evsel);
>> +    return NULL;
>> +}
>> +
>> struct perf_evsel *perf_evsel__newtp_idx(const char *sys, const char *name, int idx)
>> {
>>    struct perf_evsel *evsel = zalloc(perf_evsel__object.size);
>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
>> index 298e6bb..0b8e47d 100644
>> --- a/tools/perf/util/evsel.h
>> +++ b/tools/perf/util/evsel.h
>> @@ -118,6 +118,7 @@ struct perf_evsel {
>>    struct perf_evsel    *leader;
>>    char            *group_name;
>>    bool            cmdline_group_boundary;
>> +    bool            is_dummy;
>>    struct list_head    config_terms;
>> };
>> 
>> @@ -153,6 +154,11 @@ int perf_evsel__object_config(size_t object_size,
>>                  void (*fini)(struct perf_evsel *evsel));
>> 
>> struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx);
>> +struct perf_evsel *perf_evsel__new_dummy(const char *name);
>> +static inline bool perf_evsel__is_dummy(struct perf_evsel *evsel)
>> +{
>> +    return evsel->is_dummy;
>> +}
>> 
>> static inline struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr)
>> {
>> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
>> index 14cd7e3..71d91fb 100644
>> --- a/tools/perf/util/parse-events.c
>> +++ b/tools/perf/util/parse-events.c
>> @@ -1141,7 +1141,7 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>>    perf_pmu__parse_cleanup();
>>    if (!ret) {
>>        int entries = data.idx - evlist->nr_entries;
>> -        struct perf_evsel *last;
>> +        struct perf_evsel *last = NULL;
>> 
>>        if (!list_empty(&data.list)) {
>>            last = list_entry(data.list.prev,
>> @@ -1149,8 +1149,25 @@ int parse_events(struct perf_evlist *evlist, const char *str,
>>            last->cmdline_group_boundary = true;
>>        }
>> 
>> -        perf_evlist__splice_list_tail(evlist, &data.list, entries);
>> -        evlist->nr_groups += data.nr_groups;
>> +        if (last && perf_evsel__is_dummy(last)) {
>> +            if (!list_is_singular(&data.list)) {
>> +                parse_events_evlist_error(&data, 0,
>> +                    "Dummy evsel error: not on a singular list");
>> +                return -1;
>> +            }
>> +            /*
>> +             * We are introducing a dummy event. Don't touch
>> +             * anything, just link it.
>> +             *
>> +             * Don't use perf_evlist__splice_list_tail() since
>> +             * it alerts evlist->nr_entries, which affect header
>> +             * of resulting perf.data.
>> +             */
>> +            list_splice_tail(&data.list, &evlist->entries);
>> +        } else {
>> +            perf_evlist__splice_list_tail(evlist, &data.list, entries);
>> +            evlist->nr_groups += data.nr_groups;
>> +        }
>> 
>>        return 0;
>>    }
>> @@ -1256,7 +1273,7 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
>>    struct perf_evsel *last = NULL;
>>    int err;
>> 
>> -    if (evlist->nr_entries > 0)
>> +    if (!list_empty(&evlist->entries))
>>        last = perf_evlist__last(evlist);
>> 
>>    do {
>> -- 
>> 2.1.0
>> 


^ permalink raw reply	[flat|nested] 94+ messages in thread

* RE: Re: [PATCH 14/31] perf tools: Suppress probing messages when probing by BPF loading
  2015-09-03  0:20   ` Namhyung Kim
@ 2015-09-03  2:42     ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-03 12:10     ` [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe Masami Hiramatsu
  1 sibling, 0 replies; 94+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-03  2:42 UTC (permalink / raw)
  To: 'Namhyung Kim', Wang Nan
  Cc: acme, mingo, ast, linux-kernel, lizefan, pi3orama, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Peter Zijlstra

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1327 bytes --]

> From: Namhyung Kim [mailto:namhyung@gmail.com] On Behalf Of Namhyung Kim
> 
> On Sat, Aug 29, 2015 at 04:21:48AM +0000, Wang Nan wrote:
> > This patch suppresses message output by add_perf_probe_events() and
> > del_perf_probe_events() if they are triggered by BPF loading. Before
> > this patch, when using 'perf record' with BPF object/source as event
> > selector, following message will be output:
> >
> >      Added new event:
> >            perf_bpf_probe:lock_page_ret (on __lock_page%return)
> >         You can now use it in all perf tools, such as:
> > 	            perf record -e perf_bpf_probe:lock_page_ret -aR sleep 1
> >      ...
> >      Removed event: perf_bpf_probe:lock_page_ret
> >
> > Which is misleading, especially 'use it in all perf tools' because they
> > will be removed after 'pref record' exit.
> >
> > In this patch, a 'silent' field is appended into probe_conf to control
> > output. bpf__{,un}probe() set it to true when calling
> > {add,del}_perf_probe_events().
> 
> I think that printing those messages should be done in cmd_probe()
> rather than add/del_perf_probe_events()..

Well... try to cleanup the messages. 

Thanks!

> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe
  2015-09-03  0:20   ` Namhyung Kim
  2015-09-03  2:42     ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-03 12:10     ` Masami Hiramatsu
  2015-09-03 12:18       ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-03 20:28       ` Arnaldo Carvalho de Melo
  1 sibling, 2 replies; 94+ messages in thread
From: Masami Hiramatsu @ 2015-09-03 12:10 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo
  Cc: Wang Nan, Kaixu Xia, Peter Zijlstra, Daniel Borkmann,
	linux-kernel, He Kuang, lizefan, Jiri Olsa, David Ahern,
	Brendan Gregg, mingo, ast

Output the normal result of adding/deleting probe in buildin-probe
instead of showing it by add/del_perf_probe_events.
All the result string is stored into "result" strbuf parameter.
If you want to ignore the result string, pass a NULL to the "result".
Note that all warning/debug strings are still in the
add/del_perf_probe_events.

Suggested-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
---
 tools/perf/builtin-probe.c    |    9 +++++++--
 tools/perf/util/probe-event.c |   33 ++++++++++++++++++++-------------
 tools/perf/util/probe-event.h |    6 ++++--
 tools/perf/util/probe-file.c  |    5 +++--
 tools/perf/util/probe-file.h  |    4 +++-
 5 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index b81cec3..d11ad21 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -402,6 +402,7 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
 		    "Enable kernel symbol demangling"),
 	OPT_END()
 	};
+	struct strbuf buf = STRBUF_INIT;
 	int ret;
 
 	set_option_flag(options, 'a', "add", PARSE_OPT_EXCLUSIVE);
@@ -483,7 +484,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
 		return ret;
 #endif
 	case 'd':
-		ret = del_perf_probe_events(params.filter);
+		ret = del_perf_probe_events(params.filter, &buf);
+		/* Even if failed, we should show the result first */
+		pr_info("%s", buf.buf);
 		if (ret < 0) {
 			pr_err_with_code("  Error: Failed to delete events.", ret);
 			return ret;
@@ -496,7 +499,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
 			usage_with_options(probe_usage, options);
 		}
 
-		ret = add_perf_probe_events(params.events, params.nevents);
+		ret = add_perf_probe_events(params.events, params.nevents, &buf);
+		/* Even if failed, we should show the result first */
+		pr_info("%s", buf.buf);
 		if (ret < 0) {
 			pr_err_with_code("  Error: Failed to add events.", ret);
 			return ret;
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index eb5f18b..1a3ed7c 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2395,7 +2395,8 @@ static int probe_trace_event__set_name(struct probe_trace_event *tev,
 
 static int __add_probe_trace_events(struct perf_probe_event *pev,
 				     struct probe_trace_event *tevs,
-				     int ntevs, bool allow_suffix)
+				     int ntevs, bool allow_suffix,
+				     struct strbuf *buf)
 {
 	int i, fd, ret;
 	struct probe_trace_event *tev = NULL;
@@ -2415,7 +2416,9 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
 	}
 
 	ret = 0;
-	pr_info("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
+	if (buf)
+		strbuf_addf(buf, "Added new event%s\n",
+			    (ntevs > 1) ? "s:" : ":");
 	for (i = 0; i < ntevs; i++) {
 		tev = &tevs[i];
 		/* Skip if the symbol is out of .text or blacklisted */
@@ -2432,9 +2435,12 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
 		if (ret < 0)
 			break;
 
-		/* We use tev's name for showing new events */
-		show_perf_probe_event(tev->group, tev->event, pev,
-				      tev->point.module, false);
+		if (buf) {
+			/* We use tev's name for showing new events */
+			perf_probe_event__sprintf(tev->group, tev->event,
+						  pev, tev->point.module, buf);
+			strbuf_addch(buf, '\n');
+		}
 		/* Save the last valid name */
 		event = tev->event;
 		group = tev->group;
@@ -2451,10 +2457,10 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
 		warn_uprobe_event_compat(tev);
 
 	/* Note that it is possible to skip all events because of blacklist */
-	if (ret >= 0 && event) {
+	if (ret >= 0 && event && buf) {
 		/* Show how to use the event. */
-		pr_info("\nYou can now use it in all perf tools, such as:\n\n");
-		pr_info("\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
+		strbuf_addf(buf, "\nYou can now use it in all perf tools, such as:\n\n");
+		strbuf_addf(buf, "\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
 	}
 
 	strlist__delete(namelist);
@@ -2765,7 +2771,8 @@ struct __event_package {
 	int				ntevs;
 };
 
-int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
+int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
+			  struct strbuf *result)
 {
 	int i, j, ret;
 	struct __event_package *pkgs;
@@ -2802,7 +2809,7 @@ int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
 	for (i = 0; i < npevs; i++) {
 		ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs,
 					       pkgs[i].ntevs,
-					       probe_conf.force_add);
+					       probe_conf.force_add, result);
 		if (ret < 0)
 			break;
 	}
@@ -2819,7 +2826,7 @@ end:
 	return ret;
 }
 
-int del_perf_probe_events(struct strfilter *filter)
+int del_perf_probe_events(struct strfilter *filter, struct strbuf *result)
 {
 	int ret, ret2, ufd = -1, kfd = -1;
 	char *str = strfilter__string(filter);
@@ -2834,11 +2841,11 @@ int del_perf_probe_events(struct strfilter *filter)
 	if (ret < 0)
 		goto out;
 
-	ret = probe_file__del_events(kfd, filter);
+	ret = probe_file__del_events(kfd, filter, result);
 	if (ret < 0 && ret != -ENOENT)
 		goto error;
 
-	ret2 = probe_file__del_events(ufd, filter);
+	ret2 = probe_file__del_events(ufd, filter, result);
 	if (ret2 < 0 && ret2 != -ENOENT) {
 		ret = ret2;
 		goto error;
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 6e7ec68..9855dbf 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -137,8 +137,10 @@ extern void line_range__clear(struct line_range *lr);
 /* Initialize line range */
 extern int line_range__init(struct line_range *lr);
 
-extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs);
-extern int del_perf_probe_events(struct strfilter *filter);
+extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
+				 struct strbuf *result);
+extern int del_perf_probe_events(struct strfilter *filter,
+				 struct strbuf *result);
 extern int show_perf_probe_events(struct strfilter *filter);
 extern int show_line_range(struct line_range *lr, const char *module,
 			   bool user);
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index bbb2437..e22fa12 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -267,7 +267,6 @@ static int __del_trace_probe_event(int fd, struct str_node *ent)
 		goto error;
 	}
 
-	pr_info("Removed event: %s\n", ent->s);
 	return 0;
 error:
 	pr_warning("Failed to delete event: %s\n",
@@ -275,7 +274,7 @@ error:
 	return ret;
 }
 
-int probe_file__del_events(int fd, struct strfilter *filter)
+int probe_file__del_events(int fd, struct strfilter *filter, struct strbuf *buf)
 {
 	struct strlist *namelist;
 	struct str_node *ent;
@@ -293,6 +292,8 @@ int probe_file__del_events(int fd, struct strfilter *filter)
 			ret = __del_trace_probe_event(fd, ent);
 			if (ret < 0)
 				break;
+			if (buf)
+				strbuf_addf(buf, "Removed event: %s\n", ent->s);
 		}
 	}
 	strlist__delete(namelist);
diff --git a/tools/perf/util/probe-file.h b/tools/perf/util/probe-file.h
index ada94a2..ee89ef0 100644
--- a/tools/perf/util/probe-file.h
+++ b/tools/perf/util/probe-file.h
@@ -1,6 +1,7 @@
 #ifndef __PROBE_FILE_H
 #define __PROBE_FILE_H
 
+#include "strbuf.h"
 #include "strlist.h"
 #include "strfilter.h"
 #include "probe-event.h"
@@ -13,6 +14,7 @@ int probe_file__open_both(int *kfd, int *ufd, int flag);
 struct strlist *probe_file__get_namelist(int fd);
 struct strlist *probe_file__get_rawlist(int fd);
 int probe_file__add_event(int fd, struct probe_trace_event *tev);
-int probe_file__del_events(int fd, struct strfilter *filter);
+int probe_file__del_events(int fd, struct strfilter *filter,
+			   struct strbuf *buf);
 
 #endif



^ permalink raw reply related	[flat|nested] 94+ messages in thread

* RE: [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe
  2015-09-03 12:10     ` [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe Masami Hiramatsu
@ 2015-09-03 12:18       ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-03 17:25         ` Namhyung Kim
  2015-09-03 20:28       ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 94+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-03 12:18 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI,
	Namhyung Kim, Arnaldo Carvalho de Melo
  Cc: Wang Nan, Kaixu Xia, Peter Zijlstra, Daniel Borkmann,
	linux-kernel, He Kuang, lizefan, Jiri Olsa, David Ahern,
	Brendan Gregg, mingo, ast

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 9237 bytes --]

Hi Namhyung,

So, I hope this would be what you've suggested.

Thank you,

-- 
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: masami.hiramatsu.pt@hitachi.com


> -----Original Message-----
> From: Masami Hiramatsu [mailto:masami.hiramatsu.pt@hitachi.com]
> Sent: Thursday, September 03, 2015 9:11 PM
> To: Namhyung Kim; Arnaldo Carvalho de Melo
> Cc: Wang Nan; Kaixu Xia; Peter Zijlstra; Daniel Borkmann; linux-kernel@vger.kernel.org; He Kuang; lizefan@huawei.com;
> Jiri Olsa; David Ahern; Brendan Gregg; mingo@kernel.org; ast@plumgrid.com
> Subject: [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe
> 
> Output the normal result of adding/deleting probe in buildin-probe
> instead of showing it by add/del_perf_probe_events.
> All the result string is stored into "result" strbuf parameter.
> If you want to ignore the result string, pass a NULL to the "result".
> Note that all warning/debug strings are still in the
> add/del_perf_probe_events.
> 
> Suggested-by: Namhyung Kim <namhyung@gmail.com>
> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> ---
>  tools/perf/builtin-probe.c    |    9 +++++++--
>  tools/perf/util/probe-event.c |   33 ++++++++++++++++++++-------------
>  tools/perf/util/probe-event.h |    6 ++++--
>  tools/perf/util/probe-file.c  |    5 +++--
>  tools/perf/util/probe-file.h  |    4 +++-
>  5 files changed, 37 insertions(+), 20 deletions(-)
> 
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index b81cec3..d11ad21 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -402,6 +402,7 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>  		    "Enable kernel symbol demangling"),
>  	OPT_END()
>  	};
> +	struct strbuf buf = STRBUF_INIT;
>  	int ret;
> 
>  	set_option_flag(options, 'a', "add", PARSE_OPT_EXCLUSIVE);
> @@ -483,7 +484,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>  		return ret;
>  #endif
>  	case 'd':
> -		ret = del_perf_probe_events(params.filter);
> +		ret = del_perf_probe_events(params.filter, &buf);
> +		/* Even if failed, we should show the result first */
> +		pr_info("%s", buf.buf);
>  		if (ret < 0) {
>  			pr_err_with_code("  Error: Failed to delete events.", ret);
>  			return ret;
> @@ -496,7 +499,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>  			usage_with_options(probe_usage, options);
>  		}
> 
> -		ret = add_perf_probe_events(params.events, params.nevents);
> +		ret = add_perf_probe_events(params.events, params.nevents, &buf);
> +		/* Even if failed, we should show the result first */
> +		pr_info("%s", buf.buf);
>  		if (ret < 0) {
>  			pr_err_with_code("  Error: Failed to add events.", ret);
>  			return ret;
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index eb5f18b..1a3ed7c 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2395,7 +2395,8 @@ static int probe_trace_event__set_name(struct probe_trace_event *tev,
> 
>  static int __add_probe_trace_events(struct perf_probe_event *pev,
>  				     struct probe_trace_event *tevs,
> -				     int ntevs, bool allow_suffix)
> +				     int ntevs, bool allow_suffix,
> +				     struct strbuf *buf)
>  {
>  	int i, fd, ret;
>  	struct probe_trace_event *tev = NULL;
> @@ -2415,7 +2416,9 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
>  	}
> 
>  	ret = 0;
> -	pr_info("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
> +	if (buf)
> +		strbuf_addf(buf, "Added new event%s\n",
> +			    (ntevs > 1) ? "s:" : ":");
>  	for (i = 0; i < ntevs; i++) {
>  		tev = &tevs[i];
>  		/* Skip if the symbol is out of .text or blacklisted */
> @@ -2432,9 +2435,12 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
>  		if (ret < 0)
>  			break;
> 
> -		/* We use tev's name for showing new events */
> -		show_perf_probe_event(tev->group, tev->event, pev,
> -				      tev->point.module, false);
> +		if (buf) {
> +			/* We use tev's name for showing new events */
> +			perf_probe_event__sprintf(tev->group, tev->event,
> +						  pev, tev->point.module, buf);
> +			strbuf_addch(buf, '\n');
> +		}
>  		/* Save the last valid name */
>  		event = tev->event;
>  		group = tev->group;
> @@ -2451,10 +2457,10 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
>  		warn_uprobe_event_compat(tev);
> 
>  	/* Note that it is possible to skip all events because of blacklist */
> -	if (ret >= 0 && event) {
> +	if (ret >= 0 && event && buf) {
>  		/* Show how to use the event. */
> -		pr_info("\nYou can now use it in all perf tools, such as:\n\n");
> -		pr_info("\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
> +		strbuf_addf(buf, "\nYou can now use it in all perf tools, such as:\n\n");
> +		strbuf_addf(buf, "\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
>  	}
> 
>  	strlist__delete(namelist);
> @@ -2765,7 +2771,8 @@ struct __event_package {
>  	int				ntevs;
>  };
> 
> -int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
> +int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> +			  struct strbuf *result)
>  {
>  	int i, j, ret;
>  	struct __event_package *pkgs;
> @@ -2802,7 +2809,7 @@ int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
>  	for (i = 0; i < npevs; i++) {
>  		ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs,
>  					       pkgs[i].ntevs,
> -					       probe_conf.force_add);
> +					       probe_conf.force_add, result);
>  		if (ret < 0)
>  			break;
>  	}
> @@ -2819,7 +2826,7 @@ end:
>  	return ret;
>  }
> 
> -int del_perf_probe_events(struct strfilter *filter)
> +int del_perf_probe_events(struct strfilter *filter, struct strbuf *result)
>  {
>  	int ret, ret2, ufd = -1, kfd = -1;
>  	char *str = strfilter__string(filter);
> @@ -2834,11 +2841,11 @@ int del_perf_probe_events(struct strfilter *filter)
>  	if (ret < 0)
>  		goto out;
> 
> -	ret = probe_file__del_events(kfd, filter);
> +	ret = probe_file__del_events(kfd, filter, result);
>  	if (ret < 0 && ret != -ENOENT)
>  		goto error;
> 
> -	ret2 = probe_file__del_events(ufd, filter);
> +	ret2 = probe_file__del_events(ufd, filter, result);
>  	if (ret2 < 0 && ret2 != -ENOENT) {
>  		ret = ret2;
>  		goto error;
> diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
> index 6e7ec68..9855dbf 100644
> --- a/tools/perf/util/probe-event.h
> +++ b/tools/perf/util/probe-event.h
> @@ -137,8 +137,10 @@ extern void line_range__clear(struct line_range *lr);
>  /* Initialize line range */
>  extern int line_range__init(struct line_range *lr);
> 
> -extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs);
> -extern int del_perf_probe_events(struct strfilter *filter);
> +extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> +				 struct strbuf *result);
> +extern int del_perf_probe_events(struct strfilter *filter,
> +				 struct strbuf *result);
>  extern int show_perf_probe_events(struct strfilter *filter);
>  extern int show_line_range(struct line_range *lr, const char *module,
>  			   bool user);
> diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
> index bbb2437..e22fa12 100644
> --- a/tools/perf/util/probe-file.c
> +++ b/tools/perf/util/probe-file.c
> @@ -267,7 +267,6 @@ static int __del_trace_probe_event(int fd, struct str_node *ent)
>  		goto error;
>  	}
> 
> -	pr_info("Removed event: %s\n", ent->s);
>  	return 0;
>  error:
>  	pr_warning("Failed to delete event: %s\n",
> @@ -275,7 +274,7 @@ error:
>  	return ret;
>  }
> 
> -int probe_file__del_events(int fd, struct strfilter *filter)
> +int probe_file__del_events(int fd, struct strfilter *filter, struct strbuf *buf)
>  {
>  	struct strlist *namelist;
>  	struct str_node *ent;
> @@ -293,6 +292,8 @@ int probe_file__del_events(int fd, struct strfilter *filter)
>  			ret = __del_trace_probe_event(fd, ent);
>  			if (ret < 0)
>  				break;
> +			if (buf)
> +				strbuf_addf(buf, "Removed event: %s\n", ent->s);
>  		}
>  	}
>  	strlist__delete(namelist);
> diff --git a/tools/perf/util/probe-file.h b/tools/perf/util/probe-file.h
> index ada94a2..ee89ef0 100644
> --- a/tools/perf/util/probe-file.h
> +++ b/tools/perf/util/probe-file.h
> @@ -1,6 +1,7 @@
>  #ifndef __PROBE_FILE_H
>  #define __PROBE_FILE_H
> 
> +#include "strbuf.h"
>  #include "strlist.h"
>  #include "strfilter.h"
>  #include "probe-event.h"
> @@ -13,6 +14,7 @@ int probe_file__open_both(int *kfd, int *ufd, int flag);
>  struct strlist *probe_file__get_namelist(int fd);
>  struct strlist *probe_file__get_rawlist(int fd);
>  int probe_file__add_event(int fd, struct probe_trace_event *tev);
> -int probe_file__del_events(int fd, struct strfilter *filter);
> +int probe_file__del_events(int fd, struct strfilter *filter,
> +			   struct strbuf *buf);
> 
>  #endif
> 

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe
  2015-09-03 12:18       ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-03 17:25         ` Namhyung Kim
  0 siblings, 0 replies; 94+ messages in thread
From: Namhyung Kim @ 2015-09-03 17:25 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI
  Cc: Arnaldo Carvalho de Melo, Wang Nan, Kaixu Xia, Peter Zijlstra,
	Daniel Borkmann, linux-kernel, He Kuang, lizefan, Jiri Olsa,
	David Ahern, Brendan Gregg, mingo, ast

On Thu, Sep 03, 2015 at 12:18:28PM +0000, 平松雅巳 / HIRAMATU,MASAMI wrote:
> Hi Namhyung,
> 
> So, I hope this would be what you've suggested.
> 
> Thank you,

Hi Masami,

I think something different, but this can be ok.  Anyway, I'll send my
idea soon..

Thanks,
Namhyung


> 
> -- 
> Masami HIRAMATSU
> Linux Technology Research Center, System Productivity Research Dept.
> Center for Technology Innovation - Systems Engineering
> Hitachi, Ltd., Research & Development Group
> E-mail: masami.hiramatsu.pt@hitachi.com
> 
> 
> > -----Original Message-----
> > From: Masami Hiramatsu [mailto:masami.hiramatsu.pt@hitachi.com]
> > Sent: Thursday, September 03, 2015 9:11 PM
> > To: Namhyung Kim; Arnaldo Carvalho de Melo
> > Cc: Wang Nan; Kaixu Xia; Peter Zijlstra; Daniel Borkmann; linux-kernel@vger.kernel.org; He Kuang; lizefan@huawei.com;
> > Jiri Olsa; David Ahern; Brendan Gregg; mingo@kernel.org; ast@plumgrid.com
> > Subject: [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe
> > 
> > Output the normal result of adding/deleting probe in buildin-probe
> > instead of showing it by add/del_perf_probe_events.
> > All the result string is stored into "result" strbuf parameter.
> > If you want to ignore the result string, pass a NULL to the "result".
> > Note that all warning/debug strings are still in the
> > add/del_perf_probe_events.
> > 
> > Suggested-by: Namhyung Kim <namhyung@gmail.com>
> > Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > ---
> >  tools/perf/builtin-probe.c    |    9 +++++++--
> >  tools/perf/util/probe-event.c |   33 ++++++++++++++++++++-------------
> >  tools/perf/util/probe-event.h |    6 ++++--
> >  tools/perf/util/probe-file.c  |    5 +++--
> >  tools/perf/util/probe-file.h  |    4 +++-
> >  5 files changed, 37 insertions(+), 20 deletions(-)
> > 
> > diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> > index b81cec3..d11ad21 100644
> > --- a/tools/perf/builtin-probe.c
> > +++ b/tools/perf/builtin-probe.c
> > @@ -402,6 +402,7 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> >  		    "Enable kernel symbol demangling"),
> >  	OPT_END()
> >  	};
> > +	struct strbuf buf = STRBUF_INIT;
> >  	int ret;
> > 
> >  	set_option_flag(options, 'a', "add", PARSE_OPT_EXCLUSIVE);
> > @@ -483,7 +484,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> >  		return ret;
> >  #endif
> >  	case 'd':
> > -		ret = del_perf_probe_events(params.filter);
> > +		ret = del_perf_probe_events(params.filter, &buf);
> > +		/* Even if failed, we should show the result first */
> > +		pr_info("%s", buf.buf);
> >  		if (ret < 0) {
> >  			pr_err_with_code("  Error: Failed to delete events.", ret);
> >  			return ret;
> > @@ -496,7 +499,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> >  			usage_with_options(probe_usage, options);
> >  		}
> > 
> > -		ret = add_perf_probe_events(params.events, params.nevents);
> > +		ret = add_perf_probe_events(params.events, params.nevents, &buf);
> > +		/* Even if failed, we should show the result first */
> > +		pr_info("%s", buf.buf);
> >  		if (ret < 0) {
> >  			pr_err_with_code("  Error: Failed to add events.", ret);
> >  			return ret;
> > diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> > index eb5f18b..1a3ed7c 100644
> > --- a/tools/perf/util/probe-event.c
> > +++ b/tools/perf/util/probe-event.c
> > @@ -2395,7 +2395,8 @@ static int probe_trace_event__set_name(struct probe_trace_event *tev,
> > 
> >  static int __add_probe_trace_events(struct perf_probe_event *pev,
> >  				     struct probe_trace_event *tevs,
> > -				     int ntevs, bool allow_suffix)
> > +				     int ntevs, bool allow_suffix,
> > +				     struct strbuf *buf)
> >  {
> >  	int i, fd, ret;
> >  	struct probe_trace_event *tev = NULL;
> > @@ -2415,7 +2416,9 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
> >  	}
> > 
> >  	ret = 0;
> > -	pr_info("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
> > +	if (buf)
> > +		strbuf_addf(buf, "Added new event%s\n",
> > +			    (ntevs > 1) ? "s:" : ":");
> >  	for (i = 0; i < ntevs; i++) {
> >  		tev = &tevs[i];
> >  		/* Skip if the symbol is out of .text or blacklisted */
> > @@ -2432,9 +2435,12 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
> >  		if (ret < 0)
> >  			break;
> > 
> > -		/* We use tev's name for showing new events */
> > -		show_perf_probe_event(tev->group, tev->event, pev,
> > -				      tev->point.module, false);
> > +		if (buf) {
> > +			/* We use tev's name for showing new events */
> > +			perf_probe_event__sprintf(tev->group, tev->event,
> > +						  pev, tev->point.module, buf);
> > +			strbuf_addch(buf, '\n');
> > +		}
> >  		/* Save the last valid name */
> >  		event = tev->event;
> >  		group = tev->group;
> > @@ -2451,10 +2457,10 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
> >  		warn_uprobe_event_compat(tev);
> > 
> >  	/* Note that it is possible to skip all events because of blacklist */
> > -	if (ret >= 0 && event) {
> > +	if (ret >= 0 && event && buf) {
> >  		/* Show how to use the event. */
> > -		pr_info("\nYou can now use it in all perf tools, such as:\n\n");
> > -		pr_info("\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
> > +		strbuf_addf(buf, "\nYou can now use it in all perf tools, such as:\n\n");
> > +		strbuf_addf(buf, "\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
> >  	}
> > 
> >  	strlist__delete(namelist);
> > @@ -2765,7 +2771,8 @@ struct __event_package {
> >  	int				ntevs;
> >  };
> > 
> > -int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
> > +int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> > +			  struct strbuf *result)
> >  {
> >  	int i, j, ret;
> >  	struct __event_package *pkgs;
> > @@ -2802,7 +2809,7 @@ int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
> >  	for (i = 0; i < npevs; i++) {
> >  		ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs,
> >  					       pkgs[i].ntevs,
> > -					       probe_conf.force_add);
> > +					       probe_conf.force_add, result);
> >  		if (ret < 0)
> >  			break;
> >  	}
> > @@ -2819,7 +2826,7 @@ end:
> >  	return ret;
> >  }
> > 
> > -int del_perf_probe_events(struct strfilter *filter)
> > +int del_perf_probe_events(struct strfilter *filter, struct strbuf *result)
> >  {
> >  	int ret, ret2, ufd = -1, kfd = -1;
> >  	char *str = strfilter__string(filter);
> > @@ -2834,11 +2841,11 @@ int del_perf_probe_events(struct strfilter *filter)
> >  	if (ret < 0)
> >  		goto out;
> > 
> > -	ret = probe_file__del_events(kfd, filter);
> > +	ret = probe_file__del_events(kfd, filter, result);
> >  	if (ret < 0 && ret != -ENOENT)
> >  		goto error;
> > 
> > -	ret2 = probe_file__del_events(ufd, filter);
> > +	ret2 = probe_file__del_events(ufd, filter, result);
> >  	if (ret2 < 0 && ret2 != -ENOENT) {
> >  		ret = ret2;
> >  		goto error;
> > diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
> > index 6e7ec68..9855dbf 100644
> > --- a/tools/perf/util/probe-event.h
> > +++ b/tools/perf/util/probe-event.h
> > @@ -137,8 +137,10 @@ extern void line_range__clear(struct line_range *lr);
> >  /* Initialize line range */
> >  extern int line_range__init(struct line_range *lr);
> > 
> > -extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs);
> > -extern int del_perf_probe_events(struct strfilter *filter);
> > +extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> > +				 struct strbuf *result);
> > +extern int del_perf_probe_events(struct strfilter *filter,
> > +				 struct strbuf *result);
> >  extern int show_perf_probe_events(struct strfilter *filter);
> >  extern int show_line_range(struct line_range *lr, const char *module,
> >  			   bool user);
> > diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
> > index bbb2437..e22fa12 100644
> > --- a/tools/perf/util/probe-file.c
> > +++ b/tools/perf/util/probe-file.c
> > @@ -267,7 +267,6 @@ static int __del_trace_probe_event(int fd, struct str_node *ent)
> >  		goto error;
> >  	}
> > 
> > -	pr_info("Removed event: %s\n", ent->s);
> >  	return 0;
> >  error:
> >  	pr_warning("Failed to delete event: %s\n",
> > @@ -275,7 +274,7 @@ error:
> >  	return ret;
> >  }
> > 
> > -int probe_file__del_events(int fd, struct strfilter *filter)
> > +int probe_file__del_events(int fd, struct strfilter *filter, struct strbuf *buf)
> >  {
> >  	struct strlist *namelist;
> >  	struct str_node *ent;
> > @@ -293,6 +292,8 @@ int probe_file__del_events(int fd, struct strfilter *filter)
> >  			ret = __del_trace_probe_event(fd, ent);
> >  			if (ret < 0)
> >  				break;
> > +			if (buf)
> > +				strbuf_addf(buf, "Removed event: %s\n", ent->s);
> >  		}
> >  	}
> >  	strlist__delete(namelist);
> > diff --git a/tools/perf/util/probe-file.h b/tools/perf/util/probe-file.h
> > index ada94a2..ee89ef0 100644
> > --- a/tools/perf/util/probe-file.h
> > +++ b/tools/perf/util/probe-file.h
> > @@ -1,6 +1,7 @@
> >  #ifndef __PROBE_FILE_H
> >  #define __PROBE_FILE_H
> > 
> > +#include "strbuf.h"
> >  #include "strlist.h"
> >  #include "strfilter.h"
> >  #include "probe-event.h"
> > @@ -13,6 +14,7 @@ int probe_file__open_both(int *kfd, int *ufd, int flag);
> >  struct strlist *probe_file__get_namelist(int fd);
> >  struct strlist *probe_file__get_rawlist(int fd);
> >  int probe_file__add_event(int fd, struct probe_trace_event *tev);
> > -int probe_file__del_events(int fd, struct strfilter *filter);
> > +int probe_file__del_events(int fd, struct strfilter *filter,
> > +			   struct strbuf *buf);
> > 
> >  #endif
> > 
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe
  2015-09-03 12:10     ` [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe Masami Hiramatsu
  2015-09-03 12:18       ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-03 20:28       ` Arnaldo Carvalho de Melo
  2015-09-04  1:30         ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 1 reply; 94+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-03 20:28 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Namhyung Kim, Wang Nan, Kaixu Xia, Peter Zijlstra,
	Daniel Borkmann, linux-kernel, He Kuang, lizefan, Jiri Olsa,
	David Ahern, Brendan Gregg, mingo, ast

Em Thu, Sep 03, 2015 at 09:10:35PM +0900, Masami Hiramatsu escreveu:
> Output the normal result of adding/deleting probe in buildin-probe
> instead of showing it by add/del_perf_probe_events.
> All the result string is stored into "result" strbuf parameter.
> If you want to ignore the result string, pass a NULL to the "result".
> Note that all warning/debug strings are still in the
> add/del_perf_probe_events.

Please provide the before and after output of the affected tools.

But I'll wait for you to react to Namyung's RFC.

- Arnaldo
 
> Suggested-by: Namhyung Kim <namhyung@gmail.com>
> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> ---
>  tools/perf/builtin-probe.c    |    9 +++++++--
>  tools/perf/util/probe-event.c |   33 ++++++++++++++++++++-------------
>  tools/perf/util/probe-event.h |    6 ++++--
>  tools/perf/util/probe-file.c  |    5 +++--
>  tools/perf/util/probe-file.h  |    4 +++-
>  5 files changed, 37 insertions(+), 20 deletions(-)
> 
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index b81cec3..d11ad21 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -402,6 +402,7 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>  		    "Enable kernel symbol demangling"),
>  	OPT_END()
>  	};
> +	struct strbuf buf = STRBUF_INIT;
>  	int ret;
>  
>  	set_option_flag(options, 'a', "add", PARSE_OPT_EXCLUSIVE);
> @@ -483,7 +484,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>  		return ret;
>  #endif
>  	case 'd':
> -		ret = del_perf_probe_events(params.filter);
> +		ret = del_perf_probe_events(params.filter, &buf);
> +		/* Even if failed, we should show the result first */
> +		pr_info("%s", buf.buf);
>  		if (ret < 0) {
>  			pr_err_with_code("  Error: Failed to delete events.", ret);
>  			return ret;
> @@ -496,7 +499,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>  			usage_with_options(probe_usage, options);
>  		}
>  
> -		ret = add_perf_probe_events(params.events, params.nevents);
> +		ret = add_perf_probe_events(params.events, params.nevents, &buf);
> +		/* Even if failed, we should show the result first */
> +		pr_info("%s", buf.buf);
>  		if (ret < 0) {
>  			pr_err_with_code("  Error: Failed to add events.", ret);
>  			return ret;
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index eb5f18b..1a3ed7c 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2395,7 +2395,8 @@ static int probe_trace_event__set_name(struct probe_trace_event *tev,
>  
>  static int __add_probe_trace_events(struct perf_probe_event *pev,
>  				     struct probe_trace_event *tevs,
> -				     int ntevs, bool allow_suffix)
> +				     int ntevs, bool allow_suffix,
> +				     struct strbuf *buf)
>  {
>  	int i, fd, ret;
>  	struct probe_trace_event *tev = NULL;
> @@ -2415,7 +2416,9 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
>  	}
>  
>  	ret = 0;
> -	pr_info("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
> +	if (buf)
> +		strbuf_addf(buf, "Added new event%s\n",
> +			    (ntevs > 1) ? "s:" : ":");
>  	for (i = 0; i < ntevs; i++) {
>  		tev = &tevs[i];
>  		/* Skip if the symbol is out of .text or blacklisted */
> @@ -2432,9 +2435,12 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
>  		if (ret < 0)
>  			break;
>  
> -		/* We use tev's name for showing new events */
> -		show_perf_probe_event(tev->group, tev->event, pev,
> -				      tev->point.module, false);
> +		if (buf) {
> +			/* We use tev's name for showing new events */
> +			perf_probe_event__sprintf(tev->group, tev->event,
> +						  pev, tev->point.module, buf);
> +			strbuf_addch(buf, '\n');
> +		}
>  		/* Save the last valid name */
>  		event = tev->event;
>  		group = tev->group;
> @@ -2451,10 +2457,10 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
>  		warn_uprobe_event_compat(tev);
>  
>  	/* Note that it is possible to skip all events because of blacklist */
> -	if (ret >= 0 && event) {
> +	if (ret >= 0 && event && buf) {
>  		/* Show how to use the event. */
> -		pr_info("\nYou can now use it in all perf tools, such as:\n\n");
> -		pr_info("\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
> +		strbuf_addf(buf, "\nYou can now use it in all perf tools, such as:\n\n");
> +		strbuf_addf(buf, "\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
>  	}
>  
>  	strlist__delete(namelist);
> @@ -2765,7 +2771,8 @@ struct __event_package {
>  	int				ntevs;
>  };
>  
> -int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
> +int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> +			  struct strbuf *result)
>  {
>  	int i, j, ret;
>  	struct __event_package *pkgs;
> @@ -2802,7 +2809,7 @@ int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
>  	for (i = 0; i < npevs; i++) {
>  		ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs,
>  					       pkgs[i].ntevs,
> -					       probe_conf.force_add);
> +					       probe_conf.force_add, result);
>  		if (ret < 0)
>  			break;
>  	}
> @@ -2819,7 +2826,7 @@ end:
>  	return ret;
>  }
>  
> -int del_perf_probe_events(struct strfilter *filter)
> +int del_perf_probe_events(struct strfilter *filter, struct strbuf *result)
>  {
>  	int ret, ret2, ufd = -1, kfd = -1;
>  	char *str = strfilter__string(filter);
> @@ -2834,11 +2841,11 @@ int del_perf_probe_events(struct strfilter *filter)
>  	if (ret < 0)
>  		goto out;
>  
> -	ret = probe_file__del_events(kfd, filter);
> +	ret = probe_file__del_events(kfd, filter, result);
>  	if (ret < 0 && ret != -ENOENT)
>  		goto error;
>  
> -	ret2 = probe_file__del_events(ufd, filter);
> +	ret2 = probe_file__del_events(ufd, filter, result);
>  	if (ret2 < 0 && ret2 != -ENOENT) {
>  		ret = ret2;
>  		goto error;
> diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
> index 6e7ec68..9855dbf 100644
> --- a/tools/perf/util/probe-event.h
> +++ b/tools/perf/util/probe-event.h
> @@ -137,8 +137,10 @@ extern void line_range__clear(struct line_range *lr);
>  /* Initialize line range */
>  extern int line_range__init(struct line_range *lr);
>  
> -extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs);
> -extern int del_perf_probe_events(struct strfilter *filter);
> +extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> +				 struct strbuf *result);
> +extern int del_perf_probe_events(struct strfilter *filter,
> +				 struct strbuf *result);
>  extern int show_perf_probe_events(struct strfilter *filter);
>  extern int show_line_range(struct line_range *lr, const char *module,
>  			   bool user);
> diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
> index bbb2437..e22fa12 100644
> --- a/tools/perf/util/probe-file.c
> +++ b/tools/perf/util/probe-file.c
> @@ -267,7 +267,6 @@ static int __del_trace_probe_event(int fd, struct str_node *ent)
>  		goto error;
>  	}
>  
> -	pr_info("Removed event: %s\n", ent->s);
>  	return 0;
>  error:
>  	pr_warning("Failed to delete event: %s\n",
> @@ -275,7 +274,7 @@ error:
>  	return ret;
>  }
>  
> -int probe_file__del_events(int fd, struct strfilter *filter)
> +int probe_file__del_events(int fd, struct strfilter *filter, struct strbuf *buf)
>  {
>  	struct strlist *namelist;
>  	struct str_node *ent;
> @@ -293,6 +292,8 @@ int probe_file__del_events(int fd, struct strfilter *filter)
>  			ret = __del_trace_probe_event(fd, ent);
>  			if (ret < 0)
>  				break;
> +			if (buf)
> +				strbuf_addf(buf, "Removed event: %s\n", ent->s);
>  		}
>  	}
>  	strlist__delete(namelist);
> diff --git a/tools/perf/util/probe-file.h b/tools/perf/util/probe-file.h
> index ada94a2..ee89ef0 100644
> --- a/tools/perf/util/probe-file.h
> +++ b/tools/perf/util/probe-file.h
> @@ -1,6 +1,7 @@
>  #ifndef __PROBE_FILE_H
>  #define __PROBE_FILE_H
>  
> +#include "strbuf.h"
>  #include "strlist.h"
>  #include "strfilter.h"
>  #include "probe-event.h"
> @@ -13,6 +14,7 @@ int probe_file__open_both(int *kfd, int *ufd, int flag);
>  struct strlist *probe_file__get_namelist(int fd);
>  struct strlist *probe_file__get_rawlist(int fd);
>  int probe_file__add_event(int fd, struct probe_trace_event *tev);
> -int probe_file__del_events(int fd, struct strfilter *filter);
> +int probe_file__del_events(int fd, struct strfilter *filter,
> +			   struct strbuf *buf);
>  
>  #endif
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* RE: [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe
  2015-09-03 20:28       ` Arnaldo Carvalho de Melo
@ 2015-09-04  1:30         ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 0 replies; 94+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-04  1:30 UTC (permalink / raw)
  To: 'Arnaldo Carvalho de Melo'
  Cc: Namhyung Kim, Wang Nan, Kaixu Xia, Peter Zijlstra,
	Daniel Borkmann, linux-kernel, He Kuang, lizefan, Jiri Olsa,
	David Ahern, Brendan Gregg, mingo, ast

> From: Arnaldo Carvalho de Melo [mailto:acme@kernel.org]
> 
> Em Thu, Sep 03, 2015 at 09:10:35PM +0900, Masami Hiramatsu escreveu:
> > Output the normal result of adding/deleting probe in buildin-probe
> > instead of showing it by add/del_perf_probe_events.
> > All the result string is stored into "result" strbuf parameter.
> > If you want to ignore the result string, pass a NULL to the "result".
> > Note that all warning/debug strings are still in the
> > add/del_perf_probe_events.
> 
> Please provide the before and after output of the affected tools.
> 
> But I'll wait for you to react to Namyung's RFC.

Yeah, I think his series is much better than this add-hoc fix :)
I'll reply him asap.

Thanks!

> 
> - Arnaldo
> 
> > Suggested-by: Namhyung Kim <namhyung@gmail.com>
> > Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > ---
> >  tools/perf/builtin-probe.c    |    9 +++++++--
> >  tools/perf/util/probe-event.c |   33 ++++++++++++++++++++-------------
> >  tools/perf/util/probe-event.h |    6 ++++--
> >  tools/perf/util/probe-file.c  |    5 +++--
> >  tools/perf/util/probe-file.h  |    4 +++-
> >  5 files changed, 37 insertions(+), 20 deletions(-)
> >
> > diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> > index b81cec3..d11ad21 100644
> > --- a/tools/perf/builtin-probe.c
> > +++ b/tools/perf/builtin-probe.c
> > @@ -402,6 +402,7 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> >  		    "Enable kernel symbol demangling"),
> >  	OPT_END()
> >  	};
> > +	struct strbuf buf = STRBUF_INIT;
> >  	int ret;
> >
> >  	set_option_flag(options, 'a', "add", PARSE_OPT_EXCLUSIVE);
> > @@ -483,7 +484,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> >  		return ret;
> >  #endif
> >  	case 'd':
> > -		ret = del_perf_probe_events(params.filter);
> > +		ret = del_perf_probe_events(params.filter, &buf);
> > +		/* Even if failed, we should show the result first */
> > +		pr_info("%s", buf.buf);
> >  		if (ret < 0) {
> >  			pr_err_with_code("  Error: Failed to delete events.", ret);
> >  			return ret;
> > @@ -496,7 +499,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> >  			usage_with_options(probe_usage, options);
> >  		}
> >
> > -		ret = add_perf_probe_events(params.events, params.nevents);
> > +		ret = add_perf_probe_events(params.events, params.nevents, &buf);
> > +		/* Even if failed, we should show the result first */
> > +		pr_info("%s", buf.buf);
> >  		if (ret < 0) {
> >  			pr_err_with_code("  Error: Failed to add events.", ret);
> >  			return ret;
> > diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> > index eb5f18b..1a3ed7c 100644
> > --- a/tools/perf/util/probe-event.c
> > +++ b/tools/perf/util/probe-event.c
> > @@ -2395,7 +2395,8 @@ static int probe_trace_event__set_name(struct probe_trace_event *tev,
> >
> >  static int __add_probe_trace_events(struct perf_probe_event *pev,
> >  				     struct probe_trace_event *tevs,
> > -				     int ntevs, bool allow_suffix)
> > +				     int ntevs, bool allow_suffix,
> > +				     struct strbuf *buf)
> >  {
> >  	int i, fd, ret;
> >  	struct probe_trace_event *tev = NULL;
> > @@ -2415,7 +2416,9 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
> >  	}
> >
> >  	ret = 0;
> > -	pr_info("Added new event%s\n", (ntevs > 1) ? "s:" : ":");
> > +	if (buf)
> > +		strbuf_addf(buf, "Added new event%s\n",
> > +			    (ntevs > 1) ? "s:" : ":");
> >  	for (i = 0; i < ntevs; i++) {
> >  		tev = &tevs[i];
> >  		/* Skip if the symbol is out of .text or blacklisted */
> > @@ -2432,9 +2435,12 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
> >  		if (ret < 0)
> >  			break;
> >
> > -		/* We use tev's name for showing new events */
> > -		show_perf_probe_event(tev->group, tev->event, pev,
> > -				      tev->point.module, false);
> > +		if (buf) {
> > +			/* We use tev's name for showing new events */
> > +			perf_probe_event__sprintf(tev->group, tev->event,
> > +						  pev, tev->point.module, buf);
> > +			strbuf_addch(buf, '\n');
> > +		}
> >  		/* Save the last valid name */
> >  		event = tev->event;
> >  		group = tev->group;
> > @@ -2451,10 +2457,10 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
> >  		warn_uprobe_event_compat(tev);
> >
> >  	/* Note that it is possible to skip all events because of blacklist */
> > -	if (ret >= 0 && event) {
> > +	if (ret >= 0 && event && buf) {
> >  		/* Show how to use the event. */
> > -		pr_info("\nYou can now use it in all perf tools, such as:\n\n");
> > -		pr_info("\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
> > +		strbuf_addf(buf, "\nYou can now use it in all perf tools, such as:\n\n");
> > +		strbuf_addf(buf, "\tperf record -e %s:%s -aR sleep 1\n\n", group, event);
> >  	}
> >
> >  	strlist__delete(namelist);
> > @@ -2765,7 +2771,8 @@ struct __event_package {
> >  	int				ntevs;
> >  };
> >
> > -int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
> > +int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> > +			  struct strbuf *result)
> >  {
> >  	int i, j, ret;
> >  	struct __event_package *pkgs;
> > @@ -2802,7 +2809,7 @@ int add_perf_probe_events(struct perf_probe_event *pevs, int npevs)
> >  	for (i = 0; i < npevs; i++) {
> >  		ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs,
> >  					       pkgs[i].ntevs,
> > -					       probe_conf.force_add);
> > +					       probe_conf.force_add, result);
> >  		if (ret < 0)
> >  			break;
> >  	}
> > @@ -2819,7 +2826,7 @@ end:
> >  	return ret;
> >  }
> >
> > -int del_perf_probe_events(struct strfilter *filter)
> > +int del_perf_probe_events(struct strfilter *filter, struct strbuf *result)
> >  {
> >  	int ret, ret2, ufd = -1, kfd = -1;
> >  	char *str = strfilter__string(filter);
> > @@ -2834,11 +2841,11 @@ int del_perf_probe_events(struct strfilter *filter)
> >  	if (ret < 0)
> >  		goto out;
> >
> > -	ret = probe_file__del_events(kfd, filter);
> > +	ret = probe_file__del_events(kfd, filter, result);
> >  	if (ret < 0 && ret != -ENOENT)
> >  		goto error;
> >
> > -	ret2 = probe_file__del_events(ufd, filter);
> > +	ret2 = probe_file__del_events(ufd, filter, result);
> >  	if (ret2 < 0 && ret2 != -ENOENT) {
> >  		ret = ret2;
> >  		goto error;
> > diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
> > index 6e7ec68..9855dbf 100644
> > --- a/tools/perf/util/probe-event.h
> > +++ b/tools/perf/util/probe-event.h
> > @@ -137,8 +137,10 @@ extern void line_range__clear(struct line_range *lr);
> >  /* Initialize line range */
> >  extern int line_range__init(struct line_range *lr);
> >
> > -extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs);
> > -extern int del_perf_probe_events(struct strfilter *filter);
> > +extern int add_perf_probe_events(struct perf_probe_event *pevs, int npevs,
> > +				 struct strbuf *result);
> > +extern int del_perf_probe_events(struct strfilter *filter,
> > +				 struct strbuf *result);
> >  extern int show_perf_probe_events(struct strfilter *filter);
> >  extern int show_line_range(struct line_range *lr, const char *module,
> >  			   bool user);
> > diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
> > index bbb2437..e22fa12 100644
> > --- a/tools/perf/util/probe-file.c
> > +++ b/tools/perf/util/probe-file.c
> > @@ -267,7 +267,6 @@ static int __del_trace_probe_event(int fd, struct str_node *ent)
> >  		goto error;
> >  	}
> >
> > -	pr_info("Removed event: %s\n", ent->s);
> >  	return 0;
> >  error:
> >  	pr_warning("Failed to delete event: %s\n",
> > @@ -275,7 +274,7 @@ error:
> >  	return ret;
> >  }
> >
> > -int probe_file__del_events(int fd, struct strfilter *filter)
> > +int probe_file__del_events(int fd, struct strfilter *filter, struct strbuf *buf)
> >  {
> >  	struct strlist *namelist;
> >  	struct str_node *ent;
> > @@ -293,6 +292,8 @@ int probe_file__del_events(int fd, struct strfilter *filter)
> >  			ret = __del_trace_probe_event(fd, ent);
> >  			if (ret < 0)
> >  				break;
> > +			if (buf)
> > +				strbuf_addf(buf, "Removed event: %s\n", ent->s);
> >  		}
> >  	}
> >  	strlist__delete(namelist);
> > diff --git a/tools/perf/util/probe-file.h b/tools/perf/util/probe-file.h
> > index ada94a2..ee89ef0 100644
> > --- a/tools/perf/util/probe-file.h
> > +++ b/tools/perf/util/probe-file.h
> > @@ -1,6 +1,7 @@
> >  #ifndef __PROBE_FILE_H
> >  #define __PROBE_FILE_H
> >
> > +#include "strbuf.h"
> >  #include "strlist.h"
> >  #include "strfilter.h"
> >  #include "probe-event.h"
> > @@ -13,6 +14,7 @@ int probe_file__open_both(int *kfd, int *ufd, int flag);
> >  struct strlist *probe_file__get_namelist(int fd);
> >  struct strlist *probe_file__get_rawlist(int fd);
> >  int probe_file__add_event(int fd, struct probe_trace_event *tev);
> > -int probe_file__del_events(int fd, struct strfilter *filter);
> > +int probe_file__del_events(int fd, struct strfilter *filter,
> > +			   struct strbuf *buf);
> >
> >  #endif
> >

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 18/31] perf test: Add 'perf test BPF'
  2015-09-02 12:45   ` Namhyung Kim
@ 2015-09-05 12:21     ` Wang Nan
  0 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-09-05 12:21 UTC (permalink / raw)
  To: Namhyung Kim, Wang Nan
  Cc: acme, mingo, ast, linux-kernel, lizefan, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Peter Zijlstra



On 09/02/2015 08:45 PM, Namhyung Kim wrote:
> On Sat, Aug 29, 2015 at 04:21:52AM +0000, Wang Nan wrote:
>> This patch adds BPF testcase for testing BPF event filtering.
>>
>> By utilizing the result of 'perf test LLVM', this patch compiles the
>> eBPF sample program then test it ability. The BPF script in 'perf test
>> LLVM' collects half of execution of epoll_pwait(). This patch runs 111
>> times of it, so the resule should contains 56 samples.
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: He Kuang <hekuang@huawei.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Kaixu Xia <xiakaixu@huawei.com>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> Link: http://lkml.kernel.org/n/1440151770-129878-16-git-send-email-wangnan0@huawei.com
>> ---
>
> [SNIP]
>
>> +static int prepare_bpf(void *obj_buf, size_t obj_buf_sz)
>> +{
>> +	int err;
>> +	char errbuf[BUFSIZ];
>> +
>> +	err = bpf__prepare_load_buffer(obj_buf, obj_buf_sz, NULL);
>> +	if (err) {
>> +		bpf__strerror_prepare_load("[buffer]", false, err, errbuf,
>> +					   sizeof(errbuf));
>> +		fprintf(stderr, " (%s)", errbuf);
>> +		return TEST_FAIL;
>> +	}
>> +
>> +	err = bpf__probe();
>> +	if (err) {
>> +		bpf__strerror_load(err, errbuf, sizeof(errbuf));
>> +		fprintf(stderr, " (%s)", errbuf);
>> +		if (getuid() != 0)
>
> geteuid() ?
>
> Thanks,
> Namhyung
>

Already changed.

Thank you.

>
>> +			fprintf(stderr, " (try run as root)");
>> +		return TEST_FAIL;
>> +	}
>> +
>> +	err = bpf__load();
>> +	if (err) {
>> +		bpf__strerror_load(err, errbuf, sizeof(errbuf));
>> +		fprintf(stderr, " (%s)", errbuf);
>> +		return TEST_FAIL;
>> +	}
>> +
>> +	return 0;
>> +}


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH] perf tools: Allow BPF placeholder dummy events to collect --filter options
  2015-08-29  4:21 ` [PATCH 03/31] perf tools: Introduce dummy evsel Wang Nan
  2015-08-31 19:38   ` Arnaldo Carvalho de Melo
  2015-09-03  0:11   ` Namhyung Kim
@ 2015-09-06  5:55   ` Wang Nan
  2015-09-06  5:56     ` [PATCH] perf tools: Sync setting of real bpf events with placeholder Wang Nan
  2 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-09-06  5:55 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, Wang Nan, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	Zefan Li, pi3orama

This patch improves collection and setting of filters, allows --filter
be set to BPF placeholder events (which is a software dummy event).

perf_evsel__is_dummy(), perf_evsel__is_bpf_placeholder() and
perf_evsel__can_filter() are introduced for this.

Test result:

 # perf record --event dummy --exclude-perf
 --exclude-perf option should follow a -e tracepoint option

 # perf record --event dummy:u --exclude-perf ls
 --exclude-perf option should follow a -e tracepoint option

 # perf record --event test.o --exclude-perf ls
 Added new event:
   perf_bpf_probe:func_vfs_write (on vfs_write)
 ...

 # perf record --event dummy.o --exclude-perf ls
 Added new event:
   perf_bpf_probe:func_write (on sys_write)
 ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c       |  7 +++++++
 tools/perf/util/evsel.c        | 32 ++++++++++++++++++++++++++++++++
 tools/perf/util/evsel.h        | 23 +++++++++++++++++++++++
 tools/perf/util/parse-events.c |  4 ++--
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 93db4c1..29212dc 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1223,6 +1223,13 @@ int perf_evlist__apply_filters(struct perf_evlist *evlist, struct perf_evsel **e
 			continue;
 
 		/*
+		 * Filters are allowed to be set to dummy event for BPF object
+		 * placeholder. Don't really apply them.
+		 */
+		if (perf_evsel__is_dummy(evsel))
+			continue;
+
+		/*
 		 * filters only work for tracepoint event, which doesn't have cpu limit.
 		 * So evlist and evsel should always be same.
 		 */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 73cf9fc..e307ea2 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2344,3 +2344,35 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target,
 			 err, strerror_r(err, sbuf, sizeof(sbuf)),
 			 perf_evsel__name(evsel));
 }
+
+bool perf_evsel__is_bpf_placeholder(struct perf_evsel *evsel)
+{
+	if (!perf_evsel__is_dummy(evsel))
+		return false;
+	if (!evsel->name)
+		return false;
+	/*
+	 * If evsel->name doesn't starts with 'dummy', it must be a BPF
+	 * place holder.
+	 */
+	if (strncmp(evsel->name, perf_evsel__sw_names[PERF_COUNT_SW_DUMMY],
+			strlen(perf_evsel__sw_names[PERF_COUNT_SW_DUMMY])))
+		return true;
+	/*
+	 * Very rare case: evsel->name is 'dummy_crazy.bpf'.
+	 *
+	 * Let's check name suffix. A bpf file should ends with one of:
+	 * '.o', '.c' or '.bpf'.
+	 */
+#define SUFFIX_CMP(s)\
+	strcmp(evsel->name + strlen(evsel->name) - (sizeof(s) - 1), s)
+
+	if (SUFFIX_CMP(".o") == 0)
+		return true;
+	if (SUFFIX_CMP(".c") == 0)
+		return true;
+	if (SUFFIX_CMP(".bpf") == 0)
+		return true;
+	return false;
+#undef SUFFIX_CMP
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index fd22f83..864fd3f 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -372,11 +372,34 @@ bool perf_evsel__fallback(struct perf_evsel *evsel, int err,
 int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target,
 			      int err, char *msg, size_t size);
 
+bool perf_evsel__is_bpf_placeholder(struct perf_evsel *evsel);
+
 static inline int perf_evsel__group_idx(struct perf_evsel *evsel)
 {
 	return evsel->idx - evsel->leader->idx;
 }
 
+static inline bool perf_evsel__is_dummy(struct perf_evsel *evsel)
+{
+	if (!evsel)
+		return false;
+	if (evsel->attr.type != PERF_TYPE_SOFTWARE)
+		return false;
+	if (evsel->attr.config != PERF_COUNT_SW_DUMMY)
+		return false;
+	return true;
+}
+
+static inline int perf_evsel__can_filter(struct perf_evsel *evsel)
+{
+	if (!evsel)
+		return false;
+	if (evsel->attr.type == PERF_TYPE_TRACEPOINT)
+		return true;
+
+	return perf_evsel__is_bpf_placeholder(evsel);
+}
+
 #define for_each_group_member(_evsel, _leader) 					\
 for ((_evsel) = list_entry((_leader)->node.next, struct perf_evsel, node); 	\
      (_evsel) && (_evsel)->leader == (_leader);					\
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index b560f5f..d961e90 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1346,7 +1346,7 @@ static int set_filter(struct perf_evsel *evsel, const void *arg)
 {
 	const char *str = arg;
 
-	if (evsel == NULL || evsel->attr.type != PERF_TYPE_TRACEPOINT) {
+	if (!perf_evsel__can_filter(evsel)) {
 		fprintf(stderr,
 			"--filter option should follow a -e tracepoint option\n");
 		return -1;
@@ -1375,7 +1375,7 @@ static int add_exclude_perf_filter(struct perf_evsel *evsel,
 {
 	char new_filter[64];
 
-	if (evsel == NULL || evsel->attr.type != PERF_TYPE_TRACEPOINT) {
+	if (!perf_evsel__can_filter(evsel)) {
 		fprintf(stderr,
 			"--exclude-perf option should follow a -e tracepoint option\n");
 		return -1;
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH] perf tools: Sync setting of real bpf events with placeholder
  2015-09-06  5:55   ` [PATCH] perf tools: Allow BPF placeholder dummy events to collect --filter options Wang Nan
@ 2015-09-06  5:56     ` Wang Nan
  0 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-09-06  5:56 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, Wang Nan, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	Zefan Li, pi3orama

In this patch, when adding real events described in BPF objects, sync
filter and tracking settings with previous dummy placeholder. After
this patch, command like:

 # perf record --test-filter.o --exclude-perf ls

work as we expect.

After all settings are synced, we remove those placeholder from evlist
so they won't appear in the final perf.data.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/bpf-loader.c |  8 ++++-
 tools/perf/util/bpf-loader.h |  1 +
 tools/perf/util/evlist.c     | 75 +++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 79 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 2880dbf..3400538 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -325,6 +325,12 @@ int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg)
 	int err;
 
 	bpf_object__for_each_safe(obj, tmp) {
+		const char *obj_name;
+
+		obj_name = bpf_object__get_name(obj);
+		if (!obj_name)
+			obj_name = "[unknown].o";
+
 		bpf_object__for_each_program(prog, obj) {
 			struct probe_trace_event *tev;
 			struct perf_probe_event *pev;
@@ -348,7 +354,7 @@ int bpf__foreach_tev(bpf_prog_iter_callback_t func, void *arg)
 					return fd;
 				}
 
-				err = (*func)(tev, fd, arg);
+				err = (*func)(tev, obj_name, fd, arg);
 				if (err) {
 					pr_debug("bpf: call back failed, stop iterate\n");
 					return err;
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 34656f8..5bac423 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -13,6 +13,7 @@
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
 typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
+					const char *obj_name,
 					int fd, void *arg);
 
 #ifdef HAVE_LIBBPF_SUPPORT
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 29212dc..7e36563 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -197,7 +197,54 @@ error:
 	return -ENOMEM;
 }
 
-static int add_bpf_event(struct probe_trace_event *tev, int fd,
+static void sync_with_bpf_placeholder(struct perf_evlist *evlist,
+				      const char *obj_name,
+				      struct list_head *list)
+{
+	struct perf_evsel *dummy_evsel, *pos;
+
+	const char *filter;
+	bool tracking_set = false;
+	bool found = false;
+
+	evlist__for_each(evlist, dummy_evsel) {
+		if (!perf_evsel__is_bpf_placeholder(dummy_evsel))
+			continue;
+
+		if (strcmp(dummy_evsel->name, obj_name) == 0) {
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		pr_debug("Failed to find dummy event of '%s'\n",
+			 obj_name);
+		return;
+	}
+
+	filter = dummy_evsel->filter;
+
+	list_for_each_entry(pos, list, node) {
+		if (filter && perf_evsel__set_filter(pos, filter)) {
+			pr_debug("Failed to set filter '%s' to evsel %s\n",
+				 filter, pos->name);
+		}
+
+		/* Sync tracking */
+		if (dummy_evsel->tracking && !tracking_set)
+			pos->tracking = tracking_set = true;
+
+		/*
+		 * If someday we allow to add config terms or modifiers
+		 * to placeholder, we should sync them with real events
+		 * here. Currently only tracking needs to be considered.
+		 */
+	}
+}
+
+static int add_bpf_event(struct probe_trace_event *tev,
+			 const char *obj_name, int fd,
 			 void *arg)
 {
 	struct perf_evlist *evlist = arg;
@@ -205,8 +252,8 @@ static int add_bpf_event(struct probe_trace_event *tev, int fd,
 	struct list_head list;
 	int err, idx, entries;
 
-	pr_debug("add bpf event %s:%s and attach bpf program %d\n",
-			tev->group, tev->event, fd);
+	pr_debug("add bpf event %s:%s and attach bpf program %d (from %s)\n",
+			tev->group, tev->event, fd, obj_name);
 	INIT_LIST_HEAD(&list);
 	idx = evlist->nr_entries;
 
@@ -228,13 +275,33 @@ static int add_bpf_event(struct probe_trace_event *tev, int fd,
 	list_for_each_entry(pos, &list, node)
 		pos->bpf_fd = fd;
 	entries = idx - evlist->nr_entries;
+
+	sync_with_bpf_placeholder(evlist, obj_name, &list);
+	/*
+	 * Currectly we don't need to link those new events at the
+	 * same place where dummy node reside because order of
+	 * events in cmdline won't be used after
+	 * 'perf_evlist__add_bpf'.
+	 */
 	perf_evlist__splice_list_tail(evlist, &list, entries);
 	return 0;
 }
 
 int perf_evlist__add_bpf(struct perf_evlist *evlist)
 {
-	return bpf__foreach_tev(add_bpf_event, evlist);
+	struct perf_evsel *pos, *n;
+	int err;
+
+	err = bpf__foreach_tev(add_bpf_event, evlist);
+
+	evlist__for_each_safe(evlist, n, pos) {
+		if (perf_evsel__is_bpf_placeholder(pos)) {
+			list_del_init(&pos->node);
+			perf_evsel__delete(pos);
+			evlist->nr_entries--;
+		}
+	}
+	return err;
 }
 
 static int perf_evlist__add_attrs(struct perf_evlist *evlist,
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86
  2015-09-01 15:54           ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-06  6:02             ` Wangnan (F)
  2015-09-06  6:04               ` [PATCH] perf test: Enforce LLVM test, add kbuild test Wang Nan
  0 siblings, 1 reply; 94+ messages in thread
From: Wangnan (F) @ 2015-09-06  6:02 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI,
	'Arnaldo Carvalho de Melo'
  Cc: acme, linux-kernel, He Kuang, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, Jiri Olsa, Kaixu Xia, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Zefan Li, pi3orama



On 2015/9/1 23:54, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: Arnaldo Carvalho de Melo [mailto:acme@redhat.com]
>>
>> Em Tue, Sep 01, 2015 at 11:47:41AM +0000, 平松雅巳 / HIRAMATU,MASAMI escreveu:
>>>> From: Wang Nan [mailto:wangnan0@huawei.com]
>>>>
>>>> regs_query_register_offset() is a helper function which converts
>>>> register name like "%rax" to offset of a register in 'struct pt_regs',
>>>> which is required by BPF prologue generator. Since the function is
>>>> identical, try to reuse the code in arch/x86/kernel/ptrace.c.
>>>>
>>>> Comment inside dwarf-regs.c list the differences between this
>>>> implementation and kernel code.
>>> Hmm, this also introduce a duplication of the code...
>>> It might be a good time to move them into arch/x86/lib/ and
>>> reuse it directly from perf code.
>> It is strange to, having tried sharing stuff directly from the kernel,
>> to be now in a position where I advocate against it...
>>
>> Copy'n'pasting what I said in another message:
>>
>> -----
>> We would go back to sharing stuff with the kernel, but this time around
>> we would be using something that everybody knows is being shared, which
>> doesn't elliminates the possibility that at some point changes made with
>> the kernel in mind would break the tools/ using code.
>>
>> Perhaps it is better to keep copying what we want and introduce
>> infrastructure to check for differences and warn us as soon as possible
>> so that we would do the copy, test if it doesn't break what we use, etc.
>>
>> I.e. we wouldn't be putting any new burden on the "kernel people", i.e.
>> the burden of having to check that changes they make don't break tools/
>> living code, nor any out of the blue breakage on tools/ for tools/
>> developers to fix when changes are made on the kernel "side" -----
>> ---
>>
>> The "stop sharing directly stuff with the kernel" stance was taken after
>> a report from Linus about breakage due to tools/ using kernel files
>> directly and then a change made in some RCU files broke the tools/perf/
>> build, even with tools/perf/ not using anything RCU related so far.
>>
>> Looking at tools/perf/MANIFEST, the file used to create a detached
>> tarball so that perf can be built outside the kernel sources there are
>> still some kernel source files listed, but those probably need to be
>> copied too...
> OK, so let this apply.
>
> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>
> And also we'll need a testcase for this.

I created a testcase for the whole BPF prologue, so I think this can be 
covered?

I'll post them by replying this mail.

Thank you.

> Thank you,
>
>> - Arnaldo
>>
>>> Thank you,
>>>
>>>> get_arch_regstr() switches to regoffset_table and the old string table
>>>> is dropped.
>>>>
>>>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>>>> Signed-off-by: He Kuang <hekuang@huawei.com>
>>>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>>>> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
>>>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>>>> Cc: David Ahern <dsahern@gmail.com>
>>>> Cc: He Kuang <hekuang@huawei.com>
>>>> Cc: Jiri Olsa <jolsa@kernel.org>
>>>> Cc: Kaixu Xia <xiakaixu@huawei.com>
>>>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>>> Cc: Paul Mackerras <paulus@samba.org>
>>>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>>> Cc: Zefan Li <lizefan@huawei.com>
>>>> Cc: pi3orama@163.com
>>>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>>>> ---
>>>>   tools/perf/arch/x86/Makefile          |   1 +
>>>>   tools/perf/arch/x86/util/Build        |   1 +
>>>>   tools/perf/arch/x86/util/dwarf-regs.c | 122 ++++++++++++++++++++++++----------
>>>>   3 files changed, 90 insertions(+), 34 deletions(-)
>>>>
>>>> diff --git a/tools/perf/arch/x86/Makefile b/tools/perf/arch/x86/Makefile
>>>> index 21322e0..09ba923 100644
>>>> --- a/tools/perf/arch/x86/Makefile
>>>> +++ b/tools/perf/arch/x86/Makefile
>>>> @@ -2,3 +2,4 @@ ifndef NO_DWARF
>>>>   PERF_HAVE_DWARF_REGS := 1
>>>>   endif
>>>>   HAVE_KVM_STAT_SUPPORT := 1
>>>> +PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
>>>> diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
>>>> index 2c55e1b..d4d1f23 100644
>>>> --- a/tools/perf/arch/x86/util/Build
>>>> +++ b/tools/perf/arch/x86/util/Build
>>>> @@ -4,6 +4,7 @@ libperf-y += pmu.o
>>>>   libperf-y += kvm-stat.o
>>>>
>>>>   libperf-$(CONFIG_DWARF) += dwarf-regs.o
>>>> +libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
>>>>
>>>>   libperf-$(CONFIG_LIBUNWIND)          += unwind-libunwind.o
>>>>   libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
>>>> diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
>>>> index a08de0a..de5b936 100644
>>>> --- a/tools/perf/arch/x86/util/dwarf-regs.c
>>>> +++ b/tools/perf/arch/x86/util/dwarf-regs.c
>>>> @@ -21,55 +21,109 @@
>>>>    */
>>>>
>>>>   #include <stddef.h>
>>>> +#include <errno.h> /* for EINVAL */
>>>> +#include <string.h> /* for strcmp */
>>>> +#include <linux/ptrace.h> /* for struct pt_regs */
>>>> +#include <linux/kernel.h> /* for offsetof */
>>>>   #include <dwarf-regs.h>
>>>>
>>>>   /*
>>>> - * Generic dwarf analysis helpers
>>>> + * See arch/x86/kernel/ptrace.c.
>>>> + * Different from it:
>>>> + *
>>>> + *  - Since struct pt_regs is defined differently for user and kernel,
>>>> + *    but we want to use 'ax, bx' instead of 'rax, rbx' (which is struct
>>>> + *    field name of user's pt_regs), we make REG_OFFSET_NAME to accept
>>>> + *    both string name and reg field name.
>>>> + *
>>>> + *  - Since accessing x86_32's pt_regs from x86_64 building is difficult
>>>> + *    and vise versa, we simply fill offset with -1, so
>>>> + *    get_arch_regstr() still works but regs_query_register_offset()
>>>> + *    returns error.
>>>> + *    The only inconvenience caused by it now is that we are not allowed
>>>> + *    to generate BPF prologue for a x86_64 kernel if perf is built for
>>>> + *    x86_32. This is really a rare usecase.
>>>> + *
>>>> + *  - Order is different from kernel's ptrace.c for get_arch_regstr(), which
>>>> + *    is defined by dwarf.
>>>>    */
>>>>
>>>> -#define X86_32_MAX_REGS 8
>>>> -const char *x86_32_regs_table[X86_32_MAX_REGS] = {
>>>> -	"%ax",
>>>> -	"%cx",
>>>> -	"%dx",
>>>> -	"%bx",
>>>> -	"$stack",	/* Stack address instead of %sp */
>>>> -	"%bp",
>>>> -	"%si",
>>>> -	"%di",
>>>> +struct pt_regs_offset {
>>>> +	const char *name;
>>>> +	int offset;
>>>> +};
>>>> +
>>>> +#define REG_OFFSET_END {.name = NULL, .offset = 0}
>>>> +
>>>> +#ifdef __x86_64__
>>>> +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
>>>> +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = -1}
>>>> +#else
>>>> +# define REG_OFFSET_NAME_64(n, r) {.name = n, .offset = -1}
>>>> +# define REG_OFFSET_NAME_32(n, r) {.name = n, .offset = offsetof(struct pt_regs, r)}
>>>> +#endif
>>>> +
>>>> +static const struct pt_regs_offset x86_32_regoffset_table[] = {
>>>> +	REG_OFFSET_NAME_32("%ax",	eax),
>>>> +	REG_OFFSET_NAME_32("%cx",	ecx),
>>>> +	REG_OFFSET_NAME_32("%dx",	edx),
>>>> +	REG_OFFSET_NAME_32("%bx",	ebx),
>>>> +	REG_OFFSET_NAME_32("$stack",	esp),	/* Stack address instead of %sp */
>>>> +	REG_OFFSET_NAME_32("%bp",	ebp),
>>>> +	REG_OFFSET_NAME_32("%si",	esi),
>>>> +	REG_OFFSET_NAME_32("%di",	edi),
>>>> +	REG_OFFSET_END,
>>>>   };
>>>>
>>>> -#define X86_64_MAX_REGS 16
>>>> -const char *x86_64_regs_table[X86_64_MAX_REGS] = {
>>>> -	"%ax",
>>>> -	"%dx",
>>>> -	"%cx",
>>>> -	"%bx",
>>>> -	"%si",
>>>> -	"%di",
>>>> -	"%bp",
>>>> -	"%sp",
>>>> -	"%r8",
>>>> -	"%r9",
>>>> -	"%r10",
>>>> -	"%r11",
>>>> -	"%r12",
>>>> -	"%r13",
>>>> -	"%r14",
>>>> -	"%r15",
>>>> +static const struct pt_regs_offset x86_64_regoffset_table[] = {
>>>> +	REG_OFFSET_NAME_64("%ax",	rax),
>>>> +	REG_OFFSET_NAME_64("%dx",	rdx),
>>>> +	REG_OFFSET_NAME_64("%cx",	rcx),
>>>> +	REG_OFFSET_NAME_64("%bx",	rbx),
>>>> +	REG_OFFSET_NAME_64("%si",	rsi),
>>>> +	REG_OFFSET_NAME_64("%di",	rdi),
>>>> +	REG_OFFSET_NAME_64("%bp",	rbp),
>>>> +	REG_OFFSET_NAME_64("%sp",	rsp),
>>>> +	REG_OFFSET_NAME_64("%r8",	r8),
>>>> +	REG_OFFSET_NAME_64("%r9",	r9),
>>>> +	REG_OFFSET_NAME_64("%r10",	r10),
>>>> +	REG_OFFSET_NAME_64("%r11",	r11),
>>>> +	REG_OFFSET_NAME_64("%r12",	r12),
>>>> +	REG_OFFSET_NAME_64("%r13",	r13),
>>>> +	REG_OFFSET_NAME_64("%r14",	r14),
>>>> +	REG_OFFSET_NAME_64("%r15",	r15),
>>>> +	REG_OFFSET_END,
>>>>   };
>>>>
>>>>   /* TODO: switching by dwarf address size */
>>>>   #ifdef __x86_64__
>>>> -#define ARCH_MAX_REGS X86_64_MAX_REGS
>>>> -#define arch_regs_table x86_64_regs_table
>>>> +#define regoffset_table x86_64_regoffset_table
>>>>   #else
>>>> -#define ARCH_MAX_REGS X86_32_MAX_REGS
>>>> -#define arch_regs_table x86_32_regs_table
>>>> +#define regoffset_table x86_32_regoffset_table
>>>>   #endif
>>>>
>>>> +/* Minus 1 for the ending REG_OFFSET_END */
>>>> +#define ARCH_MAX_REGS ((sizeof(regoffset_table) / sizeof(regoffset_table[0])) - 1)
>>>> +
>>>>   /* Return architecture dependent register string (for kprobe-tracer) */
>>>>   const char *get_arch_regstr(unsigned int n)
>>>>   {
>>>> -	return (n < ARCH_MAX_REGS) ? arch_regs_table[n] : NULL;
>>>> +	return (n < ARCH_MAX_REGS) ? regoffset_table[n].name : NULL;
>>>> +}
>>>> +
>>>> +/* Reuse code from arch/x86/kernel/ptrace.c */
>>>> +/**
>>>> + * regs_query_register_offset() - query register offset from its name
>>>> + * @name:	the name of a register
>>>> + *
>>>> + * regs_query_register_offset() returns the offset of a register in struct
>>>> + * pt_regs from its name. If the name is invalid, this returns -EINVAL;
>>>> + */
>>>> +int regs_query_register_offset(const char *name)
>>>> +{
>>>> +	const struct pt_regs_offset *roff;
>>>> +	for (roff = regoffset_table; roff->name != NULL; roff++)
>>>> +		if (!strcmp(roff->name, name))
>>>> +			return roff->offset;
>>>> +	return -EINVAL;
>>>>   }
>>>> --
>>>> 1.8.3.4



^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH] perf test: Enforce LLVM test, add kbuild test
  2015-09-06  6:02             ` Wangnan (F)
@ 2015-09-06  6:04               ` Wang Nan
  2015-09-06  6:04                 ` [PATCH] perf test: Test BPF prologue Wang Nan
  0 siblings, 1 reply; 94+ messages in thread
From: Wang Nan @ 2015-09-06  6:04 UTC (permalink / raw)
  To: acme, masami.hiramatsu.pt
  Cc: linux-kernel, Wang Nan, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Namhyung Kim, Peter Zijlstra, Zefan Li, pi3orama

This patch enforces existing LLVM test, makes it compile more than one
BPF source file. The compiled results are stored, can be used for other
testcases. Except the first testcase (named LLVM_TESTCASE_BASE), failures
of other test cases are not considered as failure of the whole test.

Adds a kbuild testcase to check whether kernel headers can be correctly
found.

For example:

 # perf test LLVM

   38: Test LLVM searching and compiling                        : (llvm.kbuild-dir can be fixed) Ok

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/tests/Build                    |  11 ++-
 tools/perf/tests/bpf-script-example.c     |   4 +
 tools/perf/tests/bpf-script-test-kbuild.c |  21 ++++
 tools/perf/tests/bpf.c                    |   3 +-
 tools/perf/tests/llvm.c                   | 154 ++++++++++++++++++++++--------
 tools/perf/tests/llvm.h                   |  10 +-
 6 files changed, 156 insertions(+), 47 deletions(-)
 create mode 100644 tools/perf/tests/bpf-script-test-kbuild.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 5cfb420..2bd5f37 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -32,17 +32,24 @@ perf-y += sample-parsing.o
 perf-y += parse-no-sample-id-all.o
 perf-y += kmod-path.o
 perf-y += thread-map.o
-perf-y += llvm.o llvm-src.o
+perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o
 perf-y += bpf.o
 perf-y += topology.o
 
-$(OUTPUT)tests/llvm-src.c: tests/bpf-script-example.c
+$(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c
 	$(call rule_mkdir)
 	$(Q)echo '#include <tests/llvm.h>' > $@
 	$(Q)echo 'const char test_llvm__bpf_prog[] =' >> $@
 	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
 	$(Q)echo ';' >> $@
 
+$(OUTPUT)tests/llvm-src-kbuild.c: tests/bpf-script-test-kbuild.c
+	$(call rule_mkdir)
+	$(Q)echo '#include <tests/llvm.h>' > $@
+	$(Q)echo 'const char test_llvm__bpf_test_kbuild_prog[] =' >> $@
+	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
+	$(Q)echo ';' >> $@
+
 perf-$(CONFIG_X86) += perf-time-to-tsc.o
 ifdef CONFIG_AUXTRACE
 perf-$(CONFIG_X86) += insn-x86.o
diff --git a/tools/perf/tests/bpf-script-example.c b/tools/perf/tests/bpf-script-example.c
index 410a70b..0ec9c2c 100644
--- a/tools/perf/tests/bpf-script-example.c
+++ b/tools/perf/tests/bpf-script-example.c
@@ -1,3 +1,7 @@
+/*
+ * bpf-script-example.c
+ * Test basic LLVM building
+ */
 #ifndef LINUX_VERSION_CODE
 # error Need LINUX_VERSION_CODE
 # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
diff --git a/tools/perf/tests/bpf-script-test-kbuild.c b/tools/perf/tests/bpf-script-test-kbuild.c
new file mode 100644
index 0000000..a11f589
--- /dev/null
+++ b/tools/perf/tests/bpf-script-test-kbuild.c
@@ -0,0 +1,21 @@
+/*
+ * bpf-script-test-kbuild.c
+ * Test include from kernel header
+ */
+#ifndef LINUX_VERSION_CODE
+# error Need LINUX_VERSION_CODE
+# error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
+#endif
+#define SEC(NAME) __attribute__((section(NAME), used))
+
+#include <uapi/linux/fs.h>
+#include <uapi/asm/ptrace.h>
+
+SEC("func=vfs_llseek")
+int bpf_func__vfs_llseek(struct pt_regs *ctx)
+{
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
+int _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index e256c12..64aaab68 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -143,7 +143,8 @@ int test__bpf(void)
 		return TEST_SKIP;
 	}
 
-	test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz);
+	test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz, LLVM_TESTCASE_BASE);
+
 	if (!obj_buf || !obj_buf_sz) {
 		if (verbose == 0)
 			fprintf(stderr, " (fix 'perf test LLVM' first)");
diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
index fd5fdb0..75cd99f 100644
--- a/tools/perf/tests/llvm.c
+++ b/tools/perf/tests/llvm.c
@@ -9,6 +9,22 @@
 #include "debug.h"
 #include "llvm.h"
 
+#define SHARED_BUF_INIT_SIZE	(1 << 20)
+struct llvm_testcase {
+	const char *source;
+	const char *errmsg;
+	struct test_llvm__bpf_result *result;
+	bool tried;
+} llvm_testcases[NR_LLVM_TESTCASES + 1] = {
+	[LLVM_TESTCASE_BASE]	= {.source = test_llvm__bpf_prog,
+				   .errmsg = "Basic LLVM compiling failed",
+				   .tried = false},
+	[LLVM_TESTCASE_KBUILD]	= {.source = test_llvm__bpf_test_kbuild_prog,
+				   .errmsg = "llvm.kbuild-dir can be fixed",
+				   .tried = false},
+	{.source = NULL}
+};
+
 static int perf_config_cb(const char *var, const char *val,
 			  void *arg __maybe_unused)
 {
@@ -36,7 +52,7 @@ static int test__bpf_parsing(void *obj_buf __maybe_unused,
 #endif
 
 static char *
-compose_source(void)
+compose_source(const char *raw_source)
 {
 	struct utsname utsname;
 	int version, patchlevel, sublevel, err;
@@ -56,25 +72,27 @@ compose_source(void)
 
 	version_code = (version << 16) + (patchlevel << 8) + sublevel;
 	err = asprintf(&code, "#define LINUX_VERSION_CODE 0x%08lx;\n%s",
-		       version_code, test_llvm__bpf_prog);
+		       version_code, raw_source);
 	if (err < 0)
 		return NULL;
 
 	return code;
 }
 
-#define SHARED_BUF_INIT_SIZE	(1 << 20)
-struct test_llvm__bpf_result *p_test_llvm__bpf_result;
 
-int test__llvm(void)
+static int __test__llvm(int i)
 {
-	char *tmpl_new, *clang_opt_new;
 	void *obj_buf;
 	size_t obj_buf_sz;
 	int err, old_verbose;
-	char *source;
+	const char *tmpl_old, *clang_opt_old;
+	char *tmpl_new, *clang_opt_new, *source;
+	const char *raw_source = llvm_testcases[i].source;
+	struct test_llvm__bpf_result *result = llvm_testcases[i].result;
 
 	perf_config(perf_config_cb, NULL);
+	clang_opt_old = llvm_param.clang_opt;
+	tmpl_old = llvm_param.clang_bpf_cmd_template;
 
 	/*
 	 * Skip this test if user's .perfconfig doesn't set [llvm] section
@@ -99,15 +117,17 @@ int test__llvm(void)
 	if (!llvm_param.clang_opt)
 		llvm_param.clang_opt = strdup("");
 
-	source = compose_source();
+	source = compose_source(raw_source);
 	if (!source) {
 		pr_err("Failed to compose source code\n");
 		return -1;
 	}
 
 	/* Quote __EOF__ so strings in source won't be expanded by shell */
-	err = asprintf(&tmpl_new, "cat << '__EOF__' | %s\n%s\n__EOF__\n",
-		       llvm_param.clang_bpf_cmd_template, source);
+	err = asprintf(&tmpl_new, "cat << '__EOF__' | %s %s \n%s\n__EOF__\n",
+		       llvm_param.clang_bpf_cmd_template,
+		       !old_verbose ? "2>/dev/null" : "",
+		       source);
 	free(source);
 	source = NULL;
 	if (err < 0) {
@@ -123,73 +143,123 @@ int test__llvm(void)
 	llvm_param.clang_opt = clang_opt_new;
 	err = llvm__compile_bpf("-", &obj_buf, &obj_buf_sz);
 
+	free((void *)llvm_param.clang_bpf_cmd_template);
+	free((void *)llvm_param.clang_opt);
+	llvm_param.clang_bpf_cmd_template = tmpl_old;
+	llvm_param.clang_opt = clang_opt_old;
+
 	verbose = old_verbose;
-	if (err) {
-		if (!verbose)
-			fprintf(stderr, " (use -v to see error message)");
+	if (err)
 		return -1;
-	}
 
 	err = test__bpf_parsing(obj_buf, obj_buf_sz);
-	if (!err && p_test_llvm__bpf_result) {
+	if (!err && result) {
 		if (obj_buf_sz > SHARED_BUF_INIT_SIZE) {
 			pr_err("Resulting object too large\n");
 		} else {
-			p_test_llvm__bpf_result->size = obj_buf_sz;
-			memcpy(p_test_llvm__bpf_result->object,
-			       obj_buf, obj_buf_sz);
+			result->size = obj_buf_sz;
+			memcpy(result->object, obj_buf, obj_buf_sz);
 		}
 	}
 	free(obj_buf);
 	return err;
 }
 
+int test__llvm(void)
+{
+	int i, ret;
+
+	for (i = 0; llvm_testcases[i].source; i++) {
+		ret = __test__llvm(i);
+		if (i == 0 && ret) {
+			/*
+			 * First testcase tests basic LLVM compiling. If it
+			 * fails, no need to check others.
+			 */
+			if (!verbose)
+				fprintf(stderr, " (use -v to see error message)");
+			return ret;
+		} else if (ret) {
+			if (!verbose && llvm_testcases[i].errmsg)
+				fprintf(stderr, " (%s)", llvm_testcases[i].errmsg);
+			return 0;
+		}
+	}
+	return 0;
+}
+
 void test__llvm_prepare(void)
 {
-	p_test_llvm__bpf_result = mmap(NULL, SHARED_BUF_INIT_SIZE,
-				       PROT_READ | PROT_WRITE,
-				       MAP_SHARED | MAP_ANONYMOUS, -1, 0);
-	if (!p_test_llvm__bpf_result)
-		return;
-	memset((void *)p_test_llvm__bpf_result, '\0', SHARED_BUF_INIT_SIZE);
+	int i;
+
+	for (i = 0; llvm_testcases[i].source; i++) {
+		struct test_llvm__bpf_result *result;
+
+		result = mmap(NULL, SHARED_BUF_INIT_SIZE,
+			      PROT_READ | PROT_WRITE,
+			      MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+		if (!result)
+			return;
+		memset((void *)result, '\0', SHARED_BUF_INIT_SIZE);
+
+		llvm_testcases[i].result = result;
+	}
 }
 
 void test__llvm_cleanup(void)
 {
-	unsigned long boundary, buf_end;
+	int i;
 
-	if (!p_test_llvm__bpf_result)
-		return;
-	if (p_test_llvm__bpf_result->size == 0) {
-		munmap((void *)p_test_llvm__bpf_result, SHARED_BUF_INIT_SIZE);
-		p_test_llvm__bpf_result = NULL;
-		return;
-	}
+	for (i = 0; llvm_testcases[i].source; i++) {
+		struct test_llvm__bpf_result *result;
+		unsigned long boundary, buf_end;
 
-	buf_end = (unsigned long)p_test_llvm__bpf_result + SHARED_BUF_INIT_SIZE;
+		result = llvm_testcases[i].result;
+		llvm_testcases[i].tried = true;
 
-	boundary = (unsigned long)(p_test_llvm__bpf_result);
-	boundary += p_test_llvm__bpf_result->size;
-	boundary = (boundary + (page_size - 1)) &
+		if (!result)
+			continue;
+
+		if (result->size == 0) {
+			munmap((void *)result, SHARED_BUF_INIT_SIZE);
+			result = NULL;
+			llvm_testcases[i].result = NULL;
+			continue;
+		}
+
+		buf_end = (unsigned long)result + SHARED_BUF_INIT_SIZE;
+
+		boundary = (unsigned long)(result);
+		boundary += result->size;
+		boundary = (boundary + (page_size - 1)) &
 			(~((unsigned long)page_size - 1));
-	munmap((void *)boundary, buf_end - boundary);
+		munmap((void *)boundary, buf_end - boundary);
+	}
 }
 
 void
-test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz)
+test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz, int index)
 {
+	struct test_llvm__bpf_result *result;
+
 	*p_obj_buf = NULL;
 	*p_obj_buf_sz = 0;
 
-	if (!p_test_llvm__bpf_result) {
+	if (index > NR_LLVM_TESTCASES)
+		return;
+
+	result = llvm_testcases[index].result;
+
+	if (!result && !llvm_testcases[index].tried) {
 		test__llvm_prepare();
 		test__llvm();
 		test__llvm_cleanup();
 	}
 
-	if (!p_test_llvm__bpf_result)
+	result = llvm_testcases[index].result;
+	if (!result)
 		return;
 
-	*p_obj_buf = p_test_llvm__bpf_result->object;
-	*p_obj_buf_sz = p_test_llvm__bpf_result->size;
+	*p_obj_buf = result->object;
+	*p_obj_buf_sz = result->size;
 }
diff --git a/tools/perf/tests/llvm.h b/tools/perf/tests/llvm.h
index 2fd7ed6..78ec01d 100644
--- a/tools/perf/tests/llvm.h
+++ b/tools/perf/tests/llvm.h
@@ -8,8 +8,14 @@ struct test_llvm__bpf_result {
 	char object[];
 };
 
-extern struct test_llvm__bpf_result *p_test_llvm__bpf_result;
 extern const char test_llvm__bpf_prog[];
-void test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz);
+extern const char test_llvm__bpf_test_kbuild_prog[];
+
+enum test_llvm__testcase {
+	LLVM_TESTCASE_BASE,
+	LLVM_TESTCASE_KBUILD,
+	NR_LLVM_TESTCASES,
+};
+void test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz, int index);
 
 #endif
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH] perf test: Test BPF prologue
  2015-09-06  6:04               ` [PATCH] perf test: Enforce LLVM test, add kbuild test Wang Nan
@ 2015-09-06  6:04                 ` Wang Nan
  0 siblings, 0 replies; 94+ messages in thread
From: Wang Nan @ 2015-09-06  6:04 UTC (permalink / raw)
  To: acme, masami.hiramatsu.pt
  Cc: linux-kernel, Wang Nan, Alexei Starovoitov, Brendan Gregg,
	Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia,
	Namhyung Kim, Peter Zijlstra, Zefan Li, pi3orama

This patch introduces a new BPF script to test BPF prologue. The new
script probes at null_lseek, which is the function pointer when we try
to lseek on '/dev/null'.

null_lseek is chosen because it is a function pointer, so we don't need
to consider inlining and LTP.

By extracting file->f_mode, bpf-script-test-prologue.c should know whether
the file is writable or readonly. According to llseek_loop() and
bpf-script-test-prologue.c, one forth of total lseeks should be collected.

This patch improve test__bpf so it can run multiple BPF programs on
different test functions.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/tests/Build                      |  9 ++-
 tools/perf/tests/bpf-script-test-prologue.c | 35 +++++++++++
 tools/perf/tests/bpf.c                      | 93 +++++++++++++++++++++++------
 tools/perf/tests/llvm.c                     |  5 ++
 tools/perf/tests/llvm.h                     |  8 +++
 5 files changed, 130 insertions(+), 20 deletions(-)
 create mode 100644 tools/perf/tests/bpf-script-test-prologue.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 2bd5f37..3e98a97 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -32,7 +32,7 @@ perf-y += sample-parsing.o
 perf-y += parse-no-sample-id-all.o
 perf-y += kmod-path.o
 perf-y += thread-map.o
-perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o
+perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o
 perf-y += bpf.o
 perf-y += topology.o
 
@@ -50,6 +50,13 @@ $(OUTPUT)tests/llvm-src-kbuild.c: tests/bpf-script-test-kbuild.c
 	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
 	$(Q)echo ';' >> $@
 
+$(OUTPUT)tests/llvm-src-prologue.c: tests/bpf-script-test-prologue.c
+	$(call rule_mkdir)
+	$(Q)echo '#include <tests/llvm.h>' > $@
+	$(Q)echo 'const char test_llvm__bpf_test_prologue_prog[] =' >> $@
+	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
+	$(Q)echo ';' >> $@
+
 perf-$(CONFIG_X86) += perf-time-to-tsc.o
 ifdef CONFIG_AUXTRACE
 perf-$(CONFIG_X86) += insn-x86.o
diff --git a/tools/perf/tests/bpf-script-test-prologue.c b/tools/perf/tests/bpf-script-test-prologue.c
new file mode 100644
index 0000000..7230e62
--- /dev/null
+++ b/tools/perf/tests/bpf-script-test-prologue.c
@@ -0,0 +1,35 @@
+/*
+ * bpf-script-test-prologue.c
+ * Test BPF prologue
+ */
+#ifndef LINUX_VERSION_CODE
+# error Need LINUX_VERSION_CODE
+# error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
+#endif
+#define SEC(NAME) __attribute__((section(NAME), used))
+
+#include <uapi/linux/fs.h>
+
+#define FMODE_READ		0x1
+#define FMODE_WRITE		0x2
+
+static void (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
+	(void *) 6;
+
+SEC("func=null_lseek file->f_mode offset orig")
+int bpf_func__null_lseek(void *ctx, int err, unsigned long f_mode,
+			 unsigned long offset, unsigned long orig)
+{
+	if (err)
+		return 0;
+	if (f_mode & FMODE_WRITE)
+		return 0;
+	if (offset & 1)
+		return 0;
+	if (orig == SEEK_CUR)
+		return 0;
+	return 1;
+}
+
+char _license[] SEC("license") = "GPL";
+int _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 64aaab68..6305b3d 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -19,14 +19,37 @@ static int epoll_pwait_loop(void)
 	return 0;
 }
 
-static int prepare_bpf(void *obj_buf, size_t obj_buf_sz)
+#ifdef HAVE_BPF_PROLOGUE
+
+static int llseek_loop(void)
+{
+	int fds[2], i;
+
+	fds[0] = open("/dev/null", O_RDONLY);
+	fds[1] = open("/dev/null", O_RDWR);
+
+	if (fds[0] < 0 || fds[1] < 0)
+		return -1;
+
+	for (i = 0; i < NR_ITERS; i++) {
+		lseek(fds[i % 2], i, (i / 2) % 2 ? SEEK_CUR : SEEK_SET);
+		lseek(fds[(i + 1) % 2], i, (i / 2) % 2 ? SEEK_CUR : SEEK_SET);
+	}
+	close(fds[0]);
+	close(fds[1]);
+	return 0;
+}
+
+#endif
+
+static int prepare_bpf(const char *name, void *obj_buf, size_t obj_buf_sz)
 {
 	int err;
 	char errbuf[BUFSIZ];
 
-	err = bpf__prepare_load_buffer(obj_buf, obj_buf_sz, NULL);
+	err = bpf__prepare_load_buffer(obj_buf, obj_buf_sz, name);
 	if (err) {
-		bpf__strerror_prepare_load("[buffer]", false, err, errbuf,
+		bpf__strerror_prepare_load(name, false, err, errbuf,
 					   sizeof(errbuf));
 		fprintf(stderr, " (%s)", errbuf);
 		return TEST_FAIL;
@@ -49,7 +72,7 @@ static int prepare_bpf(void *obj_buf, size_t obj_buf_sz)
 	return 0;
 }
 
-static int do_test(void)
+static int do_test(int (*func)(void), int expect)
 {
 	struct record_opts opts = {
 		.target = {
@@ -106,7 +129,7 @@ static int do_test(void)
 	}
 
 	perf_evlist__enable(evlist);
-	epoll_pwait_loop();
+	(*func)();
 	perf_evlist__disable(evlist);
 
 	for (i = 0; i < evlist->nr_mmaps; i++) {
@@ -120,8 +143,8 @@ static int do_test(void)
 		}
 	}
 
-	if (count != (NR_ITERS + 1) / 2) {
-		fprintf(stderr, " (filter result incorrect)");
+	if (count != expect) {
+		fprintf(stderr, " (filter result incorrect: %d != %d)", count, expect);
 		err = -EBADF;
 	}
 
@@ -132,30 +155,30 @@ out_delete_evlist:
 	return 0;
 }
 
-int test__bpf(void)
+static int __test__bpf(int index, const char *name,
+		       const char *message_compile,
+		       const char *message_load,
+		       int (*func)(void), int expect)
 {
 	int err;
 	void *obj_buf;
 	size_t obj_buf_sz;
 
-	if (geteuid() != 0) {
-		fprintf(stderr, " (try run as root)");
-		return TEST_SKIP;
-	}
-
-	test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz, LLVM_TESTCASE_BASE);
-
+	test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz, index);
 	if (!obj_buf || !obj_buf_sz) {
 		if (verbose == 0)
-			fprintf(stderr, " (fix 'perf test LLVM' first)");
+			fprintf(stderr, " (%s)", message_compile);
 		return TEST_SKIP;
 	}
 
-	err = prepare_bpf(obj_buf, obj_buf_sz);
-	if (err)
+	err = prepare_bpf(name, obj_buf, obj_buf_sz);
+	if (err) {
+		if ((verbose == 0) && (message_load[0] != '\0'))
+			fprintf(stderr, " (%s)", message_load);
 		goto out;
+	}
 
-	err = do_test();
+	err = do_test(func, expect);
 	if (err)
 		goto out;
 out:
@@ -166,6 +189,38 @@ out:
 	return 0;
 }
 
+int test__bpf(void)
+{
+	int err;
+
+	if (geteuid() != 0) {
+		fprintf(stderr, " (try run as root)");
+		return TEST_SKIP;
+	}
+
+	err = __test__bpf(LLVM_TESTCASE_BASE,
+			  "[basic_bpf_test]",
+			  "fix 'perf test LLVM' first",
+			  "load bpf object failed",
+			  &epoll_pwait_loop,
+			  (NR_ITERS + 1) / 2);
+	if (err)
+		return err;
+
+#ifdef HAVE_BPF_PROLOGUE
+	err = __test__bpf(LLVM_TESTCASE_BPF_PROLOGUE,
+			  "[bpf_prologue_test]",
+			  "fix kbuild first",
+			  "check your vmlinux setting?",
+			  &llseek_loop,
+			  (NR_ITERS + 1) / 4);
+	return err;
+#else
+	fprintf(stderr, " (skip BPF prologue test)");
+	return TEST_OK;
+#endif
+}
+
 #else
 int test__bpf(void)
 {
diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
index 75cd99f..e722e8a 100644
--- a/tools/perf/tests/llvm.c
+++ b/tools/perf/tests/llvm.c
@@ -22,6 +22,11 @@ struct llvm_testcase {
 	[LLVM_TESTCASE_KBUILD]	= {.source = test_llvm__bpf_test_kbuild_prog,
 				   .errmsg = "llvm.kbuild-dir can be fixed",
 				   .tried = false},
+	/* Don't output if this one fail. */
+	[LLVM_TESTCASE_BPF_PROLOGUE]	= {
+				   .source = test_llvm__bpf_test_prologue_prog,
+				   .errmsg = "failed for unknown reason",
+				   .tried = false},
 	{.source = NULL}
 };
 
diff --git a/tools/perf/tests/llvm.h b/tools/perf/tests/llvm.h
index 78ec01d..c00c1be 100644
--- a/tools/perf/tests/llvm.h
+++ b/tools/perf/tests/llvm.h
@@ -10,10 +10,18 @@ struct test_llvm__bpf_result {
 
 extern const char test_llvm__bpf_prog[];
 extern const char test_llvm__bpf_test_kbuild_prog[];
+extern const char test_llvm__bpf_test_prologue_prog[];
 
 enum test_llvm__testcase {
 	LLVM_TESTCASE_BASE,
 	LLVM_TESTCASE_KBUILD,
+	/*
+	 * We must put LLVM_TESTCASE_BPF_PROLOGUE after
+	 * LLVM_TESTCASE_KBUILD, so if kbuild test failed,
+	 * don't need to try this one, because it depend on
+	 * kernel header.
+	 */
+	LLVM_TESTCASE_BPF_PROLOGUE,
 	NR_LLVM_TESTCASES,
 };
 void test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz, int index);
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip:perf/core] perf tools: Copy linux/filter.h to tools/include
  2015-08-29  4:21 ` [PATCH 21/31] perf tools: Move linux/filter.h to tools/include Wang Nan
  2015-08-31 20:35   ` Arnaldo Carvalho de Melo
  2015-09-01 19:39   ` Arnaldo Carvalho de Melo
@ 2015-09-08 14:31   ` tip-bot for He Kuang
  2 siblings, 0 replies; 94+ messages in thread
From: tip-bot for He Kuang @ 2015-09-08 14:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: brendan.d.gregg, xiakaixu, hekuang, acme, ast, mingo,
	masami.hiramatsu.pt, namhyung, paulus, dsahern, jolsa, hpa,
	daniel, wangnan0, a.p.zijlstra, linux-kernel, tglx, lizefan

Commit-ID:  dabf626f7f0e5cbef0d1cfb5143e40213f079bb8
Gitweb:     http://git.kernel.org/tip/dabf626f7f0e5cbef0d1cfb5143e40213f079bb8
Author:     He Kuang <hekuang@huawei.com>
AuthorDate: Sat, 29 Aug 2015 04:21:55 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 2 Sep 2015 16:30:46 -0300

perf tools: Copy linux/filter.h to tools/include

This patch copies filter.h from include/linux/kernel.h to
tools/include/linux/filter.h to enable other libraries to use macros in it,
like libbpf which will be introduced by further patches.

Currently, the filter.h copy only contains the useful macros needed by
libbpf for not introducing too much dependence.

tools/perf/MANIFEST is also updated for 'make perf-*-src-pkg'.

One change:
  The 'imm' field of BPF_EMIT_CALL becomes ((FUNC) - BPF_FUNC_unspec) to
  suit user space code generator.

Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440822125-52691-22-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
[ Removed stylistic changes, so that a diff to the original file gets reduced ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../bpf/libbpf.h => tools/include/linux/filter.h   | 131 ++++++++++++++-------
 tools/perf/MANIFEST                                |   1 +
 2 files changed, 87 insertions(+), 45 deletions(-)

diff --git a/samples/bpf/libbpf.h b/tools/include/linux/filter.h
similarity index 61%
copy from samples/bpf/libbpf.h
copy to tools/include/linux/filter.h
index 7235e29..3276625 100644
--- a/samples/bpf/libbpf.h
+++ b/tools/include/linux/filter.h
@@ -1,22 +1,32 @@
-/* eBPF mini library */
-#ifndef __LIBBPF_H
-#define __LIBBPF_H
-
-struct bpf_insn;
-
-int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,
-		   int max_entries);
-int bpf_update_elem(int fd, void *key, void *value, unsigned long long flags);
-int bpf_lookup_elem(int fd, void *key, void *value);
-int bpf_delete_elem(int fd, void *key);
-int bpf_get_next_key(int fd, void *key, void *next_key);
-
-int bpf_prog_load(enum bpf_prog_type prog_type,
-		  const struct bpf_insn *insns, int insn_len,
-		  const char *license, int kern_version);
-
-#define LOG_BUF_SIZE 65536
-extern char bpf_log_buf[LOG_BUF_SIZE];
+/*
+ * Linux Socket Filter Data Structures
+ */
+#ifndef __TOOLS_LINUX_FILTER_H
+#define __TOOLS_LINUX_FILTER_H
+
+#include <linux/bpf.h>
+
+/* ArgX, context and stack frame pointer register positions. Note,
+ * Arg1, Arg2, Arg3, etc are used as argument mappings of function
+ * calls in BPF_CALL instruction.
+ */
+#define BPF_REG_ARG1	BPF_REG_1
+#define BPF_REG_ARG2	BPF_REG_2
+#define BPF_REG_ARG3	BPF_REG_3
+#define BPF_REG_ARG4	BPF_REG_4
+#define BPF_REG_ARG5	BPF_REG_5
+#define BPF_REG_CTX	BPF_REG_6
+#define BPF_REG_FP	BPF_REG_10
+
+/* Additional register mappings for converted user programs. */
+#define BPF_REG_A	BPF_REG_0
+#define BPF_REG_X	BPF_REG_7
+#define BPF_REG_TMP	BPF_REG_8
+
+/* BPF program can access up to 512 bytes of stack space. */
+#define MAX_BPF_STACK	512
+
+/* Helper macros for filter block array initializers. */
 
 /* ALU ops on registers, bpf_add|sub|...: dst_reg += src_reg */
 
@@ -54,6 +64,16 @@ extern char bpf_log_buf[LOG_BUF_SIZE];
 		.off   = 0,					\
 		.imm   = IMM })
 
+/* Endianess conversion, cpu_to_{l,b}e(), {l,b}e_to_cpu() */
+
+#define BPF_ENDIAN(TYPE, DST, LEN)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU | BPF_END | BPF_SRC(TYPE),	\
+		.dst_reg = DST,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = LEN })
+
 /* Short form of mov, dst_reg = src_reg */
 
 #define BPF_MOV64_REG(DST, SRC)					\
@@ -64,6 +84,14 @@ extern char bpf_log_buf[LOG_BUF_SIZE];
 		.off   = 0,					\
 		.imm   = 0 })
 
+#define BPF_MOV32_REG(DST, SRC)					\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU | BPF_MOV | BPF_X,		\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = 0 })
+
 /* Short form of mov, dst_reg = imm32 */
 
 #define BPF_MOV64_IMM(DST, IMM)					\
@@ -74,32 +102,31 @@ extern char bpf_log_buf[LOG_BUF_SIZE];
 		.off   = 0,					\
 		.imm   = IMM })
 
-/* BPF_LD_IMM64 macro encodes single 'load 64-bit immediate' insn */
-#define BPF_LD_IMM64(DST, IMM)					\
-	BPF_LD_IMM64_RAW(DST, 0, IMM)
-
-#define BPF_LD_IMM64_RAW(DST, SRC, IMM)				\
+#define BPF_MOV32_IMM(DST, IMM)					\
 	((struct bpf_insn) {					\
-		.code  = BPF_LD | BPF_DW | BPF_IMM,		\
+		.code  = BPF_ALU | BPF_MOV | BPF_K,		\
 		.dst_reg = DST,					\
-		.src_reg = SRC,					\
-		.off   = 0,					\
-		.imm   = (__u32) (IMM) }),			\
-	((struct bpf_insn) {					\
-		.code  = 0, /* zero is reserved opcode */	\
-		.dst_reg = 0,					\
 		.src_reg = 0,					\
 		.off   = 0,					\
-		.imm   = ((__u64) (IMM)) >> 32 })
+		.imm   = IMM })
 
-#ifndef BPF_PSEUDO_MAP_FD
-# define BPF_PSEUDO_MAP_FD	1
-#endif
+/* Short form of mov based on type,  BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */
 
-/* pseudo BPF_LD_IMM64 insn used to refer to process-local map_fd */
-#define BPF_LD_MAP_FD(DST, MAP_FD)				\
-	BPF_LD_IMM64_RAW(DST, BPF_PSEUDO_MAP_FD, MAP_FD)
+#define BPF_MOV64_RAW(TYPE, DST, SRC, IMM)			\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU64 | BPF_MOV | BPF_SRC(TYPE),	\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = IMM })
 
+#define BPF_MOV32_RAW(TYPE, DST, SRC, IMM)			\
+	((struct bpf_insn) {					\
+		.code  = BPF_ALU | BPF_MOV | BPF_SRC(TYPE),	\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = IMM })
 
 /* Direct packet access, R0 = *(uint *) (skb->data + imm32) */
 
@@ -111,6 +138,16 @@ extern char bpf_log_buf[LOG_BUF_SIZE];
 		.off   = 0,					\
 		.imm   = IMM })
 
+/* Indirect packet access, R0 = *(uint *) (skb->data + src_reg + imm32) */
+
+#define BPF_LD_IND(SIZE, SRC, IMM)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_LD | BPF_SIZE(SIZE) | BPF_IND,	\
+		.dst_reg = 0,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = IMM })
+
 /* Memory load, dst_reg = *(uint *) (src_reg + off16) */
 
 #define BPF_LDX_MEM(SIZE, DST, SRC, OFF)			\
@@ -161,6 +198,16 @@ extern char bpf_log_buf[LOG_BUF_SIZE];
 		.off   = OFF,					\
 		.imm   = IMM })
 
+/* Function call */
+
+#define BPF_EMIT_CALL(FUNC)					\
+	((struct bpf_insn) {					\
+		.code  = BPF_JMP | BPF_CALL,			\
+		.dst_reg = 0,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = ((FUNC) - BPF_FUNC_unspec) })
+
 /* Raw code statement block */
 
 #define BPF_RAW_INSN(CODE, DST, SRC, OFF, IMM)			\
@@ -181,10 +228,4 @@ extern char bpf_log_buf[LOG_BUF_SIZE];
 		.off   = 0,					\
 		.imm   = 0 })
 
-/* create RAW socket and bind to interface 'name' */
-int open_raw_sock(const char *name);
-
-struct perf_event_attr;
-int perf_event_open(struct perf_event_attr *attr, int pid, int cpu,
-		    int group_fd, unsigned long flags);
-#endif
+#endif /* __TOOLS_LINUX_FILTER_H */
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index af009bd..2a958a8 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -41,6 +41,7 @@ tools/include/asm-generic/bitops.h
 tools/include/linux/atomic.h
 tools/include/linux/bitops.h
 tools/include/linux/compiler.h
+tools/include/linux/filter.h
 tools/include/linux/hash.h
 tools/include/linux/kernel.h
 tools/include/linux/list.h

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip:perf/core] tools lib traceevent: Support function __get_dynamic_array_len
  2015-08-29  4:22 ` [PATCH 31/31] tools lib traceevent: Support function __get_dynamic_array_len Wang Nan
@ 2015-09-08 14:31   ` tip-bot for He Kuang
  0 siblings, 0 replies; 94+ messages in thread
From: tip-bot for He Kuang @ 2015-09-08 14:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: masami.hiramatsu.pt, lizefan, namhyung, mingo, ast, acme, tglx,
	rostedt, wangnan0, a.p.zijlstra, hekuang, jolsa, linux-kernel,
	hpa

Commit-ID:  76055940c1afc8d445992fb0278b80cf205bbf97
Gitweb:     http://git.kernel.org/tip/76055940c1afc8d445992fb0278b80cf205bbf97
Author:     He Kuang <hekuang@huawei.com>
AuthorDate: Sat, 29 Aug 2015 04:22:05 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 2 Sep 2015 16:30:46 -0300

tools lib traceevent: Support function __get_dynamic_array_len

Support helper function __get_dynamic_array_len() in libtraceevent, this
function is used accompany with __print_array() or __print_hex(), but
currently it is not an available function in the function list of
process_function().

The total allocated length of the dynamic array is embedded in the top
half of __data_loc_##item field. This patch adds new arg type
PRINT_DYNAMIC_ARRAY_LEN to return the length to eval_num_arg(),

Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440822125-52691-32-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/event-parse.c                 | 56 +++++++++++++++++++++-
 tools/lib/traceevent/event-parse.h                 |  1 +
 .../perf/util/scripting-engines/trace-event-perl.c |  1 +
 .../util/scripting-engines/trace-event-python.c    |  1 +
 4 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 4d88593..1244797 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -848,6 +848,7 @@ static void free_arg(struct print_arg *arg)
 		free(arg->bitmask.bitmask);
 		break;
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 		free(arg->dynarray.index);
 		break;
 	case PRINT_OP:
@@ -2729,6 +2730,42 @@ process_dynamic_array(struct event_format *event, struct print_arg *arg, char **
 }
 
 static enum event_type
+process_dynamic_array_len(struct event_format *event, struct print_arg *arg,
+			  char **tok)
+{
+	struct format_field *field;
+	enum event_type type;
+	char *token;
+
+	if (read_expect_type(EVENT_ITEM, &token) < 0)
+		goto out_free;
+
+	arg->type = PRINT_DYNAMIC_ARRAY_LEN;
+
+	/* Find the field */
+	field = pevent_find_field(event, token);
+	if (!field)
+		goto out_free;
+
+	arg->dynarray.field = field;
+	arg->dynarray.index = 0;
+
+	if (read_expected(EVENT_DELIM, ")") < 0)
+		goto out_err;
+
+	type = read_token(&token);
+	*tok = token;
+
+	return type;
+
+ out_free:
+	free_token(token);
+ out_err:
+	*tok = NULL;
+	return EVENT_ERROR;
+}
+
+static enum event_type
 process_paren(struct event_format *event, struct print_arg *arg, char **tok)
 {
 	struct print_arg *item_arg;
@@ -2975,6 +3012,10 @@ process_function(struct event_format *event, struct print_arg *arg,
 		free_token(token);
 		return process_dynamic_array(event, arg, tok);
 	}
+	if (strcmp(token, "__get_dynamic_array_len") == 0) {
+		free_token(token);
+		return process_dynamic_array_len(event, arg, tok);
+	}
 
 	func = find_func_handler(event->pevent, token);
 	if (func) {
@@ -3655,14 +3696,25 @@ eval_num_arg(void *data, int size, struct event_format *event, struct print_arg
 			goto out_warning_op;
 		}
 		break;
+	case PRINT_DYNAMIC_ARRAY_LEN:
+		offset = pevent_read_number(pevent,
+					    data + arg->dynarray.field->offset,
+					    arg->dynarray.field->size);
+		/*
+		 * The total allocated length of the dynamic array is
+		 * stored in the top half of the field, and the offset
+		 * is in the bottom half of the 32 bit field.
+		 */
+		val = (unsigned long long)(offset >> 16);
+		break;
 	case PRINT_DYNAMIC_ARRAY:
 		/* Without [], we pass the address to the dynamic data */
 		offset = pevent_read_number(pevent,
 					    data + arg->dynarray.field->offset,
 					    arg->dynarray.field->size);
 		/*
-		 * The actual length of the dynamic array is stored
-		 * in the top half of the field, and the offset
+		 * The total allocated length of the dynamic array is
+		 * stored in the top half of the field, and the offset
 		 * is in the bottom half of the 32 bit field.
 		 */
 		offset &= 0xffff;
diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
index 204befb..6fc83c7 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -294,6 +294,7 @@ enum print_arg_type {
 	PRINT_OP,
 	PRINT_FUNC,
 	PRINT_BITMASK,
+	PRINT_DYNAMIC_ARRAY_LEN,
 };
 
 struct print_arg {
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index 1bd593b..544509c 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -221,6 +221,7 @@ static void define_event_symbols(struct event_format *event,
 		break;
 	case PRINT_BSTRING:
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 	case PRINT_STRING:
 	case PRINT_BITMASK:
 		break;
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index ace2484..aa9e125 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -251,6 +251,7 @@ static void define_event_symbols(struct event_format *event,
 		/* gcc warns for these? */
 	case PRINT_BSTRING:
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 	case PRINT_FUNC:
 	case PRINT_BITMASK:
 		/* we should warn... */

^ permalink raw reply related	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2015-09-08 14:34 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-29  4:21 [GIT PULL 00/31] perf tools: filtering events using eBPF programs Wang Nan
2015-08-29  4:21 ` [PATCH 01/31] bpf tools: New API to get name from a BPF object Wang Nan
2015-08-29  4:21 ` [PATCH 02/31] perf tools: Don't set cmdline_group_boundary if no evsel is collected Wang Nan
2015-08-31 19:20   ` Arnaldo Carvalho de Melo
2015-09-01 10:37     ` Wangnan (F)
2015-09-01 10:38     ` Jiri Olsa
2015-09-01 12:44       ` Wangnan (F)
2015-09-02  2:53   ` [PATCH] perf tools: Don't set leader if parser doesn't collect an evsel Wang Nan
2015-09-02  3:01     ` Wangnan (F)
2015-09-02  5:57     ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-02  6:09       ` Wangnan (F)
     [not found]       ` <1441176553-116129-1-git-send-email-wangnan0@huawei.com>
2015-09-02  6:53         ` [PATCH] perf tools: Don't write to evsel if parser doesn't collect evsel Wangnan (F)
2015-09-02 10:31           ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-02 11:54           ` Jiri Olsa
2015-09-02 12:05             ` pi3orama
2015-09-02 12:46               ` Jiri Olsa
2015-09-02 13:55               ` Arnaldo Carvalho de Melo
2015-09-02 14:04                 ` pi3orama
2015-09-02 14:43                   ` Arnaldo Carvalho de Melo
2015-09-02 22:24                     ` pi3orama
2015-08-29  4:21 ` [PATCH 03/31] perf tools: Introduce dummy evsel Wang Nan
2015-08-31 19:38   ` Arnaldo Carvalho de Melo
2015-09-03  0:11   ` Namhyung Kim
2015-09-03  0:42     ` pi3orama
2015-09-06  5:55   ` [PATCH] perf tools: Allow BPF placeholder dummy events to collect --filter options Wang Nan
2015-09-06  5:56     ` [PATCH] perf tools: Sync setting of real bpf events with placeholder Wang Nan
2015-08-29  4:21 ` [PATCH 04/31] perf tools: Make perf depend on libbpf Wang Nan
2015-08-29  4:21 ` [PATCH 05/31] perf ebpf: Add the libbpf glue Wang Nan
2015-08-29  4:21 ` [PATCH 06/31] perf tools: Enable passing bpf object file to --event Wang Nan
2015-08-29  4:21 ` [PATCH 07/31] perf probe: Attach trace_probe_event with perf_probe_event Wang Nan
2015-09-02  4:32   ` Namhyung Kim
2015-09-02  5:40     ` Wangnan (F)
2015-08-29  4:21 ` [PATCH 08/31] perf record, bpf: Parse and probe eBPF programs probe points Wang Nan
2015-08-29  4:21 ` [PATCH 09/31] perf bpf: Collect 'struct perf_probe_event' for bpf_program Wang Nan
2015-08-29  4:21 ` [PATCH 10/31] perf record: Load all eBPF object into kernel Wang Nan
2015-08-29  4:21 ` [PATCH 11/31] perf tools: Add bpf_fd field to evsel and config it Wang Nan
2015-08-29  4:21 ` [PATCH 12/31] perf tools: Allow filter option to be applied to bof object Wang Nan
2015-08-29  4:21 ` [PATCH 13/31] perf tools: Attach eBPF program to perf event Wang Nan
2015-08-29  4:21 ` [PATCH 14/31] perf tools: Suppress probing messages when probing by BPF loading Wang Nan
2015-09-03  0:20   ` Namhyung Kim
2015-09-03  2:42     ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-03 12:10     ` [PATCH perf/core ] perf-probe: Output the result of adding/deleting probe in buildin-probe Masami Hiramatsu
2015-09-03 12:18       ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-03 17:25         ` Namhyung Kim
2015-09-03 20:28       ` Arnaldo Carvalho de Melo
2015-09-04  1:30         ` 平松雅巳 / HIRAMATU,MASAMI
2015-08-29  4:21 ` [PATCH 15/31] perf record: Add clang options for compiling BPF scripts Wang Nan
2015-08-29  4:21 ` [PATCH 16/31] perf tools: Infrastructure for compiling scriptlets when passing '.c' to --event Wang Nan
2015-08-29  4:21 ` [PATCH 17/31] perf tests: Enforce LLVM test for BPF test Wang Nan
2015-09-01  5:59   ` Wangnan (F)
2015-08-29  4:21 ` [PATCH 18/31] perf test: Add 'perf test BPF' Wang Nan
2015-09-02 12:45   ` Namhyung Kim
2015-09-05 12:21     ` Wang Nan
2015-08-29  4:21 ` [PATCH 19/31] bpf tools: Load a program with different instances using preprocessor Wang Nan
2015-08-29  4:21 ` [PATCH 20/31] perf probe: Reset args and nargs for probe_trace_event when failure Wang Nan
2015-08-29  4:21 ` [PATCH 21/31] perf tools: Move linux/filter.h to tools/include Wang Nan
2015-08-31 20:35   ` Arnaldo Carvalho de Melo
2015-09-01 19:39   ` Arnaldo Carvalho de Melo
2015-09-01 19:47     ` Arnaldo Carvalho de Melo
2015-09-01 21:08     ` pi3orama
2015-09-01 21:43       ` Arnaldo Carvalho de Melo
2015-09-08 14:31   ` [tip:perf/core] perf tools: Copy " tip-bot for He Kuang
2015-08-29  4:21 ` [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches Wang Nan
2015-08-31 20:39   ` Arnaldo Carvalho de Melo
2015-09-01  6:59   ` Wang Nan
2015-09-01  6:59     ` [PATCH 23/31] perf tools: Introduce regs_query_register_offset() for x86 Wang Nan
2015-09-01 11:47       ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-01 13:52         ` Wangnan (F)
2015-09-01 14:50           ` Arnaldo Carvalho de Melo
2015-09-01 14:14         ` Arnaldo Carvalho de Melo
2015-09-01 15:54           ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-06  6:02             ` Wangnan (F)
2015-09-06  6:04               ` [PATCH] perf test: Enforce LLVM test, add kbuild test Wang Nan
2015-09-06  6:04                 ` [PATCH] perf test: Test BPF prologue Wang Nan
2015-09-02 14:08     ` [PATCH 22/31] perf tools: Add BPF_PROLOGUE config options for further patches Namhyung Kim
2015-08-29  4:21 ` [PATCH 23/31] perf tools: Introduce arch_get_reg_info() for x86 Wang Nan
2015-08-31 20:43   ` Arnaldo Carvalho de Melo
2015-09-01  2:39     ` Wangnan (F)
2015-08-29  4:21 ` [PATCH 24/31] perf tools: Add prologue for BPF programs for fetching arguments Wang Nan
2015-08-29  4:21 ` [PATCH 25/31] perf tools: Generate prologue for BPF programs Wang Nan
2015-08-29  4:22 ` [PATCH 26/31] perf tools: Use same BPF program if arguments are identical Wang Nan
2015-08-29  4:22 ` [PATCH 27/31] perf record: Support custom vmlinux path Wang Nan
2015-09-01 20:19   ` Arnaldo Carvalho de Melo
2015-09-01 20:21     ` Arnaldo Carvalho de Melo
2015-09-01 21:00       ` pi3orama
2015-09-01 21:33         ` Arnaldo Carvalho de Melo
2015-08-29  4:22 ` [PATCH 28/31] perf probe: Init symbol as kprobe Wang Nan
2015-09-01 20:11   ` Arnaldo Carvalho de Melo
2015-09-02  1:22     ` Wangnan (F)
2015-09-02  1:38     ` 平松雅巳 / HIRAMATU,MASAMI
2015-08-29  4:22 ` [PATCH 29/31] perf tools: Support attach BPF program on uprobe events Wang Nan
2015-08-29  4:22 ` [PATCH 30/31] perf tools: Fix cross compiling error Wang Nan
2015-08-29  4:22 ` [PATCH 31/31] tools lib traceevent: Support function __get_dynamic_array_len Wang Nan
2015-09-08 14:31   ` [tip:perf/core] " tip-bot for He Kuang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.