All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support
@ 2016-01-25  9:55 Wang Nan
  2016-01-25  9:55 ` [PATCH 01/54] perf test: Add libbpf relocation checker Wang Nan
                   ` (54 more replies)
  0 siblings, 55 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Hi Arnaldo,

The following changes since commit 512e583b2d4a35b644c8ff36e033b90be7e91c2e:

  perf hists browser: Offer non-symbol specific menu options for --sort without 'sym' (2016-01-22 14:28:48 -0300)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/pi3orama/linux.git tags/perf-core-for-acme

for you to fetch changes up to 7c8463658d92b82c2a8db9f405b50ae814b91f71:

  perf tools: Don't warn about out of order event if write_backward is used (2016-01-25 09:49:15 +0000)

----------------------------------------------------------------
perf improvements:

 - Bug fixes:
   libbpf relocation checker for a llvm bug
   fix symbol searching for offline modules

 - Building scripts:
   Use feature-dump results for build-test

 - BPF related improvements:
   Enable indices syntax to support init BPF maps
   Support BPF output events support

 - perf/core:
   Add write_backward attribute bit to support reading from
      overwrite ring buffer

 - perf record improvements:
   Enable perf record dump different output
   Support reading from overwrite ring buffer based on write_backward
     attribute

Signed-off-by: Wang Nan <wangnan0@huawei.com>

----------------------------------------------------------------
He Kuang (1):
      perf tools: Support perf event alias name

Wang Nan (53):
      perf test: Add libbpf relocation checker
      perf bpf: Check relocation target section
      tools build: Allow subprojects select all feature checkers
      perf build: Select all feature checkers for feature-dump
      perf build: Use feature dump file for build-test
      perf test: Check environment before start real BPF test
      perf tools: Fix symbols searching for offline module in buildid-cache
      perf test: Improve bp_signal
      perf tools: Add API to config maps in bpf object
      perf tools: Enable BPF object configure syntax
      perf record: Apply config to BPF objects before recording
      perf tools: Enable passing event to BPF object
      perf tools: Support setting different slots in a BPF map separately
      perf tools: Enable indices setting syntax for BPF maps
      perf tools: Introduce bpf-output event
      perf data: Support converting data from bpf_perf_event_output()
      perf core: Introduce new ioctl options to pause and resume ring buffer
      perf core: Set event's default overflow_handler
      perf core: Prepare writing into ring buffer from end
      perf core: Add backward attribute to perf event
      perf core: Reduce perf event output overhead by new overflow handler
      perf tools: Introduce API to pause ring buffer
      perf tools: Only validate is_pos for tracking evsels
      perf tools: Print write_backward value in perf_event_attr__fprintf
      perf tools: Move timestamp creation to util
      perf tools: Make ordered_events reusable
      perf record: Extract synthesize code to record__synthesize()
      perf tools: Add perf_data_file__switch() helper
      perf record: Turns auxtrace_snapshot_enable into 3 states
      perf record: Introduce record__finish_output() to finish a perf.data
      perf record: Use OPT_BOOLEAN_SET for buildid cache related options
      perf record: Add '--timestamp-filename' option to append timestamp to output filename
      perf record: Split output into multiple files via '--switch-output'
      perf record: Force enable --timestamp-filename when --switch-output is provided
      perf record: Disable buildid cache options by default in switch output mode
      perf record: Re-synthesize tracking events after output switching
      perf record: Generate tracking events for process forked by perf
      perf record: Ensure return non-zero rc when mmap fail
      perf record: Prevent reading invalid data in record__mmap_read
      perf tools: Add evlist channel helpers
      perf tools: Automatically add new channel according to evlist
      perf tools: Operate multiple channels
      perf tools: Squash overwrite setting into channel
      perf record: Don't read from and poll overwrite channel
      perf record: Don't poll on overwrite channel
      perf tools: Detect avalibility of write_backward
      perf tools: Enable overwrite settings
      perf tools: Set write_backward attribut bit for overwrite events
      perf record: Toggle overwrite ring buffer for reading
      perf record: Rename variable to make code clear
      perf record: Read from backward ring buffer
      perf record: Allow generate tracking events at the end of output
      perf tools: Don't warn about out of order event if write_backward is used

 include/linux/perf_event.h                    |  22 +-
 include/uapi/linux/perf_event.h               |   4 +-
 kernel/events/core.c                          |  73 ++-
 kernel/events/internal.h                      |  11 +
 kernel/events/ring_buffer.c                   |  63 ++-
 tools/build/Makefile.feature                  |  21 +-
 tools/lib/bpf/libbpf.c                        |  34 +-
 tools/perf/Makefile.perf                      |  11 +-
 tools/perf/builtin-buildid-cache.c            |  14 +-
 tools/perf/builtin-record.c                   | 608 ++++++++++++++++++----
 tools/perf/perf.h                             |   2 +
 tools/perf/tests/.gitignore                   |   1 +
 tools/perf/tests/Build                        |   9 +-
 tools/perf/tests/bp_signal.c                  | 140 +++++-
 tools/perf/tests/bpf-script-test-relocation.c |  50 ++
 tools/perf/tests/bpf.c                        |  63 ++-
 tools/perf/tests/llvm.c                       |  17 +-
 tools/perf/tests/llvm.h                       |   5 +-
 tools/perf/tests/make                         |  31 ++
 tools/perf/util/bpf-loader.c                  | 699 ++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h                  |  59 +++
 tools/perf/util/build-id.c                    |  44 ++
 tools/perf/util/build-id.h                    |   1 +
 tools/perf/util/data-convert-bt.c             | 112 ++++-
 tools/perf/util/data.c                        |  36 ++
 tools/perf/util/data.h                        |  11 +-
 tools/perf/util/evlist.c                      | 314 ++++++++++--
 tools/perf/util/evlist.h                      |  67 ++-
 tools/perf/util/evsel.c                       |  30 ++
 tools/perf/util/evsel.h                       |  13 +
 tools/perf/util/ordered-events.c              |   5 +
 tools/perf/util/parse-events.c                | 139 ++++-
 tools/perf/util/parse-events.h                |  24 +-
 tools/perf/util/parse-events.l                |  18 +-
 tools/perf/util/parse-events.y                | 123 ++++-
 tools/perf/util/record.c                      |  11 +
 tools/perf/util/session.c                     |  22 +-
 tools/perf/util/symbol.c                      |   4 +
 tools/perf/util/util.c                        |  17 +
 tools/perf/util/util.h                        |   1 +
 40 files changed, 2689 insertions(+), 240 deletions(-)
 create mode 100644 tools/perf/tests/bpf-script-test-relocation.c

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 01/54] perf test: Add libbpf relocation checker
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-01-26 14:58   ` Arnaldo Carvalho de Melo
  2016-02-03 10:13   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-01-25  9:55 ` [PATCH 02/54] perf bpf: Check relocation target section Wang Nan
                   ` (53 subsequent siblings)
  54 siblings, 2 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

There's a bug in LLVM that it can generate unneeded relocation
information. See [1] and [2]. Libbpf should check the target section
of a relocation symbol.

This patch adds a testcase which reference a global variable (BPF
doesn't support global variable). Before fixing libbpf, the new test
case can be loaded into kernel, the global variable acts like the first
map. It is incorrect.

Result:
 # ~/perf test BPF
 37: Test BPF filter                                          :
 37.1: Test basic BPF filtering                               : Ok
 37.2: Test BPF prologue generation                           : Ok
 37.3: Test BPF relocation checker                            : FAILED!

 # ~/perf test -v BPF
 ...
 libbpf: loading object '[bpf_relocation_test]' from buffer
 libbpf: section .strtab, size 126, link 0, flags 0, type=3
 libbpf: section .text, size 0, link 0, flags 6, type=1
 libbpf: section .data, size 0, link 0, flags 3, type=1
 libbpf: section .bss, size 0, link 0, flags 3, type=8
 libbpf: section func=sys_write, size 104, link 0, flags 6, type=1
 libbpf: found program func=sys_write
 libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9
 libbpf: section maps, size 16, link 0, flags 3, type=1
 libbpf: maps in [bpf_relocation_test]: 16 bytes
 libbpf: section license, size 4, link 0, flags 3, type=1
 libbpf: license of [bpf_relocation_test] is GPL
 libbpf: section version, size 4, link 0, flags 3, type=1
 libbpf: kernel version of [bpf_relocation_test] is 40400
 libbpf: section .symtab, size 144, link 1, flags 0, type=2
 libbpf: map 0 is "my_table"
 libbpf: collecting relocating info for: 'func=sys_write'
 libbpf: relocation: insn_idx=7
 Success unexpectedly: libbpf error when dealing with relocation
 test child finished with -1
 ---- end ----
 Test BPF filter subtest 2: FAILED!

[1] https://llvm.org/bugs/show_bug.cgi?id=26243
[2] https://patchwork.ozlabs.org/patch/571385/

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/Makefile.perf                      |  2 +-
 tools/perf/tests/.gitignore                   |  1 +
 tools/perf/tests/Build                        |  9 ++++-
 tools/perf/tests/bpf-script-test-relocation.c | 50 +++++++++++++++++++++++++++
 tools/perf/tests/bpf.c                        | 26 +++++++++++---
 tools/perf/tests/llvm.c                       | 17 ++++++---
 tools/perf/tests/llvm.h                       |  5 ++-
 7 files changed, 98 insertions(+), 12 deletions(-)
 create mode 100644 tools/perf/tests/bpf-script-test-relocation.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 5d34815..97ce869 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -618,7 +618,7 @@ clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean
 	$(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32
 	$(call QUIET_CLEAN, core-gen)   $(RM)  *.spec *.pyc *.pyo */*.pyc */*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex* \
 		$(OUTPUT)util/intel-pt-decoder/inat-tables.c $(OUTPUT)fixdep \
-		$(OUTPUT)tests/llvm-src-{base,kbuild,prologue}.c
+		$(OUTPUT)tests/llvm-src-{base,kbuild,prologue,relocation}.c
 	$(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
 	$(python-clean)
 
diff --git a/tools/perf/tests/.gitignore b/tools/perf/tests/.gitignore
index bf016c4..8cc30e7 100644
--- a/tools/perf/tests/.gitignore
+++ b/tools/perf/tests/.gitignore
@@ -1,3 +1,4 @@
 llvm-src-base.c
 llvm-src-kbuild.c
 llvm-src-prologue.c
+llvm-src-relocation.c
diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 614899b..1ba628e 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -31,7 +31,7 @@ perf-y += sample-parsing.o
 perf-y += parse-no-sample-id-all.o
 perf-y += kmod-path.o
 perf-y += thread-map.o
-perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o
+perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o llvm-src-relocation.o
 perf-y += bpf.o
 perf-y += topology.o
 perf-y += cpumap.o
@@ -59,6 +59,13 @@ $(OUTPUT)tests/llvm-src-prologue.c: tests/bpf-script-test-prologue.c tests/Build
 	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
 	$(Q)echo ';' >> $@
 
+$(OUTPUT)tests/llvm-src-relocation.c: tests/bpf-script-test-relocation.c tests/Build
+	$(call rule_mkdir)
+	$(Q)echo '#include <tests/llvm.h>' > $@
+	$(Q)echo 'const char test_llvm__bpf_test_relocation[] =' >> $@
+	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
+	$(Q)echo ';' >> $@
+
 ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64))
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 endif
diff --git a/tools/perf/tests/bpf-script-test-relocation.c b/tools/perf/tests/bpf-script-test-relocation.c
new file mode 100644
index 0000000..93af774
--- /dev/null
+++ b/tools/perf/tests/bpf-script-test-relocation.c
@@ -0,0 +1,50 @@
+/*
+ * bpf-script-test-relocation.c
+ * Test BPF loader checking relocation
+ */
+#ifndef LINUX_VERSION_CODE
+# error Need LINUX_VERSION_CODE
+# error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
+#endif
+#define BPF_ANY 0
+#define BPF_MAP_TYPE_ARRAY 2
+#define BPF_FUNC_map_lookup_elem 1
+#define BPF_FUNC_map_update_elem 2
+
+static void *(*bpf_map_lookup_elem)(void *map, void *key) =
+	(void *) BPF_FUNC_map_lookup_elem;
+static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) =
+	(void *) BPF_FUNC_map_update_elem;
+
+struct bpf_map_def {
+	unsigned int type;
+	unsigned int key_size;
+	unsigned int value_size;
+	unsigned int max_entries;
+};
+
+#define SEC(NAME) __attribute__((section(NAME), used))
+struct bpf_map_def SEC("maps") my_table = {
+	.type = BPF_MAP_TYPE_ARRAY,
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.max_entries = 1,
+};
+
+int this_is_a_global_val;
+
+SEC("func=sys_write")
+int bpf_func__sys_write(void *ctx)
+{
+	int key = 0;
+	int value = 0;
+
+	/*
+	 * Incorrect relocation. Should not allow this program be
+	 * loaded into kernel.
+	 */
+	bpf_map_update_elem(&this_is_a_global_val, &key, &value, 0);
+	return 0;
+}
+char _license[] SEC("license") = "GPL";
+int _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 33689a0..952ca99 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -71,6 +71,15 @@ static struct {
 		(NR_ITERS + 1) / 4,
 	},
 #endif
+	{
+		LLVM_TESTCASE_BPF_RELOCATION,
+		"Test BPF relocation checker",
+		"[bpf_relocation_test]",
+		"fix 'perf test LLVM' first",
+		"libbpf error when dealing with relocation",
+		NULL,
+		0,
+	},
 };
 
 static int do_test(struct bpf_object *obj, int (*func)(void),
@@ -190,7 +199,7 @@ static int __test__bpf(int idx)
 
 	ret = test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz,
 				       bpf_testcase_table[idx].prog_id,
-				       true);
+				       true, NULL);
 	if (ret != TEST_OK || !obj_buf || !obj_buf_sz) {
 		pr_debug("Unable to get BPF object, %s\n",
 			 bpf_testcase_table[idx].msg_compile_fail);
@@ -202,14 +211,21 @@ static int __test__bpf(int idx)
 
 	obj = prepare_bpf(obj_buf, obj_buf_sz,
 			  bpf_testcase_table[idx].name);
-	if (!obj) {
+	if ((!!bpf_testcase_table[idx].target_func) != (!!obj)) {
+		if (!obj)
+			pr_debug("Fail to load BPF object: %s\n",
+				 bpf_testcase_table[idx].msg_load_fail);
+		else
+			pr_debug("Success unexpectedly: %s\n",
+				 bpf_testcase_table[idx].msg_load_fail);
 		ret = TEST_FAIL;
 		goto out;
 	}
 
-	ret = do_test(obj,
-		      bpf_testcase_table[idx].target_func,
-		      bpf_testcase_table[idx].expect_result);
+	if (obj)
+		ret = do_test(obj,
+			      bpf_testcase_table[idx].target_func,
+			      bpf_testcase_table[idx].expect_result);
 out:
 	bpf__clear();
 	return ret;
diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
index 06f45c1..70edcdf 100644
--- a/tools/perf/tests/llvm.c
+++ b/tools/perf/tests/llvm.c
@@ -35,6 +35,7 @@ static int test__bpf_parsing(void *obj_buf __maybe_unused,
 static struct {
 	const char *source;
 	const char *desc;
+	bool should_load_fail;
 } bpf_source_table[__LLVM_TESTCASE_MAX] = {
 	[LLVM_TESTCASE_BASE] = {
 		.source = test_llvm__bpf_base_prog,
@@ -48,14 +49,19 @@ static struct {
 		.source = test_llvm__bpf_test_prologue_prog,
 		.desc = "Compile source for BPF prologue generation test",
 	},
+	[LLVM_TESTCASE_BPF_RELOCATION] = {
+		.source = test_llvm__bpf_test_relocation,
+		.desc = "Compile source for BPF relocation test",
+		.should_load_fail = true,
+	},
 };
 
-
 int
 test_llvm__fetch_bpf_obj(void **p_obj_buf,
 			 size_t *p_obj_buf_sz,
 			 enum test_llvm__testcase idx,
-			 bool force)
+			 bool force,
+			 bool *should_load_fail)
 {
 	const char *source;
 	const char *desc;
@@ -68,6 +74,8 @@ test_llvm__fetch_bpf_obj(void **p_obj_buf,
 
 	source = bpf_source_table[idx].source;
 	desc = bpf_source_table[idx].desc;
+	if (should_load_fail)
+		*should_load_fail = bpf_source_table[idx].should_load_fail;
 
 	perf_config(perf_config_cb, NULL);
 
@@ -136,14 +144,15 @@ int test__llvm(int subtest)
 	int ret;
 	void *obj_buf = NULL;
 	size_t obj_buf_sz = 0;
+	bool should_load_fail = false;
 
 	if ((subtest < 0) || (subtest >= __LLVM_TESTCASE_MAX))
 		return TEST_FAIL;
 
 	ret = test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz,
-				       subtest, false);
+				       subtest, false, &should_load_fail);
 
-	if (ret == TEST_OK) {
+	if (ret == TEST_OK && !should_load_fail) {
 		ret = test__bpf_parsing(obj_buf, obj_buf_sz);
 		if (ret != TEST_OK) {
 			pr_debug("Failed to parse test case '%s'\n",
diff --git a/tools/perf/tests/llvm.h b/tools/perf/tests/llvm.h
index 5150b4d..0eaa604 100644
--- a/tools/perf/tests/llvm.h
+++ b/tools/perf/tests/llvm.h
@@ -7,14 +7,17 @@
 extern const char test_llvm__bpf_base_prog[];
 extern const char test_llvm__bpf_test_kbuild_prog[];
 extern const char test_llvm__bpf_test_prologue_prog[];
+extern const char test_llvm__bpf_test_relocation[];
 
 enum test_llvm__testcase {
 	LLVM_TESTCASE_BASE,
 	LLVM_TESTCASE_KBUILD,
 	LLVM_TESTCASE_BPF_PROLOGUE,
+	LLVM_TESTCASE_BPF_RELOCATION,
 	__LLVM_TESTCASE_MAX,
 };
 
 int test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz,
-			     enum test_llvm__testcase index, bool force);
+			     enum test_llvm__testcase index, bool force,
+			     bool *should_load_fail);
 #endif
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 02/54] perf bpf: Check relocation target section
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
  2016-01-25  9:55 ` [PATCH 01/54] perf test: Add libbpf relocation checker Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-02-03 10:14   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-01-25  9:55 ` [PATCH 03/54] tools build: Allow subprojects select all feature checkers Wang Nan
                   ` (52 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Libbpf should check target section before doing relocation to ensure
the relocation is correct. If not, a bug in LLVM causes error. See [1].
Also, if an incorrect BPF script uses both global variable and
map, global variable whould be treated as map and be relocated
without error.

This patch saves id of map section into obj->efile and compare
target section of a relocation symbol against it during relocation.

Previous patch introduces a test case about this problem.
After this patch:

 # ~/perf test BPF
 37: Test BPF filter                                          :
 37.1: Test basic BPF filtering                               : Ok
 37.2: Test BPF prologue generation                           : Ok
 37.3: Test BPF relocation checker                            : Ok

 # perf test -v BPF
 ...
 37.3: Test BPF relocation checker                            :
 ...
 libbpf: loading object '[bpf_relocation_test]' from buffer
 libbpf: section .strtab, size 126, link 0, flags 0, type=3
 libbpf: section .text, size 0, link 0, flags 6, type=1
 libbpf: section .data, size 0, link 0, flags 3, type=1
 libbpf: section .bss, size 0, link 0, flags 3, type=8
 libbpf: section func=sys_write, size 104, link 0, flags 6, type=1
 libbpf: found program func=sys_write
 libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9
 libbpf: section maps, size 16, link 0, flags 3, type=1
 libbpf: maps in [bpf_relocation_test]: 16 bytes
 libbpf: section license, size 4, link 0, flags 3, type=1
 libbpf: license of [bpf_relocation_test] is GPL
 libbpf: section version, size 4, link 0, flags 3, type=1
 libbpf: kernel version of [bpf_relocation_test] is 40400
 libbpf: section .symtab, size 144, link 1, flags 0, type=2
 libbpf: map 0 is "my_table"
 libbpf: collecting relocating info for: 'func=sys_write'
 libbpf: Program 'func=sys_write' contains non-map related relo data pointing to section 65522
 bpf: failed to load buffer
 Compile BPF program failed.
 test child finished with 0
 ---- end ----
 Test BPF filter subtest 2: Ok

[1] https://llvm.org/bugs/show_bug.cgi?id=26243

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/lib/bpf/libbpf.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 8334a5a..7e543c3 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -201,6 +201,7 @@ struct bpf_object {
 			Elf_Data *data;
 		} *reloc;
 		int nr_reloc;
+		int maps_shndx;
 	} efile;
 	/*
 	 * All loaded bpf_object is linked in a list, which is
@@ -350,6 +351,7 @@ static struct bpf_object *bpf_object__new(const char *path,
 	 */
 	obj->efile.obj_buf = obj_buf;
 	obj->efile.obj_buf_sz = obj_buf_sz;
+	obj->efile.maps_shndx = -1;
 
 	obj->loaded = false;
 
@@ -529,12 +531,12 @@ bpf_object__init_maps(struct bpf_object *obj, void *data,
 }
 
 static int
-bpf_object__init_maps_name(struct bpf_object *obj, int maps_shndx)
+bpf_object__init_maps_name(struct bpf_object *obj)
 {
 	int i;
 	Elf_Data *symbols = obj->efile.symbols;
 
-	if (!symbols || maps_shndx < 0)
+	if (!symbols || obj->efile.maps_shndx < 0)
 		return -EINVAL;
 
 	for (i = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) {
@@ -544,7 +546,7 @@ bpf_object__init_maps_name(struct bpf_object *obj, int maps_shndx)
 
 		if (!gelf_getsym(symbols, i, &sym))
 			continue;
-		if (sym.st_shndx != maps_shndx)
+		if (sym.st_shndx != obj->efile.maps_shndx)
 			continue;
 
 		map_name = elf_strptr(obj->efile.elf,
@@ -572,7 +574,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 	Elf *elf = obj->efile.elf;
 	GElf_Ehdr *ep = &obj->efile.ehdr;
 	Elf_Scn *scn = NULL;
-	int idx = 0, err = 0, maps_shndx = -1;
+	int idx = 0, err = 0;
 
 	/* Elf is corrupted/truncated, avoid calling elf_strptr. */
 	if (!elf_rawdata(elf_getscn(elf, ep->e_shstrndx), NULL)) {
@@ -625,7 +627,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 		else if (strcmp(name, "maps") == 0) {
 			err = bpf_object__init_maps(obj, data->d_buf,
 						    data->d_size);
-			maps_shndx = idx;
+			obj->efile.maps_shndx = idx;
 		} else if (sh.sh_type == SHT_SYMTAB) {
 			if (obj->efile.symbols) {
 				pr_warning("bpf: multiple SYMTAB in %s\n",
@@ -674,8 +676,8 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 		pr_warning("Corrupted ELF file: index of strtab invalid\n");
 		return LIBBPF_ERRNO__FORMAT;
 	}
-	if (maps_shndx >= 0)
-		err = bpf_object__init_maps_name(obj, maps_shndx);
+	if (obj->efile.maps_shndx >= 0)
+		err = bpf_object__init_maps_name(obj);
 out:
 	return err;
 }
@@ -697,7 +699,8 @@ bpf_object__find_prog_by_idx(struct bpf_object *obj, int idx)
 static int
 bpf_program__collect_reloc(struct bpf_program *prog,
 			   size_t nr_maps, GElf_Shdr *shdr,
-			   Elf_Data *data, Elf_Data *symbols)
+			   Elf_Data *data, Elf_Data *symbols,
+			   int maps_shndx)
 {
 	int i, nrels;
 
@@ -724,9 +727,6 @@ bpf_program__collect_reloc(struct bpf_program *prog,
 			return -LIBBPF_ERRNO__FORMAT;
 		}
 
-		insn_idx = rel.r_offset / sizeof(struct bpf_insn);
-		pr_debug("relocation: insn_idx=%u\n", insn_idx);
-
 		if (!gelf_getsym(symbols,
 				 GELF_R_SYM(rel.r_info),
 				 &sym)) {
@@ -735,6 +735,15 @@ bpf_program__collect_reloc(struct bpf_program *prog,
 			return -LIBBPF_ERRNO__FORMAT;
 		}
 
+		if (sym.st_shndx != maps_shndx) {
+			pr_warning("Program '%s' contains non-map related relo data pointing to section %u\n",
+				   prog->section_name, sym.st_shndx);
+			return -LIBBPF_ERRNO__RELOC;
+		}
+
+		insn_idx = rel.r_offset / sizeof(struct bpf_insn);
+		pr_debug("relocation: insn_idx=%u\n", insn_idx);
+
 		if (insns[insn_idx].code != (BPF_LD | BPF_IMM | BPF_DW)) {
 			pr_warning("bpf: relocation: invalid relo for insns[%d].code 0x%x\n",
 				   insn_idx, insns[insn_idx].code);
@@ -863,7 +872,8 @@ static int bpf_object__collect_reloc(struct bpf_object *obj)
 
 		err = bpf_program__collect_reloc(prog, nr_maps,
 						 shdr, data,
-						 obj->efile.symbols);
+						 obj->efile.symbols,
+						 obj->efile.maps_shndx);
 		if (err)
 			return err;
 	}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 03/54] tools build: Allow subprojects select all feature checkers
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
  2016-01-25  9:55 ` [PATCH 01/54] perf test: Add libbpf relocation checker Wang Nan
  2016-01-25  9:55 ` [PATCH 02/54] perf bpf: Check relocation target section Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-02-03 10:14   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-01-25  9:55 ` [PATCH 04/54] perf build: Select all feature checkers for feature-dump Wang Nan
                   ` (51 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Put feature checkers not in original FEATURE_TESTS to a new list
and allow subproject select all feature checkers by setting
FEATURE_TESTS to 'all'.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/build/Makefile.feature | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 02db3cd..674c47d 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -27,7 +27,7 @@ endef
 #   the rule that uses them - an example for that is the 'bionic'
 #   feature check. ]
 #
-FEATURE_TESTS ?=			\
+FEATURE_TESTS_BASIC :=			\
 	backtrace			\
 	dwarf				\
 	fortify-source			\
@@ -56,6 +56,25 @@ FEATURE_TESTS ?=			\
 	get_cpuid			\
 	bpf
 
+# FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
+# of all feature tests
+FEATURE_TESTS_EXTRA :=			\
+	bionic				\
+	compile-32			\
+	compile-x32			\
+	cplus-demangle			\
+	hello				\
+	libbabeltrace			\
+	liberty				\
+	liberty-z			\
+	libunwind-debug-frame
+
+FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC)
+
+ifeq ($(FEATURE_TESTS),all)
+  FEATURE_TESTS := $(FEATURE_TESTS_BASIC) $(FEATURE_TESTS_EXTRA)
+endif
+
 FEATURE_DISPLAY ?=			\
 	dwarf				\
 	glibc				\
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 04/54] perf build: Select all feature checkers for feature-dump
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (2 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 03/54] tools build: Allow subprojects select all feature checkers Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-02-03 10:14   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-01-25  9:55 ` [PATCH 05/54] perf build: Use feature dump file for build-test Wang Nan
                   ` (50 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Set FEATURE_TESTS to 'all' so all possible feature checkers are
executed. Without this setting the output feature dump file miss
some feature, for example, liberity. Select all checker so we won't
get an incomplete feature dump file.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Makefile.perf | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 97ce869..4d2c0e4 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -165,7 +165,16 @@ ifeq ($(filter-out $(NON_CONFIG_TARGETS),$(MAKECMDGOALS)),)
 endif
 endif
 
+# Set FEATURE_TESTS to 'all' so all possible feature checkers are
+# executed. Without this setting the output feature dump file miss
+# some feature, for example, liberity. Select all checker so we won't
+# get an incomplete feature dump file.
 ifeq ($(config),1)
+ifdef MAKECMDGOALS
+ifeq ($(filter feature-dump,$(MAKECMDGOALS)),feature-dump)
+FEATURE_TESTS := all
+endif
+endif
 include config/Makefile
 endif
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 05/54] perf build: Use feature dump file for build-test
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (3 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 04/54] perf build: Select all feature checkers for feature-dump Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-01-26 16:59   ` Arnaldo Carvalho de Melo
  2016-01-25  9:55 ` [PATCH 06/54] perf test: Check environment before start real BPF test Wang Nan
                   ` (49 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

To prevent feature check run too many times, this patch utilizes
previous introduced feature-dump make target and FEATURES_DUMP
variable, makes sure the feature checkers run only once when doing
build-test for normal test cases.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/make | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index f918015..b8c86bd 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -15,6 +15,7 @@ else
 PERF := .
 PERF_O := $(PERF)
 O_OPT :=
+FULL_O := $(shell readlink -f $(PERF_O) || echo $(PERF_O))
 
 ifneq ($(O),)
   FULL_O := $(shell readlink -f $(O) || echo $(O))
@@ -313,11 +314,41 @@ make_kernelsrc_tools:
 	(make -C ../../tools $(PARALLEL_OPT) $(K_O_OPT) perf) > $@ 2>&1 && \
 	test -x $(KERNEL_O)/tools/perf/perf && rm -f $@ || (cat $@ ; false)
 
+FEATURES_DUMP_FILE := $(FULL_O)/BUILD_TEST_FEATURE_DUMP
+FEATURES_DUMP_FILE_STATIC := $(FULL_O)/BUILD_TEST_FEATURE_DUMP_STATIC
+
 all: $(run) $(run_O) tarpkg make_kernelsrc make_kernelsrc_tools
 	@echo OK
+	@rm -f $(FEATURES_DUMP_FILE) $(FEATURES_DUMP_FILE_STATIC)
 
 out: $(run_O)
 	@echo OK
+	@rm -f $(FEATURES_DUMP_FILE) $(FEATURES_DUMP_FILE_STATIC)
+
+$(FEATURES_DUMP_FILE):
+	$(call clean)
+	@cmd="cd $(PERF) && make FEATURE_DUMP_COPY=$@ $(O_OPT) feature-dump"; \
+	echo "- $@: $$cmd" && echo $$cmd && \
+	( eval $$cmd ) > /dev/null 2>&1
+
+$(FEATURES_DUMP_FILE_STATIC):
+	$(call clean)
+	@cmd="cd $(PERF) && make FEATURE_DUMP_COPY=$@ $(O_OPT) LDFLAGS='-static' feature-dump"; \
+	echo "- $@: $$cmd" && echo $$cmd && \
+	( eval $$cmd ) > /dev/null 2>&1
+
+# Add feature dump dependency for run/run_O targets
+$(foreach t,$(run) $(run_O),$(eval \
+	$(t): $(if $(findstring make_static,$(t)),\
+		$(FEATURES_DUMP_FILE_STATIC),\
+		$(FEATURES_DUMP_FILE))))
+
+# Append 'FEATURES_DUMP=' option to all test cases. For example:
+# make_no_libbpf: NO_LIBBPF=1  --> NO_LIBBPF=1 FEATURES_DUMP=/a/b/BUILD_TEST_FEATURE_DUMP
+# make_static: LDFLAGS=-static --> LDFLAGS=-static FEATURES_DUMP=/a/b/BUILD_TEST_FEATURE_DUMP_STATIC
+$(foreach t,$(run),$(if $(findstring make_static,$(t)),\
+			$(eval $(t) := $($(t)) FEATURES_DUMP=$(FEATURES_DUMP_FILE_STATIC)),\
+			$(eval $(t) := $($(t)) FEATURES_DUMP=$(FEATURES_DUMP_FILE))))
 
 .PHONY: all $(run) $(run_O) tarpkg clean make_kernelsrc make_kernelsrc_tools
 endif # ifndef MK
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 06/54] perf test: Check environment before start real BPF test
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (4 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 05/54] perf build: Use feature dump file for build-test Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-02-03 10:18   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-01-25  9:55 ` [PATCH 07/54] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
                   ` (48 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Copying perf to old kernel system results:

 # perf test bpf
 37: Test BPF filter                                          :
 37.1: Test basic BPF filtering                               : FAILED!
 37.2: Test BPF prologue generation                           : Skip

However, in case when kernel doesn't support a test case it should
return 'Skip', 'FAILED!' should be reserved for kernel tests for when
the kernel supports a feature that then fails to work as advertised.

This patch checks environment before real testcase.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/bpf.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 952ca99..4aed5cb 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -1,7 +1,11 @@
 #include <stdio.h>
 #include <sys/epoll.h>
+#include <util/util.h>
 #include <util/bpf-loader.h>
 #include <util/evlist.h>
+#include <linux/bpf.h>
+#include <linux/filter.h>
+#include <bpf/bpf.h>
 #include "tests.h"
 #include "llvm.h"
 #include "debug.h"
@@ -243,6 +247,36 @@ const char *test__bpf_subtest_get_desc(int i)
 	return bpf_testcase_table[i].desc;
 }
 
+static int check_env(void)
+{
+	int err;
+	unsigned int kver_int;
+	char license[] = "GPL";
+
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+	};
+
+	err = fetch_kernel_version(&kver_int, NULL, 0);
+	if (err) {
+		pr_debug("Unable to get kernel version\n");
+		return err;
+	}
+
+	err = bpf_load_program(BPF_PROG_TYPE_KPROBE, insns,
+			       sizeof(insns) / sizeof(insns[0]),
+			       license, kver_int, NULL, 0);
+	if (err < 0) {
+		pr_err("Missing basic BPF support, skip this test: %s\n",
+		       strerror(errno));
+		return err;
+	}
+	close(err);
+
+	return 0;
+}
+
 int test__bpf(int i)
 {
 	int err;
@@ -255,6 +289,9 @@ int test__bpf(int i)
 		return TEST_SKIP;
 	}
 
+	if (check_env())
+		return TEST_SKIP;
+
 	err = __test__bpf(i);
 	return err;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 07/54] perf tools: Fix symbols searching for offline module in buildid-cache
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (5 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 06/54] perf test: Check environment before start real BPF test Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-01-25  9:55 ` [PATCH 08/54] perf test: Improve bp_signal Wang Nan
                   ` (47 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Before this patch, if a sample is triggered inside an offline module
(module not in /lib/modules/`uname -r`/), even if the module is in
buildid-cache, 'perf report' is still unable to get correct symbol.
For example:

 # rm -rf ~/.debug/
 # perf buildid-cache -a ./mymodule.ko
 # perf probe -m ./mymodule.ko -a get_mymodule_val
 Added new event:
   probe:get_mymodule_val (on get_mymodule_val in mymodule)

 You can now use it in all perf tools, such as:

 	perf record -e probe:get_mymodule_val -aR sleep 1

 # perf record -e probe:get_mymodule_val cat /proc/mymodule
 mymodule:3
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]

 # perf report --stdio
 [SNIP]
 #
 # Overhead  Command  Shared Object     Symbol
 # ........  .......  ................  ......................
 #
    100.00%  cat      [mymodule]        [k] 0x0000000000000001

 # perf report -vvvv --stdio
 dso__load_sym: adjusting symbol: st_value: 0 sh_addr: 0 sh_offset: 0x70
 symbol__new: get_mymodule_val 0x70-0x8a
 [SNIP]

This is caused by dso__load() -> dso__load_sym(). In dso__load(), kmod
is true only when dso is regular kernel module. All files loaded from
buildid-cache is treated as user programs. Following dso__load_sym()
set map->pgoff incorrectly.

This patch gives kernel modules in buildid-cache a chance to adjust
value of kmod. After dso__load() get the type of symbols, if it is
buildid, check the last 3 chars of original filename against '.ko',
and adjust the value of kmod if the file is a kernel module.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
---
 tools/perf/util/build-id.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/build-id.h |  1 +
 tools/perf/util/symbol.c   |  4 ++++
 3 files changed, 49 insertions(+)

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 6a7e273..6b18082 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -166,6 +166,50 @@ char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size)
 	return build_id__filename(build_id_hex, bf, size);
 }
 
+bool dso__build_id_is_kmod(const struct dso *dso, char *bf, size_t size)
+{
+	char *id_name, *ch;
+	struct stat sb;
+
+	id_name = dso__build_id_filename(dso, bf, size);
+	if (!id_name)
+		goto err;
+	if (access(id_name, F_OK))
+		goto err;
+	if (lstat(id_name, &sb) == -1)
+		goto err;
+	if ((size_t)sb.st_size > size - 1)
+		goto err;
+	if (readlink(id_name, bf, size - 1) < 0)
+		goto err;
+
+	bf[sb.st_size] = '\0';
+
+	/*
+	 * link should be:
+	 * ../../lib/modules/4.4.0-rc4/kernel/net/ipv4/netfilter/nf_nat_ipv4.ko/a09fe3eb3147dafa4e3b31dbd6257e4d696bdc92
+	 */
+	ch = strrchr(bf, '/');
+	if (!ch)
+		goto err;
+	if (ch - 3 < bf)
+		goto err;
+
+	return strncmp(".ko", ch - 3, 3) == 0;
+err:
+	/*
+	 * If dso__build_id_filename work, get id_name again,
+	 * because id_name points to bf and is broken.
+	 */
+	if (id_name)
+		id_name = dso__build_id_filename(dso, bf, size);
+	pr_err("Invalid build id: %s\n", id_name ? :
+					 dso->long_name ? :
+					 dso->short_name ? :
+					 "[unknown]");
+	return false;
+}
+
 #define dsos__for_each_with_build_id(pos, head)	\
 	list_for_each_entry(pos, head, node)	\
 		if (!pos->has_build_id)		\
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 27a14a8..64af3e2 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -16,6 +16,7 @@ int sysfs__sprintf_build_id(const char *root_dir, char *sbuild_id);
 int filename__sprintf_build_id(const char *pathname, char *sbuild_id);
 
 char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size);
+bool dso__build_id_is_kmod(const struct dso *dso, char *bf, size_t size);
 
 int build_id__mark_dso_hit(struct perf_tool *tool, union perf_event *event,
 			   struct perf_sample *sample, struct perf_evsel *evsel,
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 90cedfa..e7588dc 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1529,6 +1529,10 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 	if (!runtime_ss && syms_ss)
 		runtime_ss = syms_ss;
 
+	if (syms_ss && syms_ss->type == DSO_BINARY_TYPE__BUILD_ID_CACHE)
+		if (dso__build_id_is_kmod(dso, name, PATH_MAX))
+			kmod = true;
+
 	if (syms_ss)
 		ret = dso__load_sym(dso, map, syms_ss, runtime_ss, filter, kmod);
 	else
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 08/54] perf test: Improve bp_signal
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (6 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 07/54] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-02-03 10:18   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-01-25  9:55 ` [PATCH 09/54] perf tools: Add API to config maps in bpf object Wang Nan
                   ` (46 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Will Deacon [1] has some question on patch [2]. This patch improves
test__bp_signal so we can test:

 1. A watchpoint and a breakpoint that fire on the same instruction
 2. Nested signals

Test result:

 On x86_64 and ARM64 (result are similar with patch [2] on ARM64):

 # ./perf test -v signal
 17: Test breakpoint overflow signal handler                  :
 --- start ---
 test child forked, pid 10213
 count1 1, count2 3, count3 2, overflow 3, overflows_2 3
 test child finished with 0
 ---- end ----
 Test breakpoint overflow signal handler: Ok

So at least 2 cases Will doubted are handled correctly.

[1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
[2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/bp_signal.c | 140 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 118 insertions(+), 22 deletions(-)

diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index fb80c9e..1d1bb48 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -29,14 +29,59 @@
 
 static int fd1;
 static int fd2;
+static int fd3;
 static int overflows;
+static int overflows_2;
+
+volatile long the_var;
+
+
+/*
+ * Use ASM to ensure watchpoint and breakpoint can be triggered
+ * at one instruction.
+ */
+#if defined (__x86_64__)
+extern void __test_function(volatile long *ptr);
+asm (
+	".globl __test_function\n"
+	"__test_function:\n"
+	"incq (%rdi)\n"
+	"ret\n");
+#elif defined (__aarch64__)
+extern void __test_function(volatile long *ptr);
+asm (
+	".globl __test_function\n"
+	"__test_function:\n"
+	"str x30, [x0]\n"
+	"ret\n");
+
+#else
+static void __test_function(volatile long *ptr)
+{
+	*ptr = 0x1234;
+}
+#endif
 
 __attribute__ ((noinline))
 static int test_function(void)
 {
+	__test_function(&the_var);
+	the_var++;
 	return time(NULL);
 }
 
+static void sig_handler_2(int signum __maybe_unused,
+			  siginfo_t *oh __maybe_unused,
+			  void *uc __maybe_unused)
+{
+	overflows_2++;
+	if (overflows_2 > 10) {
+		ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
+		ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+		ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
+	}
+}
+
 static void sig_handler(int signum __maybe_unused,
 			siginfo_t *oh __maybe_unused,
 			void *uc __maybe_unused)
@@ -54,10 +99,11 @@ static void sig_handler(int signum __maybe_unused,
 		 */
 		ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
 		ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+		ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
 	}
 }
 
-static int bp_event(void *fn, int setup_signal)
+static int __event(bool is_x, void *addr, int signal)
 {
 	struct perf_event_attr pe;
 	int fd;
@@ -67,8 +113,8 @@ static int bp_event(void *fn, int setup_signal)
 	pe.size = sizeof(struct perf_event_attr);
 
 	pe.config = 0;
-	pe.bp_type = HW_BREAKPOINT_X;
-	pe.bp_addr = (unsigned long) fn;
+	pe.bp_type = is_x ? HW_BREAKPOINT_X : HW_BREAKPOINT_W;
+	pe.bp_addr = (unsigned long) addr;
 	pe.bp_len = sizeof(long);
 
 	pe.sample_period = 1;
@@ -86,17 +132,25 @@ static int bp_event(void *fn, int setup_signal)
 		return TEST_FAIL;
 	}
 
-	if (setup_signal) {
-		fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
-		fcntl(fd, F_SETSIG, SIGIO);
-		fcntl(fd, F_SETOWN, getpid());
-	}
+	fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
+	fcntl(fd, F_SETSIG, signal);
+	fcntl(fd, F_SETOWN, getpid());
 
 	ioctl(fd, PERF_EVENT_IOC_RESET, 0);
 
 	return fd;
 }
 
+static int bp_event(void *addr, int signal)
+{
+	return __event(true, addr, signal);
+}
+
+static int wp_event(void *addr, int signal)
+{
+	return __event(false, addr, signal);
+}
+
 static long long bp_count(int fd)
 {
 	long long count;
@@ -114,7 +168,7 @@ static long long bp_count(int fd)
 int test__bp_signal(int subtest __maybe_unused)
 {
 	struct sigaction sa;
-	long long count1, count2;
+	long long count1, count2, count3;
 
 	/* setup SIGIO signal handler */
 	memset(&sa, 0, sizeof(struct sigaction));
@@ -126,21 +180,52 @@ int test__bp_signal(int subtest __maybe_unused)
 		return TEST_FAIL;
 	}
 
+	sa.sa_sigaction = (void *) sig_handler_2;
+	if (sigaction(SIGUSR1, &sa, NULL) < 0) {
+		pr_debug("failed setting up signal handler 2\n");
+		return TEST_FAIL;
+	}
+
 	/*
 	 * We create following events:
 	 *
-	 * fd1 - breakpoint event on test_function with SIGIO
+	 * fd1 - breakpoint event on __test_function with SIGIO
 	 *       signal configured. We should get signal
 	 *       notification each time the breakpoint is hit
 	 *
-	 * fd2 - breakpoint event on sig_handler without SIGIO
+	 * fd2 - breakpoint event on sig_handler with SIGUSR1
+	 *       configured. We should get SIGUSR1 each time when
+	 *       breakpoint is hit
+	 *
+	 * fd3 - watchpoint event on __test_function with SIGIO
 	 *       configured.
 	 *
 	 * Following processing should happen:
-	 *   - execute test_function
-	 *   - fd1 event breakpoint hit -> count1 == 1
-	 *   - SIGIO is delivered       -> overflows == 1
-	 *   - fd2 event breakpoint hit -> count2 == 1
+	 *   Exec:               Action:                       Result:
+	 *   incq (%rdi)       - fd1 event breakpoint hit   -> count1 == 1
+	 *                     - SIGIO is delivered
+	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 1
+	 *                     - SIGUSR1 is delivered
+	 *   sig_handler_2                                  -> overflows_2 == 1  (nested signal)
+	 *   sys_rt_sigreturn  - return from sig_handler_2
+	 *   overflows++                                    -> overflows = 1
+	 *   sys_rt_sigreturn  - return from sig_handler
+	 *   incq (%rdi)       - fd3 event watchpoint hit   -> count3 == 1       (wp and bp in one insn)
+	 *                     - SIGIO is delivered
+	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 2
+	 *                     - SIGUSR1 is delivered
+	 *   sig_handler_2                                  -> overflows_2 == 2  (nested signal)
+	 *   sys_rt_sigreturn  - return from sig_handler_2
+	 *   overflows++                                    -> overflows = 2
+	 *   sys_rt_sigreturn  - return from sig_handler
+	 *   the_var++         - fd3 event watchpoint hit   -> count3 == 2       (standalone watchpoint)
+	 *                     - SIGIO is delivered
+	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 3
+	 *                     - SIGUSR1 is delivered
+	 *   sig_handler_2                                  -> overflows_2 == 3  (nested signal)
+	 *   sys_rt_sigreturn  - return from sig_handler_2
+	 *   overflows++                                    -> overflows == 3
+	 *   sys_rt_sigreturn  - return from sig_handler
 	 *
 	 * The test case check following error conditions:
 	 * - we get stuck in signal handler because of debug
@@ -152,11 +237,13 @@ int test__bp_signal(int subtest __maybe_unused)
 	 *
 	 */
 
-	fd1 = bp_event(test_function, 1);
-	fd2 = bp_event(sig_handler, 0);
+	fd1 = bp_event(__test_function, SIGIO);
+	fd2 = bp_event(sig_handler, SIGUSR1);
+	fd3 = wp_event((void *)&the_var, SIGIO);
 
 	ioctl(fd1, PERF_EVENT_IOC_ENABLE, 0);
 	ioctl(fd2, PERF_EVENT_IOC_ENABLE, 0);
+	ioctl(fd3, PERF_EVENT_IOC_ENABLE, 0);
 
 	/*
 	 * Kick off the test by trigering 'fd1'
@@ -166,15 +253,18 @@ int test__bp_signal(int subtest __maybe_unused)
 
 	ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
 	ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+	ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
 
 	count1 = bp_count(fd1);
 	count2 = bp_count(fd2);
+	count3 = bp_count(fd3);
 
 	close(fd1);
 	close(fd2);
+	close(fd3);
 
-	pr_debug("count1 %lld, count2 %lld, overflow %d\n",
-		 count1, count2, overflows);
+	pr_debug("count1 %lld, count2 %lld, count3 %lld, overflow %d, overflows_2 %d\n",
+		 count1, count2, count3, overflows, overflows_2);
 
 	if (count1 != 1) {
 		if (count1 == 11)
@@ -183,12 +273,18 @@ int test__bp_signal(int subtest __maybe_unused)
 			pr_debug("failed: wrong count for bp1%lld\n", count1);
 	}
 
-	if (overflows != 1)
+	if (overflows != 3)
 		pr_debug("failed: wrong overflow hit\n");
 
-	if (count2 != 1)
+	if (overflows_2 != 3)
+		pr_debug("failed: wrong overflow_2 hit\n");
+
+	if (count2 != 3)
 		pr_debug("failed: wrong count for bp2\n");
 
-	return count1 == 1 && overflows == 1 && count2 == 1 ?
+	if (count3 != 2)
+		pr_debug("failed: wrong count for bp3\n");
+
+	return count1 == 1 && overflows == 3 && count2 == 3 && overflows_2 == 3 && count3 == 2 ?
 		TEST_OK : TEST_FAIL;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 09/54] perf tools: Add API to config maps in bpf object
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (7 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 08/54] perf test: Improve bp_signal Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-02-03 23:29   ` Arnaldo Carvalho de Melo
  2016-01-25  9:55 ` [PATCH 10/54] perf tools: Enable BPF object configure syntax Wang Nan
                   ` (45 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

bpf__config_obj() is introduced as a core API to config BPF object
after loading. One configuration option of maps is introduced. After
this patch BPF object can accept configuration like:

 maps:my_map.value=1234

(maps.my_map.value looks pretty. However, there's a small but hard
to fixed problem related to flex's greedy matching. Please see [1].
Choose ':' to avoid it in a simpler way.)

This patch is more complex than the work it really does because the
consideration of extension. In designing of BPF map configuration,
following things should be considered:

 1. Array indices selection: perf should allow user setting different
    value to different slots in an array, with syntax like:
    maps:my_map.value[0,3...6]=1234;

 2. A map can be config by different config terms, each for a part
    of it. For example, set each slot to pid of a thread;

 3. Type of value: integer is not the only valid value type. Perf
    event can also be put into a map after commit 35578d7984003097af2b1e3
    (bpf: Implement function bpf_perf_event_read() that get the selected
    hardware PMU conuter);

 4. For hash table, it is possible to use string or other as key;

 5. It is possible that map configuration is unable to be setup
    during parsing. Perf event is an example.

Therefore, this patch does following:

 1. Instead of updating map element during parsing, this patch stores
    map config options in 'struct bpf_map_priv'. Following patches
    would apply those configs at proper time;

 2. Link map operations to a list so a map can have multiple config
    terms attached, so different parts can be configured separately;

 3. Make 'struct bpf_map_priv' extensible so following patches can
    add new types of keys and operations;

 4. Use bpf_config_map_funcs array to support more maps config options.

Since the patch changing event parser to parse BPF object config is
relative large, I put in another commit. Code in this patch
could be tested after applying next patch.

[1] http://lkml.kernel.org/g/564ED621.4050500@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c | 266 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h |  38 +++++++
 2 files changed, 304 insertions(+)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 540a7ef..7d361aa 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -739,6 +739,251 @@ int bpf__foreach_tev(struct bpf_object *obj,
 	return 0;
 }
 
+enum bpf_map_op_type {
+	BPF_MAP_OP_SET_VALUE,
+};
+
+enum bpf_map_key_type {
+	BPF_MAP_KEY_ALL,
+};
+
+struct bpf_map_op {
+	struct list_head list;
+	enum bpf_map_op_type op_type;
+	enum bpf_map_key_type key_type;
+	union {
+		u64 value;
+	} v;
+};
+
+struct bpf_map_priv {
+	struct list_head ops_list;
+};
+
+static void
+bpf_map_op__free(struct bpf_map_op *op)
+{
+	struct list_head *list = &op->list;
+	/*
+	 * bpf_map_op__free() needs to consider following cases:
+	 *   1. When the op is created but not linked to any list:
+	 *      impossible. This only happen in bpf_map_op__alloc()
+	 *      and it would be freed directly;
+	 *   2. Normal case, when the op is linked to a list;
+	 *   3. After the op has already be removed.
+	 * Thanks to list.h, if it has removed by list_del() then
+	 * list->{next,prev} should have been set to LIST_POISON{1,2}.
+	 */
+	if ((list->next != LIST_POISON1) && (list->prev != LIST_POISON2))
+		list_del(list);
+	free(op);
+}
+
+static void
+bpf_map_priv__clear(struct bpf_map *map __maybe_unused,
+		    void *_priv)
+{
+	struct bpf_map_priv *priv = _priv;
+	struct bpf_map_op *pos, *n;
+
+	list_for_each_entry_safe(pos, n, &priv->ops_list, list)
+		bpf_map_op__free(pos);
+	free(priv);
+}
+
+static struct bpf_map_op *
+bpf_map_op__alloc(struct bpf_map *map)
+{
+	struct bpf_map_op *op;
+	struct bpf_map_priv *priv;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+	err = bpf_map__get_private(map, (void **)&priv);
+	if (err) {
+		pr_debug("Failed to get private from map %s\n", map_name);
+		return ERR_PTR(err);
+	}
+
+	if (!priv) {
+		priv = zalloc(sizeof(*priv));
+		if (!priv) {
+			pr_debug("No enough memory to alloc map private\n");
+			return ERR_PTR(-ENOMEM);
+		}
+		INIT_LIST_HEAD(&priv->ops_list);
+
+		if (bpf_map__set_private(map, priv, bpf_map_priv__clear)) {
+			free(priv);
+			return ERR_PTR(-BPF_LOADER_ERRNO__INTERNAL);
+		}
+	}
+
+	op = zalloc(sizeof(*op));
+	if (!op) {
+		pr_debug("Failed to alloc bpf_map_op\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	op->key_type = BPF_MAP_KEY_ALL;
+	list_add_tail(&op->list, &priv->ops_list);
+	return op;
+}
+
+static int
+bpf__obj_config_map_array_value(struct bpf_map *map,
+				struct parse_events_term *term)
+{
+	struct bpf_map_def def;
+	struct bpf_map_op *op;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("Unable to get map definition from '%s'\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	if (def.type != BPF_MAP_TYPE_ARRAY) {
+		pr_debug("Map %s type is not BPF_MAP_TYPE_ARRAY\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+	}
+	if (def.key_size < sizeof(unsigned int)) {
+		pr_debug("Map %s has incorrect key size\n", map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE;
+	}
+	switch (def.value_size) {
+	case 1:
+	case 2:
+	case 4:
+	case 8:
+		break;
+	default:
+		pr_debug("Map %s has incorrect value size\n", map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
+	}
+
+	op = bpf_map_op__alloc(map);
+	if (IS_ERR(op))
+		return PTR_ERR(op);
+	op->op_type = BPF_MAP_OP_SET_VALUE;
+	op->v.value = term->val.num;
+	return 0;
+}
+
+static int
+bpf__obj_config_map_value(struct bpf_map *map,
+			  struct parse_events_term *term,
+			  struct perf_evlist *evlist __maybe_unused)
+{
+	if (!term->err_val) {
+		pr_debug("Config value not set\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
+	}
+
+	if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM)
+		return bpf__obj_config_map_array_value(map, term);
+
+	pr_debug("ERROR: wrong value type\n");
+	return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
+}
+
+struct bpf_obj_config_map_func {
+	const char *config_opt;
+	int (*config_func)(struct bpf_map *, struct parse_events_term *,
+			   struct perf_evlist *);
+};
+
+struct bpf_obj_config_map_func bpf_obj_config_map_funcs[] = {
+	{"value", bpf__obj_config_map_value},
+};
+
+static int
+bpf__obj_config_map(struct bpf_object *obj,
+		    struct parse_events_term *term,
+		    struct perf_evlist *evlist,
+		    int *key_scan_pos)
+{
+	/* key is "maps:<mapname>.<config opt>" */
+	char *map_name = strdup(term->config + sizeof("maps:") - 1);
+	struct bpf_map *map;
+	int err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
+	char *map_opt;
+	size_t i;
+
+	if (!map_name)
+		return -ENOMEM;
+
+	map_opt = strchr(map_name, '.');
+	if (!map_opt) {
+		pr_debug("ERROR: Invalid map config: %s\n", map_name);
+		goto out;
+	}
+
+	*map_opt++ = '\0';
+	if (*map_opt == '\0') {
+		pr_debug("ERROR: Invalid map option: %s\n", term->config);
+		goto out;
+	}
+
+	map = bpf_object__get_map_by_name(obj, map_name);
+	if (!map) {
+		pr_debug("ERROR: Map %s is not exist\n", map_name);
+		err = -BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST;
+		goto out;
+	}
+
+	*key_scan_pos += map_opt - map_name;
+	for (i = 0; i < ARRAY_SIZE(bpf_obj_config_map_funcs); i++) {
+		struct bpf_obj_config_map_func *func =
+				&bpf_obj_config_map_funcs[i];
+
+		if (strcmp(map_opt, func->config_opt) == 0) {
+			err = func->config_func(map, term, evlist);
+			goto out;
+		}
+	}
+
+	pr_debug("ERROR: invalid config option '%s' for maps\n",
+		 map_opt);
+	err = -BPF_LOADER_ERRNO__OBJCONF_MAP_OPT;
+out:
+	free(map_name);
+	if (!err)
+		key_scan_pos += strlen(map_opt);
+	return err;
+}
+
+int bpf__config_obj(struct bpf_object *obj,
+		    struct parse_events_term *term,
+		    struct perf_evlist *evlist,
+		    int *error_pos)
+{
+	int key_scan_pos = 0;
+	int err;
+
+	if (!obj || !term || !term->config)
+		return -EINVAL;
+
+	if (!prefixcmp(term->config, "maps:")) {
+		key_scan_pos = sizeof("maps:") - 1;
+		err = bpf__obj_config_map(obj, term, evlist, &key_scan_pos);
+		goto out;
+	}
+	err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
+out:
+	if (error_pos)
+		*error_pos = key_scan_pos;
+	return err;
+
+}
+
 #define ERRNO_OFFSET(e)		((e) - __BPF_LOADER_ERRNO__START)
 #define ERRCODE_OFFSET(c)	ERRNO_OFFSET(BPF_LOADER_ERRNO__##c)
 #define NR_ERRNO	(__BPF_LOADER_ERRNO__END - __BPF_LOADER_ERRNO__START)
@@ -753,6 +998,14 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(PROLOGUE)]	= "Failed to generate prologue",
 	[ERRCODE_OFFSET(PROLOGUE2BIG)]	= "Prologue too big for program",
 	[ERRCODE_OFFSET(PROLOGUEOOB)]	= "Offset out of bound for prologue",
+	[ERRCODE_OFFSET(OBJCONF_OPT)]	= "Invalid object config option",
+	[ERRCODE_OFFSET(OBJCONF_CONF)]	= "Config value not set (lost '=')",
+	[ERRCODE_OFFSET(OBJCONF_MAP_OPT)]	= "Invalid object maps config option",
+	[ERRCODE_OFFSET(OBJCONF_MAP_NOTEXIST)]	= "Target map not exist",
+	[ERRCODE_OFFSET(OBJCONF_MAP_VALUE)]	= "Incorrect value type for map",
+	[ERRCODE_OFFSET(OBJCONF_MAP_TYPE)]	= "Incorrect map type",
+	[ERRCODE_OFFSET(OBJCONF_MAP_KEYSIZE)]	= "Incorrect map key size",
+	[ERRCODE_OFFSET(OBJCONF_MAP_VALUESIZE)]	= "Incorrect map value size",
 };
 
 static int
@@ -872,3 +1125,16 @@ int bpf__strerror_load(struct bpf_object *obj,
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
+			     struct parse_events_term *term __maybe_unused,
+			     struct perf_evlist *evlist __maybe_unused,
+			     int *error_pos __maybe_unused, int err,
+			     char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,
+			    "Can't use this config term to this type of map");
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 6fdc045..2464db9 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -10,6 +10,7 @@
 #include <string.h>
 #include <bpf/libbpf.h>
 #include "probe-event.h"
+#include "evlist.h"
 #include "debug.h"
 
 enum bpf_loader_errno {
@@ -24,10 +25,19 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__PROLOGUE,	/* Failed to generate prologue */
 	BPF_LOADER_ERRNO__PROLOGUE2BIG,	/* Prologue too big for program */
 	BPF_LOADER_ERRNO__PROLOGUEOOB,	/* Offset out of bound for prologue */
+	BPF_LOADER_ERRNO__OBJCONF_OPT,	/* Invalid object config option */
+	BPF_LOADER_ERRNO__OBJCONF_CONF,	/* Config value not set (lost '=')) */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_OPT,	/* Invalid object maps config option */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST,	/* Target map not exist */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE,	/* Incorrect value type for map */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,	/* Incorrect map type */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE,	/* Incorrect map key size */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE,/* Incorrect map value size */
 	__BPF_LOADER_ERRNO__END,
 };
 
 struct bpf_object;
+struct parse_events_term;
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
 typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
@@ -53,6 +63,14 @@ int bpf__strerror_load(struct bpf_object *obj, int err,
 		       char *buf, size_t size);
 int bpf__foreach_tev(struct bpf_object *obj,
 		     bpf_prog_iter_callback_t func, void *arg);
+
+int bpf__config_obj(struct bpf_object *obj, struct parse_events_term *term,
+		    struct perf_evlist *evlist, int *error_pos);
+int bpf__strerror_config_obj(struct bpf_object *obj,
+			     struct parse_events_term *term,
+			     struct perf_evlist *evlist,
+			     int *error_pos, int err, char *buf,
+			     size_t size);
 #else
 static inline struct bpf_object *
 bpf__prepare_load(const char *filename __maybe_unused,
@@ -84,6 +102,15 @@ bpf__foreach_tev(struct bpf_object *obj __maybe_unused,
 }
 
 static inline int
+bpf__config_obj(struct bpf_object *obj __maybe_unused,
+		struct parse_events_term *term __maybe_unused,
+		struct perf_evlist *evlist __maybe_unused,
+		int *error_pos __maybe_unused)
+{
+	return 0;
+}
+
+static inline int
 __bpf_strerror(char *buf, size_t size)
 {
 	if (!size)
@@ -118,5 +145,16 @@ static inline int bpf__strerror_load(struct bpf_object *obj __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int
+bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
+			 struct parse_events_term *term __maybe_unused,
+			 struct perf_evlist *evlist __maybe_unused,
+			 int *error_pos __maybe_unused,
+			 int err __maybe_unused,
+			 char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 10/54] perf tools: Enable BPF object configure syntax
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (8 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 09/54] perf tools: Add API to config maps in bpf object Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-01-25  9:55 ` [PATCH 11/54] perf record: Apply config to BPF objects before recording Wang Nan
                   ` (44 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

This patch adds the final step for BPF map configuration. A new syntax
is appended into parser so user can config BPF objects through '/' '/'
enclosed config terms.

After this patch, following syntax is available:

 # perf record -e ./test_bpf_map_1.c/maps:channel.value=10/ ...

It would takes effect after appling following commits.

Test result:

 # cat ./test_bpf_map_1.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
     (void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
     .type = BPF_MAP_TYPE_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = 1,
 };
 SEC("func=sys_nanosleep")
 int func(void *ctx)
 {
     int key = 0;
     char fmt[] = "%d\n";
     int *pval = map_lookup_elem(&channel, &key);
     if (!pval)
         return 0;
     trace_printk(fmt, sizeof(fmt), *pval);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 - Normal case:
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=10/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]

 - Error case:

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value/' usleep 10
 event syntax error: '..ps:channel:value/'
                                   \___ Config value not set (lost '=')
 Hint:	Valid config term:
      	maps:[<arraymap>]:value=[value]
	     	(add -v to see detail)
	Run 'perf list' for a list of valid events

 Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -e, --event <event>   event selector. use 'perf list' to list available events

 # ./perf record -e './test_bpf_map_1.c/xmaps:channel.value=10/' usleep 10
 event syntax error: '..pf_map_1.c/xmaps:channel.value=10/'
                                   \___ Invalid object config option
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:xchannel.value=10/' usleep 10
 event syntax error: '..p_1.c/maps:xchannel.value=10/'
                                   \___ Target map not exist
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:channel.xvalue=10/' usleep 10
 event syntax error: '..ps:channel.xvalue=10/'
                                   \___ Invalid object maps config option
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=x10/' usleep 10
 event syntax error: '..nnel.value=x10/'
                                   \___ Incorrect value type for map
 [SNIP]

 Change BPF_MAP_TYPE_ARRAY to '1':

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=10/' usleep 10
 event syntax error: '..ps:channel.value=10/'
                                   \___ Can't use this config term to this type of map

 Hint:	Valid config term:
     	maps:[<arraymap>].value=[value]
     	(add -v to see detail)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 56 +++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/parse-events.h |  3 ++-
 tools/perf/util/parse-events.l |  2 +-
 tools/perf/util/parse-events.y | 23 ++++++++++++++---
 4 files changed, 75 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4f7b0ef..1c2dc5d 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -628,17 +628,64 @@ errout:
 	return err;
 }
 
+static int
+parse_events_config_bpf(struct parse_events_evlist *data,
+		       struct bpf_object *obj,
+		       struct list_head *head_config)
+{
+	struct parse_events_term *term;
+	int error_pos;
+
+	if (!head_config || list_empty(head_config))
+		return 0;
+
+	list_for_each_entry(term, head_config, list) {
+		char errbuf[BUFSIZ];
+		int err;
+
+		if (term->type_term != PARSE_EVENTS__TERM_TYPE_USER) {
+			snprintf(errbuf, sizeof(errbuf),
+				 "Invalid config term for BPF object");
+			errbuf[BUFSIZ - 1] = '\0';
+
+			data->error->idx = term->err_term;
+			data->error->str = strdup(errbuf);
+			return -EINVAL;
+		}
+
+		err = bpf__config_obj(obj, term, NULL, &error_pos);
+		if (err) {
+			bpf__strerror_config_obj(obj, term, NULL,
+						 &error_pos, err, errbuf,
+						 sizeof(errbuf));
+			data->error->help = strdup(
+"Hint:\tValid config term:\n"
+"     \tmaps:[<arraymap>].value=[value]\n"
+"     \t(add -v to see detail)");
+			data->error->str = strdup(errbuf);
+			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
+				data->error->idx = term->err_val;
+			else
+				data->error->idx = term->err_term + error_pos;
+			return err;
+		}
+	}
+	return 0;
+
+}
+
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
-			  bool source)
+			  bool source,
+			  struct list_head *head_config)
 {
 	struct bpf_object *obj;
+	int err;
 
 	obj = bpf__prepare_load(bpf_file_name, source);
 	if (IS_ERR(obj)) {
 		char errbuf[BUFSIZ];
-		int err;
 
 		err = PTR_ERR(obj);
 
@@ -656,7 +703,10 @@ int parse_events_load_bpf(struct parse_events_evlist *data,
 		return err;
 	}
 
-	return parse_events_load_bpf_obj(data, list, obj);
+	err = parse_events_load_bpf_obj(data, list, obj);
+	if (err)
+		return err;
+	return parse_events_config_bpf(data, obj, head_config);
 }
 
 static int
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index f1a6db1..84694f3 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -126,7 +126,8 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
-			  bool source);
+			  bool source,
+			  struct list_head *head_config);
 /* Provide this function for perf test */
 struct bpf_object;
 int parse_events_load_bpf_obj(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 58c5831..4387728 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -122,7 +122,7 @@ num_dec		[0-9]+
 num_hex		0x[a-fA-F0-9]+
 num_raw_hex	[a-fA-F0-9]+
 name		[a-zA-Z_*?][a-zA-Z0-9_*?.]*
-name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.]*
+name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
 /* If you add a modifier you need to update check_modifier() */
 modifier_event	[ukhpPGHSDI]+
 modifier_bp	[rwx]{1,3}
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index ad37996..8992d16 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -64,6 +64,7 @@ static inc_group_count(struct list_head *list,
 %type <str> PE_PMU_EVENT_PRE PE_PMU_EVENT_SUF PE_KERNEL_PMU_EVENT
 %type <num> value_sym
 %type <head> event_config
+%type <head> event_bpf_config
 %type <term> event_term
 %type <head> event_pmu
 %type <head> event_legacy_symbol
@@ -455,27 +456,41 @@ PE_RAW
 }
 
 event_bpf_file:
-PE_BPF_OBJECT
+PE_BPF_OBJECT event_bpf_config
 {
 	struct parse_events_evlist *data = _data;
 	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_load_bpf(data, list, $1, false));
+	ABORT_ON(parse_events_load_bpf(data, list, $1, false, $2));
+	if ($2)
+		parse_events__free_terms($2);
 	$$ = list;
 }
 |
-PE_BPF_SOURCE
+PE_BPF_SOURCE event_bpf_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_load_bpf(data, list, $1, true));
+	ABORT_ON(parse_events_load_bpf(data, list, $1, true, $2));
+	if ($2)
+		parse_events__free_terms($2);
 	$$ = list;
 }
 
+event_bpf_config:
+'/' event_config '/'
+{
+	$$ = $2;
+}
+|
+{
+	$$ = NULL;
+}
+
 start_terms: event_config
 {
 	struct parse_events_terms *data = _data;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 11/54] perf record: Apply config to BPF objects before recording
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (9 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 10/54] perf tools: Enable BPF object configure syntax Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-01-25  9:55 ` [PATCH 12/54] perf tools: Enable passing event to BPF object Wang Nan
                   ` (43 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

bpf__apply_obj_config() is introduced as the core API to apply object
config options to all BPF objects. This patch also does the real work
for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value
stored in map's private field into the BPF map.

This patch is required because we are not always able to set all
BPF config during parsing. Further patch will set events created
by perf to BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist
until perf_evsel__open().

bpf_map_foreach_key() is introduced to iterate over each key
needs to be configured. This function would be extended to support
more map types and different key settings.

In perf record, before start recording, call bpf__apply_config() to
turn on all BPF config options.

Test result:

 # cat ./test_bpf_map_1.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
     (void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
     .type = BPF_MAP_TYPE_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = 1,
 };
 SEC("func=sys_nanosleep")
 int func(void *ctx)
 {
     int key = 0;
     char fmt[] = "%d\n";
     int *pval = map_lookup_elem(&channel, &key);
     if (!pval)
         return 0;
     trace_printk(fmt, sizeof(fmt), *pval);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=11/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 1/1   #P:8
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            usleep-18593 [007] d... 2394714.395539: : 11
 # ./perf record -e './test_bpf_map.c/maps:channel.value=101/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 1/1   #P:8
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            usleep-18593 [007] d... 2394714.395539: : 11
            usleep-19000 [006] d... 2394831.057840: : 101

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c  |  11 +++
 tools/perf/util/bpf-loader.c | 180 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h |  15 ++++
 3 files changed, 206 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 319712a..c95c8ea 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -32,6 +32,7 @@
 #include "util/parse-branch-options.h"
 #include "util/parse-regs-options.h"
 #include "util/llvm-utils.h"
+#include "util/bpf-loader.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -534,6 +535,16 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		goto out_child;
 	}
 
+	err = bpf__apply_obj_config();
+	if (err) {
+		char errbuf[BUFSIZ];
+
+		bpf__strerror_apply_obj_config(err, errbuf, sizeof(errbuf));
+		pr_err("ERROR: Apply config to BPF failed: %s\n",
+			 errbuf);
+		goto out_child;
+	}
+
 	/*
 	 * Normally perf_session__new would do this, but it doesn't have the
 	 * evlist.
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 7d361aa..96fd18b 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -7,6 +7,7 @@
 
 #include <linux/bpf.h>
 #include <bpf/libbpf.h>
+#include <bpf/bpf.h>
 #include <linux/err.h>
 #include <linux/string.h>
 #include "perf.h"
@@ -984,6 +985,178 @@ out:
 
 }
 
+typedef int (*map_config_func_t)(const char *name, int map_fd,
+				 struct bpf_map_def *pdef,
+				 struct bpf_map_op *op,
+				 void *pkey, void *arg);
+
+static int
+foreach_key_array_all(map_config_func_t func,
+		      void *arg, const char *name,
+		      int map_fd, struct bpf_map_def *pdef,
+		      struct bpf_map_op *op)
+{
+	unsigned int i;
+	int err;
+
+	for (i = 0; i < pdef->max_entries; i++) {
+		err = func(name, map_fd, pdef, op, &i, arg);
+		if (err) {
+			pr_debug("ERROR: failed to insert value to %s[%u]\n",
+				 name, i);
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int
+bpf_map_config_foreach_key(struct bpf_map *map,
+			   map_config_func_t func,
+			   void *arg)
+{
+	int err, map_fd;
+	const char *name;
+	struct bpf_map_op *op;
+	struct bpf_map_def def;
+	struct bpf_map_priv *priv;
+
+	name = bpf_map__get_name(map);
+
+	err = bpf_map__get_private(map, (void **)&priv);
+	if (err) {
+		pr_debug("ERROR: failed to get private from map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	if (!priv || list_empty(&priv->ops_list)) {
+		pr_debug("INFO: nothing to config for map %s\n", name);
+		return 0;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("ERROR: failed to get definition from map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	map_fd = bpf_map__get_fd(map);
+	if (map_fd < 0) {
+		pr_debug("ERROR: failed to get fd from map %s\n", name);
+		return map_fd;
+	}
+
+	list_for_each_entry(op, &priv->ops_list, list) {
+		switch (def.type) {
+		case BPF_MAP_TYPE_ARRAY:
+			switch (op->key_type) {
+			case BPF_MAP_KEY_ALL:
+				return foreach_key_array_all(func, arg, name,
+							     map_fd, &def, op);
+			default:
+				pr_debug("ERROR: keytype for map '%s' invalid\n",
+					 name);
+				return -BPF_LOADER_ERRNO__INTERNAL;
+		}
+		default:
+			pr_debug("ERROR: type of '%s' incorrect\n", name);
+			return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+		}
+	}
+
+	return 0;
+}
+
+static int
+apply_config_value_for_key(int map_fd, void *pkey,
+			   size_t val_size, u64 val)
+{
+	int err = 0;
+
+	switch (val_size) {
+	case 1: {
+		u8 _val = (u8)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 2: {
+		u16 _val = (u16)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 4: {
+		u32 _val = (u32)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 8: {
+		err = bpf_map_update_elem(map_fd, pkey, &val, BPF_ANY);
+		break;
+	}
+	default:
+		pr_debug("ERROR: invalid value size\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
+	}
+	if (err && errno)
+		err = -errno;
+	return err;
+}
+
+static int
+apply_obj_config_map_for_key(const char *name, int map_fd,
+			     struct bpf_map_def *pdef __maybe_unused,
+			     struct bpf_map_op *op,
+			     void *pkey, void *arg __maybe_unused)
+{
+	int err;
+
+	switch (op->op_type) {
+	case BPF_MAP_OP_SET_VALUE:
+		err = apply_config_value_for_key(map_fd, pkey,
+						 pdef->value_size,
+						 op->v.value);
+		break;
+	default:
+		pr_debug("ERROR: unknown value type for '%s'\n", name);
+		err = -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	return err;
+}
+
+static int
+apply_obj_config_map(struct bpf_map *map)
+{
+	return bpf_map_config_foreach_key(map,
+					  apply_obj_config_map_for_key,
+					  NULL);
+}
+
+static int
+apply_obj_config_object(struct bpf_object *obj)
+{
+	struct bpf_map *map;
+	int err;
+
+	bpf_map__for_each(map, obj) {
+		err = apply_obj_config_map(map);
+		if (err)
+			return err;
+	}
+	return 0;
+}
+
+int bpf__apply_obj_config(void)
+{
+	struct bpf_object *obj, *tmp;
+	int err;
+
+	bpf_object__for_each_safe(obj, tmp) {
+		err = apply_obj_config_object(obj);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 #define ERRNO_OFFSET(e)		((e) - __BPF_LOADER_ERRNO__START)
 #define ERRCODE_OFFSET(c)	ERRNO_OFFSET(BPF_LOADER_ERRNO__##c)
 #define NR_ERRNO	(__BPF_LOADER_ERRNO__END - __BPF_LOADER_ERRNO__START)
@@ -1138,3 +1311,10 @@ int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 2464db9..db3c34c 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -71,6 +71,8 @@ int bpf__strerror_config_obj(struct bpf_object *obj,
 			     struct perf_evlist *evlist,
 			     int *error_pos, int err, char *buf,
 			     size_t size);
+int bpf__apply_obj_config(void);
+int bpf__strerror_apply_obj_config(int err, char *buf, size_t size);
 #else
 static inline struct bpf_object *
 bpf__prepare_load(const char *filename __maybe_unused,
@@ -111,6 +113,12 @@ bpf__config_obj(struct bpf_object *obj __maybe_unused,
 }
 
 static inline int
+bpf__apply_obj_config(void)
+{
+	return 0;
+}
+
+static inline int
 __bpf_strerror(char *buf, size_t size)
 {
 	if (!size)
@@ -156,5 +164,12 @@ bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int
+bpf__strerror_apply_obj_config(int err __maybe_unused,
+			       char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 12/54] perf tools: Enable passing event to BPF object
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (10 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 11/54] perf record: Apply config to BPF objects before recording Wang Nan
@ 2016-01-25  9:55 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 13/54] perf tools: Support perf event alias name Wang Nan
                   ` (42 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

A new syntax is appended into parser so user can pass predefined perf
events into BPF objects.

After this patch, BPF programs for perf are finally able to utilize
bpf_perf_event_read() introduced in commit 35578d7984003097af2b1e3
(bpf: Implement function bpf_perf_event_read() that get the selected
hardware PMU conuter).

Test result:

 # cat ./test_bpf_map_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
     (void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_read)(struct bpf_map_def *, int) =
     (void *)BPF_FUNC_perf_event_read;

 struct bpf_map_def SEC("maps") pmu_map = {
     .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = __NR_CPUS__,
 };
 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
     unsigned long long val;
     char fmt[] = "sys_write:        pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }

 SEC("func_write_return=sys_write%return")
 int func_write_return(void *ctx)
 {
     unsigned long long val = 0;
     char fmt[] = "sys_write_return: pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

Normal case:
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 [SNIP]
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.013 MB perf.data (7 samples) ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-17066 [000] d... 938449.863301: : sys_write:        pmu=1157327
               ls-17066 [000] dN.. 938449.863342: : sys_write_return: pmu=1225218
               ls-17066 [000] d... 938449.863349: : sys_write:        pmu=1241922
               ls-17066 [000] dN.. 938449.863369: : sys_write_return: pmu=1267445

Normal case (system wide):
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' -a
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.811 MB perf.data (120 samples) ]

 # cat /sys/kernel/debug/tracing/trace | grep -v '18446744073709551594' | grep -v perf | head -n 20
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            gmain-30828 [002] d... 2740551.068992: : sys_write:        pmu=84373
            gmain-30828 [002] d... 2740551.068992: : sys_write_return: pmu=87696
            gmain-30828 [002] d... 2740551.068996: : sys_write:        pmu=100658
            gmain-30828 [002] d... 2740551.068997: : sys_write_return: pmu=102572

Error case 1:

 # ./perf record -e './test_bpf_map_2.c' ls /
 [SNIP]
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.014 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-17115 [007] d... 2724279.665625: : sys_write:        pmu=18446744073709551614
               ls-17115 [007] dN.. 2724279.665651: : sys_write_return: pmu=18446744073709551614
               ls-17115 [007] d... 2724279.665658: : sys_write:        pmu=18446744073709551614
               ls-17115 [007] dN.. 2724279.665677: : sys_write_return: pmu=18446744073709551614

 (18446744073709551614 is 0xfffffffffffffffe (-2))

Error case 2:
 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=evt/' -a
 event syntax error: '..ps:pmu_map.event=evt/'
                                   \___ Event not found for map setting

 Hint:	Valid config terms:
      	maps:[<arraymap>].value=[value]
      	maps:[<eventmap>].event=[event]
 [SNIP]

Error case 3:
 # ls /proc/2348/task/
 2348  2505  2506  2507  2508
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' -p 2348
 ERROR: Apply config to BPF failed: Cannot set event to BPF maps in multi-thread tracing

Error case 4:
 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i to turn off inherit)

Error case 5:
 # ./perf record -i -e raw_syscalls:sys_enter -e './test_bpf_map_2.c/maps:pmu_map.event=raw_syscalls:sys_enter/' ls
 ERROR: Apply config to BPF failed: Can only put raw, hardware and BPF output event into a BPF map

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 138 ++++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/bpf-loader.h   |   5 ++
 tools/perf/util/evlist.c       |  16 +++++
 tools/perf/util/evlist.h       |   3 +
 tools/perf/util/parse-events.c |  15 +++--
 tools/perf/util/parse-events.h |   1 +
 6 files changed, 171 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 96fd18b..84b4581 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -742,6 +742,7 @@ int bpf__foreach_tev(struct bpf_object *obj,
 
 enum bpf_map_op_type {
 	BPF_MAP_OP_SET_VALUE,
+	BPF_MAP_OP_SET_EVSEL,
 };
 
 enum bpf_map_key_type {
@@ -754,6 +755,7 @@ struct bpf_map_op {
 	enum bpf_map_key_type key_type;
 	union {
 		u64 value;
+		struct perf_evsel *evsel;
 	} v;
 };
 
@@ -891,10 +893,73 @@ bpf__obj_config_map_value(struct bpf_map *map,
 	if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM)
 		return bpf__obj_config_map_array_value(map, term);
 
-	pr_debug("ERROR: wrong value type\n");
+	pr_debug("ERROR: wrong value type for 'value'\n");
 	return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
 }
 
+static int
+bpf__obj_config_map_array_event(struct bpf_map *map,
+				struct parse_events_term *term,
+				struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	struct bpf_map_def def;
+	struct bpf_map_op *op;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+	evsel = perf_evlist__find_evsel_by_str(evlist, term->val.str);
+	if (!evsel) {
+		pr_debug("Event (for '%s') '%s' doesn't exist\n",
+			 map_name, term->val.str);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_NOEVT;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("Unable to get map definition from '%s'\n",
+			 map_name);
+		return err;
+	}
+
+	/*
+	 * No need to check key_size and value_size:
+	 * kernel has already checked them.
+	 */
+	if (def.type != BPF_MAP_TYPE_PERF_EVENT_ARRAY) {
+		pr_debug("Map %s type is not BPF_MAP_TYPE_PERF_EVENT_ARRAY\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+	}
+
+	op = bpf_map_op__alloc(map);
+	if (IS_ERR(op))
+		return PTR_ERR(op);
+
+	op->v.evsel = evsel;
+	op->op_type = BPF_MAP_OP_SET_EVSEL;
+	return 0;
+}
+
+static int
+bpf__obj_config_map_event(struct bpf_map *map,
+			  struct parse_events_term *term,
+			  struct perf_evlist *evlist)
+{
+	if (!term->err_val) {
+		pr_debug("Config value not set\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
+	}
+
+	if (term->type_val == PARSE_EVENTS__TERM_TYPE_STR)
+		return bpf__obj_config_map_array_event(map, term, evlist);
+
+	pr_debug("ERROR: wrong value type for 'event'\n");
+	return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
+}
+
+
 struct bpf_obj_config_map_func {
 	const char *config_opt;
 	int (*config_func)(struct bpf_map *, struct parse_events_term *,
@@ -903,6 +968,7 @@ struct bpf_obj_config_map_func {
 
 struct bpf_obj_config_map_func bpf_obj_config_map_funcs[] = {
 	{"value", bpf__obj_config_map_value},
+	{"event", bpf__obj_config_map_event},
 };
 
 static int
@@ -1047,6 +1113,7 @@ bpf_map_config_foreach_key(struct bpf_map *map,
 	list_for_each_entry(op, &priv->ops_list, list) {
 		switch (def.type) {
 		case BPF_MAP_TYPE_ARRAY:
+		case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
 			switch (op->key_type) {
 			case BPF_MAP_KEY_ALL:
 				return foreach_key_array_all(func, arg, name,
@@ -1101,6 +1168,60 @@ apply_config_value_for_key(int map_fd, void *pkey,
 }
 
 static int
+apply_config_evsel_for_key(const char *name, int map_fd, void *pkey,
+			   struct perf_evsel *evsel)
+{
+	struct xyarray *xy = evsel->fd;
+	struct perf_event_attr *attr;
+	unsigned int key, events;
+	bool check_pass = false;
+	int *evt_fd;
+	int err;
+
+	if (!xy) {
+		pr_debug("ERROR: evsel not ready for map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	if (xy->row_size / xy->entry_size != 1) {
+		pr_debug("ERROR: Dimension of target event is incorrect for map %s\n",
+			 name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM;
+	}
+
+	attr = &evsel->attr;
+	if (attr->inherit) {
+		pr_debug("ERROR: Can't put inherit event into map %s\n", name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH;
+	}
+
+	if (attr->type == PERF_TYPE_RAW)
+		check_pass = true;
+	if (attr->type == PERF_TYPE_HARDWARE)
+		check_pass = true;
+	if (attr->type == PERF_TYPE_SOFTWARE &&
+			attr->config == PERF_COUNT_SW_BPF_OUTPUT)
+		check_pass = true;
+	if (!check_pass) {
+		pr_debug("ERROR: Event type is wrong for map %s\n", name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE;
+	}
+
+	events = xy->entries / (xy->row_size / xy->entry_size);
+	key = *((unsigned int *)pkey);
+	if (key >= events) {
+		pr_debug("ERROR: there is no event %d for map %s\n",
+			 key, name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_MAPSIZE;
+	}
+	evt_fd = xyarray__entry(xy, key, 0);
+	err = bpf_map_update_elem(map_fd, pkey, evt_fd, BPF_ANY);
+	if (err && errno)
+		err = -errno;
+	return err;
+}
+
+static int
 apply_obj_config_map_for_key(const char *name, int map_fd,
 			     struct bpf_map_def *pdef __maybe_unused,
 			     struct bpf_map_op *op,
@@ -1114,6 +1235,10 @@ apply_obj_config_map_for_key(const char *name, int map_fd,
 						 pdef->value_size,
 						 op->v.value);
 		break;
+	case BPF_MAP_OP_SET_EVSEL:
+		err = apply_config_evsel_for_key(name, map_fd, pkey,
+						 op->v.evsel);
+		break;
 	default:
 		pr_debug("ERROR: unknown value type for '%s'\n", name);
 		err = -BPF_LOADER_ERRNO__INTERNAL;
@@ -1179,6 +1304,11 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(OBJCONF_MAP_TYPE)]	= "Incorrect map type",
 	[ERRCODE_OFFSET(OBJCONF_MAP_KEYSIZE)]	= "Incorrect map key size",
 	[ERRCODE_OFFSET(OBJCONF_MAP_VALUESIZE)]	= "Incorrect map value size",
+	[ERRCODE_OFFSET(OBJCONF_MAP_NOEVT)]	= "Event not found for map setting",
+	[ERRCODE_OFFSET(OBJCONF_MAP_MAPSIZE)]	= "Invalid map size for event setting",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTDIM)]	= "Event dimension too large",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTINH)]	= "Doesn't support inherit event",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTTYPE)]	= "Wrong event type for map",
 };
 
 static int
@@ -1315,6 +1445,12 @@ int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
 {
 	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,
+			    "Cannot set event to BPF maps in multi-thread tracing");
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,
+			    "%s (Hint: use -i to turn off inherit)", emsg);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,
+			    "Can only put raw, hardware and BPF output event into a BPF map");
 	bpf__strerror_end(buf, size);
 	return 0;
 }
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index db3c34c..c9ce792 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -33,6 +33,11 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,	/* Incorrect map type */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE,	/* Incorrect map key size */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE,/* Incorrect map value size */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_NOEVT,	/* Event not found for map setting */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_MAPSIZE,	/* Invalid map size for event setting */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,	/* Event dimension too large */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,	/* Doesn't support inherit event */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,	/* Wrong event type for map */
 	__BPF_LOADER_ERRNO__END,
 };
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d81f13d..9b56390 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1723,3 +1723,19 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 
 	tracking_evsel->tracking = true;
 }
+
+struct perf_evsel *
+perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
+			       const char *str)
+{
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		if (!evsel->name)
+			continue;
+		if (strcmp(str, evsel->name) == 0)
+			return evsel;
+	}
+
+	return NULL;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7c4d9a2..a0d1522 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -294,4 +294,7 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 				     struct perf_evsel *tracking_evsel);
 
 void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr);
+
+struct perf_evsel *
+perf_evlist__find_evsel_by_str(struct perf_evlist *evlist, const char *str);
 #endif /* __PERF_EVLIST_H */
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 1c2dc5d..6e2543c 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -653,14 +653,16 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 			return -EINVAL;
 		}
 
-		err = bpf__config_obj(obj, term, NULL, &error_pos);
+		err = bpf__config_obj(obj, term, data->evlist, &error_pos);
 		if (err) {
-			bpf__strerror_config_obj(obj, term, NULL,
+			bpf__strerror_config_obj(obj, term, data->evlist,
 						 &error_pos, err, errbuf,
 						 sizeof(errbuf));
 			data->error->help = strdup(
-"Hint:\tValid config term:\n"
+"Hint:\tValid config terms:\n"
 "     \tmaps:[<arraymap>].value=[value]\n"
+"     \tmaps:[<eventmap>].event=[event]\n"
+"\n"
 "     \t(add -v to see detail)");
 			data->error->str = strdup(errbuf);
 			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
@@ -1442,9 +1444,10 @@ int parse_events(struct perf_evlist *evlist, const char *str,
 		 struct parse_events_error *err)
 {
 	struct parse_events_evlist data = {
-		.list  = LIST_HEAD_INIT(data.list),
-		.idx   = evlist->nr_entries,
-		.error = err,
+		.list   = LIST_HEAD_INIT(data.list),
+		.idx    = evlist->nr_entries,
+		.error  = err,
+		.evlist = evlist,
 	};
 	int ret;
 
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 84694f3..2a2b172 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -98,6 +98,7 @@ struct parse_events_evlist {
 	int			   idx;
 	int			   nr_groups;
 	struct parse_events_error *error;
+	struct perf_evlist	  *evlist;
 };
 
 struct parse_events_terms {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 13/54] perf tools: Support perf event alias name
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (11 preceding siblings ...)
  2016-01-25  9:55 ` [PATCH 12/54] perf tools: Enable passing event to BPF object Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-02-03 23:35   ` Arnaldo Carvalho de Melo
  2016-01-25  9:56 ` [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately Wang Nan
                   ` (41 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

From: He Kuang <hekuang@huawei.com>

This patch is useful when trying to pass a perf event to BPF map.
Before this patch we are unable to pass an event with config term to
BPF maps. For example:

 # perf record -a -e cycles/no-inherit,period=0x7fffffffffffffff/ \
                  -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/no-inherit,period=0x7fffffffffffffff//' ls /
 event syntax error: '..ps:pmu_map.event=cycles/'
                                   \___ Event not found for map setting

Because those '/' and ',' embarrass parser.

This patch adds new bison rules for specifying an alias name to a perf
event, which allows cmdline refer to previous defined perf event through
its name. With this patch user can give alias name to a perf event using
following cmdline. The above goal can be achieved using:

 # perf record -a -e cyc=cycles/no-inherit,period=0x7fffffffffffffff/ \
                  -e './test_bpf_map_2.c/maps:pmu_map.event=cyc/' ls /

If alias is not provided (normal case):

 # perf record -e cycles ...

It will be set to event's name automatically ('cycles' in the above
example).

To allow parser refer to existing event selector, pass event list to
'struct parse_events_evlist'. perf_evlist__find_evsel_by_alias() is
introduced to get evsel through its alias.

Test result:
 # cat ./test_bpf_map_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
     (void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_read)(struct bpf_map_def *, int) =
     (void *)BPF_FUNC_perf_event_read;

 struct bpf_map_def SEC("maps") pmu_map = {
     .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = __NR_CPUS__,
 };
 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
     unsigned long long val;
     char fmt[] = "sys_write:        pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }

 SEC("func_write_return=sys_write%return")
 int func_write_return(void *ctx)
 {
     unsigned long long val = 0;
     char fmt[] = "sys_write_return: pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -a -e cyc=cycles/no-inherit,period=0x7fffffffffffffff/ \
                    -e './test_bpf_map_2.c/maps:pmu_map.event=cyc/' ls /
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.755 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-25328 [002] d... 940138.313178: : sys_write:        pmu=4503165
               ls-25328 [002] dN.. 940138.313207: : sys_write_return: pmu=4582975
               ls-25328 [002] d... 940138.313211: : sys_write:        pmu=4599840
               ls-25328 [002] dN.. 940138.313220: : sys_write_return: pmu=4633352
 # ./perf report --stdio
 Error:
 The perf.data file has no samples!
 ...
 (This is expected because we set period of cycles to a very large
 value to period of cycles event because we want to use this event
 as a counter only, don't need sampling)

 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i or use /no-inherit/ to turn off inherit)

Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   |  2 +-
 tools/perf/util/evlist.c       |  4 ++--
 tools/perf/util/evsel.c        |  1 +
 tools/perf/util/evsel.h        |  1 +
 tools/perf/util/parse-events.c | 26 ++++++++++++++++++++++++++
 tools/perf/util/parse-events.h |  4 ++++
 tools/perf/util/parse-events.y | 15 ++++++++++++++-
 7 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 84b4581..2893b4e 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -1448,7 +1448,7 @@ int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
 	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,
 			    "Cannot set event to BPF maps in multi-thread tracing");
 	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,
-			    "%s (Hint: use -i to turn off inherit)", emsg);
+			    "%s (Hint: use -i or use /no-inherit/ to turn off inherit)", emsg);
 	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,
 			    "Can only put raw, hardware and BPF output event into a BPF map");
 	bpf__strerror_end(buf, size);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9b56390..890b08b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1731,9 +1731,9 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 	struct perf_evsel *evsel;
 
 	evlist__for_each(evlist, evsel) {
-		if (!evsel->name)
+		if (!evsel->alias)
 			continue;
-		if (strcmp(str, evsel->name) == 0)
+		if (strcmp(str, evsel->alias) == 0)
 			return evsel;
 	}
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4678086..7aebd5d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1076,6 +1076,7 @@ void perf_evsel__exit(struct perf_evsel *evsel)
 	thread_map__put(evsel->threads);
 	zfree(&evsel->group_name);
 	zfree(&evsel->name);
+	zfree(&evsel->alias);
 	perf_evsel__object.fini(evsel);
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8e75434..19885fb 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -89,6 +89,7 @@ struct perf_evsel {
 	int			idx;
 	u32			ids;
 	char			*name;
+	char			*alias;
 	double			scale;
 	const char		*unit;
 	struct event_format	*tp_format;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6e2543c..1e0ac77 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1091,6 +1091,30 @@ int parse_events__modifier_group(struct list_head *list,
 	return parse_events__modifier_event(list, event_mod, true);
 }
 
+int parse_events__set_event_alias(struct parse_events_evlist *data,
+				  struct list_head *list,
+				  const char *str,
+				  void *loc_alias_)
+{
+	struct perf_evsel *evsel;
+	YYLTYPE *loc_alias = loc_alias_;
+
+	if (!str)
+		return 0;
+
+	if (!list_is_singular(list)) {
+		struct parse_events_error *err = data->error;
+
+		err->idx = loc_alias->first_column;
+		err->str = strdup("One alias can be applied to one event only");
+		return -EINVAL;
+	}
+
+	evsel = list_first_entry(list, struct perf_evsel, node);
+	evsel->alias = strdup(str);
+	return evsel->alias ? 0 : -ENOMEM;
+}
+
 void parse_events__set_leader(char *name, struct list_head *list)
 {
 	struct perf_evsel *leader;
@@ -1283,6 +1307,8 @@ int parse_events_name(struct list_head *list, char *name)
 	__evlist__for_each(list, evsel) {
 		if (!evsel->name)
 			evsel->name = strdup(name);
+		if (!evsel->alias)
+			evsel->alias = strdup(name);
 	}
 
 	return 0;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 2a2b172..20ad3c2 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -172,4 +172,8 @@ extern int is_valid_tracepoint(const char *event_string);
 int valid_event_mount(const char *eventfs);
 char *parse_events_formats_error_string(char *additional_terms);
 
+int parse_events__set_event_alias(struct parse_events_evlist *data,
+				  struct list_head *list,
+				  const char *str,
+				  void *loc_alias_);
 #endif /* __PERF_PARSE_EVENTS_H */
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 8992d16..c3cbd7a 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -77,6 +77,7 @@ static inc_group_count(struct list_head *list,
 %type <head> event_bpf_file
 %type <head> event_def
 %type <head> event_mod
+%type <head> event_alias
 %type <head> event_name
 %type <head> event
 %type <head> events
@@ -193,13 +194,25 @@ event_name PE_MODIFIER_EVENT
 event_name
 
 event_name:
-PE_EVENT_NAME event_def
+PE_EVENT_NAME event_alias
 {
 	ABORT_ON(parse_events_name($2, $1));
 	free($1);
 	$$ = $2;
 }
 |
+event_alias
+
+event_alias:
+PE_NAME '=' event_def
+{
+	struct list_head *list = $3;
+	struct parse_events_evlist *data = _data;
+
+	ABORT_ON(parse_events__set_event_alias(data, list, $1, &@1));
+	$$ = list;
+}
+|
 event_def
 
 event_def: event_pmu |
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (12 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 13/54] perf tools: Support perf event alias name Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 15/54] perf tools: Enable indices setting syntax for BPF maps Wang Nan
                   ` (40 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

This patch introduces basic facilities to support config different
slots in a BPF map one by one.

array.nr_ranges and array.ranges are introduced into 'struct
parse_events_term', where ranges is an array of indices range (start,
length) which will be configured by this config term. nr_ranges
is the size of the array. The array is passed to 'struct bpf_map_priv'.
To indicate the new type of configuration, BPF_MAP_KEY_RANGES is
added as a new key type. bpf_map_config_foreach_key() is extended to
iterate over those indices instead of all possible keys.

Code in this commit will be enabled by following commit which enables
the indices syntax for array configuration.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 132 ++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/bpf-loader.h   |   1 +
 tools/perf/util/parse-events.c |  33 ++++++++++-
 tools/perf/util/parse-events.h |  12 ++++
 4 files changed, 170 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 2893b4e..6c25de8 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -17,6 +17,7 @@
 #include "llvm-utils.h"
 #include "probe-event.h"
 #include "probe-finder.h" // for MAX_PROBES
+#include "parse-events.h"
 #include "llvm-utils.h"
 
 #define DEFINE_PRINT_FN(name, level) \
@@ -747,6 +748,7 @@ enum bpf_map_op_type {
 
 enum bpf_map_key_type {
 	BPF_MAP_KEY_ALL,
+	BPF_MAP_KEY_RANGES,
 };
 
 struct bpf_map_op {
@@ -754,6 +756,9 @@ struct bpf_map_op {
 	enum bpf_map_op_type op_type;
 	enum bpf_map_key_type key_type;
 	union {
+		struct parse_events_array array;
+	} k;
+	union {
 		u64 value;
 		struct perf_evsel *evsel;
 	} v;
@@ -779,6 +784,8 @@ bpf_map_op__free(struct bpf_map_op *op)
 	 */
 	if ((list->next != LIST_POISON1) && (list->prev != LIST_POISON2))
 		list_del(list);
+	if (op->key_type == BPF_MAP_KEY_RANGES)
+		parse_events__clear_array(&op->k.array);
 	free(op);
 }
 
@@ -794,8 +801,30 @@ bpf_map_priv__clear(struct bpf_map *map __maybe_unused,
 	free(priv);
 }
 
+static int
+bpf_map_op_setkey(struct bpf_map_op *op, struct parse_events_term *term,
+		  const char *map_name)
+{
+	op->key_type = BPF_MAP_KEY_ALL;
+
+	if (term->array.nr_ranges) {
+		size_t memsz = term->array.nr_ranges *
+				sizeof(op->k.array.ranges[0]);
+
+		op->k.array.ranges = memdup(term->array.ranges, memsz);
+		if (!op->k.array.ranges) {
+			pr_debug("No enough memory to alloc indices for %s\n",
+				 map_name);
+			return -ENOMEM;
+		}
+		op->key_type = BPF_MAP_KEY_RANGES;
+		op->k.array.nr_ranges = term->array.nr_ranges;
+	}
+	return 0;
+}
+
 static struct bpf_map_op *
-bpf_map_op__alloc(struct bpf_map *map)
+bpf_map_op__alloc(struct bpf_map *map, struct parse_events_term *term)
 {
 	struct bpf_map_op *op;
 	struct bpf_map_priv *priv;
@@ -829,7 +858,12 @@ bpf_map_op__alloc(struct bpf_map *map)
 		return ERR_PTR(-ENOMEM);
 	}
 
-	op->key_type = BPF_MAP_KEY_ALL;
+	err = bpf_map_op_setkey(op, term, map_name);
+	if (err) {
+		free(op);
+		return ERR_PTR(err);
+	}
+
 	list_add_tail(&op->list, &priv->ops_list);
 	return op;
 }
@@ -872,7 +906,7 @@ bpf__obj_config_map_array_value(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
 	}
 
-	op = bpf_map_op__alloc(map);
+	op = bpf_map_op__alloc(map, term);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 	op->op_type = BPF_MAP_OP_SET_VALUE;
@@ -933,7 +967,7 @@ bpf__obj_config_map_array_event(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
 	}
 
-	op = bpf_map_op__alloc(map);
+	op = bpf_map_op__alloc(map, term);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 
@@ -972,6 +1006,44 @@ struct bpf_obj_config_map_func bpf_obj_config_map_funcs[] = {
 };
 
 static int
+config_map_indices_range_check(struct parse_events_term *term,
+			       struct bpf_map *map,
+			       const char *map_name)
+{
+	struct parse_events_array *array = &term->array;
+	struct bpf_map_def def;
+	unsigned int i;
+	int err;
+
+	if (!array->nr_ranges)
+		return 0;
+	if (!array->ranges) {
+		pr_debug("ERROR: map %s: array->nr_ranges is %d but range array is NULL\n",
+			 map_name, (int)array->nr_ranges);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("ERROR: Unable to get map definition from '%s'\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	for (i = 0; i < array->nr_ranges; i++) {
+		unsigned int start = array->ranges[i].start;
+		size_t length = array->ranges[i].length;
+		unsigned int idx = start + length - 1;
+
+		if (idx >= def.max_entries) {
+			pr_debug("ERROR: index %d too large\n", idx);
+			return -BPF_LOADER_ERRNO__OBJCONF_MAP_IDX2BIG;
+		}
+	}
+	return 0;
+}
+
+static int
 bpf__obj_config_map(struct bpf_object *obj,
 		    struct parse_events_term *term,
 		    struct perf_evlist *evlist,
@@ -1007,6 +1079,13 @@ bpf__obj_config_map(struct bpf_object *obj,
 	}
 
 	*key_scan_pos += map_opt - map_name;
+
+	*key_scan_pos += strlen(map_opt);
+	err = config_map_indices_range_check(term, map, map_name);
+	if (err)
+		goto out;
+	*key_scan_pos -= strlen(map_opt);
+
 	for (i = 0; i < ARRAY_SIZE(bpf_obj_config_map_funcs); i++) {
 		struct bpf_obj_config_map_func *func =
 				&bpf_obj_config_map_funcs[i];
@@ -1077,6 +1156,33 @@ foreach_key_array_all(map_config_func_t func,
 }
 
 static int
+foreach_key_array_ranges(map_config_func_t func, void *arg,
+			 const char *name, int map_fd,
+			 struct bpf_map_def *pdef,
+			 struct bpf_map_op *op)
+{
+	unsigned int i, j;
+	int err;
+
+	for (i = 0; i < op->k.array.nr_ranges; i++) {
+		unsigned int start = op->k.array.ranges[i].start;
+		size_t length = op->k.array.ranges[i].length;
+
+		for (j = 0; j < length; j++) {
+			unsigned int idx = start + j;
+
+			err = func(name, map_fd, pdef, op, &idx, arg);
+			if (err) {
+				pr_debug("ERROR: failed to insert value to %s[%u]\n",
+					 name, idx);
+				return err;
+			}
+		}
+	}
+	return 0;
+}
+
+static int
 bpf_map_config_foreach_key(struct bpf_map *map,
 			   map_config_func_t func,
 			   void *arg)
@@ -1116,13 +1222,24 @@ bpf_map_config_foreach_key(struct bpf_map *map,
 		case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
 			switch (op->key_type) {
 			case BPF_MAP_KEY_ALL:
-				return foreach_key_array_all(func, arg, name,
-							     map_fd, &def, op);
+				err = foreach_key_array_all(func, arg, name,
+							    map_fd, &def, op);
+				if (err)
+					return err;
+				break;
+			case BPF_MAP_KEY_RANGES:
+				err = foreach_key_array_ranges(func, arg, name,
+							       map_fd, &def,
+							       op);
+				if (err)
+					return err;
+				break;
 			default:
 				pr_debug("ERROR: keytype for map '%s' invalid\n",
 					 name);
 				return -BPF_LOADER_ERRNO__INTERNAL;
-		}
+			}
+			break;
 		default:
 			pr_debug("ERROR: type of '%s' incorrect\n", name);
 			return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
@@ -1309,6 +1426,7 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTDIM)]	= "Event dimension too large",
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTINH)]	= "Doesn't support inherit event",
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTTYPE)]	= "Wrong event type for map",
+	[ERRCODE_OFFSET(OBJCONF_MAP_IDX2BIG)]	= "Index too large",
 };
 
 static int
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index c9ce792..30ee519 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -38,6 +38,7 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,	/* Event dimension too large */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,	/* Doesn't support inherit event */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,	/* Wrong event type for map */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_IDX2BIG,	/* Index too large */
 	__BPF_LOADER_ERRNO__END,
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 1e0ac77..f229663 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2148,8 +2148,39 @@ void parse_events__free_terms(struct list_head *terms)
 {
 	struct parse_events_term *term, *h;
 
-	list_for_each_entry_safe(term, h, terms, list)
+	list_for_each_entry_safe(term, h, terms, list) {
+		if (term->array.nr_ranges)
+			free(term->array.ranges);
 		free(term);
+	}
+}
+
+int parse_events__merge_arrays(struct parse_events_array *dest,
+			       struct parse_events_array *another)
+{
+	struct parse_events_array new;
+
+	if (!dest || !another)
+		return -EINVAL;
+
+	new.nr_ranges = dest->nr_ranges + another->nr_ranges;
+	new.ranges = malloc(sizeof(new.ranges[0]) * new.nr_ranges);
+	if (!new.ranges)
+		return -ENOMEM;
+
+	memcpy(&new.ranges[0], dest->ranges,
+	       sizeof(new.ranges[0]) * dest->nr_ranges);
+	memcpy(&new.ranges[dest->nr_ranges], another->ranges,
+	       sizeof(new.ranges[0]) * another->nr_ranges);
+	free(dest->ranges);
+	free(another->ranges);
+	*dest = new;
+	return 0;
+}
+
+void parse_events__clear_array(struct parse_events_array *a)
+{
+	free(a->ranges);
 }
 
 void parse_events_evlist_error(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 20ad3c2..c34615f 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -71,8 +71,17 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_INHERIT
 };
 
+struct parse_events_array {
+	size_t nr_ranges;
+	struct {
+		unsigned int start;
+		size_t length;
+	} *ranges;
+};
+
 struct parse_events_term {
 	char *config;
+	struct parse_events_array array;
 	union {
 		char *str;
 		u64  num;
@@ -117,6 +126,9 @@ int parse_events_term__sym_hw(struct parse_events_term **term,
 int parse_events_term__clone(struct parse_events_term **new,
 			     struct parse_events_term *term);
 void parse_events__free_terms(struct list_head *terms);
+int parse_events__merge_arrays(struct parse_events_array *dest,
+			       struct parse_events_array *another);
+void parse_events__clear_array(struct parse_events_array *a);
 int parse_events__modifier_event(struct list_head *list, char *str, bool add);
 int parse_events__modifier_group(struct list_head *list, char *event_mod);
 int parse_events_name(struct list_head *list, char *name);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 15/54] perf tools: Enable indices setting syntax for BPF maps
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (13 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 16/54] perf tools: Introduce bpf-output event Wang Nan
                   ` (39 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

This patch introduce a new syntax to perf event parser:

 # perf record -e './test_bpf_map_3.c/maps:channel.value[0,1,2,3...5]=101/' usleep 2

By utilizing the basic facilities in bpf-loader.c which allow setting
different slots in a BPF map separately, the newly introduced syntax
allows perf to control specific elements in a BPF map.

Test result:

 # cat ./test_bpf_map_3.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
 	(void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(unsigned char),
 	.max_entries = 100,
 };
 SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
 int func(void *ctx, int err, long nsec)
 {
 	char fmt[] = "%ld\n";
 	long usec = nsec * 0x10624dd3 >> 38; // nsec / 1000
 	int key = (int)usec;
 	unsigned char *pval = map_lookup_elem(&channel, &key);

 	if (!pval)
 		return 0;
 	trace_printk(fmt, sizeof(fmt), (unsigned char)*pval);
 	return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

Normal case:
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0,1,2,3...5]=101/' usleep 2
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0...9,20...29]=102,maps:channel.value[10...19]=103/' usleep 3
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0...9,20...29]=102,maps:channel.value[10...19]=103/' usleep 15
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
           usleep-655   [006] d... 2745434.122814: : 102
           usleep-904   [006] d... 2745439.916264: : 103
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[all]=104/' usleep 99
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
           usleep-655   [006] d... 2745434.122814: : 102
           usleep-904   [006] d... 2745439.916264: : 103
           usleep-1537  [003] d... 2745538.053737: : 104

Error case:
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[10...1000]=104/' usleep 99
 event syntax error: '..annel.value[10...1000]=104/'
                                   \___ Index too large
 Hint:	Valid config terms:
      	maps:[<arraymap>].value<indices>=[value]
      	maps:[<eventmap>].event<indices>=[event]

      	where <indices> is something like [0,3...5] or [all]
      	(add -v to see detail)
 Run 'perf list' for a list of valid events

  Usage: perf record [<options>] [<command>]
     or: perf record [<options>] -- <command> [<options>]

     -e, --event <event>   event selector. use 'perf list' to list available events

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c |  5 ++-
 tools/perf/util/parse-events.l | 13 ++++++-
 tools/perf/util/parse-events.y | 85 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index f229663..03d18f4 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -660,9 +660,10 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 						 sizeof(errbuf));
 			data->error->help = strdup(
 "Hint:\tValid config terms:\n"
-"     \tmaps:[<arraymap>].value=[value]\n"
-"     \tmaps:[<eventmap>].event=[event]\n"
+"     \tmaps:[<arraymap>].value<indices>=[value]\n"
+"     \tmaps:[<eventmap>].event<indices>=[event]\n"
 "\n"
+"     \twhere <indices> is something like [0,3...5] or [all]\n"
 "     \t(add -v to see detail)");
 			data->error->str = strdup(errbuf);
 			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 4387728..8bb3437 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -9,8 +9,8 @@
 %{
 #include <errno.h>
 #include "../perf.h"
-#include "parse-events-bison.h"
 #include "parse-events.h"
+#include "parse-events-bison.h"
 
 char *parse_events_get_text(yyscan_t yyscanner);
 YYSTYPE *parse_events_get_lval(yyscan_t yyscanner);
@@ -111,6 +111,7 @@ do {							\
 %x mem
 %s config
 %x event
+%x array
 
 group		[^,{}/]*[{][^}]*[}][^,{}/]*
 event_pmu	[^,{}/]+[/][^/]*[/][^,{}/]*
@@ -176,6 +177,14 @@ modifier_bp	[rwx]{1,3}
 
 }
 
+<array>{
+"]"			{ BEGIN(config); return ']'; }
+{num_dec}		{ return value(yyscanner, 10); }
+{num_hex}		{ return value(yyscanner, 16); }
+,			{ return ','; }
+"\.\.\."		{ return PE_ARRAY_RANGE; }
+}
+
 <config>{
 	/*
 	 * Please update parse_events_formats_error_string any time
@@ -196,6 +205,8 @@ no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
+\[all\]			{ return PE_ARRAY_ALL; }
+"["			{ BEGIN(array); return '['; }
 }
 
 <mem>{
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index c3cbd7a..7e93b9f 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -48,6 +48,7 @@ static inc_group_count(struct list_head *list,
 %token PE_PREFIX_MEM PE_PREFIX_RAW PE_PREFIX_GROUP
 %token PE_ERROR
 %token PE_PMU_EVENT_PRE PE_PMU_EVENT_SUF PE_KERNEL_PMU_EVENT
+%token PE_ARRAY_ALL PE_ARRAY_RANGE
 %type <num> PE_VALUE
 %type <num> PE_VALUE_SYM_HW
 %type <num> PE_VALUE_SYM_SW
@@ -84,6 +85,9 @@ static inc_group_count(struct list_head *list,
 %type <head> group_def
 %type <head> group
 %type <head> groups
+%type <array> array
+%type <array> array_term
+%type <array> array_terms
 
 %union
 {
@@ -95,6 +99,7 @@ static inc_group_count(struct list_head *list,
 		char *sys;
 		char *event;
 	} tracepoint_name;
+	struct parse_events_array array;
 }
 %%
 
@@ -601,6 +606,86 @@ PE_TERM
 	ABORT_ON(parse_events_term__num(&term, (int)$1, NULL, 1, &@1, NULL));
 	$$ = term;
 }
+|
+PE_NAME array '=' PE_NAME
+{
+	struct parse_events_term *term;
+	int i;
+
+	ABORT_ON(parse_events_term__str(&term, PARSE_EVENTS__TERM_TYPE_USER,
+					$1, $4, &@1, &@4));
+
+	term->array = $2;
+	$$ = term;
+}
+|
+PE_NAME array '=' PE_VALUE
+{
+	struct parse_events_term *term;
+
+	ABORT_ON(parse_events_term__num(&term, PARSE_EVENTS__TERM_TYPE_USER,
+					$1, $4, &@1, &@4));
+	term->array = $2;
+	$$ = term;
+}
+
+array:
+'[' array_terms ']'
+{
+	$$ = $2;
+}
+|
+PE_ARRAY_ALL
+{
+	$$.nr_ranges = 0;
+	$$.ranges = NULL;
+}
+
+array_terms:
+array_terms ',' array_term
+{
+	struct parse_events_array new_array;
+
+	new_array.nr_ranges = $1.nr_ranges + $3.nr_ranges;
+	new_array.ranges = malloc(sizeof(new_array.ranges[0]) *
+				  new_array.nr_ranges);
+	ABORT_ON(!new_array.ranges);
+	memcpy(&new_array.ranges[0], $1.ranges,
+	       $1.nr_ranges * sizeof(new_array.ranges[0]));
+	memcpy(&new_array.ranges[$1.nr_ranges], $3.ranges,
+	       $3.nr_ranges * sizeof(new_array.ranges[0]));
+	free($1.ranges);
+	free($3.ranges);
+	$$ = new_array;
+}
+|
+array_term
+
+array_term:
+PE_VALUE
+{
+	struct parse_events_array array;
+
+	array.nr_ranges = 1;
+	array.ranges = malloc(sizeof(array.ranges[0]));
+	ABORT_ON(!array.ranges);
+	array.ranges[0].start = $1;
+	array.ranges[0].length = 1;
+	$$ = array;
+}
+|
+PE_VALUE PE_ARRAY_RANGE PE_VALUE
+{
+	struct parse_events_array array;
+
+	ABORT_ON($3 < $1);
+	array.nr_ranges = 1;
+	array.ranges = malloc(sizeof(array.ranges[0]));
+	ABORT_ON(!array.ranges);
+	array.ranges[0].start = $1;
+	array.ranges[0].length = $3 - $1 + 1;
+	$$ = array;
+}
 
 sep_dc: ':' |
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 16/54] perf tools: Introduce bpf-output event
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (14 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 15/54] perf tools: Enable indices setting syntax for BPF maps Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 17/54] perf data: Support converting data from bpf_perf_event_output() Wang Nan
                   ` (38 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Commit a43eec304259a6c637f4014a6d4767159b6a3aa3 (bpf: introduce
bpf_perf_event_output() helper) add a helper to enable BPF program
output data to perf ring buffer through a new type of perf event
PERF_COUNT_SW_BPF_OUTPUT. This patch enable perf to create perf
event of that type. Now perf user can use following cmdline to
receive output data from BPF programs:

 # ./perf record -a -e evt=bpf-output/no-inherit/ \
                    -e ./test_bpf_output.c/maps:channel.event=evt/ ls /
 # ./perf script
	perf 12927 [004] 355971.129276:          0 evt=bpf-output/no-inherit/:  ffffffff811ed5f1 sys_write
	perf 12927 [004] 355971.129279:          0 evt=bpf-output/no-inherit/:  ffffffff811ed5f1 sys_write
	...

Test result:
 # cat ./test_bpf_output.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };

 #define SEC(NAME) __attribute__((section(NAME), used))
 static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
 };

 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
 	struct {
 		u64 ktime;
 		int cpuid;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output: %d\n";

 	output_data.cpuid = get_smp_processor_id();
 	output_data.ktime = ktime_get_ns();
 	int err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				    &output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data), err);
 	return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************ END ***************************/

 # ./perf record -a -e evt=bpf-output/no-inherit/ \
                    -e ./test_bpf_output.c/maps:channel.event=evt/ ls /
 # ./perf script | grep ls
              ls  4085 [000] 2746114.230215: evt=bpf-output/no-inherit/:  ffffffff811ed5f1 sys_write (/lib/modules/4.3.0-rc4+/build/vmlinux)
              ls  4085 [000] 2746114.230244: evt=bpf-output/no-inherit/:  ffffffff811ed5f1 sys_write (/lib/modules/4.3.0-rc4+/build/vmlinux)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 5 ++---
 tools/perf/util/evsel.c        | 5 +++++
 tools/perf/util/evsel.h        | 8 ++++++++
 tools/perf/util/parse-events.l | 1 +
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 6c25de8..92b815e 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -1312,13 +1312,12 @@ apply_config_evsel_for_key(const char *name, int map_fd, void *pkey,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel))
+		check_pass = true;
 	if (attr->type == PERF_TYPE_RAW)
 		check_pass = true;
 	if (attr->type == PERF_TYPE_HARDWARE)
 		check_pass = true;
-	if (attr->type == PERF_TYPE_SOFTWARE &&
-			attr->config == PERF_COUNT_SW_BPF_OUTPUT)
-		check_pass = true;
 	if (!check_pass) {
 		pr_debug("ERROR: Event type is wrong for map %s\n", name);
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 7aebd5d..7498d58 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -225,6 +225,11 @@ struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
 	if (evsel != NULL)
 		perf_evsel__init(evsel, attr, idx);
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		evsel->attr.sample_type |= PERF_SAMPLE_RAW;
+		evsel->attr.sample_period = 1;
+	}
+
 	return evsel;
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 19885fb..022fcff 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -365,6 +365,14 @@ static inline bool perf_evsel__is_function_event(struct perf_evsel *evsel)
 #undef FUNCTION_EVENT
 }
 
+static inline bool perf_evsel__is_bpf_output(struct perf_evsel *evsel)
+{
+	struct perf_event_attr *attr = &evsel->attr;
+
+	return (attr->config == PERF_COUNT_SW_BPF_OUTPUT) &&
+		(attr->type == PERF_TYPE_SOFTWARE);
+}
+
 struct perf_attr_details {
 	bool freq;
 	bool verbose;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 8bb3437..27d567f 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -249,6 +249,7 @@ cpu-migrations|migrations			{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COU
 alignment-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
 emulation-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }
 dummy						{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
+bpf-output					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
 
 	/*
 	 * We have to handle the kernel PMU event cycles-ct/cycles-t/mem-loads/mem-stores separately.
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 17/54] perf data: Support converting data from bpf_perf_event_output()
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (15 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 16/54] perf tools: Introduce bpf-output event Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 18/54] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
                   ` (37 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

bpf_perf_event_output() outputs data through sample->raw_data. This
patch adds support to convert those data into CTF. A python script
then can be used to process output data from BPF programs.

Test result:

 # cat ./test_bpf_output_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };
 #define SEC(NAME) __attribute__((section(NAME), used))
 static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
 };

 static inline int __attribute__((always_inline))
 func(void *ctx, int type)
 {
 	struct {
 		u64 ktime;
 		int type;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output\n";
 	int err;

 	output_data.type = type;
 	output_data.ktime = ktime_get_ns();
 	err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				&output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data));
 	return 0;
 }
 SEC("func_begin=sys_nanosleep")
 int func_begin(void *ctx) {return func(ctx, 1);}
 SEC("func_end=sys_nanosleep%return")
 int func_end(void *ctx) { return func(ctx, 2);}
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 # ./perf record -e evt=bpf-output/no-inherit/ \
                 -e ./test_bpf_output_2.c/maps:channel.event=evt/ \
                 usleep 100000
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]

 # ./perf script
          usleep 14942 92503.198504: evt=bpf-output/no-inherit/:  ffffffff810e0ba1 sys_nanosleep (/lib/modules/4.3.0....
          usleep 14942 92503.298562: evt=bpf-output/no-inherit/:  ffffffff810585e9 kretprobe_trampoline_holder (/lib....

 # ./perf data convert --to-ctf ./out.ctf
 [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
 [ perf data convert: Converted and wrote 0.000 MB (2 samples) ]

 # babeltrace ./out.ctf
 [01:41:43.198504134] (+?.?????????) evt=bpf-output/no-inherit/: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E0BA1, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x32C0C07B, [1] = 0x5421, [2] = 0x1 ] }
 [01:41:43.298562257] (+0.100058123) evt=bpf-output/no-inherit/: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810585E9, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x38B77FAA, [1] = 0x5421, [2] = 0x2 ] }

 # cat ./test_bpf_output_2.py
 from babeltrace import TraceCollection
 tc = TraceCollection(
 tc.add_trace('./out.ctf', 'ctf')
 d = {1:[], 2:[]}
 for event in tc.events:
     if not event.name.startswith('evt=bpf-output/no-inherit/'):
         continue
     raw_data = event['raw_data']
     (time, type) = ((raw_data[0] + (raw_data[1] << 32)), raw_data[2])
     d[type].append(time)
 print(list(map(lambda i: d[2][i] - d[1][i], range(len(d[1]))))));

 # python3 ./test_bpf_output_2.py
 [100056879]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data-convert-bt.c | 112 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 111 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 34cd1e4..62ccf8d 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -352,6 +352,84 @@ static int add_tracepoint_values(struct ctf_writer *cw,
 	return ret;
 }
 
+static int
+add_bpf_output_values(struct bt_ctf_event_class *event_class,
+		      struct bt_ctf_event *event,
+		      struct perf_sample *sample)
+{
+	struct bt_ctf_field_type *len_type, *seq_type;
+	struct bt_ctf_field *len_field, *seq_field;
+	unsigned int raw_size = sample->raw_size;
+	unsigned int nr_elements = raw_size / sizeof(u32);
+	unsigned int i;
+	int ret;
+
+	if (nr_elements * sizeof(u32) != raw_size)
+		pr_warning("Incorrect raw_size (%u) in bpf output event, skip %lu bytes\n",
+			   raw_size, nr_elements * sizeof(u32) - raw_size);
+
+	len_type = bt_ctf_event_class_get_field_by_name(event_class, "raw_len");
+	len_field = bt_ctf_field_create(len_type);
+	if (!len_field) {
+		pr_err("failed to create 'raw_len' for bpf output event\n");
+		ret = -1;
+		goto put_len_type;
+	}
+
+	ret = bt_ctf_field_unsigned_integer_set_value(len_field, nr_elements);
+	if (ret) {
+		pr_err("failed to set field value for raw_len\n");
+		goto put_len_field;
+	}
+	ret = bt_ctf_event_set_payload(event, "raw_len", len_field);
+	if (ret) {
+		pr_err("failed to set payload to raw_len\n");
+		goto put_len_field;
+	}
+
+	seq_type = bt_ctf_event_class_get_field_by_name(event_class, "raw_data");
+	seq_field = bt_ctf_field_create(seq_type);
+	if (!seq_field) {
+		pr_err("failed to create 'raw_data' for bpf output event\n");
+		ret = -1;
+		goto put_seq_type;
+	}
+
+	ret = bt_ctf_field_sequence_set_length(seq_field, len_field);
+	if (ret) {
+		pr_err("failed to set length of 'raw_data'\n");
+		goto put_seq_field;
+	}
+
+	for (i = 0; i < nr_elements; i++) {
+		struct bt_ctf_field *elem_field =
+			bt_ctf_field_sequence_get_field(seq_field, i);
+
+		ret = bt_ctf_field_unsigned_integer_set_value(elem_field,
+				((u32 *)(sample->raw_data))[i]);
+
+		bt_ctf_field_put(elem_field);
+		if (ret) {
+			pr_err("failed to set raw_data[%d]\n", i);
+			goto put_seq_field;
+		}
+	}
+
+	ret = bt_ctf_event_set_payload(event, "raw_data", seq_field);
+	if (ret)
+		pr_err("failed to set payload for raw_data\n");
+
+put_seq_field:
+	bt_ctf_field_put(seq_field);
+put_seq_type:
+	bt_ctf_field_type_put(seq_type);
+put_len_field:
+	bt_ctf_field_put(len_field);
+put_len_type:
+	bt_ctf_field_type_put(len_type);
+	return ret;
+}
+
 static int add_generic_values(struct ctf_writer *cw,
 			      struct bt_ctf_event *event,
 			      struct perf_evsel *evsel,
@@ -597,6 +675,12 @@ static int process_sample_event(struct perf_tool *tool,
 			return -1;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		ret = add_bpf_output_values(event_class, event, sample);
+		if (ret)
+			return -1;
+	}
+
 	cs = ctf_stream(cw, get_sample_cpu(cw, sample, evsel));
 	if (cs) {
 		if (is_flush_needed(cs))
@@ -744,6 +828,25 @@ static int add_tracepoint_types(struct ctf_writer *cw,
 	return ret;
 }
 
+static int add_bpf_output_types(struct ctf_writer *cw,
+				struct bt_ctf_event_class *class)
+{
+	struct bt_ctf_field_type *len_type = cw->data.u32;
+	struct bt_ctf_field_type *seq_base_type = cw->data.u32_hex;
+	struct bt_ctf_field_type *seq_type;
+	int ret;
+
+	ret = bt_ctf_event_class_add_field(class, len_type, "raw_len");
+	if (ret)
+		return ret;
+
+	seq_type = bt_ctf_field_type_sequence_create(seq_base_type, "raw_len");
+	if (!seq_type)
+		return -1;
+
+	return bt_ctf_event_class_add_field(class, seq_type, "raw_data");
+}
+
 static int add_generic_types(struct ctf_writer *cw, struct perf_evsel *evsel,
 			     struct bt_ctf_event_class *event_class)
 {
@@ -755,7 +858,8 @@ static int add_generic_types(struct ctf_writer *cw, struct perf_evsel *evsel,
 	 *                              ctf event header
 	 *   PERF_SAMPLE_READ         - TODO
 	 *   PERF_SAMPLE_CALLCHAIN    - TODO
-	 *   PERF_SAMPLE_RAW          - tracepoint fields are handled separately
+	 *   PERF_SAMPLE_RAW          - tracepoint fields and BPF output
+	 *                              are handled separately
 	 *   PERF_SAMPLE_BRANCH_STACK - TODO
 	 *   PERF_SAMPLE_REGS_USER    - TODO
 	 *   PERF_SAMPLE_STACK_USER   - TODO
@@ -824,6 +928,12 @@ static int add_event(struct ctf_writer *cw, struct perf_evsel *evsel)
 			goto err;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		ret = add_bpf_output_types(cw, event_class);
+		if (ret)
+			goto err;
+	}
+
 	ret = bt_ctf_stream_class_add_event_class(cw->stream_class, event_class);
 	if (ret) {
 		pr("Failed to add event class into stream.\n");
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 18/54] perf core: Introduce new ioctl options to pause and resume ring buffer
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (16 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 17/54] perf data: Support converting data from bpf_perf_event_output() Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 19/54] perf core: Set event's default overflow_handler Wang Nan
                   ` (36 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Add new ioctl() to pause/resume ring-buffer output.

In some situations we want to read from ring buffer only when we
ensure nothing can write to the ring buffer during reading. Without
this patch we have to turn off all events attached to this ring buffer
to achieve this.

This patch is for supporting overwrite ring buffer. Following
commits will introduce new methods support reading from overwrite ring
buffer. Before reading caller must ensure the ring buffer is frozen, or
the reading is unreliable.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/uapi/linux/perf_event.h |  1 +
 kernel/events/core.c            | 13 +++++++++++++
 kernel/events/internal.h        | 11 +++++++++++
 kernel/events/ring_buffer.c     |  7 ++++++-
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 1afe962..a3c1903 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -401,6 +401,7 @@ struct perf_event_attr {
 #define PERF_EVENT_IOC_SET_FILTER	_IOW('$', 6, char *)
 #define PERF_EVENT_IOC_ID		_IOR('$', 7, __u64 *)
 #define PERF_EVENT_IOC_SET_BPF		_IOW('$', 8, __u32)
+#define PERF_EVENT_IOC_PAUSE_OUTPUT	_IOW('$', 9, __u32)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index bf82441..9e9c84da 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4241,6 +4241,19 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
 	case PERF_EVENT_IOC_SET_BPF:
 		return perf_event_set_bpf_prog(event, arg);
 
+	case PERF_EVENT_IOC_PAUSE_OUTPUT: {
+		struct ring_buffer *rb;
+
+		rcu_read_lock();
+		rb = rcu_dereference(event->rb);
+		if (!event->rb) {
+			rcu_read_unlock();
+			return -EINVAL;
+		}
+		rb_toggle_paused(rb, !!arg);
+		rcu_read_unlock();
+		return 0;
+	}
 	default:
 		return -ENOTTY;
 	}
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 2bbad9c..6a93d1b 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -18,6 +18,7 @@ struct ring_buffer {
 #endif
 	int				nr_pages;	/* nr of data pages  */
 	int				overwrite;	/* can overwrite itself */
+	int				paused;		/* can write into ring buffer */
 
 	atomic_t			poll;		/* POLL_ for wakeups */
 
@@ -65,6 +66,16 @@ static inline void rb_free_rcu(struct rcu_head *rcu_head)
 	rb_free(rb);
 }
 
+static inline void
+rb_toggle_paused(struct ring_buffer *rb,
+		 bool pause)
+{
+	if (!pause && rb->nr_pages)
+		rb->paused = 0;
+	else
+		rb->paused = 1;
+}
+
 extern struct ring_buffer *
 rb_alloc(int nr_pages, long watermark, int cpu, int flags);
 extern void perf_event_wakeup(struct perf_event *event);
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index adfdc05..9f1a93f 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -125,8 +125,11 @@ int perf_output_begin(struct perf_output_handle *handle,
 	if (unlikely(!rb))
 		goto out;
 
-	if (unlikely(!rb->nr_pages))
+	if (unlikely(rb->paused)) {
+		if (rb->nr_pages)
+			local_inc(&rb->lost);
 		goto out;
+	}
 
 	handle->rb    = rb;
 	handle->event = event;
@@ -244,6 +247,8 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
 	INIT_LIST_HEAD(&rb->event_list);
 	spin_lock_init(&rb->event_lock);
 	init_irq_work(&rb->irq_work, rb_irq_work);
+
+	rb->paused = rb->nr_pages ? 0 : 1;
 }
 
 static void ring_buffer_put_async(struct ring_buffer *rb)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 19/54] perf core: Set event's default overflow_handler
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (17 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 18/54] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 20/54] perf core: Prepare writing into ring buffer from end Wang Nan
                   ` (35 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Set a default event->overflow_handler in perf_event_alloc() so don't
need checking event->overflow_handler in __perf_event_overflow().
Following commits can give a different default overflow_handler.

No extra performance introduced into hot path because in the original
code we still need reading this handler from memory. A conditional branch
is avoided so actually we remove some instructions.

Initial idea comes from Peter at [1].

[1] http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 kernel/events/core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9e9c84da..f79c4be 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6402,10 +6402,7 @@ static int __perf_event_overflow(struct perf_event *event,
 		irq_work_queue(&event->pending);
 	}
 
-	if (event->overflow_handler)
-		event->overflow_handler(event, data, regs);
-	else
-		perf_event_output(event, data, regs);
+	event->overflow_handler(event, data, regs);
 
 	if (*perf_event_fasync(event) && event->pending_kill) {
 		event->pending_wakeup = 1;
@@ -7874,8 +7871,13 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 		context = parent_event->overflow_handler_context;
 	}
 
-	event->overflow_handler	= overflow_handler;
-	event->overflow_handler_context = context;
+	if (overflow_handler) {
+		event->overflow_handler	= overflow_handler;
+		event->overflow_handler_context = context;
+	} else {
+		event->overflow_handler = perf_event_output;
+		event->overflow_handler_context = NULL;
+	}
 
 	perf_event__state_init(event);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 20/54] perf core: Prepare writing into ring buffer from end
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (18 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 19/54] perf core: Set event's default overflow_handler Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 21/54] perf core: Add backward attribute to perf event Wang Nan
                   ` (34 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Convert perf_output_begin to __perf_output_begin and make the later
function able to write records from the end of the ring buffer.
Following commits will utilize the 'backward' flag.

This patch doesn't introduce any extra performance overhead since we
use always_inline.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 kernel/events/ring_buffer.c | 42 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 9f1a93f..0684e880 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -102,8 +102,21 @@ out:
 	preempt_enable();
 }
 
-int perf_output_begin(struct perf_output_handle *handle,
-		      struct perf_event *event, unsigned int size)
+static bool __always_inline
+ring_buffer_has_space(unsigned long head, unsigned long tail,
+		      unsigned long data_size, unsigned int size,
+		      bool backward)
+{
+	if (!backward)
+		return CIRC_SPACE(head, tail, data_size) >= size;
+	else
+		return CIRC_SPACE(tail, head, data_size) >= size;
+}
+
+static int __always_inline
+__perf_output_begin(struct perf_output_handle *handle,
+		    struct perf_event *event, unsigned int size,
+		    bool backward)
 {
 	struct ring_buffer *rb;
 	unsigned long tail, offset, head;
@@ -146,9 +159,12 @@ int perf_output_begin(struct perf_output_handle *handle,
 	do {
 		tail = READ_ONCE(rb->user_page->data_tail);
 		offset = head = local_read(&rb->head);
-		if (!rb->overwrite &&
-		    unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size))
-			goto fail;
+		if (!rb->overwrite) {
+			if (unlikely(!ring_buffer_has_space(head, tail,
+							    perf_data_size(rb),
+							    size, backward)))
+				goto fail;
+		}
 
 		/*
 		 * The above forms a control dependency barrier separating the
@@ -162,9 +178,17 @@ int perf_output_begin(struct perf_output_handle *handle,
 		 * See perf_output_put_handle().
 		 */
 
-		head += size;
+		if (!backward)
+			head += size;
+		else
+			head -= size;
 	} while (local_cmpxchg(&rb->head, offset, head) != offset);
 
+	if (backward) {
+		offset = head;
+		head = (u64)(-head);
+	}
+
 	/*
 	 * We rely on the implied barrier() by local_cmpxchg() to ensure
 	 * none of the data stores below can be lifted up by the compiler.
@@ -206,6 +230,12 @@ out:
 	return -ENOSPC;
 }
 
+int perf_output_begin(struct perf_output_handle *handle,
+		      struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, false);
+}
+
 unsigned int perf_output_copy(struct perf_output_handle *handle,
 		      const void *buf, unsigned int len)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 21/54] perf core: Add backward attribute to perf event
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (19 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 20/54] perf core: Prepare writing into ring buffer from end Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 22/54] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
                   ` (33 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

In perf_event_attr a new bit 'write_backward' is appended to indicate
this event should write ring buffer from its end to beginning.

In perf_output_begin(), prepare ring buffer according this bit.

This patch introduces small overhead into perf_output_begin():
an extra memory read and a conditional branch. Further patch can remove
this overhead by using custom output handler.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/linux/perf_event.h      | 5 +++++
 include/uapi/linux/perf_event.h | 3 ++-
 kernel/events/core.c            | 7 +++++++
 kernel/events/ring_buffer.c     | 2 ++
 4 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f9828a4..54c3fb2 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1032,6 +1032,11 @@ static inline bool has_aux(struct perf_event *event)
 	return event->pmu->setup_aux;
 }
 
+static inline bool is_write_backward(struct perf_event *event)
+{
+	return !!event->attr.write_backward;
+}
+
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
 extern void perf_output_end(struct perf_output_handle *handle);
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index a3c1903..43fc8d2 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -340,7 +340,8 @@ struct perf_event_attr {
 				comm_exec      :  1, /* flag comm events that are due to an exec */
 				use_clockid    :  1, /* use @clockid for time fields */
 				context_switch :  1, /* context switch data */
-				__reserved_1   : 37;
+				write_backward :  1, /* Write ring buffer from end to beginning */
+				__reserved_1   : 36;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f79c4be..8ad22a5 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8107,6 +8107,13 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
 		goto out;
 
 	/*
+	 * Either writing ring buffer from beginning or from end.
+	 * Mixing is not allowed.
+	 */
+	if (is_write_backward(output_event) != is_write_backward(event))
+		goto out;
+
+	/*
 	 * If both events generate aux data, they must be on the same PMU
 	 */
 	if (has_aux(event) && has_aux(output_event) &&
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 0684e880..28543e1 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -233,6 +233,8 @@ out:
 int perf_output_begin(struct perf_output_handle *handle,
 		      struct perf_event *event, unsigned int size)
 {
+	if (unlikely(is_write_backward(event)))
+		return __perf_output_begin(handle, event, size, true);
 	return __perf_output_begin(handle, event, size, false);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 22/54] perf core: Reduce perf event output overhead by new overflow handler
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (20 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 21/54] perf core: Add backward attribute to perf event Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 23/54] perf tools: Introduce API to pause ring buffer Wang Nan
                   ` (32 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

By creating onward and backward specific overflow handlers and setting
them according to event's backward setting, normal sampling events
don't need checking backward setting of an event any more.

This is the last patch of backward writing patchset. After this patch,
there's no extra overhead introduced to the fast path of sampling
output.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/linux/perf_event.h  | 17 +++++++++++++++--
 kernel/events/core.c        | 41 ++++++++++++++++++++++++++++++++++++-----
 kernel/events/ring_buffer.c | 12 ++++++++++++
 3 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 54c3fb2..c0335b9 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -830,9 +830,15 @@ extern int perf_event_overflow(struct perf_event *event,
 				 struct perf_sample_data *data,
 				 struct pt_regs *regs);
 
+extern void perf_event_output_onward(struct perf_event *event,
+				     struct perf_sample_data *data,
+				     struct pt_regs *regs);
+extern void perf_event_output_backward(struct perf_event *event,
+				       struct perf_sample_data *data,
+				       struct pt_regs *regs);
 extern void perf_event_output(struct perf_event *event,
-				struct perf_sample_data *data,
-				struct pt_regs *regs);
+			      struct perf_sample_data *data,
+			      struct pt_regs *regs);
 
 extern void
 perf_event_header__init_id(struct perf_event_header *header,
@@ -1039,6 +1045,13 @@ static inline bool is_write_backward(struct perf_event *event)
 
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
+extern int perf_output_begin_onward(struct perf_output_handle *handle,
+				    struct perf_event *event,
+				    unsigned int size);
+extern int perf_output_begin_backward(struct perf_output_handle *handle,
+				      struct perf_event *event,
+				      unsigned int size);
+
 extern void perf_output_end(struct perf_output_handle *handle);
 extern unsigned int perf_output_copy(struct perf_output_handle *handle,
 			     const void *buf, unsigned int len);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8ad22a5..8a25e46 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5541,9 +5541,13 @@ void perf_prepare_sample(struct perf_event_header *header,
 	}
 }
 
-void perf_event_output(struct perf_event *event,
-			struct perf_sample_data *data,
-			struct pt_regs *regs)
+static void __always_inline
+__perf_event_output(struct perf_event *event,
+		    struct perf_sample_data *data,
+		    struct pt_regs *regs,
+		    int (*output_begin)(struct perf_output_handle *,
+					struct perf_event *,
+					unsigned int))
 {
 	struct perf_output_handle handle;
 	struct perf_event_header header;
@@ -5553,7 +5557,7 @@ void perf_event_output(struct perf_event *event,
 
 	perf_prepare_sample(&header, data, event, regs);
 
-	if (perf_output_begin(&handle, event, header.size))
+	if (output_begin(&handle, event, header.size))
 		goto exit;
 
 	perf_output_sample(&handle, &header, data, event);
@@ -5564,6 +5568,30 @@ exit:
 	rcu_read_unlock();
 }
 
+void
+perf_event_output_onward(struct perf_event *event,
+			 struct perf_sample_data *data,
+			 struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin_onward);
+}
+
+void
+perf_event_output_backward(struct perf_event *event,
+			   struct perf_sample_data *data,
+			   struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin_backward);
+}
+
+void
+perf_event_output(struct perf_event *event,
+		  struct perf_sample_data *data,
+		  struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin);
+}
+
 /*
  * read event_id
  */
@@ -7874,8 +7902,11 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 	if (overflow_handler) {
 		event->overflow_handler	= overflow_handler;
 		event->overflow_handler_context = context;
+	} else if (is_write_backward(event)){
+		event->overflow_handler = perf_event_output_backward;
+		event->overflow_handler_context = NULL;
 	} else {
-		event->overflow_handler = perf_event_output;
+		event->overflow_handler = perf_event_output_onward;
 		event->overflow_handler_context = NULL;
 	}
 
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 28543e1..ca11809 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -230,6 +230,18 @@ out:
 	return -ENOSPC;
 }
 
+int perf_output_begin_onward(struct perf_output_handle *handle,
+			     struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, false);
+}
+
+int perf_output_begin_backward(struct perf_output_handle *handle,
+			       struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, true);
+}
+
 int perf_output_begin(struct perf_output_handle *handle,
 		      struct perf_event *event, unsigned int size)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 23/54] perf tools: Introduce API to pause ring buffer
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (21 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 22/54] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 24/54] perf tools: Only validate is_pos for tracking evsels Wang Nan
                   ` (31 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

perf_evsel__pause() is introduced to pause a ring buffer. Since output
of a evsel is bound together, ioctl() on the first file is enough.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evsel.c | 6 ++++++
 tools/perf/util/evsel.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 7498d58..0b562cf 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1009,6 +1009,12 @@ int perf_evsel__disable(struct perf_evsel *evsel)
 				     0);
 }
 
+int perf_evsel__pause(struct perf_evsel *evsel, bool pause)
+{
+	return perf_evsel__run_ioctl(evsel, 1, 1, PERF_EVENT_IOC_PAUSE_OUTPUT,
+				     (void *)(pause ? 1UL : 0UL));
+}
+
 int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
 {
 	if (ncpus == 0 || nthreads == 0)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 022fcff..d5ae7ba 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -230,6 +230,7 @@ int perf_evsel__apply_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
 			     const char *filter);
 int perf_evsel__enable(struct perf_evsel *evsel);
 int perf_evsel__disable(struct perf_evsel *evsel);
+int perf_evsel__pause(struct perf_evsel *evsel, bool pause);
 
 int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
 			     struct cpu_map *cpus);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 24/54] perf tools: Only validate is_pos for tracking evsels
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (22 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 23/54] perf tools: Introduce API to pause ring buffer Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 25/54] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
                   ` (30 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

is_pos only useful for tracking events (fork, mmap, exit, ...).
Perf collects those events through evsel with 'tracking' set.
Therefore, there's no need to validate every is_pos against
evlist->is_pos.

This patch is required after perf support PERF_SAMPLE_TAILSIZE.
Since there an extra u64 at the end of this type of evsels, is_pos
for evsel with PERF_SAMPLE_TAILSIZE setting is different from other
evsels.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 890b08b..c80aad1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1274,8 +1274,15 @@ bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
 		return false;
 
 	evlist__for_each(evlist, pos) {
-		if (pos->id_pos != evlist->id_pos ||
-		    pos->is_pos != evlist->is_pos)
+		if (pos->id_pos != evlist->id_pos)
+			return false;
+		/*
+		 * Only tracking events needs is_pos. Those events are
+		 * collected if evsel->tracking is selected.
+		 * For other evsel, is_pos is useless for other evsels,
+		 * so skip validating them.
+		 */
+		if (pos->tracking && pos->is_pos != evlist->is_pos)
 			return false;
 	}
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 25/54] perf tools: Print write_backward value in perf_event_attr__fprintf
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (23 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 24/54] perf tools: Only validate is_pos for tracking evsels Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 26/54] perf tools: Move timestamp creation to util Wang Nan
                   ` (29 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Print write_backward setting when printing perf evsel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evsel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0b562cf..66a47ba 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1296,6 +1296,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(comm_exec, p_unsigned);
 	PRINT_ATTRf(use_clockid, p_unsigned);
 	PRINT_ATTRf(context_switch, p_unsigned);
+	PRINT_ATTRf(write_backward, p_unsigned);
 
 	PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned);
 	PRINT_ATTRf(bp_type, p_unsigned);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 26/54] perf tools: Move timestamp creation to util
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (24 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 25/54] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-02-03 10:18   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-01-25  9:56 ` [PATCH 27/54] perf tools: Make ordered_events reusable Wang Nan
                   ` (28 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Timestamp generation becomes a public available helper. Which will
be used by 'perf record', help it output to split output file based
on time.

For example:

 perf.data.2015122620363710
 perf.data.2015122620364092
 perf.data.2015122620365423
 ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-buildid-cache.c | 14 +-------------
 tools/perf/util/util.c             | 17 +++++++++++++++++
 tools/perf/util/util.h             |  1 +
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
index d93bff7..632efc6 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -38,19 +38,7 @@ static int build_id_cache__kcore_buildid(const char *proc_dir, char *sbuildid)
 
 static int build_id_cache__kcore_dir(char *dir, size_t sz)
 {
-	struct timeval tv;
-	struct tm tm;
-	char dt[32];
-
-	if (gettimeofday(&tv, NULL) || !localtime_r(&tv.tv_sec, &tm))
-		return -1;
-
-	if (!strftime(dt, sizeof(dt), "%Y%m%d%H%M%S", &tm))
-		return -1;
-
-	scnprintf(dir, sz, "%s%02u", dt, (unsigned)tv.tv_usec / 10000);
-
-	return 0;
+	return fetch_current_timestamp(dir, sz);
 }
 
 static bool same_kallsyms_reloc(const char *from_dir, char *to_dir)
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 7a2da7e..b9e2843 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -701,3 +701,20 @@ bool is_regular_file(const char *file)
 
 	return S_ISREG(st.st_mode);
 }
+
+int fetch_current_timestamp(char *buf, size_t sz)
+{
+	struct timeval tv;
+	struct tm tm;
+	char dt[32];
+
+	if (gettimeofday(&tv, NULL) || !localtime_r(&tv.tv_sec, &tm))
+		return -1;
+
+	if (!strftime(dt, sizeof(dt), "%Y%m%d%H%M%S", &tm))
+		return -1;
+
+	scnprintf(buf, sz, "%s%02u", dt, (unsigned)tv.tv_usec / 10000);
+
+	return 0;
+}
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 61650f0..a861581 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -344,5 +344,6 @@ int fetch_kernel_version(unsigned int *puint,
 
 const char *perf_tip(const char *dirpath);
 bool is_regular_file(const char *file);
+int fetch_current_timestamp(char *buf, size_t sz);
 
 #endif /* GIT_COMPAT_UTIL_H */
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 27/54] perf tools: Make ordered_events reusable
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (25 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 26/54] perf tools: Move timestamp creation to util Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 28/54] perf record: Extract synthesize code to record__synthesize() Wang Nan
                   ` (27 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

ordered_events__free() leaves linked lists and timestamps not cleared,
so unable to be reused after ordered_events__free(). Which is inconvenient
after 'perf record' supports generating multiple perf.data output and
process build-ids for each of them.

Calls ordered_events__init() in ordered_events__free() so ordered_events
can be reused.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/ordered-events.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
index b1b9e23..70c0dc8 100644
--- a/tools/perf/util/ordered-events.c
+++ b/tools/perf/util/ordered-events.c
@@ -299,6 +299,8 @@ void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t d
 
 void ordered_events__free(struct ordered_events *oe)
 {
+	ordered_events__deliver_t old_deliver = oe->deliver;
+
 	while (!list_empty(&oe->to_free)) {
 		struct ordered_event *event;
 
@@ -307,4 +309,7 @@ void ordered_events__free(struct ordered_events *oe)
 		free_dup_event(oe, event->event);
 		free(event);
 	}
+
+	memset(oe, '\0', sizeof(*oe));
+	ordered_events__init(oe, old_deliver);
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 28/54] perf record: Extract synthesize code to record__synthesize()
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (26 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 27/54] perf tools: Make ordered_events reusable Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-29 20:37   ` Arnaldo Carvalho de Melo
  2016-01-25  9:56 ` [PATCH 29/54] perf tools: Add perf_data_file__switch() helper Wang Nan
                   ` (26 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Create record__synthesize(). It can be used to create tracking events
for each perf.data after perf supporting splitting into multiple
outputs.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 132 +++++++++++++++++++++++++-------------------
 1 file changed, 76 insertions(+), 56 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index c95c8ea..5bd4f3d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -483,6 +483,81 @@ static void workload_exec_failed_signal(int signo __maybe_unused,
 
 static void snapshot_sig_handler(int sig);
 
+static int record__synthesize(struct record *rec)
+{
+	struct perf_session *session = rec->session;
+	struct machine *machine = &session->machines.host;
+	struct perf_data_file *file = &rec->file;
+	struct record_opts *opts = &rec->opts;
+	struct perf_tool *tool = &rec->tool;
+	int fd = perf_data_file__fd(file);
+	int err = 0;
+	static bool warned_kmaps = false, warned_modules = false;
+
+	if (file->is_pipe) {
+		err = perf_event__synthesize_attrs(tool, session,
+						   process_synthesized_event);
+		if (err < 0) {
+			pr_err("Couldn't synthesize attrs.\n");
+			goto out;
+		}
+
+		if (have_tracepoints(&rec->evlist->entries)) {
+			/*
+			 * FIXME err <= 0 here actually means that
+			 * there were no tracepoints so its not really
+			 * an error, just that we don't need to
+			 * synthesize anything.  We really have to
+			 * return this more properly and also
+			 * propagate errors that now are calling die()
+			 */
+			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
+								  process_synthesized_event);
+			if (err <= 0) {
+				pr_err("Couldn't record tracing data.\n");
+				goto out;
+			}
+			rec->bytes_written += err;
+		}
+	}
+
+	if (rec->opts.full_auxtrace) {
+		err = perf_event__synthesize_auxtrace_info(rec->itr, tool,
+					session, process_synthesized_event);
+		if (err)
+			goto out;
+	}
+
+	err = perf_event__synthesize_kernel_mmap(tool, process_synthesized_event,
+						 machine);
+	if (err < 0 && !warned_kmaps) {
+		warned_kmaps = true;
+		pr_err("Couldn't record kernel reference relocation symbol\n"
+		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
+		       "Check /proc/kallsyms permission or run as root.\n");
+	}
+
+	err = perf_event__synthesize_modules(tool, process_synthesized_event,
+					     machine);
+	if (err < 0 && !warned_modules) {
+		warned_modules = true;
+		pr_err("Couldn't record kernel module information.\n"
+		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
+		       "Check /proc/modules permission or run as root.\n");
+	}
+
+	if (perf_guest) {
+		machines__process_guests(&session->machines,
+					 perf_event__synthesize_guest_os, tool);
+	}
+
+	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
+					    process_synthesized_event, opts->sample_address,
+					    opts->proc_map_timeout);
+out:
+	return err;
+}
+
 static int __cmd_record(struct record *rec, int argc, const char **argv)
 {
 	int err;
@@ -577,63 +652,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	if (file->is_pipe) {
-		err = perf_event__synthesize_attrs(tool, session,
-						   process_synthesized_event);
-		if (err < 0) {
-			pr_err("Couldn't synthesize attrs.\n");
-			goto out_child;
-		}
-
-		if (have_tracepoints(&rec->evlist->entries)) {
-			/*
-			 * FIXME err <= 0 here actually means that
-			 * there were no tracepoints so its not really
-			 * an error, just that we don't need to
-			 * synthesize anything.  We really have to
-			 * return this more properly and also
-			 * propagate errors that now are calling die()
-			 */
-			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
-								  process_synthesized_event);
-			if (err <= 0) {
-				pr_err("Couldn't record tracing data.\n");
-				goto out_child;
-			}
-			rec->bytes_written += err;
-		}
-	}
-
-	if (rec->opts.full_auxtrace) {
-		err = perf_event__synthesize_auxtrace_info(rec->itr, tool,
-					session, process_synthesized_event);
-		if (err)
-			goto out_delete_session;
-	}
-
-	err = perf_event__synthesize_kernel_mmap(tool, process_synthesized_event,
-						 machine);
-	if (err < 0)
-		pr_err("Couldn't record kernel reference relocation symbol\n"
-		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
-		       "Check /proc/kallsyms permission or run as root.\n");
-
-	err = perf_event__synthesize_modules(tool, process_synthesized_event,
-					     machine);
+	err = record__synthesize(rec);
 	if (err < 0)
-		pr_err("Couldn't record kernel module information.\n"
-		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
-		       "Check /proc/modules permission or run as root.\n");
-
-	if (perf_guest) {
-		machines__process_guests(&session->machines,
-					 perf_event__synthesize_guest_os, tool);
-	}
-
-	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
-					    process_synthesized_event, opts->sample_address,
-					    opts->proc_map_timeout);
-	if (err != 0)
 		goto out_child;
 
 	if (rec->realtime_prio) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 29/54] perf tools: Add perf_data_file__switch() helper
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (27 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 28/54] perf record: Extract synthesize code to record__synthesize() Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 30/54] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
                   ` (25 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

perf_data_file__switch() closes current output file, renames it, then
open a new one to continue record. It will be used by perf record
to split output into multiple perf.data files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data.c | 36 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/data.h | 11 ++++++++++-
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index 1921942..bfded6a 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -136,3 +136,39 @@ ssize_t perf_data_file__write(struct perf_data_file *file,
 {
 	return writen(file->fd, buf, size);
 }
+
+int perf_data_file__switch(struct perf_data_file *file,
+			   const char *postfix,
+			   size_t pos, bool at_exit)
+{
+	char *new_filepath;
+	int ret;
+
+	if (check_pipe(file))
+		return -EINVAL;
+	if (perf_data_file__is_read(file))
+		return -EINVAL;
+
+	if (asprintf(&new_filepath, "%s.%s", file->path, postfix) < 0)
+		return -ENOMEM;
+
+	rename(file->path, new_filepath);
+
+	if (!at_exit) {
+		close(file->fd);
+		ret = perf_data_file__open(file);
+		if (ret < 0)
+			goto out;
+
+		if (lseek(file->fd, pos, SEEK_SET) == (off_t)-1) {
+			ret = -errno;
+			pr_debug("Failed to lseek to %zu: %s",
+				 pos, strerror(errno));
+			goto out;
+		}
+	}
+	ret = file->fd;
+out:
+	free(new_filepath);
+	return ret;
+}
diff --git a/tools/perf/util/data.h b/tools/perf/util/data.h
index 2b15d0c..ae510ce 100644
--- a/tools/perf/util/data.h
+++ b/tools/perf/util/data.h
@@ -46,5 +46,14 @@ int perf_data_file__open(struct perf_data_file *file);
 void perf_data_file__close(struct perf_data_file *file);
 ssize_t perf_data_file__write(struct perf_data_file *file,
 			      void *buf, size_t size);
-
+/*
+ * If at_exit is set, only rename current perf.data to
+ * perf.data.<postfix>, continue write on original file.
+ * Set at_exit when flushing the last output.
+ *
+ * Return value is fd of new output.
+ */
+int perf_data_file__switch(struct perf_data_file *file,
+			   const char *postfix,
+			   size_t pos, bool at_exit);
 #endif /* __PERF_DATA_H */
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 30/54] perf record: Turns auxtrace_snapshot_enable into 3 states
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (28 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 29/54] perf tools: Add perf_data_file__switch() helper Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 31/54] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
                   ` (24 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

auxtrace_snapshot_enable has only two states (0/1). Turns it into a
triple states enum so SIGUSR2 handler can safely do other works without
triggering auxtrace snapshot.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 59 +++++++++++++++++++++++++++++++++++++--------
 1 file changed, 49 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5bd4f3d..19e5046 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -121,7 +121,43 @@ out:
 static volatile int done;
 static volatile int signr = -1;
 static volatile int child_finished;
-static volatile int auxtrace_snapshot_enabled;
+
+static volatile enum {
+	AUXTRACE_SNAPSHOT_OFF = -1,
+	AUXTRACE_SNAPSHOT_DISABLED = 0,
+	AUXTRACE_SNAPSHOT_ENABLED = 1,
+} auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_OFF;
+
+static inline void
+auxtrace_snapshot_on(void)
+{
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_DISABLED;
+}
+
+static inline void
+auxtrace_snapshot_enable(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return;
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_ENABLED;
+}
+
+static inline void
+auxtrace_snapshot_disable(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return;
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_DISABLED;
+}
+
+static inline bool
+auxtrace_snapshot_is_enabled(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return false;
+	return auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_ENABLED;
+}
+
 static volatile int auxtrace_snapshot_err;
 static volatile int auxtrace_record__snapshot_started;
 
@@ -245,7 +281,7 @@ static void record__read_auxtrace_snapshot(struct record *rec)
 	} else {
 		auxtrace_snapshot_err = auxtrace_record__snapshot_finish(rec->itr);
 		if (!auxtrace_snapshot_err)
-			auxtrace_snapshot_enabled = 1;
+			auxtrace_snapshot_enable();
 	}
 }
 
@@ -578,10 +614,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	signal(SIGCHLD, sig_handler);
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
-	if (rec->opts.auxtrace_snapshot_mode)
+
+	if (rec->opts.auxtrace_snapshot_mode) {
 		signal(SIGUSR2, snapshot_sig_handler);
-	else
+		auxtrace_snapshot_on();
+	} else {
 		signal(SIGUSR2, SIG_IGN);
+	}
 
 	session = perf_session__new(file, false, tool);
 	if (session == NULL) {
@@ -707,12 +746,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		perf_evlist__enable(rec->evlist);
 	}
 
-	auxtrace_snapshot_enabled = 1;
+	auxtrace_snapshot_enable();
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
 		if (record__mmap_read_all(rec) < 0) {
-			auxtrace_snapshot_enabled = 0;
+			auxtrace_snapshot_disable();
 			err = -1;
 			goto out_child;
 		}
@@ -750,12 +789,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		 * disable events in this case.
 		 */
 		if (done && !disabled && !target__none(&opts->target)) {
-			auxtrace_snapshot_enabled = 0;
+			auxtrace_snapshot_disable();
 			perf_evlist__disable(rec->evlist);
 			disabled = true;
 		}
 	}
-	auxtrace_snapshot_enabled = 0;
+	auxtrace_snapshot_disable();
 
 	if (forks && workload_exec_errno) {
 		char msg[STRERR_BUFSIZE];
@@ -1315,9 +1354,9 @@ out_symbol_exit:
 
 static void snapshot_sig_handler(int sig __maybe_unused)
 {
-	if (!auxtrace_snapshot_enabled)
+	if (!auxtrace_snapshot_is_enabled())
 		return;
-	auxtrace_snapshot_enabled = 0;
+	auxtrace_snapshot_disable();
 	auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
 	auxtrace_record__snapshot_started = 1;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 31/54] perf record: Introduce record__finish_output() to finish a perf.data
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (29 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 30/54] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 32/54] perf record: Use OPT_BOOLEAN_SET for buildid cache related options Wang Nan
                   ` (23 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Move code for finalizing 'perf.data' to record__finish_output(). It
will be used by following commits to split output to multiple files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 37 +++++++++++++++++++++++++------------
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 19e5046..dc7fb4d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -501,6 +501,29 @@ static void record__init_features(struct record *rec)
 	perf_header__clear_feat(&session->header, HEADER_STAT);
 }
 
+static void
+record__finish_output(struct record *rec)
+{
+	struct perf_data_file *file = &rec->file;
+	int fd = perf_data_file__fd(file);
+
+	if (file->is_pipe)
+		return;
+
+	rec->session->header.data_size += rec->bytes_written;
+	file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
+
+	if (!rec->no_buildid) {
+		process_buildids(rec);
+
+		if (rec->buildid_all)
+			dsos__hit_all(rec->session);
+	}
+	perf_session__write_header(rec->session, rec->evlist, fd, true);
+
+	return;
+}
+
 static volatile int workload_exec_errno;
 
 /*
@@ -828,18 +851,8 @@ out_child:
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
-	if (!err && !file->is_pipe) {
-		rec->session->header.data_size += rec->bytes_written;
-		file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
-
-		if (!rec->no_buildid) {
-			process_buildids(rec);
-
-			if (rec->buildid_all)
-				dsos__hit_all(rec->session);
-		}
-		perf_session__write_header(rec->session, rec->evlist, fd, true);
-	}
+	if (!err)
+		record__finish_output(rec);
 
 	if (!err && !quiet) {
 		char samples[128];
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 32/54] perf record: Use OPT_BOOLEAN_SET for buildid cache related options
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (30 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 31/54] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-02-03 10:19   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-01-25  9:56 ` [PATCH 33/54] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
                   ` (22 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

'perf record' knows whether buildid cache is enabled (via
--no-no-buildid-cache) deliberately. Buildid cache can be turned off
in some situations.

Output switching support needs this feature to turn off buildid cache
by default.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dc7fb4d..c8d9c0b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -50,7 +50,9 @@ struct record {
 	const char		*progname;
 	int			realtime_prio;
 	bool			no_buildid;
+	bool			no_buildid_set;
 	bool			no_buildid_cache;
+	bool			no_buildid_cache_set;
 	bool			buildid_all;
 	unsigned long long	samples;
 };
@@ -1180,10 +1182,12 @@ struct option __record_options[] = {
 	OPT_BOOLEAN('P', "period", &record.opts.period, "Record the sample period"),
 	OPT_BOOLEAN('n', "no-samples", &record.opts.no_samples,
 		    "don't sample"),
-	OPT_BOOLEAN('N', "no-buildid-cache", &record.no_buildid_cache,
-		    "do not update the buildid cache"),
-	OPT_BOOLEAN('B', "no-buildid", &record.no_buildid,
-		    "do not collect buildids in perf.data"),
+	OPT_BOOLEAN_SET('N', "no-buildid-cache", &record.no_buildid_cache,
+			&record.no_buildid_cache_set,
+			"do not update the buildid cache"),
+	OPT_BOOLEAN_SET('B', "no-buildid", &record.no_buildid,
+			&record.no_buildid_set,
+			"do not collect buildids in perf.data"),
 	OPT_CALLBACK('G', "cgroup", &record.evlist, "name",
 		     "monitor event in cgroup name only",
 		     parse_cgroups),
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 33/54] perf record: Add '--timestamp-filename' option to append timestamp to output filename
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (31 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 32/54] perf record: Use OPT_BOOLEAN_SET for buildid cache related options Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 34/54] perf record: Split output into multiple files via '--switch-output' Wang Nan
                   ` (21 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

This options append current timestamp to output. For example:

 # perf record -a --timestamp-filename
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2015122622265847 ]
 [ perf record: Captured and wrote 0.742 MB perf.data (90 samples) ]
 # ls
 perf.data.201512262226584

After 'perf record' support generating multiple output files, timestamp
would be useful to identify each of them.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 47 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index c8d9c0b..a561599 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -54,6 +54,7 @@ struct record {
 	bool			no_buildid_cache;
 	bool			no_buildid_cache_set;
 	bool			buildid_all;
+	bool			timestamp_filename;
 	unsigned long long	samples;
 };
 
@@ -526,6 +527,37 @@ record__finish_output(struct record *rec)
 	return;
 }
 
+static int
+record__switch_output(struct record *rec, bool at_exit)
+{
+	struct perf_data_file *file = &rec->file;
+	int fd, err;
+
+	/* Same Size:      "2015122520103046"*/
+	char timestamp[] = "InvalidTimestamp";
+
+	rec->samples = 0;
+	record__finish_output(rec);
+	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
+	if (err) {
+		pr_err("Failed to get current timestamp\n");
+		return -EINVAL;
+	}
+
+	fd = perf_data_file__switch(file, timestamp,
+				    rec->session->header.data_offset,
+				    at_exit);
+	if (fd >= 0 && !at_exit) {
+		rec->bytes_written = 0;
+		rec->session->header.data_size = 0;
+	}
+
+	if (!quiet)
+		fprintf(stderr, "[ perf record: Dump %s.%s ]\n",
+			file->path, timestamp);
+	return fd;
+}
+
 static volatile int workload_exec_errno;
 
 /*
@@ -853,8 +885,17 @@ out_child:
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
-	if (!err)
-		record__finish_output(rec);
+	if (!err) {
+		if (!rec->timestamp_filename) {
+			record__finish_output(rec);
+		} else {
+			fd = record__switch_output(rec, true);
+			if (fd < 0) {
+				status = fd;
+				goto out_delete_session;
+			}
+		}
+	}
 
 	if (!err && !quiet) {
 		char samples[128];
@@ -1231,6 +1272,8 @@ struct option __record_options[] = {
 		   "file", "vmlinux pathname"),
 	OPT_BOOLEAN(0, "buildid-all", &record.buildid_all,
 		    "Record build-id of all DSOs regardless of hits"),
+	OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
+		    "append timestamp to output filename"),
 	OPT_END()
 };
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 34/54] perf record: Split output into multiple files via '--switch-output'
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (32 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 33/54] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 35/54] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
                   ` (20 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Allow 'perf record' splits its output into multiple files.

For example:

 # ~/perf record -a --timestamp-filename --switch-output &
 [1] 10763
 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622314468 ]

 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622314762 ]

 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 #[ perf record: Dump perf.data.2015122622315171 ]

 # fg
 perf record -a --timestamp-filename --switch-output
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2015122622315513 ]
 [ perf record: Captured and wrote 0.014 MB perf.data (296 samples) ]

 # ls -l
 total 920
 -rw------- 1 root root 797692 Dec 26 22:31 perf.data.2015122622314468
 -rw------- 1 root root  59960 Dec 26 22:31 perf.data.2015122622314762
 -rw------- 1 root root  59912 Dec 26 22:31 perf.data.2015122622315171
 -rw------- 1 root root  19220 Dec 26 22:31 perf.data.2015122622315513

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a561599..4e03a20 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -55,6 +55,7 @@ struct record {
 	bool			no_buildid_cache_set;
 	bool			buildid_all;
 	bool			timestamp_filename;
+	bool			switch_output;
 	unsigned long long	samples;
 };
 
@@ -163,6 +164,7 @@ auxtrace_snapshot_is_enabled(void)
 
 static volatile int auxtrace_snapshot_err;
 static volatile int auxtrace_record__snapshot_started;
+static volatile int switch_output_started;
 
 static void sig_handler(int sig)
 {
@@ -672,7 +674,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
-	if (rec->opts.auxtrace_snapshot_mode) {
+	if (rec->opts.auxtrace_snapshot_mode || rec->switch_output) {
 		signal(SIGUSR2, snapshot_sig_handler);
 		auxtrace_snapshot_on();
 	} else {
@@ -824,9 +826,25 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			}
 		}
 
+		if (switch_output_started) {
+			switch_output_started = 0;
+
+			if (!quiet)
+				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
+					waking);
+			waking = 0;
+			fd = record__switch_output(rec, false);
+			if (fd < 0) {
+				pr_err("Failed to switch to new file\n");
+				err = fd;
+				goto out_child;
+			}
+		}
+
 		if (hits == rec->samples) {
 			if (done || draining)
 				break;
+
 			err = perf_evlist__poll(rec->evlist, -1);
 			/*
 			 * Propagate error, only if there's any. Ignore positive
@@ -1274,6 +1292,8 @@ struct option __record_options[] = {
 		    "Record build-id of all DSOs regardless of hits"),
 	OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
 		    "append timestamp to output filename"),
+	OPT_BOOLEAN(0, "switch-output", &record.switch_output,
+		    "Switch output when receive SIGUSR2"),
 	OPT_END()
 };
 
@@ -1414,9 +1434,11 @@ out_symbol_exit:
 
 static void snapshot_sig_handler(int sig __maybe_unused)
 {
-	if (!auxtrace_snapshot_is_enabled())
-		return;
-	auxtrace_snapshot_disable();
-	auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
-	auxtrace_record__snapshot_started = 1;
+	if (auxtrace_snapshot_is_enabled()) {
+		auxtrace_snapshot_disable();
+		auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
+		auxtrace_record__snapshot_started = 1;
+	}
+
+	switch_output_started = 1;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 35/54] perf record: Force enable --timestamp-filename when --switch-output is provided
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (33 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 34/54] perf record: Split output into multiple files via '--switch-output' Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 36/54] perf record: Disable buildid cache options by default in switch output mode Wang Nan
                   ` (19 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Without this patch, the last output doesn't have timestamp appended if
--timestamp-filename is not explicitly provided. For example:

 # perf record -a --switch-output &
 [1] 11224
 # kill -s SIGUSR2 11224
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622372823 ]

 # fg
 perf record -a --switch-output
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.027 MB perf.data (540 samples) ]

 # ls -l
 total 836
 -rw------- 1 root root  33256 Dec 26 22:37 perf.data   <---- *Odd*
 -rw------- 1 root root 817156 Dec 26 22:37 perf.data.2015122622372823

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4e03a20..dcb6ae3 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1349,6 +1349,9 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		return -EINVAL;
 	}
 
+	if (rec->switch_output)
+		rec->timestamp_filename = true;
+
 	if (!rec->itr) {
 		rec->itr = auxtrace_record__init(rec->evlist, &err);
 		if (err)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 36/54] perf record: Disable buildid cache options by default in switch output mode
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (34 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 35/54] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 37/54] perf record: Re-synthesize tracking events after output switching Wang Nan
                   ` (18 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Cost of buildid cache processing is high: read all events in output
perf.data, open elf files to read buildid then copy them into
~/.debug directory. In switch output mode, causes perf stop receiving
from perf events for too long.

Enable no-buildid and no-buildid-cache by default if --switch-output
is provided. Still allow user use --no-no-buildid to explicitly enable
buildid in this case.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dcb6ae3..238234e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1377,8 +1377,36 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 "If some relocation was applied (e.g. kexec) symbols may be misresolved\n"
 "even with a suitable vmlinux or kallsyms file.\n\n");
 
-	if (rec->no_buildid_cache || rec->no_buildid)
+	if (rec->no_buildid_cache || rec->no_buildid) {
 		disable_buildid_cache();
+	} else if (rec->switch_output) {
+		/*
+		 * In 'perf record --switch-output', disable buildid
+		 * generation by default to reduce data file switching
+		 * overhead. Still generate buildid if they are required
+		 * explicitly using
+		 *
+		 *  perf record --signal-trigger --no-no-buildid \
+		 *              --no-no-buildid-cache
+		 *
+		 * Following code equals to:
+		 *
+		 * if ((rec->no_buildid || !rec->no_buildid_set) &&
+		 *     (rec->no_buildid_cache || !rec->no_buildid_cache_set))
+		 *         disable_buildid_cache();
+		 */
+		bool disable = true;
+
+		if (rec->no_buildid_set && !rec->no_buildid)
+			disable = false;
+		if (rec->no_buildid_cache_set && !rec->no_buildid_cache)
+			disable = false;
+		if (disable) {
+			rec->no_buildid = true;
+			rec->no_buildid_cache = true;
+			disable_buildid_cache();
+		}
+	}
 
 	if (rec->evlist->nr_entries == 0 &&
 	    perf_evlist__add_default(rec->evlist) < 0) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 37/54] perf record: Re-synthesize tracking events after output switching
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (35 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 36/54] perf record: Disable buildid cache options by default in switch output mode Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 38/54] perf record: Generate tracking events for process forked by perf Wang Nan
                   ` (17 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Tracking events describe kernel and threads. They are generated by
reading /proc/kallsyms, /proc/*/maps and /proc/*/task/* during
initialization of 'perf record', serialized into event sequences and put
at the head of 'perf.data'. In case of output switching, each output
file should contain those events.

This patch calls record__synthesize() during output switching, so the
event sequences described above can be collected again.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 238234e..de51134 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -529,6 +529,8 @@ record__finish_output(struct record *rec)
 	return;
 }
 
+static int record__synthesize(struct record *rec);
+
 static int
 record__switch_output(struct record *rec, bool at_exit)
 {
@@ -557,6 +559,15 @@ record__switch_output(struct record *rec, bool at_exit)
 	if (!quiet)
 		fprintf(stderr, "[ perf record: Dump %s.%s ]\n",
 			file->path, timestamp);
+
+	/* Reinit machine */
+	if (!at_exit) {
+		machines__exit(&rec->session->machines);
+		machines__init(&rec->session->machines);
+		perf_session__create_kernel_maps(rec->session);
+		perf_session__set_id_hdr_size(rec->session);
+		record__synthesize(rec);
+	}
 	return fd;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 38/54] perf record: Generate tracking events for process forked by perf
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (36 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 37/54] perf record: Re-synthesize tracking events after output switching Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 39/54] perf record: Ensure return non-zero rc when mmap fail Wang Nan
                   ` (16 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

With 'perf record --switch-output' without -a, record__synthesize() in
record__switch_output() won't generate tracking events because there's
no thread_map in evlist. Which causes newly created perf.data doesn't
contain map and comm information.

This patch creates a fake thread_map and directly call
perf_event__synthesize_thread_map() for those events.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index de51134..e6a8b31 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -567,6 +567,23 @@ record__switch_output(struct record *rec, bool at_exit)
 		perf_session__create_kernel_maps(rec->session);
 		perf_session__set_id_hdr_size(rec->session);
 		record__synthesize(rec);
+
+		if (target__none(&rec->opts.target)) {
+			struct {
+				struct thread_map map;
+				struct thread_map_data map_data;
+			} thread_map;
+
+			thread_map.map.nr = 1;
+			thread_map.map.map[0].pid = rec->evlist->workload.pid;
+			thread_map.map.map[0].comm = NULL;
+			perf_event__synthesize_thread_map(&rec->tool,
+					&thread_map.map,
+					process_synthesized_event,
+					&rec->session->machines.host,
+					rec->opts.sample_address,
+					rec->opts.proc_map_timeout);
+		}
 	}
 	return fd;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 39/54] perf record: Ensure return non-zero rc when mmap fail
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (37 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 38/54] perf record: Generate tracking events for process forked by perf Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 40/54] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
                   ` (15 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

perf_evlist__mmap_ex() can fail without setting errno (for example,
fail in condition checking. In this case all syscall is success).
If this happen, record__open() incorrectly returns 0. Force setting
rc is a quick way to avoid this problem, or we have to follow all
possible code path in perf_evlist__mmap_ex() to make sure there's
at least one system call before returning an error.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e6a8b31..9265948 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -362,7 +362,10 @@ try_again:
 		} else {
 			pr_err("failed to mmap with %d (%s)\n", errno,
 				strerror_r(errno, msg, sizeof(msg)));
-			rc = -errno;
+			if (errno)
+				rc = -errno;
+			else
+				rc = -EINVAL;
 		}
 		goto out;
 	}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 40/54] perf record: Prevent reading invalid data in record__mmap_read
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (38 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 39/54] perf record: Ensure return non-zero rc when mmap fail Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 41/54] perf tools: Add evlist channel helpers Wang Nan
                   ` (14 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

When record__mmap_read() requires data more than the size of ring
buffer, drop those data to avoid accessing invalid memory.

This can happen when reading from overwritable ring buffer, which
should be avoided. However, check this for robustness.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 9265948..0a4f3ec 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -37,6 +37,7 @@
 #include <unistd.h>
 #include <sched.h>
 #include <sys/mman.h>
+#include <asm/bug.h>
 
 
 struct record {
@@ -95,6 +96,13 @@ static int record__mmap_read(struct record *rec, int idx)
 	rec->samples++;
 
 	size = head - old;
+	if (size > (unsigned long)(md->mask) + 1) {
+		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
+
+		md->prev = head;
+		perf_evlist__mmap_consume(rec->evlist, idx);
+		return 0;
+	}
 
 	if ((old & md->mask) + size != (head & md->mask)) {
 		buf = &data[old & md->mask];
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 41/54] perf tools: Add evlist channel helpers
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (39 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 40/54] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 42/54] perf tools: Automatically add new channel according to evlist Wang Nan
                   ` (13 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

In this commit sereval helpers are introduced to support the principle
of channel. Channels hold different groups of evsels which configured
differently. It will be used for overwritable evsels, which allows perf
record some events continuously while capture snapshot for other events
when something happen. Tracking events (mmap, mmap2, fork, exit ...)
are another possible events worth to be put into a separated channel.

Channels are represented by an array with channel flags. Each channel
contains evlist->nr_mmaps mmaps. Channels are configured before
perf_evlist__mmap_ex(). During that function nr_mmaps mmaps for each
channel are allocated together as a big array.
perf_evlist__channel_idx() converts index in the big array and the
channel number. For API functions which accept idx, _ex() versions are
introduced to accept selecting an mmap from a channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |   6 ++
 tools/perf/util/evlist.c    | 132 ++++++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/evlist.h    |  58 +++++++++++++++++++
 3 files changed, 190 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0a4f3ec..2d9e6c6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -356,6 +356,12 @@ try_again:
 		goto out;
 	}
 
+	perf_evlist__channel_reset(evlist);
+	rc = perf_evlist__channel_add(evlist, 0, true);
+	if (rc < 0)
+		goto out;
+	rc = 0;
+
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c80aad1..c565563 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -679,14 +679,51 @@ static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
 	return NULL;
 }
 
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx)
+{
+	int channel = *p_channel;
+	int _idx = *p_idx;
+
+	if (_idx < 0)
+		return -EINVAL;
+	/*
+	 * Negative channel means caller explicitly use real index.
+	 */
+	if (channel < 0) {
+		channel = perf_evlist__idx_channel(evlist, _idx);
+		_idx = _idx % evlist->nr_mmaps;
+	}
+	if (channel < 0)
+		return channel;
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	if (_idx >= evlist->nr_mmaps)
+		return -E2BIG;
+
+	*p_channel = channel;
+	*p_idx = evlist->nr_mmaps * channel + _idx;
+	return 0;
+}
+
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx)
 {
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
 	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head;
-	u64 old = md->prev;
-	unsigned char *data = md->base + page_size;
+	u64 old;
+	unsigned char *data;
 	union perf_event *event = NULL;
 
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return NULL;
+	}
+	old = md->prev;
+	data = md->base + page_size;
+
 	/*
 	 * Check if event was unmapped due to a POLLHUP/POLLERR.
 	 */
@@ -748,6 +785,11 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 	return event;
 }
 
+union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+{
+	return perf_evlist__mmap_read_ex(evlist, -1, idx);
+}
+
 static bool perf_mmap__empty(struct perf_mmap *md)
 {
 	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
@@ -766,10 +808,18 @@ static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 		__perf_evlist__munmap(evlist, idx);
 }
 
-void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx)
 {
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
 	struct perf_mmap *md = &evlist->mmap[idx];
 
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return;
+	}
+
 	if (!evlist->overwrite) {
 		u64 old = md->prev;
 
@@ -780,6 +830,11 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 		perf_evlist__mmap_put(evlist, idx);
 }
 
+void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+{
+	perf_evlist__mmap_consume_ex(evlist, -1, idx);
+}
+
 int __weak auxtrace_mmap__mmap(struct auxtrace_mmap *mm __maybe_unused,
 			       struct auxtrace_mmap_params *mp __maybe_unused,
 			       void *userpg __maybe_unused,
@@ -825,7 +880,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 	if (evlist->mmap == NULL)
 		return;
 
-	for (i = 0; i < evlist->nr_mmaps; i++)
+	for (i = 0; i < perf_evlist__mmap_nr(evlist); i++)
 		__perf_evlist__munmap(evlist, i);
 
 	zfree(&evlist->mmap);
@@ -833,10 +888,17 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 
 static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
+	int total_mmaps;
+
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
-	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
+
+	total_mmaps = perf_evlist__mmap_nr(evlist);
+	if (!total_mmaps)
+		return -EINVAL;
+
+	evlist->mmap = zalloc(total_mmaps * sizeof(struct perf_mmap));
 	return evlist->mmap != NULL ? 0 : -ENOMEM;
 }
 
@@ -1137,6 +1199,12 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	int err;
+
+	perf_evlist__channel_reset(evlist);
+	err = perf_evlist__channel_add(evlist, 0, true);
+	if (err < 0)
+		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
@@ -1746,3 +1814,55 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 
 	return NULL;
 }
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist)
+{
+	int i;
+
+	for (i = PERF_EVLIST__NR_CHANNELS - 1; i >= 0; i--) {
+		unsigned long flags = evlist->channel_flags[i];
+
+		if (flags & PERF_EVLIST__CHANNEL_ENABLED)
+			return i + 1;
+	}
+	return 0;
+}
+
+int perf_evlist__mmap_nr(struct perf_evlist *evlist)
+{
+	return evlist->nr_mmaps * perf_evlist__channel_nr(evlist);
+}
+
+void perf_evlist__channel_reset(struct perf_evlist *evlist)
+{
+	int i;
+
+	BUG_ON(evlist->mmap);
+
+	for (i = 0; i < PERF_EVLIST__NR_CHANNELS; i++)
+		evlist->channel_flags[i] = 0;
+}
+
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default)
+{
+	int n = perf_evlist__channel_nr(evlist);
+	unsigned long *flags = evlist->channel_flags;
+
+	BUG_ON(evlist->mmap);
+
+	if (n >= PERF_EVLIST__NR_CHANNELS) {
+		pr_debug("ERROR: too many channels. Increase PERF_EVLIST__NR_CHANNELS\n");
+		return -ENOSPC;
+	}
+
+	if (is_default) {
+		memmove(&flags[1], &flags[0],
+			sizeof(evlist->channel_flags) -
+			sizeof(evlist->channel_flags[0]));
+		n = 0;
+	}
+	flags[n] = flag | PERF_EVLIST__CHANNEL_ENABLED;
+	return n;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a0d1522..1812652 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,6 +20,11 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
+#define PERF_EVLIST__NR_CHANNELS	1
+enum perf_evlist_mmap_flag {
+	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+};
+
 /**
  * struct perf_mmap - perf's ring buffer mmap details
  *
@@ -52,6 +57,7 @@ struct perf_evlist {
 		pid_t	pid;
 	} workload;
 	struct fdarray	 pollfd;
+	unsigned long channel_flags[PERF_EVLIST__NR_CHANNELS];
 	struct perf_mmap *mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
@@ -116,9 +122,61 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx);
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
 
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx);
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
+int perf_evlist__mmap_nr(struct perf_evlist *evlist);
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist);
+void perf_evlist__channel_reset(struct perf_evlist *evlist);
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default);
+
+static inline bool
+__perf_evlist__channel_check(struct perf_evlist *evlist, int channel,
+			     enum perf_evlist_mmap_flag bits)
+{
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return false;
+
+	return (evlist->channel_flags[channel] & bits) ? true : false;
+}
+#define perf_evlist__channel_check(e, c, b) \
+		__perf_evlist__channel_check(e, c, PERF_EVLIST__CHANNEL_##b)
+
+static inline bool
+perf_evlist__channel_is_enabled(struct perf_evlist *evlist, int channel)
+{
+	return perf_evlist__channel_check(evlist, channel, ENABLED);
+}
+
+static inline int
+perf_evlist__idx_channel(struct perf_evlist *evlist, int idx)
+{
+	int channel = idx / evlist->nr_mmaps;
+
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	return channel;
+}
+
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx);
+
+static inline struct perf_mmap *
+perf_evlist__get_mmap(struct perf_evlist *evlist,
+		      int channel, int idx)
+{
+	if (perf_evlist__channel_idx(evlist, &channel, &idx))
+		return NULL;
+
+	return &evlist->mmap[idx];
+}
 
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 42/54] perf tools: Automatically add new channel according to evlist
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (40 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 41/54] perf tools: Add evlist channel helpers Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 43/54] perf tools: Operate multiple channels Wang Nan
                   ` (12 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

perf_evlist__channel_find() can be used to find a proper channel based
on propreties of a evsel. If the channel doesn't exist, it can create
new one for it. After this patch there's no need to create default
channel explicitly.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  5 -----
 tools/perf/util/evlist.c    | 47 ++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 2d9e6c6..30a3c5c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -357,11 +357,6 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	rc = perf_evlist__channel_add(evlist, 0, true);
-	if (rc < 0)
-		goto out;
-	rc = 0;
-
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c565563..10eeacf 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -943,6 +943,43 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
+static unsigned long
+perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
+{
+	return 0;
+}
+
+static int
+perf_evlist__channel_find(struct perf_evlist *evlist,
+			  struct perf_evsel *evsel,
+			  bool add_new)
+{
+	unsigned long flag = perf_evlist__channel_for_evsel(evsel);
+	int i;
+
+	flag |= PERF_EVLIST__CHANNEL_ENABLED;
+	for (i = 0; i < perf_evlist__channel_nr(evlist); i++)
+		if (evlist->channel_flags[i] == flag)
+			return i;
+	if (add_new)
+		return perf_evlist__channel_add(evlist, flag, false);
+	return -ENOENT;
+}
+
+static int
+perf_evlist__channel_complete(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	int err;
+
+	evlist__for_each(evlist, evsel) {
+		err = perf_evlist__channel_find(evlist, evsel, true);
+		if (err < 0)
+			return err;
+	}
+	return 0;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
@@ -1162,6 +1199,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 bool overwrite, unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite)
 {
+	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
@@ -1169,6 +1207,10 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
+	err = perf_evlist__channel_complete(evlist);
+	if (err)
+		return err;
+
 	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
 		return -ENOMEM;
 
@@ -1199,12 +1241,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
-	int err;
-
 	perf_evlist__channel_reset(evlist);
-	err = perf_evlist__channel_add(evlist, 0, true);
-	if (err < 0)
-		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 43/54] perf tools: Operate multiple channels
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (41 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 42/54] perf tools: Automatically add new channel according to evlist Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 44/54] perf tools: Squash overwrite setting into channel Wang Nan
                   ` (11 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Before this patch perf operates on only the first channel. Make perf
mmap and read from multiple channels.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  3 ++-
 tools/perf/util/evlist.c    | 55 ++++++++++++++++++++++++++++++++++-----------
 tools/perf/util/evlist.h    |  2 +-
 3 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 30a3c5c..a471ca6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -466,8 +466,9 @@ static int record__mmap_read_all(struct record *rec)
 	u64 bytes_written = rec->bytes_written;
 	int i;
 	int rc = 0;
+	int total_mmaps = perf_evlist__mmap_nr(rec->evlist);
 
-	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
+	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
 		if (rec->evlist->mmap[i].base) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 10eeacf..a363466 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -873,6 +873,21 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
 }
 
+static void
+__perf_evlist__munmap_channels(struct perf_evlist *evlist, int _idx)
+{
+	int _ch;
+
+	for (_ch = 0; _ch < perf_evlist__channel_nr(evlist); _ch++) {
+		int err, idx = _idx, ch = _ch;
+
+		err = perf_evlist__channel_idx(evlist, &ch, &idx);
+		if (err < 0)
+			continue;
+		__perf_evlist__munmap(evlist, idx);
+	}
+}
+
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	int i;
@@ -980,26 +995,38 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
+static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *outputs)
 {
 	struct perf_evsel *evsel;
 
 	evlist__for_each(evlist, evsel) {
-		int fd;
+		int fd, channel, idx, err;
+
+		channel = perf_evlist__channel_find(evlist, evsel, false);
+		if (channel < 0) {
+			pr_err("ERROR: unable to find suitable channel for %s\n",
+			       evsel->name);
+			return -1;
+		}
+
+		idx = _idx;
+		err = perf_evlist__channel_idx(evlist, &channel, &idx);
+		if (err < 0)
+			return err;
 
 		if (evsel->system_wide && thread)
 			continue;
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
-			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+		if (outputs[channel] == -1) {
+			outputs[channel] = fd;
+			if (__perf_evlist__mmap(evlist, idx, mp, outputs[channel]) < 0)
 				return -1;
 		} else {
-			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
+			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, outputs[channel]) != 0)
 				return -1;
 
 			perf_evlist__mmap_get(evlist, idx);
@@ -1039,14 +1066,15 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, outputs))
 				goto out_unmap;
 		}
 	}
@@ -1055,7 +1083,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 
 out_unmap:
 	for (cpu = 0; cpu < nr_cpus; cpu++)
-		__perf_evlist__munmap(evlist, cpu);
+		__perf_evlist__munmap_channels(evlist, cpu);
 	return -1;
 }
 
@@ -1067,13 +1095,14 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						outputs))
 			goto out_unmap;
 	}
 
@@ -1081,7 +1110,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 
 out_unmap:
 	for (thread = 0; thread < nr_threads; thread++)
-		__perf_evlist__munmap(evlist, thread);
+		__perf_evlist__munmap_channels(evlist, thread);
 	return -1;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1812652..b652587 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,7 +20,7 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
-#define PERF_EVLIST__NR_CHANNELS	1
+#define PERF_EVLIST__NR_CHANNELS	2
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 };
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 44/54] perf tools: Squash overwrite setting into channel
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (42 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 43/54] perf tools: Operate multiple channels Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 45/54] perf record: Don't read from and poll overwrite channel Wang Nan
                   ` (10 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Make 'overwrite' a channel configuration other than a evlist global
option. With this setting an evlist can have two channels, one is
normal channel, another is overwritable channel.
perf_evlist__channel_for_evsel() ensures events with 'overwrite'
configuration inserted to overwritable channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  2 +-
 tools/perf/util/evlist.c    | 42 +++++++++++++++++++++++++++---------------
 tools/perf/util/evlist.h    |  5 ++---
 tools/perf/util/evsel.h     |  1 +
 4 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a471ca6..53bfe55 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -357,7 +357,7 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
+	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
 		if (errno == EPERM) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a363466..d728d82 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -731,7 +731,7 @@ union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
 		return NULL;
 
 	head = perf_mmap__read_head(md);
-	if (evlist->overwrite) {
+	if (perf_evlist__channel_check(evlist, channel, RDONLY)) {
 		/*
 		 * If we're further behind than half the buffer, there's a chance
 		 * the writer will bite our tail and mess up the samples under us.
@@ -820,7 +820,7 @@ void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
 		return;
 	}
 
-	if (!evlist->overwrite) {
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY)) {
 		u64 old = md->prev;
 
 		perf_mmap__write_tail(md, old);
@@ -918,7 +918,6 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 }
 
 struct mmap_params {
-	int prot;
 	int mask;
 	struct auxtrace_mmap_params auxtrace_mp;
 };
@@ -926,6 +925,15 @@ struct mmap_params {
 static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 			       struct mmap_params *mp, int fd)
 {
+	int channel = perf_evlist__idx_channel(evlist, idx);
+	int prot = PROT_READ;
+
+	if (channel < 0)
+		return -1;
+
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY))
+		prot |= PROT_WRITE;
+
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
 	 * make sure we don't prevent tools from consuming every last event in
@@ -942,7 +950,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	atomic_set(&evlist->mmap[idx].refcnt, 2);
 	evlist->mmap[idx].prev = 0;
 	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
+	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, prot,
 				      MAP_SHARED, fd, 0);
 	if (evlist->mmap[idx].base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
@@ -959,9 +967,13 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 }
 
 static unsigned long
-perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
+perf_evlist__channel_for_evsel(struct perf_evsel *evsel)
 {
-	return 0;
+	unsigned long flag = 0;
+
+	if (evsel->overwrite)
+		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	return flag;
 }
 
 static int
@@ -1211,11 +1223,10 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * perf_evlist__mmap_ex - Create mmaps to receive events.
  * @evlist: list of events
  * @pages: map length in pages
- * @overwrite: overwrite older events?
  * @auxtrace_pages - auxtrace map length in pages
  * @auxtrace_overwrite - overwrite older auxtrace data?
  *
- * If @overwrite is %false the user needs to signal event consumption using
+ * For writable channel, the user needs to signal event consumption using
  * perf_mmap__write_tail().  Using perf_evlist__mmap_read() does this
  * automatically.
  *
@@ -1225,16 +1236,13 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * Return: %0 on success, negative error code otherwise.
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite)
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite)
 {
 	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
-	struct mmap_params mp = {
-		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
-	};
+	struct mmap_params mp;
 
 	err = perf_evlist__channel_complete(evlist);
 	if (err)
@@ -1246,7 +1254,6 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
 	mp.mask = evlist->mmap_len - page_size - 1;
@@ -1270,8 +1277,13 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	struct perf_evsel *evsel;
+
 	perf_evlist__channel_reset(evlist);
-	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+	evlist__for_each(evlist, evsel)
+		evsel->overwrite = overwrite;
+
+	return perf_evlist__mmap_ex(evlist, pages, 0, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b652587..21a8b85 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -23,6 +23,7 @@ struct record_opts;
 #define PERF_EVLIST__NR_CHANNELS	2
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+	PERF_EVLIST__CHANNEL_RDONLY	= 2,
 };
 
 /**
@@ -45,7 +46,6 @@ struct perf_evlist {
 	int		 nr_entries;
 	int		 nr_groups;
 	int		 nr_mmaps;
-	bool		 overwrite;
 	bool		 enabled;
 	bool		 has_user_cpus;
 	size_t		 mmap_len;
@@ -203,8 +203,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 				  int unset);
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite);
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite);
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite);
 void perf_evlist__munmap(struct perf_evlist *evlist);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d5ae7ba..f5c7433 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -115,6 +115,7 @@ struct perf_evsel {
 	bool			tracking;
 	bool			per_pkg;
 	bool			precise_max;
+	bool			overwrite;
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 45/54] perf record: Don't read from and poll overwrite channel
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (43 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 44/54] perf tools: Squash overwrite setting into channel Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 46/54] perf record: Don't poll on " Wang Nan
                   ` (9 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Reading from overwritable ring buffer is unreliable. Introduce
record__mmap_should_read() and prevent reading from overwrite ring
buffer in 'perf record'. The rule in record__mmap_should_read() will
be changed when perf support reading from backward writing ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 53bfe55..503eee9 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -461,6 +461,19 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static bool record__mmap_should_read(struct record *rec, int idx)
+{
+	int channel = -1;
+
+	if (!rec->evlist->mmap[idx].base)
+		return false;
+	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
+		return false;
+	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int record__mmap_read_all(struct record *rec)
 {
 	u64 bytes_written = rec->bytes_written;
@@ -471,7 +484,7 @@ static int record__mmap_read_all(struct record *rec)
 	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
-		if (rec->evlist->mmap[i].base) {
+		if (record__mmap_should_read(rec, i)) {
 			if (record__mmap_read(rec, i) != 0) {
 				rc = -1;
 				goto out;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 46/54] perf record: Don't poll on overwrite channel
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (44 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 45/54] perf record: Don't read from and poll overwrite channel Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 47/54] perf tools: Detect avalibility of write_backward Wang Nan
                   ` (8 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

There's no need to receive events from overwrite ring buffer. Instead,
perf should make them run background until something happen. This patch
makes events from overwrite ring buffer is ignored except POLLERR and
POLLHUP.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d728d82..3c308a7 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -461,9 +461,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
 {
-	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
 	 * Save the idx so that when we filter out fds POLLHUP'ed we can
 	 * close the associated evlist->mmap[] entry.
@@ -479,7 +479,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1);
+	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
@@ -1007,6 +1007,18 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
+static bool
+perf_evlist__should_poll(struct perf_evlist *evlist,
+			 struct perf_evsel *evsel,
+			 int channel)
+{
+	if (evsel->system_wide)
+		return false;
+	if (perf_evlist__channel_check(evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *outputs)
@@ -1015,6 +1027,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 
 	evlist__for_each(evlist, evsel) {
 		int fd, channel, idx, err;
+		short revent = POLLIN;
 
 		channel = perf_evlist__channel_find(evlist, evsel, false);
 		if (channel < 0) {
@@ -1044,6 +1057,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 			perf_evlist__mmap_get(evlist, idx);
 		}
 
+		if (!perf_evlist__should_poll(evlist, evsel, channel))
+			revent = 0;
 		/*
 		 * The system_wide flag causes a selected event to be opened
 		 * always without a pid.  Consequently it will never get a
@@ -1052,7 +1067,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 47/54] perf tools: Detect avalibility of write_backward
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (45 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 46/54] perf record: Don't poll on " Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 48/54] perf tools: Enable overwrite settings Wang Nan
                   ` (7 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Detect avalibility of write_backward and save the result into
record_opts. With write_backward the start pointer of a ring
buffer mapped read only can be found reliably.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/perf.h        |  1 +
 tools/perf/util/record.c | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 90129ac..00c25b1 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -71,6 +71,7 @@ struct record_opts {
 	bool	     sample_transaction;
 	unsigned     initial_delay;
 	bool         use_clockid;
+	bool	     has_write_backward;
 	clockid_t    clockid;
 	unsigned int proc_map_timeout;
 };
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 0467367..d01f155 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -85,6 +85,11 @@ static void perf_probe_comm_exec(struct perf_evsel *evsel)
 	evsel->attr.comm_exec = 1;
 }
 
+static void perf_probe_write_backward(struct perf_evsel *evsel)
+{
+	evsel->attr.write_backward = 1;
+}
+
 static void perf_probe_context_switch(struct perf_evsel *evsel)
 {
 	evsel->attr.context_switch = 1;
@@ -105,6 +110,11 @@ bool perf_can_record_switch_events(void)
 	return perf_probe_api(perf_probe_context_switch);
 }
 
+static bool perf_can_write_backward(void)
+{
+	return perf_probe_api(perf_probe_write_backward);
+}
+
 bool perf_can_record_cpu_wide(void)
 {
 	struct perf_event_attr attr = {
@@ -235,6 +245,7 @@ static int record_opts__config_freq(struct record_opts *opts)
 
 int record_opts__config(struct record_opts *opts)
 {
+	opts->has_write_backward = perf_can_write_backward();
 	return record_opts__config_freq(opts);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 48/54] perf tools: Enable overwrite settings
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (46 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 47/54] perf tools: Detect avalibility of write_backward Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 49/54] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
                   ` (6 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

This patch allows following config terms and option:

 # perf record --overwrite ...

   Globally set following events to overwrite;

 # perf record --event cycles/overwrite/ ...
 # perf record --event cycles/no-overwrite/ ...

Set specific events to be overwrite or no-overwrite.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c    |  1 +
 tools/perf/perf.h              |  1 +
 tools/perf/util/evsel.c        |  4 ++++
 tools/perf/util/evsel.h        |  2 ++
 tools/perf/util/parse-events.c | 14 ++++++++++++++
 tools/perf/util/parse-events.h |  4 +++-
 tools/perf/util/parse-events.l |  2 ++
 7 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 503eee9..f416296 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1271,6 +1271,7 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 		     "number of mmap data pages and AUX area tracing mmap pages",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 00c25b1..ea7f6f5 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -58,6 +58,7 @@ struct record_opts {
 	bool	     full_auxtrace;
 	bool	     auxtrace_snapshot_mode;
 	bool	     record_switch_events;
+	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 66a47ba..d9b1f1d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -670,6 +670,9 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			 */
 			attr->inherit = term->val.inherit ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+			evsel->overwrite = term->val.overwrite ? 1 : 0;
+			break;
 		default:
 			break;
 		}
@@ -745,6 +748,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
+	evsel->overwrite    = opts->overwrite;
 
 	perf_evsel__set_sample_bit(evsel, IP);
 	perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index f5c7433..f987b0b 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -44,6 +44,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
 	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_INHERIT,
+	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -57,6 +58,7 @@ struct perf_evsel_config_term {
 		char	*callgraph;
 		u64	stack_user;
 		bool	inherit;
+		bool	overwrite;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 03d18f4..c1d4f39 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -855,6 +855,12 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
@@ -892,6 +898,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
 	case PARSE_EVENTS__TERM_TYPE_INHERIT:
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -961,6 +969,12 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 			ADD_CONFIG_TERM(INHERIT, inherit, term->val.num ? 0 : 1);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index c34615f..29cc804 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -68,7 +68,9 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_CALLGRAPH,
 	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
-	PARSE_EVENTS__TERM_TYPE_INHERIT
+	PARSE_EVENTS__TERM_TYPE_INHERIT,
+	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
+	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 };
 
 struct parse_events_array {
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 27d567f..2ef6f96 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -202,6 +202,8 @@ call-graph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
 stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
 inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
+overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
+no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 49/54] perf tools: Set write_backward attribut bit for overwrite events
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (47 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 48/54] perf tools: Enable overwrite settings Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading Wang Nan
                   ` (5 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

write_backward attribute makes kernel filling ring buffer from the end
of it, makes reading from overwrite ring buffer possible.

This patch select this attribute if evsel->overwrite is selected
explicitly by user.

Overwrite and write_backward are still controled separatly for legacy
readonly mmap users (most of them are in perf/tests).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  7 +++++++
 tools/perf/util/evlist.c    |  2 ++
 tools/perf/util/evlist.h    |  1 +
 tools/perf/util/evsel.c     | 13 +++++++++++++
 4 files changed, 23 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f416296..09aa4ee 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -332,6 +332,13 @@ static int record__open(struct record *rec)
 	perf_evlist__config(evlist, opts);
 
 	evlist__for_each(evlist, pos) {
+		if (pos->overwrite) {
+			if (!pos->attr.write_backward) {
+				ui__warning("Unable to read from overwrite ring buffer\n\n");
+				rc = -ENOSYS;
+				goto out;
+			}
+		}
 try_again:
 		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
 			if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 3c308a7..807930b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -973,6 +973,8 @@ perf_evlist__channel_for_evsel(struct perf_evsel *evsel)
 
 	if (evsel->overwrite)
 		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	if (evsel->attr.write_backward)
+		flag |= PERF_EVLIST__CHANNEL_BACKWARD;
 	return flag;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 21a8b85..321224c 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -24,6 +24,7 @@ struct record_opts;
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 	PERF_EVLIST__CHANNEL_RDONLY	= 2,
+	PERF_EVLIST__CHANNEL_BACKWARD	= 4,
 };
 
 /**
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d9b1f1d..d821556 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -678,6 +678,19 @@ static void apply_config_terms(struct perf_evsel *evsel,
 		}
 	}
 
+	/*
+	 * Set backward after config term processing because it is
+	 * possible to set overwrite globally, without config
+	 * terms.
+	 */
+	if (evsel->overwrite) {
+		if (opts->has_write_backward)
+			attr->write_backward = 1;
+		else
+			pr_err("Reading from overwrite event %s is not supported\n",
+			       evsel->name);
+	}
+
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
 	if ((callgraph_buf != NULL) || (dump_size > 0)) {
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (48 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 49/54] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-26  8:25   ` Wangnan (F)
  2016-01-25  9:56 ` [PATCH 51/54] perf record: Rename variable to make code clear Wang Nan
                   ` (4 subsequent siblings)
  54 siblings, 1 reply; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Reading from a overwrite ring buffer is unrelible. perf_evsel__pause()
should be called before reading from them.

Toggel overwrite_evt_paused director after receiving done or switch
output.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 77 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 09aa4ee..69f089f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -39,6 +39,11 @@
 #include <sys/mman.h>
 #include <asm/bug.h>
 
+enum overwrite_evt_state {
+	OVERWRITE_EVT_RUNNING,
+	OVERWRITE_EVT_DATA_PENDING,
+	OVERWRITE_EVT_EMPTY,
+};
 
 struct record {
 	struct perf_tool	tool;
@@ -57,6 +62,7 @@ struct record {
 	bool			buildid_all;
 	bool			timestamp_filename;
 	bool			switch_output;
+	enum overwrite_evt_state overwrite_evt_state;
 	unsigned long long	samples;
 };
 
@@ -388,6 +394,7 @@ try_again:
 
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
+	rec->overwrite_evt_state = OVERWRITE_EVT_RUNNING;
 out:
 	return rc;
 }
@@ -468,6 +475,50 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static void
+record__toggle_overwrite_evsels(struct record *rec,
+				enum overwrite_evt_state state)
+{
+	struct perf_evsel *pos;
+	struct perf_evlist *evlist = rec->evlist;
+	enum overwrite_evt_state old_state = rec->overwrite_evt_state;
+	enum action {
+		NONE,
+		PAUSE,
+		RESUME,
+	} action = NONE;
+
+	switch (old_state) {
+	case OVERWRITE_EVT_RUNNING:
+		if (state != OVERWRITE_EVT_RUNNING)
+			action = PAUSE;
+		break;
+	case OVERWRITE_EVT_DATA_PENDING:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		break;
+	case OVERWRITE_EVT_EMPTY:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		if (state == OVERWRITE_EVT_DATA_PENDING)
+			state = OVERWRITE_EVT_EMPTY;
+		break;
+	default:
+		WARN_ONCE(1, "Shouldn't get there\n");
+	}
+
+	rec->overwrite_evt_state = state;
+
+	if (action == NONE)
+		return;
+
+	evlist__for_each(evlist, pos) {
+		if (!pos->overwrite)
+			continue;
+		perf_evsel__pause(pos, action == PAUSE);
+	}
+}
+
 static bool record__mmap_should_read(struct record *rec, int idx)
 {
 	int channel = -1;
@@ -512,6 +563,8 @@ static int record__mmap_read_all(struct record *rec)
 	if (bytes_written != rec->bytes_written)
 		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
 
+	if (rec->overwrite_evt_state == OVERWRITE_EVT_DATA_PENDING)
+		record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_EMPTY);
 out:
 	return rc;
 }
@@ -870,6 +923,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
+		/*
+		 * rec->overwrite_evt_state is possible to be
+		 * OVERWRITE_EVT_EMPTY here: when done == true and
+		 * hits != rec->samples after previous reading.
+		 *
+		 * record__toggle_overwrite_evsels ensure we never
+		 * convert OVERWRITE_EVT_EMPTY to OVERWRITE_EVT_DATA_PENDING.
+		 */
+		if (switch_output_started || done || draining)
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_DATA_PENDING);
+
 		if (record__mmap_read_all(rec) < 0) {
 			auxtrace_snapshot_disable();
 			err = -1;
@@ -888,7 +952,20 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 
 		if (switch_output_started) {
+			/*
+			 * SIGUSR2 raise after or during record__mmap_read_all().
+			 * continue to read again.
+			 */
+			if (rec->overwrite_evt_state == OVERWRITE_EVT_RUNNING)
+				continue;
+
 			switch_output_started = 0;
+			/*
+			 * Reenable events in overwrite ring buffer after
+			 * record__mmap_read_all(): we should have collected
+			 * data from it.
+			 */
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_RUNNING);
 
 			if (!quiet)
 				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 51/54] perf record: Rename variable to make code clear
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (49 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 52/54] perf record: Read from backward ring buffer Wang Nan
                   ` (3 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

record__mmap_read() write data from ring buffer into perf.data.
'head' is maintained by kernel, points to the last writtend record.
'old' is maintained by perf, points to the record read in previous
round. record__mmap_read() saves data from 'old' to 'head' to
perf.data. The naming of variables are not easy to read. In addition,
when dealing with backward writing ring buffer, the md->prev pointer
should point to 'head' instead of the last byte it got.

Add start and end pointer to make code clear and set md->prev to 'head'
instead of the moved 'old' pointer. This patch doesn't change
behavior since:

    buf = &data[old & md->mask];
    size = head - old;
    old += size;     <--- Here, old == head

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 69f089f..12147ef 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -91,17 +91,18 @@ static int record__mmap_read(struct record *rec, int idx)
 	struct perf_mmap *md = &rec->evlist->mmap[idx];
 	u64 head = perf_mmap__read_head(md);
 	u64 old = md->prev;
+	u64 end = head, start = old;
 	unsigned char *data = md->base + page_size;
 	unsigned long size;
 	void *buf;
 	int rc = 0;
 
-	if (old == head)
+	if (start == end)
 		return 0;
 
 	rec->samples++;
 
-	size = head - old;
+	size = end - start;
 	if (size > (unsigned long)(md->mask) + 1) {
 		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
@@ -110,10 +111,10 @@ static int record__mmap_read(struct record *rec, int idx)
 		return 0;
 	}
 
-	if ((old & md->mask) + size != (head & md->mask)) {
-		buf = &data[old & md->mask];
-		size = md->mask + 1 - (old & md->mask);
-		old += size;
+	if ((start & md->mask) + size != (end & md->mask)) {
+		buf = &data[start & md->mask];
+		size = md->mask + 1 - (start & md->mask);
+		start += size;
 
 		if (record__write(rec, buf, size) < 0) {
 			rc = -1;
@@ -121,16 +122,16 @@ static int record__mmap_read(struct record *rec, int idx)
 		}
 	}
 
-	buf = &data[old & md->mask];
-	size = head - old;
-	old += size;
+	buf = &data[start & md->mask];
+	size = end - start;
+	start += size;
 
 	if (record__write(rec, buf, size) < 0) {
 		rc = -1;
 		goto out;
 	}
 
-	md->prev = old;
+	md->prev = head;
 	perf_evlist__mmap_consume(rec->evlist, idx);
 out:
 	return rc;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 52/54] perf record: Read from backward ring buffer
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (50 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 51/54] perf record: Rename variable to make code clear Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 53/54] perf record: Allow generate tracking events at the end of output Wang Nan
                   ` (2 subsequent siblings)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Introduce rb_find_range() to find start and end position from a backward
ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 69 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 67 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 12147ef..1f03db0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -86,6 +86,61 @@ static int process_synthesized_event(struct perf_tool *tool,
 	return record__write(rec, event, event->header.size);
 }
 
+static int
+backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
+{
+	struct perf_event_header *pheader;
+	u64 evt_head = head;
+	int size = mask + 1;
+
+	pr_debug2("backward_rb_find_range: buf=%p, head=%"PRIx64"\n", buf, head);
+	pheader = (struct perf_event_header *)(buf + (head & mask));
+	*start = head;
+	while (true) {
+		if (evt_head - head >= (unsigned int)size) {
+			pr_debug("Finshed reading backward ring buffer: rewind\n");
+			if (evt_head - head > (unsigned int)size)
+				evt_head -= pheader->size;
+			*end = evt_head;
+			return 0;
+		}
+
+		pheader = (struct perf_event_header *)(buf + (evt_head & mask));
+
+		if (pheader->size == 0) {
+			pr_debug("Finshed reading backward ring buffer: get start\n");
+			*end = evt_head;
+			return 0;
+		}
+
+		evt_head += pheader->size;
+		pr_debug3("move evt_head: %"PRIx64"\n", evt_head);
+	}
+	WARN_ONCE(1, "Shouldn't get here\n");
+	return -1;
+}
+
+static int
+rb_find_range(struct perf_evlist *evlist, int idx,
+	      void *data, int mask, u64 head, u64 old,
+	      u64 *start, u64 *end)
+{
+	int channel;
+
+	channel = perf_evlist__idx_channel(evlist, idx);
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY)) {
+		*start = old;
+		*end = head;
+		return 0;
+	}
+
+	if (perf_evlist__channel_check(evlist, channel, BACKWARD))
+		return backward_rb_find_range(data, mask, head, start, end);
+
+	WARN_ONCE(1, "Unable to find start position from a read-only ring buffer\n");
+	return -1;
+}
+
 static int record__mmap_read(struct record *rec, int idx)
 {
 	struct perf_mmap *md = &rec->evlist->mmap[idx];
@@ -97,6 +152,10 @@ static int record__mmap_read(struct record *rec, int idx)
 	void *buf;
 	int rc = 0;
 
+	if (rb_find_range(rec->evlist, idx, data, md->mask, head,
+			  old, &start, &end))
+		return -1;
+
 	if (start == end)
 		return 0;
 
@@ -528,8 +587,14 @@ static bool record__mmap_should_read(struct record *rec, int idx)
 		return false;
 	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
 		return false;
-	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
-		return false;
+	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY)) {
+		if (rec->overwrite_evt_state != OVERWRITE_EVT_DATA_PENDING)
+			return false;
+		if (perf_evlist__channel_check(rec->evlist, channel, BACKWARD))
+			return true;
+		else
+			return false;
+	}
 	return true;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 53/54] perf record: Allow generate tracking events at the end of output
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (51 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 52/54] perf record: Read from backward ring buffer Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-25  9:56 ` [PATCH 54/54] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
  2016-01-26  9:11 ` [offlist] Re: [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wangnan (F)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

Before this patch tracking events are generated based on information in
/proc before all samples. However, with the introducing of overwrite
evsel in perf record, it becomes inconvenience: 'perf record' now can
executed as a daemon for sereval hours and only capture the last
snapshot when it receives SIGUSR2. The tracking events generated at
the head of output 'perf.data' becomes too old, but most of tracking
events during 'perf record' running are dropped.

This patch generates tracking events at the end of output. The output
events series would better reflecting status of system when SIGUSR2
received.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 62 +++++++++++++++++++++++++++++++--------------
 1 file changed, 43 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1f03db0..6eaa43d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -63,6 +63,7 @@ struct record {
 	bool			timestamp_filename;
 	bool			switch_output;
 	enum overwrite_evt_state overwrite_evt_state;
+	bool			tail_tracking;
 	unsigned long long	samples;
 };
 
@@ -683,6 +684,26 @@ record__finish_output(struct record *rec)
 
 static int record__synthesize(struct record *rec);
 
+static void record__synthesize_target(struct record *rec)
+{
+	if (target__none(&rec->opts.target)) {
+		struct {
+			struct thread_map map;
+			struct thread_map_data map_data;
+		} thread_map;
+
+		thread_map.map.nr = 1;
+		thread_map.map.map[0].pid = rec->evlist->workload.pid;
+		thread_map.map.map[0].comm = NULL;
+		perf_event__synthesize_thread_map(&rec->tool,
+				&thread_map.map,
+				process_synthesized_event,
+				&rec->session->machines.host,
+				rec->opts.sample_address,
+				rec->opts.proc_map_timeout);
+	}
+}
+
 static int
 record__switch_output(struct record *rec, bool at_exit)
 {
@@ -692,6 +713,11 @@ record__switch_output(struct record *rec, bool at_exit)
 	/* Same Size:      "2015122520103046"*/
 	char timestamp[] = "InvalidTimestamp";
 
+	if (rec->tail_tracking) {
+		record__synthesize(rec);
+		record__synthesize_target(rec);
+	}
+
 	rec->samples = 0;
 	record__finish_output(rec);
 	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
@@ -718,23 +744,10 @@ record__switch_output(struct record *rec, bool at_exit)
 		machines__init(&rec->session->machines);
 		perf_session__create_kernel_maps(rec->session);
 		perf_session__set_id_hdr_size(rec->session);
-		record__synthesize(rec);
 
-		if (target__none(&rec->opts.target)) {
-			struct {
-				struct thread_map map;
-				struct thread_map_data map_data;
-			} thread_map;
-
-			thread_map.map.nr = 1;
-			thread_map.map.map[0].pid = rec->evlist->workload.pid;
-			thread_map.map.map[0].comm = NULL;
-			perf_event__synthesize_thread_map(&rec->tool,
-					&thread_map.map,
-					process_synthesized_event,
-					&rec->session->machines.host,
-					rec->opts.sample_address,
-					rec->opts.proc_map_timeout);
+		if (!rec->tail_tracking) {
+			record__synthesize(rec);
+			record__synthesize_target(rec);
 		}
 	}
 	return fd;
@@ -930,9 +943,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	err = record__synthesize(rec);
-	if (err < 0)
-		goto out_child;
+	if (!rec->tail_tracking) {
+		err = record__synthesize(rec);
+		if (err < 0)
+			goto out_child;
+	}
 
 	if (rec->realtime_prio) {
 		struct sched_param param;
@@ -1073,6 +1088,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			disabled = true;
 		}
 	}
+
+	if (rec->tail_tracking) {
+		err = record__synthesize(rec);
+		if (err < 0)
+			goto out_child;
+	}
+
 	auxtrace_snapshot_disable();
 
 	if (forks && workload_exec_errno) {
@@ -1499,6 +1521,8 @@ struct option __record_options[] = {
 		    "append timestamp to output filename"),
 	OPT_BOOLEAN(0, "switch-output", &record.switch_output,
 		    "Switch output when receive SIGUSR2"),
+	OPT_BOOLEAN(0, "tail-tracking", &record.tail_tracking,
+		    "Generate tracking events at the end of output"),
 	OPT_END()
 };
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 54/54] perf tools: Don't warn about out of order event if write_backward is used
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (52 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 53/54] perf record: Allow generate tracking events at the end of output Wang Nan
@ 2016-01-25  9:56 ` Wang Nan
  2016-01-26  9:11 ` [offlist] Re: [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wangnan (F)
  54 siblings, 0 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-25  9:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel, Wang Nan

If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.

Result:

Before this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
 [ perf record: Woken up 5 times to write data ]
 Warning:
 40 out of order events recorded.
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

After this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
 [ perf record: Woken up 5 times to write data ]
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/session.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 40b7a0d..132c6ab 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1516,10 +1516,27 @@ int perf_session__register_idle_thread(struct perf_session *session)
 	return err;
 }
 
+static void
+perf_session__warn_order(const struct perf_session *session)
+{
+	const struct ordered_events *oe = &session->ordered_events;
+	struct perf_evsel *evsel;
+	bool should_warn = true;
+
+	evlist__for_each(session->evlist, evsel) {
+		if (evsel->attr.write_backward)
+			should_warn = false;
+	}
+
+	if (!should_warn)
+		return;
+	if (oe->nr_unordered_events != 0)
+		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+}
+
 static void perf_session__warn_about_errors(const struct perf_session *session)
 {
 	const struct events_stats *stats = &session->evlist->stats;
-	const struct ordered_events *oe = &session->ordered_events;
 
 	if (session->tool->lost == perf_event__process_lost &&
 	    stats->nr_events[PERF_RECORD_LOST] != 0) {
@@ -1576,8 +1593,7 @@ static void perf_session__warn_about_errors(const struct perf_session *session)
 			    stats->nr_unprocessable_samples);
 	}
 
-	if (oe->nr_unordered_events != 0)
-		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+	perf_session__warn_order(session);
 
 	events_stats__auxtrace_error_warn(stats);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading
  2016-01-25  9:56 ` [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading Wang Nan
@ 2016-01-26  8:25   ` Wangnan (F)
  0 siblings, 0 replies; 79+ messages in thread
From: Wangnan (F) @ 2016-01-26  8:25 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo
  Cc: Brendan Gregg, Daniel Borkmann, David S. Miller, He Kuang,
	Jiri Olsa, Li Zefan, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, pi3orama, Will Deacon, linux-kernel



On 2016/1/25 17:56, Wang Nan wrote:
> Reading from a overwrite ring buffer is unrelible. perf_evsel__pause()
> should be called before reading from them.
>
> Toggel overwrite_evt_paused director after receiving done or switch
> output.
>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>   tools/perf/builtin-record.c | 77 +++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 77 insertions(+)

[SNIP]

> +static void
> +record__toggle_overwrite_evsels(struct record *rec,
> +				enum overwrite_evt_state state)
> +{

[SNIP]

> +	rec->overwrite_evt_state = state;
> +
> +	if (action == NONE)
> +		return;
> +
> +	evlist__for_each(evlist, pos) {
> +		if (!pos->overwrite)
> +			continue;
> +		perf_evsel__pause(pos, action == PAUSE);
> +	}
> +}
> +
This part is incorrect. We should pause ring buffers for each CPU
in a channel, not each evsel.

Already fixed at:

https://git.kernel.org/cgit/linux/kernel/git/pi3orama/linux.git/commit/?h=perf/overwrite&id=fe59d9c6621c60087ce7e6e269f2f15f152d6d71

Thank you.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [offlist] Re: [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support
  2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (53 preceding siblings ...)
  2016-01-25  9:56 ` [PATCH 54/54] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
@ 2016-01-26  9:11 ` Wangnan (F)
  2016-01-26 14:11   ` Arnaldo Carvalho de Melo
  54 siblings, 1 reply; 79+ messages in thread
From: Wangnan (F) @ 2016-01-26  9:11 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alexei Starovoitov, Brendan Gregg, pi3orama, lizefan 00213767,
	linux-kernel

Hi Arnaldo,

We didn't make too much progress on this patchset for several weeks.
Kernel support of bpf-output has already been merged by v4.4, but perf
side code is still missing in v4.5. Do you have any plan on it?

Brendan asked this feature this month. I think he would be disappointed
because he still unable to use them on v4.5 kernel...

Thank you.


On 2016/1/25 17:55, Wang Nan wrote:
> Hi Arnaldo,
>
> The following changes since commit 512e583b2d4a35b644c8ff36e033b90be7e91c2e:
>
>    perf hists browser: Offer non-symbol specific menu options for --sort without 'sym' (2016-01-22 14:28:48 -0300)
>
> are available in the git repository at:
>
>    https://git.kernel.org/pub/scm/linux/kernel/git/pi3orama/linux.git tags/perf-core-for-acme
>
> for you to fetch changes up to 7c8463658d92b82c2a8db9f405b50ae814b91f71:
>
>    perf tools: Don't warn about out of order event if write_backward is used (2016-01-25 09:49:15 +0000)
>
> ----------------------------------------------------------------
> perf improvements:
>
>   - Bug fixes:
>     libbpf relocation checker for a llvm bug
>     fix symbol searching for offline modules
>
>   - Building scripts:
>     Use feature-dump results for build-test
>
>   - BPF related improvements:
>     Enable indices syntax to support init BPF maps
>     Support BPF output events support
>
>   - perf/core:
>     Add write_backward attribute bit to support reading from
>        overwrite ring buffer
>
>   - perf record improvements:
>     Enable perf record dump different output
>     Support reading from overwrite ring buffer based on write_backward
>       attribute
>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>
> ----------------------------------------------------------------
> He Kuang (1):
>        perf tools: Support perf event alias name
>
> Wang Nan (53):
>        perf test: Add libbpf relocation checker
>        perf bpf: Check relocation target section
>        tools build: Allow subprojects select all feature checkers
>        perf build: Select all feature checkers for feature-dump
>        perf build: Use feature dump file for build-test
>        perf test: Check environment before start real BPF test
>        perf tools: Fix symbols searching for offline module in buildid-cache
>        perf test: Improve bp_signal
>        perf tools: Add API to config maps in bpf object
>        perf tools: Enable BPF object configure syntax
>        perf record: Apply config to BPF objects before recording
>        perf tools: Enable passing event to BPF object
>        perf tools: Support setting different slots in a BPF map separately
>        perf tools: Enable indices setting syntax for BPF maps
>        perf tools: Introduce bpf-output event
>        perf data: Support converting data from bpf_perf_event_output()
>        perf core: Introduce new ioctl options to pause and resume ring buffer
>        perf core: Set event's default overflow_handler
>        perf core: Prepare writing into ring buffer from end
>        perf core: Add backward attribute to perf event
>        perf core: Reduce perf event output overhead by new overflow handler
>        perf tools: Introduce API to pause ring buffer
>        perf tools: Only validate is_pos for tracking evsels
>        perf tools: Print write_backward value in perf_event_attr__fprintf
>        perf tools: Move timestamp creation to util
>        perf tools: Make ordered_events reusable
>        perf record: Extract synthesize code to record__synthesize()
>        perf tools: Add perf_data_file__switch() helper
>        perf record: Turns auxtrace_snapshot_enable into 3 states
>        perf record: Introduce record__finish_output() to finish a perf.data
>        perf record: Use OPT_BOOLEAN_SET for buildid cache related options
>        perf record: Add '--timestamp-filename' option to append timestamp to output filename
>        perf record: Split output into multiple files via '--switch-output'
>        perf record: Force enable --timestamp-filename when --switch-output is provided
>        perf record: Disable buildid cache options by default in switch output mode
>        perf record: Re-synthesize tracking events after output switching
>        perf record: Generate tracking events for process forked by perf
>        perf record: Ensure return non-zero rc when mmap fail
>        perf record: Prevent reading invalid data in record__mmap_read
>        perf tools: Add evlist channel helpers
>        perf tools: Automatically add new channel according to evlist
>        perf tools: Operate multiple channels
>        perf tools: Squash overwrite setting into channel
>        perf record: Don't read from and poll overwrite channel
>        perf record: Don't poll on overwrite channel
>        perf tools: Detect avalibility of write_backward
>        perf tools: Enable overwrite settings
>        perf tools: Set write_backward attribut bit for overwrite events
>        perf record: Toggle overwrite ring buffer for reading
>        perf record: Rename variable to make code clear
>        perf record: Read from backward ring buffer
>        perf record: Allow generate tracking events at the end of output
>        perf tools: Don't warn about out of order event if write_backward is used
>
>   include/linux/perf_event.h                    |  22 +-
>   include/uapi/linux/perf_event.h               |   4 +-
>   kernel/events/core.c                          |  73 ++-
>   kernel/events/internal.h                      |  11 +
>   kernel/events/ring_buffer.c                   |  63 ++-
>   tools/build/Makefile.feature                  |  21 +-
>   tools/lib/bpf/libbpf.c                        |  34 +-
>   tools/perf/Makefile.perf                      |  11 +-
>   tools/perf/builtin-buildid-cache.c            |  14 +-
>   tools/perf/builtin-record.c                   | 608 ++++++++++++++++++----
>   tools/perf/perf.h                             |   2 +
>   tools/perf/tests/.gitignore                   |   1 +
>   tools/perf/tests/Build                        |   9 +-
>   tools/perf/tests/bp_signal.c                  | 140 +++++-
>   tools/perf/tests/bpf-script-test-relocation.c |  50 ++
>   tools/perf/tests/bpf.c                        |  63 ++-
>   tools/perf/tests/llvm.c                       |  17 +-
>   tools/perf/tests/llvm.h                       |   5 +-
>   tools/perf/tests/make                         |  31 ++
>   tools/perf/util/bpf-loader.c                  | 699 ++++++++++++++++++++++++++
>   tools/perf/util/bpf-loader.h                  |  59 +++
>   tools/perf/util/build-id.c                    |  44 ++
>   tools/perf/util/build-id.h                    |   1 +
>   tools/perf/util/data-convert-bt.c             | 112 ++++-
>   tools/perf/util/data.c                        |  36 ++
>   tools/perf/util/data.h                        |  11 +-
>   tools/perf/util/evlist.c                      | 314 ++++++++++--
>   tools/perf/util/evlist.h                      |  67 ++-
>   tools/perf/util/evsel.c                       |  30 ++
>   tools/perf/util/evsel.h                       |  13 +
>   tools/perf/util/ordered-events.c              |   5 +
>   tools/perf/util/parse-events.c                | 139 ++++-
>   tools/perf/util/parse-events.h                |  24 +-
>   tools/perf/util/parse-events.l                |  18 +-
>   tools/perf/util/parse-events.y                | 123 ++++-
>   tools/perf/util/record.c                      |  11 +
>   tools/perf/util/session.c                     |  22 +-
>   tools/perf/util/symbol.c                      |   4 +
>   tools/perf/util/util.c                        |  17 +
>   tools/perf/util/util.h                        |   1 +
>   40 files changed, 2689 insertions(+), 240 deletions(-)
>   create mode 100644 tools/perf/tests/bpf-script-test-relocation.c
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [offlist] Re: [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support
  2016-01-26  9:11 ` [offlist] Re: [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wangnan (F)
@ 2016-01-26 14:11   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 79+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-26 14:11 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Alexei Starovoitov, Brendan Gregg, pi3orama, lizefan 00213767,
	linux-kernel

Em Tue, Jan 26, 2016 at 05:11:28PM +0800, Wangnan (F) escreveu:
> Hi Arnaldo,
> 
> We didn't make too much progress on this patchset for several weeks.
> Kernel support of bpf-output has already been merged by v4.4, but perf
> side code is still missing in v4.5. Do you have any plan on it?

Yes, I have, I updated my main machine, had to reinstall the clang
environment, figured out the one in fedora doesn't support it, reported
it to the fedora guys, next version will have it, this way we'll have
one less roadblock.

But this is complex code, that needs reviewing, as is several other
fronts, and december I was out, vacations, life got in the way (a son,
heya!).

We'll get there, expect progress in the coming days and weeks.

- Arnaldo
 
> Brendan asked this feature this month. I think he would be disappointed
> because he still unable to use them on v4.5 kernel...
> 
> Thank you.
> 
> 
> On 2016/1/25 17:55, Wang Nan wrote:
> >Hi Arnaldo,
> >
> >The following changes since commit 512e583b2d4a35b644c8ff36e033b90be7e91c2e:
> >
> >   perf hists browser: Offer non-symbol specific menu options for --sort without 'sym' (2016-01-22 14:28:48 -0300)
> >
> >are available in the git repository at:
> >
> >   https://git.kernel.org/pub/scm/linux/kernel/git/pi3orama/linux.git tags/perf-core-for-acme
> >
> >for you to fetch changes up to 7c8463658d92b82c2a8db9f405b50ae814b91f71:
> >
> >   perf tools: Don't warn about out of order event if write_backward is used (2016-01-25 09:49:15 +0000)
> >
> >----------------------------------------------------------------
> >perf improvements:
> >
> >  - Bug fixes:
> >    libbpf relocation checker for a llvm bug
> >    fix symbol searching for offline modules
> >
> >  - Building scripts:
> >    Use feature-dump results for build-test
> >
> >  - BPF related improvements:
> >    Enable indices syntax to support init BPF maps
> >    Support BPF output events support
> >
> >  - perf/core:
> >    Add write_backward attribute bit to support reading from
> >       overwrite ring buffer
> >
> >  - perf record improvements:
> >    Enable perf record dump different output
> >    Support reading from overwrite ring buffer based on write_backward
> >      attribute
> >
> >Signed-off-by: Wang Nan <wangnan0@huawei.com>
> >
> >----------------------------------------------------------------
> >He Kuang (1):
> >       perf tools: Support perf event alias name
> >
> >Wang Nan (53):
> >       perf test: Add libbpf relocation checker
> >       perf bpf: Check relocation target section
> >       tools build: Allow subprojects select all feature checkers
> >       perf build: Select all feature checkers for feature-dump
> >       perf build: Use feature dump file for build-test
> >       perf test: Check environment before start real BPF test
> >       perf tools: Fix symbols searching for offline module in buildid-cache
> >       perf test: Improve bp_signal
> >       perf tools: Add API to config maps in bpf object
> >       perf tools: Enable BPF object configure syntax
> >       perf record: Apply config to BPF objects before recording
> >       perf tools: Enable passing event to BPF object
> >       perf tools: Support setting different slots in a BPF map separately
> >       perf tools: Enable indices setting syntax for BPF maps
> >       perf tools: Introduce bpf-output event
> >       perf data: Support converting data from bpf_perf_event_output()
> >       perf core: Introduce new ioctl options to pause and resume ring buffer
> >       perf core: Set event's default overflow_handler
> >       perf core: Prepare writing into ring buffer from end
> >       perf core: Add backward attribute to perf event
> >       perf core: Reduce perf event output overhead by new overflow handler
> >       perf tools: Introduce API to pause ring buffer
> >       perf tools: Only validate is_pos for tracking evsels
> >       perf tools: Print write_backward value in perf_event_attr__fprintf
> >       perf tools: Move timestamp creation to util
> >       perf tools: Make ordered_events reusable
> >       perf record: Extract synthesize code to record__synthesize()
> >       perf tools: Add perf_data_file__switch() helper
> >       perf record: Turns auxtrace_snapshot_enable into 3 states
> >       perf record: Introduce record__finish_output() to finish a perf.data
> >       perf record: Use OPT_BOOLEAN_SET for buildid cache related options
> >       perf record: Add '--timestamp-filename' option to append timestamp to output filename
> >       perf record: Split output into multiple files via '--switch-output'
> >       perf record: Force enable --timestamp-filename when --switch-output is provided
> >       perf record: Disable buildid cache options by default in switch output mode
> >       perf record: Re-synthesize tracking events after output switching
> >       perf record: Generate tracking events for process forked by perf
> >       perf record: Ensure return non-zero rc when mmap fail
> >       perf record: Prevent reading invalid data in record__mmap_read
> >       perf tools: Add evlist channel helpers
> >       perf tools: Automatically add new channel according to evlist
> >       perf tools: Operate multiple channels
> >       perf tools: Squash overwrite setting into channel
> >       perf record: Don't read from and poll overwrite channel
> >       perf record: Don't poll on overwrite channel
> >       perf tools: Detect avalibility of write_backward
> >       perf tools: Enable overwrite settings
> >       perf tools: Set write_backward attribut bit for overwrite events
> >       perf record: Toggle overwrite ring buffer for reading
> >       perf record: Rename variable to make code clear
> >       perf record: Read from backward ring buffer
> >       perf record: Allow generate tracking events at the end of output
> >       perf tools: Don't warn about out of order event if write_backward is used
> >
> >  include/linux/perf_event.h                    |  22 +-
> >  include/uapi/linux/perf_event.h               |   4 +-
> >  kernel/events/core.c                          |  73 ++-
> >  kernel/events/internal.h                      |  11 +
> >  kernel/events/ring_buffer.c                   |  63 ++-
> >  tools/build/Makefile.feature                  |  21 +-
> >  tools/lib/bpf/libbpf.c                        |  34 +-
> >  tools/perf/Makefile.perf                      |  11 +-
> >  tools/perf/builtin-buildid-cache.c            |  14 +-
> >  tools/perf/builtin-record.c                   | 608 ++++++++++++++++++----
> >  tools/perf/perf.h                             |   2 +
> >  tools/perf/tests/.gitignore                   |   1 +
> >  tools/perf/tests/Build                        |   9 +-
> >  tools/perf/tests/bp_signal.c                  | 140 +++++-
> >  tools/perf/tests/bpf-script-test-relocation.c |  50 ++
> >  tools/perf/tests/bpf.c                        |  63 ++-
> >  tools/perf/tests/llvm.c                       |  17 +-
> >  tools/perf/tests/llvm.h                       |   5 +-
> >  tools/perf/tests/make                         |  31 ++
> >  tools/perf/util/bpf-loader.c                  | 699 ++++++++++++++++++++++++++
> >  tools/perf/util/bpf-loader.h                  |  59 +++
> >  tools/perf/util/build-id.c                    |  44 ++
> >  tools/perf/util/build-id.h                    |   1 +
> >  tools/perf/util/data-convert-bt.c             | 112 ++++-
> >  tools/perf/util/data.c                        |  36 ++
> >  tools/perf/util/data.h                        |  11 +-
> >  tools/perf/util/evlist.c                      | 314 ++++++++++--
> >  tools/perf/util/evlist.h                      |  67 ++-
> >  tools/perf/util/evsel.c                       |  30 ++
> >  tools/perf/util/evsel.h                       |  13 +
> >  tools/perf/util/ordered-events.c              |   5 +
> >  tools/perf/util/parse-events.c                | 139 ++++-
> >  tools/perf/util/parse-events.h                |  24 +-
> >  tools/perf/util/parse-events.l                |  18 +-
> >  tools/perf/util/parse-events.y                | 123 ++++-
> >  tools/perf/util/record.c                      |  11 +
> >  tools/perf/util/session.c                     |  22 +-
> >  tools/perf/util/symbol.c                      |   4 +
> >  tools/perf/util/util.c                        |  17 +
> >  tools/perf/util/util.h                        |   1 +
> >  40 files changed, 2689 insertions(+), 240 deletions(-)
> >  create mode 100644 tools/perf/tests/bpf-script-test-relocation.c
> >
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/54] perf test: Add libbpf relocation checker
  2016-01-25  9:55 ` [PATCH 01/54] perf test: Add libbpf relocation checker Wang Nan
@ 2016-01-26 14:58   ` Arnaldo Carvalho de Melo
  2016-01-26 15:07     ` Arnaldo Carvalho de Melo
  2016-02-03 10:13   ` [tip:perf/core] " tip-bot for Wang Nan
  1 sibling, 1 reply; 79+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-26 14:58 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, acme, Brendan Gregg, Daniel Borkmann,
	David S. Miller, He Kuang, Jiri Olsa, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, Will Deacon,
	linux-kernel

Em Mon, Jan 25, 2016 at 09:55:48AM +0000, Wang Nan escreveu:
> There's a bug in LLVM that it can generate unneeded relocation
> information. See [1] and [2]. Libbpf should check the target section
> of a relocation symbol.
> 
> This patch adds a testcase which reference a global variable (BPF
> doesn't support global variable). Before fixing libbpf, the new test
> case can be loaded into kernel, the global variable acts like the first
> map. It is incorrect.
> 
> Result:
>  # ~/perf test BPF
>  37: Test BPF filter                                          :
>  37.1: Test basic BPF filtering                               : Ok
>  37.2: Test BPF prologue generation                           : Ok
>  37.3: Test BPF relocation checker                            : FAILED!
> 
>  # ~/perf test -v BPF
>  ...
>  libbpf: loading object '[bpf_relocation_test]' from buffer
>  libbpf: section .strtab, size 126, link 0, flags 0, type=3
>  libbpf: section .text, size 0, link 0, flags 6, type=1
>  libbpf: section .data, size 0, link 0, flags 3, type=1
>  libbpf: section .bss, size 0, link 0, flags 3, type=8
>  libbpf: section func=sys_write, size 104, link 0, flags 6, type=1
>  libbpf: found program func=sys_write
>  libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9
>  libbpf: section maps, size 16, link 0, flags 3, type=1
>  libbpf: maps in [bpf_relocation_test]: 16 bytes
>  libbpf: section license, size 4, link 0, flags 3, type=1
>  libbpf: license of [bpf_relocation_test] is GPL
>  libbpf: section version, size 4, link 0, flags 3, type=1
>  libbpf: kernel version of [bpf_relocation_test] is 40400
>  libbpf: section .symtab, size 144, link 1, flags 0, type=2
>  libbpf: map 0 is "my_table"
>  libbpf: collecting relocating info for: 'func=sys_write'
>  libbpf: relocation: insn_idx=7
>  Success unexpectedly: libbpf error when dealing with relocation

"Success unexpectedly?" Reading the code to try to grok this message...


- Arnaldo

>  test child finished with -1
>  ---- end ----
>  Test BPF filter subtest 2: FAILED!
> 
> [1] https://llvm.org/bugs/show_bug.cgi?id=26243
> [2] https://patchwork.ozlabs.org/patch/571385/
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/Makefile.perf                      |  2 +-
>  tools/perf/tests/.gitignore                   |  1 +
>  tools/perf/tests/Build                        |  9 ++++-
>  tools/perf/tests/bpf-script-test-relocation.c | 50 +++++++++++++++++++++++++++
>  tools/perf/tests/bpf.c                        | 26 +++++++++++---
>  tools/perf/tests/llvm.c                       | 17 ++++++---
>  tools/perf/tests/llvm.h                       |  5 ++-
>  7 files changed, 98 insertions(+), 12 deletions(-)
>  create mode 100644 tools/perf/tests/bpf-script-test-relocation.c
> 
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 5d34815..97ce869 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -618,7 +618,7 @@ clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean
>  	$(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32
>  	$(call QUIET_CLEAN, core-gen)   $(RM)  *.spec *.pyc *.pyo */*.pyc */*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex* \
>  		$(OUTPUT)util/intel-pt-decoder/inat-tables.c $(OUTPUT)fixdep \
> -		$(OUTPUT)tests/llvm-src-{base,kbuild,prologue}.c
> +		$(OUTPUT)tests/llvm-src-{base,kbuild,prologue,relocation}.c
>  	$(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
>  	$(python-clean)
>  
> diff --git a/tools/perf/tests/.gitignore b/tools/perf/tests/.gitignore
> index bf016c4..8cc30e7 100644
> --- a/tools/perf/tests/.gitignore
> +++ b/tools/perf/tests/.gitignore
> @@ -1,3 +1,4 @@
>  llvm-src-base.c
>  llvm-src-kbuild.c
>  llvm-src-prologue.c
> +llvm-src-relocation.c
> diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
> index 614899b..1ba628e 100644
> --- a/tools/perf/tests/Build
> +++ b/tools/perf/tests/Build
> @@ -31,7 +31,7 @@ perf-y += sample-parsing.o
>  perf-y += parse-no-sample-id-all.o
>  perf-y += kmod-path.o
>  perf-y += thread-map.o
> -perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o
> +perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o llvm-src-relocation.o
>  perf-y += bpf.o
>  perf-y += topology.o
>  perf-y += cpumap.o
> @@ -59,6 +59,13 @@ $(OUTPUT)tests/llvm-src-prologue.c: tests/bpf-script-test-prologue.c tests/Build
>  	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
>  	$(Q)echo ';' >> $@
>  
> +$(OUTPUT)tests/llvm-src-relocation.c: tests/bpf-script-test-relocation.c tests/Build
> +	$(call rule_mkdir)
> +	$(Q)echo '#include <tests/llvm.h>' > $@
> +	$(Q)echo 'const char test_llvm__bpf_test_relocation[] =' >> $@
> +	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
> +	$(Q)echo ';' >> $@
> +
>  ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64))
>  perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
>  endif
> diff --git a/tools/perf/tests/bpf-script-test-relocation.c b/tools/perf/tests/bpf-script-test-relocation.c
> new file mode 100644
> index 0000000..93af774
> --- /dev/null
> +++ b/tools/perf/tests/bpf-script-test-relocation.c
> @@ -0,0 +1,50 @@
> +/*
> + * bpf-script-test-relocation.c
> + * Test BPF loader checking relocation
> + */
> +#ifndef LINUX_VERSION_CODE
> +# error Need LINUX_VERSION_CODE
> +# error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
> +#endif
> +#define BPF_ANY 0
> +#define BPF_MAP_TYPE_ARRAY 2
> +#define BPF_FUNC_map_lookup_elem 1
> +#define BPF_FUNC_map_update_elem 2
> +
> +static void *(*bpf_map_lookup_elem)(void *map, void *key) =
> +	(void *) BPF_FUNC_map_lookup_elem;
> +static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) =
> +	(void *) BPF_FUNC_map_update_elem;
> +
> +struct bpf_map_def {
> +	unsigned int type;
> +	unsigned int key_size;
> +	unsigned int value_size;
> +	unsigned int max_entries;
> +};
> +
> +#define SEC(NAME) __attribute__((section(NAME), used))
> +struct bpf_map_def SEC("maps") my_table = {
> +	.type = BPF_MAP_TYPE_ARRAY,
> +	.key_size = sizeof(int),
> +	.value_size = sizeof(int),
> +	.max_entries = 1,
> +};
> +
> +int this_is_a_global_val;
> +
> +SEC("func=sys_write")
> +int bpf_func__sys_write(void *ctx)
> +{
> +	int key = 0;
> +	int value = 0;
> +
> +	/*
> +	 * Incorrect relocation. Should not allow this program be
> +	 * loaded into kernel.
> +	 */
> +	bpf_map_update_elem(&this_is_a_global_val, &key, &value, 0);
> +	return 0;
> +}
> +char _license[] SEC("license") = "GPL";
> +int _version SEC("version") = LINUX_VERSION_CODE;
> diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
> index 33689a0..952ca99 100644
> --- a/tools/perf/tests/bpf.c
> +++ b/tools/perf/tests/bpf.c
> @@ -71,6 +71,15 @@ static struct {
>  		(NR_ITERS + 1) / 4,
>  	},
>  #endif
> +	{
> +		LLVM_TESTCASE_BPF_RELOCATION,
> +		"Test BPF relocation checker",
> +		"[bpf_relocation_test]",
> +		"fix 'perf test LLVM' first",
> +		"libbpf error when dealing with relocation",
> +		NULL,
> +		0,
> +	},
>  };
>  
>  static int do_test(struct bpf_object *obj, int (*func)(void),
> @@ -190,7 +199,7 @@ static int __test__bpf(int idx)
>  
>  	ret = test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz,
>  				       bpf_testcase_table[idx].prog_id,
> -				       true);
> +				       true, NULL);
>  	if (ret != TEST_OK || !obj_buf || !obj_buf_sz) {
>  		pr_debug("Unable to get BPF object, %s\n",
>  			 bpf_testcase_table[idx].msg_compile_fail);
> @@ -202,14 +211,21 @@ static int __test__bpf(int idx)
>  
>  	obj = prepare_bpf(obj_buf, obj_buf_sz,
>  			  bpf_testcase_table[idx].name);
> -	if (!obj) {
> +	if ((!!bpf_testcase_table[idx].target_func) != (!!obj)) {
> +		if (!obj)
> +			pr_debug("Fail to load BPF object: %s\n",
> +				 bpf_testcase_table[idx].msg_load_fail);
> +		else
> +			pr_debug("Success unexpectedly: %s\n",
> +				 bpf_testcase_table[idx].msg_load_fail);
>  		ret = TEST_FAIL;
>  		goto out;
>  	}
>  
> -	ret = do_test(obj,
> -		      bpf_testcase_table[idx].target_func,
> -		      bpf_testcase_table[idx].expect_result);
> +	if (obj)
> +		ret = do_test(obj,
> +			      bpf_testcase_table[idx].target_func,
> +			      bpf_testcase_table[idx].expect_result);
>  out:
>  	bpf__clear();
>  	return ret;
> diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
> index 06f45c1..70edcdf 100644
> --- a/tools/perf/tests/llvm.c
> +++ b/tools/perf/tests/llvm.c
> @@ -35,6 +35,7 @@ static int test__bpf_parsing(void *obj_buf __maybe_unused,
>  static struct {
>  	const char *source;
>  	const char *desc;
> +	bool should_load_fail;
>  } bpf_source_table[__LLVM_TESTCASE_MAX] = {
>  	[LLVM_TESTCASE_BASE] = {
>  		.source = test_llvm__bpf_base_prog,
> @@ -48,14 +49,19 @@ static struct {
>  		.source = test_llvm__bpf_test_prologue_prog,
>  		.desc = "Compile source for BPF prologue generation test",
>  	},
> +	[LLVM_TESTCASE_BPF_RELOCATION] = {
> +		.source = test_llvm__bpf_test_relocation,
> +		.desc = "Compile source for BPF relocation test",
> +		.should_load_fail = true,
> +	},
>  };
>  
> -
>  int
>  test_llvm__fetch_bpf_obj(void **p_obj_buf,
>  			 size_t *p_obj_buf_sz,
>  			 enum test_llvm__testcase idx,
> -			 bool force)
> +			 bool force,
> +			 bool *should_load_fail)
>  {
>  	const char *source;
>  	const char *desc;
> @@ -68,6 +74,8 @@ test_llvm__fetch_bpf_obj(void **p_obj_buf,
>  
>  	source = bpf_source_table[idx].source;
>  	desc = bpf_source_table[idx].desc;
> +	if (should_load_fail)
> +		*should_load_fail = bpf_source_table[idx].should_load_fail;
>  
>  	perf_config(perf_config_cb, NULL);
>  
> @@ -136,14 +144,15 @@ int test__llvm(int subtest)
>  	int ret;
>  	void *obj_buf = NULL;
>  	size_t obj_buf_sz = 0;
> +	bool should_load_fail = false;
>  
>  	if ((subtest < 0) || (subtest >= __LLVM_TESTCASE_MAX))
>  		return TEST_FAIL;
>  
>  	ret = test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz,
> -				       subtest, false);
> +				       subtest, false, &should_load_fail);
>  
> -	if (ret == TEST_OK) {
> +	if (ret == TEST_OK && !should_load_fail) {
>  		ret = test__bpf_parsing(obj_buf, obj_buf_sz);
>  		if (ret != TEST_OK) {
>  			pr_debug("Failed to parse test case '%s'\n",
> diff --git a/tools/perf/tests/llvm.h b/tools/perf/tests/llvm.h
> index 5150b4d..0eaa604 100644
> --- a/tools/perf/tests/llvm.h
> +++ b/tools/perf/tests/llvm.h
> @@ -7,14 +7,17 @@
>  extern const char test_llvm__bpf_base_prog[];
>  extern const char test_llvm__bpf_test_kbuild_prog[];
>  extern const char test_llvm__bpf_test_prologue_prog[];
> +extern const char test_llvm__bpf_test_relocation[];
>  
>  enum test_llvm__testcase {
>  	LLVM_TESTCASE_BASE,
>  	LLVM_TESTCASE_KBUILD,
>  	LLVM_TESTCASE_BPF_PROLOGUE,
> +	LLVM_TESTCASE_BPF_RELOCATION,
>  	__LLVM_TESTCASE_MAX,
>  };
>  
>  int test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz,
> -			     enum test_llvm__testcase index, bool force);
> +			     enum test_llvm__testcase index, bool force,
> +			     bool *should_load_fail);
>  #endif
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/54] perf test: Add libbpf relocation checker
  2016-01-26 14:58   ` Arnaldo Carvalho de Melo
@ 2016-01-26 15:07     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 79+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-26 15:07 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, acme, Brendan Gregg, Daniel Borkmann,
	David S. Miller, He Kuang, Jiri Olsa, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, Will Deacon,
	linux-kernel

Em Tue, Jan 26, 2016 at 12:58:50PM -0200, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 25, 2016 at 09:55:48AM +0000, Wang Nan escreveu:
> > There's a bug in LLVM that it can generate unneeded relocation
> > information. See [1] and [2]. Libbpf should check the target section
> > of a relocation symbol.
> > 
> > This patch adds a testcase which reference a global variable (BPF
> > doesn't support global variable). Before fixing libbpf, the new test
> > case can be loaded into kernel, the global variable acts like the first
> > map. It is incorrect.
> > 
> > Result:
> >  # ~/perf test BPF
> >  37: Test BPF filter                                          :
> >  37.1: Test basic BPF filtering                               : Ok
> >  37.2: Test BPF prologue generation                           : Ok
> >  37.3: Test BPF relocation checker                            : FAILED!
> > 
> >  # ~/perf test -v BPF
> >  ...
> >  libbpf: loading object '[bpf_relocation_test]' from buffer
> >  libbpf: section .strtab, size 126, link 0, flags 0, type=3
> >  libbpf: section .text, size 0, link 0, flags 6, type=1
> >  libbpf: section .data, size 0, link 0, flags 3, type=1
> >  libbpf: section .bss, size 0, link 0, flags 3, type=8
> >  libbpf: section func=sys_write, size 104, link 0, flags 6, type=1
> >  libbpf: found program func=sys_write
> >  libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9
> >  libbpf: section maps, size 16, link 0, flags 3, type=1
> >  libbpf: maps in [bpf_relocation_test]: 16 bytes
> >  libbpf: section license, size 4, link 0, flags 3, type=1
> >  libbpf: license of [bpf_relocation_test] is GPL
> >  libbpf: section version, size 4, link 0, flags 3, type=1
> >  libbpf: kernel version of [bpf_relocation_test] is 40400
> >  libbpf: section .symtab, size 144, link 1, flags 0, type=2
> >  libbpf: map 0 is "my_table"
> >  libbpf: collecting relocating info for: 'func=sys_write'
> >  libbpf: relocation: insn_idx=7
> >  Success unexpectedly: libbpf error when dealing with relocation
> 
> "Success unexpectedly?" Reading the code to try to grok this message...

        obj = prepare_bpf(obj_buf, obj_buf_sz,
                          bpf_testcase_table[idx].name);
        if ((!!bpf_testcase_table[idx].target_func) != (!!obj)) {
                if (!obj)
                        pr_debug("Fail to load BPF object: %s\n",
                                 bpf_testcase_table[idx].msg_load_fail);
                else
                        pr_debug("Success unexpectedly: %s\n",
                                 bpf_testcase_table[idx].msg_load_fail);
                ret = TEST_FAIL;
                goto out;
        }


Ok, so in this case you have target_func == NULL, and you managed to
prepare the bpf object, that shouldn't been the case, i.e. prepare_obj
should've returned NULL.

Perhaps replace that "Success unexpectedly"  with "Unexpected sucess,
this script is invalid, should've been marked as such by function
libbpf_foo()"?

Now to apply the follow up patch to see how that will make this test
work as expected...

- Arnaldo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 05/54] perf build: Use feature dump file for build-test
  2016-01-25  9:55 ` [PATCH 05/54] perf build: Use feature dump file for build-test Wang Nan
@ 2016-01-26 16:59   ` Arnaldo Carvalho de Melo
  2016-01-27  2:36     ` Wangnan (F)
  2016-01-27 11:22     ` [PATCH] tools build: Check basic headers for test-compile feature checker Wang Nan
  0 siblings, 2 replies; 79+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-26 16:59 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Brendan Gregg, Daniel Borkmann,
	David S. Miller, He Kuang, Jiri Olsa, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, Will Deacon,
	linux-kernel

Em Mon, Jan 25, 2016 at 09:55:52AM +0000, Wang Nan escreveu:
> To prevent feature check run too many times, this patch utilizes
> previous introduced feature-dump make target and FEATURES_DUMP
> variable, makes sure the feature checkers run only once when doing
> build-test for normal test cases.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>

So, I'm having this problem when this patch is applied.

[acme@jouet linux]$ make -C tools clean
make: Entering directory '/home/acme/git/linux/tools'
  DESCEND  power/acpi
make[1]: Entering directory '/home/acme/git/linux/tools/power/acpi'
  DESCEND  tools/acpidbg
make[2]: Entering directory '/home/acme/git/linux/tools/power/acpi/tools/acpidbg'
find ./ \( -not -type d \) \
-and \( -name '*~' -o -name '*.[oas]' \) \
-type f -print \
 | xargs rm -f
rm -f ./acpidbg
make[2]: Leaving directory '/home/acme/git/linux/tools/power/acpi/tools/acpidbg'
  DESCEND  tools/acpidump
make[2]: Entering directory '/home/acme/git/linux/tools/power/acpi/tools/acpidump'
find ./ \( -not -type d \) \
-and \( -name '*~' -o -name '*.[oas]' \) \
-type f -print \
 | xargs rm -f
rm -f ./acpidump
make[2]: Leaving directory '/home/acme/git/linux/tools/power/acpi/tools/acpidump'
  DESCEND  tools/ec
make[2]: Entering directory '/home/acme/git/linux/tools/power/acpi/tools/ec'
find ./ \( -not -type d \) \
-and \( -name '*~' -o -name '*.[oas]' \) \
-type f -print \
 | xargs rm -f
rm -f ./ec
make[2]: Leaving directory '/home/acme/git/linux/tools/power/acpi/tools/ec'
make[1]: Leaving directory '/home/acme/git/linux/tools/power/acpi'
  DESCEND  cgroup
make[1]: Entering directory '/home/acme/git/linux/tools/cgroup'
rm -f cgroup_event_listener
make[1]: Leaving directory '/home/acme/git/linux/tools/cgroup'
  DESCEND  power/cpupower
make[1]: Entering directory '/home/acme/git/linux/tools/power/cpupower'
find ./ \( -not -type d \) -and \( -name '*~' -o -name '*.[oas]' \) -type f -print \
 | xargs rm -f
rm -f ./cpupower
rm -f ./libcpupower.so*
rm -rf ./po/*.gmo
rm -rf ./po/*.pot
make -C bench O=./ clean
make[2]: Entering directory '/home/acme/git/linux/tools/power/cpupower/bench'
rm -f .//*.o
rm -f .//cpufreq-bench
make[2]: Leaving directory '/home/acme/git/linux/tools/power/cpupower/bench'
make[1]: Leaving directory '/home/acme/git/linux/tools/power/cpupower'
  DESCEND  hv
make[1]: Entering directory '/home/acme/git/linux/tools/hv'
rm -f hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon
make[1]: Leaving directory '/home/acme/git/linux/tools/hv'
  DESCEND  firewire
make[1]: Entering directory '/home/acme/git/linux/tools/firewire'
rm -rf *.o nosy-dump
make[1]: Leaving directory '/home/acme/git/linux/tools/firewire'
  DESCEND  lguest
make[1]: Entering directory '/home/acme/git/linux/tools/lguest'
rm -f lguest
rm -rf include
make[1]: Leaving directory '/home/acme/git/linux/tools/lguest'
  DESCEND  perf
make[1]: Entering directory '/home/acme/git/linux/tools/perf'
  CLEAN    libtraceevent
  CLEAN    libapi
  CLEAN    libsubcmd
  CLEAN    libbpf
  CLEAN    libsubcmd
  CLEAN    config
  CLEAN    core-objs
  CLEAN    core-progs
  CLEAN    core-gen
  SUBDIR   Documentation
  CLEAN    Documentation
  CLEAN    python
make[1]: Leaving directory '/home/acme/git/linux/tools/perf'
  DESCEND  testing/selftests
make[1]: Entering directory '/home/acme/git/linux/tools/testing/selftests'
for TARGET in breakpoints cpu-hotplug efivarfs exec firmware ftrace futex kcmp lib membarrier memfd memory-hotplug mount mqueue net powerpc pstore ptrace seccomp size static_keys sysctl timers user vm x86 zram; do \
	make -C $TARGET clean; \
done;
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/breakpoints'
rm -fr breakpoint_test
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/breakpoints'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/cpu-hotplug'
make[2]: Nothing to be done for 'clean'.
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/cpu-hotplug'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/efivarfs'
rm -f open-unlink create-read
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/efivarfs'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/exec'
rm -rf execveat execveat.symlink execveat.denatured script subdir subdir.moved execveat.moved xxxxx*
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/exec'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/firmware'
make[2]: Nothing to be done for 'clean'.
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/firmware'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/ftrace'
rm -rf logs/*
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/ftrace'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/futex'
for DIR in functional; do make -C $DIR clean ; done
make[3]: Entering directory '/home/acme/git/linux/tools/testing/selftests/futex/functional'
rm -f futex_wait_timeout futex_wait_wouldblock futex_requeue_pi futex_requeue_pi_signal_restart futex_requeue_pi_mismatched_ops futex_wait_uninitialized_heap futex_wait_private_mapped_file
make[3]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/futex/functional'
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/futex'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/kcmp'
rm -f kcmp_test kcmp-test-file
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/kcmp'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/lib'
make[2]: Nothing to be done for 'clean'.
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/lib'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/membarrier'
rm -f membarrier_test
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/membarrier'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/memfd'
rm -f memfd_test fuse_test
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/memfd'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/memory-hotplug'
make[2]: Nothing to be done for 'clean'.
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/memory-hotplug'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/mount'
rm -f unprivileged-remount-test
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/mount'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/mqueue'
rm -f mq_open_tests mq_perf_tests
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/mqueue'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/net'
rm -f socket psock_fanout psock_tpacket reuseport_bpf
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/net'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/powerpc'
rm -f tags
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/powerpc'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/pstore'
rm -rf logs/* *uuid
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/pstore'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/ptrace'
rm -f peeksiginfo
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/ptrace'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/seccomp'
rm -f seccomp_bpf
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/seccomp'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/size'
rm -f get_size
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/size'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/static_keys'
make[2]: Nothing to be done for 'clean'.
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/static_keys'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/sysctl'
make[2]: Nothing to be done for 'clean'.
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/sysctl'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/timers'
rm -f posix_timers nanosleep nsleep-lat set-timer-lat mqueue-lat inconsistency-check raw_skew threadtest rtctest alarmtimer-suspend valid-adjtimex adjtick change_skew skew_consistency clocksource-switch leap-a-day leapcrash set-tai set-2038
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/timers'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/user'
make[2]: Nothing to be done for 'clean'.
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/user'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/vm'
rm -f compaction_test hugepage-mmap hugepage-shm map_hugetlb mlock2-tests on-fault-limit thuge-gen transhuge-stress userfaultfd
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/vm'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/x86'
rm -f single_step_syscall_32 sysret_ss_attrs_32 syscall_nt_32 ptrace_syscall_32 entry_from_vm86_32 syscall_arg_fault_32 sigreturn_32 test_syscall_vdso_32 unwind_vdso_32 test_FCMOV_32 test_FCOMI_32 test_FISTTP_32 ldt_gdt_32 vdso_restorer_32 single_step_syscall_64 sysret_ss_attrs_64 syscall_nt_64 ptrace_syscall_64
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/x86'
make[2]: Entering directory '/home/acme/git/linux/tools/testing/selftests/zram'
rm -f err.log
make[2]: Leaving directory '/home/acme/git/linux/tools/testing/selftests/zram'
make[1]: Leaving directory '/home/acme/git/linux/tools/testing/selftests'
  DESCEND  power/x86/turbostat
make[1]: Entering directory '/home/acme/git/linux/tools/power/x86/turbostat'
make[1]: Leaving directory '/home/acme/git/linux/tools/power/x86/turbostat'
  DESCEND  spi
make[1]: Entering directory '/home/acme/git/linux/tools/spi'
rm -f spidev_test spidev_fdx
make[1]: Leaving directory '/home/acme/git/linux/tools/spi'
  DESCEND  usb
make[1]: Entering directory '/home/acme/git/linux/tools/usb'
rm -f testusb ffs-test
make[1]: Leaving directory '/home/acme/git/linux/tools/usb'
  DESCEND  virtio
make[1]: Entering directory '/home/acme/git/linux/tools/virtio'
rm -f *.o vringh_test virtio_test vhost_test/*.o vhost_test/.*.cmd \
              vhost_test/Module.symvers vhost_test/modules.order *.d
make[1]: Leaving directory '/home/acme/git/linux/tools/virtio'
  DESCEND  vm
make[1]: Entering directory '/home/acme/git/linux/tools/vm'
rm -f page-types slabinfo page_owner_sort
make -C ../lib/api clean
make[2]: Entering directory '/home/acme/git/linux/tools/lib/api'
  CLEAN    libapi
make[2]: Leaving directory '/home/acme/git/linux/tools/lib/api'
make[1]: Leaving directory '/home/acme/git/linux/tools/vm'
  DESCEND  net
make[1]: Entering directory '/home/acme/git/linux/tools/net'
rm -rf *.o bpf_jit_disasm bpf_dbg bpf_asm bpf_exp.yacc.* bpf_exp.lex.*
make[1]: Leaving directory '/home/acme/git/linux/tools/net'
  DESCEND  iio
make[1]: Entering directory '/home/acme/git/linux/tools/iio'
rm -f *.o iio_event_monitor lsiio generic_buffer
make[1]: Leaving directory '/home/acme/git/linux/tools/iio'
  DESCEND  power/x86/x86_energy_perf_policy
make[1]: Entering directory '/home/acme/git/linux/tools/power/x86/x86_energy_perf_policy'
rm -f x86_energy_perf_policy
make[1]: Leaving directory '/home/acme/git/linux/tools/power/x86/x86_energy_perf_policy'
  DESCEND  thermal/tmon
make[1]: Entering directory '/home/acme/git/linux/tools/thermal/tmon'
find . -name "*.o" | xargs rm -f
rm -f tmon
make[1]: Leaving directory '/home/acme/git/linux/tools/thermal/tmon'
  DESCEND  laptop/freefall
make[1]: Entering directory '/home/acme/git/linux/tools/laptop/freefall'
rm -f freefall
make[1]: Leaving directory '/home/acme/git/linux/tools/laptop/freefall'
  DESCEND  build
make[1]: Entering directory '/home/acme/git/linux/tools/build'
  CLEAN    fixdep
make[1]: Leaving directory '/home/acme/git/linux/tools/build'
  DESCEND  lib/bpf
make[1]: Entering directory '/home/acme/git/linux/tools/lib/bpf'
  CLEAN    libbpf
  CLEAN    core-gen
make[1]: Leaving directory '/home/acme/git/linux/tools/lib/bpf'
  DESCEND  lib/subcmd
make[1]: Entering directory '/home/acme/git/linux/tools/lib/subcmd'
  CLEAN    libsubcmd
make[1]: Leaving directory '/home/acme/git/linux/tools/lib/subcmd'
  DESCEND  lib/lockdep
make[1]: Entering directory '/home/acme/git/linux/tools/lib/lockdep'
git statusrm -f *.o *~ liblockdep.a liblockdep.so.4.4.0 *.a *liblockdep*.so*  .*.d .*.cmd
rm -f tags TAGS
make[1]: Leaving directory '/home/acme/git/linux/tools/lib/lockdep'
make: Leaving directory '/home/acme/git/linux/tools'
[acme@jouet linux]$ git status
On branch perf/core
Untracked files:
  (use "git add <file>..." to include in what will be committed)

	perf.data
	perf.data.old
	tools/perf/BUILD_TEST_FEATURE_DUMP
	tools/perf/make_no_libbpf
	tools/perf/make_no_newt

nothing added to commit but untracked files present (use "git add" to track)
[acme@jouet linux]$ rm -f tools/perf/BUILD_TEST_FEATURE_DUMP tools/perf/make_no_libbpf tools/perf/make_no_newt
[acme@jouet linux]$ perf stat make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
Testing Makefile
- /home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP: cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP  feature-dump
cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
- make_doc: cd . && make -f Makefile   DESTDIR=/tmp/tmp.lLyAWJ2KUJ doc FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
- make_no_libperl: cd . && make -f Makefile   DESTDIR=/tmp/tmp.iPREXpyGhh NO_LIBPERL=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
cd . && make -f Makefile DESTDIR=/tmp/tmp.iPREXpyGhh NO_LIBPERL=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
  BUILD:   Doing 'make -j4' parallel build
  GEN      common-cmds.h
  CC       fixdep.o
  CC       perf-read-vdso32
In file included from /usr/include/features.h:389:0,
                 from /usr/include/stdio.h:27,
                 from perf-read-vdso.c:1:
/usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such file or directory
compilation terminated.
Makefile.perf:416: recipe for target 'perf-read-vdso32' failed
make[4]: *** [perf-read-vdso32] Error 1
make[4]: *** Waiting for unfinished jobs....
  LD       fixdep-in.o
  LINK     fixdep
  PERF_VERSION = 4.4.g80fcfd7
Makefile:68: recipe for target 'all' failed
make[3]: *** [all] Error 2
  test: test -x ./perf
tests/make:274: recipe for target 'make_no_libperl' failed
make[2]: *** [make_no_libperl] Error 1
tests/make:7: recipe for target 'all' failed
make[1]: *** [all] Error 2
Makefile:81: recipe for target 'build-test' failed
make: *** [build-test] Error 2
make: Leaving directory '/home/acme/git/linux/tools/perf'

 Performance counter stats for 'make -C tools/perf build-test':

      61660.694764      task-clock (msec)         #    3.494 CPUs utilized          
            13,836      context-switches          #    0.224 K/sec                  
             3,707      cpu-migrations            #    0.060 K/sec                  
         1,151,896      page-faults               #    0.019 M/sec                  
   190,413,042,688      cycles                    #    3.088 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
   206,130,087,553      instructions              #    1.08  insns per cycle        
    48,604,926,191      branches                  #  788.264 M/sec                  
       603,468,422      branch-misses             #    1.24% of all branches        

      17.647845909 seconds time elapsed

[acme@jouet linux]$ 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 05/54] perf build: Use feature dump file for build-test
  2016-01-26 16:59   ` Arnaldo Carvalho de Melo
@ 2016-01-27  2:36     ` Wangnan (F)
  2016-01-27 13:54       ` Arnaldo Carvalho de Melo
  2016-01-27 11:22     ` [PATCH] tools build: Check basic headers for test-compile feature checker Wang Nan
  1 sibling, 1 reply; 79+ messages in thread
From: Wangnan (F) @ 2016-01-27  2:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alexei Starovoitov, Brendan Gregg, Daniel Borkmann,
	David S. Miller, He Kuang, Jiri Olsa, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, Will Deacon,
	linux-kernel



On 2016/1/27 0:59, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 25, 2016 at 09:55:52AM +0000, Wang Nan escreveu:
>> To prevent feature check run too many times, this patch utilizes
>> previous introduced feature-dump make target and FEATURES_DUMP
>> variable, makes sure the feature checkers run only once when doing
>> build-test for normal test cases.
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
> So, I'm having this problem when this patch is applied.

[SNIP]

>
> nothing added to commit but untracked files present (use "git add" to track)
> [acme@jouet linux]$ rm -f tools/perf/BUILD_TEST_FEATURE_DUMP tools/perf/make_no_libbpf tools/perf/make_no_newt
> [acme@jouet linux]$ perf stat make -C tools/perf build-test
> make: Entering directory '/home/acme/git/linux/tools/perf'
> Testing Makefile
> - /home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP: cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP  feature-dump
> cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
> - make_doc: cd . && make -f Makefile   DESTDIR=/tmp/tmp.lLyAWJ2KUJ doc FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
> - make_no_libperl: cd . && make -f Makefile   DESTDIR=/tmp/tmp.iPREXpyGhh NO_LIBPERL=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
> cd . && make -f Makefile DESTDIR=/tmp/tmp.iPREXpyGhh NO_LIBPERL=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
>    BUILD:   Doing 'make -j4' parallel build
>    GEN      common-cmds.h
>    CC       fixdep.o
>    CC       perf-read-vdso32
> In file included from /usr/include/features.h:389:0,
>                   from /usr/include/stdio.h:27,
>                   from perf-read-vdso.c:1:
> /usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such file or directory
> compilation terminated.
> Makefile.perf:416: recipe for target 'perf-read-vdso32' failed
> make[4]: *** [perf-read-vdso32] Error 1
> make[4]: *** Waiting for unfinished jobs....
>    LD       fixdep-in.o
>    LINK     fixdep
>    PERF_VERSION = 4.4.g80fcfd7
> Makefile:68: recipe for target 'all' failed
> make[3]: *** [all] Error 2
>    test: test -x ./perf
> tests/make:274: recipe for target 'make_no_libperl' failed
> make[2]: *** [make_no_libperl] Error 1
> tests/make:7: recipe for target 'all' failed
> make[1]: *** [all] Error 2
> Makefile:81: recipe for target 'build-test' failed
> make: *** [build-test] Error 2
> make: Leaving directory '/home/acme/git/linux/tools/perf'
>

This is the problem of test-compile-32. In 
./tools/build/feature/test-compile.c,
we check the '-m32' compiler flag but don't check include files.

Could you please have a look at your environment? Do you have 
glibc-devel-i386
installed? What's the result of

  $ gcc -m32 tools/build/feature/test-compile.c

I guess in your platform you can compile and link test-compile.c without
gnu/stubs-32.h. Then we need to improve test-compile.c to make it check 
headers
also.

Another question is why you don't meet this error before this patch. It 
seems
test-compile-32 should also pass...

Thank you.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH] tools build: Check basic headers for test-compile feature checker
  2016-01-26 16:59   ` Arnaldo Carvalho de Melo
  2016-01-27  2:36     ` Wangnan (F)
@ 2016-01-27 11:22     ` Wang Nan
  2016-01-27 13:23       ` Jiri Olsa
  2016-02-03 10:15       ` [tip:perf/core] " tip-bot for Wang Nan
  1 sibling, 2 replies; 79+ messages in thread
From: Wang Nan @ 2016-01-27 11:22 UTC (permalink / raw)
  To: acme; +Cc: linux-kernel, Wang Nan, Jiri Olsa, Li Zefan

An i386 binary can be linked correctly even without correct headers.
Which causes problem. For exmaple:

 $ mv /tmp/oxygen_root/usr/include/gnu/stubs-32.h{,.bak}
 $ make tools/perf
 Auto-detecting system features:
 ...                         dwarf: [ on  ]
 [SNIP]
   GEN      common-cmds.h
   CC       perf-read-vdso32
 In file included from /tmp/oxygen_root/usr/include/features.h:388:0,
                  from /tmp/oxygen_root/usr/include/stdio.h:27,
                  from perf-read-vdso.c:1:
 /tmp/oxygen_root/usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such file or directory
  # include <gnu/stubs-32.h>
                           ^
 compilation terminated.
 ...

In this patch we checks not only compiler and linker, but also basic
headers in test-compile test case, make it fail on a platform
lacking correct headers.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
---
 tools/build/feature/test-compile.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/build/feature/test-compile.c b/tools/build/feature/test-compile.c
index 31dbf45..c54e655 100644
--- a/tools/build/feature/test-compile.c
+++ b/tools/build/feature/test-compile.c
@@ -1,4 +1,6 @@
+#include <stdio.h>
 int main(void)
 {
+	printf("Hello World!\n");
 	return 0;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH] tools build: Check basic headers for test-compile feature checker
  2016-01-27 11:22     ` [PATCH] tools build: Check basic headers for test-compile feature checker Wang Nan
@ 2016-01-27 13:23       ` Jiri Olsa
  2016-01-27 13:55         ` Arnaldo Carvalho de Melo
  2016-02-03 10:15       ` [tip:perf/core] " tip-bot for Wang Nan
  1 sibling, 1 reply; 79+ messages in thread
From: Jiri Olsa @ 2016-01-27 13:23 UTC (permalink / raw)
  To: Wang Nan; +Cc: acme, linux-kernel, Jiri Olsa, Li Zefan

On Wed, Jan 27, 2016 at 11:22:22AM +0000, Wang Nan wrote:
> An i386 binary can be linked correctly even without correct headers.
> Which causes problem. For exmaple:
> 
>  $ mv /tmp/oxygen_root/usr/include/gnu/stubs-32.h{,.bak}
>  $ make tools/perf
>  Auto-detecting system features:
>  ...                         dwarf: [ on  ]
>  [SNIP]
>    GEN      common-cmds.h
>    CC       perf-read-vdso32
>  In file included from /tmp/oxygen_root/usr/include/features.h:388:0,
>                   from /tmp/oxygen_root/usr/include/stdio.h:27,
>                   from perf-read-vdso.c:1:
>  /tmp/oxygen_root/usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such file or directory
>   # include <gnu/stubs-32.h>
>                            ^
>  compilation terminated.
>  ...
> 
> In this patch we checks not only compiler and linker, but also basic
> headers in test-compile test case, make it fail on a platform
> lacking correct headers.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>

nice ;-)

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

> ---
>  tools/build/feature/test-compile.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/build/feature/test-compile.c b/tools/build/feature/test-compile.c
> index 31dbf45..c54e655 100644
> --- a/tools/build/feature/test-compile.c
> +++ b/tools/build/feature/test-compile.c
> @@ -1,4 +1,6 @@
> +#include <stdio.h>
>  int main(void)
>  {
> +	printf("Hello World!\n");
>  	return 0;
>  }
> -- 
> 1.8.3.4
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 05/54] perf build: Use feature dump file for build-test
  2016-01-27  2:36     ` Wangnan (F)
@ 2016-01-27 13:54       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 79+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-27 13:54 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Alexei Starovoitov, Brendan Gregg, Daniel Borkmann,
	David S. Miller, He Kuang, Jiri Olsa, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, Will Deacon,
	linux-kernel

Em Wed, Jan 27, 2016 at 10:36:54AM +0800, Wangnan (F) escreveu:
> On 2016/1/27 0:59, Arnaldo Carvalho de Melo wrote:
> >Em Mon, Jan 25, 2016 at 09:55:52AM +0000, Wang Nan escreveu:
> >>To prevent feature check run too many times, this patch utilizes
> >>previous introduced feature-dump make target and FEATURES_DUMP
> >>variable, makes sure the feature checkers run only once when doing
> >>build-test for normal test cases.

<SNIP>

> >So, I'm having this problem when this patch is applied.
 
> [SNIP]
> 
> >nothing added to commit but untracked files present (use "git add" to track)
> >[acme@jouet linux]$ rm -f tools/perf/BUILD_TEST_FEATURE_DUMP tools/perf/make_no_libbpf tools/perf/make_no_newt
> >[acme@jouet linux]$ perf stat make -C tools/perf build-test
> >make: Entering directory '/home/acme/git/linux/tools/perf'
> >Testing Makefile
> >- /home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP: cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP  feature-dump
> >cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
> >- make_doc: cd . && make -f Makefile   DESTDIR=/tmp/tmp.lLyAWJ2KUJ doc FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
> >- make_no_libperl: cd . && make -f Makefile   DESTDIR=/tmp/tmp.iPREXpyGhh NO_LIBPERL=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
> >cd . && make -f Makefile DESTDIR=/tmp/tmp.iPREXpyGhh NO_LIBPERL=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
> >   BUILD:   Doing 'make -j4' parallel build
> >   GEN      common-cmds.h
> >   CC       fixdep.o
> >   CC       perf-read-vdso32
> >In file included from /usr/include/features.h:389:0,
> >                  from /usr/include/stdio.h:27,
> >                  from perf-read-vdso.c:1:
> >/usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such file or directory
> >compilation terminated.
> >Makefile.perf:416: recipe for target 'perf-read-vdso32' failed
> >make[4]: *** [perf-read-vdso32] Error 1
> >make[4]: *** Waiting for unfinished jobs....
> >   LD       fixdep-in.o
> >   LINK     fixdep
> >   PERF_VERSION = 4.4.g80fcfd7
> >Makefile:68: recipe for target 'all' failed
> >make[3]: *** [all] Error 2
> >   test: test -x ./perf
> >tests/make:274: recipe for target 'make_no_libperl' failed
> >make[2]: *** [make_no_libperl] Error 1
> >tests/make:7: recipe for target 'all' failed
> >make[1]: *** [all] Error 2
> >Makefile:81: recipe for target 'build-test' failed
> >make: *** [build-test] Error 2
> >make: Leaving directory '/home/acme/git/linux/tools/perf'
 
> This is the problem of test-compile-32. In
> ./tools/build/feature/test-compile.c, we check the '-m32' compiler
> flag but don't check include files.
 
> Could you please have a look at your environment? Do you have
> glibc-devel-i386 installed? What's the result of
 
>  $ gcc -m32 tools/build/feature/test-compile.c

[acme@jouet linux]$ gcc -m32 tools/build/feature/test-compile.c 
/usr/bin/ld: cannot find crt1.o: No such file or directory
/usr/bin/ld: cannot find crti.o: No such file or directory
/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/5.3.1/libgcc_s.so when searching for -lgcc_s
/usr/bin/ld: cannot find -lgcc_s
/usr/bin/ld: cannot find -lc
/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/5.3.1/libgcc_s.so when searching for -lgcc_s
/usr/bin/ld: cannot find -lgcc_s
/usr/bin/ld: cannot find crtn.o: No such file or directory
collect2: error: ld returned 1 exit status
[acme@jouet linux]$ 

But this fails, in the same fashion, on my older devel machine, with fedora 21.
 
> I guess in your platform you can compile and link test-compile.c without
> gnu/stubs-32.h. Then we need to improve test-compile.c to make it check
> headers also.

Yeah, on the old system (fedora 21 x86_64 ivy bridge) there is no stups-32.h
file anywhere, nor in the new one (fedora 23 x86_64 broadwell).
 
> Another question is why you don't meet this error before this patch.  It
> seems test-compile-32 should also pass...

What happened is that I got a new notebook, and in the new one, as you
correctly analysed, probably a file required for building that file is not
present, I'll dig deeper and try to add an informative message about
requirements to the test in question...

But the curious thing is that what I have now in my perf/core branch
passes 'make -C tools/perf build-test', its only when this specific
patch is added that it fails.

I.e. it seems that by reusing the features dump file we're triggering some
conditional compilation differently than without this reuse. I wonder if we
could try first having a mechanism that would do as before, i.e. the feature
detection, and then compare with what is passed via the FEATURES_DUMP= make
command line variable, flagging any difference, that would help us figure out
this problem...

Anyway, I'm running:

  make -C tools clean ; perf stat make -C tools/perf build-test

To recheck that it is just when this patch gets applied that the build fails.

- Arnaldo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH] tools build: Check basic headers for test-compile feature checker
  2016-01-27 13:23       ` Jiri Olsa
@ 2016-01-27 13:55         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 79+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-27 13:55 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Wang Nan, linux-kernel, Jiri Olsa, Li Zefan

Em Wed, Jan 27, 2016 at 02:23:59PM +0100, Jiri Olsa escreveu:
> On Wed, Jan 27, 2016 at 11:22:22AM +0000, Wang Nan wrote:
> > An i386 binary can be linked correctly even without correct headers.
> > Which causes problem. For exmaple:
> > 
> >  $ mv /tmp/oxygen_root/usr/include/gnu/stubs-32.h{,.bak}
> >  $ make tools/perf
> >  Auto-detecting system features:
> >  ...                         dwarf: [ on  ]
> >  [SNIP]
> >    GEN      common-cmds.h
> >    CC       perf-read-vdso32
> >  In file included from /tmp/oxygen_root/usr/include/features.h:388:0,
> >                   from /tmp/oxygen_root/usr/include/stdio.h:27,
> >                   from perf-read-vdso.c:1:
> >  /tmp/oxygen_root/usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such file or directory
> >   # include <gnu/stubs-32.h>
> >                            ^
> >  compilation terminated.
> >  ...
> > 
> > In this patch we checks not only compiler and linker, but also basic
> > headers in test-compile test case, make it fail on a platform
> > lacking correct headers.
> > 
> > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Li Zefan <lizefan@huawei.com>
> 
> nice ;-)

Ok, so this one may explain that problem when reusing the features dump
file, trying applying this and then the other...
 
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> 
> thanks,
> jirka
> 
> > ---
> >  tools/build/feature/test-compile.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/tools/build/feature/test-compile.c b/tools/build/feature/test-compile.c
> > index 31dbf45..c54e655 100644
> > --- a/tools/build/feature/test-compile.c
> > +++ b/tools/build/feature/test-compile.c
> > @@ -1,4 +1,6 @@
> > +#include <stdio.h>
> >  int main(void)
> >  {
> > +	printf("Hello World!\n");
> >  	return 0;
> >  }
> > -- 
> > 1.8.3.4
> > 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 28/54] perf record: Extract synthesize code to record__synthesize()
  2016-01-25  9:56 ` [PATCH 28/54] perf record: Extract synthesize code to record__synthesize() Wang Nan
@ 2016-01-29 20:37   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 79+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-29 20:37 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Brendan Gregg, Daniel Borkmann,
	David S. Miller, He Kuang, Jiri Olsa, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, Will Deacon,
	linux-kernel

Em Mon, Jan 25, 2016 at 09:56:15AM +0000, Wang Nan escreveu:
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>  
> +static int record__synthesize(struct record *rec)
> +{
> +	struct perf_session *session = rec->session;
> +	struct machine *machine = &session->machines.host;
> +	struct perf_data_file *file = &rec->file;
> +	struct record_opts *opts = &rec->opts;
> +	struct perf_tool *tool = &rec->tool;
> +	int fd = perf_data_file__fd(file);
> +	int err = 0;
> +	static bool warned_kmaps = false, warned_modules = false;

snip

> +	err = perf_event__synthesize_kernel_mmap(tool, process_synthesized_event,
> +						 machine);
> +	if (err < 0 && !warned_kmaps) {

Please use WARN_ONCE, there are lots of examples in tools/perf and in
the kernel proper, from where this idiom was adopted, this way these
static variables will be auto-created.

- Arnaldo

> +		warned_kmaps = true;
> +		pr_err("Couldn't record kernel reference relocation symbol\n"
> +		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
> +		       "Check /proc/kallsyms permission or run as root.\n");
> +	}
> +
> +	err = perf_event__synthesize_modules(tool, process_synthesized_event,
> +					     machine);
> +	if (err < 0 && !warned_modules) {
> +		warned_modules = true;
> +		pr_err("Couldn't record kernel module information.\n"
> +		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
> +		       "Check /proc/modules permission or run as root.\n");

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [tip:perf/core] perf test: Add libbpf relocation checker
  2016-01-25  9:55 ` [PATCH 01/54] perf test: Add libbpf relocation checker Wang Nan
  2016-01-26 14:58   ` Arnaldo Carvalho de Melo
@ 2016-02-03 10:13   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 79+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-03 10:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: masami.hiramatsu.pt, wangnan0, brendan.d.gregg, acme, peterz,
	will.deacon, lizefan, mingo, hekuang, daniel, hpa, ast, davem,
	jolsa, namhyung, linux-kernel, tglx

Commit-ID:  7b6982ce4b38ecc3f63be46beb7bd079aa290fd7
Gitweb:     http://git.kernel.org/tip/7b6982ce4b38ecc3f63be46beb7bd079aa290fd7
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 25 Jan 2016 09:55:48 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 26 Jan 2016 12:10:55 -0300

perf test: Add libbpf relocation checker

There's a bug in LLVM that it can generate unneeded relocation
information. See [1] and [2]. Libbpf should check the target section of
a relocation symbol.

This patch adds a testcase which references a global variable (BPF
doesn't support global variables). Before fixing libbpf, the new test
case can be loaded into kernel, the global variable acts like the first
map. It is incorrect.

Result:

  # ~/perf test BPF
  37: Test BPF filter                                          :
  37.1: Test basic BPF filtering                               : Ok
  37.2: Test BPF prologue generation                           : Ok
  37.3: Test BPF relocation checker                            : FAILED!

  # ~/perf test -v BPF
  ...
  libbpf: loading object '[bpf_relocation_test]' from buffer
  libbpf: section .strtab, size 126, link 0, flags 0, type=3
  libbpf: section .text, size 0, link 0, flags 6, type=1
  libbpf: section .data, size 0, link 0, flags 3, type=1
  libbpf: section .bss, size 0, link 0, flags 3, type=8
  libbpf: section func=sys_write, size 104, link 0, flags 6, type=1
  libbpf: found program func=sys_write
  libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9
  libbpf: section maps, size 16, link 0, flags 3, type=1
  libbpf: maps in [bpf_relocation_test]: 16 bytes
  libbpf: section license, size 4, link 0, flags 3, type=1
  libbpf: license of [bpf_relocation_test] is GPL
  libbpf: section version, size 4, link 0, flags 3, type=1
  libbpf: kernel version of [bpf_relocation_test] is 40400
  libbpf: section .symtab, size 144, link 1, flags 0, type=2
  libbpf: map 0 is "my_table"
  libbpf: collecting relocating info for: 'func=sys_write'
  libbpf: relocation: insn_idx=7
  Success unexpectedly: libbpf error when dealing with relocation
  test child finished with -1
  ---- end ----
  Test BPF filter subtest 2: FAILED!

[1] https://llvm.org/bugs/show_bug.cgi?id=26243
[2] https://patchwork.ozlabs.org/patch/571385/

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.perf                           |  2 +-
 tools/perf/tests/.gitignore                        |  1 +
 tools/perf/tests/Build                             |  9 ++++++-
 ...ript-example.c => bpf-script-test-relocation.c} | 30 ++++++++++++----------
 tools/perf/tests/bpf.c                             | 26 +++++++++++++++----
 tools/perf/tests/llvm.c                            | 17 +++++++++---
 tools/perf/tests/llvm.h                            |  5 +++-
 7 files changed, 64 insertions(+), 26 deletions(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 5d34815..97ce869 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -618,7 +618,7 @@ clean: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean
 	$(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32
 	$(call QUIET_CLEAN, core-gen)   $(RM)  *.spec *.pyc *.pyo */*.pyc */*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex* \
 		$(OUTPUT)util/intel-pt-decoder/inat-tables.c $(OUTPUT)fixdep \
-		$(OUTPUT)tests/llvm-src-{base,kbuild,prologue}.c
+		$(OUTPUT)tests/llvm-src-{base,kbuild,prologue,relocation}.c
 	$(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
 	$(python-clean)
 
diff --git a/tools/perf/tests/.gitignore b/tools/perf/tests/.gitignore
index bf016c4..8cc30e7 100644
--- a/tools/perf/tests/.gitignore
+++ b/tools/perf/tests/.gitignore
@@ -1,3 +1,4 @@
 llvm-src-base.c
 llvm-src-kbuild.c
 llvm-src-prologue.c
+llvm-src-relocation.c
diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 614899b..1ba628e 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -31,7 +31,7 @@ perf-y += sample-parsing.o
 perf-y += parse-no-sample-id-all.o
 perf-y += kmod-path.o
 perf-y += thread-map.o
-perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o
+perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o llvm-src-relocation.o
 perf-y += bpf.o
 perf-y += topology.o
 perf-y += cpumap.o
@@ -59,6 +59,13 @@ $(OUTPUT)tests/llvm-src-prologue.c: tests/bpf-script-test-prologue.c tests/Build
 	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
 	$(Q)echo ';' >> $@
 
+$(OUTPUT)tests/llvm-src-relocation.c: tests/bpf-script-test-relocation.c tests/Build
+	$(call rule_mkdir)
+	$(Q)echo '#include <tests/llvm.h>' > $@
+	$(Q)echo 'const char test_llvm__bpf_test_relocation[] =' >> $@
+	$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
+	$(Q)echo ';' >> $@
+
 ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64))
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 endif
diff --git a/tools/perf/tests/bpf-script-example.c b/tools/perf/tests/bpf-script-test-relocation.c
similarity index 69%
copy from tools/perf/tests/bpf-script-example.c
copy to tools/perf/tests/bpf-script-test-relocation.c
index 0ec9c2c..93af774 100644
--- a/tools/perf/tests/bpf-script-example.c
+++ b/tools/perf/tests/bpf-script-test-relocation.c
@@ -1,6 +1,6 @@
 /*
- * bpf-script-example.c
- * Test basic LLVM building
+ * bpf-script-test-relocation.c
+ * Test BPF loader checking relocation
  */
 #ifndef LINUX_VERSION_CODE
 # error Need LINUX_VERSION_CODE
@@ -24,25 +24,27 @@ struct bpf_map_def {
 };
 
 #define SEC(NAME) __attribute__((section(NAME), used))
-struct bpf_map_def SEC("maps") flip_table = {
+struct bpf_map_def SEC("maps") my_table = {
 	.type = BPF_MAP_TYPE_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(int),
 	.max_entries = 1,
 };
 
-SEC("func=sys_epoll_pwait")
-int bpf_func__sys_epoll_pwait(void *ctx)
+int this_is_a_global_val;
+
+SEC("func=sys_write")
+int bpf_func__sys_write(void *ctx)
 {
-	int ind =0;
-	int *flag = bpf_map_lookup_elem(&flip_table, &ind);
-	int new_flag;
-	if (!flag)
-		return 0;
-	/* flip flag and store back */
-	new_flag = !*flag;
-	bpf_map_update_elem(&flip_table, &ind, &new_flag, BPF_ANY);
-	return new_flag;
+	int key = 0;
+	int value = 0;
+
+	/*
+	 * Incorrect relocation. Should not allow this program be
+	 * loaded into kernel.
+	 */
+	bpf_map_update_elem(&this_is_a_global_val, &key, &value, 0);
+	return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 33689a0..952ca99 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -71,6 +71,15 @@ static struct {
 		(NR_ITERS + 1) / 4,
 	},
 #endif
+	{
+		LLVM_TESTCASE_BPF_RELOCATION,
+		"Test BPF relocation checker",
+		"[bpf_relocation_test]",
+		"fix 'perf test LLVM' first",
+		"libbpf error when dealing with relocation",
+		NULL,
+		0,
+	},
 };
 
 static int do_test(struct bpf_object *obj, int (*func)(void),
@@ -190,7 +199,7 @@ static int __test__bpf(int idx)
 
 	ret = test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz,
 				       bpf_testcase_table[idx].prog_id,
-				       true);
+				       true, NULL);
 	if (ret != TEST_OK || !obj_buf || !obj_buf_sz) {
 		pr_debug("Unable to get BPF object, %s\n",
 			 bpf_testcase_table[idx].msg_compile_fail);
@@ -202,14 +211,21 @@ static int __test__bpf(int idx)
 
 	obj = prepare_bpf(obj_buf, obj_buf_sz,
 			  bpf_testcase_table[idx].name);
-	if (!obj) {
+	if ((!!bpf_testcase_table[idx].target_func) != (!!obj)) {
+		if (!obj)
+			pr_debug("Fail to load BPF object: %s\n",
+				 bpf_testcase_table[idx].msg_load_fail);
+		else
+			pr_debug("Success unexpectedly: %s\n",
+				 bpf_testcase_table[idx].msg_load_fail);
 		ret = TEST_FAIL;
 		goto out;
 	}
 
-	ret = do_test(obj,
-		      bpf_testcase_table[idx].target_func,
-		      bpf_testcase_table[idx].expect_result);
+	if (obj)
+		ret = do_test(obj,
+			      bpf_testcase_table[idx].target_func,
+			      bpf_testcase_table[idx].expect_result);
 out:
 	bpf__clear();
 	return ret;
diff --git a/tools/perf/tests/llvm.c b/tools/perf/tests/llvm.c
index 06f45c1..70edcdf 100644
--- a/tools/perf/tests/llvm.c
+++ b/tools/perf/tests/llvm.c
@@ -35,6 +35,7 @@ static int test__bpf_parsing(void *obj_buf __maybe_unused,
 static struct {
 	const char *source;
 	const char *desc;
+	bool should_load_fail;
 } bpf_source_table[__LLVM_TESTCASE_MAX] = {
 	[LLVM_TESTCASE_BASE] = {
 		.source = test_llvm__bpf_base_prog,
@@ -48,14 +49,19 @@ static struct {
 		.source = test_llvm__bpf_test_prologue_prog,
 		.desc = "Compile source for BPF prologue generation test",
 	},
+	[LLVM_TESTCASE_BPF_RELOCATION] = {
+		.source = test_llvm__bpf_test_relocation,
+		.desc = "Compile source for BPF relocation test",
+		.should_load_fail = true,
+	},
 };
 
-
 int
 test_llvm__fetch_bpf_obj(void **p_obj_buf,
 			 size_t *p_obj_buf_sz,
 			 enum test_llvm__testcase idx,
-			 bool force)
+			 bool force,
+			 bool *should_load_fail)
 {
 	const char *source;
 	const char *desc;
@@ -68,6 +74,8 @@ test_llvm__fetch_bpf_obj(void **p_obj_buf,
 
 	source = bpf_source_table[idx].source;
 	desc = bpf_source_table[idx].desc;
+	if (should_load_fail)
+		*should_load_fail = bpf_source_table[idx].should_load_fail;
 
 	perf_config(perf_config_cb, NULL);
 
@@ -136,14 +144,15 @@ int test__llvm(int subtest)
 	int ret;
 	void *obj_buf = NULL;
 	size_t obj_buf_sz = 0;
+	bool should_load_fail = false;
 
 	if ((subtest < 0) || (subtest >= __LLVM_TESTCASE_MAX))
 		return TEST_FAIL;
 
 	ret = test_llvm__fetch_bpf_obj(&obj_buf, &obj_buf_sz,
-				       subtest, false);
+				       subtest, false, &should_load_fail);
 
-	if (ret == TEST_OK) {
+	if (ret == TEST_OK && !should_load_fail) {
 		ret = test__bpf_parsing(obj_buf, obj_buf_sz);
 		if (ret != TEST_OK) {
 			pr_debug("Failed to parse test case '%s'\n",
diff --git a/tools/perf/tests/llvm.h b/tools/perf/tests/llvm.h
index 5150b4d..0eaa604 100644
--- a/tools/perf/tests/llvm.h
+++ b/tools/perf/tests/llvm.h
@@ -7,14 +7,17 @@
 extern const char test_llvm__bpf_base_prog[];
 extern const char test_llvm__bpf_test_kbuild_prog[];
 extern const char test_llvm__bpf_test_prologue_prog[];
+extern const char test_llvm__bpf_test_relocation[];
 
 enum test_llvm__testcase {
 	LLVM_TESTCASE_BASE,
 	LLVM_TESTCASE_KBUILD,
 	LLVM_TESTCASE_BPF_PROLOGUE,
+	LLVM_TESTCASE_BPF_RELOCATION,
 	__LLVM_TESTCASE_MAX,
 };
 
 int test_llvm__fetch_bpf_obj(void **p_obj_buf, size_t *p_obj_buf_sz,
-			     enum test_llvm__testcase index, bool force);
+			     enum test_llvm__testcase index, bool force,
+			     bool *should_load_fail);
 #endif

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [tip:perf/core] perf bpf: Check relocation target section
  2016-01-25  9:55 ` [PATCH 02/54] perf bpf: Check relocation target section Wang Nan
@ 2016-02-03 10:14   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 79+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-03 10:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: masami.hiramatsu.pt, will.deacon, peterz, brendan.d.gregg, jolsa,
	lizefan, davem, namhyung, wangnan0, linux-kernel, daniel, acme,
	mingo, hpa, tglx, hekuang, ast

Commit-ID:  666810e86a3b7531cce892fbeda3b2f2322e1d72
Gitweb:     http://git.kernel.org/tip/666810e86a3b7531cce892fbeda3b2f2322e1d72
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 25 Jan 2016 09:55:49 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 26 Jan 2016 12:11:01 -0300

perf bpf: Check relocation target section

Libbpf should check the target section before doing relocation to ensure
the relocation is correct. If not, a bug in LLVM causes an error. See
[1].  Also, if an incorrect BPF script uses both global variable and
map, global variable whould be treated as map and be relocated without
error.

This patch saves the id of the map section into obj->efile and compare
target section of a relocation symbol against it during relocation.

Previous patch introduces a test case about this problem.  After this
patch:

  # ~/perf test BPF
  37: Test BPF filter                                          :
  37.1: Test basic BPF filtering                               : Ok
  37.2: Test BPF prologue generation                           : Ok
  37.3: Test BPF relocation checker                            : Ok

  # perf test -v BPF
  ...
  37.3: Test BPF relocation checker                            :
  ...
  libbpf: loading object '[bpf_relocation_test]' from buffer
  libbpf: section .strtab, size 126, link 0, flags 0, type=3
  libbpf: section .text, size 0, link 0, flags 6, type=1
  libbpf: section .data, size 0, link 0, flags 3, type=1
  libbpf: section .bss, size 0, link 0, flags 3, type=8
  libbpf: section func=sys_write, size 104, link 0, flags 6, type=1
  libbpf: found program func=sys_write
  libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9
  libbpf: section maps, size 16, link 0, flags 3, type=1
  libbpf: maps in [bpf_relocation_test]: 16 bytes
  libbpf: section license, size 4, link 0, flags 3, type=1
  libbpf: license of [bpf_relocation_test] is GPL
  libbpf: section version, size 4, link 0, flags 3, type=1
  libbpf: kernel version of [bpf_relocation_test] is 40400
  libbpf: section .symtab, size 144, link 1, flags 0, type=2
  libbpf: map 0 is "my_table"
  libbpf: collecting relocating info for: 'func=sys_write'
  libbpf: Program 'func=sys_write' contains non-map related relo data pointing to section 65522
  bpf: failed to load buffer
  Compile BPF program failed.
  test child finished with 0
  ---- end ----
  Test BPF filter subtest 2: Ok

[1] https://llvm.org/bugs/show_bug.cgi?id=26243

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/bpf/libbpf.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 8334a5a..7e543c3 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -201,6 +201,7 @@ struct bpf_object {
 			Elf_Data *data;
 		} *reloc;
 		int nr_reloc;
+		int maps_shndx;
 	} efile;
 	/*
 	 * All loaded bpf_object is linked in a list, which is
@@ -350,6 +351,7 @@ static struct bpf_object *bpf_object__new(const char *path,
 	 */
 	obj->efile.obj_buf = obj_buf;
 	obj->efile.obj_buf_sz = obj_buf_sz;
+	obj->efile.maps_shndx = -1;
 
 	obj->loaded = false;
 
@@ -529,12 +531,12 @@ bpf_object__init_maps(struct bpf_object *obj, void *data,
 }
 
 static int
-bpf_object__init_maps_name(struct bpf_object *obj, int maps_shndx)
+bpf_object__init_maps_name(struct bpf_object *obj)
 {
 	int i;
 	Elf_Data *symbols = obj->efile.symbols;
 
-	if (!symbols || maps_shndx < 0)
+	if (!symbols || obj->efile.maps_shndx < 0)
 		return -EINVAL;
 
 	for (i = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) {
@@ -544,7 +546,7 @@ bpf_object__init_maps_name(struct bpf_object *obj, int maps_shndx)
 
 		if (!gelf_getsym(symbols, i, &sym))
 			continue;
-		if (sym.st_shndx != maps_shndx)
+		if (sym.st_shndx != obj->efile.maps_shndx)
 			continue;
 
 		map_name = elf_strptr(obj->efile.elf,
@@ -572,7 +574,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 	Elf *elf = obj->efile.elf;
 	GElf_Ehdr *ep = &obj->efile.ehdr;
 	Elf_Scn *scn = NULL;
-	int idx = 0, err = 0, maps_shndx = -1;
+	int idx = 0, err = 0;
 
 	/* Elf is corrupted/truncated, avoid calling elf_strptr. */
 	if (!elf_rawdata(elf_getscn(elf, ep->e_shstrndx), NULL)) {
@@ -625,7 +627,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 		else if (strcmp(name, "maps") == 0) {
 			err = bpf_object__init_maps(obj, data->d_buf,
 						    data->d_size);
-			maps_shndx = idx;
+			obj->efile.maps_shndx = idx;
 		} else if (sh.sh_type == SHT_SYMTAB) {
 			if (obj->efile.symbols) {
 				pr_warning("bpf: multiple SYMTAB in %s\n",
@@ -674,8 +676,8 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 		pr_warning("Corrupted ELF file: index of strtab invalid\n");
 		return LIBBPF_ERRNO__FORMAT;
 	}
-	if (maps_shndx >= 0)
-		err = bpf_object__init_maps_name(obj, maps_shndx);
+	if (obj->efile.maps_shndx >= 0)
+		err = bpf_object__init_maps_name(obj);
 out:
 	return err;
 }
@@ -697,7 +699,8 @@ bpf_object__find_prog_by_idx(struct bpf_object *obj, int idx)
 static int
 bpf_program__collect_reloc(struct bpf_program *prog,
 			   size_t nr_maps, GElf_Shdr *shdr,
-			   Elf_Data *data, Elf_Data *symbols)
+			   Elf_Data *data, Elf_Data *symbols,
+			   int maps_shndx)
 {
 	int i, nrels;
 
@@ -724,9 +727,6 @@ bpf_program__collect_reloc(struct bpf_program *prog,
 			return -LIBBPF_ERRNO__FORMAT;
 		}
 
-		insn_idx = rel.r_offset / sizeof(struct bpf_insn);
-		pr_debug("relocation: insn_idx=%u\n", insn_idx);
-
 		if (!gelf_getsym(symbols,
 				 GELF_R_SYM(rel.r_info),
 				 &sym)) {
@@ -735,6 +735,15 @@ bpf_program__collect_reloc(struct bpf_program *prog,
 			return -LIBBPF_ERRNO__FORMAT;
 		}
 
+		if (sym.st_shndx != maps_shndx) {
+			pr_warning("Program '%s' contains non-map related relo data pointing to section %u\n",
+				   prog->section_name, sym.st_shndx);
+			return -LIBBPF_ERRNO__RELOC;
+		}
+
+		insn_idx = rel.r_offset / sizeof(struct bpf_insn);
+		pr_debug("relocation: insn_idx=%u\n", insn_idx);
+
 		if (insns[insn_idx].code != (BPF_LD | BPF_IMM | BPF_DW)) {
 			pr_warning("bpf: relocation: invalid relo for insns[%d].code 0x%x\n",
 				   insn_idx, insns[insn_idx].code);
@@ -863,7 +872,8 @@ static int bpf_object__collect_reloc(struct bpf_object *obj)
 
 		err = bpf_program__collect_reloc(prog, nr_maps,
 						 shdr, data,
-						 obj->efile.symbols);
+						 obj->efile.symbols,
+						 obj->efile.maps_shndx);
 		if (err)
 			return err;
 	}

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [tip:perf/core] tools build: Allow subprojects select all feature checkers
  2016-01-25  9:55 ` [PATCH 03/54] tools build: Allow subprojects select all feature checkers Wang Nan
@ 2016-02-03 10:14   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 79+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-03 10:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: brendan.d.gregg, masami.hiramatsu.pt, ast, davem, jolsa, daniel,
	wangnan0, peterz, tglx, linux-kernel, acme, hekuang, hpa,
	lizefan, will.deacon, namhyung, mingo

Commit-ID:  9fd4186ac19a4c8182dffc9b15dd288b50f09f76
Gitweb:     http://git.kernel.org/tip/9fd4186ac19a4c8182dffc9b15dd288b50f09f76
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 25 Jan 2016 09:55:50 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 26 Jan 2016 12:12:48 -0300

tools build: Allow subprojects select all feature checkers

Put feature checkers not in original FEATURE_TESTS to a new list and
allow subproject select all feature checkers by setting FEATURE_TESTS to
'all'.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1453715801-7732-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/build/Makefile.feature | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 02db3cd..674c47d 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -27,7 +27,7 @@ endef
 #   the rule that uses them - an example for that is the 'bionic'
 #   feature check. ]
 #
-FEATURE_TESTS ?=			\
+FEATURE_TESTS_BASIC :=			\
 	backtrace			\
 	dwarf				\
 	fortify-source			\
@@ -56,6 +56,25 @@ FEATURE_TESTS ?=			\
 	get_cpuid			\
 	bpf
 
+# FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
+# of all feature tests
+FEATURE_TESTS_EXTRA :=			\
+	bionic				\
+	compile-32			\
+	compile-x32			\
+	cplus-demangle			\
+	hello				\
+	libbabeltrace			\
+	liberty				\
+	liberty-z			\
+	libunwind-debug-frame
+
+FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC)
+
+ifeq ($(FEATURE_TESTS),all)
+  FEATURE_TESTS := $(FEATURE_TESTS_BASIC) $(FEATURE_TESTS_EXTRA)
+endif
+
 FEATURE_DISPLAY ?=			\
 	dwarf				\
 	glibc				\

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [tip:perf/core] perf build: Select all feature checkers for feature-dump
  2016-01-25  9:55 ` [PATCH 04/54] perf build: Select all feature checkers for feature-dump Wang Nan
@ 2016-02-03 10:14   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 79+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-03 10:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: namhyung, will.deacon, peterz, jolsa, acme, davem,
	masami.hiramatsu.pt, mingo, daniel, wangnan0, brendan.d.gregg,
	tglx, ast, hekuang, linux-kernel, lizefan, hpa

Commit-ID:  c053a1506faee399cbc2105f2131bb5a5d99eedd
Gitweb:     http://git.kernel.org/tip/c053a1506faee399cbc2105f2131bb5a5d99eedd
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 25 Jan 2016 09:55:51 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 26 Jan 2016 13:53:10 -0300

perf build: Select all feature checkers for feature-dump

Set FEATURE_TESTS to 'all' so all possible feature checkers are
executed. Without this setting the output feature dump file miss some
feature, for example, liberity. Select all checker so we won't get an
incomplete feature dump file.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1453715801-7732-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.perf | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 97ce869..0ef3d97 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -165,7 +165,16 @@ ifeq ($(filter-out $(NON_CONFIG_TARGETS),$(MAKECMDGOALS)),)
 endif
 endif
 
+# Set FEATURE_TESTS to 'all' so all possible feature checkers are executed.
+# Without this setting the output feature dump file misses some features, for
+# example, liberty. Select all checkers so we won't get an incomplete feature
+# dump file.
 ifeq ($(config),1)
+ifdef MAKECMDGOALS
+ifeq ($(filter feature-dump,$(MAKECMDGOALS)),feature-dump)
+FEATURE_TESTS := all
+endif
+endif
 include config/Makefile
 endif
 

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [tip:perf/core] tools build: Check basic headers for test-compile feature checker
  2016-01-27 11:22     ` [PATCH] tools build: Check basic headers for test-compile feature checker Wang Nan
  2016-01-27 13:23       ` Jiri Olsa
@ 2016-02-03 10:15       ` tip-bot for Wang Nan
  1 sibling, 0 replies; 79+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-03 10:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, hpa, mingo, lizefan, tglx, acme, wangnan0, linux-kernel

Commit-ID:  cf9162c290447cdf6fca7b64dd6e2200dc52f03b
Gitweb:     http://git.kernel.org/tip/cf9162c290447cdf6fca7b64dd6e2200dc52f03b
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Wed, 27 Jan 2016 11:22:22 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 27 Jan 2016 11:59:32 -0300

tools build: Check basic headers for test-compile feature checker

An i386 binary can be linked correctly even without correct headers.
Which causes problem. For exmaple:

 $ mv /tmp/oxygen_root/usr/include/gnu/stubs-32.h{,.bak}
 $ make tools/perf
 Auto-detecting system features:
 ...                         dwarf: [ on  ]
 [SNIP]
   GEN      common-cmds.h
   CC       perf-read-vdso32
 In file included from /tmp/oxygen_root/usr/include/features.h:388:0,
                  from /tmp/oxygen_root/usr/include/stdio.h:27,
                  from perf-read-vdso.c:1:
 /tmp/oxygen_root/usr/include/gnu/stubs.h:7:27: fatal error: gnu/stubs-32.h: No such file or directory
  # include <gnu/stubs-32.h>
                           ^
 compilation terminated.
 ...

In this patch we checks not only compiler and linker, but also basic
headers in test-compile test case, make it fail on a platform
lacking correct headers.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1453893742-20603-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/build/feature/test-compile.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/build/feature/test-compile.c b/tools/build/feature/test-compile.c
index 31dbf45..c54e655 100644
--- a/tools/build/feature/test-compile.c
+++ b/tools/build/feature/test-compile.c
@@ -1,4 +1,6 @@
+#include <stdio.h>
 int main(void)
 {
+	printf("Hello World!\n");
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [tip:perf/core] perf test: Check environment before start real BPF test
  2016-01-25  9:55 ` [PATCH 06/54] perf test: Check environment before start real BPF test Wang Nan
@ 2016-02-03 10:18   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 79+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-03 10:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: lizefan, wangnan0, ast, hpa, peterz, masami.hiramatsu.pt,
	namhyung, linux-kernel, daniel, mingo, acme, brendan.d.gregg,
	jolsa, hekuang, tglx, will.deacon

Commit-ID:  6a7d550e8b2eeb380ab85d9bc53571123b98345b
Gitweb:     http://git.kernel.org/tip/6a7d550e8b2eeb380ab85d9bc53571123b98345b
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 25 Jan 2016 09:55:53 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 29 Jan 2016 17:25:43 -0300

perf test: Check environment before start real BPF test

Copying perf to old kernel system results:

  # perf test bpf
  37: Test BPF filter                                          :
  37.1: Test basic BPF filtering                               : FAILED!
  37.2: Test BPF prologue generation                           : Skip

However, in case when kernel doesn't support a test case it should
return 'Skip', 'FAILED!' should be reserved for kernel tests for when
the kernel supports a feature that then fails to work as advertised.

This patch checks environment before real testcase.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/bpf.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 952ca99..4aed5cb 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -1,7 +1,11 @@
 #include <stdio.h>
 #include <sys/epoll.h>
+#include <util/util.h>
 #include <util/bpf-loader.h>
 #include <util/evlist.h>
+#include <linux/bpf.h>
+#include <linux/filter.h>
+#include <bpf/bpf.h>
 #include "tests.h"
 #include "llvm.h"
 #include "debug.h"
@@ -243,6 +247,36 @@ const char *test__bpf_subtest_get_desc(int i)
 	return bpf_testcase_table[i].desc;
 }
 
+static int check_env(void)
+{
+	int err;
+	unsigned int kver_int;
+	char license[] = "GPL";
+
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+	};
+
+	err = fetch_kernel_version(&kver_int, NULL, 0);
+	if (err) {
+		pr_debug("Unable to get kernel version\n");
+		return err;
+	}
+
+	err = bpf_load_program(BPF_PROG_TYPE_KPROBE, insns,
+			       sizeof(insns) / sizeof(insns[0]),
+			       license, kver_int, NULL, 0);
+	if (err < 0) {
+		pr_err("Missing basic BPF support, skip this test: %s\n",
+		       strerror(errno));
+		return err;
+	}
+	close(err);
+
+	return 0;
+}
+
 int test__bpf(int i)
 {
 	int err;
@@ -255,6 +289,9 @@ int test__bpf(int i)
 		return TEST_SKIP;
 	}
 
+	if (check_env())
+		return TEST_SKIP;
+
 	err = __test__bpf(i);
 	return err;
 }

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [tip:perf/core] perf test: Improve bp_signal
  2016-01-25  9:55 ` [PATCH 08/54] perf test: Improve bp_signal Wang Nan
@ 2016-02-03 10:18   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 79+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-03 10:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: brendan.d.gregg, peterz, jolsa, lizefan, masami.hiramatsu.pt,
	ast, tglx, hpa, hekuang, will.deacon, acme, daniel, wangnan0,
	linux-kernel, namhyung, mingo

Commit-ID:  8fd34e1cce180eb0c726e7ed88f7b70c11c38e21
Gitweb:     http://git.kernel.org/tip/8fd34e1cce180eb0c726e7ed88f7b70c11c38e21
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 25 Jan 2016 09:55:55 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 29 Jan 2016 17:28:46 -0300

perf test: Improve bp_signal

Will Deacon [1] has some question on patch [2]. This patch improves
test__bp_signal so we can test:

 1. A watchpoint and a breakpoint that fire on the same instruction
 2. Nested signals

Test result:

 On x86_64 and ARM64 (result are similar with patch [2] on ARM64):

  # ./perf test -v signal
  17: Test breakpoint overflow signal handler                  :
  --- start ---
  test child forked, pid 10213
  count1 1, count2 3, count3 2, overflow 3, overflows_2 3
  test child finished with 0
  ---- end ----
  Test breakpoint overflow signal handler: Ok

So at least 2 cases Will doubted are handled correctly.

[1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
[2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-9-git-send-email-wangnan0@huawei.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/bp_signal.c | 140 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 118 insertions(+), 22 deletions(-)

diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index fb80c9e..1d1bb48 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -29,14 +29,59 @@
 
 static int fd1;
 static int fd2;
+static int fd3;
 static int overflows;
+static int overflows_2;
+
+volatile long the_var;
+
+
+/*
+ * Use ASM to ensure watchpoint and breakpoint can be triggered
+ * at one instruction.
+ */
+#if defined (__x86_64__)
+extern void __test_function(volatile long *ptr);
+asm (
+	".globl __test_function\n"
+	"__test_function:\n"
+	"incq (%rdi)\n"
+	"ret\n");
+#elif defined (__aarch64__)
+extern void __test_function(volatile long *ptr);
+asm (
+	".globl __test_function\n"
+	"__test_function:\n"
+	"str x30, [x0]\n"
+	"ret\n");
+
+#else
+static void __test_function(volatile long *ptr)
+{
+	*ptr = 0x1234;
+}
+#endif
 
 __attribute__ ((noinline))
 static int test_function(void)
 {
+	__test_function(&the_var);
+	the_var++;
 	return time(NULL);
 }
 
+static void sig_handler_2(int signum __maybe_unused,
+			  siginfo_t *oh __maybe_unused,
+			  void *uc __maybe_unused)
+{
+	overflows_2++;
+	if (overflows_2 > 10) {
+		ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
+		ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+		ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
+	}
+}
+
 static void sig_handler(int signum __maybe_unused,
 			siginfo_t *oh __maybe_unused,
 			void *uc __maybe_unused)
@@ -54,10 +99,11 @@ static void sig_handler(int signum __maybe_unused,
 		 */
 		ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
 		ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+		ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
 	}
 }
 
-static int bp_event(void *fn, int setup_signal)
+static int __event(bool is_x, void *addr, int signal)
 {
 	struct perf_event_attr pe;
 	int fd;
@@ -67,8 +113,8 @@ static int bp_event(void *fn, int setup_signal)
 	pe.size = sizeof(struct perf_event_attr);
 
 	pe.config = 0;
-	pe.bp_type = HW_BREAKPOINT_X;
-	pe.bp_addr = (unsigned long) fn;
+	pe.bp_type = is_x ? HW_BREAKPOINT_X : HW_BREAKPOINT_W;
+	pe.bp_addr = (unsigned long) addr;
 	pe.bp_len = sizeof(long);
 
 	pe.sample_period = 1;
@@ -86,17 +132,25 @@ static int bp_event(void *fn, int setup_signal)
 		return TEST_FAIL;
 	}
 
-	if (setup_signal) {
-		fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
-		fcntl(fd, F_SETSIG, SIGIO);
-		fcntl(fd, F_SETOWN, getpid());
-	}
+	fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
+	fcntl(fd, F_SETSIG, signal);
+	fcntl(fd, F_SETOWN, getpid());
 
 	ioctl(fd, PERF_EVENT_IOC_RESET, 0);
 
 	return fd;
 }
 
+static int bp_event(void *addr, int signal)
+{
+	return __event(true, addr, signal);
+}
+
+static int wp_event(void *addr, int signal)
+{
+	return __event(false, addr, signal);
+}
+
 static long long bp_count(int fd)
 {
 	long long count;
@@ -114,7 +168,7 @@ static long long bp_count(int fd)
 int test__bp_signal(int subtest __maybe_unused)
 {
 	struct sigaction sa;
-	long long count1, count2;
+	long long count1, count2, count3;
 
 	/* setup SIGIO signal handler */
 	memset(&sa, 0, sizeof(struct sigaction));
@@ -126,21 +180,52 @@ int test__bp_signal(int subtest __maybe_unused)
 		return TEST_FAIL;
 	}
 
+	sa.sa_sigaction = (void *) sig_handler_2;
+	if (sigaction(SIGUSR1, &sa, NULL) < 0) {
+		pr_debug("failed setting up signal handler 2\n");
+		return TEST_FAIL;
+	}
+
 	/*
 	 * We create following events:
 	 *
-	 * fd1 - breakpoint event on test_function with SIGIO
+	 * fd1 - breakpoint event on __test_function with SIGIO
 	 *       signal configured. We should get signal
 	 *       notification each time the breakpoint is hit
 	 *
-	 * fd2 - breakpoint event on sig_handler without SIGIO
+	 * fd2 - breakpoint event on sig_handler with SIGUSR1
+	 *       configured. We should get SIGUSR1 each time when
+	 *       breakpoint is hit
+	 *
+	 * fd3 - watchpoint event on __test_function with SIGIO
 	 *       configured.
 	 *
 	 * Following processing should happen:
-	 *   - execute test_function
-	 *   - fd1 event breakpoint hit -> count1 == 1
-	 *   - SIGIO is delivered       -> overflows == 1
-	 *   - fd2 event breakpoint hit -> count2 == 1
+	 *   Exec:               Action:                       Result:
+	 *   incq (%rdi)       - fd1 event breakpoint hit   -> count1 == 1
+	 *                     - SIGIO is delivered
+	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 1
+	 *                     - SIGUSR1 is delivered
+	 *   sig_handler_2                                  -> overflows_2 == 1  (nested signal)
+	 *   sys_rt_sigreturn  - return from sig_handler_2
+	 *   overflows++                                    -> overflows = 1
+	 *   sys_rt_sigreturn  - return from sig_handler
+	 *   incq (%rdi)       - fd3 event watchpoint hit   -> count3 == 1       (wp and bp in one insn)
+	 *                     - SIGIO is delivered
+	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 2
+	 *                     - SIGUSR1 is delivered
+	 *   sig_handler_2                                  -> overflows_2 == 2  (nested signal)
+	 *   sys_rt_sigreturn  - return from sig_handler_2
+	 *   overflows++                                    -> overflows = 2
+	 *   sys_rt_sigreturn  - return from sig_handler
+	 *   the_var++         - fd3 event watchpoint hit   -> count3 == 2       (standalone watchpoint)
+	 *                     - SIGIO is delivered
+	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 3
+	 *                     - SIGUSR1 is delivered
+	 *   sig_handler_2                                  -> overflows_2 == 3  (nested signal)
+	 *   sys_rt_sigreturn  - return from sig_handler_2
+	 *   overflows++                                    -> overflows == 3
+	 *   sys_rt_sigreturn  - return from sig_handler
 	 *
 	 * The test case check following error conditions:
 	 * - we get stuck in signal handler because of debug
@@ -152,11 +237,13 @@ int test__bp_signal(int subtest __maybe_unused)
 	 *
 	 */
 
-	fd1 = bp_event(test_function, 1);
-	fd2 = bp_event(sig_handler, 0);
+	fd1 = bp_event(__test_function, SIGIO);
+	fd2 = bp_event(sig_handler, SIGUSR1);
+	fd3 = wp_event((void *)&the_var, SIGIO);
 
 	ioctl(fd1, PERF_EVENT_IOC_ENABLE, 0);
 	ioctl(fd2, PERF_EVENT_IOC_ENABLE, 0);
+	ioctl(fd3, PERF_EVENT_IOC_ENABLE, 0);
 
 	/*
 	 * Kick off the test by trigering 'fd1'
@@ -166,15 +253,18 @@ int test__bp_signal(int subtest __maybe_unused)
 
 	ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
 	ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+	ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
 
 	count1 = bp_count(fd1);
 	count2 = bp_count(fd2);
+	count3 = bp_count(fd3);
 
 	close(fd1);
 	close(fd2);
+	close(fd3);
 
-	pr_debug("count1 %lld, count2 %lld, overflow %d\n",
-		 count1, count2, overflows);
+	pr_debug("count1 %lld, count2 %lld, count3 %lld, overflow %d, overflows_2 %d\n",
+		 count1, count2, count3, overflows, overflows_2);
 
 	if (count1 != 1) {
 		if (count1 == 11)
@@ -183,12 +273,18 @@ int test__bp_signal(int subtest __maybe_unused)
 			pr_debug("failed: wrong count for bp1%lld\n", count1);
 	}
 
-	if (overflows != 1)
+	if (overflows != 3)
 		pr_debug("failed: wrong overflow hit\n");
 
-	if (count2 != 1)
+	if (overflows_2 != 3)
+		pr_debug("failed: wrong overflow_2 hit\n");
+
+	if (count2 != 3)
 		pr_debug("failed: wrong count for bp2\n");
 
-	return count1 == 1 && overflows == 1 && count2 == 1 ?
+	if (count3 != 2)
+		pr_debug("failed: wrong count for bp3\n");
+
+	return count1 == 1 && overflows == 3 && count2 == 3 && overflows_2 == 3 && count3 == 2 ?
 		TEST_OK : TEST_FAIL;
 }

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [tip:perf/core] perf tools: Move timestamp creation to util
  2016-01-25  9:56 ` [PATCH 26/54] perf tools: Move timestamp creation to util Wang Nan
@ 2016-02-03 10:18   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 79+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-03 10:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: ast, will.deacon, tglx, daniel, hpa, hekuang, mingo, namhyung,
	masami.hiramatsu.pt, jolsa, peterz, brendan.d.gregg, wangnan0,
	acme, linux-kernel, lizefan

Commit-ID:  37b20151efe002a4a43532d3791d11d39d080248
Gitweb:     http://git.kernel.org/tip/37b20151efe002a4a43532d3791d11d39d080248
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 25 Jan 2016 09:56:13 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 29 Jan 2016 17:30:06 -0300

perf tools: Move timestamp creation to util

Timestamp generation becomes a public available helper. Which will
be used by 'perf record', help it output to split output file based
on time.

For example:

 perf.data.2015122620363710
 perf.data.2015122620364092
 perf.data.2015122620365423
 ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-27-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-buildid-cache.c | 14 +-------------
 tools/perf/util/util.c             | 17 +++++++++++++++++
 tools/perf/util/util.h             |  1 +
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
index d93bff7..632efc6 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -38,19 +38,7 @@ static int build_id_cache__kcore_buildid(const char *proc_dir, char *sbuildid)
 
 static int build_id_cache__kcore_dir(char *dir, size_t sz)
 {
-	struct timeval tv;
-	struct tm tm;
-	char dt[32];
-
-	if (gettimeofday(&tv, NULL) || !localtime_r(&tv.tv_sec, &tm))
-		return -1;
-
-	if (!strftime(dt, sizeof(dt), "%Y%m%d%H%M%S", &tm))
-		return -1;
-
-	scnprintf(dir, sz, "%s%02u", dt, (unsigned)tv.tv_usec / 10000);
-
-	return 0;
+	return fetch_current_timestamp(dir, sz);
 }
 
 static bool same_kallsyms_reloc(const char *from_dir, char *to_dir)
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 7a2da7e..b9e2843 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -701,3 +701,20 @@ bool is_regular_file(const char *file)
 
 	return S_ISREG(st.st_mode);
 }
+
+int fetch_current_timestamp(char *buf, size_t sz)
+{
+	struct timeval tv;
+	struct tm tm;
+	char dt[32];
+
+	if (gettimeofday(&tv, NULL) || !localtime_r(&tv.tv_sec, &tm))
+		return -1;
+
+	if (!strftime(dt, sizeof(dt), "%Y%m%d%H%M%S", &tm))
+		return -1;
+
+	scnprintf(buf, sz, "%s%02u", dt, (unsigned)tv.tv_usec / 10000);
+
+	return 0;
+}
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 61650f0..a861581 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -344,5 +344,6 @@ int fetch_kernel_version(unsigned int *puint,
 
 const char *perf_tip(const char *dirpath);
 bool is_regular_file(const char *file);
+int fetch_current_timestamp(char *buf, size_t sz);
 
 #endif /* GIT_COMPAT_UTIL_H */

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [tip:perf/core] perf record: Use OPT_BOOLEAN_SET for buildid cache related options
  2016-01-25  9:56 ` [PATCH 32/54] perf record: Use OPT_BOOLEAN_SET for buildid cache related options Wang Nan
@ 2016-02-03 10:19   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 79+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-03 10:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: ast, will.deacon, lizefan, namhyung, linux-kernel, hpa, acme,
	wangnan0, peterz, hekuang, brendan.d.gregg, mingo,
	masami.hiramatsu.pt, tglx, jolsa, daniel

Commit-ID:  d2db9a98c3058a45780f7fcd0cc8584858cf6b29
Gitweb:     http://git.kernel.org/tip/d2db9a98c3058a45780f7fcd0cc8584858cf6b29
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 25 Jan 2016 09:56:19 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 29 Jan 2016 17:39:07 -0300

perf record: Use OPT_BOOLEAN_SET for buildid cache related options

'perf record' knows whether buildid cache is enabled (via
--no-no-buildid-cache) deliberately. Buildid cache can be turned off in
some situations.

Output switching support needs this feature to turn off buildid cache
by default.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-33-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 319712a..0ee0d5c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -49,7 +49,9 @@ struct record {
 	const char		*progname;
 	int			realtime_prio;
 	bool			no_buildid;
+	bool			no_buildid_set;
 	bool			no_buildid_cache;
+	bool			no_buildid_cache_set;
 	bool			buildid_all;
 	unsigned long long	samples;
 };
@@ -1097,10 +1099,12 @@ struct option __record_options[] = {
 	OPT_BOOLEAN('P', "period", &record.opts.period, "Record the sample period"),
 	OPT_BOOLEAN('n', "no-samples", &record.opts.no_samples,
 		    "don't sample"),
-	OPT_BOOLEAN('N', "no-buildid-cache", &record.no_buildid_cache,
-		    "do not update the buildid cache"),
-	OPT_BOOLEAN('B', "no-buildid", &record.no_buildid,
-		    "do not collect buildids in perf.data"),
+	OPT_BOOLEAN_SET('N', "no-buildid-cache", &record.no_buildid_cache,
+			&record.no_buildid_cache_set,
+			"do not update the buildid cache"),
+	OPT_BOOLEAN_SET('B', "no-buildid", &record.no_buildid,
+			&record.no_buildid_set,
+			"do not collect buildids in perf.data"),
 	OPT_CALLBACK('G', "cgroup", &record.evlist, "name",
 		     "monitor event in cgroup name only",
 		     parse_cgroups),

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 09/54] perf tools: Add API to config maps in bpf object
  2016-01-25  9:55 ` [PATCH 09/54] perf tools: Add API to config maps in bpf object Wang Nan
@ 2016-02-03 23:29   ` Arnaldo Carvalho de Melo
  2016-02-04 12:59     ` Wangnan (F)
  0 siblings, 1 reply; 79+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-03 23:29 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Brendan Gregg, Daniel Borkmann,
	David S. Miller, He Kuang, Jiri Olsa, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, Will Deacon,
	linux-kernel

Em Mon, Jan 25, 2016 at 09:55:56AM +0000, Wang Nan escreveu:
> bpf__config_obj() is introduced as a core API to config BPF object
> after loading. One configuration option of maps is introduced. After
> this patch BPF object can accept configuration like:
> 
>  maps:my_map.value=1234
> 
> (maps.my_map.value looks pretty. However, there's a small but hard
> to fixed problem related to flex's greedy matching. Please see [1].
> Choose ':' to avoid it in a simpler way.)
> 
> This patch is more complex than the work it really does because the
> consideration of extension. In designing of BPF map configuration,
> following things should be considered:
> 
>  1. Array indices selection: perf should allow user setting different
>     value to different slots in an array, with syntax like:
>     maps:my_map.value[0,3...6]=1234;
> 
>  2. A map can be config by different config terms, each for a part
>     of it. For example, set each slot to pid of a thread;
> 
>  3. Type of value: integer is not the only valid value type. Perf
>     event can also be put into a map after commit 35578d7984003097af2b1e3
>     (bpf: Implement function bpf_perf_event_read() that get the selected
>     hardware PMU conuter);
> 
>  4. For hash table, it is possible to use string or other as key;
> 
>  5. It is possible that map configuration is unable to be setup
>     during parsing. Perf event is an example.
> 
> Therefore, this patch does following:
> 
>  1. Instead of updating map element during parsing, this patch stores
>     map config options in 'struct bpf_map_priv'. Following patches
>     would apply those configs at proper time;
> 
>  2. Link map operations to a list so a map can have multiple config
>     terms attached, so different parts can be configured separately;
> 
>  3. Make 'struct bpf_map_priv' extensible so following patches can
>     add new types of keys and operations;
> 
>  4. Use bpf_config_map_funcs array to support more maps config options.
> 
> Since the patch changing event parser to parse BPF object config is
> relative large, I put in another commit. Code in this patch
> could be tested after applying next patch.
> 
> [1] http://lkml.kernel.org/g/564ED621.4050500@huawei.com
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/util/bpf-loader.c | 266 +++++++++++++++++++++++++++++++++++++++++++
>  tools/perf/util/bpf-loader.h |  38 +++++++
>  2 files changed, 304 insertions(+)
> 
> diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
> index 540a7ef..7d361aa 100644
> --- a/tools/perf/util/bpf-loader.c
> +++ b/tools/perf/util/bpf-loader.c
> @@ -739,6 +739,251 @@ int bpf__foreach_tev(struct bpf_object *obj,
>  	return 0;
>  }
>  
> +enum bpf_map_op_type {
> +	BPF_MAP_OP_SET_VALUE,
> +};
> +
> +enum bpf_map_key_type {
> +	BPF_MAP_KEY_ALL,
> +};
> +
> +struct bpf_map_op {
> +	struct list_head list;
> +	enum bpf_map_op_type op_type;
> +	enum bpf_map_key_type key_type;
> +	union {
> +		u64 value;
> +	} v;
> +};
> +
> +struct bpf_map_priv {
> +	struct list_head ops_list;
> +};
> +
> +static void
> +bpf_map_op__free(struct bpf_map_op *op)
> +{
> +	struct list_head *list = &op->list;
> +	/*
> +	 * bpf_map_op__free() needs to consider following cases:
> +	 *   1. When the op is created but not linked to any list:
> +	 *      impossible. This only happen in bpf_map_op__alloc()
> +	 *      and it would be freed directly;
> +	 *   2. Normal case, when the op is linked to a list;
> +	 *   3. After the op has already be removed.
> +	 * Thanks to list.h, if it has removed by list_del() then
> +	 * list->{next,prev} should have been set to LIST_POISON{1,2}.
> +	 */
> +	if ((list->next != LIST_POISON1) && (list->prev != LIST_POISON2))

Humm, this seems to rely on a debugging feature (setting something to a
trap value), i.e. list poisoning, shouldn't establish that removal needs
to be done via list_del_init() and then we would just check it with
list_empty(), which would be just like that bug we fixed recently wrt
thread__put(), the check, i.e. this is not problematic:

 		list_del_init(&op->list);
 		list_del_init(&op->list);

And after:

		list_del_init(&op->list);

if you wanted for some reason to check if it was unlinked, this would do
the trick:

		if (!list_empty(&op->list) /* Is op in a list? */
			list_del_init(&op->list);

static void bpf_map_op__free(struct bpf_map_op *op)
{
	list_del(&op->list); /* Make sure it is removed */
	free(op);
}

If we make sure that all list removal is done with list_del_init().

But then, this "make sure it is removed" looks strange, this should be
done only if it isn't linked, no? Perhaps use refcounts here?


> +		list_del(list);
> +	free(op);


I.e. this function could be rewritten as:

> +}
> +
> +static void
> +bpf_map_priv__clear(struct bpf_map *map __maybe_unused,
> +		    void *_priv)
> +{
> +	struct bpf_map_priv *priv = _priv;
> +	struct bpf_map_op *pos, *n;
> +
> +	list_for_each_entry_safe(pos, n, &priv->ops_list, list)
> +		bpf_map_op__free(pos);


I.e. here you would remove the thing and then call the delete()
operation for bpf_map_op, otherwise that delete().

Also normally this would be called bpf_map_priv__purge(), i.e. remove
entries and delete them, used in tools in:

[acme@jouet linux]$ find tools/ -name "*.c" | xargs grep __purge
tools/perf/builtin-buildid-cache.c:static int build_id_cache__purge_path(const char *pathname)
tools/perf/builtin-buildid-cache.c:				if (build_id_cache__purge_path(pos->s)) {
tools/perf/util/evlist.c:static void perf_evlist__purge(struct perf_evlist *evlist)
tools/perf/util/evlist.c:	perf_evlist__purge(evlist);
tools/perf/util/map.c:static void __maps__purge(struct maps *maps)
tools/perf/util/map.c:	__maps__purge(maps);
tools/perf/util/annotate.c:void disasm__purge(struct list_head *head)
tools/perf/util/annotate.c:	disasm__purge(&symbol__annotation(sym)->src->source);
tools/perf/util/machine.c:static void dsos__purge(struct dsos *dsos)
tools/perf/util/machine.c:	dsos__purge(dsos);
[acme@jouet linux]$

And in the kernel proper in:

[acme@jouet linux]$ find . -name "*.c" | xargs grep [a-z]_purge  | wc -l
1009

Most notable examples:

/**
 *      __skb_queue_purge - empty a list
 *      @list: list to empty
 *
 *      Delete all buffers on an &sk_buff list. Each buffer is removed from
 *      the list and one reference dropped. This function does not take the
 *      list lock and the caller must hold the relevant locks to use it.
 */
static inline void __skb_queue_purge(struct sk_buff_head *list)
{
        struct sk_buff *skb;
        while ((skb = __skb_dequeue(list)) != NULL)
                kfree_skb(skb);
}

/**
 *      skb_queue_purge - empty a list
 *      @list: list to empty
 *
 *      Delete all buffers on an &sk_buff list. Each buffer is removed from
 *      the list and one reference dropped. This function takes the list
 *      lock and is atomic with respect to other list locking functions.
 */
void skb_queue_purge(struct sk_buff_head *list)
{
        struct sk_buff *skb;
        while ((skb = skb_dequeue(list)) != NULL)
                kfree_skb(skb);
}

Where the delete() operation is called kfree_skb() and notice that it is called
only after the object (skb) is unlinked from whatever lists it sits on.

> +	free(priv);
> +}
> +
> +static struct bpf_map_op *
> +bpf_map_op__alloc(struct bpf_map *map)

I'd name it bpf_map_op__new(), for consistency with other tools/perf/ code, but
wouldn't fight too much about using both __alloc() and __new() for constructors
while __free() and __delete() for destructors :-\

> +{
> +	struct bpf_map_op *op;
> +	struct bpf_map_priv *priv;
> +	const char *map_name;
> +	int err;
> +
> +	map_name = bpf_map__get_name(map);
> +	err = bpf_map__get_private(map, (void **)&priv);
> +	if (err) {
> +		pr_debug("Failed to get private from map %s\n", map_name);
> +		return ERR_PTR(err);
> +	}
> +
> +	if (!priv) {
> +		priv = zalloc(sizeof(*priv));
> +		if (!priv) {
> +			pr_debug("No enough memory to alloc map private\n");
> +			return ERR_PTR(-ENOMEM);
> +		}
> +		INIT_LIST_HEAD(&priv->ops_list);
> +
> +		if (bpf_map__set_private(map, priv, bpf_map_priv__clear)) {
> +			free(priv);
> +			return ERR_PTR(-BPF_LOADER_ERRNO__INTERNAL);
> +		}
> +	}

Can't this bpf_map specific stuff be done on the caller? I.e. it looks like a
layering violation,  i.e. the method is called "bpf_map_op__alloc", this is
something that is related to a bpf_map_op instance, but in the end it allocates
a new instance of a bpf_map_op _and_ adds it to the bpf_map passed as a parameter.

I would expect it to be like:

	op = bpf_map_op__new(); // i.e.: op = bpf_map_op__alloc();
	bpf_map__add(map, op);

And bpf_map__add_op() would to the map->priv allocation if needed, which would
be natural, as bpf_map__ functions touches bpf_map internals.

> +
> +	op = zalloc(sizeof(*op));
> +	if (!op) {
> +		pr_debug("Failed to alloc bpf_map_op\n");
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	op->key_type = BPF_MAP_KEY_ALL;
> +	list_add_tail(&op->list, &priv->ops_list);
> +	return op;
> +}
> +
> +static int
> +bpf__obj_config_map_array_value(struct bpf_map *map,
> +				struct parse_events_term *term)

This should be:

  bpf_map__

Ditto f or other functions below that operate on struct bpf_map.


> +{
> +	struct bpf_map_def def;
> +	struct bpf_map_op *op;
> +	const char *map_name;
> +	int err;
> +
> +	map_name = bpf_map__get_name(map);
> +
> +	err = bpf_map__get_def(map, &def);
> +	if (err) {
> +		pr_debug("Unable to get map definition from '%s'\n",
> +			 map_name);
> +		return -BPF_LOADER_ERRNO__INTERNAL;
> +	}
> +
> +	if (def.type != BPF_MAP_TYPE_ARRAY) {
> +		pr_debug("Map %s type is not BPF_MAP_TYPE_ARRAY\n",
> +			 map_name);
> +		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
> +	}
> +	if (def.key_size < sizeof(unsigned int)) {
> +		pr_debug("Map %s has incorrect key size\n", map_name);
> +		return -BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE;
> +	}
> +	switch (def.value_size) {
> +	case 1:
> +	case 2:
> +	case 4:
> +	case 8:
> +		break;
> +	default:
> +		pr_debug("Map %s has incorrect value size\n", map_name);
> +		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
> +	}
> +
> +	op = bpf_map_op__alloc(map);
> +	if (IS_ERR(op))
> +		return PTR_ERR(op);
> +	op->op_type = BPF_MAP_OP_SET_VALUE;
> +	op->v.value = term->val.num;
> +	return 0;
> +}
> +
> +static int
> +bpf__obj_config_map_value(struct bpf_map *map,
> +			  struct parse_events_term *term,
> +			  struct perf_evlist *evlist __maybe_unused)
> +{
> +	if (!term->err_val) {
> +		pr_debug("Config value not set\n");
> +		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
> +	}
> +
> +	if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM)
> +		return bpf__obj_config_map_array_value(map, term);
> +
> +	pr_debug("ERROR: wrong value type\n");
> +	return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
> +}
> +
> +struct bpf_obj_config_map_func {
> +	const char *config_opt;
> +	int (*config_func)(struct bpf_map *, struct parse_events_term *,
> +			   struct perf_evlist *);
> +};
> +
> +struct bpf_obj_config_map_func bpf_obj_config_map_funcs[] = {
> +	{"value", bpf__obj_config_map_value},
> +};
> +
> +static int
> +bpf__obj_config_map(struct bpf_object *obj,
> +		    struct parse_events_term *term,
> +		    struct perf_evlist *evlist,
> +		    int *key_scan_pos)
> +{
> +	/* key is "maps:<mapname>.<config opt>" */
> +	char *map_name = strdup(term->config + sizeof("maps:") - 1);
> +	struct bpf_map *map;
> +	int err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
> +	char *map_opt;
> +	size_t i;
> +
> +	if (!map_name)
> +		return -ENOMEM;
> +
> +	map_opt = strchr(map_name, '.');
> +	if (!map_opt) {
> +		pr_debug("ERROR: Invalid map config: %s\n", map_name);
> +		goto out;
> +	}
> +
> +	*map_opt++ = '\0';
> +	if (*map_opt == '\0') {
> +		pr_debug("ERROR: Invalid map option: %s\n", term->config);
> +		goto out;
> +	}
> +
> +	map = bpf_object__get_map_by_name(obj, map_name);
> +	if (!map) {
> +		pr_debug("ERROR: Map %s is not exist\n", map_name);
> +		err = -BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST;
> +		goto out;
> +	}
> +
> +	*key_scan_pos += map_opt - map_name;
> +	for (i = 0; i < ARRAY_SIZE(bpf_obj_config_map_funcs); i++) {
> +		struct bpf_obj_config_map_func *func =
> +				&bpf_obj_config_map_funcs[i];
> +
> +		if (strcmp(map_opt, func->config_opt) == 0) {
> +			err = func->config_func(map, term, evlist);
> +			goto out;
> +		}
> +	}
> +
> +	pr_debug("ERROR: invalid config option '%s' for maps\n",
> +		 map_opt);
> +	err = -BPF_LOADER_ERRNO__OBJCONF_MAP_OPT;
> +out:
> +	free(map_name);
> +	if (!err)
> +		key_scan_pos += strlen(map_opt);
> +	return err;
> +}
> +
> +int bpf__config_obj(struct bpf_object *obj,
> +		    struct parse_events_term *term,
> +		    struct perf_evlist *evlist,
> +		    int *error_pos)
> +{
> +	int key_scan_pos = 0;
> +	int err;
> +
> +	if (!obj || !term || !term->config)
> +		return -EINVAL;
> +
> +	if (!prefixcmp(term->config, "maps:")) {
> +		key_scan_pos = sizeof("maps:") - 1;
> +		err = bpf__obj_config_map(obj, term, evlist, &key_scan_pos);
> +		goto out;
> +	}
> +	err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
> +out:
> +	if (error_pos)
> +		*error_pos = key_scan_pos;
> +	return err;
> +
> +}
> +
>  #define ERRNO_OFFSET(e)		((e) - __BPF_LOADER_ERRNO__START)
>  #define ERRCODE_OFFSET(c)	ERRNO_OFFSET(BPF_LOADER_ERRNO__##c)
>  #define NR_ERRNO	(__BPF_LOADER_ERRNO__END - __BPF_LOADER_ERRNO__START)
> @@ -753,6 +998,14 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
>  	[ERRCODE_OFFSET(PROLOGUE)]	= "Failed to generate prologue",
>  	[ERRCODE_OFFSET(PROLOGUE2BIG)]	= "Prologue too big for program",
>  	[ERRCODE_OFFSET(PROLOGUEOOB)]	= "Offset out of bound for prologue",
> +	[ERRCODE_OFFSET(OBJCONF_OPT)]	= "Invalid object config option",
> +	[ERRCODE_OFFSET(OBJCONF_CONF)]	= "Config value not set (lost '=')",
> +	[ERRCODE_OFFSET(OBJCONF_MAP_OPT)]	= "Invalid object maps config option",
> +	[ERRCODE_OFFSET(OBJCONF_MAP_NOTEXIST)]	= "Target map not exist",
> +	[ERRCODE_OFFSET(OBJCONF_MAP_VALUE)]	= "Incorrect value type for map",
> +	[ERRCODE_OFFSET(OBJCONF_MAP_TYPE)]	= "Incorrect map type",
> +	[ERRCODE_OFFSET(OBJCONF_MAP_KEYSIZE)]	= "Incorrect map key size",
> +	[ERRCODE_OFFSET(OBJCONF_MAP_VALUESIZE)]	= "Incorrect map value size",
>  };
>  
>  static int
> @@ -872,3 +1125,16 @@ int bpf__strerror_load(struct bpf_object *obj,
>  	bpf__strerror_end(buf, size);
>  	return 0;
>  }
> +
> +int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
> +			     struct parse_events_term *term __maybe_unused,
> +			     struct perf_evlist *evlist __maybe_unused,
> +			     int *error_pos __maybe_unused, int err,
> +			     char *buf, size_t size)
> +{
> +	bpf__strerror_head(err, buf, size);
> +	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,
> +			    "Can't use this config term to this type of map");
> +	bpf__strerror_end(buf, size);
> +	return 0;
> +}
> diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
> index 6fdc045..2464db9 100644
> --- a/tools/perf/util/bpf-loader.h
> +++ b/tools/perf/util/bpf-loader.h
> @@ -10,6 +10,7 @@
>  #include <string.h>
>  #include <bpf/libbpf.h>
>  #include "probe-event.h"
> +#include "evlist.h"
>  #include "debug.h"
>  
>  enum bpf_loader_errno {
> @@ -24,10 +25,19 @@ enum bpf_loader_errno {
>  	BPF_LOADER_ERRNO__PROLOGUE,	/* Failed to generate prologue */
>  	BPF_LOADER_ERRNO__PROLOGUE2BIG,	/* Prologue too big for program */
>  	BPF_LOADER_ERRNO__PROLOGUEOOB,	/* Offset out of bound for prologue */
> +	BPF_LOADER_ERRNO__OBJCONF_OPT,	/* Invalid object config option */
> +	BPF_LOADER_ERRNO__OBJCONF_CONF,	/* Config value not set (lost '=')) */
> +	BPF_LOADER_ERRNO__OBJCONF_MAP_OPT,	/* Invalid object maps config option */
> +	BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST,	/* Target map not exist */
> +	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE,	/* Incorrect value type for map */
> +	BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,	/* Incorrect map type */
> +	BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE,	/* Incorrect map key size */
> +	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE,/* Incorrect map value size */
>  	__BPF_LOADER_ERRNO__END,
>  };
>  
>  struct bpf_object;
> +struct parse_events_term;
>  #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
>  
>  typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
> @@ -53,6 +63,14 @@ int bpf__strerror_load(struct bpf_object *obj, int err,
>  		       char *buf, size_t size);
>  int bpf__foreach_tev(struct bpf_object *obj,
>  		     bpf_prog_iter_callback_t func, void *arg);
> +
> +int bpf__config_obj(struct bpf_object *obj, struct parse_events_term *term,
> +		    struct perf_evlist *evlist, int *error_pos);
> +int bpf__strerror_config_obj(struct bpf_object *obj,
> +			     struct parse_events_term *term,
> +			     struct perf_evlist *evlist,
> +			     int *error_pos, int err, char *buf,
> +			     size_t size);
>  #else
>  static inline struct bpf_object *
>  bpf__prepare_load(const char *filename __maybe_unused,
> @@ -84,6 +102,15 @@ bpf__foreach_tev(struct bpf_object *obj __maybe_unused,
>  }
>  
>  static inline int
> +bpf__config_obj(struct bpf_object *obj __maybe_unused,
> +		struct parse_events_term *term __maybe_unused,
> +		struct perf_evlist *evlist __maybe_unused,
> +		int *error_pos __maybe_unused)
> +{
> +	return 0;
> +}
> +
> +static inline int
>  __bpf_strerror(char *buf, size_t size)
>  {
>  	if (!size)
> @@ -118,5 +145,16 @@ static inline int bpf__strerror_load(struct bpf_object *obj __maybe_unused,
>  {
>  	return __bpf_strerror(buf, size);
>  }
> +
> +static inline int
> +bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
> +			 struct parse_events_term *term __maybe_unused,
> +			 struct perf_evlist *evlist __maybe_unused,
> +			 int *error_pos __maybe_unused,
> +			 int err __maybe_unused,
> +			 char *buf, size_t size)
> +{
> +	return __bpf_strerror(buf, size);
> +}
>  #endif
>  #endif
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 13/54] perf tools: Support perf event alias name
  2016-01-25  9:56 ` [PATCH 13/54] perf tools: Support perf event alias name Wang Nan
@ 2016-02-03 23:35   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 79+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-03 23:35 UTC (permalink / raw)
  To: Wang Nan, Jiri Olsa
  Cc: Alexei Starovoitov, Brendan Gregg, Daniel Borkmann,
	David S. Miller, He Kuang, Jiri Olsa, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, Will Deacon,
	linux-kernel

Em Mon, Jan 25, 2016 at 09:56:00AM +0000, Wang Nan escreveu:
> From: He Kuang <hekuang@huawei.com>
> 
> This patch is useful when trying to pass a perf event to BPF map.
> Before this patch we are unable to pass an event with config term to
> BPF maps. For example:
> 
>  # perf record -a -e cycles/no-inherit,period=0x7fffffffffffffff/ \
>                   -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/no-inherit,period=0x7fffffffffffffff//' ls /
>  event syntax error: '..ps:pmu_map.event=cycles/'
>                                    \___ Event not found for map setting
> 
> Because those '/' and ',' embarrass parser.
> 
> This patch adds new bison rules for specifying an alias name to a perf
> event, which allows cmdline refer to previous defined perf event through
> its name. With this patch user can give alias name to a perf event using
> following cmdline. The above goal can be achieved using:
> 
>  # perf record -a -e cyc=cycles/no-inherit,period=0x7fffffffffffffff/ \
>                   -e './test_bpf_map_2.c/maps:pmu_map.event=cyc/' ls /

In another thread Jiri suggested this as a way to get an alias
associated with some event:

 -----------------------------------------------------------------
On Tue, Feb 02, 2016 at 05:24:16PM +0100, Andreas Hollmann wrote:
> Jiri, how do you handle raw counters with this python stat__*
> callback? What is the name of the callback?

you can use 'name' term like:
  perf stat -e cycles,"cpu/config=0x6530160,name=krava/"

and use following callback in your script:
  def stat__krava(cpu, thread, time, val, ena, run):

 -----------------------------------------------------------------

Can't we go with that existing syntax? Jiri?

That would be:

  # perf record -a -e cycles/no-inherit,period=0x7fffffffffffffff,name=cyc/ \
                   -e './test_bpf_map_2.c/maps:pmu_map.event=cyc/' ls /

Your way is shorter, but we already have this name=foo method :-\

- Arnaldo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 09/54] perf tools: Add API to config maps in bpf object
  2016-02-03 23:29   ` Arnaldo Carvalho de Melo
@ 2016-02-04 12:59     ` Wangnan (F)
  0 siblings, 0 replies; 79+ messages in thread
From: Wangnan (F) @ 2016-02-04 12:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Alexei Starovoitov, Brendan Gregg, Daniel Borkmann,
	David S. Miller, He Kuang, Jiri Olsa, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, Will Deacon,
	linux-kernel



On 2016/2/4 7:29, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 25, 2016 at 09:55:56AM +0000, Wang Nan escreveu:

[SNIP]

>> +
>> +static void
>> +bpf_map_op__free(struct bpf_map_op *op)
>> +{
>> +	struct list_head *list = &op->list;
>> +	/*
>> +	 * bpf_map_op__free() needs to consider following cases:
>> +	 *   1. When the op is created but not linked to any list:
>> +	 *      impossible. This only happen in bpf_map_op__alloc()
>> +	 *      and it would be freed directly;
>> +	 *   2. Normal case, when the op is linked to a list;
>> +	 *   3. After the op has already be removed.
>> +	 * Thanks to list.h, if it has removed by list_del() then
>> +	 * list->{next,prev} should have been set to LIST_POISON{1,2}.
>> +	 */
>> +	if ((list->next != LIST_POISON1) && (list->prev != LIST_POISON2))
> Humm, this seems to rely on a debugging feature (setting something to a
> trap value), i.e. list poisoning, shouldn't establish that removal needs
> to be done via list_del_init() and then we would just check it with
> list_empty(), which would be just like that bug we fixed recently wrt
> thread__put(), the check, i.e. this is not problematic:
>
>   		list_del_init(&op->list);
>   		list_del_init(&op->list);
>
> And after:
>
> 		list_del_init(&op->list);
>
> if you wanted for some reason to check if it was unlinked, this would do
> the trick:
>
> 		if (!list_empty(&op->list) /* Is op in a list? */
> 			list_del_init(&op->list);
>
> static void bpf_map_op__free(struct bpf_map_op *op)
> {
> 	list_del(&op->list); /* Make sure it is removed */
> 	free(op);
> }
>
> If we make sure that all list removal is done with list_del_init().
>
> But then, this "make sure it is removed" looks strange, this should be
> done only if it isn't linked, no? Perhaps use refcounts here?
>
>
>> +		list_del(list);
>> +	free(op);
>
> I.e. this function could be rewritten as:
>
>> +}
>> +
>> +static void
>> +bpf_map_priv__clear(struct bpf_map *map __maybe_unused,
>> +		    void *_priv)
>> +{
>> +	struct bpf_map_priv *priv = _priv;
>> +	struct bpf_map_op *pos, *n;
>> +
>> +	list_for_each_entry_safe(pos, n, &priv->ops_list, list)
>> +		bpf_map_op__free(pos);
>
> I.e. here you would remove the thing and then call the delete()
> operation for bpf_map_op, otherwise that delete().
>
> Also normally this would be called bpf_map_priv__purge(), i.e. remove
> entries and delete them, used in tools in:

The name of bpf_map_priv__clear comes from bpf_map_clear_priv_t. It would be
passed to bpf_map__set_private as a callback. Naming it using 
bpf_map_priv__purge()
whould be confusing.

I'll try another way to make things clear. Please see next version.

Thank you.

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2016-02-04 13:01 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-25  9:55 [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
2016-01-25  9:55 ` [PATCH 01/54] perf test: Add libbpf relocation checker Wang Nan
2016-01-26 14:58   ` Arnaldo Carvalho de Melo
2016-01-26 15:07     ` Arnaldo Carvalho de Melo
2016-02-03 10:13   ` [tip:perf/core] " tip-bot for Wang Nan
2016-01-25  9:55 ` [PATCH 02/54] perf bpf: Check relocation target section Wang Nan
2016-02-03 10:14   ` [tip:perf/core] " tip-bot for Wang Nan
2016-01-25  9:55 ` [PATCH 03/54] tools build: Allow subprojects select all feature checkers Wang Nan
2016-02-03 10:14   ` [tip:perf/core] " tip-bot for Wang Nan
2016-01-25  9:55 ` [PATCH 04/54] perf build: Select all feature checkers for feature-dump Wang Nan
2016-02-03 10:14   ` [tip:perf/core] " tip-bot for Wang Nan
2016-01-25  9:55 ` [PATCH 05/54] perf build: Use feature dump file for build-test Wang Nan
2016-01-26 16:59   ` Arnaldo Carvalho de Melo
2016-01-27  2:36     ` Wangnan (F)
2016-01-27 13:54       ` Arnaldo Carvalho de Melo
2016-01-27 11:22     ` [PATCH] tools build: Check basic headers for test-compile feature checker Wang Nan
2016-01-27 13:23       ` Jiri Olsa
2016-01-27 13:55         ` Arnaldo Carvalho de Melo
2016-02-03 10:15       ` [tip:perf/core] " tip-bot for Wang Nan
2016-01-25  9:55 ` [PATCH 06/54] perf test: Check environment before start real BPF test Wang Nan
2016-02-03 10:18   ` [tip:perf/core] " tip-bot for Wang Nan
2016-01-25  9:55 ` [PATCH 07/54] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
2016-01-25  9:55 ` [PATCH 08/54] perf test: Improve bp_signal Wang Nan
2016-02-03 10:18   ` [tip:perf/core] " tip-bot for Wang Nan
2016-01-25  9:55 ` [PATCH 09/54] perf tools: Add API to config maps in bpf object Wang Nan
2016-02-03 23:29   ` Arnaldo Carvalho de Melo
2016-02-04 12:59     ` Wangnan (F)
2016-01-25  9:55 ` [PATCH 10/54] perf tools: Enable BPF object configure syntax Wang Nan
2016-01-25  9:55 ` [PATCH 11/54] perf record: Apply config to BPF objects before recording Wang Nan
2016-01-25  9:55 ` [PATCH 12/54] perf tools: Enable passing event to BPF object Wang Nan
2016-01-25  9:56 ` [PATCH 13/54] perf tools: Support perf event alias name Wang Nan
2016-02-03 23:35   ` Arnaldo Carvalho de Melo
2016-01-25  9:56 ` [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately Wang Nan
2016-01-25  9:56 ` [PATCH 15/54] perf tools: Enable indices setting syntax for BPF maps Wang Nan
2016-01-25  9:56 ` [PATCH 16/54] perf tools: Introduce bpf-output event Wang Nan
2016-01-25  9:56 ` [PATCH 17/54] perf data: Support converting data from bpf_perf_event_output() Wang Nan
2016-01-25  9:56 ` [PATCH 18/54] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
2016-01-25  9:56 ` [PATCH 19/54] perf core: Set event's default overflow_handler Wang Nan
2016-01-25  9:56 ` [PATCH 20/54] perf core: Prepare writing into ring buffer from end Wang Nan
2016-01-25  9:56 ` [PATCH 21/54] perf core: Add backward attribute to perf event Wang Nan
2016-01-25  9:56 ` [PATCH 22/54] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
2016-01-25  9:56 ` [PATCH 23/54] perf tools: Introduce API to pause ring buffer Wang Nan
2016-01-25  9:56 ` [PATCH 24/54] perf tools: Only validate is_pos for tracking evsels Wang Nan
2016-01-25  9:56 ` [PATCH 25/54] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
2016-01-25  9:56 ` [PATCH 26/54] perf tools: Move timestamp creation to util Wang Nan
2016-02-03 10:18   ` [tip:perf/core] " tip-bot for Wang Nan
2016-01-25  9:56 ` [PATCH 27/54] perf tools: Make ordered_events reusable Wang Nan
2016-01-25  9:56 ` [PATCH 28/54] perf record: Extract synthesize code to record__synthesize() Wang Nan
2016-01-29 20:37   ` Arnaldo Carvalho de Melo
2016-01-25  9:56 ` [PATCH 29/54] perf tools: Add perf_data_file__switch() helper Wang Nan
2016-01-25  9:56 ` [PATCH 30/54] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
2016-01-25  9:56 ` [PATCH 31/54] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
2016-01-25  9:56 ` [PATCH 32/54] perf record: Use OPT_BOOLEAN_SET for buildid cache related options Wang Nan
2016-02-03 10:19   ` [tip:perf/core] " tip-bot for Wang Nan
2016-01-25  9:56 ` [PATCH 33/54] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
2016-01-25  9:56 ` [PATCH 34/54] perf record: Split output into multiple files via '--switch-output' Wang Nan
2016-01-25  9:56 ` [PATCH 35/54] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
2016-01-25  9:56 ` [PATCH 36/54] perf record: Disable buildid cache options by default in switch output mode Wang Nan
2016-01-25  9:56 ` [PATCH 37/54] perf record: Re-synthesize tracking events after output switching Wang Nan
2016-01-25  9:56 ` [PATCH 38/54] perf record: Generate tracking events for process forked by perf Wang Nan
2016-01-25  9:56 ` [PATCH 39/54] perf record: Ensure return non-zero rc when mmap fail Wang Nan
2016-01-25  9:56 ` [PATCH 40/54] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
2016-01-25  9:56 ` [PATCH 41/54] perf tools: Add evlist channel helpers Wang Nan
2016-01-25  9:56 ` [PATCH 42/54] perf tools: Automatically add new channel according to evlist Wang Nan
2016-01-25  9:56 ` [PATCH 43/54] perf tools: Operate multiple channels Wang Nan
2016-01-25  9:56 ` [PATCH 44/54] perf tools: Squash overwrite setting into channel Wang Nan
2016-01-25  9:56 ` [PATCH 45/54] perf record: Don't read from and poll overwrite channel Wang Nan
2016-01-25  9:56 ` [PATCH 46/54] perf record: Don't poll on " Wang Nan
2016-01-25  9:56 ` [PATCH 47/54] perf tools: Detect avalibility of write_backward Wang Nan
2016-01-25  9:56 ` [PATCH 48/54] perf tools: Enable overwrite settings Wang Nan
2016-01-25  9:56 ` [PATCH 49/54] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
2016-01-25  9:56 ` [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading Wang Nan
2016-01-26  8:25   ` Wangnan (F)
2016-01-25  9:56 ` [PATCH 51/54] perf record: Rename variable to make code clear Wang Nan
2016-01-25  9:56 ` [PATCH 52/54] perf record: Read from backward ring buffer Wang Nan
2016-01-25  9:56 ` [PATCH 53/54] perf record: Allow generate tracking events at the end of output Wang Nan
2016-01-25  9:56 ` [PATCH 54/54] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
2016-01-26  9:11 ` [offlist] Re: [GIT PULL 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wangnan (F)
2016-01-26 14:11   ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.