All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support
@ 2016-02-05 14:01 Wang Nan
  2016-02-05 14:01 ` [PATCH 01/54] perf tools: Fix dangling pointers in parse_events__free_terms Wang Nan
                   ` (53 more replies)
  0 siblings, 54 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Hi Arnaldo,

The following changes since commit 89fee59b504f86925894fcc9ba79d5c933842f93:

  perf tools: handle spaces in file names obtained from /proc/pid/maps (2016-02-05 09:39:56 -0300)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/pi3orama/linux.git tags/perf-core-for-acme

for you to fetch changes up to f9c0effb2643da8f7712936df46444e260b87dc1:

  perf tools: Don't warn about out of order event if write_backward is used (2016-02-05 13:44:29 +0000)

----------------------------------------------------------------
perf improvements:

 - Code clean: based on Arnaldo's suggestion.

 - Remove '-e evt=cycles' symtax. Enforce existing 'cycles/name=evt/'
   syntax.

 - Forbid improper config terms passing to 'perf stat'. i.e. 'perf stat -e cycles/no-inherit/ ...'

Signed-off-by: Wang Nan <wangnan0@huawei.com>

----------------------------------------------------------------
Wang Nan (54):
      perf tools: Fix dangling pointers in parse_events__free_terms
      perf tools: Fix symbols searching for offline module in buildid-cache
      perf tools: Record text offset in dso to calculate objdump address
      perf tools: Adjust symbol for shared objects
      perf data: Fix releasing event_class
      perf tools: Add API to config maps in bpf object
      perf tools: Enable BPF object configure syntax
      perf record: Apply config to BPF objects before recording
      perf tools: Enable passing event to BPF object
      perf stat: Forbid user passing improper config terms
      perf tools: Rename and move pmu_event_name to get_config_name
      perf tools: Enable config raw and numeric events
      perf tools: Enable config and setting names for legacy cache events
      perf tools: Support setting different slots in a BPF map separately
      perf tools: Enable indices setting syntax for BPF maps
      perf tools: Pass tracepoint options to BPF script
      perf tools: Introduce bpf-output event
      perf data: Support converting data from bpf_perf_event_output()
      perf core: Introduce new ioctl options to pause and resume ring buffer
      perf core: Set event's default overflow_handler
      perf core: Prepare writing into ring buffer from end
      perf core: Add backward attribute to perf event
      perf core: Reduce perf event output overhead by new overflow handler
      perf tools: Only validate is_pos for tracking evsels
      perf tools: Print write_backward value in perf_event_attr__fprintf
      perf tools: Make ordered_events reusable
      perf record: Extract synthesize code to record__synthesize()
      perf tools: Add perf_data_file__switch() helper
      perf record: Turns auxtrace_snapshot_enable into 3 states
      perf record: Introduce record__finish_output() to finish a perf.data
      perf record: Add '--timestamp-filename' option to append timestamp to output filename
      perf record: Split output into multiple files via '--switch-output'
      perf record: Force enable --timestamp-filename when --switch-output is provided
      perf record: Disable buildid cache options by default in switch output mode
      perf record: Re-synthesize tracking events after output switching
      perf record: Generate tracking events for process forked by perf
      perf record: Ensure return non-zero rc when mmap fail
      perf record: Prevent reading invalid data in record__mmap_read
      perf tools: Add evlist channel helpers
      perf tools: Automatically add new channel according to evlist
      perf tools: Operate multiple channels
      perf tools: Squash overwrite setting into channel
      perf record: Don't read from and poll overwrite channel
      perf record: Don't poll on overwrite channel
      perf tools: Detect avalibility of write_backward
      perf tools: Enable overwrite settings
      perf tools: Set write_backward attribut bit for overwrite events
      perf tools: Record fd into perf_mmap
      perf tools: Add API to pause a channel
      perf record: Toggle overwrite ring buffer for reading
      perf record: Rename variable to make code clear
      perf record: Read from backward ring buffer
      perf record: Allow generate tracking events at the end of output
      perf tools: Don't warn about out of order event if write_backward is used

 include/linux/perf_event.h        |  22 +-
 include/uapi/linux/perf_event.h   |   4 +-
 kernel/events/core.c              |  73 +++-
 kernel/events/internal.h          |  11 +
 kernel/events/ring_buffer.c       |  63 +++-
 tools/perf/builtin-record.c       | 598 ++++++++++++++++++++++++++-----
 tools/perf/builtin-stat.c         |   1 +
 tools/perf/perf.h                 |   2 +
 tools/perf/tests/bpf.c            |   2 +-
 tools/perf/util/bpf-loader.c      | 719 ++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h      |  59 ++++
 tools/perf/util/build-id.c        |  44 +++
 tools/perf/util/build-id.h        |   1 +
 tools/perf/util/data-convert-bt.c | 130 ++++++-
 tools/perf/util/data.c            |  36 ++
 tools/perf/util/data.h            |  11 +-
 tools/perf/util/dso.h             |   1 +
 tools/perf/util/evlist.c          | 355 ++++++++++++++++---
 tools/perf/util/evlist.h          |  70 +++-
 tools/perf/util/evsel.c           |  23 ++
 tools/perf/util/evsel.h           |  11 +
 tools/perf/util/map.c             |  14 +
 tools/perf/util/ordered-events.c  |   5 +
 tools/perf/util/parse-events.c    | 267 ++++++++++++--
 tools/perf/util/parse-events.h    |  28 +-
 tools/perf/util/parse-events.l    |  18 +-
 tools/perf/util/parse-events.y    | 134 ++++++-
 tools/perf/util/record.c          |  11 +
 tools/perf/util/session.c         |  22 +-
 tools/perf/util/symbol-elf.c      |  25 +-
 tools/perf/util/symbol.c          |   4 +
 31 files changed, 2532 insertions(+), 232 deletions(-)

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH 01/54] perf tools: Fix dangling pointers in parse_events__free_terms
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-16  7:53   ` [tip:perf/core] perf tools: Unlink entries from terms list tip-bot for Wang Nan
  2016-02-16  7:54   ` [tip:perf/core] perf tools: Free the terms list_head in parse_events__free_terms() tip-bot for Wang Nan
  2016-02-05 14:01 ` [PATCH 02/54] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
                   ` (52 subsequent siblings)
  53 siblings, 2 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

We expect parse_events__free_terms() destory the full config term list.
However, current code leaves the list header unclean, and points to
a invalidated linked list.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 813d9b2..e8b2d85 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2072,8 +2072,15 @@ void parse_events__free_terms(struct list_head *terms)
 {
 	struct parse_events_term *term, *h;
 
-	list_for_each_entry_safe(term, h, terms, list)
+	if (!terms)
+		return;
+
+	list_for_each_entry_safe(term, h, terms, list) {
+		list_del(&term->list);
 		free(term);
+	}
+
+	free(terms);
 }
 
 void parse_events_evlist_error(struct parse_events_evlist *data,
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 02/54] perf tools: Fix symbols searching for offline module in buildid-cache
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
  2016-02-05 14:01 ` [PATCH 01/54] perf tools: Fix dangling pointers in parse_events__free_terms Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-16  7:52   ` [tip:perf/core] perf symbols: Fix symbols searching for " tip-bot for Wang Nan
  2016-02-05 14:01 ` [PATCH 03/54] perf tools: Record text offset in dso to calculate objdump address Wang Nan
                   ` (51 subsequent siblings)
  53 siblings, 1 reply; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Before this patch, if a sample is triggered inside an offline module
(module not in /lib/modules/`uname -r`/), even if the module is in
buildid-cache, 'perf report' is still unable to get correct symbol.
For example:

 # rm -rf ~/.debug/
 # perf buildid-cache -a ./mymodule.ko
 # perf probe -m ./mymodule.ko -a get_mymodule_val
 Added new event:
   probe:get_mymodule_val (on get_mymodule_val in mymodule)

 You can now use it in all perf tools, such as:

 	perf record -e probe:get_mymodule_val -aR sleep 1

 # perf record -e probe:get_mymodule_val cat /proc/mymodule
 mymodule:3
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]

 # perf report --stdio
 [SNIP]
 #
 # Overhead  Command  Shared Object     Symbol
 # ........  .......  ................  ......................
 #
    100.00%  cat      [mymodule]        [k] 0x0000000000000001

 # perf report -vvvv --stdio
 dso__load_sym: adjusting symbol: st_value: 0 sh_addr: 0 sh_offset: 0x70
 symbol__new: get_mymodule_val 0x70-0x8a
 [SNIP]

This is caused by dso__load() -> dso__load_sym(). In dso__load(), kmod
is true only when dso is regular kernel module. All files loaded from
buildid-cache is treated as user programs. Following dso__load_sym()
set map->pgoff incorrectly.

This patch gives kernel modules in buildid-cache a chance to adjust
value of kmod. After dso__load() get the type of symbols, if it is
buildid, check the last 3 chars of original filename against '.ko',
and adjust the value of kmod if the file is a kernel module.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
---
 tools/perf/util/build-id.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/build-id.h |  1 +
 tools/perf/util/symbol.c   |  4 ++++
 3 files changed, 49 insertions(+)

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index b28100e..f1479ee 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -166,6 +166,50 @@ char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size)
 	return build_id__filename(build_id_hex, bf, size);
 }
 
+bool dso__build_id_is_kmod(const struct dso *dso, char *bf, size_t size)
+{
+	char *id_name, *ch;
+	struct stat sb;
+
+	id_name = dso__build_id_filename(dso, bf, size);
+	if (!id_name)
+		goto err;
+	if (access(id_name, F_OK))
+		goto err;
+	if (lstat(id_name, &sb) == -1)
+		goto err;
+	if ((size_t)sb.st_size > size - 1)
+		goto err;
+	if (readlink(id_name, bf, size - 1) < 0)
+		goto err;
+
+	bf[sb.st_size] = '\0';
+
+	/*
+	 * link should be:
+	 * ../../lib/modules/4.4.0-rc4/kernel/net/ipv4/netfilter/nf_nat_ipv4.ko/a09fe3eb3147dafa4e3b31dbd6257e4d696bdc92
+	 */
+	ch = strrchr(bf, '/');
+	if (!ch)
+		goto err;
+	if (ch - 3 < bf)
+		goto err;
+
+	return strncmp(".ko", ch - 3, 3) == 0;
+err:
+	/*
+	 * If dso__build_id_filename work, get id_name again,
+	 * because id_name points to bf and is broken.
+	 */
+	if (id_name)
+		id_name = dso__build_id_filename(dso, bf, size);
+	pr_err("Invalid build id: %s\n", id_name ? :
+					 dso->long_name ? :
+					 dso->short_name ? :
+					 "[unknown]");
+	return false;
+}
+
 #define dsos__for_each_with_build_id(pos, head)	\
 	list_for_each_entry(pos, head, node)	\
 		if (!pos->has_build_id)		\
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 27a14a8..64af3e2 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -16,6 +16,7 @@ int sysfs__sprintf_build_id(const char *root_dir, char *sbuild_id);
 int filename__sprintf_build_id(const char *pathname, char *sbuild_id);
 
 char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size);
+bool dso__build_id_is_kmod(const struct dso *dso, char *bf, size_t size);
 
 int build_id__mark_dso_hit(struct perf_tool *tool, union perf_event *event,
 			   struct perf_sample *sample, struct perf_evsel *evsel,
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 90cedfa..e7588dc 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1529,6 +1529,10 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 	if (!runtime_ss && syms_ss)
 		runtime_ss = syms_ss;
 
+	if (syms_ss && syms_ss->type == DSO_BINARY_TYPE__BUILD_ID_CACHE)
+		if (dso__build_id_is_kmod(dso, name, PATH_MAX))
+			kmod = true;
+
 	if (syms_ss)
 		ret = dso__load_sym(dso, map, syms_ss, runtime_ss, filter, kmod);
 	else
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 03/54] perf tools: Record text offset in dso to calculate objdump address
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
  2016-02-05 14:01 ` [PATCH 01/54] perf tools: Fix dangling pointers in parse_events__free_terms Wang Nan
  2016-02-05 14:01 ` [PATCH 02/54] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 04/54] perf tools: Adjust symbol for shared objects Wang Nan
                   ` (50 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

In this patch, the offset of '.text' section is stored into dso
and used here to re-calculate address to objdump.

In most of the cases, executable code is in '.text' section, so the
adjustment made to a symbol in dso__load_sym (using
sym.st_value -= shdr.sh_addr - shdr.sh_offset) should equal to
'sym.st_value -= dso->text_offset'. Therefore, adding text_offset back
get objdump address from symbol address (rip). However, it is not true
for kernel and kernel module since there could be multiple executable
sections with different offset. Exclude kernel for this reason.

After this patch, even dso->adjust_symbols is set to true for shared
objects, map__rip_2objdump() and map__objdump_2mem() would return
correct result, so perf behavior of annotate won't be changed.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/dso.h        |  1 +
 tools/perf/util/map.c        | 14 ++++++++++++++
 tools/perf/util/symbol-elf.c | 12 ++++++------
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 45ec4d0..ef3dbc9 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -162,6 +162,7 @@ struct dso {
 	u8		 loaded;
 	u8		 rel;
 	u8		 build_id[BUILD_ID_SIZE];
+	u64		 text_offset;
 	const char	 *short_name;
 	const char	 *long_name;
 	u16		 long_name_len;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 171b6d1..02c3186 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -431,6 +431,13 @@ u64 map__rip_2objdump(struct map *map, u64 rip)
 	if (map->dso->rel)
 		return rip - map->pgoff;
 
+	/*
+	 * kernel modules also have DSO_TYPE_USER in dso->kernel,
+	 * but all kernel modules are ET_REL, so won't get here.
+	 */
+	if (map->dso->kernel == DSO_TYPE_USER)
+		return rip + map->dso->text_offset;
+
 	return map->unmap_ip(map, rip) - map->reloc;
 }
 
@@ -454,6 +461,13 @@ u64 map__objdump_2mem(struct map *map, u64 ip)
 	if (map->dso->rel)
 		return map->unmap_ip(map, ip + map->pgoff);
 
+	/*
+	 * kernel modules also have DSO_TYPE_USER in dso->kernel,
+	 * but all kernel modules are ET_REL, so won't get here.
+	 */
+	if (map->dso->kernel == DSO_TYPE_USER)
+		return map->unmap_ip(map, ip - map->dso->text_offset);
+
 	return ip + map->reloc;
 }
 
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 562b8eb..5227186 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -792,6 +792,7 @@ int dso__load_sym(struct dso *dso, struct map *map,
 	uint32_t idx;
 	GElf_Ehdr ehdr;
 	GElf_Shdr shdr;
+	GElf_Shdr tshdr;
 	Elf_Data *syms, *opddata = NULL;
 	GElf_Sym sym;
 	Elf_Scn *sec, *sec_strndx;
@@ -831,6 +832,9 @@ int dso__load_sym(struct dso *dso, struct map *map,
 	sec = syms_ss->symtab;
 	shdr = syms_ss->symshdr;
 
+	if (elf_section_by_name(elf, &ehdr, &tshdr, ".text", NULL))
+		dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
+
 	if (runtime_ss->opdsec)
 		opddata = elf_rawdata(runtime_ss->opdsec, NULL);
 
@@ -879,12 +883,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
 	 * Handle any relocation of vdso necessary because older kernels
 	 * attempted to prelink vdso to its virtual address.
 	 */
-	if (dso__is_vdso(dso)) {
-		GElf_Shdr tshdr;
-
-		if (elf_section_by_name(elf, &ehdr, &tshdr, ".text", NULL))
-			map->reloc = map->start - tshdr.sh_addr + tshdr.sh_offset;
-	}
+	if (dso__is_vdso(dso))
+		map->reloc = map->start - dso->text_offset;
 
 	dso->adjust_symbols = runtime_ss->adjust_symbols || ref_reloc(kmap);
 	/*
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 04/54] perf tools: Adjust symbol for shared objects
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (2 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 03/54] perf tools: Record text offset in dso to calculate objdump address Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 05/54] perf data: Fix releasing event_class Wang Nan
                   ` (49 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

He Kuang reported a problem that perf fails to get correct symbol on
Android platform in [1]. The problem can be reproduced on normal x86_64
platform. I will describe the reproducing steps in detail at the end of
commit message.

The reason of this problem is the missing of symbol adjustment for normal
shared objects. In most of the cases it works correctly, but when
'.text' section have different 'address' and 'offset' the result is
wrong. I checked all shared objects in my working platform, only wine
dll objects and debug objects (in .debug) have this problem. However,
it is common on Android. For example:

 $ readelf -S ./libsurfaceflinger.so | grep \.text
   [10] .text             PROGBITS         0000000000029030  00012030

This patch enables symbol adjustment for dynamic objects so the symbol
address got from elfutils would be adjusted correctly.

Now nearly all type of ELF file should adjust symbols. Makes
ss->adjust_symbols default to true.

Steps to reproduce the problem:

 $ cat ./Makefile
PWD := $(shell pwd)
LDFLAGS += "-Wl,-rpath=$(PWD)"
CFLAGS += -g
main: main.c libbuggy.so
libbuggy.so: buggy.c
	gcc -g -shared -fPIC -Wl,-Ttext-segment=0x200000 $< -o $@
clean:
	rm -rf main libbuggy.so *.o

 $ cat ./buggy.c
 int fib(int x)
 {
     return (x == 0) ? 1 : (x == 1) ? 1 : fib(x - 1) + fib(x - 2);
 }

 $ cat ./main.c
 #include <stdio.h>

 extern int fib(int x);
 int main()
 {
     int i;

     for (i = 0; i < 40; i++)
         printf("%d\n", fib(i));
     return 0;
 }

 $ make
 $ perf record ./main
 ...
 $ perf report --stdio
 # Overhead  Command  Shared Object      Symbol
 # ........  .......  .................  ...............................
 #
     14.97%  main     libbuggy.so        [.] 0x000000000000066c
      8.68%  main     libbuggy.so        [.] 0x00000000000006aa
      8.52%  main     libbuggy.so        [.] fib@plt
      7.95%  main     libbuggy.so        [.] 0x0000000000000664
      5.94%  main     libbuggy.so        [.] 0x00000000000006a9
      5.35%  main     libbuggy.so        [.] 0x0000000000000678
 ...

The correct result should be (after this patch):

 # Overhead  Command  Shared Object      Symbol
 # ........  .......  .................  ...............................
 #
     91.47%  main     libbuggy.so        [.] fib
      8.52%  main     libbuggy.so        [.] fib@plt
      0.00%  main     [kernel.kallsyms]  [k] kmem_cache_free

[1] http://lkml.kernel.org/g/1452567507-54013-1-git-send-email-hekuang@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/symbol-elf.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 5227186..6a9a53a 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -708,17 +708,10 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
 	if (ss->opdshdr.sh_type != SHT_PROGBITS)
 		ss->opdsec = NULL;
 
-	if (dso->kernel == DSO_TYPE_USER) {
-		GElf_Shdr shdr;
-		ss->adjust_symbols = (ehdr.e_type == ET_EXEC ||
-				ehdr.e_type == ET_REL ||
-				dso__is_vdso(dso) ||
-				elf_section_by_name(elf, &ehdr, &shdr,
-						     ".gnu.prelink_undo",
-						     NULL) != NULL);
-	} else {
+	if (dso->kernel == DSO_TYPE_USER)
+		ss->adjust_symbols = true;
+	else
 		ss->adjust_symbols = elf__needs_adjust_symbols(ehdr);
-	}
 
 	ss->name   = strdup(name);
 	if (!ss->name) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 05/54] perf data: Fix releasing event_class
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (3 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 04/54] perf tools: Adjust symbol for shared objects Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
       [not found]   ` <20160211220413.GF32168@kernel.org>
  2016-02-16  7:55   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-02-05 14:01 ` [PATCH 06/54] perf tools: Add API to config maps in bpf object Wang Nan
                   ` (48 subsequent siblings)
  53 siblings, 2 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

A new patch of libbabeltrace [1] reveals a object leak problem in
perf data CTF support: perf code never release event_class which is
allocated in add_event() and stored in evsel's private field.

If libbabeltrace has the above patch applied, leaking event_class
prevent writer being destroied and flushing metadata. For example:

 $ ./perf record ls
 Lowering default frequency rate to 500.
 Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
 perf.data
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data (12 samples) ]
 $ ./perf data convert --to-ctf ./out.ctf
 [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
 [ perf data convert: Converted and wrote 0.000 MB (12 samples) ]
 $ cat ./out.ctf/metadata
 $ ls -l  ./out.ctf/metadata
 -rw-r----- 1 w00229757 mm 0 Jan 27 10:49 ./out.ctf/metadata

The correct result should be:
 ...
 $ cat ./out.ctf/metadata
 /* CTF 1.8 */

 trace {
 [SNIP]

 $ ls -l  ./out.ctf/metadata
 -rw-r----- 1 w00229757 mm 2446 Jan 27 10:52 ./out.ctf/metadata

The full story is:

 Patch [1] of babeltrace redesign reference counting scheme. In that
 patch:

  * writer <- trace (bt_ctf_writer_create)
  * trace <- stream_class (bt_ctf_trace_add_stream_class)
  * stream_class <- event_class (bt_ctf_stream_class_add_event_class)
  ('<-' means 'is a parent of')

  Holding of event_class causes reference count of corresponding
  'writer' increases through parent chain. Perf expect 'writer' is
  released (so metadata is flushed) through bt_ctf_writer_put() in
  ctf_writer__cleanup(). However, since it never release event_class,
  the reference of 'writer' won't be reduced, so bt_ctf_writer_put()
  won't lead releasing of writer.

 Before this CTF patch, !(writer <- trace). Even event_class leak,
 writer is able to be released.

[1] https://github.com/efficios/babeltrace/commit/e6a8e8e4744633807083a077ff9f101eb97d9801

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data-convert-bt.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 34cd1e4..b722e57 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -858,6 +858,23 @@ static int setup_events(struct ctf_writer *cw, struct perf_session *session)
 	return 0;
 }
 
+static void cleanup_events(struct perf_session *session)
+{
+	struct perf_evlist *evlist = session->evlist;
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		struct evsel_priv *priv;
+
+		priv = evsel->priv;
+		bt_ctf_event_class_put(priv->event_class);
+		zfree(&evsel->priv);
+	}
+
+	perf_evlist__delete(evlist);
+	session->evlist = NULL;
+}
+
 static int setup_streams(struct ctf_writer *cw, struct perf_session *session)
 {
 	struct ctf_stream **stream;
@@ -1171,6 +1188,7 @@ int bt_convert__perf2ctf(const char *input, const char *path, bool force)
 		(double) c.events_size / 1024.0 / 1024.0,
 		c.events_count);
 
+	cleanup_events(session);
 	perf_session__delete(session);
 	ctf_writer__cleanup(cw);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 06/54] perf tools: Add API to config maps in bpf object
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (4 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 05/54] perf data: Fix releasing event_class Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 07/54] perf tools: Enable BPF object configure syntax Wang Nan
                   ` (47 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

bpf__config_obj() is introduced as a core API to config BPF object
after loading. One configuration option of maps is introduced. After
this patch BPF object can accept configuration like:

 maps:my_map.value=1234

(maps.my_map.value looks pretty. However, there's a small but hard
to fixed problem related to flex's greedy matching. Please see [1].
Choose ':' to avoid it in a simpler way.)

This patch is more complex than the work it really does because the
consideration of extension. In designing of BPF map configuration,
following things should be considered:

 1. Array indices selection: perf should allow user setting different
    value to different slots in an array, with syntax like:
    maps:my_map.value[0,3...6]=1234;

 2. A map can be config by different config terms, each for a part
    of it. For example, set each slot to pid of a thread;

 3. Type of value: integer is not the only valid value type. Perf
    event can also be put into a map after commit 35578d7984003097af2b1e3
    (bpf: Implement function bpf_perf_event_read() that get the selected
    hardware PMU conuter);

 4. For hash table, it is possible to use string or other as key;

 5. It is possible that map configuration is unable to be setup
    during parsing. Perf event is an example.

Therefore, this patch does following:

 1. Instead of updating map element during parsing, this patch stores
    map config options in 'struct bpf_map_priv'. Following patches
    would apply those configs at proper time;

 2. Link map operations to a list so a map can have multiple config
    terms attached, so different parts can be configured separately;

 3. Make 'struct bpf_map_priv' extensible so following patches can
    add new types of keys and operations;

 4. Use bpf_obj_config__map_funcs array to support more maps config options.

Since the patch changing event parser to parse BPF object config is
relative large, I put in another commit. Code in this patch
could be tested after applying next patch.

[1] http://lkml.kernel.org/g/564ED621.4050500@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c | 277 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h |  38 ++++++
 2 files changed, 315 insertions(+)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 540a7ef..91678f4 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -739,6 +739,262 @@ int bpf__foreach_tev(struct bpf_object *obj,
 	return 0;
 }
 
+enum bpf_map_op_type {
+	BPF_MAP_OP_SET_VALUE,
+};
+
+enum bpf_map_key_type {
+	BPF_MAP_KEY_ALL,
+};
+
+struct bpf_map_op {
+	struct list_head list;
+	enum bpf_map_op_type op_type;
+	enum bpf_map_key_type key_type;
+	union {
+		u64 value;
+	} v;
+};
+
+struct bpf_map_priv {
+	struct list_head ops_list;
+};
+
+static void
+bpf_map_op__delete(struct bpf_map_op *op)
+{
+	if (!list_empty(&op->list))
+		list_del(&op->list);
+	free(op);
+}
+
+static void
+bpf_map_priv__purge(struct bpf_map_priv *priv)
+{
+	struct bpf_map_op *pos, *n;
+
+	list_for_each_entry_safe(pos, n, &priv->ops_list, list) {
+		list_del_init(&pos->list);
+		bpf_map_op__delete(pos);
+	}
+}
+
+static void
+bpf_map_priv__clear(struct bpf_map *map __maybe_unused,
+		    void *_priv)
+{
+	struct bpf_map_priv *priv = _priv;
+
+	bpf_map_priv__purge(priv);
+	free(priv);
+}
+
+static struct bpf_map_op *
+bpf_map_op__new(void)
+{
+	struct bpf_map_op *op;
+
+	op = zalloc(sizeof(*op));
+	if (!op) {
+		pr_debug("Failed to alloc bpf_map_op\n");
+		return ERR_PTR(-ENOMEM);
+	}
+	INIT_LIST_HEAD(&op->list);
+
+	op->key_type = BPF_MAP_KEY_ALL;
+	return op;
+}
+
+static int
+bpf_map__add_op(struct bpf_map *map, struct bpf_map_op *op)
+{
+	struct bpf_map_priv *priv;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+	err = bpf_map__get_private(map, (void **)&priv);
+	if (err) {
+		pr_debug("Failed to get private from map %s\n", map_name);
+		return err;
+	}
+
+	if (!priv) {
+		priv = zalloc(sizeof(*priv));
+		if (!priv) {
+			pr_debug("No enough memory to alloc map private\n");
+			return -ENOMEM;
+		}
+		INIT_LIST_HEAD(&priv->ops_list);
+
+		if (bpf_map__set_private(map, priv, bpf_map_priv__clear)) {
+			free(priv);
+			return -BPF_LOADER_ERRNO__INTERNAL;
+		}
+	}
+
+	list_add_tail(&op->list, &priv->ops_list);
+	return 0;
+}
+
+static int
+__bpf_map__config_value(struct bpf_map *map,
+			struct parse_events_term *term)
+{
+	struct bpf_map_def def;
+	struct bpf_map_op *op;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("Unable to get map definition from '%s'\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	if (def.type != BPF_MAP_TYPE_ARRAY) {
+		pr_debug("Map %s type is not BPF_MAP_TYPE_ARRAY\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+	}
+	if (def.key_size < sizeof(unsigned int)) {
+		pr_debug("Map %s has incorrect key size\n", map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE;
+	}
+	switch (def.value_size) {
+	case 1:
+	case 2:
+	case 4:
+	case 8:
+		break;
+	default:
+		pr_debug("Map %s has incorrect value size\n", map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
+	}
+
+	op = bpf_map_op__new();
+	if (IS_ERR(op))
+		return PTR_ERR(op);
+	op->op_type = BPF_MAP_OP_SET_VALUE;
+	op->v.value = term->val.num;
+
+	err = bpf_map__add_op(map, op);
+	if (err)
+		bpf_map_op__delete(op);
+	return err;
+}
+
+static int
+bpf_map__config_value(struct bpf_map *map,
+		      struct parse_events_term *term,
+		      struct perf_evlist *evlist __maybe_unused)
+{
+	if (!term->err_val) {
+		pr_debug("Config value not set\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
+	}
+
+	if (!term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
+		pr_debug("ERROR: wrong value type\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
+	}
+
+	return __bpf_map__config_value(map, term);
+}
+
+struct bpf_obj_config__map_func {
+	const char *config_opt;
+	int (*config_func)(struct bpf_map *, struct parse_events_term *,
+			   struct perf_evlist *);
+};
+
+struct bpf_obj_config__map_func bpf_obj_config__map_funcs[] = {
+	{"value", bpf_map__config_value},
+};
+
+static int
+bpf__obj_config_map(struct bpf_object *obj,
+		    struct parse_events_term *term,
+		    struct perf_evlist *evlist,
+		    int *key_scan_pos)
+{
+	/* key is "maps:<mapname>.<config opt>" */
+	char *map_name = strdup(term->config + sizeof("maps:") - 1);
+	struct bpf_map *map;
+	int err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
+	char *map_opt;
+	size_t i;
+
+	if (!map_name)
+		return -ENOMEM;
+
+	map_opt = strchr(map_name, '.');
+	if (!map_opt) {
+		pr_debug("ERROR: Invalid map config: %s\n", map_name);
+		goto out;
+	}
+
+	*map_opt++ = '\0';
+	if (*map_opt == '\0') {
+		pr_debug("ERROR: Invalid map option: %s\n", term->config);
+		goto out;
+	}
+
+	map = bpf_object__get_map_by_name(obj, map_name);
+	if (!map) {
+		pr_debug("ERROR: Map %s is not exist\n", map_name);
+		err = -BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST;
+		goto out;
+	}
+
+	*key_scan_pos += map_opt - map_name;
+	for (i = 0; i < ARRAY_SIZE(bpf_obj_config__map_funcs); i++) {
+		struct bpf_obj_config__map_func *func =
+				&bpf_obj_config__map_funcs[i];
+
+		if (strcmp(map_opt, func->config_opt) == 0) {
+			err = func->config_func(map, term, evlist);
+			goto out;
+		}
+	}
+
+	pr_debug("ERROR: invalid config option '%s' for maps\n",
+		 map_opt);
+	err = -BPF_LOADER_ERRNO__OBJCONF_MAP_OPT;
+out:
+	free(map_name);
+	if (!err)
+		key_scan_pos += strlen(map_opt);
+	return err;
+}
+
+int bpf__config_obj(struct bpf_object *obj,
+		    struct parse_events_term *term,
+		    struct perf_evlist *evlist,
+		    int *error_pos)
+{
+	int key_scan_pos = 0;
+	int err;
+
+	if (!obj || !term || !term->config)
+		return -EINVAL;
+
+	if (!prefixcmp(term->config, "maps:")) {
+		key_scan_pos = sizeof("maps:") - 1;
+		err = bpf__obj_config_map(obj, term, evlist, &key_scan_pos);
+		goto out;
+	}
+	err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
+out:
+	if (error_pos)
+		*error_pos = key_scan_pos;
+	return err;
+
+}
+
 #define ERRNO_OFFSET(e)		((e) - __BPF_LOADER_ERRNO__START)
 #define ERRCODE_OFFSET(c)	ERRNO_OFFSET(BPF_LOADER_ERRNO__##c)
 #define NR_ERRNO	(__BPF_LOADER_ERRNO__END - __BPF_LOADER_ERRNO__START)
@@ -753,6 +1009,14 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(PROLOGUE)]	= "Failed to generate prologue",
 	[ERRCODE_OFFSET(PROLOGUE2BIG)]	= "Prologue too big for program",
 	[ERRCODE_OFFSET(PROLOGUEOOB)]	= "Offset out of bound for prologue",
+	[ERRCODE_OFFSET(OBJCONF_OPT)]	= "Invalid object config option",
+	[ERRCODE_OFFSET(OBJCONF_CONF)]	= "Config value not set (lost '=')",
+	[ERRCODE_OFFSET(OBJCONF_MAP_OPT)]	= "Invalid object maps config option",
+	[ERRCODE_OFFSET(OBJCONF_MAP_NOTEXIST)]	= "Target map not exist",
+	[ERRCODE_OFFSET(OBJCONF_MAP_VALUE)]	= "Incorrect value type for map",
+	[ERRCODE_OFFSET(OBJCONF_MAP_TYPE)]	= "Incorrect map type",
+	[ERRCODE_OFFSET(OBJCONF_MAP_KEYSIZE)]	= "Incorrect map key size",
+	[ERRCODE_OFFSET(OBJCONF_MAP_VALUESIZE)]	= "Incorrect map value size",
 };
 
 static int
@@ -872,3 +1136,16 @@ int bpf__strerror_load(struct bpf_object *obj,
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
+			     struct parse_events_term *term __maybe_unused,
+			     struct perf_evlist *evlist __maybe_unused,
+			     int *error_pos __maybe_unused, int err,
+			     char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,
+			    "Can't use this config term to this type of map");
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 6fdc045..2464db9 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -10,6 +10,7 @@
 #include <string.h>
 #include <bpf/libbpf.h>
 #include "probe-event.h"
+#include "evlist.h"
 #include "debug.h"
 
 enum bpf_loader_errno {
@@ -24,10 +25,19 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__PROLOGUE,	/* Failed to generate prologue */
 	BPF_LOADER_ERRNO__PROLOGUE2BIG,	/* Prologue too big for program */
 	BPF_LOADER_ERRNO__PROLOGUEOOB,	/* Offset out of bound for prologue */
+	BPF_LOADER_ERRNO__OBJCONF_OPT,	/* Invalid object config option */
+	BPF_LOADER_ERRNO__OBJCONF_CONF,	/* Config value not set (lost '=')) */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_OPT,	/* Invalid object maps config option */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST,	/* Target map not exist */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE,	/* Incorrect value type for map */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,	/* Incorrect map type */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE,	/* Incorrect map key size */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE,/* Incorrect map value size */
 	__BPF_LOADER_ERRNO__END,
 };
 
 struct bpf_object;
+struct parse_events_term;
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
 typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
@@ -53,6 +63,14 @@ int bpf__strerror_load(struct bpf_object *obj, int err,
 		       char *buf, size_t size);
 int bpf__foreach_tev(struct bpf_object *obj,
 		     bpf_prog_iter_callback_t func, void *arg);
+
+int bpf__config_obj(struct bpf_object *obj, struct parse_events_term *term,
+		    struct perf_evlist *evlist, int *error_pos);
+int bpf__strerror_config_obj(struct bpf_object *obj,
+			     struct parse_events_term *term,
+			     struct perf_evlist *evlist,
+			     int *error_pos, int err, char *buf,
+			     size_t size);
 #else
 static inline struct bpf_object *
 bpf__prepare_load(const char *filename __maybe_unused,
@@ -84,6 +102,15 @@ bpf__foreach_tev(struct bpf_object *obj __maybe_unused,
 }
 
 static inline int
+bpf__config_obj(struct bpf_object *obj __maybe_unused,
+		struct parse_events_term *term __maybe_unused,
+		struct perf_evlist *evlist __maybe_unused,
+		int *error_pos __maybe_unused)
+{
+	return 0;
+}
+
+static inline int
 __bpf_strerror(char *buf, size_t size)
 {
 	if (!size)
@@ -118,5 +145,16 @@ static inline int bpf__strerror_load(struct bpf_object *obj __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int
+bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
+			 struct parse_events_term *term __maybe_unused,
+			 struct perf_evlist *evlist __maybe_unused,
+			 int *error_pos __maybe_unused,
+			 int err __maybe_unused,
+			 char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 07/54] perf tools: Enable BPF object configure syntax
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (5 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 06/54] perf tools: Add API to config maps in bpf object Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-12 14:09   ` Jiri Olsa
  2016-02-05 14:01 ` [PATCH 08/54] perf record: Apply config to BPF objects before recording Wang Nan
                   ` (46 subsequent siblings)
  53 siblings, 1 reply; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch adds the final step for BPF map configuration. A new syntax
is appended into parser so user can config BPF objects through '/' '/'
enclosed config terms.

After this patch, following syntax is available:

 # perf record -e ./test_bpf_map_1.c/maps:channel.value=10/ ...

It would takes effect after appling following commits.

Test result:

 # cat ./test_bpf_map_1.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
     (void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
     .type = BPF_MAP_TYPE_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = 1,
 };
 SEC("func=sys_nanosleep")
 int func(void *ctx)
 {
     int key = 0;
     char fmt[] = "%d\n";
     int *pval = map_lookup_elem(&channel, &key);
     if (!pval)
         return 0;
     trace_printk(fmt, sizeof(fmt), *pval);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 - Normal case:
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=10/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]

 - Error case:

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value/' usleep 10
 event syntax error: '..ps:channel:value/'
                                   \___ Config value not set (lost '=')
 Hint:	Valid config term:
      	maps:[<arraymap>]:value=[value]
	     	(add -v to see detail)
	Run 'perf list' for a list of valid events

 Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -e, --event <event>   event selector. use 'perf list' to list available events

 # ./perf record -e './test_bpf_map_1.c/xmaps:channel.value=10/' usleep 10
 event syntax error: '..pf_map_1.c/xmaps:channel.value=10/'
                                   \___ Invalid object config option
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:xchannel.value=10/' usleep 10
 event syntax error: '..p_1.c/maps:xchannel.value=10/'
                                   \___ Target map not exist
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:channel.xvalue=10/' usleep 10
 event syntax error: '..ps:channel.xvalue=10/'
                                   \___ Invalid object maps config option
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=x10/' usleep 10
 event syntax error: '..nnel.value=x10/'
                                   \___ Incorrect value type for map
 [SNIP]

 Change BPF_MAP_TYPE_ARRAY to '1':

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=10/' usleep 10
 event syntax error: '..ps:channel.value=10/'
                                   \___ Can't use this config term to this type of map

 Hint:	Valid config term:
     	maps:[<arraymap>].value=[value]
     	(add -v to see detail)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 55 +++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/parse-events.h |  3 ++-
 tools/perf/util/parse-events.l |  2 +-
 tools/perf/util/parse-events.y | 21 +++++++++++++---
 4 files changed, 72 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index e8b2d85..dcbd004 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -631,17 +631,63 @@ errout:
 	return err;
 }
 
+static int
+parse_events_config_bpf(struct parse_events_evlist *data,
+			struct bpf_object *obj,
+			struct list_head *head_config)
+{
+	struct parse_events_term *term;
+	int error_pos;
+
+	if (!head_config || list_empty(head_config))
+		return 0;
+
+	list_for_each_entry(term, head_config, list) {
+		char errbuf[BUFSIZ];
+		int err;
+
+		if (term->type_term != PARSE_EVENTS__TERM_TYPE_USER) {
+			snprintf(errbuf, sizeof(errbuf),
+				 "Invalid config term for BPF object");
+			errbuf[BUFSIZ - 1] = '\0';
+
+			data->error->idx = term->err_term;
+			data->error->str = strdup(errbuf);
+			return -EINVAL;
+		}
+
+		err = bpf__config_obj(obj, term, NULL, &error_pos);
+		if (err) {
+			bpf__strerror_config_obj(obj, term, NULL,
+						 &error_pos, err, errbuf,
+						 sizeof(errbuf));
+			data->error->help = strdup(
+"Hint:\tValid config term:\n"
+"     \tmaps:[<arraymap>].value=[value]\n"
+"     \t(add -v to see detail)");
+			data->error->str = strdup(errbuf);
+			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
+				data->error->idx = term->err_val;
+			else
+				data->error->idx = term->err_term + error_pos;
+			return err;
+		}
+	}
+	return 0;
+}
+
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
-			  bool source)
+			  bool source,
+			  struct list_head *head_config)
 {
 	struct bpf_object *obj;
+	int err;
 
 	obj = bpf__prepare_load(bpf_file_name, source);
 	if (IS_ERR(obj)) {
 		char errbuf[BUFSIZ];
-		int err;
 
 		err = PTR_ERR(obj);
 
@@ -659,7 +705,10 @@ int parse_events_load_bpf(struct parse_events_evlist *data,
 		return err;
 	}
 
-	return parse_events_load_bpf_obj(data, list, obj);
+	err = parse_events_load_bpf_obj(data, list, obj);
+	if (err)
+		return err;
+	return parse_events_config_bpf(data, obj, head_config);
 }
 
 static int
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index f1a6db1..84694f3 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -126,7 +126,8 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
-			  bool source);
+			  bool source,
+			  struct list_head *head_config);
 /* Provide this function for perf test */
 struct bpf_object;
 int parse_events_load_bpf_obj(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 58c5831..4387728 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -122,7 +122,7 @@ num_dec		[0-9]+
 num_hex		0x[a-fA-F0-9]+
 num_raw_hex	[a-fA-F0-9]+
 name		[a-zA-Z_*?][a-zA-Z0-9_*?.]*
-name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.]*
+name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
 /* If you add a modifier you need to update check_modifier() */
 modifier_event	[ukhpPGHSDI]+
 modifier_bp	[rwx]{1,3}
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index ad37996..3e0b563 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -64,6 +64,7 @@ static inc_group_count(struct list_head *list,
 %type <str> PE_PMU_EVENT_PRE PE_PMU_EVENT_SUF PE_KERNEL_PMU_EVENT
 %type <num> value_sym
 %type <head> event_config
+%type <head> opt_event_config
 %type <term> event_term
 %type <head> event_pmu
 %type <head> event_legacy_symbol
@@ -455,27 +456,39 @@ PE_RAW
 }
 
 event_bpf_file:
-PE_BPF_OBJECT
+PE_BPF_OBJECT opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_load_bpf(data, list, $1, false));
+	ABORT_ON(parse_events_load_bpf(data, list, $1, false, $2));
+	parse_events__free_terms($2);
 	$$ = list;
 }
 |
-PE_BPF_SOURCE
+PE_BPF_SOURCE opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_load_bpf(data, list, $1, true));
+	ABORT_ON(parse_events_load_bpf(data, list, $1, true, $2));
+	parse_events__free_terms($2);
 	$$ = list;
 }
 
+opt_event_config:
+'/' event_config '/'
+{
+	$$ = $2;
+}
+|
+{
+	$$ = NULL;
+}
+
 start_terms: event_config
 {
 	struct parse_events_terms *data = _data;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 08/54] perf record: Apply config to BPF objects before recording
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (6 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 07/54] perf tools: Enable BPF object configure syntax Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-12 20:55   ` Arnaldo Carvalho de Melo
  2016-02-05 14:01 ` [PATCH 09/54] perf tools: Enable passing event to BPF object Wang Nan
                   ` (45 subsequent siblings)
  53 siblings, 1 reply; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

bpf__apply_obj_config() is introduced as the core API to apply object
config options to all BPF objects. This patch also does the real work
for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value
stored in map's private field into the BPF map.

This patch is required because we are not always able to set all
BPF config during parsing. Further patch will set events created
by perf to BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist
until perf_evsel__open().

bpf_map_foreach_key() is introduced to iterate over each key
needs to be configured. This function would be extended to support
more map types and different key settings.

In perf record, before start recording, call bpf__apply_config() to
turn on all BPF config options.

Test result:

 # cat ./test_bpf_map_1.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
     (void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
     .type = BPF_MAP_TYPE_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = 1,
 };
 SEC("func=sys_nanosleep")
 int func(void *ctx)
 {
     int key = 0;
     char fmt[] = "%d\n";
     int *pval = map_lookup_elem(&channel, &key);
     if (!pval)
         return 0;
     trace_printk(fmt, sizeof(fmt), *pval);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=11/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 1/1   #P:8
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            usleep-18593 [007] d... 2394714.395539: : 11
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=101/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 1/1   #P:8
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            usleep-18593 [007] d... 2394714.395539: : 11
            usleep-19000 [006] d... 2394831.057840: : 101

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c  |  11 +++
 tools/perf/util/bpf-loader.c | 184 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h |  15 ++++
 3 files changed, 210 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0ee0d5c..caa8235 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -32,6 +32,7 @@
 #include "util/parse-branch-options.h"
 #include "util/parse-regs-options.h"
 #include "util/llvm-utils.h"
+#include "util/bpf-loader.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -536,6 +537,16 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		goto out_child;
 	}
 
+	err = bpf__apply_obj_config();
+	if (err) {
+		char errbuf[BUFSIZ];
+
+		bpf__strerror_apply_obj_config(err, errbuf, sizeof(errbuf));
+		pr_err("ERROR: Apply config to BPF failed: %s\n",
+			 errbuf);
+		goto out_child;
+	}
+
 	/*
 	 * Normally perf_session__new would do this, but it doesn't have the
 	 * evlist.
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 91678f4..d3d451b2 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -7,6 +7,7 @@
 
 #include <linux/bpf.h>
 #include <bpf/libbpf.h>
+#include <bpf/bpf.h>
 #include <linux/err.h>
 #include <linux/string.h>
 #include "perf.h"
@@ -995,6 +996,182 @@ out:
 
 }
 
+typedef int (*map_config_func_t)(const char *name, int map_fd,
+				 struct bpf_map_def *pdef,
+				 struct bpf_map_op *op,
+				 void *pkey, void *arg);
+
+static int
+foreach_key_array_all(map_config_func_t func,
+		      void *arg, const char *name,
+		      int map_fd, struct bpf_map_def *pdef,
+		      struct bpf_map_op *op)
+{
+	unsigned int i;
+	int err;
+
+	for (i = 0; i < pdef->max_entries; i++) {
+		err = func(name, map_fd, pdef, op, &i, arg);
+		if (err) {
+			pr_debug("ERROR: failed to insert value to %s[%u]\n",
+				 name, i);
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int
+bpf_map_config_foreach_key(struct bpf_map *map,
+			   map_config_func_t func,
+			   void *arg)
+{
+	int err, map_fd;
+	const char *name;
+	struct bpf_map_op *op;
+	struct bpf_map_def def;
+	struct bpf_map_priv *priv;
+
+	name = bpf_map__get_name(map);
+
+	err = bpf_map__get_private(map, (void **)&priv);
+	if (err) {
+		pr_debug("ERROR: failed to get private from map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	if (!priv || list_empty(&priv->ops_list)) {
+		pr_debug("INFO: nothing to config for map %s\n", name);
+		return 0;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("ERROR: failed to get definition from map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	map_fd = bpf_map__get_fd(map);
+	if (map_fd < 0) {
+		pr_debug("ERROR: failed to get fd from map %s\n", name);
+		return map_fd;
+	}
+
+	list_for_each_entry(op, &priv->ops_list, list) {
+		switch (def.type) {
+		case BPF_MAP_TYPE_ARRAY:
+			switch (op->key_type) {
+			case BPF_MAP_KEY_ALL:
+				err = foreach_key_array_all(func, arg, name,
+							    map_fd, &def, op);
+				if (err)
+					return err;
+				break;
+			default:
+				pr_debug("ERROR: keytype for map '%s' invalid\n",
+					 name);
+				return -BPF_LOADER_ERRNO__INTERNAL;
+			}
+			break;
+		default:
+			pr_debug("ERROR: type of '%s' incorrect\n", name);
+			return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+		}
+	}
+
+	return 0;
+}
+
+static int
+apply_config_value_for_key(int map_fd, void *pkey,
+			   size_t val_size, u64 val)
+{
+	int err = 0;
+
+	switch (val_size) {
+	case 1: {
+		u8 _val = (u8)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 2: {
+		u16 _val = (u16)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 4: {
+		u32 _val = (u32)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 8: {
+		err = bpf_map_update_elem(map_fd, pkey, &val, BPF_ANY);
+		break;
+	}
+	default:
+		pr_debug("ERROR: invalid value size\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
+	}
+	if (err && errno)
+		err = -errno;
+	return err;
+}
+
+static int
+apply_obj_config_map_for_key(const char *name, int map_fd,
+			     struct bpf_map_def *pdef __maybe_unused,
+			     struct bpf_map_op *op,
+			     void *pkey, void *arg __maybe_unused)
+{
+	int err;
+
+	switch (op->op_type) {
+	case BPF_MAP_OP_SET_VALUE:
+		err = apply_config_value_for_key(map_fd, pkey,
+						 pdef->value_size,
+						 op->v.value);
+		break;
+	default:
+		pr_debug("ERROR: unknown value type for '%s'\n", name);
+		err = -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	return err;
+}
+
+static int
+apply_obj_config_map(struct bpf_map *map)
+{
+	return bpf_map_config_foreach_key(map,
+					  apply_obj_config_map_for_key,
+					  NULL);
+}
+
+static int
+apply_obj_config_object(struct bpf_object *obj)
+{
+	struct bpf_map *map;
+	int err;
+
+	bpf_map__for_each(map, obj) {
+		err = apply_obj_config_map(map);
+		if (err)
+			return err;
+	}
+	return 0;
+}
+
+int bpf__apply_obj_config(void)
+{
+	struct bpf_object *obj, *tmp;
+	int err;
+
+	bpf_object__for_each_safe(obj, tmp) {
+		err = apply_obj_config_object(obj);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 #define ERRNO_OFFSET(e)		((e) - __BPF_LOADER_ERRNO__START)
 #define ERRCODE_OFFSET(c)	ERRNO_OFFSET(BPF_LOADER_ERRNO__##c)
 #define NR_ERRNO	(__BPF_LOADER_ERRNO__END - __BPF_LOADER_ERRNO__START)
@@ -1149,3 +1326,10 @@ int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 2464db9..db3c34c 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -71,6 +71,8 @@ int bpf__strerror_config_obj(struct bpf_object *obj,
 			     struct perf_evlist *evlist,
 			     int *error_pos, int err, char *buf,
 			     size_t size);
+int bpf__apply_obj_config(void);
+int bpf__strerror_apply_obj_config(int err, char *buf, size_t size);
 #else
 static inline struct bpf_object *
 bpf__prepare_load(const char *filename __maybe_unused,
@@ -111,6 +113,12 @@ bpf__config_obj(struct bpf_object *obj __maybe_unused,
 }
 
 static inline int
+bpf__apply_obj_config(void)
+{
+	return 0;
+}
+
+static inline int
 __bpf_strerror(char *buf, size_t size)
 {
 	if (!size)
@@ -156,5 +164,12 @@ bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int
+bpf__strerror_apply_obj_config(int err __maybe_unused,
+			       char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 09/54] perf tools: Enable passing event to BPF object
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (7 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 08/54] perf record: Apply config to BPF objects before recording Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-12 14:05   ` Jiri Olsa
  2016-02-05 14:01 ` [PATCH 10/54] perf stat: Forbid user passing improper config terms Wang Nan
                   ` (44 subsequent siblings)
  53 siblings, 1 reply; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

A new syntax is appended into parser so user can pass predefined perf
events into BPF objects.

After this patch, BPF programs for perf are finally able to utilize
bpf_perf_event_read() introduced in commit 35578d7984003097af2b1e3
(bpf: Implement function bpf_perf_event_read() that get the selected
hardware PMU conuter).

Test result:

 # cat ./test_bpf_map_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
     (void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_read)(struct bpf_map_def *, int) =
     (void *)BPF_FUNC_perf_event_read;

 struct bpf_map_def SEC("maps") pmu_map = {
     .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = __NR_CPUS__,
 };
 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
     unsigned long long val;
     char fmt[] = "sys_write:        pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }

 SEC("func_write_return=sys_write%return")
 int func_write_return(void *ctx)
 {
     unsigned long long val = 0;
     char fmt[] = "sys_write_return: pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

Normal case:
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 [SNIP]
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.013 MB perf.data (7 samples) ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-17066 [000] d... 938449.863301: : sys_write:        pmu=1157327
               ls-17066 [000] dN.. 938449.863342: : sys_write_return: pmu=1225218
               ls-17066 [000] d... 938449.863349: : sys_write:        pmu=1241922
               ls-17066 [000] dN.. 938449.863369: : sys_write_return: pmu=1267445

Normal case (system wide):
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' -a
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.811 MB perf.data (120 samples) ]

 # cat /sys/kernel/debug/tracing/trace | grep -v '18446744073709551594' | grep -v perf | head -n 20
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            gmain-30828 [002] d... 2740551.068992: : sys_write:        pmu=84373
            gmain-30828 [002] d... 2740551.068992: : sys_write_return: pmu=87696
            gmain-30828 [002] d... 2740551.068996: : sys_write:        pmu=100658
            gmain-30828 [002] d... 2740551.068997: : sys_write_return: pmu=102572

Error case 1:

 # ./perf record -e './test_bpf_map_2.c' ls /
 [SNIP]
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.014 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-17115 [007] d... 2724279.665625: : sys_write:        pmu=18446744073709551614
               ls-17115 [007] dN.. 2724279.665651: : sys_write_return: pmu=18446744073709551614
               ls-17115 [007] d... 2724279.665658: : sys_write:        pmu=18446744073709551614
               ls-17115 [007] dN.. 2724279.665677: : sys_write_return: pmu=18446744073709551614

 (18446744073709551614 is 0xfffffffffffffffe (-2))

Error case 2:
 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=evt/' -a
 event syntax error: '..ps:pmu_map.event=evt/'
                                   \___ Event not found for map setting

 Hint:	Valid config terms:
      	maps:[<arraymap>].value=[value]
      	maps:[<eventmap>].event=[event]
 [SNIP]

Error case 3:
 # ls /proc/2348/task/
 2348  2505  2506  2507  2508
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' -p 2348
 ERROR: Apply config to BPF failed: Cannot set event to BPF maps in multi-thread tracing

Error case 4:
 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i to turn off inherit)

Error case 5:
 # ./perf record -i -e raw_syscalls:sys_enter -e './test_bpf_map_2.c/maps:pmu_map.event=raw_syscalls:sys_enter/' ls
 ERROR: Apply config to BPF failed: Can only put raw, hardware and BPF output event into a BPF map

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 163 +++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/bpf-loader.h   |   5 ++
 tools/perf/util/evlist.c       |  16 ++++
 tools/perf/util/evlist.h       |   3 +
 tools/perf/util/parse-events.c |  15 ++--
 tools/perf/util/parse-events.h |   1 +
 6 files changed, 190 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index d3d451b2..412cd51 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -742,6 +742,7 @@ int bpf__foreach_tev(struct bpf_object *obj,
 
 enum bpf_map_op_type {
 	BPF_MAP_OP_SET_VALUE,
+	BPF_MAP_OP_SET_EVSEL,
 };
 
 enum bpf_map_key_type {
@@ -754,6 +755,7 @@ struct bpf_map_op {
 	enum bpf_map_key_type key_type;
 	union {
 		u64 value;
+		struct perf_evsel *evsel;
 	} v;
 };
 
@@ -838,6 +840,24 @@ bpf_map__add_op(struct bpf_map *map, struct bpf_map_op *op)
 	return 0;
 }
 
+static struct bpf_map_op *
+bpf_map__add_newop(struct bpf_map *map)
+{
+	struct bpf_map_op *op;
+	int err;
+
+	op = bpf_map_op__new();
+	if (IS_ERR(op))
+		return op;
+
+	err = bpf_map__add_op(map, op);
+	if (err) {
+		bpf_map_op__delete(op);
+		return ERR_PTR(err);
+	}
+	return op;
+}
+
 static int
 __bpf_map__config_value(struct bpf_map *map,
 			struct parse_events_term *term)
@@ -876,16 +896,12 @@ __bpf_map__config_value(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
 	}
 
-	op = bpf_map_op__new();
+	op = bpf_map__add_newop(map);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 	op->op_type = BPF_MAP_OP_SET_VALUE;
 	op->v.value = term->val.num;
-
-	err = bpf_map__add_op(map, op);
-	if (err)
-		bpf_map_op__delete(op);
-	return err;
+	return 0;
 }
 
 static int
@@ -899,13 +915,75 @@ bpf_map__config_value(struct bpf_map *map,
 	}
 
 	if (!term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
-		pr_debug("ERROR: wrong value type\n");
+		pr_debug("ERROR: wrong value type for 'value'\n");
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
 	}
 
 	return __bpf_map__config_value(map, term);
 }
 
+static int
+__bpf_map__config_event(struct bpf_map *map,
+			struct parse_events_term *term,
+			struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	struct bpf_map_def def;
+	struct bpf_map_op *op;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+	evsel = perf_evlist__find_evsel_by_str(evlist, term->val.str);
+	if (!evsel) {
+		pr_debug("Event (for '%s') '%s' doesn't exist\n",
+			 map_name, term->val.str);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_NOEVT;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("Unable to get map definition from '%s'\n",
+			 map_name);
+		return err;
+	}
+
+	/*
+	 * No need to check key_size and value_size:
+	 * kernel has already checked them.
+	 */
+	if (def.type != BPF_MAP_TYPE_PERF_EVENT_ARRAY) {
+		pr_debug("Map %s type is not BPF_MAP_TYPE_PERF_EVENT_ARRAY\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+	}
+
+	op = bpf_map__add_newop(map);
+	if (IS_ERR(op))
+		return PTR_ERR(op);
+	op->op_type = BPF_MAP_OP_SET_EVSEL;
+	op->v.evsel = evsel;
+	return 0;
+}
+
+static int
+bpf_map__config_event(struct bpf_map *map,
+		      struct parse_events_term *term,
+		      struct perf_evlist *evlist)
+{
+	if (!term->err_val) {
+		pr_debug("Config value not set\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
+	}
+
+	if (!term->type_val == PARSE_EVENTS__TERM_TYPE_STR) {
+		pr_debug("ERROR: wrong value type for 'event'\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
+	}
+
+	return __bpf_map__config_event(map, term, evlist);
+}
+
 struct bpf_obj_config__map_func {
 	const char *config_opt;
 	int (*config_func)(struct bpf_map *, struct parse_events_term *,
@@ -914,6 +992,7 @@ struct bpf_obj_config__map_func {
 
 struct bpf_obj_config__map_func bpf_obj_config__map_funcs[] = {
 	{"value", bpf_map__config_value},
+	{"event", bpf_map__config_event},
 };
 
 static int
@@ -1058,6 +1137,7 @@ bpf_map_config_foreach_key(struct bpf_map *map,
 	list_for_each_entry(op, &priv->ops_list, list) {
 		switch (def.type) {
 		case BPF_MAP_TYPE_ARRAY:
+		case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
 			switch (op->key_type) {
 			case BPF_MAP_KEY_ALL:
 				err = foreach_key_array_all(func, arg, name,
@@ -1116,6 +1196,60 @@ apply_config_value_for_key(int map_fd, void *pkey,
 }
 
 static int
+apply_config_evsel_for_key(const char *name, int map_fd, void *pkey,
+			   struct perf_evsel *evsel)
+{
+	struct xyarray *xy = evsel->fd;
+	struct perf_event_attr *attr;
+	unsigned int key, events;
+	bool check_pass = false;
+	int *evt_fd;
+	int err;
+
+	if (!xy) {
+		pr_debug("ERROR: evsel not ready for map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	if (xy->row_size / xy->entry_size != 1) {
+		pr_debug("ERROR: Dimension of target event is incorrect for map %s\n",
+			 name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM;
+	}
+
+	attr = &evsel->attr;
+	if (attr->inherit) {
+		pr_debug("ERROR: Can't put inherit event into map %s\n", name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH;
+	}
+
+	if (attr->type == PERF_TYPE_RAW)
+		check_pass = true;
+	if (attr->type == PERF_TYPE_HARDWARE)
+		check_pass = true;
+	if (attr->type == PERF_TYPE_SOFTWARE &&
+			attr->config == PERF_COUNT_SW_BPF_OUTPUT)
+		check_pass = true;
+	if (!check_pass) {
+		pr_debug("ERROR: Event type is wrong for map %s\n", name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE;
+	}
+
+	events = xy->entries / (xy->row_size / xy->entry_size);
+	key = *((unsigned int *)pkey);
+	if (key >= events) {
+		pr_debug("ERROR: there is no event %d for map %s\n",
+			 key, name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_MAPSIZE;
+	}
+	evt_fd = xyarray__entry(xy, key, 0);
+	err = bpf_map_update_elem(map_fd, pkey, evt_fd, BPF_ANY);
+	if (err && errno)
+		err = -errno;
+	return err;
+}
+
+static int
 apply_obj_config_map_for_key(const char *name, int map_fd,
 			     struct bpf_map_def *pdef __maybe_unused,
 			     struct bpf_map_op *op,
@@ -1129,6 +1263,10 @@ apply_obj_config_map_for_key(const char *name, int map_fd,
 						 pdef->value_size,
 						 op->v.value);
 		break;
+	case BPF_MAP_OP_SET_EVSEL:
+		err = apply_config_evsel_for_key(name, map_fd, pkey,
+						 op->v.evsel);
+		break;
 	default:
 		pr_debug("ERROR: unknown value type for '%s'\n", name);
 		err = -BPF_LOADER_ERRNO__INTERNAL;
@@ -1194,6 +1332,11 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(OBJCONF_MAP_TYPE)]	= "Incorrect map type",
 	[ERRCODE_OFFSET(OBJCONF_MAP_KEYSIZE)]	= "Incorrect map key size",
 	[ERRCODE_OFFSET(OBJCONF_MAP_VALUESIZE)]	= "Incorrect map value size",
+	[ERRCODE_OFFSET(OBJCONF_MAP_NOEVT)]	= "Event not found for map setting",
+	[ERRCODE_OFFSET(OBJCONF_MAP_MAPSIZE)]	= "Invalid map size for event setting",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTDIM)]	= "Event dimension too large",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTINH)]	= "Doesn't support inherit event",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTTYPE)]	= "Wrong event type for map",
 };
 
 static int
@@ -1330,6 +1473,12 @@ int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
 {
 	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,
+			    "Cannot set event to BPF maps in multi-thread tracing");
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,
+			    "%s (Hint: use -i to turn off inherit)", emsg);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,
+			    "Can only put raw, hardware and BPF output event into a BPF map");
 	bpf__strerror_end(buf, size);
 	return 0;
 }
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index db3c34c..c9ce792 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -33,6 +33,11 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,	/* Incorrect map type */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE,	/* Incorrect map key size */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE,/* Incorrect map value size */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_NOEVT,	/* Event not found for map setting */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_MAPSIZE,	/* Invalid map size for event setting */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,	/* Event dimension too large */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,	/* Doesn't support inherit event */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,	/* Wrong event type for map */
 	__BPF_LOADER_ERRNO__END,
 };
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d81f13d..9b56390 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1723,3 +1723,19 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 
 	tracking_evsel->tracking = true;
 }
+
+struct perf_evsel *
+perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
+			       const char *str)
+{
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		if (!evsel->name)
+			continue;
+		if (strcmp(str, evsel->name) == 0)
+			return evsel;
+	}
+
+	return NULL;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7c4d9a2..a0d1522 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -294,4 +294,7 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 				     struct perf_evsel *tracking_evsel);
 
 void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr);
+
+struct perf_evsel *
+perf_evlist__find_evsel_by_str(struct perf_evlist *evlist, const char *str);
 #endif /* __PERF_EVLIST_H */
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index dcbd004..8e08990 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -656,14 +656,16 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 			return -EINVAL;
 		}
 
-		err = bpf__config_obj(obj, term, NULL, &error_pos);
+		err = bpf__config_obj(obj, term, data->evlist, &error_pos);
 		if (err) {
-			bpf__strerror_config_obj(obj, term, NULL,
+			bpf__strerror_config_obj(obj, term, data->evlist,
 						 &error_pos, err, errbuf,
 						 sizeof(errbuf));
 			data->error->help = strdup(
-"Hint:\tValid config term:\n"
+"Hint:\tValid config terms:\n"
 "     \tmaps:[<arraymap>].value=[value]\n"
+"     \tmaps:[<eventmap>].event=[event]\n"
+"\n"
 "     \t(add -v to see detail)");
 			data->error->str = strdup(errbuf);
 			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
@@ -1444,9 +1446,10 @@ int parse_events(struct perf_evlist *evlist, const char *str,
 		 struct parse_events_error *err)
 {
 	struct parse_events_evlist data = {
-		.list  = LIST_HEAD_INIT(data.list),
-		.idx   = evlist->nr_entries,
-		.error = err,
+		.list   = LIST_HEAD_INIT(data.list),
+		.idx    = evlist->nr_entries,
+		.error  = err,
+		.evlist = evlist,
 	};
 	int ret;
 
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 84694f3..2a2b172 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -98,6 +98,7 @@ struct parse_events_evlist {
 	int			   idx;
 	int			   nr_groups;
 	struct parse_events_error *error;
+	struct perf_evlist	  *evlist;
 };
 
 struct parse_events_terms {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 10/54] perf stat: Forbid user passing improper config terms
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (8 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 09/54] perf tools: Enable passing event to BPF object Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-12 13:49   ` Jiri Olsa
  2016-02-05 14:01 ` [PATCH 11/54] perf tools: Rename and move pmu_event_name to get_config_name Wang Nan
                   ` (43 subsequent siblings)
  53 siblings, 1 reply; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

'perf stat' accepts some config terms but doesn't apply them. For
example:

 # perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
 # ls
 # exit

 Performance counter stats for 'bash':

         266258061      instructions/no-inherit/
         266258061      instructions/inherit/

       1.402183915 seconds time elapsed

The result is confusing, because user may expect the first
'instructions' event exclude the 'ls' command.

This patch forbit most of those config terms for 'perf stat'.

Result:

 # ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
 event syntax error: 'instructions/no-inherit/'
                      \___ Don't use record mode only config terms
 ...

We can add blocked config terms back when 'perf stat' really support them.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-stat.c      |  1 +
 tools/perf/util/parse-events.c | 42 +++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/parse-events.h |  1 +
 3 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 038e877..bee8fcc 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1825,6 +1825,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (evsel_list == NULL)
 		return -ENOMEM;
 
+	parse_events__shrink_config_terms();
 	argc = parse_options_subcommand(argc, argv, stat_options, stat_subcommands,
 					(const char **) stat_usage,
 					PARSE_OPT_STOP_AT_NON_OPTION);
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 8e08990..962a364 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -40,6 +40,26 @@ static struct perf_pmu_event_symbol *perf_pmu_events_list;
  */
 static int perf_pmu_events_list_num;
 
+typedef int config_term_func_t(struct perf_event_attr *attr,
+			       struct parse_events_term *term,
+			       struct parse_events_error *err);
+static int config_term_base(struct perf_event_attr *attr,
+			    struct parse_events_term *term,
+			    struct parse_events_error *err);
+static int config_term_limited(struct perf_event_attr *attr,
+			       struct parse_events_term *term,
+			       struct parse_events_error *err);
+static config_term_func_t *config_term_common = &config_term_base;
+static int config_attr(struct perf_event_attr *attr,
+		       struct list_head *head,
+		       struct parse_events_error *err,
+		       config_term_func_t config_term);
+
+void parse_events__shrink_config_terms(void)
+{
+	config_term_common = &config_term_limited;
+}
+
 struct event_symbol event_symbols_hw[PERF_COUNT_HW_MAX] = {
 	[PERF_COUNT_HW_CPU_CYCLES] = {
 		.symbol = "cpu-cycles",
@@ -801,9 +821,9 @@ typedef int config_term_func_t(struct perf_event_attr *attr,
 			       struct parse_events_term *term,
 			       struct parse_events_error *err);
 
-static int config_term_common(struct perf_event_attr *attr,
-			      struct parse_events_term *term,
-			      struct parse_events_error *err)
+static int config_term_base(struct perf_event_attr *attr,
+			    struct parse_events_term *term,
+			    struct parse_events_error *err)
 {
 #define CHECK_TYPE_VAL(type)						   \
 do {									   \
@@ -870,6 +890,22 @@ do {									   \
 #undef CHECK_TYPE_VAL
 }
 
+static int config_term_limited(struct perf_event_attr *attr,
+			       struct parse_events_term *term,
+			       struct parse_events_error *err)
+{
+	switch (term->type_term) {
+	case PARSE_EVENTS__TERM_TYPE_CONFIG:
+	case PARSE_EVENTS__TERM_TYPE_CONFIG1:
+	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
+	case PARSE_EVENTS__TERM_TYPE_NAME:
+		return config_term_base(attr, term, err);
+	default:
+		err->str = strdup("Don't use record mode only config terms");
+		return -EINVAL;
+	}
+}
+
 static int config_term_pmu(struct perf_event_attr *attr,
 			   struct parse_events_term *term,
 			   struct parse_events_error *err)
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 2a2b172..fc61a63 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -105,6 +105,7 @@ struct parse_events_terms {
 	struct list_head *terms;
 };
 
+void parse_events__shrink_config_terms(void);
 int parse_events__is_hardcoded_term(struct parse_events_term *term);
 int parse_events_term__num(struct parse_events_term **term,
 			   int type_term, char *config, u64 num,
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 11/54] perf tools: Rename and move pmu_event_name to get_config_name
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (9 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 10/54] perf stat: Forbid user passing improper config terms Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 12/54] perf tools: Enable config raw and numeric events Wang Nan
                   ` (42 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Following commits will make more events obey /name=newname/ options.
This patch makes pmu_event_name() a generic helper.

Makes new get_config_name() accept NULL input to make life easier.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 962a364..4b15ece 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -299,7 +299,24 @@ const char *event_type(int type)
 	return "unknown";
 }
 
+static int parse_events__is_name_term(struct parse_events_term *term)
+{
+	return term->type_term == PARSE_EVENTS__TERM_TYPE_NAME;
+}
 
+static char *get_config_name(struct list_head *head_terms)
+{
+	struct parse_events_term *term;
+
+	if (!head_terms)
+		return NULL;
+
+	list_for_each_entry(term, head_terms, list)
+		if (parse_events__is_name_term(term))
+			return term->val.str;
+
+	return NULL;
+}
 
 static struct perf_evsel *
 __add_event(struct list_head *list, int *idx,
@@ -1051,22 +1068,6 @@ int parse_events_add_numeric(struct parse_events_evlist *data,
 	return add_event(list, &data->idx, &attr, NULL, &config_terms);
 }
 
-static int parse_events__is_name_term(struct parse_events_term *term)
-{
-	return term->type_term == PARSE_EVENTS__TERM_TYPE_NAME;
-}
-
-static char *pmu_event_name(struct list_head *head_terms)
-{
-	struct parse_events_term *term;
-
-	list_for_each_entry(term, head_terms, list)
-		if (parse_events__is_name_term(term))
-			return term->val.str;
-
-	return NULL;
-}
-
 int parse_events_add_pmu(struct parse_events_evlist *data,
 			 struct list_head *list, char *name,
 			 struct list_head *head_config)
@@ -1111,7 +1112,7 @@ int parse_events_add_pmu(struct parse_events_evlist *data,
 		return -EINVAL;
 
 	evsel = __add_event(list, &data->idx, &attr,
-			    pmu_event_name(head_config), pmu->cpus,
+			    get_config_name(head_config), pmu->cpus,
 			    &config_terms);
 	if (evsel) {
 		evsel->unit = info.unit;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 12/54] perf tools: Enable config raw and numeric events
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (10 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 11/54] perf tools: Rename and move pmu_event_name to get_config_name Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-12 13:52   ` Jiri Olsa
                     ` (2 more replies)
  2016-02-05 14:01 ` [PATCH 13/54] perf tools: Enable config and setting names for legacy cache events Wang Nan
                   ` (41 subsequent siblings)
  53 siblings, 3 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch allows setting config terms for raw and numeric events.
For example:

 # perf stat -e cycles/name=cyc/ ls
 ...
 1821108      cyc
 ...

 # perf stat -e r6530160/name=event/ ls
 ...
 1103195      event
 ...

 # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
 ...
 # perf report --stdio
 ...
 # Samples: 124  of event 'cycles'
 46.61%     0.00%  swapper        [kernel.vmlinux]            [k] cpu_startup_entry
 41.26%     0.00%  swapper        [kernel.vmlinux]            [k] start_secondary
 ...
 # Samples: 91  of event 'evtx'
 ...
 93.76%     0.00%  swapper      [kernel.vmlinux]            [k] cpu_startup_entry
         |
         ---cpu_startup_entry
            |
            |--66.63%--call_cpuidle
            |          cpuidle_enter
            |          |
 ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c |  3 ++-
 tools/perf/util/parse-events.y | 10 ++++++----
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4b15ece..e341b52 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1065,7 +1065,8 @@ int parse_events_add_numeric(struct parse_events_evlist *data,
 			return -ENOMEM;
 	}
 
-	return add_event(list, &data->idx, &attr, NULL, &config_terms);
+	return add_event(list, &data->idx, &attr,
+			 get_config_name(head_config), &config_terms);
 }
 
 int parse_events_add_pmu(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 3e0b563..77de5bb 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -434,24 +434,26 @@ PE_NAME ':' PE_NAME
 }
 
 event_legacy_numeric:
-PE_VALUE ':' PE_VALUE
+PE_VALUE ':' PE_VALUE opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, NULL));
+	ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, $4));
+	parse_events__free_terms($4);
 	$$ = list;
 }
 
 event_legacy_raw:
-PE_RAW
+PE_RAW opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, NULL));
+	ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, $2));
+	parse_events__free_terms($2);
 	$$ = list;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 13/54] perf tools: Enable config and setting names for legacy cache events
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (11 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 12/54] perf tools: Enable config raw and numeric events Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately Wang Nan
                   ` (40 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch allows setting config terms for legacy cache events.
For example:

 # perf stat -e L1-icache-misses/name=valA/ -e branches/name=valB/ ls
 ...
  Performance counter stats for 'ls':

              11299      valA
             451605      valB

        0.000779091 seconds time elapsed

 # perf record -e cache-misses/name=inh/ -e cache-misses/name=noinh,no-inherit/ bash
 # ls
 # exit
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.023 MB perf.data (131 samples) ]
 # perf report --stdio | grep -B 1 'Event count'
 # Samples: 105  of event 'inh'
 # Event count (approx.): 109118
 --
 # Samples: 26  of event 'noinh'
 # Event count (approx.): 48302

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 19 ++++++++++++++++---
 tools/perf/util/parse-events.h |  4 +++-
 tools/perf/util/parse-events.y | 18 ++++++++++++------
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index e341b52..817a0cc 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -371,10 +371,13 @@ static int parse_aliases(char *str, const char *names[][PERF_EVSEL__MAX_ALIASES]
 }
 
 int parse_events_add_cache(struct list_head *list, int *idx,
-			   char *type, char *op_result1, char *op_result2)
+			   char *type, char *op_result1, char *op_result2,
+			   struct parse_events_error *error,
+			   struct list_head *head_config)
 {
 	struct perf_event_attr attr;
-	char name[MAX_NAME_LEN];
+	LIST_HEAD(config_terms);
+	char name[MAX_NAME_LEN], *config_name;
 	int cache_type = -1, cache_op = -1, cache_result = -1;
 	char *op_result[2] = { op_result1, op_result2 };
 	int i, n;
@@ -388,6 +391,7 @@ int parse_events_add_cache(struct list_head *list, int *idx,
 	if (cache_type == -1)
 		return -EINVAL;
 
+	config_name = get_config_name(head_config);
 	n = snprintf(name, MAX_NAME_LEN, "%s", type);
 
 	for (i = 0; (i < 2) && (op_result[i]); i++) {
@@ -428,7 +432,16 @@ int parse_events_add_cache(struct list_head *list, int *idx,
 	memset(&attr, 0, sizeof(attr));
 	attr.config = cache_type | (cache_op << 8) | (cache_result << 16);
 	attr.type = PERF_TYPE_HW_CACHE;
-	return add_event(list, idx, &attr, name, NULL);
+
+	if (head_config) {
+		if (config_attr(&attr, head_config, error,
+				config_term_common))
+			return -EINVAL;
+
+		if (get_config_terms(head_config, &config_terms))
+			return -ENOMEM;
+	}
+	return add_event(list, idx, &attr, config_name ? : name, &config_terms);
 }
 
 static void tracepoint_error(struct parse_events_error *e, int err,
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index fc61a63..a490424 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -140,7 +140,9 @@ int parse_events_add_numeric(struct parse_events_evlist *data,
 			     u32 type, u64 config,
 			     struct list_head *head_config);
 int parse_events_add_cache(struct list_head *list, int *idx,
-			   char *type, char *op_result1, char *op_result2);
+			   char *type, char *op_result1, char *op_result2,
+			   struct parse_events_error *error,
+			   struct list_head *head_config);
 int parse_events_add_breakpoint(struct list_head *list, int *idx,
 				void *ptr, char *type, u64 len);
 int parse_events_add_pmu(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 77de5bb..2541b32 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -303,33 +303,39 @@ value_sym sep_slash_dc
 }
 
 event_legacy_cache:
-PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT
+PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT opt_event_config
 {
 	struct parse_events_evlist *data = _data;
+	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, $5));
+	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, $5, error, $6));
+	parse_events__free_terms($6);
 	$$ = list;
 }
 |
-PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT
+PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT opt_event_config
 {
 	struct parse_events_evlist *data = _data;
+	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, NULL));
+	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, NULL, error, $4));
+	parse_events__free_terms($4);
 	$$ = list;
 }
 |
-PE_NAME_CACHE_TYPE
+PE_NAME_CACHE_TYPE opt_event_config
 {
 	struct parse_events_evlist *data = _data;
+	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, NULL, NULL));
+	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, NULL, NULL, error, $2));
+	parse_events__free_terms($2);
 	$$ = list;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (12 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 13/54] perf tools: Enable config and setting names for legacy cache events Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-12 14:23   ` Jiri Olsa
  2016-02-05 14:01 ` [PATCH 15/54] perf tools: Enable indices setting syntax for BPF maps Wang Nan
                   ` (39 subsequent siblings)
  53 siblings, 1 reply; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch introduces basic facilities to support config different
slots in a BPF map one by one.

array.nr_ranges and array.ranges are introduced into 'struct
parse_events_term', where ranges is an array of indices range (start,
length) which will be configured by this config term. nr_ranges
is the size of the array. The array is passed to 'struct bpf_map_priv'.
To indicate the new type of configuration, BPF_MAP_KEY_RANGES is
added as a new key type. bpf_map_config_foreach_key() is extended to
iterate over those indices instead of all possible keys.

Code in this commit will be enabled by following commit which enables
the indices syntax for array configuration.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 128 ++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/bpf-loader.h   |   1 +
 tools/perf/util/parse-events.c |  30 ++++++++++
 tools/perf/util/parse-events.h |  12 ++++
 4 files changed, 162 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 412cd51..0cc8334 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -17,6 +17,7 @@
 #include "llvm-utils.h"
 #include "probe-event.h"
 #include "probe-finder.h" // for MAX_PROBES
+#include "parse-events.h"
 #include "llvm-utils.h"
 
 #define DEFINE_PRINT_FN(name, level) \
@@ -747,6 +748,7 @@ enum bpf_map_op_type {
 
 enum bpf_map_key_type {
 	BPF_MAP_KEY_ALL,
+	BPF_MAP_KEY_RANGES,
 };
 
 struct bpf_map_op {
@@ -754,6 +756,9 @@ struct bpf_map_op {
 	enum bpf_map_op_type op_type;
 	enum bpf_map_key_type key_type;
 	union {
+		struct parse_events_array array;
+	} k;
+	union {
 		u64 value;
 		struct perf_evsel *evsel;
 	} v;
@@ -768,6 +773,8 @@ bpf_map_op__delete(struct bpf_map_op *op)
 {
 	if (!list_empty(&op->list))
 		list_del(&op->list);
+	if (op->key_type == BPF_MAP_KEY_RANGES)
+		parse_events__clear_array(&op->k.array);
 	free(op);
 }
 
@@ -792,10 +799,33 @@ bpf_map_priv__clear(struct bpf_map *map __maybe_unused,
 	free(priv);
 }
 
+static int
+bpf_map_op_setkey(struct bpf_map_op *op, struct parse_events_term *term)
+{
+	op->key_type = BPF_MAP_KEY_ALL;
+	if (!term)
+		return 0;
+
+	if (term->array.nr_ranges) {
+		size_t memsz = term->array.nr_ranges *
+				sizeof(op->k.array.ranges[0]);
+
+		op->k.array.ranges = memdup(term->array.ranges, memsz);
+		if (!op->k.array.ranges) {
+			pr_debug("No enough memory to alloc indices for map\n");
+			return -ENOMEM;
+		}
+		op->key_type = BPF_MAP_KEY_RANGES;
+		op->k.array.nr_ranges = term->array.nr_ranges;
+	}
+	return 0;
+}
+
 static struct bpf_map_op *
-bpf_map_op__new(void)
+bpf_map_op__new(struct parse_events_term *term)
 {
 	struct bpf_map_op *op;
+	int err;
 
 	op = zalloc(sizeof(*op));
 	if (!op) {
@@ -804,7 +834,11 @@ bpf_map_op__new(void)
 	}
 	INIT_LIST_HEAD(&op->list);
 
-	op->key_type = BPF_MAP_KEY_ALL;
+	err = bpf_map_op_setkey(op, term);
+	if (err) {
+		free(op);
+		return ERR_PTR(err);
+	}
 	return op;
 }
 
@@ -841,12 +875,12 @@ bpf_map__add_op(struct bpf_map *map, struct bpf_map_op *op)
 }
 
 static struct bpf_map_op *
-bpf_map__add_newop(struct bpf_map *map)
+bpf_map__add_newop(struct bpf_map *map, struct parse_events_term *term)
 {
 	struct bpf_map_op *op;
 	int err;
 
-	op = bpf_map_op__new();
+	op = bpf_map_op__new(term);
 	if (IS_ERR(op))
 		return op;
 
@@ -896,7 +930,7 @@ __bpf_map__config_value(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
 	}
 
-	op = bpf_map__add_newop(map);
+	op = bpf_map__add_newop(map, term);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 	op->op_type = BPF_MAP_OP_SET_VALUE;
@@ -958,7 +992,7 @@ __bpf_map__config_event(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
 	}
 
-	op = bpf_map__add_newop(map);
+	op = bpf_map__add_newop(map, term);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 	op->op_type = BPF_MAP_OP_SET_EVSEL;
@@ -996,6 +1030,44 @@ struct bpf_obj_config__map_func bpf_obj_config__map_funcs[] = {
 };
 
 static int
+config_map_indices_range_check(struct parse_events_term *term,
+			       struct bpf_map *map,
+			       const char *map_name)
+{
+	struct parse_events_array *array = &term->array;
+	struct bpf_map_def def;
+	unsigned int i;
+	int err;
+
+	if (!array->nr_ranges)
+		return 0;
+	if (!array->ranges) {
+		pr_debug("ERROR: map %s: array->nr_ranges is %d but range array is NULL\n",
+			 map_name, (int)array->nr_ranges);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("ERROR: Unable to get map definition from '%s'\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	for (i = 0; i < array->nr_ranges; i++) {
+		unsigned int start = array->ranges[i].start;
+		size_t length = array->ranges[i].length;
+		unsigned int idx = start + length - 1;
+
+		if (idx >= def.max_entries) {
+			pr_debug("ERROR: index %d too large\n", idx);
+			return -BPF_LOADER_ERRNO__OBJCONF_MAP_IDX2BIG;
+		}
+	}
+	return 0;
+}
+
+static int
 bpf__obj_config_map(struct bpf_object *obj,
 		    struct parse_events_term *term,
 		    struct perf_evlist *evlist,
@@ -1030,7 +1102,12 @@ bpf__obj_config_map(struct bpf_object *obj,
 		goto out;
 	}
 
-	*key_scan_pos += map_opt - map_name;
+	*key_scan_pos += strlen(map_opt);
+	err = config_map_indices_range_check(term, map, map_name);
+	if (err)
+		goto out;
+	*key_scan_pos -= strlen(map_opt);
+
 	for (i = 0; i < ARRAY_SIZE(bpf_obj_config__map_funcs); i++) {
 		struct bpf_obj_config__map_func *func =
 				&bpf_obj_config__map_funcs[i];
@@ -1101,6 +1178,33 @@ foreach_key_array_all(map_config_func_t func,
 }
 
 static int
+foreach_key_array_ranges(map_config_func_t func, void *arg,
+			 const char *name, int map_fd,
+			 struct bpf_map_def *pdef,
+			 struct bpf_map_op *op)
+{
+	unsigned int i, j;
+	int err;
+
+	for (i = 0; i < op->k.array.nr_ranges; i++) {
+		unsigned int start = op->k.array.ranges[i].start;
+		size_t length = op->k.array.ranges[i].length;
+
+		for (j = 0; j < length; j++) {
+			unsigned int idx = start + j;
+
+			err = func(name, map_fd, pdef, op, &idx, arg);
+			if (err) {
+				pr_debug("ERROR: failed to insert value to %s[%u]\n",
+					 name, idx);
+				return err;
+			}
+		}
+	}
+	return 0;
+}
+
+static int
 bpf_map_config_foreach_key(struct bpf_map *map,
 			   map_config_func_t func,
 			   void *arg)
@@ -1142,14 +1246,19 @@ bpf_map_config_foreach_key(struct bpf_map *map,
 			case BPF_MAP_KEY_ALL:
 				err = foreach_key_array_all(func, arg, name,
 							    map_fd, &def, op);
-				if (err)
-					return err;
+				break;
+			case BPF_MAP_KEY_RANGES:
+				err = foreach_key_array_ranges(func, arg, name,
+							       map_fd, &def,
+							       op);
 				break;
 			default:
 				pr_debug("ERROR: keytype for map '%s' invalid\n",
 					 name);
 				return -BPF_LOADER_ERRNO__INTERNAL;
 			}
+			if (err)
+				return err;
 			break;
 		default:
 			pr_debug("ERROR: type of '%s' incorrect\n", name);
@@ -1337,6 +1446,7 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTDIM)]	= "Event dimension too large",
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTINH)]	= "Doesn't support inherit event",
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTTYPE)]	= "Wrong event type for map",
+	[ERRCODE_OFFSET(OBJCONF_MAP_IDX2BIG)]	= "Index too large",
 };
 
 static int
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index c9ce792..30ee519 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -38,6 +38,7 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,	/* Event dimension too large */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,	/* Doesn't support inherit event */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,	/* Wrong event type for map */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_IDX2BIG,	/* Index too large */
 	__BPF_LOADER_ERRNO__END,
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 817a0cc..62c7ec1 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2180,12 +2180,42 @@ void parse_events__free_terms(struct list_head *terms)
 
 	list_for_each_entry_safe(term, h, terms, list) {
 		list_del(&term->list);
+		if (term->array.nr_ranges)
+			free(term->array.ranges);
 		free(term);
 	}
 
 	free(terms);
 }
 
+int parse_events__merge_arrays(struct parse_events_array *dest,
+			       struct parse_events_array *another)
+{
+	struct parse_events_array new;
+
+	if (!dest || !another)
+		return -EINVAL;
+
+	new.nr_ranges = dest->nr_ranges + another->nr_ranges;
+	new.ranges = malloc(sizeof(new.ranges[0]) * new.nr_ranges);
+	if (!new.ranges)
+		return -ENOMEM;
+
+	memcpy(&new.ranges[0], dest->ranges,
+	       sizeof(new.ranges[0]) * dest->nr_ranges);
+	memcpy(&new.ranges[dest->nr_ranges], another->ranges,
+	       sizeof(new.ranges[0]) * another->nr_ranges);
+	free(dest->ranges);
+	free(another->ranges);
+	*dest = new;
+	return 0;
+}
+
+void parse_events__clear_array(struct parse_events_array *a)
+{
+	free(a->ranges);
+}
+
 void parse_events_evlist_error(struct parse_events_evlist *data,
 			       int idx, const char *str)
 {
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index a490424..f715818 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -71,8 +71,17 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_INHERIT
 };
 
+struct parse_events_array {
+	size_t nr_ranges;
+	struct {
+		unsigned int start;
+		size_t length;
+	} *ranges;
+};
+
 struct parse_events_term {
 	char *config;
+	struct parse_events_array array;
 	union {
 		char *str;
 		u64  num;
@@ -118,6 +127,9 @@ int parse_events_term__sym_hw(struct parse_events_term **term,
 int parse_events_term__clone(struct parse_events_term **new,
 			     struct parse_events_term *term);
 void parse_events__free_terms(struct list_head *terms);
+int parse_events__merge_arrays(struct parse_events_array *dest,
+			       struct parse_events_array *another);
+void parse_events__clear_array(struct parse_events_array *a);
 int parse_events__modifier_event(struct list_head *list, char *str, bool add);
 int parse_events__modifier_group(struct list_head *list, char *event_mod);
 int parse_events_name(struct list_head *list, char *name);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 15/54] perf tools: Enable indices setting syntax for BPF maps
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (13 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 16/54] perf tools: Pass tracepoint options to BPF script Wang Nan
                   ` (38 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch introduces a new syntax to perf event parser:

 # perf record -e './test_bpf_map_3.c/maps:channel.value[0,1,2,3...5]=101/' usleep 2

By utilizing the basic facilities in bpf-loader.c which allow setting
different slots in a BPF map separately, the newly introduced syntax
allows perf to control specific elements in a BPF map.

Test result:

 # cat ./test_bpf_map_3.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
 	(void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(unsigned char),
 	.max_entries = 100,
 };
 SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
 int func(void *ctx, int err, long nsec)
 {
 	char fmt[] = "%ld\n";
 	long usec = nsec * 0x10624dd3 >> 38; // nsec / 1000
 	int key = (int)usec;
 	unsigned char *pval = map_lookup_elem(&channel, &key);

 	if (!pval)
 		return 0;
 	trace_printk(fmt, sizeof(fmt), (unsigned char)*pval);
 	return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

Normal case:
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0,1,2,3...5]=101/' usleep 2
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0...9,20...29]=102,maps:channel.value[10...19]=103/' usleep 3
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0...9,20...29]=102,maps:channel.value[10...19]=103/' usleep 15
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
           usleep-655   [006] d... 2745434.122814: : 102
           usleep-904   [006] d... 2745439.916264: : 103
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[all]=104/' usleep 99
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
           usleep-655   [006] d... 2745434.122814: : 102
           usleep-904   [006] d... 2745439.916264: : 103
           usleep-1537  [003] d... 2745538.053737: : 104

Error case:
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[10...1000]=104/' usleep 99
 event syntax error: '..annel.value[10...1000]=104/'
                                   \___ Index too large
 Hint:	Valid config terms:
      	maps:[<arraymap>].value<indices>=[value]
      	maps:[<eventmap>].event<indices>=[event]

      	where <indices> is something like [0,3...5] or [all]
      	(add -v to see detail)
 Run 'perf list' for a list of valid events

  Usage: perf record [<options>] [<command>]
     or: perf record [<options>] -- <command> [<options>]

     -e, --event <event>   event selector. use 'perf list' to list available events

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c |  5 ++-
 tools/perf/util/parse-events.l | 13 ++++++-
 tools/perf/util/parse-events.y | 85 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 62c7ec1..e5ea572 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -713,9 +713,10 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 						 sizeof(errbuf));
 			data->error->help = strdup(
 "Hint:\tValid config terms:\n"
-"     \tmaps:[<arraymap>].value=[value]\n"
-"     \tmaps:[<eventmap>].event=[event]\n"
+"     \tmaps:[<arraymap>].value<indices>=[value]\n"
+"     \tmaps:[<eventmap>].event<indices>=[event]\n"
 "\n"
+"     \twhere <indices> is something like [0,3...5] or [all]\n"
 "     \t(add -v to see detail)");
 			data->error->str = strdup(errbuf);
 			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 4387728..8bb3437 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -9,8 +9,8 @@
 %{
 #include <errno.h>
 #include "../perf.h"
-#include "parse-events-bison.h"
 #include "parse-events.h"
+#include "parse-events-bison.h"
 
 char *parse_events_get_text(yyscan_t yyscanner);
 YYSTYPE *parse_events_get_lval(yyscan_t yyscanner);
@@ -111,6 +111,7 @@ do {							\
 %x mem
 %s config
 %x event
+%x array
 
 group		[^,{}/]*[{][^}]*[}][^,{}/]*
 event_pmu	[^,{}/]+[/][^/]*[/][^,{}/]*
@@ -176,6 +177,14 @@ modifier_bp	[rwx]{1,3}
 
 }
 
+<array>{
+"]"			{ BEGIN(config); return ']'; }
+{num_dec}		{ return value(yyscanner, 10); }
+{num_hex}		{ return value(yyscanner, 16); }
+,			{ return ','; }
+"\.\.\."		{ return PE_ARRAY_RANGE; }
+}
+
 <config>{
 	/*
 	 * Please update parse_events_formats_error_string any time
@@ -196,6 +205,8 @@ no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
+\[all\]			{ return PE_ARRAY_ALL; }
+"["			{ BEGIN(array); return '['; }
 }
 
 <mem>{
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 2541b32..527d72a 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -48,6 +48,7 @@ static inc_group_count(struct list_head *list,
 %token PE_PREFIX_MEM PE_PREFIX_RAW PE_PREFIX_GROUP
 %token PE_ERROR
 %token PE_PMU_EVENT_PRE PE_PMU_EVENT_SUF PE_KERNEL_PMU_EVENT
+%token PE_ARRAY_ALL PE_ARRAY_RANGE
 %type <num> PE_VALUE
 %type <num> PE_VALUE_SYM_HW
 %type <num> PE_VALUE_SYM_SW
@@ -83,6 +84,9 @@ static inc_group_count(struct list_head *list,
 %type <head> group_def
 %type <head> group
 %type <head> groups
+%type <array> array
+%type <array> array_term
+%type <array> array_terms
 
 %union
 {
@@ -94,6 +98,7 @@ static inc_group_count(struct list_head *list,
 		char *sys;
 		char *event;
 	} tracepoint_name;
+	struct parse_events_array array;
 }
 %%
 
@@ -594,6 +599,86 @@ PE_TERM
 	ABORT_ON(parse_events_term__num(&term, (int)$1, NULL, 1, &@1, NULL));
 	$$ = term;
 }
+|
+PE_NAME array '=' PE_NAME
+{
+	struct parse_events_term *term;
+	int i;
+
+	ABORT_ON(parse_events_term__str(&term, PARSE_EVENTS__TERM_TYPE_USER,
+					$1, $4, &@1, &@4));
+
+	term->array = $2;
+	$$ = term;
+}
+|
+PE_NAME array '=' PE_VALUE
+{
+	struct parse_events_term *term;
+
+	ABORT_ON(parse_events_term__num(&term, PARSE_EVENTS__TERM_TYPE_USER,
+					$1, $4, &@1, &@4));
+	term->array = $2;
+	$$ = term;
+}
+
+array:
+'[' array_terms ']'
+{
+	$$ = $2;
+}
+|
+PE_ARRAY_ALL
+{
+	$$.nr_ranges = 0;
+	$$.ranges = NULL;
+}
+
+array_terms:
+array_terms ',' array_term
+{
+	struct parse_events_array new_array;
+
+	new_array.nr_ranges = $1.nr_ranges + $3.nr_ranges;
+	new_array.ranges = malloc(sizeof(new_array.ranges[0]) *
+				  new_array.nr_ranges);
+	ABORT_ON(!new_array.ranges);
+	memcpy(&new_array.ranges[0], $1.ranges,
+	       $1.nr_ranges * sizeof(new_array.ranges[0]));
+	memcpy(&new_array.ranges[$1.nr_ranges], $3.ranges,
+	       $3.nr_ranges * sizeof(new_array.ranges[0]));
+	free($1.ranges);
+	free($3.ranges);
+	$$ = new_array;
+}
+|
+array_term
+
+array_term:
+PE_VALUE
+{
+	struct parse_events_array array;
+
+	array.nr_ranges = 1;
+	array.ranges = malloc(sizeof(array.ranges[0]));
+	ABORT_ON(!array.ranges);
+	array.ranges[0].start = $1;
+	array.ranges[0].length = 1;
+	$$ = array;
+}
+|
+PE_VALUE PE_ARRAY_RANGE PE_VALUE
+{
+	struct parse_events_array array;
+
+	ABORT_ON($3 < $1);
+	array.nr_ranges = 1;
+	array.ranges = malloc(sizeof(array.ranges[0]));
+	ABORT_ON(!array.ranges);
+	array.ranges[0].start = $1;
+	array.ranges[0].length = $3 - $1 + 1;
+	$$ = array;
+}
 
 sep_dc: ':' |
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 16/54] perf tools: Pass tracepoint options to BPF script
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (14 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 15/54] perf tools: Enable indices setting syntax for BPF maps Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 17/54] perf tools: Introduce bpf-output event Wang Nan
                   ` (37 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Users can pass options to tracepoints defined in the BPF script.
For example:

 # perf record -e ./test.c/no-inherit/ bash
 # dd if=/dev/zero of=/dev/null count=10000
 # exit
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.022 MB perf.data (139 samples) ]

test.c:

 #define SEC(NAME) __attribute__((section(NAME), used))
 SEC("func=sys_read")
 int bpf_func__sys_read(void *ctx)
 {
     return 1;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;

no-inherit is applied to the kprobe event defined in test.c.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/tests/bpf.c         |  2 +-
 tools/perf/util/parse-events.c | 56 +++++++++++++++++++++++++++++++++++++-----
 tools/perf/util/parse-events.h |  3 ++-
 3 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 4aed5cb..199501c 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -112,7 +112,7 @@ static int do_test(struct bpf_object *obj, int (*func)(void),
 	parse_evlist.error = &parse_error;
 	INIT_LIST_HEAD(&parse_evlist.list);
 
-	err = parse_events_load_bpf_obj(&parse_evlist, &parse_evlist.list, obj);
+	err = parse_events_load_bpf_obj(&parse_evlist, &parse_evlist.list, obj, NULL);
 	if (err || list_empty(&parse_evlist.list)) {
 		pr_debug("Failed to add events selected by BPF\n");
 		return TEST_FAIL;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index e5ea572..8e0f401 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -590,6 +590,7 @@ static int add_tracepoint_multi_sys(struct list_head *list, int *idx,
 struct __add_bpf_event_param {
 	struct parse_events_evlist *data;
 	struct list_head *list;
+	struct list_head *head_config;
 };
 
 static int add_bpf_event(struct probe_trace_event *tev, int fd,
@@ -606,7 +607,8 @@ static int add_bpf_event(struct probe_trace_event *tev, int fd,
 		 tev->group, tev->event, fd);
 
 	err = parse_events_add_tracepoint(&new_evsels, &evlist->idx, tev->group,
-					  tev->event, evlist->error, NULL);
+					  tev->event, evlist->error,
+					  param->head_config);
 	if (err) {
 		struct perf_evsel *evsel, *tmp;
 
@@ -631,11 +633,12 @@ static int add_bpf_event(struct probe_trace_event *tev, int fd,
 
 int parse_events_load_bpf_obj(struct parse_events_evlist *data,
 			      struct list_head *list,
-			      struct bpf_object *obj)
+			      struct bpf_object *obj,
+			      struct list_head *head_config)
 {
 	int err;
 	char errbuf[BUFSIZ];
-	struct __add_bpf_event_param param = {data, list};
+	struct __add_bpf_event_param param = {data, list, head_config};
 	static bool registered_unprobe_atexit = false;
 
 	if (IS_ERR(obj) || !obj) {
@@ -729,14 +732,47 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 	return 0;
 }
 
+/*
+ * Split config terms:
+ * perf record -e bpf.c/call-graph=fp,maps:array.value[0]=1/ ...
+ *  'call-graph=fp' is 'evt config', should be applied to each
+ *  events in bpf.c.
+ * 'maps:array.value[0]=1' is 'obj config', should be processed
+ * with parse_events_config_bpf.
+ *
+ * Move object config terms from the first list to obj_head_config.
+ */
+static void
+split_bpf_config_terms(struct list_head *evt_head_config,
+		       struct list_head *obj_head_config)
+{
+	struct parse_events_term *term, *temp;
+
+	/*
+	 * Currectly, all possible user config term
+	 * belong to bpf object. parse_events__is_hardcoded_term()
+	 * happends to be a good flag.
+	 *
+	 * See parse_events_config_bpf() and
+	 * config_term_tracepoint().
+	 */
+	list_for_each_entry_safe(term, temp, evt_head_config, list)
+		if (!parse_events__is_hardcoded_term(term))
+			list_move_tail(&term->list, obj_head_config);
+}
+
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
 			  bool source,
 			  struct list_head *head_config)
 {
-	struct bpf_object *obj;
 	int err;
+	struct bpf_object *obj;
+	LIST_HEAD(obj_head_config);
+
+	if (head_config)
+		split_bpf_config_terms(head_config, &obj_head_config);
 
 	obj = bpf__prepare_load(bpf_file_name, source);
 	if (IS_ERR(obj)) {
@@ -758,10 +794,18 @@ int parse_events_load_bpf(struct parse_events_evlist *data,
 		return err;
 	}
 
-	err = parse_events_load_bpf_obj(data, list, obj);
+	err = parse_events_load_bpf_obj(data, list, obj, head_config);
 	if (err)
 		return err;
-	return parse_events_config_bpf(data, obj, head_config);
+	err = parse_events_config_bpf(data, obj, &obj_head_config);
+
+	/*
+	 * Caller doesn't know anything about obj_head_config,
+	 * so combine them together again before returnning.
+	 */
+	if (head_config)
+		list_splice_tail(&obj_head_config, head_config);
+	return err;
 }
 
 static int
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index f715818..ad1f78f 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -146,7 +146,8 @@ int parse_events_load_bpf(struct parse_events_evlist *data,
 struct bpf_object;
 int parse_events_load_bpf_obj(struct parse_events_evlist *data,
 			      struct list_head *list,
-			      struct bpf_object *obj);
+			      struct bpf_object *obj,
+			      struct list_head *head_config);
 int parse_events_add_numeric(struct parse_events_evlist *data,
 			     struct list_head *list,
 			     u32 type, u64 config,
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 17/54] perf tools: Introduce bpf-output event
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (15 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 16/54] perf tools: Pass tracepoint options to BPF script Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 18/54] perf data: Support converting data from bpf_perf_event_output() Wang Nan
                   ` (36 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Commit a43eec304259a6c637f4014a6d4767159b6a3aa3 (bpf: introduce
bpf_perf_event_output() helper) add a helper to enable BPF program
output data to perf ring buffer through a new type of perf event
PERF_COUNT_SW_BPF_OUTPUT. This patch enable perf to create perf
event of that type. Now perf user can use following cmdline to
receive output data from BPF programs:

 # ./perf record -a -e bpf-output/no-inherit,name=evt/ \
                    -e ./test_bpf_output.c/maps:channel.event=evt/ ls /
 # ./perf script
            perf  1560 [004] 347747.086295:                       evt:  ffffffff811fd201 sys_write ...
            perf  1560 [004] 347747.086300:                       evt:  ffffffff811fd201 sys_write ...
            perf  1560 [004] 347747.086315:                       evt:  ffffffff811fd201 sys_write ...
            ...

Test result:
 # cat ./test_bpf_output.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };

 #define SEC(NAME) __attribute__((section(NAME), used))
 static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
 };

 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
 	struct {
 		u64 ktime;
 		int cpuid;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output: %d\n";

 	output_data.cpuid = get_smp_processor_id();
 	output_data.ktime = ktime_get_ns();
 	int err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				    &output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data), err);
 	return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************ END ***************************/

 # ./perf record -a -e bpf-output/no-inherit,name=evt/ \
                    -e ./test_bpf_output.c/maps:channel.event=evt/ ls /
 # ./perf script | grep ls
          ls  2242 [003] 347851.557563:                       evt:  ffffffff811fd201 sys_write ...
          ls  2242 [003] 347851.557571:                       evt:  ffffffff811fd201 sys_write ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 5 ++---
 tools/perf/util/evsel.c        | 5 +++++
 tools/perf/util/evsel.h        | 8 ++++++++
 tools/perf/util/parse-events.l | 1 +
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 0cc8334..73950f3 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -1332,13 +1332,12 @@ apply_config_evsel_for_key(const char *name, int map_fd, void *pkey,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel))
+		check_pass = true;
 	if (attr->type == PERF_TYPE_RAW)
 		check_pass = true;
 	if (attr->type == PERF_TYPE_HARDWARE)
 		check_pass = true;
-	if (attr->type == PERF_TYPE_SOFTWARE &&
-			attr->config == PERF_COUNT_SW_BPF_OUTPUT)
-		check_pass = true;
 	if (!check_pass) {
 		pr_debug("ERROR: Event type is wrong for map %s\n", name);
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4678086..6cb4c05 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -225,6 +225,11 @@ struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
 	if (evsel != NULL)
 		perf_evsel__init(evsel, attr, idx);
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		evsel->attr.sample_type |= PERF_SAMPLE_RAW;
+		evsel->attr.sample_period = 1;
+	}
+
 	return evsel;
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8e75434..efad78f 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -364,6 +364,14 @@ static inline bool perf_evsel__is_function_event(struct perf_evsel *evsel)
 #undef FUNCTION_EVENT
 }
 
+static inline bool perf_evsel__is_bpf_output(struct perf_evsel *evsel)
+{
+	struct perf_event_attr *attr = &evsel->attr;
+
+	return (attr->config == PERF_COUNT_SW_BPF_OUTPUT) &&
+		(attr->type == PERF_TYPE_SOFTWARE);
+}
+
 struct perf_attr_details {
 	bool freq;
 	bool verbose;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 8bb3437..27d567f 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -249,6 +249,7 @@ cpu-migrations|migrations			{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COU
 alignment-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
 emulation-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }
 dummy						{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
+bpf-output					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
 
 	/*
 	 * We have to handle the kernel PMU event cycles-ct/cycles-t/mem-loads/mem-stores separately.
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 18/54] perf data: Support converting data from bpf_perf_event_output()
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (16 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 17/54] perf tools: Introduce bpf-output event Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 19/54] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
                   ` (35 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

bpf_perf_event_output() outputs data through sample->raw_data. This
patch adds support to convert those data into CTF. A python script
then can be used to process output data from BPF programs.

Test result:

 # cat ./test_bpf_output_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };
 #define SEC(NAME) __attribute__((section(NAME), used))
 static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
 };

 static inline int __attribute__((always_inline))
 func(void *ctx, int type)
 {
 	struct {
 		u64 ktime;
 		int type;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output\n";
 	int err;

 	output_data.type = type;
 	output_data.ktime = ktime_get_ns();
 	err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				&output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data));
 	return 0;
 }
 SEC("func_begin=sys_nanosleep")
 int func_begin(void *ctx) {return func(ctx, 1);}
 SEC("func_end=sys_nanosleep%return")
 int func_end(void *ctx) { return func(ctx, 2);}
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 # ./perf record -e bpf-output/no-inherit,name=evt/ \
                 -e ./test_bpf_output_2.c/maps:channel.event=evt/ \
                 usleep 100000
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]

 # ./perf script
          usleep 14942 92503.198504: evt:  ffffffff810e0ba1 sys_nanosleep (/lib/modules/4.3.0....
          usleep 14942 92503.298562: evt:  ffffffff810585e9 kretprobe_trampoline_holder (/lib....

 # ./perf data convert --to-ctf ./out.ctf
 [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
 [ perf data convert: Converted and wrote 0.000 MB (2 samples) ]

 # babeltrace ./out.ctf
 [01:41:43.198504134] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E0BA1, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x32C0C07B, [1] = 0x5421, [2] = 0x1 ] }
 [01:41:43.298562257] (+0.100058123) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810585E9, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x38B77FAA, [1] = 0x5421, [2] = 0x2 ] }

 # cat ./test_bpf_output_2.py
 from babeltrace import TraceCollection
 tc = TraceCollection()
 tc.add_trace('./out.ctf', 'ctf')
 d = {1:[], 2:[]}
 for event in tc.events:
     if not event.name.startswith('evt'):
         continue
     raw_data = event['raw_data']
     (time, type) = ((raw_data[0] + (raw_data[1] << 32)), raw_data[2])
     d[type].append(time)
 print(list(map(lambda i: d[2][i] - d[1][i], range(len(d[1])))));

 # python3 ./test_bpf_output_2.py
 [100056879]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data-convert-bt.c | 112 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 111 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index b722e57..70f462d 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -352,6 +352,84 @@ static int add_tracepoint_values(struct ctf_writer *cw,
 	return ret;
 }
 
+static int
+add_bpf_output_values(struct bt_ctf_event_class *event_class,
+		      struct bt_ctf_event *event,
+		      struct perf_sample *sample)
+{
+	struct bt_ctf_field_type *len_type, *seq_type;
+	struct bt_ctf_field *len_field, *seq_field;
+	unsigned int raw_size = sample->raw_size;
+	unsigned int nr_elements = raw_size / sizeof(u32);
+	unsigned int i;
+	int ret;
+
+	if (nr_elements * sizeof(u32) != raw_size)
+		pr_warning("Incorrect raw_size (%u) in bpf output event, skip %lu bytes\n",
+			   raw_size, nr_elements * sizeof(u32) - raw_size);
+
+	len_type = bt_ctf_event_class_get_field_by_name(event_class, "raw_len");
+	len_field = bt_ctf_field_create(len_type);
+	if (!len_field) {
+		pr_err("failed to create 'raw_len' for bpf output event\n");
+		ret = -1;
+		goto put_len_type;
+	}
+
+	ret = bt_ctf_field_unsigned_integer_set_value(len_field, nr_elements);
+	if (ret) {
+		pr_err("failed to set field value for raw_len\n");
+		goto put_len_field;
+	}
+	ret = bt_ctf_event_set_payload(event, "raw_len", len_field);
+	if (ret) {
+		pr_err("failed to set payload to raw_len\n");
+		goto put_len_field;
+	}
+
+	seq_type = bt_ctf_event_class_get_field_by_name(event_class, "raw_data");
+	seq_field = bt_ctf_field_create(seq_type);
+	if (!seq_field) {
+		pr_err("failed to create 'raw_data' for bpf output event\n");
+		ret = -1;
+		goto put_seq_type;
+	}
+
+	ret = bt_ctf_field_sequence_set_length(seq_field, len_field);
+	if (ret) {
+		pr_err("failed to set length of 'raw_data'\n");
+		goto put_seq_field;
+	}
+
+	for (i = 0; i < nr_elements; i++) {
+		struct bt_ctf_field *elem_field =
+			bt_ctf_field_sequence_get_field(seq_field, i);
+
+		ret = bt_ctf_field_unsigned_integer_set_value(elem_field,
+				((u32 *)(sample->raw_data))[i]);
+
+		bt_ctf_field_put(elem_field);
+		if (ret) {
+			pr_err("failed to set raw_data[%d]\n", i);
+			goto put_seq_field;
+		}
+	}
+
+	ret = bt_ctf_event_set_payload(event, "raw_data", seq_field);
+	if (ret)
+		pr_err("failed to set payload for raw_data\n");
+
+put_seq_field:
+	bt_ctf_field_put(seq_field);
+put_seq_type:
+	bt_ctf_field_type_put(seq_type);
+put_len_field:
+	bt_ctf_field_put(len_field);
+put_len_type:
+	bt_ctf_field_type_put(len_type);
+	return ret;
+}
+
 static int add_generic_values(struct ctf_writer *cw,
 			      struct bt_ctf_event *event,
 			      struct perf_evsel *evsel,
@@ -597,6 +675,12 @@ static int process_sample_event(struct perf_tool *tool,
 			return -1;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		ret = add_bpf_output_values(event_class, event, sample);
+		if (ret)
+			return -1;
+	}
+
 	cs = ctf_stream(cw, get_sample_cpu(cw, sample, evsel));
 	if (cs) {
 		if (is_flush_needed(cs))
@@ -744,6 +828,25 @@ static int add_tracepoint_types(struct ctf_writer *cw,
 	return ret;
 }
 
+static int add_bpf_output_types(struct ctf_writer *cw,
+				struct bt_ctf_event_class *class)
+{
+	struct bt_ctf_field_type *len_type = cw->data.u32;
+	struct bt_ctf_field_type *seq_base_type = cw->data.u32_hex;
+	struct bt_ctf_field_type *seq_type;
+	int ret;
+
+	ret = bt_ctf_event_class_add_field(class, len_type, "raw_len");
+	if (ret)
+		return ret;
+
+	seq_type = bt_ctf_field_type_sequence_create(seq_base_type, "raw_len");
+	if (!seq_type)
+		return -1;
+
+	return bt_ctf_event_class_add_field(class, seq_type, "raw_data");
+}
+
 static int add_generic_types(struct ctf_writer *cw, struct perf_evsel *evsel,
 			     struct bt_ctf_event_class *event_class)
 {
@@ -755,7 +858,8 @@ static int add_generic_types(struct ctf_writer *cw, struct perf_evsel *evsel,
 	 *                              ctf event header
 	 *   PERF_SAMPLE_READ         - TODO
 	 *   PERF_SAMPLE_CALLCHAIN    - TODO
-	 *   PERF_SAMPLE_RAW          - tracepoint fields are handled separately
+	 *   PERF_SAMPLE_RAW          - tracepoint fields and BPF output
+	 *                              are handled separately
 	 *   PERF_SAMPLE_BRANCH_STACK - TODO
 	 *   PERF_SAMPLE_REGS_USER    - TODO
 	 *   PERF_SAMPLE_STACK_USER   - TODO
@@ -824,6 +928,12 @@ static int add_event(struct ctf_writer *cw, struct perf_evsel *evsel)
 			goto err;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		ret = add_bpf_output_types(cw, event_class);
+		if (ret)
+			goto err;
+	}
+
 	ret = bt_ctf_stream_class_add_event_class(cw->stream_class, event_class);
 	if (ret) {
 		pr("Failed to add event class into stream.\n");
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 19/54] perf core: Introduce new ioctl options to pause and resume ring buffer
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (17 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 18/54] perf data: Support converting data from bpf_perf_event_output() Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 20/54] perf core: Set event's default overflow_handler Wang Nan
                   ` (34 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Add new ioctl() to pause/resume ring-buffer output.

In some situations we want to read from ring buffer only when we
ensure nothing can write to the ring buffer during reading. Without
this patch we have to turn off all events attached to this ring buffer
to achieve this.

This patch is for supporting overwrite ring buffer. Following
commits will introduce new methods support reading from overwrite ring
buffer. Before reading caller must ensure the ring buffer is frozen, or
the reading is unreliable.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/uapi/linux/perf_event.h |  1 +
 kernel/events/core.c            | 13 +++++++++++++
 kernel/events/internal.h        | 11 +++++++++++
 kernel/events/ring_buffer.c     |  7 ++++++-
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 1afe962..a3c1903 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -401,6 +401,7 @@ struct perf_event_attr {
 #define PERF_EVENT_IOC_SET_FILTER	_IOW('$', 6, char *)
 #define PERF_EVENT_IOC_ID		_IOR('$', 7, __u64 *)
 #define PERF_EVENT_IOC_SET_BPF		_IOW('$', 8, __u32)
+#define PERF_EVENT_IOC_PAUSE_OUTPUT	_IOW('$', 9, __u32)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5946460..3e6a3ad 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4231,6 +4231,19 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
 	case PERF_EVENT_IOC_SET_BPF:
 		return perf_event_set_bpf_prog(event, arg);
 
+	case PERF_EVENT_IOC_PAUSE_OUTPUT: {
+		struct ring_buffer *rb;
+
+		rcu_read_lock();
+		rb = rcu_dereference(event->rb);
+		if (!event->rb) {
+			rcu_read_unlock();
+			return -EINVAL;
+		}
+		rb_toggle_paused(rb, !!arg);
+		rcu_read_unlock();
+		return 0;
+	}
 	default:
 		return -ENOTTY;
 	}
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 2bbad9c..6a93d1b 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -18,6 +18,7 @@ struct ring_buffer {
 #endif
 	int				nr_pages;	/* nr of data pages  */
 	int				overwrite;	/* can overwrite itself */
+	int				paused;		/* can write into ring buffer */
 
 	atomic_t			poll;		/* POLL_ for wakeups */
 
@@ -65,6 +66,16 @@ static inline void rb_free_rcu(struct rcu_head *rcu_head)
 	rb_free(rb);
 }
 
+static inline void
+rb_toggle_paused(struct ring_buffer *rb,
+		 bool pause)
+{
+	if (!pause && rb->nr_pages)
+		rb->paused = 0;
+	else
+		rb->paused = 1;
+}
+
 extern struct ring_buffer *
 rb_alloc(int nr_pages, long watermark, int cpu, int flags);
 extern void perf_event_wakeup(struct perf_event *event);
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 1faad2c..22e1a47 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -125,8 +125,11 @@ int perf_output_begin(struct perf_output_handle *handle,
 	if (unlikely(!rb))
 		goto out;
 
-	if (unlikely(!rb->nr_pages))
+	if (unlikely(rb->paused)) {
+		if (rb->nr_pages)
+			local_inc(&rb->lost);
 		goto out;
+	}
 
 	handle->rb    = rb;
 	handle->event = event;
@@ -244,6 +247,8 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
 	INIT_LIST_HEAD(&rb->event_list);
 	spin_lock_init(&rb->event_lock);
 	init_irq_work(&rb->irq_work, rb_irq_work);
+
+	rb->paused = rb->nr_pages ? 0 : 1;
 }
 
 static void ring_buffer_put_async(struct ring_buffer *rb)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 20/54] perf core: Set event's default overflow_handler
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (18 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 19/54] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 21/54] perf core: Prepare writing into ring buffer from end Wang Nan
                   ` (33 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Set a default event->overflow_handler in perf_event_alloc() so don't
need checking event->overflow_handler in __perf_event_overflow().
Following commits can give a different default overflow_handler.

No extra performance introduced into hot path because in the original
code we still need reading this handler from memory. A conditional branch
is avoided so actually we remove some instructions.

Initial idea comes from Peter at [1].

[1] http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 kernel/events/core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3e6a3ad..77a6475 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6392,10 +6392,7 @@ static int __perf_event_overflow(struct perf_event *event,
 		irq_work_queue(&event->pending);
 	}
 
-	if (event->overflow_handler)
-		event->overflow_handler(event, data, regs);
-	else
-		perf_event_output(event, data, regs);
+	event->overflow_handler(event, data, regs);
 
 	if (*perf_event_fasync(event) && event->pending_kill) {
 		event->pending_wakeup = 1;
@@ -7869,8 +7866,13 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 		context = parent_event->overflow_handler_context;
 	}
 
-	event->overflow_handler	= overflow_handler;
-	event->overflow_handler_context = context;
+	if (overflow_handler) {
+		event->overflow_handler	= overflow_handler;
+		event->overflow_handler_context = context;
+	} else {
+		event->overflow_handler = perf_event_output;
+		event->overflow_handler_context = NULL;
+	}
 
 	perf_event__state_init(event);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 21/54] perf core: Prepare writing into ring buffer from end
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (19 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 20/54] perf core: Set event's default overflow_handler Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 22/54] perf core: Add backward attribute to perf event Wang Nan
                   ` (32 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Convert perf_output_begin to __perf_output_begin and make the later
function able to write records from the end of the ring buffer.
Following commits will utilize the 'backward' flag.

This patch doesn't introduce any extra performance overhead since we
use always_inline.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 kernel/events/ring_buffer.c | 42 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 22e1a47..37c11c6 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -102,8 +102,21 @@ out:
 	preempt_enable();
 }
 
-int perf_output_begin(struct perf_output_handle *handle,
-		      struct perf_event *event, unsigned int size)
+static bool __always_inline
+ring_buffer_has_space(unsigned long head, unsigned long tail,
+		      unsigned long data_size, unsigned int size,
+		      bool backward)
+{
+	if (!backward)
+		return CIRC_SPACE(head, tail, data_size) >= size;
+	else
+		return CIRC_SPACE(tail, head, data_size) >= size;
+}
+
+static int __always_inline
+__perf_output_begin(struct perf_output_handle *handle,
+		    struct perf_event *event, unsigned int size,
+		    bool backward)
 {
 	struct ring_buffer *rb;
 	unsigned long tail, offset, head;
@@ -146,9 +159,12 @@ int perf_output_begin(struct perf_output_handle *handle,
 	do {
 		tail = READ_ONCE(rb->user_page->data_tail);
 		offset = head = local_read(&rb->head);
-		if (!rb->overwrite &&
-		    unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size))
-			goto fail;
+		if (!rb->overwrite) {
+			if (unlikely(!ring_buffer_has_space(head, tail,
+							    perf_data_size(rb),
+							    size, backward)))
+				goto fail;
+		}
 
 		/*
 		 * The above forms a control dependency barrier separating the
@@ -162,9 +178,17 @@ int perf_output_begin(struct perf_output_handle *handle,
 		 * See perf_output_put_handle().
 		 */
 
-		head += size;
+		if (!backward)
+			head += size;
+		else
+			head -= size;
 	} while (local_cmpxchg(&rb->head, offset, head) != offset);
 
+	if (backward) {
+		offset = head;
+		head = (u64)(-head);
+	}
+
 	/*
 	 * We rely on the implied barrier() by local_cmpxchg() to ensure
 	 * none of the data stores below can be lifted up by the compiler.
@@ -206,6 +230,12 @@ out:
 	return -ENOSPC;
 }
 
+int perf_output_begin(struct perf_output_handle *handle,
+		      struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, false);
+}
+
 unsigned int perf_output_copy(struct perf_output_handle *handle,
 		      const void *buf, unsigned int len)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 22/54] perf core: Add backward attribute to perf event
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (20 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 21/54] perf core: Prepare writing into ring buffer from end Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 23/54] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
                   ` (31 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

In perf_event_attr a new bit 'write_backward' is appended to indicate
this event should write ring buffer from its end to beginning.

In perf_output_begin(), prepare ring buffer according this bit.

This patch introduces small overhead into perf_output_begin():
an extra memory read and a conditional branch. Further patch can remove
this overhead by using custom output handler.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/linux/perf_event.h      | 5 +++++
 include/uapi/linux/perf_event.h | 3 ++-
 kernel/events/core.c            | 7 +++++++
 kernel/events/ring_buffer.c     | 2 ++
 4 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index b35a61a..0ce1015 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1029,6 +1029,11 @@ static inline bool has_aux(struct perf_event *event)
 	return event->pmu->setup_aux;
 }
 
+static inline bool is_write_backward(struct perf_event *event)
+{
+	return !!event->attr.write_backward;
+}
+
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
 extern void perf_output_end(struct perf_output_handle *handle);
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index a3c1903..43fc8d2 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -340,7 +340,8 @@ struct perf_event_attr {
 				comm_exec      :  1, /* flag comm events that are due to an exec */
 				use_clockid    :  1, /* use @clockid for time fields */
 				context_switch :  1, /* context switch data */
-				__reserved_1   : 37;
+				write_backward :  1, /* Write ring buffer from end to beginning */
+				__reserved_1   : 36;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 77a6475..b12de05 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8102,6 +8102,13 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
 		goto out;
 
 	/*
+	 * Either writing ring buffer from beginning or from end.
+	 * Mixing is not allowed.
+	 */
+	if (is_write_backward(output_event) != is_write_backward(event))
+		goto out;
+
+	/*
 	 * If both events generate aux data, they must be on the same PMU
 	 */
 	if (has_aux(event) && has_aux(output_event) &&
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 37c11c6..80b1fa7 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -233,6 +233,8 @@ out:
 int perf_output_begin(struct perf_output_handle *handle,
 		      struct perf_event *event, unsigned int size)
 {
+	if (unlikely(is_write_backward(event)))
+		return __perf_output_begin(handle, event, size, true);
 	return __perf_output_begin(handle, event, size, false);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 23/54] perf core: Reduce perf event output overhead by new overflow handler
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (21 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 22/54] perf core: Add backward attribute to perf event Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 24/54] perf tools: Only validate is_pos for tracking evsels Wang Nan
                   ` (30 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

By creating onward and backward specific overflow handlers and setting
them according to event's backward setting, normal sampling events
don't need checking backward setting of an event any more.

This is the last patch of backward writing patchset. After this patch,
there's no extra overhead introduced to the fast path of sampling
output.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/linux/perf_event.h  | 17 +++++++++++++++--
 kernel/events/core.c        | 41 ++++++++++++++++++++++++++++++++++++-----
 kernel/events/ring_buffer.c | 12 ++++++++++++
 3 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 0ce1015..e466cc6 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -827,9 +827,15 @@ extern int perf_event_overflow(struct perf_event *event,
 				 struct perf_sample_data *data,
 				 struct pt_regs *regs);
 
+extern void perf_event_output_onward(struct perf_event *event,
+				     struct perf_sample_data *data,
+				     struct pt_regs *regs);
+extern void perf_event_output_backward(struct perf_event *event,
+				       struct perf_sample_data *data,
+				       struct pt_regs *regs);
 extern void perf_event_output(struct perf_event *event,
-				struct perf_sample_data *data,
-				struct pt_regs *regs);
+			      struct perf_sample_data *data,
+			      struct pt_regs *regs);
 
 extern void
 perf_event_header__init_id(struct perf_event_header *header,
@@ -1036,6 +1042,13 @@ static inline bool is_write_backward(struct perf_event *event)
 
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
+extern int perf_output_begin_onward(struct perf_output_handle *handle,
+				    struct perf_event *event,
+				    unsigned int size);
+extern int perf_output_begin_backward(struct perf_output_handle *handle,
+				      struct perf_event *event,
+				      unsigned int size);
+
 extern void perf_output_end(struct perf_output_handle *handle);
 extern unsigned int perf_output_copy(struct perf_output_handle *handle,
 			     const void *buf, unsigned int len);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index b12de05..da6be01 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5531,9 +5531,13 @@ void perf_prepare_sample(struct perf_event_header *header,
 	}
 }
 
-void perf_event_output(struct perf_event *event,
-			struct perf_sample_data *data,
-			struct pt_regs *regs)
+static void __always_inline
+__perf_event_output(struct perf_event *event,
+		    struct perf_sample_data *data,
+		    struct pt_regs *regs,
+		    int (*output_begin)(struct perf_output_handle *,
+					struct perf_event *,
+					unsigned int))
 {
 	struct perf_output_handle handle;
 	struct perf_event_header header;
@@ -5543,7 +5547,7 @@ void perf_event_output(struct perf_event *event,
 
 	perf_prepare_sample(&header, data, event, regs);
 
-	if (perf_output_begin(&handle, event, header.size))
+	if (output_begin(&handle, event, header.size))
 		goto exit;
 
 	perf_output_sample(&handle, &header, data, event);
@@ -5554,6 +5558,30 @@ exit:
 	rcu_read_unlock();
 }
 
+void
+perf_event_output_onward(struct perf_event *event,
+			 struct perf_sample_data *data,
+			 struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin_onward);
+}
+
+void
+perf_event_output_backward(struct perf_event *event,
+			   struct perf_sample_data *data,
+			   struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin_backward);
+}
+
+void
+perf_event_output(struct perf_event *event,
+		  struct perf_sample_data *data,
+		  struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin);
+}
+
 /*
  * read event_id
  */
@@ -7869,8 +7897,11 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 	if (overflow_handler) {
 		event->overflow_handler	= overflow_handler;
 		event->overflow_handler_context = context;
+	} else if (is_write_backward(event)){
+		event->overflow_handler = perf_event_output_backward;
+		event->overflow_handler_context = NULL;
 	} else {
-		event->overflow_handler = perf_event_output;
+		event->overflow_handler = perf_event_output_onward;
 		event->overflow_handler_context = NULL;
 	}
 
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 80b1fa7..7e30e012 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -230,6 +230,18 @@ out:
 	return -ENOSPC;
 }
 
+int perf_output_begin_onward(struct perf_output_handle *handle,
+			     struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, false);
+}
+
+int perf_output_begin_backward(struct perf_output_handle *handle,
+			       struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, true);
+}
+
 int perf_output_begin(struct perf_output_handle *handle,
 		      struct perf_event *event, unsigned int size)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 24/54] perf tools: Only validate is_pos for tracking evsels
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (22 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 23/54] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 25/54] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
                   ` (29 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

is_pos only useful for tracking events (fork, mmap, exit, ...).
Perf collects those events through evsel with 'tracking' set.
Therefore, there's no need to validate every is_pos against
evlist->is_pos.

This patch is required after perf support PERF_SAMPLE_TAILSIZE.
Since there an extra u64 at the end of this type of evsels, is_pos
for evsel with PERF_SAMPLE_TAILSIZE setting is different from other
evsels.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9b56390..90a9820 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1274,8 +1274,15 @@ bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
 		return false;
 
 	evlist__for_each(evlist, pos) {
-		if (pos->id_pos != evlist->id_pos ||
-		    pos->is_pos != evlist->is_pos)
+		if (pos->id_pos != evlist->id_pos)
+			return false;
+		/*
+		 * Only tracking events needs is_pos. Those events are
+		 * collected if evsel->tracking is selected.
+		 * For other evsel, is_pos is useless for other evsels,
+		 * so skip validating them.
+		 */
+		if (pos->tracking && pos->is_pos != evlist->is_pos)
 			return false;
 	}
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 25/54] perf tools: Print write_backward value in perf_event_attr__fprintf
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (23 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 24/54] perf tools: Only validate is_pos for tracking evsels Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 26/54] perf tools: Make ordered_events reusable Wang Nan
                   ` (28 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Print write_backward setting when printing perf evsel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evsel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6cb4c05..60529e5 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1289,6 +1289,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(comm_exec, p_unsigned);
 	PRINT_ATTRf(use_clockid, p_unsigned);
 	PRINT_ATTRf(context_switch, p_unsigned);
+	PRINT_ATTRf(write_backward, p_unsigned);
 
 	PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned);
 	PRINT_ATTRf(bp_type, p_unsigned);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 26/54] perf tools: Make ordered_events reusable
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (24 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 25/54] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 27/54] perf record: Extract synthesize code to record__synthesize() Wang Nan
                   ` (27 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

ordered_events__free() leaves linked lists and timestamps not cleared,
so unable to be reused after ordered_events__free(). Which is inconvenient
after 'perf record' supports generating multiple perf.data output and
process build-ids for each of them.

Calls ordered_events__init() in ordered_events__free() so ordered_events
can be reused.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/ordered-events.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
index b1b9e23..70c0dc8 100644
--- a/tools/perf/util/ordered-events.c
+++ b/tools/perf/util/ordered-events.c
@@ -299,6 +299,8 @@ void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t d
 
 void ordered_events__free(struct ordered_events *oe)
 {
+	ordered_events__deliver_t old_deliver = oe->deliver;
+
 	while (!list_empty(&oe->to_free)) {
 		struct ordered_event *event;
 
@@ -307,4 +309,7 @@ void ordered_events__free(struct ordered_events *oe)
 		free_dup_event(oe, event->event);
 		free(event);
 	}
+
+	memset(oe, '\0', sizeof(*oe));
+	ordered_events__init(oe, old_deliver);
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 27/54] perf record: Extract synthesize code to record__synthesize()
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (25 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 26/54] perf tools: Make ordered_events reusable Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 28/54] perf tools: Add perf_data_file__switch() helper Wang Nan
                   ` (26 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Create record__synthesize(). It can be used to create tracking events
for each perf.data after perf supporting splitting into multiple
outputs.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 132 +++++++++++++++++++++++++-------------------
 1 file changed, 76 insertions(+), 56 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index caa8235..a9f001d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -485,6 +485,81 @@ static void workload_exec_failed_signal(int signo __maybe_unused,
 
 static void snapshot_sig_handler(int sig);
 
+static int record__synthesize(struct record *rec)
+{
+	struct perf_session *session = rec->session;
+	struct machine *machine = &session->machines.host;
+	struct perf_data_file *file = &rec->file;
+	struct record_opts *opts = &rec->opts;
+	struct perf_tool *tool = &rec->tool;
+	int fd = perf_data_file__fd(file);
+	int err = 0;
+	static bool warned_kmaps = false, warned_modules = false;
+
+	if (file->is_pipe) {
+		err = perf_event__synthesize_attrs(tool, session,
+						   process_synthesized_event);
+		if (err < 0) {
+			pr_err("Couldn't synthesize attrs.\n");
+			goto out;
+		}
+
+		if (have_tracepoints(&rec->evlist->entries)) {
+			/*
+			 * FIXME err <= 0 here actually means that
+			 * there were no tracepoints so its not really
+			 * an error, just that we don't need to
+			 * synthesize anything.  We really have to
+			 * return this more properly and also
+			 * propagate errors that now are calling die()
+			 */
+			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
+								  process_synthesized_event);
+			if (err <= 0) {
+				pr_err("Couldn't record tracing data.\n");
+				goto out;
+			}
+			rec->bytes_written += err;
+		}
+	}
+
+	if (rec->opts.full_auxtrace) {
+		err = perf_event__synthesize_auxtrace_info(rec->itr, tool,
+					session, process_synthesized_event);
+		if (err)
+			goto out;
+	}
+
+	err = perf_event__synthesize_kernel_mmap(tool, process_synthesized_event,
+						 machine);
+	if (err < 0 && !warned_kmaps) {
+		warned_kmaps = true;
+		pr_err("Couldn't record kernel reference relocation symbol\n"
+		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
+		       "Check /proc/kallsyms permission or run as root.\n");
+	}
+
+	err = perf_event__synthesize_modules(tool, process_synthesized_event,
+					     machine);
+	if (err < 0 && !warned_modules) {
+		warned_modules = true;
+		pr_err("Couldn't record kernel module information.\n"
+		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
+		       "Check /proc/modules permission or run as root.\n");
+	}
+
+	if (perf_guest) {
+		machines__process_guests(&session->machines,
+					 perf_event__synthesize_guest_os, tool);
+	}
+
+	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
+					    process_synthesized_event, opts->sample_address,
+					    opts->proc_map_timeout);
+out:
+	return err;
+}
+
 static int __cmd_record(struct record *rec, int argc, const char **argv)
 {
 	int err;
@@ -579,63 +654,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	if (file->is_pipe) {
-		err = perf_event__synthesize_attrs(tool, session,
-						   process_synthesized_event);
-		if (err < 0) {
-			pr_err("Couldn't synthesize attrs.\n");
-			goto out_child;
-		}
-
-		if (have_tracepoints(&rec->evlist->entries)) {
-			/*
-			 * FIXME err <= 0 here actually means that
-			 * there were no tracepoints so its not really
-			 * an error, just that we don't need to
-			 * synthesize anything.  We really have to
-			 * return this more properly and also
-			 * propagate errors that now are calling die()
-			 */
-			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
-								  process_synthesized_event);
-			if (err <= 0) {
-				pr_err("Couldn't record tracing data.\n");
-				goto out_child;
-			}
-			rec->bytes_written += err;
-		}
-	}
-
-	if (rec->opts.full_auxtrace) {
-		err = perf_event__synthesize_auxtrace_info(rec->itr, tool,
-					session, process_synthesized_event);
-		if (err)
-			goto out_delete_session;
-	}
-
-	err = perf_event__synthesize_kernel_mmap(tool, process_synthesized_event,
-						 machine);
-	if (err < 0)
-		pr_err("Couldn't record kernel reference relocation symbol\n"
-		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
-		       "Check /proc/kallsyms permission or run as root.\n");
-
-	err = perf_event__synthesize_modules(tool, process_synthesized_event,
-					     machine);
+	err = record__synthesize(rec);
 	if (err < 0)
-		pr_err("Couldn't record kernel module information.\n"
-		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
-		       "Check /proc/modules permission or run as root.\n");
-
-	if (perf_guest) {
-		machines__process_guests(&session->machines,
-					 perf_event__synthesize_guest_os, tool);
-	}
-
-	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
-					    process_synthesized_event, opts->sample_address,
-					    opts->proc_map_timeout);
-	if (err != 0)
 		goto out_child;
 
 	if (rec->realtime_prio) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 28/54] perf tools: Add perf_data_file__switch() helper
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (26 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 27/54] perf record: Extract synthesize code to record__synthesize() Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 29/54] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
                   ` (25 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

perf_data_file__switch() closes current output file, renames it, then
open a new one to continue record. It will be used by perf record
to split output into multiple perf.data files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data.c | 36 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/data.h | 11 ++++++++++-
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index 1921942..bfded6a 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -136,3 +136,39 @@ ssize_t perf_data_file__write(struct perf_data_file *file,
 {
 	return writen(file->fd, buf, size);
 }
+
+int perf_data_file__switch(struct perf_data_file *file,
+			   const char *postfix,
+			   size_t pos, bool at_exit)
+{
+	char *new_filepath;
+	int ret;
+
+	if (check_pipe(file))
+		return -EINVAL;
+	if (perf_data_file__is_read(file))
+		return -EINVAL;
+
+	if (asprintf(&new_filepath, "%s.%s", file->path, postfix) < 0)
+		return -ENOMEM;
+
+	rename(file->path, new_filepath);
+
+	if (!at_exit) {
+		close(file->fd);
+		ret = perf_data_file__open(file);
+		if (ret < 0)
+			goto out;
+
+		if (lseek(file->fd, pos, SEEK_SET) == (off_t)-1) {
+			ret = -errno;
+			pr_debug("Failed to lseek to %zu: %s",
+				 pos, strerror(errno));
+			goto out;
+		}
+	}
+	ret = file->fd;
+out:
+	free(new_filepath);
+	return ret;
+}
diff --git a/tools/perf/util/data.h b/tools/perf/util/data.h
index 2b15d0c..ae510ce 100644
--- a/tools/perf/util/data.h
+++ b/tools/perf/util/data.h
@@ -46,5 +46,14 @@ int perf_data_file__open(struct perf_data_file *file);
 void perf_data_file__close(struct perf_data_file *file);
 ssize_t perf_data_file__write(struct perf_data_file *file,
 			      void *buf, size_t size);
-
+/*
+ * If at_exit is set, only rename current perf.data to
+ * perf.data.<postfix>, continue write on original file.
+ * Set at_exit when flushing the last output.
+ *
+ * Return value is fd of new output.
+ */
+int perf_data_file__switch(struct perf_data_file *file,
+			   const char *postfix,
+			   size_t pos, bool at_exit);
 #endif /* __PERF_DATA_H */
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 29/54] perf record: Turns auxtrace_snapshot_enable into 3 states
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (27 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 28/54] perf tools: Add perf_data_file__switch() helper Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 30/54] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
                   ` (24 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

auxtrace_snapshot_enable has only two states (0/1). Turns it into a
triple states enum so SIGUSR2 handler can safely do other works without
triggering auxtrace snapshot.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 59 +++++++++++++++++++++++++++++++++++++--------
 1 file changed, 49 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a9f001d..2b45f7f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -123,7 +123,43 @@ out:
 static volatile int done;
 static volatile int signr = -1;
 static volatile int child_finished;
-static volatile int auxtrace_snapshot_enabled;
+
+static volatile enum {
+	AUXTRACE_SNAPSHOT_OFF = -1,
+	AUXTRACE_SNAPSHOT_DISABLED = 0,
+	AUXTRACE_SNAPSHOT_ENABLED = 1,
+} auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_OFF;
+
+static inline void
+auxtrace_snapshot_on(void)
+{
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_DISABLED;
+}
+
+static inline void
+auxtrace_snapshot_enable(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return;
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_ENABLED;
+}
+
+static inline void
+auxtrace_snapshot_disable(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return;
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_DISABLED;
+}
+
+static inline bool
+auxtrace_snapshot_is_enabled(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return false;
+	return auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_ENABLED;
+}
+
 static volatile int auxtrace_snapshot_err;
 static volatile int auxtrace_record__snapshot_started;
 
@@ -247,7 +283,7 @@ static void record__read_auxtrace_snapshot(struct record *rec)
 	} else {
 		auxtrace_snapshot_err = auxtrace_record__snapshot_finish(rec->itr);
 		if (!auxtrace_snapshot_err)
-			auxtrace_snapshot_enabled = 1;
+			auxtrace_snapshot_enable();
 	}
 }
 
@@ -580,10 +616,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	signal(SIGCHLD, sig_handler);
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
-	if (rec->opts.auxtrace_snapshot_mode)
+
+	if (rec->opts.auxtrace_snapshot_mode) {
 		signal(SIGUSR2, snapshot_sig_handler);
-	else
+		auxtrace_snapshot_on();
+	} else {
 		signal(SIGUSR2, SIG_IGN);
+	}
 
 	session = perf_session__new(file, false, tool);
 	if (session == NULL) {
@@ -709,12 +748,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		perf_evlist__enable(rec->evlist);
 	}
 
-	auxtrace_snapshot_enabled = 1;
+	auxtrace_snapshot_enable();
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
 		if (record__mmap_read_all(rec) < 0) {
-			auxtrace_snapshot_enabled = 0;
+			auxtrace_snapshot_disable();
 			err = -1;
 			goto out_child;
 		}
@@ -752,12 +791,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		 * disable events in this case.
 		 */
 		if (done && !disabled && !target__none(&opts->target)) {
-			auxtrace_snapshot_enabled = 0;
+			auxtrace_snapshot_disable();
 			perf_evlist__disable(rec->evlist);
 			disabled = true;
 		}
 	}
-	auxtrace_snapshot_enabled = 0;
+	auxtrace_snapshot_disable();
 
 	if (forks && workload_exec_errno) {
 		char msg[STRERR_BUFSIZE];
@@ -1319,9 +1358,9 @@ out_symbol_exit:
 
 static void snapshot_sig_handler(int sig __maybe_unused)
 {
-	if (!auxtrace_snapshot_enabled)
+	if (!auxtrace_snapshot_is_enabled())
 		return;
-	auxtrace_snapshot_enabled = 0;
+	auxtrace_snapshot_disable();
 	auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
 	auxtrace_record__snapshot_started = 1;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 30/54] perf record: Introduce record__finish_output() to finish a perf.data
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (28 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 29/54] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 31/54] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
                   ` (23 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Move code for finalizing 'perf.data' to record__finish_output(). It
will be used by following commits to split output to multiple files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 37 +++++++++++++++++++++++++------------
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 2b45f7f..c8d9c0b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -503,6 +503,29 @@ static void record__init_features(struct record *rec)
 	perf_header__clear_feat(&session->header, HEADER_STAT);
 }
 
+static void
+record__finish_output(struct record *rec)
+{
+	struct perf_data_file *file = &rec->file;
+	int fd = perf_data_file__fd(file);
+
+	if (file->is_pipe)
+		return;
+
+	rec->session->header.data_size += rec->bytes_written;
+	file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
+
+	if (!rec->no_buildid) {
+		process_buildids(rec);
+
+		if (rec->buildid_all)
+			dsos__hit_all(rec->session);
+	}
+	perf_session__write_header(rec->session, rec->evlist, fd, true);
+
+	return;
+}
+
 static volatile int workload_exec_errno;
 
 /*
@@ -830,18 +853,8 @@ out_child:
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
-	if (!err && !file->is_pipe) {
-		rec->session->header.data_size += rec->bytes_written;
-		file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
-
-		if (!rec->no_buildid) {
-			process_buildids(rec);
-
-			if (rec->buildid_all)
-				dsos__hit_all(rec->session);
-		}
-		perf_session__write_header(rec->session, rec->evlist, fd, true);
-	}
+	if (!err)
+		record__finish_output(rec);
 
 	if (!err && !quiet) {
 		char samples[128];
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 31/54] perf record: Add '--timestamp-filename' option to append timestamp to output filename
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (29 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 30/54] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 32/54] perf record: Split output into multiple files via '--switch-output' Wang Nan
                   ` (22 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This options append current timestamp to output. For example:

 # perf record -a --timestamp-filename
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2015122622265847 ]
 [ perf record: Captured and wrote 0.742 MB perf.data (90 samples) ]
 # ls
 perf.data.201512262226584

After 'perf record' support generating multiple output files, timestamp
would be useful to identify each of them.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 47 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index c8d9c0b..a561599 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -54,6 +54,7 @@ struct record {
 	bool			no_buildid_cache;
 	bool			no_buildid_cache_set;
 	bool			buildid_all;
+	bool			timestamp_filename;
 	unsigned long long	samples;
 };
 
@@ -526,6 +527,37 @@ record__finish_output(struct record *rec)
 	return;
 }
 
+static int
+record__switch_output(struct record *rec, bool at_exit)
+{
+	struct perf_data_file *file = &rec->file;
+	int fd, err;
+
+	/* Same Size:      "2015122520103046"*/
+	char timestamp[] = "InvalidTimestamp";
+
+	rec->samples = 0;
+	record__finish_output(rec);
+	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
+	if (err) {
+		pr_err("Failed to get current timestamp\n");
+		return -EINVAL;
+	}
+
+	fd = perf_data_file__switch(file, timestamp,
+				    rec->session->header.data_offset,
+				    at_exit);
+	if (fd >= 0 && !at_exit) {
+		rec->bytes_written = 0;
+		rec->session->header.data_size = 0;
+	}
+
+	if (!quiet)
+		fprintf(stderr, "[ perf record: Dump %s.%s ]\n",
+			file->path, timestamp);
+	return fd;
+}
+
 static volatile int workload_exec_errno;
 
 /*
@@ -853,8 +885,17 @@ out_child:
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
-	if (!err)
-		record__finish_output(rec);
+	if (!err) {
+		if (!rec->timestamp_filename) {
+			record__finish_output(rec);
+		} else {
+			fd = record__switch_output(rec, true);
+			if (fd < 0) {
+				status = fd;
+				goto out_delete_session;
+			}
+		}
+	}
 
 	if (!err && !quiet) {
 		char samples[128];
@@ -1231,6 +1272,8 @@ struct option __record_options[] = {
 		   "file", "vmlinux pathname"),
 	OPT_BOOLEAN(0, "buildid-all", &record.buildid_all,
 		    "Record build-id of all DSOs regardless of hits"),
+	OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
+		    "append timestamp to output filename"),
 	OPT_END()
 };
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 32/54] perf record: Split output into multiple files via '--switch-output'
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (30 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 31/54] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 33/54] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
                   ` (21 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Allow 'perf record' splits its output into multiple files.

For example:

 # ~/perf record -a --timestamp-filename --switch-output &
 [1] 10763
 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622314468 ]

 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622314762 ]

 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 #[ perf record: Dump perf.data.2015122622315171 ]

 # fg
 perf record -a --timestamp-filename --switch-output
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2015122622315513 ]
 [ perf record: Captured and wrote 0.014 MB perf.data (296 samples) ]

 # ls -l
 total 920
 -rw------- 1 root root 797692 Dec 26 22:31 perf.data.2015122622314468
 -rw------- 1 root root  59960 Dec 26 22:31 perf.data.2015122622314762
 -rw------- 1 root root  59912 Dec 26 22:31 perf.data.2015122622315171
 -rw------- 1 root root  19220 Dec 26 22:31 perf.data.2015122622315513

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a561599..4e03a20 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -55,6 +55,7 @@ struct record {
 	bool			no_buildid_cache_set;
 	bool			buildid_all;
 	bool			timestamp_filename;
+	bool			switch_output;
 	unsigned long long	samples;
 };
 
@@ -163,6 +164,7 @@ auxtrace_snapshot_is_enabled(void)
 
 static volatile int auxtrace_snapshot_err;
 static volatile int auxtrace_record__snapshot_started;
+static volatile int switch_output_started;
 
 static void sig_handler(int sig)
 {
@@ -672,7 +674,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
-	if (rec->opts.auxtrace_snapshot_mode) {
+	if (rec->opts.auxtrace_snapshot_mode || rec->switch_output) {
 		signal(SIGUSR2, snapshot_sig_handler);
 		auxtrace_snapshot_on();
 	} else {
@@ -824,9 +826,25 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			}
 		}
 
+		if (switch_output_started) {
+			switch_output_started = 0;
+
+			if (!quiet)
+				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
+					waking);
+			waking = 0;
+			fd = record__switch_output(rec, false);
+			if (fd < 0) {
+				pr_err("Failed to switch to new file\n");
+				err = fd;
+				goto out_child;
+			}
+		}
+
 		if (hits == rec->samples) {
 			if (done || draining)
 				break;
+
 			err = perf_evlist__poll(rec->evlist, -1);
 			/*
 			 * Propagate error, only if there's any. Ignore positive
@@ -1274,6 +1292,8 @@ struct option __record_options[] = {
 		    "Record build-id of all DSOs regardless of hits"),
 	OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
 		    "append timestamp to output filename"),
+	OPT_BOOLEAN(0, "switch-output", &record.switch_output,
+		    "Switch output when receive SIGUSR2"),
 	OPT_END()
 };
 
@@ -1414,9 +1434,11 @@ out_symbol_exit:
 
 static void snapshot_sig_handler(int sig __maybe_unused)
 {
-	if (!auxtrace_snapshot_is_enabled())
-		return;
-	auxtrace_snapshot_disable();
-	auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
-	auxtrace_record__snapshot_started = 1;
+	if (auxtrace_snapshot_is_enabled()) {
+		auxtrace_snapshot_disable();
+		auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
+		auxtrace_record__snapshot_started = 1;
+	}
+
+	switch_output_started = 1;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 33/54] perf record: Force enable --timestamp-filename when --switch-output is provided
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (31 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 32/54] perf record: Split output into multiple files via '--switch-output' Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:01 ` [PATCH 34/54] perf record: Disable buildid cache options by default in switch output mode Wang Nan
                   ` (20 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Without this patch, the last output doesn't have timestamp appended if
--timestamp-filename is not explicitly provided. For example:

 # perf record -a --switch-output &
 [1] 11224
 # kill -s SIGUSR2 11224
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622372823 ]

 # fg
 perf record -a --switch-output
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.027 MB perf.data (540 samples) ]

 # ls -l
 total 836
 -rw------- 1 root root  33256 Dec 26 22:37 perf.data   <---- *Odd*
 -rw------- 1 root root 817156 Dec 26 22:37 perf.data.2015122622372823

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4e03a20..dcb6ae3 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1349,6 +1349,9 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		return -EINVAL;
 	}
 
+	if (rec->switch_output)
+		rec->timestamp_filename = true;
+
 	if (!rec->itr) {
 		rec->itr = auxtrace_record__init(rec->evlist, &err);
 		if (err)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 34/54] perf record: Disable buildid cache options by default in switch output mode
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (32 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 33/54] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
@ 2016-02-05 14:01 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 35/54] perf record: Re-synthesize tracking events after output switching Wang Nan
                   ` (19 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Cost of buildid cache processing is high: read all events in output
perf.data, open elf files to read buildid then copy them into
~/.debug directory. In switch output mode, causes perf stop receiving
from perf events for too long.

Enable no-buildid and no-buildid-cache by default if --switch-output
is provided. Still allow user use --no-no-buildid to explicitly enable
buildid in this case.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dcb6ae3..238234e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1377,8 +1377,36 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 "If some relocation was applied (e.g. kexec) symbols may be misresolved\n"
 "even with a suitable vmlinux or kallsyms file.\n\n");
 
-	if (rec->no_buildid_cache || rec->no_buildid)
+	if (rec->no_buildid_cache || rec->no_buildid) {
 		disable_buildid_cache();
+	} else if (rec->switch_output) {
+		/*
+		 * In 'perf record --switch-output', disable buildid
+		 * generation by default to reduce data file switching
+		 * overhead. Still generate buildid if they are required
+		 * explicitly using
+		 *
+		 *  perf record --signal-trigger --no-no-buildid \
+		 *              --no-no-buildid-cache
+		 *
+		 * Following code equals to:
+		 *
+		 * if ((rec->no_buildid || !rec->no_buildid_set) &&
+		 *     (rec->no_buildid_cache || !rec->no_buildid_cache_set))
+		 *         disable_buildid_cache();
+		 */
+		bool disable = true;
+
+		if (rec->no_buildid_set && !rec->no_buildid)
+			disable = false;
+		if (rec->no_buildid_cache_set && !rec->no_buildid_cache)
+			disable = false;
+		if (disable) {
+			rec->no_buildid = true;
+			rec->no_buildid_cache = true;
+			disable_buildid_cache();
+		}
+	}
 
 	if (rec->evlist->nr_entries == 0 &&
 	    perf_evlist__add_default(rec->evlist) < 0) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 35/54] perf record: Re-synthesize tracking events after output switching
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (33 preceding siblings ...)
  2016-02-05 14:01 ` [PATCH 34/54] perf record: Disable buildid cache options by default in switch output mode Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 36/54] perf record: Generate tracking events for process forked by perf Wang Nan
                   ` (18 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Tracking events describe kernel and threads. They are generated by
reading /proc/kallsyms, /proc/*/maps and /proc/*/task/* during
initialization of 'perf record', serialized into event sequences and put
at the head of 'perf.data'. In case of output switching, each output
file should contain those events.

This patch calls record__synthesize() during output switching, so the
event sequences described above can be collected again.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 238234e..de51134 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -529,6 +529,8 @@ record__finish_output(struct record *rec)
 	return;
 }
 
+static int record__synthesize(struct record *rec);
+
 static int
 record__switch_output(struct record *rec, bool at_exit)
 {
@@ -557,6 +559,15 @@ record__switch_output(struct record *rec, bool at_exit)
 	if (!quiet)
 		fprintf(stderr, "[ perf record: Dump %s.%s ]\n",
 			file->path, timestamp);
+
+	/* Reinit machine */
+	if (!at_exit) {
+		machines__exit(&rec->session->machines);
+		machines__init(&rec->session->machines);
+		perf_session__create_kernel_maps(rec->session);
+		perf_session__set_id_hdr_size(rec->session);
+		record__synthesize(rec);
+	}
 	return fd;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 36/54] perf record: Generate tracking events for process forked by perf
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (34 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 35/54] perf record: Re-synthesize tracking events after output switching Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 37/54] perf record: Ensure return non-zero rc when mmap fail Wang Nan
                   ` (17 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

With 'perf record --switch-output' without -a, record__synthesize() in
record__switch_output() won't generate tracking events because there's
no thread_map in evlist. Which causes newly created perf.data doesn't
contain map and comm information.

This patch creates a fake thread_map and directly call
perf_event__synthesize_thread_map() for those events.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index de51134..e6a8b31 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -567,6 +567,23 @@ record__switch_output(struct record *rec, bool at_exit)
 		perf_session__create_kernel_maps(rec->session);
 		perf_session__set_id_hdr_size(rec->session);
 		record__synthesize(rec);
+
+		if (target__none(&rec->opts.target)) {
+			struct {
+				struct thread_map map;
+				struct thread_map_data map_data;
+			} thread_map;
+
+			thread_map.map.nr = 1;
+			thread_map.map.map[0].pid = rec->evlist->workload.pid;
+			thread_map.map.map[0].comm = NULL;
+			perf_event__synthesize_thread_map(&rec->tool,
+					&thread_map.map,
+					process_synthesized_event,
+					&rec->session->machines.host,
+					rec->opts.sample_address,
+					rec->opts.proc_map_timeout);
+		}
 	}
 	return fd;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 37/54] perf record: Ensure return non-zero rc when mmap fail
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (35 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 36/54] perf record: Generate tracking events for process forked by perf Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 38/54] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
                   ` (16 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

perf_evlist__mmap_ex() can fail without setting errno (for example,
fail in condition checking. In this case all syscall is success).
If this happen, record__open() incorrectly returns 0. Force setting
rc is a quick way to avoid this problem, or we have to follow all
possible code path in perf_evlist__mmap_ex() to make sure there's
at least one system call before returning an error.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e6a8b31..9265948 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -362,7 +362,10 @@ try_again:
 		} else {
 			pr_err("failed to mmap with %d (%s)\n", errno,
 				strerror_r(errno, msg, sizeof(msg)));
-			rc = -errno;
+			if (errno)
+				rc = -errno;
+			else
+				rc = -EINVAL;
 		}
 		goto out;
 	}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 38/54] perf record: Prevent reading invalid data in record__mmap_read
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (36 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 37/54] perf record: Ensure return non-zero rc when mmap fail Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 39/54] perf tools: Add evlist channel helpers Wang Nan
                   ` (15 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

When record__mmap_read() requires data more than the size of ring
buffer, drop those data to avoid accessing invalid memory.

This can happen when reading from overwritable ring buffer, which
should be avoided. However, check this for robustness.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 9265948..0a4f3ec 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -37,6 +37,7 @@
 #include <unistd.h>
 #include <sched.h>
 #include <sys/mman.h>
+#include <asm/bug.h>
 
 
 struct record {
@@ -95,6 +96,13 @@ static int record__mmap_read(struct record *rec, int idx)
 	rec->samples++;
 
 	size = head - old;
+	if (size > (unsigned long)(md->mask) + 1) {
+		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
+
+		md->prev = head;
+		perf_evlist__mmap_consume(rec->evlist, idx);
+		return 0;
+	}
 
 	if ((old & md->mask) + size != (head & md->mask)) {
 		buf = &data[old & md->mask];
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 39/54] perf tools: Add evlist channel helpers
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (37 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 38/54] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 40/54] perf tools: Automatically add new channel according to evlist Wang Nan
                   ` (14 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

In this commit sereval helpers are introduced to support the principle
of channel. Channels hold different groups of evsels which configured
differently. It will be used for overwritable evsels, which allows perf
record some events continuously while capture snapshot for other events
when something happen. Tracking events (mmap, mmap2, fork, exit ...)
are another possible events worth to be put into a separated channel.

Channels are represented by an array with channel flags. Each channel
contains evlist->nr_mmaps mmaps. Channels are configured before
perf_evlist__mmap_ex(). During that function nr_mmaps mmaps for each
channel are allocated together as a big array.
perf_evlist__channel_idx() converts index in the big array and the
channel number. For API functions which accept idx, _ex() versions are
introduced to accept selecting an mmap from a channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |   6 ++
 tools/perf/util/evlist.c    | 132 ++++++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/evlist.h    |  58 +++++++++++++++++++
 3 files changed, 190 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0a4f3ec..2d9e6c6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -356,6 +356,12 @@ try_again:
 		goto out;
 	}
 
+	perf_evlist__channel_reset(evlist);
+	rc = perf_evlist__channel_add(evlist, 0, true);
+	if (rc < 0)
+		goto out;
+	rc = 0;
+
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 90a9820..36f8c66 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -679,14 +679,51 @@ static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
 	return NULL;
 }
 
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx)
+{
+	int channel = *p_channel;
+	int _idx = *p_idx;
+
+	if (_idx < 0)
+		return -EINVAL;
+	/*
+	 * Negative channel means caller explicitly use real index.
+	 */
+	if (channel < 0) {
+		channel = perf_evlist__idx_channel(evlist, _idx);
+		_idx = _idx % evlist->nr_mmaps;
+	}
+	if (channel < 0)
+		return channel;
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	if (_idx >= evlist->nr_mmaps)
+		return -E2BIG;
+
+	*p_channel = channel;
+	*p_idx = evlist->nr_mmaps * channel + _idx;
+	return 0;
+}
+
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx)
 {
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
 	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head;
-	u64 old = md->prev;
-	unsigned char *data = md->base + page_size;
+	u64 old;
+	unsigned char *data;
 	union perf_event *event = NULL;
 
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return NULL;
+	}
+	old = md->prev;
+	data = md->base + page_size;
+
 	/*
 	 * Check if event was unmapped due to a POLLHUP/POLLERR.
 	 */
@@ -748,6 +785,11 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 	return event;
 }
 
+union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+{
+	return perf_evlist__mmap_read_ex(evlist, -1, idx);
+}
+
 static bool perf_mmap__empty(struct perf_mmap *md)
 {
 	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
@@ -766,10 +808,18 @@ static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 		__perf_evlist__munmap(evlist, idx);
 }
 
-void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx)
 {
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
 	struct perf_mmap *md = &evlist->mmap[idx];
 
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return;
+	}
+
 	if (!evlist->overwrite) {
 		u64 old = md->prev;
 
@@ -780,6 +830,11 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 		perf_evlist__mmap_put(evlist, idx);
 }
 
+void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+{
+	perf_evlist__mmap_consume_ex(evlist, -1, idx);
+}
+
 int __weak auxtrace_mmap__mmap(struct auxtrace_mmap *mm __maybe_unused,
 			       struct auxtrace_mmap_params *mp __maybe_unused,
 			       void *userpg __maybe_unused,
@@ -825,7 +880,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 	if (evlist->mmap == NULL)
 		return;
 
-	for (i = 0; i < evlist->nr_mmaps; i++)
+	for (i = 0; i < perf_evlist__mmap_nr(evlist); i++)
 		__perf_evlist__munmap(evlist, i);
 
 	zfree(&evlist->mmap);
@@ -833,10 +888,17 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 
 static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
+	int total_mmaps;
+
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
-	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
+
+	total_mmaps = perf_evlist__mmap_nr(evlist);
+	if (!total_mmaps)
+		return -EINVAL;
+
+	evlist->mmap = zalloc(total_mmaps * sizeof(struct perf_mmap));
 	return evlist->mmap != NULL ? 0 : -ENOMEM;
 }
 
@@ -1137,6 +1199,12 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	int err;
+
+	perf_evlist__channel_reset(evlist);
+	err = perf_evlist__channel_add(evlist, 0, true);
+	if (err < 0)
+		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
@@ -1746,3 +1814,55 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 
 	return NULL;
 }
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist)
+{
+	int i;
+
+	for (i = PERF_EVLIST__NR_CHANNELS - 1; i >= 0; i--) {
+		unsigned long flags = evlist->channel_flags[i];
+
+		if (flags & PERF_EVLIST__CHANNEL_ENABLED)
+			return i + 1;
+	}
+	return 0;
+}
+
+int perf_evlist__mmap_nr(struct perf_evlist *evlist)
+{
+	return evlist->nr_mmaps * perf_evlist__channel_nr(evlist);
+}
+
+void perf_evlist__channel_reset(struct perf_evlist *evlist)
+{
+	int i;
+
+	BUG_ON(evlist->mmap);
+
+	for (i = 0; i < PERF_EVLIST__NR_CHANNELS; i++)
+		evlist->channel_flags[i] = 0;
+}
+
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default)
+{
+	int n = perf_evlist__channel_nr(evlist);
+	unsigned long *flags = evlist->channel_flags;
+
+	BUG_ON(evlist->mmap);
+
+	if (n >= PERF_EVLIST__NR_CHANNELS) {
+		pr_debug("ERROR: too many channels. Increase PERF_EVLIST__NR_CHANNELS\n");
+		return -ENOSPC;
+	}
+
+	if (is_default) {
+		memmove(&flags[1], &flags[0],
+			sizeof(evlist->channel_flags) -
+			sizeof(evlist->channel_flags[0]));
+		n = 0;
+	}
+	flags[n] = flag | PERF_EVLIST__CHANNEL_ENABLED;
+	return n;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a0d1522..1812652 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,6 +20,11 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
+#define PERF_EVLIST__NR_CHANNELS	1
+enum perf_evlist_mmap_flag {
+	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+};
+
 /**
  * struct perf_mmap - perf's ring buffer mmap details
  *
@@ -52,6 +57,7 @@ struct perf_evlist {
 		pid_t	pid;
 	} workload;
 	struct fdarray	 pollfd;
+	unsigned long channel_flags[PERF_EVLIST__NR_CHANNELS];
 	struct perf_mmap *mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
@@ -116,9 +122,61 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx);
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
 
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx);
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
+int perf_evlist__mmap_nr(struct perf_evlist *evlist);
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist);
+void perf_evlist__channel_reset(struct perf_evlist *evlist);
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default);
+
+static inline bool
+__perf_evlist__channel_check(struct perf_evlist *evlist, int channel,
+			     enum perf_evlist_mmap_flag bits)
+{
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return false;
+
+	return (evlist->channel_flags[channel] & bits) ? true : false;
+}
+#define perf_evlist__channel_check(e, c, b) \
+		__perf_evlist__channel_check(e, c, PERF_EVLIST__CHANNEL_##b)
+
+static inline bool
+perf_evlist__channel_is_enabled(struct perf_evlist *evlist, int channel)
+{
+	return perf_evlist__channel_check(evlist, channel, ENABLED);
+}
+
+static inline int
+perf_evlist__idx_channel(struct perf_evlist *evlist, int idx)
+{
+	int channel = idx / evlist->nr_mmaps;
+
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	return channel;
+}
+
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx);
+
+static inline struct perf_mmap *
+perf_evlist__get_mmap(struct perf_evlist *evlist,
+		      int channel, int idx)
+{
+	if (perf_evlist__channel_idx(evlist, &channel, &idx))
+		return NULL;
+
+	return &evlist->mmap[idx];
+}
 
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 40/54] perf tools: Automatically add new channel according to evlist
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (38 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 39/54] perf tools: Add evlist channel helpers Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 41/54] perf tools: Operate multiple channels Wang Nan
                   ` (13 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

perf_evlist__channel_find() can be used to find a proper channel based
on propreties of a evsel. If the channel doesn't exist, it can create
new one for it. After this patch there's no need to create default
channel explicitly.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  5 -----
 tools/perf/util/evlist.c    | 47 ++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 2d9e6c6..30a3c5c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -357,11 +357,6 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	rc = perf_evlist__channel_add(evlist, 0, true);
-	if (rc < 0)
-		goto out;
-	rc = 0;
-
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 36f8c66..4ed07ad 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -943,6 +943,43 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
+static unsigned long
+perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
+{
+	return 0;
+}
+
+static int
+perf_evlist__channel_find(struct perf_evlist *evlist,
+			  struct perf_evsel *evsel,
+			  bool add_new)
+{
+	unsigned long flag = perf_evlist__channel_for_evsel(evsel);
+	int i;
+
+	flag |= PERF_EVLIST__CHANNEL_ENABLED;
+	for (i = 0; i < perf_evlist__channel_nr(evlist); i++)
+		if (evlist->channel_flags[i] == flag)
+			return i;
+	if (add_new)
+		return perf_evlist__channel_add(evlist, flag, false);
+	return -ENOENT;
+}
+
+static int
+perf_evlist__channel_complete(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	int err;
+
+	evlist__for_each(evlist, evsel) {
+		err = perf_evlist__channel_find(evlist, evsel, true);
+		if (err < 0)
+			return err;
+	}
+	return 0;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
@@ -1162,6 +1199,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 bool overwrite, unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite)
 {
+	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
@@ -1169,6 +1207,10 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
+	err = perf_evlist__channel_complete(evlist);
+	if (err)
+		return err;
+
 	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
 		return -ENOMEM;
 
@@ -1199,12 +1241,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
-	int err;
-
 	perf_evlist__channel_reset(evlist);
-	err = perf_evlist__channel_add(evlist, 0, true);
-	if (err < 0)
-		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 41/54] perf tools: Operate multiple channels
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (39 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 40/54] perf tools: Automatically add new channel according to evlist Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 42/54] perf tools: Squash overwrite setting into channel Wang Nan
                   ` (12 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Before this patch perf operates on only the first channel. Make perf
mmap and read from multiple channels.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  3 ++-
 tools/perf/util/evlist.c    | 55 ++++++++++++++++++++++++++++++++++-----------
 tools/perf/util/evlist.h    |  2 +-
 3 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 30a3c5c..a471ca6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -466,8 +466,9 @@ static int record__mmap_read_all(struct record *rec)
 	u64 bytes_written = rec->bytes_written;
 	int i;
 	int rc = 0;
+	int total_mmaps = perf_evlist__mmap_nr(rec->evlist);
 
-	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
+	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
 		if (rec->evlist->mmap[i].base) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 4ed07ad..ea573fc 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -873,6 +873,21 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
 }
 
+static void
+__perf_evlist__munmap_channels(struct perf_evlist *evlist, int _idx)
+{
+	int _ch;
+
+	for (_ch = 0; _ch < perf_evlist__channel_nr(evlist); _ch++) {
+		int err, idx = _idx, ch = _ch;
+
+		err = perf_evlist__channel_idx(evlist, &ch, &idx);
+		if (err < 0)
+			continue;
+		__perf_evlist__munmap(evlist, idx);
+	}
+}
+
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	int i;
@@ -980,26 +995,38 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
+static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *outputs)
 {
 	struct perf_evsel *evsel;
 
 	evlist__for_each(evlist, evsel) {
-		int fd;
+		int fd, channel, idx, err;
+
+		channel = perf_evlist__channel_find(evlist, evsel, false);
+		if (channel < 0) {
+			pr_err("ERROR: unable to find suitable channel for %s\n",
+			       evsel->name);
+			return -1;
+		}
+
+		idx = _idx;
+		err = perf_evlist__channel_idx(evlist, &channel, &idx);
+		if (err < 0)
+			return err;
 
 		if (evsel->system_wide && thread)
 			continue;
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
-			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+		if (outputs[channel] == -1) {
+			outputs[channel] = fd;
+			if (__perf_evlist__mmap(evlist, idx, mp, outputs[channel]) < 0)
 				return -1;
 		} else {
-			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
+			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, outputs[channel]) != 0)
 				return -1;
 
 			perf_evlist__mmap_get(evlist, idx);
@@ -1039,14 +1066,15 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, outputs))
 				goto out_unmap;
 		}
 	}
@@ -1055,7 +1083,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 
 out_unmap:
 	for (cpu = 0; cpu < nr_cpus; cpu++)
-		__perf_evlist__munmap(evlist, cpu);
+		__perf_evlist__munmap_channels(evlist, cpu);
 	return -1;
 }
 
@@ -1067,13 +1095,14 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						outputs))
 			goto out_unmap;
 	}
 
@@ -1081,7 +1110,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 
 out_unmap:
 	for (thread = 0; thread < nr_threads; thread++)
-		__perf_evlist__munmap(evlist, thread);
+		__perf_evlist__munmap_channels(evlist, thread);
 	return -1;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1812652..b652587 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,7 +20,7 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
-#define PERF_EVLIST__NR_CHANNELS	1
+#define PERF_EVLIST__NR_CHANNELS	2
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 };
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 42/54] perf tools: Squash overwrite setting into channel
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (40 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 41/54] perf tools: Operate multiple channels Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 43/54] perf record: Don't read from and poll overwrite channel Wang Nan
                   ` (11 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Make 'overwrite' a channel configuration other than a evlist global
option. With this setting an evlist can have two channels, one is
normal channel, another is overwritable channel.
perf_evlist__channel_for_evsel() ensures events with 'overwrite'
configuration inserted to overwritable channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  2 +-
 tools/perf/util/evlist.c    | 42 +++++++++++++++++++++++++++---------------
 tools/perf/util/evlist.h    |  5 ++---
 tools/perf/util/evsel.h     |  1 +
 4 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a471ca6..53bfe55 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -357,7 +357,7 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
+	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
 		if (errno == EPERM) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ea573fc..ee4b486 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -731,7 +731,7 @@ union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
 		return NULL;
 
 	head = perf_mmap__read_head(md);
-	if (evlist->overwrite) {
+	if (perf_evlist__channel_check(evlist, channel, RDONLY)) {
 		/*
 		 * If we're further behind than half the buffer, there's a chance
 		 * the writer will bite our tail and mess up the samples under us.
@@ -820,7 +820,7 @@ void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
 		return;
 	}
 
-	if (!evlist->overwrite) {
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY)) {
 		u64 old = md->prev;
 
 		perf_mmap__write_tail(md, old);
@@ -918,7 +918,6 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 }
 
 struct mmap_params {
-	int prot;
 	int mask;
 	struct auxtrace_mmap_params auxtrace_mp;
 };
@@ -926,6 +925,15 @@ struct mmap_params {
 static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 			       struct mmap_params *mp, int fd)
 {
+	int channel = perf_evlist__idx_channel(evlist, idx);
+	int prot = PROT_READ;
+
+	if (channel < 0)
+		return -1;
+
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY))
+		prot |= PROT_WRITE;
+
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
 	 * make sure we don't prevent tools from consuming every last event in
@@ -942,7 +950,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	atomic_set(&evlist->mmap[idx].refcnt, 2);
 	evlist->mmap[idx].prev = 0;
 	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
+	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, prot,
 				      MAP_SHARED, fd, 0);
 	if (evlist->mmap[idx].base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
@@ -959,9 +967,13 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 }
 
 static unsigned long
-perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
+perf_evlist__channel_for_evsel(struct perf_evsel *evsel)
 {
-	return 0;
+	unsigned long flag = 0;
+
+	if (evsel->overwrite)
+		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	return flag;
 }
 
 static int
@@ -1211,11 +1223,10 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * perf_evlist__mmap_ex - Create mmaps to receive events.
  * @evlist: list of events
  * @pages: map length in pages
- * @overwrite: overwrite older events?
  * @auxtrace_pages - auxtrace map length in pages
  * @auxtrace_overwrite - overwrite older auxtrace data?
  *
- * If @overwrite is %false the user needs to signal event consumption using
+ * For writable channel, the user needs to signal event consumption using
  * perf_mmap__write_tail().  Using perf_evlist__mmap_read() does this
  * automatically.
  *
@@ -1225,16 +1236,13 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * Return: %0 on success, negative error code otherwise.
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite)
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite)
 {
 	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
-	struct mmap_params mp = {
-		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
-	};
+	struct mmap_params mp;
 
 	err = perf_evlist__channel_complete(evlist);
 	if (err)
@@ -1246,7 +1254,6 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
 	mp.mask = evlist->mmap_len - page_size - 1;
@@ -1270,8 +1277,13 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	struct perf_evsel *evsel;
+
 	perf_evlist__channel_reset(evlist);
-	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+	evlist__for_each(evlist, evsel)
+		evsel->overwrite = overwrite;
+
+	return perf_evlist__mmap_ex(evlist, pages, 0, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b652587..21a8b85 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -23,6 +23,7 @@ struct record_opts;
 #define PERF_EVLIST__NR_CHANNELS	2
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+	PERF_EVLIST__CHANNEL_RDONLY	= 2,
 };
 
 /**
@@ -45,7 +46,6 @@ struct perf_evlist {
 	int		 nr_entries;
 	int		 nr_groups;
 	int		 nr_mmaps;
-	bool		 overwrite;
 	bool		 enabled;
 	bool		 has_user_cpus;
 	size_t		 mmap_len;
@@ -203,8 +203,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 				  int unset);
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite);
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite);
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite);
 void perf_evlist__munmap(struct perf_evlist *evlist);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index efad78f..03c70e5 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -114,6 +114,7 @@ struct perf_evsel {
 	bool			tracking;
 	bool			per_pkg;
 	bool			precise_max;
+	bool			overwrite;
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 43/54] perf record: Don't read from and poll overwrite channel
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (41 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 42/54] perf tools: Squash overwrite setting into channel Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 44/54] perf record: Don't poll on " Wang Nan
                   ` (10 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Reading from overwritable ring buffer is unreliable. Introduce
record__mmap_should_read() and prevent reading from overwrite ring
buffer in 'perf record'. The rule in record__mmap_should_read() will
be changed when perf support reading from backward writing ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 53bfe55..503eee9 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -461,6 +461,19 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static bool record__mmap_should_read(struct record *rec, int idx)
+{
+	int channel = -1;
+
+	if (!rec->evlist->mmap[idx].base)
+		return false;
+	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
+		return false;
+	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int record__mmap_read_all(struct record *rec)
 {
 	u64 bytes_written = rec->bytes_written;
@@ -471,7 +484,7 @@ static int record__mmap_read_all(struct record *rec)
 	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
-		if (rec->evlist->mmap[i].base) {
+		if (record__mmap_should_read(rec, i)) {
 			if (record__mmap_read(rec, i) != 0) {
 				rc = -1;
 				goto out;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 44/54] perf record: Don't poll on overwrite channel
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (42 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 43/54] perf record: Don't read from and poll overwrite channel Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 45/54] perf tools: Detect avalibility of write_backward Wang Nan
                   ` (9 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

There's no need to receive events from overwrite ring buffer. Instead,
perf should make them run background until something happen. This patch
makes events from overwrite ring buffer is ignored except POLLERR and
POLLHUP.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ee4b486..1ff57ef 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -461,9 +461,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
 {
-	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
 	 * Save the idx so that when we filter out fds POLLHUP'ed we can
 	 * close the associated evlist->mmap[] entry.
@@ -479,7 +479,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1);
+	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
@@ -1007,6 +1007,18 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
+static bool
+perf_evlist__should_poll(struct perf_evlist *evlist,
+			 struct perf_evsel *evsel,
+			 int channel)
+{
+	if (evsel->system_wide)
+		return false;
+	if (perf_evlist__channel_check(evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *outputs)
@@ -1015,6 +1027,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 
 	evlist__for_each(evlist, evsel) {
 		int fd, channel, idx, err;
+		short revent = POLLIN;
 
 		channel = perf_evlist__channel_find(evlist, evsel, false);
 		if (channel < 0) {
@@ -1044,6 +1057,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 			perf_evlist__mmap_get(evlist, idx);
 		}
 
+		if (!perf_evlist__should_poll(evlist, evsel, channel))
+			revent = 0;
 		/*
 		 * The system_wide flag causes a selected event to be opened
 		 * always without a pid.  Consequently it will never get a
@@ -1052,7 +1067,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 45/54] perf tools: Detect avalibility of write_backward
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (43 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 44/54] perf record: Don't poll on " Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 46/54] perf tools: Enable overwrite settings Wang Nan
                   ` (8 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Detect avalibility of write_backward and save the result into
record_opts. With write_backward the start pointer of a ring
buffer mapped read only can be found reliably.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/perf.h        |  1 +
 tools/perf/util/record.c | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 90129ac..00c25b1 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -71,6 +71,7 @@ struct record_opts {
 	bool	     sample_transaction;
 	unsigned     initial_delay;
 	bool         use_clockid;
+	bool	     has_write_backward;
 	clockid_t    clockid;
 	unsigned int proc_map_timeout;
 };
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 0467367..d01f155 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -85,6 +85,11 @@ static void perf_probe_comm_exec(struct perf_evsel *evsel)
 	evsel->attr.comm_exec = 1;
 }
 
+static void perf_probe_write_backward(struct perf_evsel *evsel)
+{
+	evsel->attr.write_backward = 1;
+}
+
 static void perf_probe_context_switch(struct perf_evsel *evsel)
 {
 	evsel->attr.context_switch = 1;
@@ -105,6 +110,11 @@ bool perf_can_record_switch_events(void)
 	return perf_probe_api(perf_probe_context_switch);
 }
 
+static bool perf_can_write_backward(void)
+{
+	return perf_probe_api(perf_probe_write_backward);
+}
+
 bool perf_can_record_cpu_wide(void)
 {
 	struct perf_event_attr attr = {
@@ -235,6 +245,7 @@ static int record_opts__config_freq(struct record_opts *opts)
 
 int record_opts__config(struct record_opts *opts)
 {
+	opts->has_write_backward = perf_can_write_backward();
 	return record_opts__config_freq(opts);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 46/54] perf tools: Enable overwrite settings
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (44 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 45/54] perf tools: Detect avalibility of write_backward Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 47/54] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
                   ` (7 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch allows following config terms and option:

 # perf record --overwrite ...

   Globally set following events to overwrite;

 # perf record --event cycles/overwrite/ ...
 # perf record --event cycles/no-overwrite/ ...

Set specific events to be overwrite or no-overwrite.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c    |  1 +
 tools/perf/perf.h              |  1 +
 tools/perf/util/evsel.c        |  4 ++++
 tools/perf/util/evsel.h        |  2 ++
 tools/perf/util/parse-events.c | 14 ++++++++++++++
 tools/perf/util/parse-events.h |  4 +++-
 tools/perf/util/parse-events.l |  2 ++
 7 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 503eee9..f416296 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1271,6 +1271,7 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 		     "number of mmap data pages and AUX area tracing mmap pages",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 00c25b1..ea7f6f5 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -58,6 +58,7 @@ struct record_opts {
 	bool	     full_auxtrace;
 	bool	     auxtrace_snapshot_mode;
 	bool	     record_switch_events;
+	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 60529e5..211e27d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -670,6 +670,9 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			 */
 			attr->inherit = term->val.inherit ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+			evsel->overwrite = term->val.overwrite ? 1 : 0;
+			break;
 		default:
 			break;
 		}
@@ -745,6 +748,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
+	evsel->overwrite    = opts->overwrite;
 
 	perf_evsel__set_sample_bit(evsel, IP);
 	perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 03c70e5..aa976f9 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -44,6 +44,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
 	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_INHERIT,
+	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -57,6 +58,7 @@ struct perf_evsel_config_term {
 		char	*callgraph;
 		u64	stack_user;
 		bool	inherit;
+		bool	overwrite;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 8e0f401..b809c3b 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -951,6 +951,12 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
@@ -1004,6 +1010,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
 	case PARSE_EVENTS__TERM_TYPE_INHERIT:
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -1073,6 +1081,12 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 			ADD_CONFIG_TERM(INHERIT, inherit, term->val.num ? 0 : 1);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index ad1f78f..425750d 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -68,7 +68,9 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_CALLGRAPH,
 	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
-	PARSE_EVENTS__TERM_TYPE_INHERIT
+	PARSE_EVENTS__TERM_TYPE_INHERIT,
+	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
+	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 };
 
 struct parse_events_array {
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 27d567f..2ef6f96 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -202,6 +202,8 @@ call-graph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
 stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
 inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
+overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
+no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 47/54] perf tools: Set write_backward attribut bit for overwrite events
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (45 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 46/54] perf tools: Enable overwrite settings Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 48/54] perf tools: Record fd into perf_mmap Wang Nan
                   ` (6 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

write_backward attribute makes kernel filling ring buffer from the end
of it, makes reading from overwrite ring buffer possible.

This patch select this attribute if evsel->overwrite is selected
explicitly by user.

Overwrite and write_backward are still controled separatly for legacy
readonly mmap users (most of them are in perf/tests).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  7 +++++++
 tools/perf/util/evlist.c    |  2 ++
 tools/perf/util/evlist.h    |  1 +
 tools/perf/util/evsel.c     | 13 +++++++++++++
 4 files changed, 23 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f416296..09aa4ee 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -332,6 +332,13 @@ static int record__open(struct record *rec)
 	perf_evlist__config(evlist, opts);
 
 	evlist__for_each(evlist, pos) {
+		if (pos->overwrite) {
+			if (!pos->attr.write_backward) {
+				ui__warning("Unable to read from overwrite ring buffer\n\n");
+				rc = -ENOSYS;
+				goto out;
+			}
+		}
 try_again:
 		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
 			if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1ff57ef..340307c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -973,6 +973,8 @@ perf_evlist__channel_for_evsel(struct perf_evsel *evsel)
 
 	if (evsel->overwrite)
 		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	if (evsel->attr.write_backward)
+		flag |= PERF_EVLIST__CHANNEL_BACKWARD;
 	return flag;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 21a8b85..321224c 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -24,6 +24,7 @@ struct record_opts;
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 	PERF_EVLIST__CHANNEL_RDONLY	= 2,
+	PERF_EVLIST__CHANNEL_BACKWARD	= 4,
 };
 
 /**
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 211e27d..14309d8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -678,6 +678,19 @@ static void apply_config_terms(struct perf_evsel *evsel,
 		}
 	}
 
+	/*
+	 * Set backward after config term processing because it is
+	 * possible to set overwrite globally, without config
+	 * terms.
+	 */
+	if (evsel->overwrite) {
+		if (opts->has_write_backward)
+			attr->write_backward = 1;
+		else
+			pr_err("Reading from overwrite event %s is not supported\n",
+			       evsel->name);
+	}
+
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
 	if ((callgraph_buf != NULL) || (dump_size > 0)) {
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 48/54] perf tools: Record fd into perf_mmap
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (46 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 47/54] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 49/54] perf tools: Add API to pause a channel Wang Nan
                   ` (5 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Add a fd field into perf_mmap so perf can backtrack the fd from mmap.
This feature will be used to toggle overwrite ring buffers.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 15 +++++++++++++--
 tools/perf/util/evlist.h |  1 +
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 340307c..c543024 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -868,6 +868,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	if (evlist->mmap[idx].base != NULL) {
 		munmap(evlist->mmap[idx].base, evlist->mmap_len);
 		evlist->mmap[idx].base = NULL;
+		evlist->mmap[idx].fd = -1;
 		atomic_set(&evlist->mmap[idx].refcnt, 0);
 	}
 	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
@@ -903,7 +904,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 
 static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
-	int total_mmaps;
+	int total_mmaps, i;
 
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
@@ -914,7 +915,12 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 		return -EINVAL;
 
 	evlist->mmap = zalloc(total_mmaps * sizeof(struct perf_mmap));
-	return evlist->mmap != NULL ? 0 : -ENOMEM;
+	if (!evlist->mmap)
+		return -ENOMEM;
+
+	for (i = 0; i < total_mmaps; i++)
+		evlist->mmap[i].fd = -1;
+	return 0;
 }
 
 struct mmap_params {
@@ -934,6 +940,10 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	if (!perf_evlist__channel_check(evlist, channel, RDONLY))
 		prot |= PROT_WRITE;
 
+	if (evlist->mmap[idx].fd >= 0) {
+		pr_err("idx %d already mapped\n", idx);
+		return -1;
+	}
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
 	 * make sure we don't prevent tools from consuming every last event in
@@ -958,6 +968,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 		evlist->mmap[idx].base = NULL;
 		return -1;
 	}
+	evlist->mmap[idx].fd = fd;
 
 	if (auxtrace_mmap__mmap(&evlist->mmap[idx].auxtrace_mmap,
 				&mp->auxtrace_mp, evlist->mmap[idx].base, fd))
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 321224c..bc6d787 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -35,6 +35,7 @@ enum perf_evlist_mmap_flag {
 struct perf_mmap {
 	void		 *base;
 	int		 mask;
+	int		 fd;
 	atomic_t	 refcnt;
 	u64		 prev;
 	struct auxtrace_mmap auxtrace_mmap;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 49/54] perf tools: Add API to pause a channel
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (47 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 48/54] perf tools: Record fd into perf_mmap Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading Wang Nan
                   ` (4 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

perf_evlist__channel_toggle_paused() is introduced to pause/resume a
channel in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl.
Following commits use perf_evlist__channel_toggle_paused() to ensure
overwrite ring buffer is turned off before reading.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 28 ++++++++++++++++++++++++++++
 tools/perf/util/evlist.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c543024..38e1c3a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -706,6 +706,34 @@ int perf_evlist__channel_idx(struct perf_evlist *evlist,
 	return 0;
 }
 
+int perf_evlist__channel_toggle_paused(struct perf_evlist *evlist,
+				       int channel, bool pause)
+{
+	int i;
+
+	if (channel >= perf_evlist__channel_nr(evlist))
+		return -E2BIG;
+	if (!evlist->mmap)
+		return -EFAULT;
+	for (i = 0; i < evlist->nr_mmaps; i++) {
+		int n = channel * evlist->nr_mmaps + i;
+		int fd = evlist->mmap[n].fd;
+		int err;
+
+		if (fd < 0)
+			continue;
+		err = ioctl(fd, PERF_EVENT_IOC_PAUSE_OUTPUT,
+			    pause ? 1 : 0);
+		if (err) {
+			err = (errno == 0 ? -EINVAL : -errno);
+			pr_err("Unable to pause output on %d: %s\n",
+			       fd, strerror(-err));
+			return err;
+		}
+	}
+	return 0;
+}
+
 union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
 					    int channel, int idx)
 {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bc6d787..c1831a9 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -180,6 +180,8 @@ perf_evlist__get_mmap(struct perf_evlist *evlist,
 	return &evlist->mmap[idx];
 }
 
+int perf_evlist__channel_toggle_paused(struct perf_evlist *evlist,
+				       int channel, bool pause);
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (48 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 49/54] perf tools: Add API to pause a channel Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 51/54] perf record: Rename variable to make code clear Wang Nan
                   ` (3 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Reading from a overwrite ring buffer is unrelible.
perf_evlist__channel_toggle_paused() should be called before
reading from them.

Toggel overwrite_evt_paused director after receiving done or switch
output.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 79 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 09aa4ee..4d89543 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -39,6 +39,11 @@
 #include <sys/mman.h>
 #include <asm/bug.h>
 
+enum overwrite_evt_state {
+	OVERWRITE_EVT_RUNNING,
+	OVERWRITE_EVT_DATA_PENDING,
+	OVERWRITE_EVT_EMPTY,
+};
 
 struct record {
 	struct perf_tool	tool;
@@ -57,6 +62,7 @@ struct record {
 	bool			buildid_all;
 	bool			timestamp_filename;
 	bool			switch_output;
+	enum overwrite_evt_state overwrite_evt_state;
 	unsigned long long	samples;
 };
 
@@ -388,6 +394,7 @@ try_again:
 
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
+	rec->overwrite_evt_state = OVERWRITE_EVT_RUNNING;
 out:
 	return rc;
 }
@@ -468,6 +475,52 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static void
+record__toggle_overwrite_evsels(struct record *rec,
+				enum overwrite_evt_state state)
+{
+	struct perf_evlist *evlist = rec->evlist;
+	enum overwrite_evt_state old_state = rec->overwrite_evt_state;
+	enum action {
+		NONE,
+		PAUSE,
+		RESUME,
+	} action = NONE;
+	int ch, nr_channels;
+
+	switch (old_state) {
+	case OVERWRITE_EVT_RUNNING:
+		if (state != OVERWRITE_EVT_RUNNING)
+			action = PAUSE;
+		break;
+	case OVERWRITE_EVT_DATA_PENDING:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		break;
+	case OVERWRITE_EVT_EMPTY:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		if (state == OVERWRITE_EVT_DATA_PENDING)
+			state = OVERWRITE_EVT_EMPTY;
+		break;
+	default:
+		WARN_ONCE(1, "Shouldn't get there\n");
+	}
+
+	rec->overwrite_evt_state = state;
+
+	if (action == NONE)
+		return;
+
+	nr_channels = perf_evlist__channel_nr(evlist);
+	for (ch = 0; ch < nr_channels; ch++) {
+		if (!perf_evlist__channel_check(evlist, ch, RDONLY))
+			continue;
+		perf_evlist__channel_toggle_paused(evlist, ch,
+						   action == PAUSE);
+	}
+}
+
 static bool record__mmap_should_read(struct record *rec, int idx)
 {
 	int channel = -1;
@@ -512,6 +565,8 @@ static int record__mmap_read_all(struct record *rec)
 	if (bytes_written != rec->bytes_written)
 		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
 
+	if (rec->overwrite_evt_state == OVERWRITE_EVT_DATA_PENDING)
+		record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_EMPTY);
 out:
 	return rc;
 }
@@ -870,6 +925,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
+		/*
+		 * rec->overwrite_evt_state is possible to be
+		 * OVERWRITE_EVT_EMPTY here: when done == true and
+		 * hits != rec->samples after previous reading.
+		 *
+		 * record__toggle_overwrite_evsels ensure we never
+		 * convert OVERWRITE_EVT_EMPTY to OVERWRITE_EVT_DATA_PENDING.
+		 */
+		if (switch_output_started || done || draining)
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_DATA_PENDING);
+
 		if (record__mmap_read_all(rec) < 0) {
 			auxtrace_snapshot_disable();
 			err = -1;
@@ -888,7 +954,20 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 
 		if (switch_output_started) {
+			/*
+			 * SIGUSR2 raise after or during record__mmap_read_all().
+			 * continue to read again.
+			 */
+			if (rec->overwrite_evt_state == OVERWRITE_EVT_RUNNING)
+				continue;
+
 			switch_output_started = 0;
+			/*
+			 * Reenable events in overwrite ring buffer after
+			 * record__mmap_read_all(): we should have collected
+			 * data from it.
+			 */
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_RUNNING);
 
 			if (!quiet)
 				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 51/54] perf record: Rename variable to make code clear
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (49 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 52/54] perf record: Read from backward ring buffer Wang Nan
                   ` (2 subsequent siblings)
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

record__mmap_read() write data from ring buffer into perf.data.
'head' is maintained by kernel, points to the last writtend record.
'old' is maintained by perf, points to the record read in previous
round. record__mmap_read() saves data from 'old' to 'head' to
perf.data. The naming of variables are not easy to read. In addition,
when dealing with backward writing ring buffer, the md->prev pointer
should point to 'head' instead of the last byte it got.

Add start and end pointer to make code clear and set md->prev to 'head'
instead of the moved 'old' pointer. This patch doesn't change
behavior since:

    buf = &data[old & md->mask];
    size = head - old;
    old += size;     <--- Here, old == head

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4d89543..b4f0c61 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -91,17 +91,18 @@ static int record__mmap_read(struct record *rec, int idx)
 	struct perf_mmap *md = &rec->evlist->mmap[idx];
 	u64 head = perf_mmap__read_head(md);
 	u64 old = md->prev;
+	u64 end = head, start = old;
 	unsigned char *data = md->base + page_size;
 	unsigned long size;
 	void *buf;
 	int rc = 0;
 
-	if (old == head)
+	if (start == end)
 		return 0;
 
 	rec->samples++;
 
-	size = head - old;
+	size = end - start;
 	if (size > (unsigned long)(md->mask) + 1) {
 		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
@@ -110,10 +111,10 @@ static int record__mmap_read(struct record *rec, int idx)
 		return 0;
 	}
 
-	if ((old & md->mask) + size != (head & md->mask)) {
-		buf = &data[old & md->mask];
-		size = md->mask + 1 - (old & md->mask);
-		old += size;
+	if ((start & md->mask) + size != (end & md->mask)) {
+		buf = &data[start & md->mask];
+		size = md->mask + 1 - (start & md->mask);
+		start += size;
 
 		if (record__write(rec, buf, size) < 0) {
 			rc = -1;
@@ -121,16 +122,16 @@ static int record__mmap_read(struct record *rec, int idx)
 		}
 	}
 
-	buf = &data[old & md->mask];
-	size = head - old;
-	old += size;
+	buf = &data[start & md->mask];
+	size = end - start;
+	start += size;
 
 	if (record__write(rec, buf, size) < 0) {
 		rc = -1;
 		goto out;
 	}
 
-	md->prev = old;
+	md->prev = head;
 	perf_evlist__mmap_consume(rec->evlist, idx);
 out:
 	return rc;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 52/54] perf record: Read from backward ring buffer
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (50 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 51/54] perf record: Rename variable to make code clear Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 53/54] perf record: Allow generate tracking events at the end of output Wang Nan
  2016-02-05 14:02 ` [PATCH 54/54] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Introduce rb_find_range() to find start and end position from a backward
ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 69 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 67 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b4f0c61..f9ce659 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -86,6 +86,61 @@ static int process_synthesized_event(struct perf_tool *tool,
 	return record__write(rec, event, event->header.size);
 }
 
+static int
+backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
+{
+	struct perf_event_header *pheader;
+	u64 evt_head = head;
+	int size = mask + 1;
+
+	pr_debug2("backward_rb_find_range: buf=%p, head=%"PRIx64"\n", buf, head);
+	pheader = (struct perf_event_header *)(buf + (head & mask));
+	*start = head;
+	while (true) {
+		if (evt_head - head >= (unsigned int)size) {
+			pr_debug("Finshed reading backward ring buffer: rewind\n");
+			if (evt_head - head > (unsigned int)size)
+				evt_head -= pheader->size;
+			*end = evt_head;
+			return 0;
+		}
+
+		pheader = (struct perf_event_header *)(buf + (evt_head & mask));
+
+		if (pheader->size == 0) {
+			pr_debug("Finshed reading backward ring buffer: get start\n");
+			*end = evt_head;
+			return 0;
+		}
+
+		evt_head += pheader->size;
+		pr_debug3("move evt_head: %"PRIx64"\n", evt_head);
+	}
+	WARN_ONCE(1, "Shouldn't get here\n");
+	return -1;
+}
+
+static int
+rb_find_range(struct perf_evlist *evlist, int idx,
+	      void *data, int mask, u64 head, u64 old,
+	      u64 *start, u64 *end)
+{
+	int channel;
+
+	channel = perf_evlist__idx_channel(evlist, idx);
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY)) {
+		*start = old;
+		*end = head;
+		return 0;
+	}
+
+	if (perf_evlist__channel_check(evlist, channel, BACKWARD))
+		return backward_rb_find_range(data, mask, head, start, end);
+
+	WARN_ONCE(1, "Unable to find start position from a read-only ring buffer\n");
+	return -1;
+}
+
 static int record__mmap_read(struct record *rec, int idx)
 {
 	struct perf_mmap *md = &rec->evlist->mmap[idx];
@@ -97,6 +152,10 @@ static int record__mmap_read(struct record *rec, int idx)
 	void *buf;
 	int rc = 0;
 
+	if (rb_find_range(rec->evlist, idx, data, md->mask, head,
+			  old, &start, &end))
+		return -1;
+
 	if (start == end)
 		return 0;
 
@@ -530,8 +589,14 @@ static bool record__mmap_should_read(struct record *rec, int idx)
 		return false;
 	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
 		return false;
-	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
-		return false;
+	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY)) {
+		if (rec->overwrite_evt_state != OVERWRITE_EVT_DATA_PENDING)
+			return false;
+		if (perf_evlist__channel_check(rec->evlist, channel, BACKWARD))
+			return true;
+		else
+			return false;
+	}
 	return true;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 53/54] perf record: Allow generate tracking events at the end of output
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (51 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 52/54] perf record: Read from backward ring buffer Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  2016-02-05 14:02 ` [PATCH 54/54] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Before this patch tracking events are generated based on information in
/proc before all samples. However, with the introducing of overwrite
evsel in perf record, it becomes inconvenience: 'perf record' now can
executed as a daemon for sereval hours and only capture the last
snapshot when it receives SIGUSR2. The tracking events generated at
the head of output 'perf.data' becomes too old, but most of tracking
events during 'perf record' running are dropped.

This patch generates tracking events at the end of output. The output
events series would better reflecting status of system when SIGUSR2
received.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 62 +++++++++++++++++++++++++++++++--------------
 1 file changed, 43 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f9ce659..8221512 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -63,6 +63,7 @@ struct record {
 	bool			timestamp_filename;
 	bool			switch_output;
 	enum overwrite_evt_state overwrite_evt_state;
+	bool			tail_tracking;
 	unsigned long long	samples;
 };
 
@@ -685,6 +686,26 @@ record__finish_output(struct record *rec)
 
 static int record__synthesize(struct record *rec);
 
+static void record__synthesize_target(struct record *rec)
+{
+	if (target__none(&rec->opts.target)) {
+		struct {
+			struct thread_map map;
+			struct thread_map_data map_data;
+		} thread_map;
+
+		thread_map.map.nr = 1;
+		thread_map.map.map[0].pid = rec->evlist->workload.pid;
+		thread_map.map.map[0].comm = NULL;
+		perf_event__synthesize_thread_map(&rec->tool,
+				&thread_map.map,
+				process_synthesized_event,
+				&rec->session->machines.host,
+				rec->opts.sample_address,
+				rec->opts.proc_map_timeout);
+	}
+}
+
 static int
 record__switch_output(struct record *rec, bool at_exit)
 {
@@ -694,6 +715,11 @@ record__switch_output(struct record *rec, bool at_exit)
 	/* Same Size:      "2015122520103046"*/
 	char timestamp[] = "InvalidTimestamp";
 
+	if (rec->tail_tracking) {
+		record__synthesize(rec);
+		record__synthesize_target(rec);
+	}
+
 	rec->samples = 0;
 	record__finish_output(rec);
 	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
@@ -720,23 +746,10 @@ record__switch_output(struct record *rec, bool at_exit)
 		machines__init(&rec->session->machines);
 		perf_session__create_kernel_maps(rec->session);
 		perf_session__set_id_hdr_size(rec->session);
-		record__synthesize(rec);
 
-		if (target__none(&rec->opts.target)) {
-			struct {
-				struct thread_map map;
-				struct thread_map_data map_data;
-			} thread_map;
-
-			thread_map.map.nr = 1;
-			thread_map.map.map[0].pid = rec->evlist->workload.pid;
-			thread_map.map.map[0].comm = NULL;
-			perf_event__synthesize_thread_map(&rec->tool,
-					&thread_map.map,
-					process_synthesized_event,
-					&rec->session->machines.host,
-					rec->opts.sample_address,
-					rec->opts.proc_map_timeout);
+		if (!rec->tail_tracking) {
+			record__synthesize(rec);
+			record__synthesize_target(rec);
 		}
 	}
 	return fd;
@@ -932,9 +945,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	err = record__synthesize(rec);
-	if (err < 0)
-		goto out_child;
+	if (!rec->tail_tracking) {
+		err = record__synthesize(rec);
+		if (err < 0)
+			goto out_child;
+	}
 
 	if (rec->realtime_prio) {
 		struct sched_param param;
@@ -1075,6 +1090,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			disabled = true;
 		}
 	}
+
+	if (rec->tail_tracking) {
+		err = record__synthesize(rec);
+		if (err < 0)
+			goto out_child;
+	}
+
 	auxtrace_snapshot_disable();
 
 	if (forks && workload_exec_errno) {
@@ -1501,6 +1523,8 @@ struct option __record_options[] = {
 		    "append timestamp to output filename"),
 	OPT_BOOLEAN(0, "switch-output", &record.switch_output,
 		    "Switch output when receive SIGUSR2"),
+	OPT_BOOLEAN(0, "tail-tracking", &record.tail_tracking,
+		    "Generate tracking events at the end of output"),
 	OPT_END()
 };
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [PATCH 54/54] perf tools: Don't warn about out of order event if write_backward is used
  2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (52 preceding siblings ...)
  2016-02-05 14:02 ` [PATCH 53/54] perf record: Allow generate tracking events at the end of output Wang Nan
@ 2016-02-05 14:02 ` Wang Nan
  53 siblings, 0 replies; 75+ messages in thread
From: Wang Nan @ 2016-02-05 14:02 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.

Result:

Before this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
 [ perf record: Woken up 5 times to write data ]
 Warning:
 40 out of order events recorded.
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

After this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
 [ perf record: Woken up 5 times to write data ]
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/session.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 40b7a0d..132c6ab 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1516,10 +1516,27 @@ int perf_session__register_idle_thread(struct perf_session *session)
 	return err;
 }
 
+static void
+perf_session__warn_order(const struct perf_session *session)
+{
+	const struct ordered_events *oe = &session->ordered_events;
+	struct perf_evsel *evsel;
+	bool should_warn = true;
+
+	evlist__for_each(session->evlist, evsel) {
+		if (evsel->attr.write_backward)
+			should_warn = false;
+	}
+
+	if (!should_warn)
+		return;
+	if (oe->nr_unordered_events != 0)
+		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+}
+
 static void perf_session__warn_about_errors(const struct perf_session *session)
 {
 	const struct events_stats *stats = &session->evlist->stats;
-	const struct ordered_events *oe = &session->ordered_events;
 
 	if (session->tool->lost == perf_event__process_lost &&
 	    stats->nr_events[PERF_RECORD_LOST] != 0) {
@@ -1576,8 +1593,7 @@ static void perf_session__warn_about_errors(const struct perf_session *session)
 			    stats->nr_unprocessable_samples);
 	}
 
-	if (oe->nr_unordered_events != 0)
-		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+	perf_session__warn_order(session);
 
 	events_stats__auxtrace_error_warn(stats);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH 05/54] perf data: Fix releasing event_class
       [not found]   ` <20160211220413.GF32168@kernel.org>
@ 2016-02-12 12:19     ` Jiri Olsa
  2016-02-12 15:42       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 12:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Wang Nan, Alexei Starovoitov, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

On Thu, Feb 11, 2016 at 07:04:13PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Feb 05, 2016 at 02:01:30PM +0000, Wang Nan escreveu:
> > A new patch of libbabeltrace [1] reveals a object leak problem in
> > perf data CTF support: perf code never release event_class which is
> > allocated in add_event() and stored in evsel's private field.
> > 
> > If libbabeltrace has the above patch applied, leaking event_class
> > prevent writer being destroied and flushing metadata. For example:
> 
> Ok, so if the user has an older version of this libbabeltrace, what
> happens? 

it's standard cleanup that should be there even for old version,

IIUC the problem is that new version of libbabeltrace started
to check on refcounts on some exit function and gets crazy
if there's a mess/leak

IIRC I've already acked this one

jirka

> 
> Would he/she gets some warning about the requirement of a later version?
> 
> - Arnaldo
>  
> >  $ ./perf record ls
> >  Lowering default frequency rate to 500.
> >  Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
> >  perf.data
> >  [ perf record: Woken up 1 times to write data ]
> >  [ perf record: Captured and wrote 0.012 MB perf.data (12 samples) ]
> >  $ ./perf data convert --to-ctf ./out.ctf
> >  [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
> >  [ perf data convert: Converted and wrote 0.000 MB (12 samples) ]
> >  $ cat ./out.ctf/metadata
> >  $ ls -l  ./out.ctf/metadata
> >  -rw-r----- 1 w00229757 mm 0 Jan 27 10:49 ./out.ctf/metadata
> > 
> > The correct result should be:
> >  ...
> >  $ cat ./out.ctf/metadata
> >  /* CTF 1.8 */
> > 
> >  trace {
> >  [SNIP]
> > 
> >  $ ls -l  ./out.ctf/metadata
> >  -rw-r----- 1 w00229757 mm 2446 Jan 27 10:52 ./out.ctf/metadata
> > 
> > The full story is:
> > 
> >  Patch [1] of babeltrace redesign reference counting scheme. In that
> >  patch:
> > 
> >   * writer <- trace (bt_ctf_writer_create)
> >   * trace <- stream_class (bt_ctf_trace_add_stream_class)
> >   * stream_class <- event_class (bt_ctf_stream_class_add_event_class)
> >   ('<-' means 'is a parent of')
> > 
> >   Holding of event_class causes reference count of corresponding
> >   'writer' increases through parent chain. Perf expect 'writer' is
> >   released (so metadata is flushed) through bt_ctf_writer_put() in
> >   ctf_writer__cleanup(). However, since it never release event_class,
> >   the reference of 'writer' won't be reduced, so bt_ctf_writer_put()
> >   won't lead releasing of writer.
> > 
> >  Before this CTF patch, !(writer <- trace). Even event_class leak,
> >  writer is able to be released.
> > 
> > [1] https://github.com/efficios/babeltrace/commit/e6a8e8e4744633807083a077ff9f101eb97d9801
> > 
> > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Jérémie Galarneau <jeremie.galarneau@efficios.com>
> > Cc: Zefan Li <lizefan@huawei.com>
> > Cc: pi3orama@163.com
> > ---
> >  tools/perf/util/data-convert-bt.c | 18 ++++++++++++++++++
> >  1 file changed, 18 insertions(+)
> > 
> > diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
> > index 34cd1e4..b722e57 100644
> > --- a/tools/perf/util/data-convert-bt.c
> > +++ b/tools/perf/util/data-convert-bt.c
> > @@ -858,6 +858,23 @@ static int setup_events(struct ctf_writer *cw, struct perf_session *session)
> >  	return 0;
> >  }
> >  
> > +static void cleanup_events(struct perf_session *session)
> > +{
> > +	struct perf_evlist *evlist = session->evlist;
> > +	struct perf_evsel *evsel;
> > +
> > +	evlist__for_each(evlist, evsel) {
> > +		struct evsel_priv *priv;
> > +
> > +		priv = evsel->priv;
> > +		bt_ctf_event_class_put(priv->event_class);
> > +		zfree(&evsel->priv);
> > +	}
> > +
> > +	perf_evlist__delete(evlist);
> > +	session->evlist = NULL;
> > +}
> > +
> >  static int setup_streams(struct ctf_writer *cw, struct perf_session *session)
> >  {
> >  	struct ctf_stream **stream;
> > @@ -1171,6 +1188,7 @@ int bt_convert__perf2ctf(const char *input, const char *path, bool force)
> >  		(double) c.events_size / 1024.0 / 1024.0,
> >  		c.events_count);
> >  
> > +	cleanup_events(session);
> >  	perf_session__delete(session);
> >  	ctf_writer__cleanup(cw);
> >  
> > -- 
> > 1.8.3.4

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 10/54] perf stat: Forbid user passing improper config terms
  2016-02-05 14:01 ` [PATCH 10/54] perf stat: Forbid user passing improper config terms Wang Nan
@ 2016-02-12 13:49   ` Jiri Olsa
  2016-02-12 15:45     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 13:49 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

On Fri, Feb 05, 2016 at 02:01:35PM +0000, Wang Nan wrote:
> 'perf stat' accepts some config terms but doesn't apply them. For
> example:
> 
>  # perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
>  # ls
>  # exit
> 
>  Performance counter stats for 'bash':
> 
>          266258061      instructions/no-inherit/
>          266258061      instructions/inherit/

hum, but we support no-/inherit in stat, it'd be better to
implement this one for stat IMO


> 
>        1.402183915 seconds time elapsed
> 
> The result is confusing, because user may expect the first
> 'instructions' event exclude the 'ls' command.
> 
> This patch forbit most of those config terms for 'perf stat'.
> 
> Result:
> 
>  # ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
>  event syntax error: 'instructions/no-inherit/'
>                       \___ Don't use record mode only config terms

and there's bunch of others which are sampling related:
  PARSE_EVENTS__TERM_TYPE_SAMPLE_*
  PERF_EVSEL__CONFIG_TERM_CALLGRAPH
  PERF_EVSEL__CONFIG_TERM_STACK_USER
  ...

probably all from get_config_terms apart from the 'inherot' ones,
which should end up with the error message, which could be more
user friendly like:

  - Can't use stack-size term in stat event.

thanks,
jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/54] perf tools: Enable config raw and numeric events
  2016-02-05 14:01 ` [PATCH 12/54] perf tools: Enable config raw and numeric events Wang Nan
@ 2016-02-12 13:52   ` Jiri Olsa
  2016-02-12 13:56     ` pi3orama
  2016-02-12 13:56     ` Jiri Olsa
  2016-02-12 14:10   ` Jiri Olsa
  2016-02-12 14:12   ` Jiri Olsa
  2 siblings, 2 replies; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 13:52 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

On Fri, Feb 05, 2016 at 02:01:37PM +0000, Wang Nan wrote:
> This patch allows setting config terms for raw and numeric events.
> For example:
> 
>  # perf stat -e cycles/name=cyc/ ls
>  ...
>  1821108      cyc
>  ...
> 
>  # perf stat -e r6530160/name=event/ ls
>  ...
>  1103195      event
>  ...
> 
>  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
>  ...
>  # perf report --stdio
>  ...
>  # Samples: 124  of event 'cycles'
>  46.61%     0.00%  swapper        [kernel.vmlinux]            [k] cpu_startup_entry
>  41.26%     0.00%  swapper        [kernel.vmlinux]            [k] start_secondary
>  ...
>  # Samples: 91  of event 'evtx'
>  ...
>  93.76%     0.00%  swapper      [kernel.vmlinux]            [k] cpu_startup_entry
>          |
>          ---cpu_startup_entry
>             |
>             |--66.63%--call_cpuidle
>             |          cpuidle_enter
>             |          |


got compile error:


[jolsa@krava perf]$ make JOBS=1
  BUILD:   Doing 'make -j1' parallel build
  BISON    util/parse-events-bison.c
util/parse-events.y:436.23-38: error: symbol opt_event_config is used, but is not defined as a token and has no rules
 PE_VALUE ':' PE_VALUE opt_event_config
                       ^^^^^^^^^^^^^^^^
util/parse-events.y:442.68-69: error: $4 of ‘event_legacy_numeric’ has no declared type
        ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, $4));
                                                                    ^^
util/parse-events.y:443.34-35: error: $4 of ‘event_legacy_numeric’ has no declared type
        parse_events__free_terms($4);
                                  ^^
util/parse-events.y:454.74-75: error: $2 of ‘event_legacy_raw’ has no declared type
        ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, $2));
                                                                          ^^
util/parse-events.y:455.34-35: error: $2 of ‘event_legacy_raw’ has no declared type
        parse_events__free_terms($2);
                                  ^^
util/Build:122: recipe for target 'util/parse-events-bison.c' failed
make[3]: *** [util/parse-events-bison.c] Error 1
/home/jolsa/kernel/linux-perf/tools/build/Makefile.build:116: recipe for target 'util' failed
make[2]: *** [util] Error 2
Makefile.perf:434: recipe for target 'libperf-in.o' failed
make[1]: *** [libperf-in.o] Error 2
Makefile:68: recipe for target 'all' failed
make: *** [all] Error 2


jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/54] perf tools: Enable config raw and numeric events
  2016-02-12 13:52   ` Jiri Olsa
@ 2016-02-12 13:56     ` pi3orama
  2016-02-12 13:56     ` Jiri Olsa
  1 sibling, 0 replies; 75+ messages in thread
From: pi3orama @ 2016-02-12 13:56 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Wang Nan, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, linux-kernel



发自我的 iPhone

> 在 2016年2月12日,下午9:52,Jiri Olsa <jolsa@redhat.com> 写道:
> 
>> On Fri, Feb 05, 2016 at 02:01:37PM +0000, Wang Nan wrote:
>> This patch allows setting config terms for raw and numeric events.
>> For example:
>> 
>> # perf stat -e cycles/name=cyc/ ls
>> ...
>> 1821108      cyc
>> ...
>> 
>> # perf stat -e r6530160/name=event/ ls
>> ...
>> 1103195      event
>> ...
>> 
>> # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
>> ...
>> # perf report --stdio
>> ...
>> # Samples: 124  of event 'cycles'
>> 46.61%     0.00%  swapper        [kernel.vmlinux]            [k] cpu_startup_entry
>> 41.26%     0.00%  swapper        [kernel.vmlinux]            [k] start_secondary
>> ...
>> # Samples: 91  of event 'evtx'
>> ...
>> 93.76%     0.00%  swapper      [kernel.vmlinux]            [k] cpu_startup_entry
>>         |
>>         ---cpu_startup_entry
>>            |
>>            |--66.63%--call_cpuidle
>>            |          cpuidle_enter
>>            |          |
> 
> 
> got compile error:
> 

Have you cleaned those generated .c files
from .y and .l? Most of compiling errors
related to yacc and lex would gone after
removing them by make clean or by hand.

Thank you.

> 
> [jolsa@krava perf]$ make JOBS=1
>  BUILD:   Doing 'make -j1' parallel build
>  BISON    util/parse-events-bison.c
> util/parse-events.y:436.23-38: error: symbol opt_event_config is used, but is not defined as a token and has no rules
> PE_VALUE ':' PE_VALUE opt_event_config
>                       ^^^^^^^^^^^^^^^^
> util/parse-events.y:442.68-69: error: $4 of ‘event_legacy_numeric’ has no declared type
>        ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, $4));
>                                                                    ^^
> util/parse-events.y:443.34-35: error: $4 of ‘event_legacy_numeric’ has no declared type
>        parse_events__free_terms($4);
>                                  ^^
> util/parse-events.y:454.74-75: error: $2 of ‘event_legacy_raw’ has no declared type
>        ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, $2));
>                                                                          ^^
> util/parse-events.y:455.34-35: error: $2 of ‘event_legacy_raw’ has no declared type
>        parse_events__free_terms($2);
>                                  ^^
> util/Build:122: recipe for target 'util/parse-events-bison.c' failed
> make[3]: *** [util/parse-events-bison.c] Error 1
> /home/jolsa/kernel/linux-perf/tools/build/Makefile.build:116: recipe for target 'util' failed
> make[2]: *** [util] Error 2
> Makefile.perf:434: recipe for target 'libperf-in.o' failed
> make[1]: *** [libperf-in.o] Error 2
> Makefile:68: recipe for target 'all' failed
> make: *** [all] Error 2
> 
> 
> jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/54] perf tools: Enable config raw and numeric events
  2016-02-12 13:52   ` Jiri Olsa
  2016-02-12 13:56     ` pi3orama
@ 2016-02-12 13:56     ` Jiri Olsa
  1 sibling, 0 replies; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 13:56 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

On Fri, Feb 12, 2016 at 02:52:45PM +0100, Jiri Olsa wrote:

SNIP

> 
> 
> got compile error:
> 
> 
> [jolsa@krava perf]$ make JOBS=1
>   BUILD:   Doing 'make -j1' parallel build
>   BISON    util/parse-events-bison.c
> util/parse-events.y:436.23-38: error: symbol opt_event_config is used, but is not defined as a token and has no rules
>  PE_VALUE ':' PE_VALUE opt_event_config
>                        ^^^^^^^^^^^^^^^^
> util/parse-events.y:442.68-69: error: $4 of ‘event_legacy_numeric’ has no declared type
>         ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, $4));
>                                                                     ^^
> util/parse-events.y:443.34-35: error: $4 of ‘event_legacy_numeric’ has no declared type
>         parse_events__free_terms($4);
>                                   ^^
> util/parse-events.y:454.74-75: error: $2 of ‘event_legacy_raw’ has no declared type
>         ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, $2));
>                                                                           ^^
> util/parse-events.y:455.34-35: error: $2 of ‘event_legacy_raw’ has no declared type
>         parse_events__free_terms($2);
>                                   ^^
> util/Build:122: recipe for target 'util/parse-events-bison.c' failed
> make[3]: *** [util/parse-events-bison.c] Error 1
> /home/jolsa/kernel/linux-perf/tools/build/Makefile.build:116: recipe for target 'util' failed
> make[2]: *** [util] Error 2
> Makefile.perf:434: recipe for target 'libperf-in.o' failed
> make[1]: *** [libperf-in.o] Error 2
> Makefile:68: recipe for target 'all' failed
> make: *** [all] Error 2

ugh missed the patch 7.. sry for noise

jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 09/54] perf tools: Enable passing event to BPF object
  2016-02-05 14:01 ` [PATCH 09/54] perf tools: Enable passing event to BPF object Wang Nan
@ 2016-02-12 14:05   ` Jiri Olsa
  0 siblings, 0 replies; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 14:05 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

On Fri, Feb 05, 2016 at 02:01:34PM +0000, Wang Nan wrote:

SNIP

> +
> +	op = bpf_map__add_newop(map);
> +	if (IS_ERR(op))
> +		return PTR_ERR(op);
> +	op->op_type = BPF_MAP_OP_SET_EVSEL;
> +	op->v.evsel = evsel;
> +	return 0;
> +}
> +
> +static int
> +bpf_map__config_event(struct bpf_map *map,
> +		      struct parse_events_term *term,
> +		      struct perf_evlist *evlist)
> +{
> +	if (!term->err_val) {
> +		pr_debug("Config value not set\n");
> +		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
> +	}
> +
> +	if (!term->type_val == PARSE_EVENTS__TERM_TYPE_STR) {

this failed to compiled due to

  CC       util/bpf-loader.o
util/bpf-loader.c: In function ‘bpf_map__config_event’:
util/bpf-loader.c:1013:22: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses]
  if (!term->type_val == PARSE_EVENTS__TERM_TYPE_STR) {
                      ^
cc1: all warnings being treated as errors


jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 07/54] perf tools: Enable BPF object configure syntax
  2016-02-05 14:01 ` [PATCH 07/54] perf tools: Enable BPF object configure syntax Wang Nan
@ 2016-02-12 14:09   ` Jiri Olsa
  2016-02-18  6:17     ` Wangnan (F)
  0 siblings, 1 reply; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 14:09 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

On Fri, Feb 05, 2016 at 02:01:32PM +0000, Wang Nan wrote:

SNIP

>  }
>  |
> -PE_BPF_SOURCE
> +PE_BPF_SOURCE opt_event_config
>  {
>  	struct parse_events_evlist *data = _data;
>  	struct list_head *list;
>  
>  	ALLOC_LIST(list);
> -	ABORT_ON(parse_events_load_bpf(data, list, $1, true));
> +	ABORT_ON(parse_events_load_bpf(data, list, $1, true, $2));
> +	parse_events__free_terms($2);
>  	$$ = list;
>  }
>  
> +opt_event_config:
> +'/' event_config '/'
> +{
> +	$$ = $2;
> +}
> +|
> +{
> +	$$ = NULL;
> +}

can't judge the bpf part, but for the parser part:

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/54] perf tools: Enable config raw and numeric events
  2016-02-05 14:01 ` [PATCH 12/54] perf tools: Enable config raw and numeric events Wang Nan
  2016-02-12 13:52   ` Jiri Olsa
@ 2016-02-12 14:10   ` Jiri Olsa
  2016-02-12 14:12   ` Jiri Olsa
  2 siblings, 0 replies; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 14:10 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

On Fri, Feb 05, 2016 at 02:01:37PM +0000, Wang Nan wrote:
> This patch allows setting config terms for raw and numeric events.
> For example:
> 
>  # perf stat -e cycles/name=cyc/ ls
>  ...
>  1821108      cyc
>  ...
> 
>  # perf stat -e r6530160/name=event/ ls
>  ...
>  1103195      event
>  ...
> 
>  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
>  ...
>  # perf report --stdio
>  ...
>  # Samples: 124  of event 'cycles'
>  46.61%     0.00%  swapper        [kernel.vmlinux]            [k] cpu_startup_entry
>  41.26%     0.00%  swapper        [kernel.vmlinux]            [k] start_secondary
>  ...
>  # Samples: 91  of event 'evtx'
>  ...
>  93.76%     0.00%  swapper      [kernel.vmlinux]            [k] cpu_startup_entry
>          |
>          ---cpu_startup_entry
>             |
>             |--66.63%--call_cpuidle
>             |          cpuidle_enter
>             |          |
>  ...
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/util/parse-events.c |  3 ++-
>  tools/perf/util/parse-events.y | 10 ++++++----
>  2 files changed, 8 insertions(+), 5 deletions(-)

please add new tests into parse-events tests, however for this one:

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 12/54] perf tools: Enable config raw and numeric events
  2016-02-05 14:01 ` [PATCH 12/54] perf tools: Enable config raw and numeric events Wang Nan
  2016-02-12 13:52   ` Jiri Olsa
  2016-02-12 14:10   ` Jiri Olsa
@ 2016-02-12 14:12   ` Jiri Olsa
  2 siblings, 0 replies; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 14:12 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

On Fri, Feb 05, 2016 at 02:01:37PM +0000, Wang Nan wrote:
> This patch allows setting config terms for raw and numeric events.
> For example:
> 
>  # perf stat -e cycles/name=cyc/ ls
>  ...
>  1821108      cyc
>  ...
> 
>  # perf stat -e r6530160/name=event/ ls
>  ...
>  1103195      event
>  ...
> 
>  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
>  ...
>  # perf report --stdio
>  ...
>  # Samples: 124  of event 'cycles'
>  46.61%     0.00%  swapper        [kernel.vmlinux]            [k] cpu_startup_entry
>  41.26%     0.00%  swapper        [kernel.vmlinux]            [k] start_secondary
>  ...
>  # Samples: 91  of event 'evtx'
>  ...
>  93.76%     0.00%  swapper      [kernel.vmlinux]            [k] cpu_startup_entry
>          |
>          ---cpu_startup_entry
>             |
>             |--66.63%--call_cpuidle
>             |          cpuidle_enter
>             |          |
>  ...
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/util/parse-events.c |  3 ++-
>  tools/perf/util/parse-events.y | 10 ++++++----
>  2 files changed, 8 insertions(+), 5 deletions(-)
> 

please add new tests into parse-events tests ;-)

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately
  2016-02-05 14:01 ` [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately Wang Nan
@ 2016-02-12 14:23   ` Jiri Olsa
  2016-02-12 14:34     ` pi3orama
  0 siblings, 1 reply; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 14:23 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

On Fri, Feb 05, 2016 at 02:01:39PM +0000, Wang Nan wrote:

SNIP

>  
> +int parse_events__merge_arrays(struct parse_events_array *dest,
> +			       struct parse_events_array *another)
> +{
> +	struct parse_events_array new;
> +
> +	if (!dest || !another)
> +		return -EINVAL;
> +
> +	new.nr_ranges = dest->nr_ranges + another->nr_ranges;
> +	new.ranges = malloc(sizeof(new.ranges[0]) * new.nr_ranges);
> +	if (!new.ranges)
> +		return -ENOMEM;
> +
> +	memcpy(&new.ranges[0], dest->ranges,
> +	       sizeof(new.ranges[0]) * dest->nr_ranges);
> +	memcpy(&new.ranges[dest->nr_ranges], another->ranges,
> +	       sizeof(new.ranges[0]) * another->nr_ranges);
> +	free(dest->ranges);
> +	free(another->ranges);
> +	*dest = new;
> +	return 0;
> +}

is there a user for this function in this patchset? can't find it..

I recall I've already seen it in earlier versions, but can't find it now ;-)

jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately
  2016-02-12 14:23   ` Jiri Olsa
@ 2016-02-12 14:34     ` pi3orama
  0 siblings, 0 replies; 75+ messages in thread
From: pi3orama @ 2016-02-12 14:34 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Wang Nan, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, linux-kernel



发自我的 iPhone

> 在 2016年2月12日,下午10:23,Jiri Olsa <jolsa@redhat.com> 写道:
> 
> On Fri, Feb 05, 2016 at 02:01:39PM +0000, Wang Nan wrote:
> 
> SNIP
> 
>> 
>> +int parse_events__merge_arrays(struct parse_events_array *dest,
>> +                   struct parse_events_array *another)
>> +{
>> +    struct parse_events_array new;
>> +
>> +    if (!dest || !another)
>> +        return -EINVAL;
>> +
>> +    new.nr_ranges = dest->nr_ranges + another->nr_ranges;
>> +    new.ranges = malloc(sizeof(new.ranges[0]) * new.nr_ranges);
>> +    if (!new.ranges)
>> +        return -ENOMEM;
>> +
>> +    memcpy(&new.ranges[0], dest->ranges,
>> +           sizeof(new.ranges[0]) * dest->nr_ranges);
>> +    memcpy(&new.ranges[dest->nr_ranges], another->ranges,
>> +           sizeof(new.ranges[0]) * another->nr_ranges);
>> +    free(dest->ranges);
>> +    free(another->ranges);
>> +    *dest = new;
>> +    return 0;
>> +}
> 
> is there a user for this function in this patchset? can't find it..
> 
> I recall I've already seen it in earlier versions, but can't find it now ;-)
> 

Sorry, I should remove this function. It
is designed to be used in patch 15, but
I found it require more code if use this
function, so instead patch 15 does the
merging inline, makes this function useless.

Thank you.

> jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 05/54] perf data: Fix releasing event_class
  2016-02-12 12:19     ` Jiri Olsa
@ 2016-02-12 15:42       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 75+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-12 15:42 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Wang Nan, Alexei Starovoitov, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

Em Fri, Feb 12, 2016 at 01:19:42PM +0100, Jiri Olsa escreveu:
> On Thu, Feb 11, 2016 at 07:04:13PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Feb 05, 2016 at 02:01:30PM +0000, Wang Nan escreveu:
> > > A new patch of libbabeltrace [1] reveals a object leak problem in
> > > perf data CTF support: perf code never release event_class which is
> > > allocated in add_event() and stored in evsel's private field.
> > > 
> > > If libbabeltrace has the above patch applied, leaking event_class
> > > prevent writer being destroied and flushing metadata. For example:
> > 
> > Ok, so if the user has an older version of this libbabeltrace, what
> > happens? 
> 
> it's standard cleanup that should be there even for old version,
> 
> IIUC the problem is that new version of libbabeltrace started
> to check on refcounts on some exit function and gets crazy
> if there's a mess/leak
> 
> IIRC I've already acked this one

Ok, Wang, if you notice such Acked-by, collect them, i.e. add them to
new versions of your patchkits, that helps me in processing them, i.e.
knowing that there was already discussion/acknowledgement from people
directly involved in the affected code.

thanks,

- Arnaldo
 
> jirka
> 
> > 
> > Would he/she gets some warning about the requirement of a later version?
> > 
> > - Arnaldo
> >  
> > >  $ ./perf record ls
> > >  Lowering default frequency rate to 500.
> > >  Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
> > >  perf.data
> > >  [ perf record: Woken up 1 times to write data ]
> > >  [ perf record: Captured and wrote 0.012 MB perf.data (12 samples) ]
> > >  $ ./perf data convert --to-ctf ./out.ctf
> > >  [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
> > >  [ perf data convert: Converted and wrote 0.000 MB (12 samples) ]
> > >  $ cat ./out.ctf/metadata
> > >  $ ls -l  ./out.ctf/metadata
> > >  -rw-r----- 1 w00229757 mm 0 Jan 27 10:49 ./out.ctf/metadata
> > > 
> > > The correct result should be:
> > >  ...
> > >  $ cat ./out.ctf/metadata
> > >  /* CTF 1.8 */
> > > 
> > >  trace {
> > >  [SNIP]
> > > 
> > >  $ ls -l  ./out.ctf/metadata
> > >  -rw-r----- 1 w00229757 mm 2446 Jan 27 10:52 ./out.ctf/metadata
> > > 
> > > The full story is:
> > > 
> > >  Patch [1] of babeltrace redesign reference counting scheme. In that
> > >  patch:
> > > 
> > >   * writer <- trace (bt_ctf_writer_create)
> > >   * trace <- stream_class (bt_ctf_trace_add_stream_class)
> > >   * stream_class <- event_class (bt_ctf_stream_class_add_event_class)
> > >   ('<-' means 'is a parent of')
> > > 
> > >   Holding of event_class causes reference count of corresponding
> > >   'writer' increases through parent chain. Perf expect 'writer' is
> > >   released (so metadata is flushed) through bt_ctf_writer_put() in
> > >   ctf_writer__cleanup(). However, since it never release event_class,
> > >   the reference of 'writer' won't be reduced, so bt_ctf_writer_put()
> > >   won't lead releasing of writer.
> > > 
> > >  Before this CTF patch, !(writer <- trace). Even event_class leak,
> > >  writer is able to be released.
> > > 
> > > [1] https://github.com/efficios/babeltrace/commit/e6a8e8e4744633807083a077ff9f101eb97d9801
> > > 
> > > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > Cc: Jiri Olsa <jolsa@kernel.org>
> > > Cc: Jérémie Galarneau <jeremie.galarneau@efficios.com>
> > > Cc: Zefan Li <lizefan@huawei.com>
> > > Cc: pi3orama@163.com
> > > ---
> > >  tools/perf/util/data-convert-bt.c | 18 ++++++++++++++++++
> > >  1 file changed, 18 insertions(+)
> > > 
> > > diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
> > > index 34cd1e4..b722e57 100644
> > > --- a/tools/perf/util/data-convert-bt.c
> > > +++ b/tools/perf/util/data-convert-bt.c
> > > @@ -858,6 +858,23 @@ static int setup_events(struct ctf_writer *cw, struct perf_session *session)
> > >  	return 0;
> > >  }
> > >  
> > > +static void cleanup_events(struct perf_session *session)
> > > +{
> > > +	struct perf_evlist *evlist = session->evlist;
> > > +	struct perf_evsel *evsel;
> > > +
> > > +	evlist__for_each(evlist, evsel) {
> > > +		struct evsel_priv *priv;
> > > +
> > > +		priv = evsel->priv;
> > > +		bt_ctf_event_class_put(priv->event_class);
> > > +		zfree(&evsel->priv);
> > > +	}
> > > +
> > > +	perf_evlist__delete(evlist);
> > > +	session->evlist = NULL;
> > > +}
> > > +
> > >  static int setup_streams(struct ctf_writer *cw, struct perf_session *session)
> > >  {
> > >  	struct ctf_stream **stream;
> > > @@ -1171,6 +1188,7 @@ int bt_convert__perf2ctf(const char *input, const char *path, bool force)
> > >  		(double) c.events_size / 1024.0 / 1024.0,
> > >  		c.events_count);
> > >  
> > > +	cleanup_events(session);
> > >  	perf_session__delete(session);
> > >  	ctf_writer__cleanup(cw);
> > >  
> > > -- 
> > > 1.8.3.4

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 10/54] perf stat: Forbid user passing improper config terms
  2016-02-12 13:49   ` Jiri Olsa
@ 2016-02-12 15:45     ` Arnaldo Carvalho de Melo
  2016-02-12 15:50       ` Jiri Olsa
  0 siblings, 1 reply; 75+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-12 15:45 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Wang Nan, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Brendan Gregg, Adrian Hunter, Cody P Schafer, David S. Miller,
	He Kuang, Jérémie Galarneau, Jiri Olsa, Kirill Smelkov,
	Li Zefan, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra,
	pi3orama, linux-kernel

Em Fri, Feb 12, 2016 at 02:49:08PM +0100, Jiri Olsa escreveu:
> On Fri, Feb 05, 2016 at 02:01:35PM +0000, Wang Nan wrote:
> > 'perf stat' accepts some config terms but doesn't apply them. For
> > example:
> > 
> >  # perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
> >  # ls
> >  # exit
> > 
> >  Performance counter stats for 'bash':
> > 
> >          266258061      instructions/no-inherit/
> >          266258061      instructions/inherit/
> 
> hum, but we support no-/inherit in stat, it'd be better to
> implement this one for stat IMO
> 
> 
> > 
> >        1.402183915 seconds time elapsed
> > 
> > The result is confusing, because user may expect the first
> > 'instructions' event exclude the 'ls' command.
> > 
> > This patch forbit most of those config terms for 'perf stat'.
> > 
> > Result:
> > 
> >  # ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
> >  event syntax error: 'instructions/no-inherit/'
> >                       \___ Don't use record mode only config terms
> 
> and there's bunch of others which are sampling related:
>   PARSE_EVENTS__TERM_TYPE_SAMPLE_*
>   PERF_EVSEL__CONFIG_TERM_CALLGRAPH
>   PERF_EVSEL__CONFIG_TERM_STACK_USER
>   ...
> 
> probably all from get_config_terms apart from the 'inherot' ones,
> which should end up with the error message, which could be more
> user friendly like:
> 
>   - Can't use stack-size term in stat event.

'stat' event? If this is the term you want to use maybe:

    'stack-size' is not usable in 'perf stat'.

But perhaps it would be better as:

    'stack-size' can only be used when sampling.

Or even more verbose:

    The 'stack-size' can only be configured when sampling, not when just
    counting events.

No?

- Arnaldo

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 10/54] perf stat: Forbid user passing improper config terms
  2016-02-12 15:45     ` Arnaldo Carvalho de Melo
@ 2016-02-12 15:50       ` Jiri Olsa
  0 siblings, 0 replies; 75+ messages in thread
From: Jiri Olsa @ 2016-02-12 15:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Wang Nan, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Brendan Gregg, Adrian Hunter, Cody P Schafer, David S. Miller,
	He Kuang, Jérémie Galarneau, Jiri Olsa, Kirill Smelkov,
	Li Zefan, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra,
	pi3orama, linux-kernel

On Fri, Feb 12, 2016 at 01:45:17PM -0200, Arnaldo Carvalho de Melo wrote:
> Em Fri, Feb 12, 2016 at 02:49:08PM +0100, Jiri Olsa escreveu:
> > On Fri, Feb 05, 2016 at 02:01:35PM +0000, Wang Nan wrote:
> > > 'perf stat' accepts some config terms but doesn't apply them. For
> > > example:
> > > 
> > >  # perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
> > >  # ls
> > >  # exit
> > > 
> > >  Performance counter stats for 'bash':
> > > 
> > >          266258061      instructions/no-inherit/
> > >          266258061      instructions/inherit/
> > 
> > hum, but we support no-/inherit in stat, it'd be better to
> > implement this one for stat IMO
> > 
> > 
> > > 
> > >        1.402183915 seconds time elapsed
> > > 
> > > The result is confusing, because user may expect the first
> > > 'instructions' event exclude the 'ls' command.
> > > 
> > > This patch forbit most of those config terms for 'perf stat'.
> > > 
> > > Result:
> > > 
> > >  # ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
> > >  event syntax error: 'instructions/no-inherit/'
> > >                       \___ Don't use record mode only config terms
> > 
> > and there's bunch of others which are sampling related:
> >   PARSE_EVENTS__TERM_TYPE_SAMPLE_*
> >   PERF_EVSEL__CONFIG_TERM_CALLGRAPH
> >   PERF_EVSEL__CONFIG_TERM_STACK_USER
> >   ...
> > 
> > probably all from get_config_terms apart from the 'inherot' ones,
> > which should end up with the error message, which could be more
> > user friendly like:
> > 
> >   - Can't use stack-size term in stat event.
> 
> 'stat' event? If this is the term you want to use maybe:
> 
>     'stack-size' is not usable in 'perf stat'.

ok I vote for this one then ^^^ ;-)

jirka

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH 08/54] perf record: Apply config to BPF objects before recording
  2016-02-05 14:01 ` [PATCH 08/54] perf record: Apply config to BPF objects before recording Wang Nan
@ 2016-02-12 20:55   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 75+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-12 20:55 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo, Brendan Gregg,
	Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel

Em Fri, Feb 05, 2016 at 02:01:33PM +0000, Wang Nan escreveu:
> bpf__apply_obj_config() is introduced as the core API to apply object
> config options to all BPF objects. This patch also does the real work
> for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value
> stored in map's private field into the BPF map.

Ok, I have up to here in my perf/core branch, will perform tests and see
if I can push to Ingo so that we make progress, please take a look at
Jiri's comments for the next patches.

- Arnaldo

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [tip:perf/core] perf symbols: Fix symbols searching for module in buildid-cache
  2016-02-05 14:01 ` [PATCH 02/54] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
@ 2016-02-16  7:52   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 75+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-16  7:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: ast, masami.hiramatsu.pt, linux-kernel, namhyung, lizefan, kirr,
	jolsa, jeremie.galarneau, wangnan0, mingo, brendan.d.gregg, acme,
	peterz, adrian.hunter, dev, tglx, hekuang, hpa

Commit-ID:  e7ee404757609067c8f261d90251f1e96459c535
Gitweb:     http://git.kernel.org/tip/e7ee404757609067c8f261d90251f1e96459c535
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 5 Feb 2016 14:01:27 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 12 Feb 2016 10:54:47 -0300

perf symbols: Fix symbols searching for module in buildid-cache

Before this patch, if a sample is triggered inside a module not in
/lib/modules/`uname -r`/, even if the module is in buildid-cache, 'perf
report' will still be unable to find the correct symbol.  For example:

  # rm -rf ~/.debug/
  # perf buildid-cache -a ./mymodule.ko
  # perf probe -m ./mymodule.ko -a get_mymodule_val
  Added new event:
    probe:get_mymodule_val (on get_mymodule_val in mymodule)

  You can now use it in all perf tools, such as:

 	perf record -e probe:get_mymodule_val -aR sleep 1

  # perf record -e probe:get_mymodule_val cat /proc/mymodule
  mymodule:3
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]

  # perf report --stdio
  [SNIP]
  #
  # Overhead  Command  Shared Object     Symbol
  # ........  .......  ................  ......................
  #
    100.00%  cat      [mymodule]        [k] 0x0000000000000001

  # perf report -vvvv --stdio
  dso__load_sym: adjusting symbol: st_value: 0 sh_addr: 0 sh_offset: 0x70
  symbol__new: get_mymodule_val 0x70-0x8a
  [SNIP]

This is caused by dso__load() -> dso__load_sym(). In dso__load(), kmod
is true only when its file is found in some well know directories. All
files loaded from buildid-cache are treated as user programs. Following
dso__load_sym() set map->pgoff incorrectly.

This patch gives kernel modules in buildid-cache a chance to adjust
value of kmod. After dso__load() get the type of symbols, if it is
buildid, check the last 3 chars of original filename against '.ko', and
adjust the value of kmod if the file is a kernel module.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/build-id.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/build-id.h |  1 +
 tools/perf/util/symbol.c   |  4 ++++
 3 files changed, 49 insertions(+)

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index b28100e..f1479ee 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -166,6 +166,50 @@ char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size)
 	return build_id__filename(build_id_hex, bf, size);
 }
 
+bool dso__build_id_is_kmod(const struct dso *dso, char *bf, size_t size)
+{
+	char *id_name, *ch;
+	struct stat sb;
+
+	id_name = dso__build_id_filename(dso, bf, size);
+	if (!id_name)
+		goto err;
+	if (access(id_name, F_OK))
+		goto err;
+	if (lstat(id_name, &sb) == -1)
+		goto err;
+	if ((size_t)sb.st_size > size - 1)
+		goto err;
+	if (readlink(id_name, bf, size - 1) < 0)
+		goto err;
+
+	bf[sb.st_size] = '\0';
+
+	/*
+	 * link should be:
+	 * ../../lib/modules/4.4.0-rc4/kernel/net/ipv4/netfilter/nf_nat_ipv4.ko/a09fe3eb3147dafa4e3b31dbd6257e4d696bdc92
+	 */
+	ch = strrchr(bf, '/');
+	if (!ch)
+		goto err;
+	if (ch - 3 < bf)
+		goto err;
+
+	return strncmp(".ko", ch - 3, 3) == 0;
+err:
+	/*
+	 * If dso__build_id_filename work, get id_name again,
+	 * because id_name points to bf and is broken.
+	 */
+	if (id_name)
+		id_name = dso__build_id_filename(dso, bf, size);
+	pr_err("Invalid build id: %s\n", id_name ? :
+					 dso->long_name ? :
+					 dso->short_name ? :
+					 "[unknown]");
+	return false;
+}
+
 #define dsos__for_each_with_build_id(pos, head)	\
 	list_for_each_entry(pos, head, node)	\
 		if (!pos->has_build_id)		\
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 27a14a8..64af3e2 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -16,6 +16,7 @@ int sysfs__sprintf_build_id(const char *root_dir, char *sbuild_id);
 int filename__sprintf_build_id(const char *pathname, char *sbuild_id);
 
 char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size);
+bool dso__build_id_is_kmod(const struct dso *dso, char *bf, size_t size);
 
 int build_id__mark_dso_hit(struct perf_tool *tool, union perf_event *event,
 			   struct perf_sample *sample, struct perf_evsel *evsel,
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 90cedfa..e7588dc 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1529,6 +1529,10 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 	if (!runtime_ss && syms_ss)
 		runtime_ss = syms_ss;
 
+	if (syms_ss && syms_ss->type == DSO_BINARY_TYPE__BUILD_ID_CACHE)
+		if (dso__build_id_is_kmod(dso, name, PATH_MAX))
+			kmod = true;
+
 	if (syms_ss)
 		ret = dso__load_sym(dso, map, syms_ss, runtime_ss, filter, kmod);
 	else

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [tip:perf/core] perf tools: Unlink entries from terms list
  2016-02-05 14:01 ` [PATCH 01/54] perf tools: Fix dangling pointers in parse_events__free_terms Wang Nan
@ 2016-02-16  7:53   ` tip-bot for Wang Nan
  2016-02-16  7:54   ` [tip:perf/core] perf tools: Free the terms list_head in parse_events__free_terms() tip-bot for Wang Nan
  1 sibling, 0 replies; 75+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-16  7:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: masami.hiramatsu.pt, ast, hpa, tglx, jolsa, lizefan, mingo,
	hekuang, acme, wangnan0, namhyung, linux-kernel

Commit-ID:  a8adfceb389a0045e06af22517fa3326797b160a
Gitweb:     http://git.kernel.org/tip/a8adfceb389a0045e06af22517fa3326797b160a
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 12 Feb 2016 16:31:23 -0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 12 Feb 2016 16:51:15 -0300

perf tools: Unlink entries from terms list

We were just freeing them, better unlink and init its nodes to catch
bugs faster if we keep dangling references to them.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch, use list_del_init() instead of list_del() ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/parse-events.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 813d9b2..133c8d2 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2072,8 +2072,10 @@ void parse_events__free_terms(struct list_head *terms)
 {
 	struct parse_events_term *term, *h;
 
-	list_for_each_entry_safe(term, h, terms, list)
+	list_for_each_entry_safe(term, h, terms, list) {
+		list_del_init(&term->list);
 		free(term);
+	}
 }
 
 void parse_events_evlist_error(struct parse_events_evlist *data,

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [tip:perf/core] perf tools: Free the terms list_head in parse_events__free_terms()
  2016-02-05 14:01 ` [PATCH 01/54] perf tools: Fix dangling pointers in parse_events__free_terms Wang Nan
  2016-02-16  7:53   ` [tip:perf/core] perf tools: Unlink entries from terms list tip-bot for Wang Nan
@ 2016-02-16  7:54   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 75+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-16  7:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hekuang, hpa, acme, linux-kernel, lizefan, mingo, tglx, wangnan0,
	namhyung, ast, masami.hiramatsu.pt, jolsa

Commit-ID:  d20a5f2b277b2f46548fb60f2bb95ad9a601d3fe
Gitweb:     http://git.kernel.org/tip/d20a5f2b277b2f46548fb60f2bb95ad9a601d3fe
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 12 Feb 2016 17:01:17 -0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 12 Feb 2016 17:01:17 -0300

perf tools: Free the terms list_head in parse_events__free_terms()

Fixing a leak, since code calling parse_events__free_terms() expect it
to free the list_head too.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/parse-events.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 668afdc..d1b49ec 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2081,6 +2081,7 @@ void parse_events_terms__purge(struct list_head *terms)
 void parse_events__free_terms(struct list_head *terms)
 {
 	parse_events_terms__purge(terms);
+	free(terms);
 }
 
 void parse_events_evlist_error(struct parse_events_evlist *data,

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [tip:perf/core] perf data: Fix releasing event_class
  2016-02-05 14:01 ` [PATCH 05/54] perf data: Fix releasing event_class Wang Nan
       [not found]   ` <20160211220413.GF32168@kernel.org>
@ 2016-02-16  7:55   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 75+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, adrian.hunter, jeremie.galarneau, jolsa, mingo,
	hpa, hekuang, namhyung, brendan.d.gregg, tglx, kirr, lizefan,
	ast, masami.hiramatsu.pt, dev, acme, wangnan0, peterz

Commit-ID:  5141d7350d3d8a12f1f76b1015b937f14d2b97e2
Gitweb:     http://git.kernel.org/tip/5141d7350d3d8a12f1f76b1015b937f14d2b97e2
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 5 Feb 2016 14:01:30 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 12 Feb 2016 17:27:48 -0300

perf data: Fix releasing event_class

A new patch in libbabeltrace [1] reveals a object leak problem in
'perf data' CTF support: perf code never releases the event_class
which is allocated in add_event() and stored in evsel's private field.

If libbabeltrace has the above patch applied, leaking event_class
prevents the writer from being destroyed and flushing metadata. For
example:

  $ perf record ls
  perf.data
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data (12 samples) ]
  $ perf data convert --to-ctf ./out.ctf
  [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
  [ perf data convert: Converted and wrote 0.000 MB (12 samples) ]
  $ cat ./out.ctf/metadata
  $ ls -l  ./out.ctf/metadata
  -rw-r----- 1 w00229757 mm 0 Jan 27 10:49 ./out.ctf/metadata

The correct result should be:
  ...
  $ cat ./out.ctf/metadata
  /* CTF 1.8 */

  trace {
  [SNIP]

  $ ls -l  ./out.ctf/metadata
  -rw-r----- 1 w00229757 mm 2446 Jan 27 10:52 ./out.ctf/metadata

The full story is:

Patch [1] of babeltrace redesigns its reference counting scheme. In that
patch:

 * writer <- trace (bt_ctf_writer_create)
 * trace <- stream_class (bt_ctf_trace_add_stream_class)
 * stream_class <- event_class (bt_ctf_stream_class_add_event_class)
 ('<-' means 'is a parent of')

Holding of event_class causes reference count of corresponding 'writer'
to increase through parent chain. Perf expects that 'writer' is released
(so metadata is flushed) through bt_ctf_writer_put() in
ctf_writer__cleanup(). However, since it never releases event_class, the
reference of 'writer' won't be dropped, so bt_ctf_writer_put() won't
lead to the release of writer.

Before this CTF patch, !(writer <- trace). Even with event_class leaking,
the writer ends up being released.

[1] https://github.com/efficios/babeltrace/commit/e6a8e8e4744633807083a077ff9f101eb97d9801

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/data-convert-bt.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 34cd1e4..b722e57 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -858,6 +858,23 @@ static int setup_events(struct ctf_writer *cw, struct perf_session *session)
 	return 0;
 }
 
+static void cleanup_events(struct perf_session *session)
+{
+	struct perf_evlist *evlist = session->evlist;
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		struct evsel_priv *priv;
+
+		priv = evsel->priv;
+		bt_ctf_event_class_put(priv->event_class);
+		zfree(&evsel->priv);
+	}
+
+	perf_evlist__delete(evlist);
+	session->evlist = NULL;
+}
+
 static int setup_streams(struct ctf_writer *cw, struct perf_session *session)
 {
 	struct ctf_stream **stream;
@@ -1171,6 +1188,7 @@ int bt_convert__perf2ctf(const char *input, const char *path, bool force)
 		(double) c.events_size / 1024.0 / 1024.0,
 		c.events_count);
 
+	cleanup_events(session);
 	perf_session__delete(session);
 	ctf_writer__cleanup(cw);
 

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH 07/54] perf tools: Enable BPF object configure syntax
  2016-02-12 14:09   ` Jiri Olsa
@ 2016-02-18  6:17     ` Wangnan (F)
  0 siblings, 0 replies; 75+ messages in thread
From: Wangnan (F) @ 2016-02-18  6:17 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg, Adrian Hunter,
	Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	linux-kernel



On 2016/2/12 22:09, Jiri Olsa wrote:
> On Fri, Feb 05, 2016 at 02:01:32PM +0000, Wang Nan wrote:
>
> SNIP
>
>>   }
>>   |
>> -PE_BPF_SOURCE
>> +PE_BPF_SOURCE opt_event_config
>>   {
>>   	struct parse_events_evlist *data = _data;
>>   	struct list_head *list;
>>   
>>   	ALLOC_LIST(list);
>> -	ABORT_ON(parse_events_load_bpf(data, list, $1, true));
>> +	ABORT_ON(parse_events_load_bpf(data, list, $1, true, $2));
>> +	parse_events__free_terms($2);
>>   	$$ = list;
>>   }
>>   
>> +opt_event_config:
>> +'/' event_config '/'
>> +{
>> +	$$ = $2;
>> +}
>> +|
>> +{
>> +	$$ = NULL;
>> +}
> can't judge the bpf part, but for the parser part:
>
> Acked-by: Jiri Olsa <jolsa@kernel.org>

You have already acked this patch before :)

Thank you.

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2016-02-18  6:19 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-05 14:01 [PATCH 00/54] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
2016-02-05 14:01 ` [PATCH 01/54] perf tools: Fix dangling pointers in parse_events__free_terms Wang Nan
2016-02-16  7:53   ` [tip:perf/core] perf tools: Unlink entries from terms list tip-bot for Wang Nan
2016-02-16  7:54   ` [tip:perf/core] perf tools: Free the terms list_head in parse_events__free_terms() tip-bot for Wang Nan
2016-02-05 14:01 ` [PATCH 02/54] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
2016-02-16  7:52   ` [tip:perf/core] perf symbols: Fix symbols searching for " tip-bot for Wang Nan
2016-02-05 14:01 ` [PATCH 03/54] perf tools: Record text offset in dso to calculate objdump address Wang Nan
2016-02-05 14:01 ` [PATCH 04/54] perf tools: Adjust symbol for shared objects Wang Nan
2016-02-05 14:01 ` [PATCH 05/54] perf data: Fix releasing event_class Wang Nan
     [not found]   ` <20160211220413.GF32168@kernel.org>
2016-02-12 12:19     ` Jiri Olsa
2016-02-12 15:42       ` Arnaldo Carvalho de Melo
2016-02-16  7:55   ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-05 14:01 ` [PATCH 06/54] perf tools: Add API to config maps in bpf object Wang Nan
2016-02-05 14:01 ` [PATCH 07/54] perf tools: Enable BPF object configure syntax Wang Nan
2016-02-12 14:09   ` Jiri Olsa
2016-02-18  6:17     ` Wangnan (F)
2016-02-05 14:01 ` [PATCH 08/54] perf record: Apply config to BPF objects before recording Wang Nan
2016-02-12 20:55   ` Arnaldo Carvalho de Melo
2016-02-05 14:01 ` [PATCH 09/54] perf tools: Enable passing event to BPF object Wang Nan
2016-02-12 14:05   ` Jiri Olsa
2016-02-05 14:01 ` [PATCH 10/54] perf stat: Forbid user passing improper config terms Wang Nan
2016-02-12 13:49   ` Jiri Olsa
2016-02-12 15:45     ` Arnaldo Carvalho de Melo
2016-02-12 15:50       ` Jiri Olsa
2016-02-05 14:01 ` [PATCH 11/54] perf tools: Rename and move pmu_event_name to get_config_name Wang Nan
2016-02-05 14:01 ` [PATCH 12/54] perf tools: Enable config raw and numeric events Wang Nan
2016-02-12 13:52   ` Jiri Olsa
2016-02-12 13:56     ` pi3orama
2016-02-12 13:56     ` Jiri Olsa
2016-02-12 14:10   ` Jiri Olsa
2016-02-12 14:12   ` Jiri Olsa
2016-02-05 14:01 ` [PATCH 13/54] perf tools: Enable config and setting names for legacy cache events Wang Nan
2016-02-05 14:01 ` [PATCH 14/54] perf tools: Support setting different slots in a BPF map separately Wang Nan
2016-02-12 14:23   ` Jiri Olsa
2016-02-12 14:34     ` pi3orama
2016-02-05 14:01 ` [PATCH 15/54] perf tools: Enable indices setting syntax for BPF maps Wang Nan
2016-02-05 14:01 ` [PATCH 16/54] perf tools: Pass tracepoint options to BPF script Wang Nan
2016-02-05 14:01 ` [PATCH 17/54] perf tools: Introduce bpf-output event Wang Nan
2016-02-05 14:01 ` [PATCH 18/54] perf data: Support converting data from bpf_perf_event_output() Wang Nan
2016-02-05 14:01 ` [PATCH 19/54] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
2016-02-05 14:01 ` [PATCH 20/54] perf core: Set event's default overflow_handler Wang Nan
2016-02-05 14:01 ` [PATCH 21/54] perf core: Prepare writing into ring buffer from end Wang Nan
2016-02-05 14:01 ` [PATCH 22/54] perf core: Add backward attribute to perf event Wang Nan
2016-02-05 14:01 ` [PATCH 23/54] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
2016-02-05 14:01 ` [PATCH 24/54] perf tools: Only validate is_pos for tracking evsels Wang Nan
2016-02-05 14:01 ` [PATCH 25/54] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
2016-02-05 14:01 ` [PATCH 26/54] perf tools: Make ordered_events reusable Wang Nan
2016-02-05 14:01 ` [PATCH 27/54] perf record: Extract synthesize code to record__synthesize() Wang Nan
2016-02-05 14:01 ` [PATCH 28/54] perf tools: Add perf_data_file__switch() helper Wang Nan
2016-02-05 14:01 ` [PATCH 29/54] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
2016-02-05 14:01 ` [PATCH 30/54] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
2016-02-05 14:01 ` [PATCH 31/54] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
2016-02-05 14:01 ` [PATCH 32/54] perf record: Split output into multiple files via '--switch-output' Wang Nan
2016-02-05 14:01 ` [PATCH 33/54] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
2016-02-05 14:01 ` [PATCH 34/54] perf record: Disable buildid cache options by default in switch output mode Wang Nan
2016-02-05 14:02 ` [PATCH 35/54] perf record: Re-synthesize tracking events after output switching Wang Nan
2016-02-05 14:02 ` [PATCH 36/54] perf record: Generate tracking events for process forked by perf Wang Nan
2016-02-05 14:02 ` [PATCH 37/54] perf record: Ensure return non-zero rc when mmap fail Wang Nan
2016-02-05 14:02 ` [PATCH 38/54] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
2016-02-05 14:02 ` [PATCH 39/54] perf tools: Add evlist channel helpers Wang Nan
2016-02-05 14:02 ` [PATCH 40/54] perf tools: Automatically add new channel according to evlist Wang Nan
2016-02-05 14:02 ` [PATCH 41/54] perf tools: Operate multiple channels Wang Nan
2016-02-05 14:02 ` [PATCH 42/54] perf tools: Squash overwrite setting into channel Wang Nan
2016-02-05 14:02 ` [PATCH 43/54] perf record: Don't read from and poll overwrite channel Wang Nan
2016-02-05 14:02 ` [PATCH 44/54] perf record: Don't poll on " Wang Nan
2016-02-05 14:02 ` [PATCH 45/54] perf tools: Detect avalibility of write_backward Wang Nan
2016-02-05 14:02 ` [PATCH 46/54] perf tools: Enable overwrite settings Wang Nan
2016-02-05 14:02 ` [PATCH 47/54] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
2016-02-05 14:02 ` [PATCH 48/54] perf tools: Record fd into perf_mmap Wang Nan
2016-02-05 14:02 ` [PATCH 49/54] perf tools: Add API to pause a channel Wang Nan
2016-02-05 14:02 ` [PATCH 50/54] perf record: Toggle overwrite ring buffer for reading Wang Nan
2016-02-05 14:02 ` [PATCH 51/54] perf record: Rename variable to make code clear Wang Nan
2016-02-05 14:02 ` [PATCH 52/54] perf record: Read from backward ring buffer Wang Nan
2016-02-05 14:02 ` [PATCH 53/54] perf record: Allow generate tracking events at the end of output Wang Nan
2016-02-05 14:02 ` [PATCH 54/54] perf tools: Don't warn about out of order event if write_backward is used Wang Nan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.