linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support
@ 2016-02-19 11:43 Wang Nan
  2016-02-19 11:43 ` [PATCH 01/55] perf tools: Record text offset in dso to calculate objdump address Wang Nan
                   ` (54 more replies)
  0 siblings, 55 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Hi Arnaldo,

This is this week's perf BPF and overwrite ring buffer patch set.

Please note that this patch set is based on your perf/core, not
tmp.perf/bpf_map, because there is a bug similar to what Jiri pointed
out in commit 5c78fe3 in the bpf_map branch.

A bug related to newest libbabeltrace is found and fixed. Similar to
commit 5141d73, this fix is safe to old version of libbabeltrace.

The following changes since commit 1a3dcca461e81fbee1e588b2e6e1eae0b2b911d3:

  perf stat: Handled scaled == -1 case for counters (2016-02-18 13:50:49 -0300)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/pi3orama/linux.git tags/perf-core-for-acme

for you to fetch changes up to 6b8e82587380be165876e423b9d46f41cb5b4020:

  perf tools: Don't warn about out of order event if write_backward is used (2016-02-19 11:26:57 +0000)

----------------------------------------------------------------
perf improvements:

 - Bug fix: byte order of the first BPF output sample incorrect when
   converting to CTF.

 - Fix compiling warning Jiri found.

 - Improve error messages.

Signed-off-by: Wang Nan <wangnan0@huawei.com>

----------------------------------------------------------------
Wang Nan (55):
      perf tools: Record text offset in dso to calculate objdump address
      perf tools: Adjust symbol for shared objects
      perf tools: Rename bpf_prog_priv__clear() to clear_prog_priv()
      perf tools: Fix checking asprintf return value
      perf tools: Add API to config maps in bpf object
      perf tools: Enable BPF object configure syntax
      perf record: Apply config to BPF objects before recording
      perf tools: Enable passing event to BPF object
      perf tools: Create config_term_names array
      perf stat: Forbid user passing improper config terms
      perf tools: Rename and move pmu_event_name to get_config_name
      perf tools: Enable config raw and numeric events
      perf tools: Enable config and setting names for legacy cache events
      perf tools: Support setting different slots in a BPF map separately
      perf tools: Enable indices setting syntax for BPF maps
      perf tools: Pass tracepoint options to BPF script
      perf tools: Introduce bpf-output event
      perf data: Support converting data from bpf_perf_event_output()
      perf data: Explicitly set byte order for integer types
      perf core: Introduce new ioctl options to pause and resume ring buffer
      perf core: Set event's default overflow_handler
      perf core: Prepare writing into ring buffer from end
      perf core: Add backward attribute to perf event
      perf core: Reduce perf event output overhead by new overflow handler
      perf tools: Only validate is_pos for tracking evsels
      perf tools: Print write_backward value in perf_event_attr__fprintf
      perf tools: Make ordered_events reusable
      perf record: Extract synthesize code to record__synthesize()
      perf tools: Add perf_data_file__switch() helper
      perf record: Turns auxtrace_snapshot_enable into 3 states
      perf record: Introduce record__finish_output() to finish a perf.data
      perf record: Add '--timestamp-filename' option to append timestamp to output filename
      perf record: Split output into multiple files via '--switch-output'
      perf record: Force enable --timestamp-filename when --switch-output is provided
      perf record: Disable buildid cache options by default in switch output mode
      perf record: Re-synthesize tracking events after output switching
      perf record: Generate tracking events for process forked by perf
      perf record: Ensure return non-zero rc when mmap fail
      perf record: Prevent reading invalid data in record__mmap_read
      perf tools: Add evlist channel helpers
      perf tools: Automatically add new channel according to evlist
      perf tools: Operate multiple channels
      perf tools: Squash overwrite setting into channel
      perf record: Don't read from and poll overwrite channel
      perf record: Don't poll on overwrite channel
      perf tools: Detect avalibility of write_backward
      perf tools: Enable overwrite settings
      perf tools: Set write_backward attribut bit for overwrite events
      perf tools: Record fd into perf_mmap
      perf tools: Add API to pause a channel
      perf record: Toggle overwrite ring buffer for reading
      perf record: Rename variable to make code clear
      perf record: Read from backward ring buffer
      perf record: Allow generate tracking events at the end of output
      perf tools: Don't warn about out of order event if write_backward is used

 include/linux/perf_event.h        |  22 +-
 include/uapi/linux/perf_event.h   |   4 +-
 kernel/events/core.c              |  73 +++-
 kernel/events/internal.h          |  11 +
 kernel/events/ring_buffer.c       |  63 +++-
 tools/perf/builtin-record.c       | 598 ++++++++++++++++++++++++++-----
 tools/perf/builtin-stat.c         |   1 +
 tools/perf/perf.h                 |   2 +
 tools/perf/tests/bpf.c            |   2 +-
 tools/perf/tests/parse-events.c   |  52 +++
 tools/perf/util/bpf-loader.c      | 724 +++++++++++++++++++++++++++++++++++++-
 tools/perf/util/bpf-loader.h      |  59 ++++
 tools/perf/util/data-convert-bt.c | 118 ++++++-
 tools/perf/util/data.c            |  36 ++
 tools/perf/util/data.h            |  11 +-
 tools/perf/util/dso.h             |   1 +
 tools/perf/util/evlist.c          | 355 +++++++++++++++++--
 tools/perf/util/evlist.h          |  70 +++-
 tools/perf/util/evsel.c           |  23 ++
 tools/perf/util/evsel.h           |  11 +
 tools/perf/util/map.c             |  14 +
 tools/perf/util/ordered-events.c  |   5 +
 tools/perf/util/parse-events.c    | 309 ++++++++++++++--
 tools/perf/util/parse-events.h    |  27 +-
 tools/perf/util/parse-events.l    |  21 +-
 tools/perf/util/parse-events.y    | 134 ++++++-
 tools/perf/util/record.c          |  11 +
 tools/perf/util/session.c         |  22 +-
 tools/perf/util/symbol-elf.c      |  25 +-
 29 files changed, 2565 insertions(+), 239 deletions(-)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 01/55] perf tools: Record text offset in dso to calculate objdump address
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-19 11:43 ` [PATCH 02/55] perf tools: Adjust symbol for shared objects Wang Nan
                   ` (53 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

In this patch, the offset of '.text' section is stored into dso
and used here to re-calculate address to objdump.

In most of the cases, executable code is in '.text' section, so the
adjustment made to a symbol in dso__load_sym (using
sym.st_value -= shdr.sh_addr - shdr.sh_offset) should equal to
'sym.st_value -= dso->text_offset'. Therefore, adding text_offset back
get objdump address from symbol address (rip). However, it is not true
for kernel and kernel module since there could be multiple executable
sections with different offset. Exclude kernel for this reason.

After this patch, even dso->adjust_symbols is set to true for shared
objects, map__rip_2objdump() and map__objdump_2mem() would return
correct result, so perf behavior of annotate won't be changed.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/dso.h        |  1 +
 tools/perf/util/map.c        | 14 ++++++++++++++
 tools/perf/util/symbol-elf.c | 12 ++++++------
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 45ec4d0..ef3dbc9 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -162,6 +162,7 @@ struct dso {
 	u8		 loaded;
 	u8		 rel;
 	u8		 build_id[BUILD_ID_SIZE];
+	u64		 text_offset;
 	const char	 *short_name;
 	const char	 *long_name;
 	u16		 long_name_len;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 171b6d1..02c3186 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -431,6 +431,13 @@ u64 map__rip_2objdump(struct map *map, u64 rip)
 	if (map->dso->rel)
 		return rip - map->pgoff;
 
+	/*
+	 * kernel modules also have DSO_TYPE_USER in dso->kernel,
+	 * but all kernel modules are ET_REL, so won't get here.
+	 */
+	if (map->dso->kernel == DSO_TYPE_USER)
+		return rip + map->dso->text_offset;
+
 	return map->unmap_ip(map, rip) - map->reloc;
 }
 
@@ -454,6 +461,13 @@ u64 map__objdump_2mem(struct map *map, u64 ip)
 	if (map->dso->rel)
 		return map->unmap_ip(map, ip + map->pgoff);
 
+	/*
+	 * kernel modules also have DSO_TYPE_USER in dso->kernel,
+	 * but all kernel modules are ET_REL, so won't get here.
+	 */
+	if (map->dso->kernel == DSO_TYPE_USER)
+		return map->unmap_ip(map, ip - map->dso->text_offset);
+
 	return ip + map->reloc;
 }
 
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index b1dd68f..bc229a7 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -793,6 +793,7 @@ int dso__load_sym(struct dso *dso, struct map *map,
 	uint32_t idx;
 	GElf_Ehdr ehdr;
 	GElf_Shdr shdr;
+	GElf_Shdr tshdr;
 	Elf_Data *syms, *opddata = NULL;
 	GElf_Sym sym;
 	Elf_Scn *sec, *sec_strndx;
@@ -832,6 +833,9 @@ int dso__load_sym(struct dso *dso, struct map *map,
 	sec = syms_ss->symtab;
 	shdr = syms_ss->symshdr;
 
+	if (elf_section_by_name(elf, &ehdr, &tshdr, ".text", NULL))
+		dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
+
 	if (runtime_ss->opdsec)
 		opddata = elf_rawdata(runtime_ss->opdsec, NULL);
 
@@ -880,12 +884,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
 	 * Handle any relocation of vdso necessary because older kernels
 	 * attempted to prelink vdso to its virtual address.
 	 */
-	if (dso__is_vdso(dso)) {
-		GElf_Shdr tshdr;
-
-		if (elf_section_by_name(elf, &ehdr, &tshdr, ".text", NULL))
-			map->reloc = map->start - tshdr.sh_addr + tshdr.sh_offset;
-	}
+	if (dso__is_vdso(dso))
+		map->reloc = map->start - dso->text_offset;
 
 	dso->adjust_symbols = runtime_ss->adjust_symbols || ref_reloc(kmap);
 	/*
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 02/55] perf tools: Adjust symbol for shared objects
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
  2016-02-19 11:43 ` [PATCH 01/55] perf tools: Record text offset in dso to calculate objdump address Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-19 11:43 ` [PATCH 03/55] perf tools: Rename bpf_prog_priv__clear() to clear_prog_priv() Wang Nan
                   ` (52 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

He Kuang reported a problem that perf fails to get correct symbol on
Android platform in [1]. The problem can be reproduced on normal x86_64
platform. I will describe the reproducing steps in detail at the end of
commit message.

The reason of this problem is the missing of symbol adjustment for normal
shared objects. In most of the cases it is okay skipping adjustment. However,
the result is wrong when '.text' section have different 'address' and 'offset'.
I checked all shared objects in my working platform, only wine dll objects and
debug objects (in .debug) have this problem. However, it is common on Android.
For example:

 $ readelf -S ./libsurfaceflinger.so | grep \.text
   [10] .text             PROGBITS         0000000000029030  00012030

This patch enables symbol adjustment for dynamic objects so the symbol
address got from elfutils would be adjusted correctly.

Now nearly all type of ELF file should adjust symbols. Makes
ss->adjust_symbols default to true.

Steps to reproduce the problem:

 $ cat ./Makefile
PWD := $(shell pwd)
LDFLAGS += "-Wl,-rpath=$(PWD)"
CFLAGS += -g
main: main.c libbuggy.so
libbuggy.so: buggy.c
	gcc -g -shared -fPIC -Wl,-Ttext-segment=0x200000 $< -o $@
clean:
	rm -rf main libbuggy.so *.o

 $ cat ./buggy.c
 int fib(int x)
 {
     return (x == 0) ? 1 : (x == 1) ? 1 : fib(x - 1) + fib(x - 2);
 }

 $ cat ./main.c
 #include <stdio.h>

 extern int fib(int x);
 int main()
 {
     int i;

     for (i = 0; i < 40; i++)
         printf("%d\n", fib(i));
     return 0;
 }

 $ make
 $ perf record ./main
 ...
 $ perf report --stdio
 # Overhead  Command  Shared Object      Symbol
 # ........  .......  .................  ...............................
 #
     14.97%  main     libbuggy.so        [.] 0x000000000000066c
      8.68%  main     libbuggy.so        [.] 0x00000000000006aa
      8.52%  main     libbuggy.so        [.] fib@plt
      7.95%  main     libbuggy.so        [.] 0x0000000000000664
      5.94%  main     libbuggy.so        [.] 0x00000000000006a9
      5.35%  main     libbuggy.so        [.] 0x0000000000000678
 ...

The correct result should be (after this patch):

 # Overhead  Command  Shared Object      Symbol
 # ........  .......  .................  ...............................
 #
     91.47%  main     libbuggy.so        [.] fib
      8.52%  main     libbuggy.so        [.] fib@plt
      0.00%  main     [kernel.kallsyms]  [k] kmem_cache_free

[1] http://lkml.kernel.org/g/1452567507-54013-1-git-send-email-hekuang@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/symbol-elf.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index bc229a7..3f9d679 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -709,17 +709,10 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
 	if (ss->opdshdr.sh_type != SHT_PROGBITS)
 		ss->opdsec = NULL;
 
-	if (dso->kernel == DSO_TYPE_USER) {
-		GElf_Shdr shdr;
-		ss->adjust_symbols = (ehdr.e_type == ET_EXEC ||
-				ehdr.e_type == ET_REL ||
-				dso__is_vdso(dso) ||
-				elf_section_by_name(elf, &ehdr, &shdr,
-						     ".gnu.prelink_undo",
-						     NULL) != NULL);
-	} else {
+	if (dso->kernel == DSO_TYPE_USER)
+		ss->adjust_symbols = true;
+	else
 		ss->adjust_symbols = elf__needs_adjust_symbols(ehdr);
-	}
 
 	ss->name   = strdup(name);
 	if (!ss->name) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 03/55] perf tools: Rename bpf_prog_priv__clear() to clear_prog_priv()
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
  2016-02-19 11:43 ` [PATCH 01/55] perf tools: Record text offset in dso to calculate objdump address Wang Nan
  2016-02-19 11:43 ` [PATCH 02/55] perf tools: Adjust symbol for shared objects Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-20 11:36   ` [tip:perf/core] perf bpf: " tip-bot for Wang Nan
  2016-02-19 11:43 ` [PATCH 04/55] perf tools: Fix checking asprintf return value Wang Nan
                   ` (51 subsequent siblings)
  54 siblings, 1 reply; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

The name of bpf_prog_priv__clear() doesn't follow perf's naming
convention. bpf_prog_priv__delete() seems to be a better name. However,
bpf_prog_priv__delete() should be a method of 'struct bpf_prog_priv',
but its first parameter is 'struct bpf_program'.

It is callback from libbpf to clear priv structures when destroying
a bpf program. It is actually a method of bpf_program (libbpf object),
but bpf_program__ functions should be provided by libbpf.

This patch removes the prefix of that function.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 540a7ef..0bdccf4 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -108,8 +108,8 @@ void bpf__clear(void)
 }
 
 static void
-bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
-		     void *_priv)
+clear_prog_priv(struct bpf_program *prog __maybe_unused,
+		void *_priv)
 {
 	struct bpf_prog_priv *priv = _priv;
 
@@ -337,7 +337,7 @@ config_bpf_program(struct bpf_program *prog)
 	}
 	pr_debug("bpf: config '%s' is ok\n", config_str);
 
-	err = bpf_program__set_private(prog, priv, bpf_prog_priv__clear);
+	err = bpf_program__set_private(prog, priv, clear_prog_priv);
 	if (err) {
 		pr_debug("Failed to set priv for program '%s'\n", config_str);
 		goto errout;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 04/55] perf tools: Fix checking asprintf return value
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (2 preceding siblings ...)
  2016-02-19 11:43 ` [PATCH 03/55] perf tools: Rename bpf_prog_priv__clear() to clear_prog_priv() Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-20 11:36   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-02-19 11:43 ` [PATCH 05/55] perf tools: Add API to config maps in bpf object Wang Nan
                   ` (50 subsequent siblings)
  54 siblings, 1 reply; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

According to man pages, asprintf returns -1 when failure. This patch
fixes two incorrect return value checker.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Fixes: ffeb883e5662 ("perf tools: Show proper error message for wrong terms of hw/sw events")
Cc: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index e5583fd..72524c7 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2110,11 +2110,11 @@ char *parse_events_formats_error_string(char *additional_terms)
 
 	/* valid terms */
 	if (additional_terms) {
-		if (!asprintf(&str, "valid terms: %s,%s",
-			      additional_terms, static_terms))
+		if (asprintf(&str, "valid terms: %s,%s",
+			     additional_terms, static_terms) < 0)
 			goto fail;
 	} else {
-		if (!asprintf(&str, "valid terms: %s", static_terms))
+		if (asprintf(&str, "valid terms: %s", static_terms) < 0)
 			goto fail;
 	}
 	return str;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 05/55] perf tools: Add API to config maps in bpf object
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (3 preceding siblings ...)
  2016-02-19 11:43 ` [PATCH 04/55] perf tools: Fix checking asprintf return value Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-19 11:43 ` [PATCH 06/55] perf tools: Enable BPF object configure syntax Wang Nan
                   ` (49 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

bpf__config_obj() is introduced as a core API to config BPF object
after loading. One configuration option of maps is introduced. After
this patch BPF object can accept configuration like:

 maps:my_map.value=1234

(maps.my_map.value looks pretty. However, there's a small but hard
to fixed problem related to flex's greedy matching. Please see [1].
Choose ':' to avoid it in a simpler way.)

This patch is more complex than the work it really does because the
consideration of extension. In designing of BPF map configuration,
following things should be considered:

 1. Array indices selection: perf should allow user setting different
    value to different slots in an array, with syntax like:
    maps:my_map.value[0,3...6]=1234;

 2. A map can be config by different config terms, each for a part
    of it. For example, set each slot to pid of a thread;

 3. Type of value: integer is not the only valid value type. Perf
    event can also be put into a map after commit 35578d7984003097af2b1e3
    (bpf: Implement function bpf_perf_event_read() that get the selected
    hardware PMU conuter);

 4. For hash table, it is possible to use string or other as key;

 5. It is possible that map configuration is unable to be setup
    during parsing. Perf event is an example.

Therefore, this patch does following:

 1. Instead of updating map element during parsing, this patch stores
    map config options in 'struct bpf_map_priv'. Following patches
    would apply those configs at proper time;

 2. Link map operations to a list so a map can have multiple config
    terms attached, so different parts can be configured separately;

 3. Make 'struct bpf_map_priv' extensible so following patches can
    add new types of keys and operations;

 4. Use bpf_obj_config__map_funcs array to support more maps config options.

Since the patch changing event parser to parse BPF object config is
relative large, I put in another commit. Code in this patch
could be tested after applying next patch.

[1] http://lkml.kernel.org/g/564ED621.4050500@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c | 276 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h |  38 ++++++
 2 files changed, 314 insertions(+)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 0bdccf4..d33da89 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -739,6 +739,261 @@ int bpf__foreach_tev(struct bpf_object *obj,
 	return 0;
 }
 
+enum bpf_map_op_type {
+	BPF_MAP_OP_SET_VALUE,
+};
+
+enum bpf_map_key_type {
+	BPF_MAP_KEY_ALL,
+};
+
+struct bpf_map_op {
+	struct list_head list;
+	enum bpf_map_op_type op_type;
+	enum bpf_map_key_type key_type;
+	union {
+		u64 value;
+	} v;
+};
+
+struct bpf_map_priv {
+	struct list_head ops_list;
+};
+
+static void
+bpf_map_op__delete(struct bpf_map_op *op)
+{
+	if (!list_empty(&op->list))
+		list_del(&op->list);
+	free(op);
+}
+
+static void
+bpf_map_priv__delete(struct bpf_map_priv *priv)
+{
+	struct bpf_map_op *pos, *n;
+
+	list_for_each_entry_safe(pos, n, &priv->ops_list, list) {
+		list_del_init(&pos->list);
+		bpf_map_op__delete(pos);
+	}
+	free(priv);
+}
+
+static void
+clear_map_priv(struct bpf_map *map __maybe_unused, void *_priv)
+{
+	struct bpf_map_priv *priv = _priv;
+
+	bpf_map_priv__delete(priv);
+}
+
+static struct bpf_map_op *
+bpf_map_op__new(void)
+{
+	struct bpf_map_op *op;
+
+	op = zalloc(sizeof(*op));
+	if (!op) {
+		pr_debug("Failed to alloc bpf_map_op\n");
+		return ERR_PTR(-ENOMEM);
+	}
+	INIT_LIST_HEAD(&op->list);
+
+	op->key_type = BPF_MAP_KEY_ALL;
+	return op;
+}
+
+static int
+bpf_map__add_op(struct bpf_map *map, struct bpf_map_op *op)
+{
+	struct bpf_map_priv *priv;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+	err = bpf_map__get_private(map, (void **)&priv);
+	if (err) {
+		pr_debug("Failed to get private from map %s\n", map_name);
+		return err;
+	}
+
+	if (!priv) {
+		priv = zalloc(sizeof(*priv));
+		if (!priv) {
+			pr_debug("No enough memory to alloc map private\n");
+			return -ENOMEM;
+		}
+		INIT_LIST_HEAD(&priv->ops_list);
+
+		if (bpf_map__set_private(map, priv, clear_map_priv)) {
+			free(priv);
+			return -BPF_LOADER_ERRNO__INTERNAL;
+		}
+	}
+
+	list_add_tail(&op->list, &priv->ops_list);
+	return 0;
+}
+
+static int
+__bpf_map__config_value(struct bpf_map *map,
+			struct parse_events_term *term)
+{
+	struct bpf_map_def def;
+	struct bpf_map_op *op;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("Unable to get map definition from '%s'\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	if (def.type != BPF_MAP_TYPE_ARRAY) {
+		pr_debug("Map %s type is not BPF_MAP_TYPE_ARRAY\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+	}
+	if (def.key_size < sizeof(unsigned int)) {
+		pr_debug("Map %s has incorrect key size\n", map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE;
+	}
+	switch (def.value_size) {
+	case 1:
+	case 2:
+	case 4:
+	case 8:
+		break;
+	default:
+		pr_debug("Map %s has incorrect value size\n", map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
+	}
+
+	op = bpf_map_op__new();
+	if (IS_ERR(op))
+		return PTR_ERR(op);
+	op->op_type = BPF_MAP_OP_SET_VALUE;
+	op->v.value = term->val.num;
+
+	err = bpf_map__add_op(map, op);
+	if (err)
+		bpf_map_op__delete(op);
+	return err;
+}
+
+static int
+bpf_map__config_value(struct bpf_map *map,
+		      struct parse_events_term *term,
+		      struct perf_evlist *evlist __maybe_unused)
+{
+	if (!term->err_val) {
+		pr_debug("Config value not set\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
+	}
+
+	if (term->type_val != PARSE_EVENTS__TERM_TYPE_NUM) {
+		pr_debug("ERROR: wrong value type\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
+	}
+
+	return __bpf_map__config_value(map, term);
+}
+
+struct bpf_obj_config__map_func {
+	const char *config_opt;
+	int (*config_func)(struct bpf_map *, struct parse_events_term *,
+			   struct perf_evlist *);
+};
+
+struct bpf_obj_config__map_func bpf_obj_config__map_funcs[] = {
+	{"value", bpf_map__config_value},
+};
+
+static int
+bpf__obj_config_map(struct bpf_object *obj,
+		    struct parse_events_term *term,
+		    struct perf_evlist *evlist,
+		    int *key_scan_pos)
+{
+	/* key is "maps:<mapname>.<config opt>" */
+	char *map_name = strdup(term->config + sizeof("maps:") - 1);
+	struct bpf_map *map;
+	int err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
+	char *map_opt;
+	size_t i;
+
+	if (!map_name)
+		return -ENOMEM;
+
+	map_opt = strchr(map_name, '.');
+	if (!map_opt) {
+		pr_debug("ERROR: Invalid map config: %s\n", map_name);
+		goto out;
+	}
+
+	*map_opt++ = '\0';
+	if (*map_opt == '\0') {
+		pr_debug("ERROR: Invalid map option: %s\n", term->config);
+		goto out;
+	}
+
+	map = bpf_object__get_map_by_name(obj, map_name);
+	if (!map) {
+		pr_debug("ERROR: Map %s is not exist\n", map_name);
+		err = -BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST;
+		goto out;
+	}
+
+	*key_scan_pos += map_opt - map_name;
+	for (i = 0; i < ARRAY_SIZE(bpf_obj_config__map_funcs); i++) {
+		struct bpf_obj_config__map_func *func =
+				&bpf_obj_config__map_funcs[i];
+
+		if (strcmp(map_opt, func->config_opt) == 0) {
+			err = func->config_func(map, term, evlist);
+			goto out;
+		}
+	}
+
+	pr_debug("ERROR: invalid config option '%s' for maps\n",
+		 map_opt);
+	err = -BPF_LOADER_ERRNO__OBJCONF_MAP_OPT;
+out:
+	free(map_name);
+	if (!err)
+		key_scan_pos += strlen(map_opt);
+	return err;
+}
+
+int bpf__config_obj(struct bpf_object *obj,
+		    struct parse_events_term *term,
+		    struct perf_evlist *evlist,
+		    int *error_pos)
+{
+	int key_scan_pos = 0;
+	int err;
+
+	if (!obj || !term || !term->config)
+		return -EINVAL;
+
+	if (!prefixcmp(term->config, "maps:")) {
+		key_scan_pos = sizeof("maps:") - 1;
+		err = bpf__obj_config_map(obj, term, evlist, &key_scan_pos);
+		goto out;
+	}
+	err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
+out:
+	if (error_pos)
+		*error_pos = key_scan_pos;
+	return err;
+
+}
+
 #define ERRNO_OFFSET(e)		((e) - __BPF_LOADER_ERRNO__START)
 #define ERRCODE_OFFSET(c)	ERRNO_OFFSET(BPF_LOADER_ERRNO__##c)
 #define NR_ERRNO	(__BPF_LOADER_ERRNO__END - __BPF_LOADER_ERRNO__START)
@@ -753,6 +1008,14 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(PROLOGUE)]	= "Failed to generate prologue",
 	[ERRCODE_OFFSET(PROLOGUE2BIG)]	= "Prologue too big for program",
 	[ERRCODE_OFFSET(PROLOGUEOOB)]	= "Offset out of bound for prologue",
+	[ERRCODE_OFFSET(OBJCONF_OPT)]	= "Invalid object config option",
+	[ERRCODE_OFFSET(OBJCONF_CONF)]	= "Config value not set (lost '=')",
+	[ERRCODE_OFFSET(OBJCONF_MAP_OPT)]	= "Invalid object maps config option",
+	[ERRCODE_OFFSET(OBJCONF_MAP_NOTEXIST)]	= "Target map not exist",
+	[ERRCODE_OFFSET(OBJCONF_MAP_VALUE)]	= "Incorrect value type for map",
+	[ERRCODE_OFFSET(OBJCONF_MAP_TYPE)]	= "Incorrect map type",
+	[ERRCODE_OFFSET(OBJCONF_MAP_KEYSIZE)]	= "Incorrect map key size",
+	[ERRCODE_OFFSET(OBJCONF_MAP_VALUESIZE)]	= "Incorrect map value size",
 };
 
 static int
@@ -872,3 +1135,16 @@ int bpf__strerror_load(struct bpf_object *obj,
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
+			     struct parse_events_term *term __maybe_unused,
+			     struct perf_evlist *evlist __maybe_unused,
+			     int *error_pos __maybe_unused, int err,
+			     char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,
+			    "Can't use this config term to this type of map");
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 6fdc045..2464db9 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -10,6 +10,7 @@
 #include <string.h>
 #include <bpf/libbpf.h>
 #include "probe-event.h"
+#include "evlist.h"
 #include "debug.h"
 
 enum bpf_loader_errno {
@@ -24,10 +25,19 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__PROLOGUE,	/* Failed to generate prologue */
 	BPF_LOADER_ERRNO__PROLOGUE2BIG,	/* Prologue too big for program */
 	BPF_LOADER_ERRNO__PROLOGUEOOB,	/* Offset out of bound for prologue */
+	BPF_LOADER_ERRNO__OBJCONF_OPT,	/* Invalid object config option */
+	BPF_LOADER_ERRNO__OBJCONF_CONF,	/* Config value not set (lost '=')) */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_OPT,	/* Invalid object maps config option */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST,	/* Target map not exist */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE,	/* Incorrect value type for map */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,	/* Incorrect map type */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE,	/* Incorrect map key size */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE,/* Incorrect map value size */
 	__BPF_LOADER_ERRNO__END,
 };
 
 struct bpf_object;
+struct parse_events_term;
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
 typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
@@ -53,6 +63,14 @@ int bpf__strerror_load(struct bpf_object *obj, int err,
 		       char *buf, size_t size);
 int bpf__foreach_tev(struct bpf_object *obj,
 		     bpf_prog_iter_callback_t func, void *arg);
+
+int bpf__config_obj(struct bpf_object *obj, struct parse_events_term *term,
+		    struct perf_evlist *evlist, int *error_pos);
+int bpf__strerror_config_obj(struct bpf_object *obj,
+			     struct parse_events_term *term,
+			     struct perf_evlist *evlist,
+			     int *error_pos, int err, char *buf,
+			     size_t size);
 #else
 static inline struct bpf_object *
 bpf__prepare_load(const char *filename __maybe_unused,
@@ -84,6 +102,15 @@ bpf__foreach_tev(struct bpf_object *obj __maybe_unused,
 }
 
 static inline int
+bpf__config_obj(struct bpf_object *obj __maybe_unused,
+		struct parse_events_term *term __maybe_unused,
+		struct perf_evlist *evlist __maybe_unused,
+		int *error_pos __maybe_unused)
+{
+	return 0;
+}
+
+static inline int
 __bpf_strerror(char *buf, size_t size)
 {
 	if (!size)
@@ -118,5 +145,16 @@ static inline int bpf__strerror_load(struct bpf_object *obj __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int
+bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
+			 struct parse_events_term *term __maybe_unused,
+			 struct perf_evlist *evlist __maybe_unused,
+			 int *error_pos __maybe_unused,
+			 int err __maybe_unused,
+			 char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 06/55] perf tools: Enable BPF object configure syntax
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (4 preceding siblings ...)
  2016-02-19 11:43 ` [PATCH 05/55] perf tools: Add API to config maps in bpf object Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-19 11:43 ` [PATCH 07/55] perf record: Apply config to BPF objects before recording Wang Nan
                   ` (48 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch adds the final step for BPF map configuration. A new syntax
is appended into parser so user can config BPF objects through '/' '/'
enclosed config terms.

After this patch, following syntax is available:

 # perf record -e ./test_bpf_map_1.c/maps:channel.value=10/ ...

It would takes effect after appling following commits.

Test result:

 # cat ./test_bpf_map_1.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
     (void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
     .type = BPF_MAP_TYPE_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = 1,
 };
 SEC("func=sys_nanosleep")
 int func(void *ctx)
 {
     int key = 0;
     char fmt[] = "%d\n";
     int *pval = map_lookup_elem(&channel, &key);
     if (!pval)
         return 0;
     trace_printk(fmt, sizeof(fmt), *pval);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 - Normal case:
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=10/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]

 - Error case:

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value/' usleep 10
 event syntax error: '..ps:channel:value/'
                                   \___ Config value not set (lost '=')
 Hint:	Valid config term:
      	maps:[<arraymap>]:value=[value]
	     	(add -v to see detail)
	Run 'perf list' for a list of valid events

 Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -e, --event <event>   event selector. use 'perf list' to list available events

 # ./perf record -e './test_bpf_map_1.c/xmaps:channel.value=10/' usleep 10
 event syntax error: '..pf_map_1.c/xmaps:channel.value=10/'
                                   \___ Invalid object config option
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:xchannel.value=10/' usleep 10
 event syntax error: '..p_1.c/maps:xchannel.value=10/'
                                   \___ Target map not exist
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:channel.xvalue=10/' usleep 10
 event syntax error: '..ps:channel.xvalue=10/'
                                   \___ Invalid object maps config option
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=x10/' usleep 10
 event syntax error: '..nnel.value=x10/'
                                   \___ Incorrect value type for map
 [SNIP]

 Change BPF_MAP_TYPE_ARRAY to '1' in test_bpf_map_1.c:

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=10/' usleep 10
 event syntax error: '..ps:channel.value=10/'
                                   \___ Can't use this config term to this type of map

 Hint:	Valid config term:
     	maps:[<arraymap>].value=[value]
     	(add -v to see detail)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
[for parser part]
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 55 +++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/parse-events.h |  3 ++-
 tools/perf/util/parse-events.l |  2 +-
 tools/perf/util/parse-events.y | 21 +++++++++++++---
 4 files changed, 72 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 72524c7..261690a 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -631,17 +631,63 @@ errout:
 	return err;
 }
 
+static int
+parse_events_config_bpf(struct parse_events_evlist *data,
+			struct bpf_object *obj,
+			struct list_head *head_config)
+{
+	struct parse_events_term *term;
+	int error_pos;
+
+	if (!head_config || list_empty(head_config))
+		return 0;
+
+	list_for_each_entry(term, head_config, list) {
+		char errbuf[BUFSIZ];
+		int err;
+
+		if (term->type_term != PARSE_EVENTS__TERM_TYPE_USER) {
+			snprintf(errbuf, sizeof(errbuf),
+				 "Invalid config term for BPF object");
+			errbuf[BUFSIZ - 1] = '\0';
+
+			data->error->idx = term->err_term;
+			data->error->str = strdup(errbuf);
+			return -EINVAL;
+		}
+
+		err = bpf__config_obj(obj, term, NULL, &error_pos);
+		if (err) {
+			bpf__strerror_config_obj(obj, term, NULL,
+						 &error_pos, err, errbuf,
+						 sizeof(errbuf));
+			data->error->help = strdup(
+"Hint:\tValid config term:\n"
+"     \tmaps:[<arraymap>].value=[value]\n"
+"     \t(add -v to see detail)");
+			data->error->str = strdup(errbuf);
+			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
+				data->error->idx = term->err_val;
+			else
+				data->error->idx = term->err_term + error_pos;
+			return err;
+		}
+	}
+	return 0;
+}
+
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
-			  bool source)
+			  bool source,
+			  struct list_head *head_config)
 {
 	struct bpf_object *obj;
+	int err;
 
 	obj = bpf__prepare_load(bpf_file_name, source);
 	if (IS_ERR(obj)) {
 		char errbuf[BUFSIZ];
-		int err;
 
 		err = PTR_ERR(obj);
 
@@ -659,7 +705,10 @@ int parse_events_load_bpf(struct parse_events_evlist *data,
 		return err;
 	}
 
-	return parse_events_load_bpf_obj(data, list, obj);
+	err = parse_events_load_bpf_obj(data, list, obj);
+	if (err)
+		return err;
+	return parse_events_config_bpf(data, obj, head_config);
 }
 
 static int
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 53628bf..0117328 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -127,7 +127,8 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
-			  bool source);
+			  bool source,
+			  struct list_head *head_config);
 /* Provide this function for perf test */
 struct bpf_object;
 int parse_events_load_bpf_obj(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 58c5831..4387728 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -122,7 +122,7 @@ num_dec		[0-9]+
 num_hex		0x[a-fA-F0-9]+
 num_raw_hex	[a-fA-F0-9]+
 name		[a-zA-Z_*?][a-zA-Z0-9_*?.]*
-name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.]*
+name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
 /* If you add a modifier you need to update check_modifier() */
 modifier_event	[ukhpPGHSDI]+
 modifier_bp	[rwx]{1,3}
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index c0eac88..448c1ec 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -64,6 +64,7 @@ static inc_group_count(struct list_head *list,
 %type <str> PE_PMU_EVENT_PRE PE_PMU_EVENT_SUF PE_KERNEL_PMU_EVENT
 %type <num> value_sym
 %type <head> event_config
+%type <head> opt_event_config
 %type <term> event_term
 %type <head> event_pmu
 %type <head> event_legacy_symbol
@@ -455,27 +456,39 @@ PE_RAW
 }
 
 event_bpf_file:
-PE_BPF_OBJECT
+PE_BPF_OBJECT opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_load_bpf(data, list, $1, false));
+	ABORT_ON(parse_events_load_bpf(data, list, $1, false, $2));
+	parse_events_terms__delete($2);
 	$$ = list;
 }
 |
-PE_BPF_SOURCE
+PE_BPF_SOURCE opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_load_bpf(data, list, $1, true));
+	ABORT_ON(parse_events_load_bpf(data, list, $1, true, $2));
+	parse_events_terms__delete($2);
 	$$ = list;
 }
 
+opt_event_config:
+'/' event_config '/'
+{
+	$$ = $2;
+}
+|
+{
+	$$ = NULL;
+}
+
 start_terms: event_config
 {
 	struct parse_events_terms *data = _data;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 07/55] perf record: Apply config to BPF objects before recording
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (5 preceding siblings ...)
  2016-02-19 11:43 ` [PATCH 06/55] perf tools: Enable BPF object configure syntax Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-19 11:43 ` [PATCH 08/55] perf tools: Enable passing event to BPF object Wang Nan
                   ` (47 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

bpf__apply_obj_config() is introduced as the core API to apply object
config options to all BPF objects. This patch also does the real work
for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value
stored in map's private field into the BPF map.

This patch is required because we are not always able to set all
BPF config during parsing. Further patch will set events created
by perf to BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist
until perf_evsel__open().

bpf_map_foreach_key() is introduced to iterate over each key
needs to be configured. This function would be extended to support
more map types and different key settings.

In perf record, before start recording, call bpf__apply_config() to
turn on all BPF config options.

Test result:

 # cat ./test_bpf_map_1.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
     (void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
     .type = BPF_MAP_TYPE_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = 1,
 };
 SEC("func=sys_nanosleep")
 int func(void *ctx)
 {
     int key = 0;
     char fmt[] = "%d\n";
     int *pval = map_lookup_elem(&channel, &key);
     if (!pval)
         return 0;
     trace_printk(fmt, sizeof(fmt), *pval);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=11/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 1/1   #P:8
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            usleep-18593 [007] d... 2394714.395539: : 11
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=101/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 1/1   #P:8
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            usleep-18593 [007] d... 2394714.395539: : 11
            usleep-19000 [006] d... 2394831.057840: : 101

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c  |  11 +++
 tools/perf/util/bpf-loader.c | 184 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h |  15 ++++
 3 files changed, 210 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index cf3a28d..7d11162 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -32,6 +32,7 @@
 #include "util/parse-branch-options.h"
 #include "util/parse-regs-options.h"
 #include "util/llvm-utils.h"
+#include "util/bpf-loader.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -536,6 +537,16 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		goto out_child;
 	}
 
+	err = bpf__apply_obj_config();
+	if (err) {
+		char errbuf[BUFSIZ];
+
+		bpf__strerror_apply_obj_config(err, errbuf, sizeof(errbuf));
+		pr_err("ERROR: Apply config to BPF failed: %s\n",
+			 errbuf);
+		goto out_child;
+	}
+
 	/*
 	 * Normally perf_session__new would do this, but it doesn't have the
 	 * evlist.
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index d33da89..c762616 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -7,6 +7,7 @@
 
 #include <linux/bpf.h>
 #include <bpf/libbpf.h>
+#include <bpf/bpf.h>
 #include <linux/err.h>
 #include <linux/string.h>
 #include "perf.h"
@@ -994,6 +995,182 @@ out:
 
 }
 
+typedef int (*map_config_func_t)(const char *name, int map_fd,
+				 struct bpf_map_def *pdef,
+				 struct bpf_map_op *op,
+				 void *pkey, void *arg);
+
+static int
+foreach_key_array_all(map_config_func_t func,
+		      void *arg, const char *name,
+		      int map_fd, struct bpf_map_def *pdef,
+		      struct bpf_map_op *op)
+{
+	unsigned int i;
+	int err;
+
+	for (i = 0; i < pdef->max_entries; i++) {
+		err = func(name, map_fd, pdef, op, &i, arg);
+		if (err) {
+			pr_debug("ERROR: failed to insert value to %s[%u]\n",
+				 name, i);
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int
+bpf_map_config_foreach_key(struct bpf_map *map,
+			   map_config_func_t func,
+			   void *arg)
+{
+	int err, map_fd;
+	const char *name;
+	struct bpf_map_op *op;
+	struct bpf_map_def def;
+	struct bpf_map_priv *priv;
+
+	name = bpf_map__get_name(map);
+
+	err = bpf_map__get_private(map, (void **)&priv);
+	if (err) {
+		pr_debug("ERROR: failed to get private from map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	if (!priv || list_empty(&priv->ops_list)) {
+		pr_debug("INFO: nothing to config for map %s\n", name);
+		return 0;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("ERROR: failed to get definition from map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	map_fd = bpf_map__get_fd(map);
+	if (map_fd < 0) {
+		pr_debug("ERROR: failed to get fd from map %s\n", name);
+		return map_fd;
+	}
+
+	list_for_each_entry(op, &priv->ops_list, list) {
+		switch (def.type) {
+		case BPF_MAP_TYPE_ARRAY:
+			switch (op->key_type) {
+			case BPF_MAP_KEY_ALL:
+				err = foreach_key_array_all(func, arg, name,
+							    map_fd, &def, op);
+				if (err)
+					return err;
+				break;
+			default:
+				pr_debug("ERROR: keytype for map '%s' invalid\n",
+					 name);
+				return -BPF_LOADER_ERRNO__INTERNAL;
+			}
+			break;
+		default:
+			pr_debug("ERROR: type of '%s' incorrect\n", name);
+			return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+		}
+	}
+
+	return 0;
+}
+
+static int
+apply_config_value_for_key(int map_fd, void *pkey,
+			   size_t val_size, u64 val)
+{
+	int err = 0;
+
+	switch (val_size) {
+	case 1: {
+		u8 _val = (u8)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 2: {
+		u16 _val = (u16)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 4: {
+		u32 _val = (u32)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 8: {
+		err = bpf_map_update_elem(map_fd, pkey, &val, BPF_ANY);
+		break;
+	}
+	default:
+		pr_debug("ERROR: invalid value size\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
+	}
+	if (err && errno)
+		err = -errno;
+	return err;
+}
+
+static int
+apply_obj_config_map_for_key(const char *name, int map_fd,
+			     struct bpf_map_def *pdef __maybe_unused,
+			     struct bpf_map_op *op,
+			     void *pkey, void *arg __maybe_unused)
+{
+	int err;
+
+	switch (op->op_type) {
+	case BPF_MAP_OP_SET_VALUE:
+		err = apply_config_value_for_key(map_fd, pkey,
+						 pdef->value_size,
+						 op->v.value);
+		break;
+	default:
+		pr_debug("ERROR: unknown value type for '%s'\n", name);
+		err = -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	return err;
+}
+
+static int
+apply_obj_config_map(struct bpf_map *map)
+{
+	return bpf_map_config_foreach_key(map,
+					  apply_obj_config_map_for_key,
+					  NULL);
+}
+
+static int
+apply_obj_config_object(struct bpf_object *obj)
+{
+	struct bpf_map *map;
+	int err;
+
+	bpf_map__for_each(map, obj) {
+		err = apply_obj_config_map(map);
+		if (err)
+			return err;
+	}
+	return 0;
+}
+
+int bpf__apply_obj_config(void)
+{
+	struct bpf_object *obj, *tmp;
+	int err;
+
+	bpf_object__for_each_safe(obj, tmp) {
+		err = apply_obj_config_object(obj);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 #define ERRNO_OFFSET(e)		((e) - __BPF_LOADER_ERRNO__START)
 #define ERRCODE_OFFSET(c)	ERRNO_OFFSET(BPF_LOADER_ERRNO__##c)
 #define NR_ERRNO	(__BPF_LOADER_ERRNO__END - __BPF_LOADER_ERRNO__START)
@@ -1148,3 +1325,10 @@ int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 2464db9..db3c34c 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -71,6 +71,8 @@ int bpf__strerror_config_obj(struct bpf_object *obj,
 			     struct perf_evlist *evlist,
 			     int *error_pos, int err, char *buf,
 			     size_t size);
+int bpf__apply_obj_config(void);
+int bpf__strerror_apply_obj_config(int err, char *buf, size_t size);
 #else
 static inline struct bpf_object *
 bpf__prepare_load(const char *filename __maybe_unused,
@@ -111,6 +113,12 @@ bpf__config_obj(struct bpf_object *obj __maybe_unused,
 }
 
 static inline int
+bpf__apply_obj_config(void)
+{
+	return 0;
+}
+
+static inline int
 __bpf_strerror(char *buf, size_t size)
 {
 	if (!size)
@@ -156,5 +164,12 @@ bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int
+bpf__strerror_apply_obj_config(int err __maybe_unused,
+			       char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 08/55] perf tools: Enable passing event to BPF object
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (6 preceding siblings ...)
  2016-02-19 11:43 ` [PATCH 07/55] perf record: Apply config to BPF objects before recording Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-19 11:43 ` [PATCH 09/55] perf tools: Create config_term_names array Wang Nan
                   ` (46 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

A new syntax is appended into parser so user can pass predefined perf
events into BPF objects.

After this patch, BPF programs for perf are finally able to utilize
bpf_perf_event_read() introduced in commit 35578d7984003097af2b1e3
(bpf: Implement function bpf_perf_event_read() that get the selected
hardware PMU conuter).

Test result:

 # cat ./test_bpf_map_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
     (void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_read)(struct bpf_map_def *, int) =
     (void *)BPF_FUNC_perf_event_read;

 struct bpf_map_def SEC("maps") pmu_map = {
     .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = __NR_CPUS__,
 };
 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
     unsigned long long val;
     char fmt[] = "sys_write:        pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }

 SEC("func_write_return=sys_write%return")
 int func_write_return(void *ctx)
 {
     unsigned long long val = 0;
     char fmt[] = "sys_write_return: pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

Normal case:
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 [SNIP]
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.013 MB perf.data (7 samples) ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-17066 [000] d... 938449.863301: : sys_write:        pmu=1157327
               ls-17066 [000] dN.. 938449.863342: : sys_write_return: pmu=1225218
               ls-17066 [000] d... 938449.863349: : sys_write:        pmu=1241922
               ls-17066 [000] dN.. 938449.863369: : sys_write_return: pmu=1267445

Normal case (system wide):
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' -a
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.811 MB perf.data (120 samples) ]

 # cat /sys/kernel/debug/tracing/trace | grep -v '18446744073709551594' | grep -v perf | head -n 20
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            gmain-30828 [002] d... 2740551.068992: : sys_write:        pmu=84373
            gmain-30828 [002] d... 2740551.068992: : sys_write_return: pmu=87696
            gmain-30828 [002] d... 2740551.068996: : sys_write:        pmu=100658
            gmain-30828 [002] d... 2740551.068997: : sys_write_return: pmu=102572

Error case 1:

 # ./perf record -e './test_bpf_map_2.c' ls /
 [SNIP]
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.014 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-17115 [007] d... 2724279.665625: : sys_write:        pmu=18446744073709551614
               ls-17115 [007] dN.. 2724279.665651: : sys_write_return: pmu=18446744073709551614
               ls-17115 [007] d... 2724279.665658: : sys_write:        pmu=18446744073709551614
               ls-17115 [007] dN.. 2724279.665677: : sys_write_return: pmu=18446744073709551614

 (18446744073709551614 is 0xfffffffffffffffe (-2))

Error case 2:
 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=evt/' -a
 event syntax error: '..ps:pmu_map.event=evt/'
                                   \___ Event not found for map setting

 Hint:	Valid config terms:
      	maps:[<arraymap>].value=[value]
      	maps:[<eventmap>].event=[event]
 [SNIP]

Error case 3:
 # ls /proc/2348/task/
 2348  2505  2506  2507  2508
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' -p 2348
 ERROR: Apply config to BPF failed: Cannot set event to BPF maps in multi-thread tracing

Error case 4:
 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i to turn off inherit)

Error case 5:
 # ./perf record -i -e raw_syscalls:sys_enter -e './test_bpf_map_2.c/maps:pmu_map.event=raw_syscalls:sys_enter/' ls
 ERROR: Apply config to BPF failed: Can only put raw, hardware and BPF output event into a BPF map

Error case 6:
 # ./perf record -i -e './test_bpf_map_2.c/maps:pmu_map.event=123/' ls /
 event syntax error: '.._map.event=123/'
                                   \___ Incorrect value type for map
 [SNIP]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 163 +++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/bpf-loader.h   |   5 ++
 tools/perf/util/evlist.c       |  16 ++++
 tools/perf/util/evlist.h       |   3 +
 tools/perf/util/parse-events.c |  15 ++--
 tools/perf/util/parse-events.h |   1 +
 6 files changed, 190 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index c762616..e7520c2 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -742,6 +742,7 @@ int bpf__foreach_tev(struct bpf_object *obj,
 
 enum bpf_map_op_type {
 	BPF_MAP_OP_SET_VALUE,
+	BPF_MAP_OP_SET_EVSEL,
 };
 
 enum bpf_map_key_type {
@@ -754,6 +755,7 @@ struct bpf_map_op {
 	enum bpf_map_key_type key_type;
 	union {
 		u64 value;
+		struct perf_evsel *evsel;
 	} v;
 };
 
@@ -837,6 +839,24 @@ bpf_map__add_op(struct bpf_map *map, struct bpf_map_op *op)
 	return 0;
 }
 
+static struct bpf_map_op *
+bpf_map__add_newop(struct bpf_map *map)
+{
+	struct bpf_map_op *op;
+	int err;
+
+	op = bpf_map_op__new();
+	if (IS_ERR(op))
+		return op;
+
+	err = bpf_map__add_op(map, op);
+	if (err) {
+		bpf_map_op__delete(op);
+		return ERR_PTR(err);
+	}
+	return op;
+}
+
 static int
 __bpf_map__config_value(struct bpf_map *map,
 			struct parse_events_term *term)
@@ -875,16 +895,12 @@ __bpf_map__config_value(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
 	}
 
-	op = bpf_map_op__new();
+	op = bpf_map__add_newop(map);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 	op->op_type = BPF_MAP_OP_SET_VALUE;
 	op->v.value = term->val.num;
-
-	err = bpf_map__add_op(map, op);
-	if (err)
-		bpf_map_op__delete(op);
-	return err;
+	return 0;
 }
 
 static int
@@ -898,13 +914,75 @@ bpf_map__config_value(struct bpf_map *map,
 	}
 
 	if (term->type_val != PARSE_EVENTS__TERM_TYPE_NUM) {
-		pr_debug("ERROR: wrong value type\n");
+		pr_debug("ERROR: wrong value type for 'value'\n");
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
 	}
 
 	return __bpf_map__config_value(map, term);
 }
 
+static int
+__bpf_map__config_event(struct bpf_map *map,
+			struct parse_events_term *term,
+			struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	struct bpf_map_def def;
+	struct bpf_map_op *op;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+	evsel = perf_evlist__find_evsel_by_str(evlist, term->val.str);
+	if (!evsel) {
+		pr_debug("Event (for '%s') '%s' doesn't exist\n",
+			 map_name, term->val.str);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_NOEVT;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("Unable to get map definition from '%s'\n",
+			 map_name);
+		return err;
+	}
+
+	/*
+	 * No need to check key_size and value_size:
+	 * kernel has already checked them.
+	 */
+	if (def.type != BPF_MAP_TYPE_PERF_EVENT_ARRAY) {
+		pr_debug("Map %s type is not BPF_MAP_TYPE_PERF_EVENT_ARRAY\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+	}
+
+	op = bpf_map__add_newop(map);
+	if (IS_ERR(op))
+		return PTR_ERR(op);
+	op->op_type = BPF_MAP_OP_SET_EVSEL;
+	op->v.evsel = evsel;
+	return 0;
+}
+
+static int
+bpf_map__config_event(struct bpf_map *map,
+		      struct parse_events_term *term,
+		      struct perf_evlist *evlist)
+{
+	if (!term->err_val) {
+		pr_debug("Config value not set\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
+	}
+
+	if (term->type_val != PARSE_EVENTS__TERM_TYPE_STR) {
+		pr_debug("ERROR: wrong value type for 'event'\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
+	}
+
+	return __bpf_map__config_event(map, term, evlist);
+}
+
 struct bpf_obj_config__map_func {
 	const char *config_opt;
 	int (*config_func)(struct bpf_map *, struct parse_events_term *,
@@ -913,6 +991,7 @@ struct bpf_obj_config__map_func {
 
 struct bpf_obj_config__map_func bpf_obj_config__map_funcs[] = {
 	{"value", bpf_map__config_value},
+	{"event", bpf_map__config_event},
 };
 
 static int
@@ -1057,6 +1136,7 @@ bpf_map_config_foreach_key(struct bpf_map *map,
 	list_for_each_entry(op, &priv->ops_list, list) {
 		switch (def.type) {
 		case BPF_MAP_TYPE_ARRAY:
+		case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
 			switch (op->key_type) {
 			case BPF_MAP_KEY_ALL:
 				err = foreach_key_array_all(func, arg, name,
@@ -1115,6 +1195,60 @@ apply_config_value_for_key(int map_fd, void *pkey,
 }
 
 static int
+apply_config_evsel_for_key(const char *name, int map_fd, void *pkey,
+			   struct perf_evsel *evsel)
+{
+	struct xyarray *xy = evsel->fd;
+	struct perf_event_attr *attr;
+	unsigned int key, events;
+	bool check_pass = false;
+	int *evt_fd;
+	int err;
+
+	if (!xy) {
+		pr_debug("ERROR: evsel not ready for map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	if (xy->row_size / xy->entry_size != 1) {
+		pr_debug("ERROR: Dimension of target event is incorrect for map %s\n",
+			 name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM;
+	}
+
+	attr = &evsel->attr;
+	if (attr->inherit) {
+		pr_debug("ERROR: Can't put inherit event into map %s\n", name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH;
+	}
+
+	if (attr->type == PERF_TYPE_RAW)
+		check_pass = true;
+	if (attr->type == PERF_TYPE_HARDWARE)
+		check_pass = true;
+	if (attr->type == PERF_TYPE_SOFTWARE &&
+			attr->config == PERF_COUNT_SW_BPF_OUTPUT)
+		check_pass = true;
+	if (!check_pass) {
+		pr_debug("ERROR: Event type is wrong for map %s\n", name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE;
+	}
+
+	events = xy->entries / (xy->row_size / xy->entry_size);
+	key = *((unsigned int *)pkey);
+	if (key >= events) {
+		pr_debug("ERROR: there is no event %d for map %s\n",
+			 key, name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_MAPSIZE;
+	}
+	evt_fd = xyarray__entry(xy, key, 0);
+	err = bpf_map_update_elem(map_fd, pkey, evt_fd, BPF_ANY);
+	if (err && errno)
+		err = -errno;
+	return err;
+}
+
+static int
 apply_obj_config_map_for_key(const char *name, int map_fd,
 			     struct bpf_map_def *pdef __maybe_unused,
 			     struct bpf_map_op *op,
@@ -1128,6 +1262,10 @@ apply_obj_config_map_for_key(const char *name, int map_fd,
 						 pdef->value_size,
 						 op->v.value);
 		break;
+	case BPF_MAP_OP_SET_EVSEL:
+		err = apply_config_evsel_for_key(name, map_fd, pkey,
+						 op->v.evsel);
+		break;
 	default:
 		pr_debug("ERROR: unknown value type for '%s'\n", name);
 		err = -BPF_LOADER_ERRNO__INTERNAL;
@@ -1193,6 +1331,11 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(OBJCONF_MAP_TYPE)]	= "Incorrect map type",
 	[ERRCODE_OFFSET(OBJCONF_MAP_KEYSIZE)]	= "Incorrect map key size",
 	[ERRCODE_OFFSET(OBJCONF_MAP_VALUESIZE)]	= "Incorrect map value size",
+	[ERRCODE_OFFSET(OBJCONF_MAP_NOEVT)]	= "Event not found for map setting",
+	[ERRCODE_OFFSET(OBJCONF_MAP_MAPSIZE)]	= "Invalid map size for event setting",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTDIM)]	= "Event dimension too large",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTINH)]	= "Doesn't support inherit event",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTTYPE)]	= "Wrong event type for map",
 };
 
 static int
@@ -1329,6 +1472,12 @@ int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
 {
 	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,
+			    "Cannot set event to BPF maps in multi-thread tracing");
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,
+			    "%s (Hint: use -i to turn off inherit)", emsg);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,
+			    "Can only put raw, hardware and BPF output event into a BPF map");
 	bpf__strerror_end(buf, size);
 	return 0;
 }
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index db3c34c..c9ce792 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -33,6 +33,11 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,	/* Incorrect map type */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE,	/* Incorrect map key size */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE,/* Incorrect map value size */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_NOEVT,	/* Event not found for map setting */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_MAPSIZE,	/* Invalid map size for event setting */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,	/* Event dimension too large */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,	/* Doesn't support inherit event */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,	/* Wrong event type for map */
 	__BPF_LOADER_ERRNO__END,
 };
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 0f57716..c42e196 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1741,3 +1741,19 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 
 	tracking_evsel->tracking = true;
 }
+
+struct perf_evsel *
+perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
+			       const char *str)
+{
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		if (!evsel->name)
+			continue;
+		if (strcmp(str, evsel->name) == 0)
+			return evsel;
+	}
+
+	return NULL;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7c4d9a2..a0d1522 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -294,4 +294,7 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 				     struct perf_evsel *tracking_evsel);
 
 void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr);
+
+struct perf_evsel *
+perf_evlist__find_evsel_by_str(struct perf_evlist *evlist, const char *str);
 #endif /* __PERF_EVLIST_H */
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 261690a..ed50421 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -656,14 +656,16 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 			return -EINVAL;
 		}
 
-		err = bpf__config_obj(obj, term, NULL, &error_pos);
+		err = bpf__config_obj(obj, term, data->evlist, &error_pos);
 		if (err) {
-			bpf__strerror_config_obj(obj, term, NULL,
+			bpf__strerror_config_obj(obj, term, data->evlist,
 						 &error_pos, err, errbuf,
 						 sizeof(errbuf));
 			data->error->help = strdup(
-"Hint:\tValid config term:\n"
+"Hint:\tValid config terms:\n"
 "     \tmaps:[<arraymap>].value=[value]\n"
+"     \tmaps:[<eventmap>].event=[event]\n"
+"\n"
 "     \t(add -v to see detail)");
 			data->error->str = strdup(errbuf);
 			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
@@ -1443,9 +1445,10 @@ int parse_events(struct perf_evlist *evlist, const char *str,
 		 struct parse_events_error *err)
 {
 	struct parse_events_evlist data = {
-		.list  = LIST_HEAD_INIT(data.list),
-		.idx   = evlist->nr_entries,
-		.error = err,
+		.list   = LIST_HEAD_INIT(data.list),
+		.idx    = evlist->nr_entries,
+		.error  = err,
+		.evlist = evlist,
 	};
 	int ret;
 
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 0117328..ed0208f 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -98,6 +98,7 @@ struct parse_events_evlist {
 	int			   idx;
 	int			   nr_groups;
 	struct parse_events_error *error;
+	struct perf_evlist	  *evlist;
 };
 
 struct parse_events_terms {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 09/55] perf tools: Create config_term_names array
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (7 preceding siblings ...)
  2016-02-19 11:43 ` [PATCH 08/55] perf tools: Enable passing event to BPF object Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-20 11:36   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-02-19 11:43 ` [PATCH 10/55] perf stat: Forbid user passing improper config terms Wang Nan
                   ` (45 subsequent siblings)
  54 siblings, 1 reply; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

config_term_names is introduced for furture commits which can retrive
config name through config term.

Utilize this array for parse_events_formats_error_string() so the
missing '{,no-}inherit' terms are added.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 51 +++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/parse-events.h |  3 ++-
 tools/perf/util/parse-events.l |  3 +--
 3 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index ed50421..3fe0d11 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -797,6 +797,25 @@ static int check_type_val(struct parse_events_term *term,
 	return -EINVAL;
 }
 
+/*
+ * Update according to parse-events.l
+ */
+static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
+	[PARSE_EVENTS__TERM_TYPE_USER]			= "<sysfs term>",
+	[PARSE_EVENTS__TERM_TYPE_CONFIG]		= "config",
+	[PARSE_EVENTS__TERM_TYPE_CONFIG1]		= "config1",
+	[PARSE_EVENTS__TERM_TYPE_CONFIG2]		= "config2",
+	[PARSE_EVENTS__TERM_TYPE_NAME]			= "name",
+	[PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD]		= "period",
+	[PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ]		= "freq",
+	[PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE]	= "branch_type",
+	[PARSE_EVENTS__TERM_TYPE_TIME]			= "time",
+	[PARSE_EVENTS__TERM_TYPE_CALLGRAPH]		= "call-graph",
+	[PARSE_EVENTS__TERM_TYPE_STACKSIZE]		= "stack-size",
+	[PARSE_EVENTS__TERM_TYPE_NOINHERIT]		= "no-inherit",
+	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
+};
+
 typedef int config_term_func_t(struct perf_event_attr *attr,
 			       struct parse_events_term *term,
 			       struct parse_events_error *err);
@@ -2149,6 +2168,31 @@ void parse_events_evlist_error(struct parse_events_evlist *data,
 	WARN_ONCE(!err->str, "WARNING: failed to allocate error string");
 }
 
+static void config_terms_list(char *buf, size_t buf_sz)
+{
+	int i;
+	bool first = true;
+
+	buf[0] = '\0';
+	for (i = 0; i < __PARSE_EVENTS__TERM_TYPE_NR; i++) {
+		const char *name = config_term_names[i];
+
+		if (!name)
+			continue;
+		if (name[0] == '<')
+			continue;
+
+		if (strlen(buf) + strlen(name) + 2 >= buf_sz)
+			return;
+
+		if (!first)
+			strcat(buf, ",");
+		else
+			first = false;
+		strcat(buf, name);
+	}
+}
+
 /*
  * Return string contains valid config terms of an event.
  * @additional_terms: For terms such as PMU sysfs terms.
@@ -2156,10 +2200,11 @@ void parse_events_evlist_error(struct parse_events_evlist *data,
 char *parse_events_formats_error_string(char *additional_terms)
 {
 	char *str;
-	static const char *static_terms = "config,config1,config2,name,"
-					  "period,freq,branch_type,time,"
-					  "call-graph,stack-size\n";
+	/* "branch_type" is the longest name */
+	char static_terms[__PARSE_EVENTS__TERM_TYPE_NR *
+			  (sizeof("branch_type") - 1)];
 
+	config_terms_list(static_terms, sizeof(static_terms));
 	/* valid terms */
 	if (additional_terms) {
 		if (asprintf(&str, "valid terms: %s,%s",
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index ed0208f..2196db9 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -68,7 +68,8 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_CALLGRAPH,
 	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
-	PARSE_EVENTS__TERM_TYPE_INHERIT
+	PARSE_EVENTS__TERM_TYPE_INHERIT,
+	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
 struct parse_events_term {
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 4387728..0cc6b84 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -178,8 +178,7 @@ modifier_bp	[rwx]{1,3}
 
 <config>{
 	/*
-	 * Please update parse_events_formats_error_string any time
-	 * new static term is added.
+	 * Please update config_term_names when new static term is added.
 	 */
 config			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG); }
 config1			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG1); }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 10/55] perf stat: Forbid user passing improper config terms
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (8 preceding siblings ...)
  2016-02-19 11:43 ` [PATCH 09/55] perf tools: Create config_term_names array Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
       [not found]   ` <20160219140333.GB16141@kernel.org>
  2016-02-20 11:37   ` [tip:perf/core] perf stat: Bail out on unsupported event config modifiers tip-bot for Wang Nan
  2016-02-19 11:43 ` [PATCH 11/55] perf tools: Rename and move pmu_event_name to get_config_name Wang Nan
                   ` (44 subsequent siblings)
  54 siblings, 2 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

'perf stat' accepts some config terms but doesn't apply them. For
example:

 # perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
 # ls
 # exit

 Performance counter stats for 'bash':

         266258061      instructions/no-inherit/
         266258061      instructions/inherit/

       1.402183915 seconds time elapsed

The result is confusing, because user may expect the first
'instructions' event exclude the 'ls' command.

This patch forbid most of these config terms for 'perf stat'.

Result:

 # ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
 event syntax error: 'instructions/no-inherit/'
                      \___ 'no-inherit' is not usable in 'perf stat'
 ...

We can add blocked config terms back when 'perf stat' really supports them.

This patch also removes unavailable config term from error message:

 # ./perf stat -e 'instructions/badterm/' ls
 event syntax error: 'instructions/badterm/'
                                   \___ unknown term

 valid terms: config,config1,config2,name

 # ./perf stat -e 'cpu/badterm/' ls
 event syntax error: 'cpu/badterm/'
                          \___ unknown term

 valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-stat.c      |  1 +
 tools/perf/util/parse-events.c | 48 ++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/parse-events.h |  1 +
 3 files changed, 50 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 86289df..8c0bc0f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1831,6 +1831,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (evsel_list == NULL)
 		return -ENOMEM;
 
+	parse_events__shrink_config_terms();
 	argc = parse_options_subcommand(argc, argv, stat_options, stat_subcommands,
 					(const char **) stat_usage,
 					PARSE_OPT_STOP_AT_NON_OPTION);
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 3fe0d11..5b8a13b 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -816,6 +816,41 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
 };
 
+static bool config_term_shrinked;
+
+static bool
+config_term_avail(int term_type, struct parse_events_error *err)
+{
+	if (term_type < 0 || term_type >= __PARSE_EVENTS__TERM_TYPE_NR) {
+		err->str = strdup("Invalid term_type");
+		return false;
+	}
+	if (!config_term_shrinked)
+		return true;
+
+	switch (term_type) {
+	case PARSE_EVENTS__TERM_TYPE_CONFIG:
+	case PARSE_EVENTS__TERM_TYPE_CONFIG1:
+	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
+	case PARSE_EVENTS__TERM_TYPE_NAME:
+		return true;
+	default:
+		if (!err)
+			return false;
+
+		/* term_type is validated so indexing is safe */
+		if (asprintf(&err->str, "'%s' is not usable in 'perf stat'",
+			     config_term_names[term_type]) < 0)
+			err->str = NULL;
+		return false;
+	}
+}
+
+void parse_events__shrink_config_terms(void)
+{
+	config_term_shrinked = true;
+}
+
 typedef int config_term_func_t(struct perf_event_attr *attr,
 			       struct parse_events_term *term,
 			       struct parse_events_error *err);
@@ -885,6 +920,17 @@ do {									   \
 		return -EINVAL;
 	}
 
+	/*
+	 * Check term availbility after basic checking so
+	 * PARSE_EVENTS__TERM_TYPE_USER can be found and filtered.
+	 *
+	 * If check availbility at the entry of this function,
+	 * user will see "'<sysfs term>' is not usable in 'perf stat'"
+	 * if an invalid config term is provided for legacy events
+	 * (for example, instructions/badterm/...), which is confusing.
+	 */
+	if (!config_term_avail(term->type_term, err))
+		return -EINVAL;
 	return 0;
 #undef CHECK_TYPE_VAL
 }
@@ -2177,6 +2223,8 @@ static void config_terms_list(char *buf, size_t buf_sz)
 	for (i = 0; i < __PARSE_EVENTS__TERM_TYPE_NR; i++) {
 		const char *name = config_term_names[i];
 
+		if (!config_term_avail(i, NULL))
+			continue;
 		if (!name)
 			continue;
 		if (name[0] == '<')
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 2196db9..b6afc57 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -106,6 +106,7 @@ struct parse_events_terms {
 	struct list_head *terms;
 };
 
+void parse_events__shrink_config_terms(void);
 int parse_events__is_hardcoded_term(struct parse_events_term *term);
 int parse_events_term__num(struct parse_events_term **term,
 			   int type_term, char *config, u64 num,
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 11/55] perf tools: Rename and move pmu_event_name to get_config_name
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (9 preceding siblings ...)
  2016-02-19 11:43 ` [PATCH 10/55] perf stat: Forbid user passing improper config terms Wang Nan
@ 2016-02-19 11:43 ` Wang Nan
  2016-02-20 11:37   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-02-19 11:44 ` [PATCH 12/55] perf tools: Enable config raw and numeric events Wang Nan
                   ` (43 subsequent siblings)
  54 siblings, 1 reply; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Following commits will make more events obey /name=newname/ options.
This patch makes pmu_event_name() a generic helper.

Makes new get_config_name() accept NULL input to make life easier.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 5b8a13b..87911ae 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -279,7 +279,24 @@ const char *event_type(int type)
 	return "unknown";
 }
 
+static int parse_events__is_name_term(struct parse_events_term *term)
+{
+	return term->type_term == PARSE_EVENTS__TERM_TYPE_NAME;
+}
 
+static char *get_config_name(struct list_head *head_terms)
+{
+	struct parse_events_term *term;
+
+	if (!head_terms)
+		return NULL;
+
+	list_for_each_entry(term, head_terms, list)
+		if (parse_events__is_name_term(term))
+			return term->val.str;
+
+	return NULL;
+}
 
 static struct perf_evsel *
 __add_event(struct list_head *list, int *idx,
@@ -1080,22 +1097,6 @@ int parse_events_add_numeric(struct parse_events_evlist *data,
 	return add_event(list, &data->idx, &attr, NULL, &config_terms);
 }
 
-static int parse_events__is_name_term(struct parse_events_term *term)
-{
-	return term->type_term == PARSE_EVENTS__TERM_TYPE_NAME;
-}
-
-static char *pmu_event_name(struct list_head *head_terms)
-{
-	struct parse_events_term *term;
-
-	list_for_each_entry(term, head_terms, list)
-		if (parse_events__is_name_term(term))
-			return term->val.str;
-
-	return NULL;
-}
-
 int parse_events_add_pmu(struct parse_events_evlist *data,
 			 struct list_head *list, char *name,
 			 struct list_head *head_config)
@@ -1140,7 +1141,7 @@ int parse_events_add_pmu(struct parse_events_evlist *data,
 		return -EINVAL;
 
 	evsel = __add_event(list, &data->idx, &attr,
-			    pmu_event_name(head_config), pmu->cpus,
+			    get_config_name(head_config), pmu->cpus,
 			    &config_terms);
 	if (evsel) {
 		evsel->unit = info.unit;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 12/55] perf tools: Enable config raw and numeric events
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (10 preceding siblings ...)
  2016-02-19 11:43 ` [PATCH 11/55] perf tools: Rename and move pmu_event_name to get_config_name Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
       [not found]   ` <20160219141405.GC16141@kernel.org>
  2016-02-20 11:38   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-02-19 11:44 ` [PATCH 13/55] perf tools: Enable config and setting names for legacy cache events Wang Nan
                   ` (42 subsequent siblings)
  54 siblings, 2 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch allows setting config terms for raw and numeric events.
For example:

 # perf stat -e cycles/name=cyc/ ls
 ...
 1821108      cyc
 ...

 # perf stat -e r6530160/name=event/ ls
 ...
 1103195      event
 ...

 # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
 ...
 # perf report --stdio
 ...
 # Samples: 124  of event 'cycles'
 46.61%     0.00%  swapper        [kernel.vmlinux]            [k] cpu_startup_entry
 41.26%     0.00%  swapper        [kernel.vmlinux]            [k] start_secondary
 ...
 # Samples: 91  of event 'evtx'
 ...
 93.76%     0.00%  swapper      [kernel.vmlinux]            [k] cpu_startup_entry
         |
         ---cpu_startup_entry
            |
            |--66.63%--call_cpuidle
            |          cpuidle_enter
            |          |
 ...

3 test cases are introduced to test config terms for symbol, raw and
numeric events.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/tests/parse-events.c | 40 ++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/parse-events.c  |  3 ++-
 tools/perf/util/parse-events.y  | 10 ++++++----
 3 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 6648274..15e2d05 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -1271,6 +1271,31 @@ static int test__checkevent_precise_max_modifier(struct perf_evlist *evlist)
 	return 0;
 }
 
+static int test__checkevent_config_symbol(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+	TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "insn") == 0);
+	return 0;
+}
+
+static int test__checkevent_config_raw(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+	TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "rawpmu") == 0);
+	return 0;
+}
+
+static int test__checkevent_config_num(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+	TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "numpmu") == 0);
+	return 0;
+}
+
+
 static int count_tracepoints(void)
 {
 	struct dirent *events_ent;
@@ -1579,6 +1604,21 @@ static struct evlist_test test__events[] = {
 		.check = test__checkevent_precise_max_modifier,
 		.id    = 47,
 	},
+	{
+		.name  = "instructions/name=insn/",
+		.check = test__checkevent_config_symbol,
+		.id    = 48,
+	},
+	{
+		.name  = "r1234/name=rawpmu/",
+		.check = test__checkevent_config_raw,
+		.id    = 49,
+	},
+	{
+		.name  = "4:0x6530160/name=numpmu/",
+		.check = test__checkevent_config_num,
+		.id    = 50,
+	},
 };
 
 static struct evlist_test test__events_pmu[] = {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 87911ae..68a3871 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1094,7 +1094,8 @@ int parse_events_add_numeric(struct parse_events_evlist *data,
 			return -ENOMEM;
 	}
 
-	return add_event(list, &data->idx, &attr, NULL, &config_terms);
+	return add_event(list, &data->idx, &attr,
+			 get_config_name(head_config), &config_terms);
 }
 
 int parse_events_add_pmu(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 448c1ec..7ce8633 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -434,24 +434,26 @@ PE_NAME ':' PE_NAME
 }
 
 event_legacy_numeric:
-PE_VALUE ':' PE_VALUE
+PE_VALUE ':' PE_VALUE opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, NULL));
+	ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, $4));
+	parse_events_terms__delete($4);
 	$$ = list;
 }
 
 event_legacy_raw:
-PE_RAW
+PE_RAW opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, NULL));
+	ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, $2));
+	parse_events_terms__delete($2);
 	$$ = list;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 13/55] perf tools: Enable config and setting names for legacy cache events
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (11 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 12/55] perf tools: Enable config raw and numeric events Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-20 11:38   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-02-19 11:44 ` [PATCH 14/55] perf tools: Support setting different slots in a BPF map separately Wang Nan
                   ` (41 subsequent siblings)
  54 siblings, 1 reply; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch allows setting config terms for legacy cache events.
For example:

 # perf stat -e L1-icache-misses/name=valA/ -e branches/name=valB/ ls
 ...
  Performance counter stats for 'ls':

              11299      valA
             451605      valB

        0.000779091 seconds time elapsed

 # perf record -e cache-misses/name=inh/ -e cache-misses/name=noinh,no-inherit/ bash
 # ls
 # exit
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.023 MB perf.data (131 samples) ]
 # perf report --stdio | grep -B 1 'Event count'
 # Samples: 105  of event 'inh'
 # Event count (approx.): 109118
 --
 # Samples: 26  of event 'noinh'
 # Event count (approx.): 48302

A test case is introduced to test this feature.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/tests/parse-events.c | 12 ++++++++++++
 tools/perf/util/parse-events.c  | 30 +++++++++++++++++++++++++++---
 tools/perf/util/parse-events.h  |  4 +++-
 tools/perf/util/parse-events.y  | 18 ++++++++++++------
 4 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 15e2d05..7865f68 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -1295,6 +1295,13 @@ static int test__checkevent_config_num(struct perf_evlist *evlist)
 	return 0;
 }
 
+static int test__checkevent_config_cache(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+	TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "cachepmu") == 0);
+	return 0;
+}
 
 static int count_tracepoints(void)
 {
@@ -1619,6 +1626,11 @@ static struct evlist_test test__events[] = {
 		.check = test__checkevent_config_num,
 		.id    = 50,
 	},
+	{
+		.name  = "L1-dcache-misses/name=cachepmu/",
+		.check = test__checkevent_config_cache,
+		.id    = 51,
+	},
 };
 
 static struct evlist_test test__events_pmu[] = {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 68a3871..be3da08 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -350,11 +350,25 @@ static int parse_aliases(char *str, const char *names[][PERF_EVSEL__MAX_ALIASES]
 	return -1;
 }
 
+typedef int config_term_func_t(struct perf_event_attr *attr,
+			       struct parse_events_term *term,
+			       struct parse_events_error *err);
+static int config_term_common(struct perf_event_attr *attr,
+			      struct parse_events_term *term,
+			      struct parse_events_error *err);
+static int config_attr(struct perf_event_attr *attr,
+		       struct list_head *head,
+		       struct parse_events_error *err,
+		       config_term_func_t config_term);
+
 int parse_events_add_cache(struct list_head *list, int *idx,
-			   char *type, char *op_result1, char *op_result2)
+			   char *type, char *op_result1, char *op_result2,
+			   struct parse_events_error *error,
+			   struct list_head *head_config)
 {
 	struct perf_event_attr attr;
-	char name[MAX_NAME_LEN];
+	LIST_HEAD(config_terms);
+	char name[MAX_NAME_LEN], *config_name;
 	int cache_type = -1, cache_op = -1, cache_result = -1;
 	char *op_result[2] = { op_result1, op_result2 };
 	int i, n;
@@ -368,6 +382,7 @@ int parse_events_add_cache(struct list_head *list, int *idx,
 	if (cache_type == -1)
 		return -EINVAL;
 
+	config_name = get_config_name(head_config);
 	n = snprintf(name, MAX_NAME_LEN, "%s", type);
 
 	for (i = 0; (i < 2) && (op_result[i]); i++) {
@@ -408,7 +423,16 @@ int parse_events_add_cache(struct list_head *list, int *idx,
 	memset(&attr, 0, sizeof(attr));
 	attr.config = cache_type | (cache_op << 8) | (cache_result << 16);
 	attr.type = PERF_TYPE_HW_CACHE;
-	return add_event(list, idx, &attr, name, NULL);
+
+	if (head_config) {
+		if (config_attr(&attr, head_config, error,
+				config_term_common))
+			return -EINVAL;
+
+		if (get_config_terms(head_config, &config_terms))
+			return -ENOMEM;
+	}
+	return add_event(list, idx, &attr, config_name ? : name, &config_terms);
 }
 
 static void tracepoint_error(struct parse_events_error *e, int err,
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index b6afc57..e036969 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -142,7 +142,9 @@ int parse_events_add_numeric(struct parse_events_evlist *data,
 			     u32 type, u64 config,
 			     struct list_head *head_config);
 int parse_events_add_cache(struct list_head *list, int *idx,
-			   char *type, char *op_result1, char *op_result2);
+			   char *type, char *op_result1, char *op_result2,
+			   struct parse_events_error *error,
+			   struct list_head *head_config);
 int parse_events_add_breakpoint(struct list_head *list, int *idx,
 				void *ptr, char *type, u64 len);
 int parse_events_add_pmu(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 7ce8633..359ef6a 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -303,33 +303,39 @@ value_sym sep_slash_dc
 }
 
 event_legacy_cache:
-PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT
+PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT opt_event_config
 {
 	struct parse_events_evlist *data = _data;
+	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, $5));
+	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, $5, error, $6));
+	parse_events_terms__delete($6);
 	$$ = list;
 }
 |
-PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT
+PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT opt_event_config
 {
 	struct parse_events_evlist *data = _data;
+	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, NULL));
+	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, NULL, error, $4));
+	parse_events_terms__delete($4);
 	$$ = list;
 }
 |
-PE_NAME_CACHE_TYPE
+PE_NAME_CACHE_TYPE opt_event_config
 {
 	struct parse_events_evlist *data = _data;
+	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, NULL, NULL));
+	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, NULL, NULL, error, $2));
+	parse_events_terms__delete($2);
 	$$ = list;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 14/55] perf tools: Support setting different slots in a BPF map separately
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (12 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 13/55] perf tools: Enable config and setting names for legacy cache events Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 15/55] perf tools: Enable indices setting syntax for BPF maps Wang Nan
                   ` (40 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch introduces basic facilities to support config different
slots in a BPF map one by one.

array.nr_ranges and array.ranges are introduced into 'struct
parse_events_term', where ranges is an array of indices range (start,
length) which will be configured by this config term. nr_ranges
is the size of the array. The array is passed to 'struct bpf_map_priv'.
To indicate the new type of configuration, BPF_MAP_KEY_RANGES is
added as a new key type. bpf_map_config_foreach_key() is extended to
iterate over those indices instead of all possible keys.

Code in this commit will be enabled by following commit which enables
the indices syntax for array configuration.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 128 ++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/bpf-loader.h   |   1 +
 tools/perf/util/parse-events.c |   7 +++
 tools/perf/util/parse-events.h |  10 ++++
 4 files changed, 137 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index e7520c2..b6526b0 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -17,6 +17,7 @@
 #include "llvm-utils.h"
 #include "probe-event.h"
 #include "probe-finder.h" // for MAX_PROBES
+#include "parse-events.h"
 #include "llvm-utils.h"
 
 #define DEFINE_PRINT_FN(name, level) \
@@ -747,6 +748,7 @@ enum bpf_map_op_type {
 
 enum bpf_map_key_type {
 	BPF_MAP_KEY_ALL,
+	BPF_MAP_KEY_RANGES,
 };
 
 struct bpf_map_op {
@@ -754,6 +756,9 @@ struct bpf_map_op {
 	enum bpf_map_op_type op_type;
 	enum bpf_map_key_type key_type;
 	union {
+		struct parse_events_array array;
+	} k;
+	union {
 		u64 value;
 		struct perf_evsel *evsel;
 	} v;
@@ -768,6 +773,8 @@ bpf_map_op__delete(struct bpf_map_op *op)
 {
 	if (!list_empty(&op->list))
 		list_del(&op->list);
+	if (op->key_type == BPF_MAP_KEY_RANGES)
+		parse_events__clear_array(&op->k.array);
 	free(op);
 }
 
@@ -791,10 +798,33 @@ clear_map_priv(struct bpf_map *map __maybe_unused, void *_priv)
 	bpf_map_priv__delete(priv);
 }
 
+static int
+bpf_map_op_setkey(struct bpf_map_op *op, struct parse_events_term *term)
+{
+	op->key_type = BPF_MAP_KEY_ALL;
+	if (!term)
+		return 0;
+
+	if (term->array.nr_ranges) {
+		size_t memsz = term->array.nr_ranges *
+				sizeof(op->k.array.ranges[0]);
+
+		op->k.array.ranges = memdup(term->array.ranges, memsz);
+		if (!op->k.array.ranges) {
+			pr_debug("No enough memory to alloc indices for map\n");
+			return -ENOMEM;
+		}
+		op->key_type = BPF_MAP_KEY_RANGES;
+		op->k.array.nr_ranges = term->array.nr_ranges;
+	}
+	return 0;
+}
+
 static struct bpf_map_op *
-bpf_map_op__new(void)
+bpf_map_op__new(struct parse_events_term *term)
 {
 	struct bpf_map_op *op;
+	int err;
 
 	op = zalloc(sizeof(*op));
 	if (!op) {
@@ -803,7 +833,11 @@ bpf_map_op__new(void)
 	}
 	INIT_LIST_HEAD(&op->list);
 
-	op->key_type = BPF_MAP_KEY_ALL;
+	err = bpf_map_op_setkey(op, term);
+	if (err) {
+		free(op);
+		return ERR_PTR(err);
+	}
 	return op;
 }
 
@@ -840,12 +874,12 @@ bpf_map__add_op(struct bpf_map *map, struct bpf_map_op *op)
 }
 
 static struct bpf_map_op *
-bpf_map__add_newop(struct bpf_map *map)
+bpf_map__add_newop(struct bpf_map *map, struct parse_events_term *term)
 {
 	struct bpf_map_op *op;
 	int err;
 
-	op = bpf_map_op__new();
+	op = bpf_map_op__new(term);
 	if (IS_ERR(op))
 		return op;
 
@@ -895,7 +929,7 @@ __bpf_map__config_value(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
 	}
 
-	op = bpf_map__add_newop(map);
+	op = bpf_map__add_newop(map, term);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 	op->op_type = BPF_MAP_OP_SET_VALUE;
@@ -957,7 +991,7 @@ __bpf_map__config_event(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
 	}
 
-	op = bpf_map__add_newop(map);
+	op = bpf_map__add_newop(map, term);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 	op->op_type = BPF_MAP_OP_SET_EVSEL;
@@ -995,6 +1029,44 @@ struct bpf_obj_config__map_func bpf_obj_config__map_funcs[] = {
 };
 
 static int
+config_map_indices_range_check(struct parse_events_term *term,
+			       struct bpf_map *map,
+			       const char *map_name)
+{
+	struct parse_events_array *array = &term->array;
+	struct bpf_map_def def;
+	unsigned int i;
+	int err;
+
+	if (!array->nr_ranges)
+		return 0;
+	if (!array->ranges) {
+		pr_debug("ERROR: map %s: array->nr_ranges is %d but range array is NULL\n",
+			 map_name, (int)array->nr_ranges);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("ERROR: Unable to get map definition from '%s'\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	for (i = 0; i < array->nr_ranges; i++) {
+		unsigned int start = array->ranges[i].start;
+		size_t length = array->ranges[i].length;
+		unsigned int idx = start + length - 1;
+
+		if (idx >= def.max_entries) {
+			pr_debug("ERROR: index %d too large\n", idx);
+			return -BPF_LOADER_ERRNO__OBJCONF_MAP_IDX2BIG;
+		}
+	}
+	return 0;
+}
+
+static int
 bpf__obj_config_map(struct bpf_object *obj,
 		    struct parse_events_term *term,
 		    struct perf_evlist *evlist,
@@ -1029,7 +1101,12 @@ bpf__obj_config_map(struct bpf_object *obj,
 		goto out;
 	}
 
-	*key_scan_pos += map_opt - map_name;
+	*key_scan_pos += strlen(map_opt);
+	err = config_map_indices_range_check(term, map, map_name);
+	if (err)
+		goto out;
+	*key_scan_pos -= strlen(map_opt);
+
 	for (i = 0; i < ARRAY_SIZE(bpf_obj_config__map_funcs); i++) {
 		struct bpf_obj_config__map_func *func =
 				&bpf_obj_config__map_funcs[i];
@@ -1100,6 +1177,33 @@ foreach_key_array_all(map_config_func_t func,
 }
 
 static int
+foreach_key_array_ranges(map_config_func_t func, void *arg,
+			 const char *name, int map_fd,
+			 struct bpf_map_def *pdef,
+			 struct bpf_map_op *op)
+{
+	unsigned int i, j;
+	int err;
+
+	for (i = 0; i < op->k.array.nr_ranges; i++) {
+		unsigned int start = op->k.array.ranges[i].start;
+		size_t length = op->k.array.ranges[i].length;
+
+		for (j = 0; j < length; j++) {
+			unsigned int idx = start + j;
+
+			err = func(name, map_fd, pdef, op, &idx, arg);
+			if (err) {
+				pr_debug("ERROR: failed to insert value to %s[%u]\n",
+					 name, idx);
+				return err;
+			}
+		}
+	}
+	return 0;
+}
+
+static int
 bpf_map_config_foreach_key(struct bpf_map *map,
 			   map_config_func_t func,
 			   void *arg)
@@ -1141,14 +1245,19 @@ bpf_map_config_foreach_key(struct bpf_map *map,
 			case BPF_MAP_KEY_ALL:
 				err = foreach_key_array_all(func, arg, name,
 							    map_fd, &def, op);
-				if (err)
-					return err;
+				break;
+			case BPF_MAP_KEY_RANGES:
+				err = foreach_key_array_ranges(func, arg, name,
+							       map_fd, &def,
+							       op);
 				break;
 			default:
 				pr_debug("ERROR: keytype for map '%s' invalid\n",
 					 name);
 				return -BPF_LOADER_ERRNO__INTERNAL;
 			}
+			if (err)
+				return err;
 			break;
 		default:
 			pr_debug("ERROR: type of '%s' incorrect\n", name);
@@ -1336,6 +1445,7 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTDIM)]	= "Event dimension too large",
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTINH)]	= "Doesn't support inherit event",
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTTYPE)]	= "Wrong event type for map",
+	[ERRCODE_OFFSET(OBJCONF_MAP_IDX2BIG)]	= "Index too large",
 };
 
 static int
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index c9ce792..30ee519 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -38,6 +38,7 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,	/* Event dimension too large */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,	/* Doesn't support inherit event */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,	/* Wrong event type for map */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_IDX2BIG,	/* Index too large */
 	__BPF_LOADER_ERRNO__END,
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index be3da08..dfb3d9d 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2215,6 +2215,8 @@ void parse_events_terms__purge(struct list_head *terms)
 	struct parse_events_term *term, *h;
 
 	list_for_each_entry_safe(term, h, terms, list) {
+		if (term->array.nr_ranges)
+			free(term->array.ranges);
 		list_del_init(&term->list);
 		free(term);
 	}
@@ -2228,6 +2230,11 @@ void parse_events_terms__delete(struct list_head *terms)
 	free(terms);
 }
 
+void parse_events__clear_array(struct parse_events_array *a)
+{
+	free(a->ranges);
+}
+
 void parse_events_evlist_error(struct parse_events_evlist *data,
 			       int idx, const char *str)
 {
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index e036969..e445622 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -72,8 +72,17 @@ enum {
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
+struct parse_events_array {
+	size_t nr_ranges;
+	struct {
+		unsigned int start;
+		size_t length;
+	} *ranges;
+};
+
 struct parse_events_term {
 	char *config;
+	struct parse_events_array array;
 	union {
 		char *str;
 		u64  num;
@@ -120,6 +129,7 @@ int parse_events_term__clone(struct parse_events_term **new,
 			     struct parse_events_term *term);
 void parse_events_terms__delete(struct list_head *terms);
 void parse_events_terms__purge(struct list_head *terms);
+void parse_events__clear_array(struct parse_events_array *a);
 int parse_events__modifier_event(struct list_head *list, char *str, bool add);
 int parse_events__modifier_group(struct list_head *list, char *event_mod);
 int parse_events_name(struct list_head *list, char *name);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 15/55] perf tools: Enable indices setting syntax for BPF maps
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (13 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 14/55] perf tools: Support setting different slots in a BPF map separately Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 16/55] perf tools: Pass tracepoint options to BPF script Wang Nan
                   ` (39 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch introduces a new syntax to perf event parser:

 # perf record -e './test_bpf_map_3.c/maps:channel.value[0,1,2,3...5]=101/' usleep 2

By utilizing the basic facilities in bpf-loader.c which allow setting
different slots in a BPF map separately, the newly introduced syntax
allows perf to control specific elements in a BPF map.

Test result:

 # cat ./test_bpf_map_3.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
 	(void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(unsigned char),
 	.max_entries = 100,
 };
 SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
 int func(void *ctx, int err, long nsec)
 {
 	char fmt[] = "%ld\n";
 	long usec = nsec * 0x10624dd3 >> 38; // nsec / 1000
 	int key = (int)usec;
 	unsigned char *pval = map_lookup_elem(&channel, &key);

 	if (!pval)
 		return 0;
 	trace_printk(fmt, sizeof(fmt), (unsigned char)*pval);
 	return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

Normal case:
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0,1,2,3...5]=101/' usleep 2
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0...9,20...29]=102,maps:channel.value[10...19]=103/' usleep 3
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0...9,20...29]=102,maps:channel.value[10...19]=103/' usleep 15
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
           usleep-655   [006] d... 2745434.122814: : 102
           usleep-904   [006] d... 2745439.916264: : 103
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[all]=104/' usleep 99
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
           usleep-655   [006] d... 2745434.122814: : 102
           usleep-904   [006] d... 2745439.916264: : 103
           usleep-1537  [003] d... 2745538.053737: : 104

Error case:
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[10...1000]=104/' usleep 99
 event syntax error: '..annel.value[10...1000]=104/'
                                   \___ Index too large
 Hint:	Valid config terms:
      	maps:[<arraymap>].value<indices>=[value]
      	maps:[<eventmap>].event<indices>=[event]

      	where <indices> is something like [0,3...5] or [all]
      	(add -v to see detail)
 Run 'perf list' for a list of valid events

  Usage: perf record [<options>] [<command>]
     or: perf record [<options>] -- <command> [<options>]

     -e, --event <event>   event selector. use 'perf list' to list available events

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c |  5 ++-
 tools/perf/util/parse-events.l | 13 ++++++-
 tools/perf/util/parse-events.y | 85 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index dfb3d9d..a9734f8 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -704,9 +704,10 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 						 sizeof(errbuf));
 			data->error->help = strdup(
 "Hint:\tValid config terms:\n"
-"     \tmaps:[<arraymap>].value=[value]\n"
-"     \tmaps:[<eventmap>].event=[event]\n"
+"     \tmaps:[<arraymap>].value<indices>=[value]\n"
+"     \tmaps:[<eventmap>].event<indices>=[event]\n"
 "\n"
+"     \twhere <indices> is something like [0,3...5] or [all]\n"
 "     \t(add -v to see detail)");
 			data->error->str = strdup(errbuf);
 			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 0cc6b84..fb85d03 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -9,8 +9,8 @@
 %{
 #include <errno.h>
 #include "../perf.h"
-#include "parse-events-bison.h"
 #include "parse-events.h"
+#include "parse-events-bison.h"
 
 char *parse_events_get_text(yyscan_t yyscanner);
 YYSTYPE *parse_events_get_lval(yyscan_t yyscanner);
@@ -111,6 +111,7 @@ do {							\
 %x mem
 %s config
 %x event
+%x array
 
 group		[^,{}/]*[{][^}]*[}][^,{}/]*
 event_pmu	[^,{}/]+[/][^/]*[/][^,{}/]*
@@ -176,6 +177,14 @@ modifier_bp	[rwx]{1,3}
 
 }
 
+<array>{
+"]"			{ BEGIN(config); return ']'; }
+{num_dec}		{ return value(yyscanner, 10); }
+{num_hex}		{ return value(yyscanner, 16); }
+,			{ return ','; }
+"\.\.\."		{ return PE_ARRAY_RANGE; }
+}
+
 <config>{
 	/*
 	 * Please update config_term_names when new static term is added.
@@ -195,6 +204,8 @@ no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
+\[all\]			{ return PE_ARRAY_ALL; }
+"["			{ BEGIN(array); return '['; }
 }
 
 <mem>{
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 359ef6a..4992fb9 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -48,6 +48,7 @@ static inc_group_count(struct list_head *list,
 %token PE_PREFIX_MEM PE_PREFIX_RAW PE_PREFIX_GROUP
 %token PE_ERROR
 %token PE_PMU_EVENT_PRE PE_PMU_EVENT_SUF PE_KERNEL_PMU_EVENT
+%token PE_ARRAY_ALL PE_ARRAY_RANGE
 %type <num> PE_VALUE
 %type <num> PE_VALUE_SYM_HW
 %type <num> PE_VALUE_SYM_SW
@@ -83,6 +84,9 @@ static inc_group_count(struct list_head *list,
 %type <head> group_def
 %type <head> group
 %type <head> groups
+%type <array> array
+%type <array> array_term
+%type <array> array_terms
 
 %union
 {
@@ -94,6 +98,7 @@ static inc_group_count(struct list_head *list,
 		char *sys;
 		char *event;
 	} tracepoint_name;
+	struct parse_events_array array;
 }
 %%
 
@@ -594,6 +599,86 @@ PE_TERM
 	ABORT_ON(parse_events_term__num(&term, (int)$1, NULL, 1, &@1, NULL));
 	$$ = term;
 }
+|
+PE_NAME array '=' PE_NAME
+{
+	struct parse_events_term *term;
+	int i;
+
+	ABORT_ON(parse_events_term__str(&term, PARSE_EVENTS__TERM_TYPE_USER,
+					$1, $4, &@1, &@4));
+
+	term->array = $2;
+	$$ = term;
+}
+|
+PE_NAME array '=' PE_VALUE
+{
+	struct parse_events_term *term;
+
+	ABORT_ON(parse_events_term__num(&term, PARSE_EVENTS__TERM_TYPE_USER,
+					$1, $4, &@1, &@4));
+	term->array = $2;
+	$$ = term;
+}
+
+array:
+'[' array_terms ']'
+{
+	$$ = $2;
+}
+|
+PE_ARRAY_ALL
+{
+	$$.nr_ranges = 0;
+	$$.ranges = NULL;
+}
+
+array_terms:
+array_terms ',' array_term
+{
+	struct parse_events_array new_array;
+
+	new_array.nr_ranges = $1.nr_ranges + $3.nr_ranges;
+	new_array.ranges = malloc(sizeof(new_array.ranges[0]) *
+				  new_array.nr_ranges);
+	ABORT_ON(!new_array.ranges);
+	memcpy(&new_array.ranges[0], $1.ranges,
+	       $1.nr_ranges * sizeof(new_array.ranges[0]));
+	memcpy(&new_array.ranges[$1.nr_ranges], $3.ranges,
+	       $3.nr_ranges * sizeof(new_array.ranges[0]));
+	free($1.ranges);
+	free($3.ranges);
+	$$ = new_array;
+}
+|
+array_term
+
+array_term:
+PE_VALUE
+{
+	struct parse_events_array array;
+
+	array.nr_ranges = 1;
+	array.ranges = malloc(sizeof(array.ranges[0]));
+	ABORT_ON(!array.ranges);
+	array.ranges[0].start = $1;
+	array.ranges[0].length = 1;
+	$$ = array;
+}
+|
+PE_VALUE PE_ARRAY_RANGE PE_VALUE
+{
+	struct parse_events_array array;
+
+	ABORT_ON($3 < $1);
+	array.nr_ranges = 1;
+	array.ranges = malloc(sizeof(array.ranges[0]));
+	ABORT_ON(!array.ranges);
+	array.ranges[0].start = $1;
+	array.ranges[0].length = $3 - $1 + 1;
+	$$ = array;
+}
 
 sep_dc: ':' |
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 16/55] perf tools: Pass tracepoint options to BPF script
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (14 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 15/55] perf tools: Enable indices setting syntax for BPF maps Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 17/55] perf tools: Introduce bpf-output event Wang Nan
                   ` (38 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Users can pass options to tracepoints defined in the BPF script.
For example:

 # perf record -e ./test.c/no-inherit/ bash
 # dd if=/dev/zero of=/dev/null count=10000
 # exit
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.022 MB perf.data (139 samples) ]

 (no-inherit works, only sys_read issued by bash is captured, at least
  10000 sys_read issued by dd is skipped.)

test.c:

 #define SEC(NAME) __attribute__((section(NAME), used))
 SEC("func=sys_read")
 int bpf_func__sys_read(void *ctx)
 {
     return 1;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;

no-inherit is applied to the kprobe event defined in test.c.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/tests/bpf.c         |  2 +-
 tools/perf/util/parse-events.c | 56 +++++++++++++++++++++++++++++++++++++-----
 tools/perf/util/parse-events.h |  3 ++-
 3 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 4aed5cb..199501c 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -112,7 +112,7 @@ static int do_test(struct bpf_object *obj, int (*func)(void),
 	parse_evlist.error = &parse_error;
 	INIT_LIST_HEAD(&parse_evlist.list);
 
-	err = parse_events_load_bpf_obj(&parse_evlist, &parse_evlist.list, obj);
+	err = parse_events_load_bpf_obj(&parse_evlist, &parse_evlist.list, obj, NULL);
 	if (err || list_empty(&parse_evlist.list)) {
 		pr_debug("Failed to add events selected by BPF\n");
 		return TEST_FAIL;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index a9734f8..5e958b9 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -581,6 +581,7 @@ static int add_tracepoint_multi_sys(struct list_head *list, int *idx,
 struct __add_bpf_event_param {
 	struct parse_events_evlist *data;
 	struct list_head *list;
+	struct list_head *head_config;
 };
 
 static int add_bpf_event(struct probe_trace_event *tev, int fd,
@@ -597,7 +598,8 @@ static int add_bpf_event(struct probe_trace_event *tev, int fd,
 		 tev->group, tev->event, fd);
 
 	err = parse_events_add_tracepoint(&new_evsels, &evlist->idx, tev->group,
-					  tev->event, evlist->error, NULL);
+					  tev->event, evlist->error,
+					  param->head_config);
 	if (err) {
 		struct perf_evsel *evsel, *tmp;
 
@@ -622,11 +624,12 @@ static int add_bpf_event(struct probe_trace_event *tev, int fd,
 
 int parse_events_load_bpf_obj(struct parse_events_evlist *data,
 			      struct list_head *list,
-			      struct bpf_object *obj)
+			      struct bpf_object *obj,
+			      struct list_head *head_config)
 {
 	int err;
 	char errbuf[BUFSIZ];
-	struct __add_bpf_event_param param = {data, list};
+	struct __add_bpf_event_param param = {data, list, head_config};
 	static bool registered_unprobe_atexit = false;
 
 	if (IS_ERR(obj) || !obj) {
@@ -720,14 +723,47 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 	return 0;
 }
 
+/*
+ * Split config terms:
+ * perf record -e bpf.c/call-graph=fp,maps:array.value[0]=1/ ...
+ *  'call-graph=fp' is 'evt config', should be applied to each
+ *  events in bpf.c.
+ * 'maps:array.value[0]=1' is 'obj config', should be processed
+ * with parse_events_config_bpf.
+ *
+ * Move object config terms from the first list to obj_head_config.
+ */
+static void
+split_bpf_config_terms(struct list_head *evt_head_config,
+		       struct list_head *obj_head_config)
+{
+	struct parse_events_term *term, *temp;
+
+	/*
+	 * Currectly, all possible user config term
+	 * belong to bpf object. parse_events__is_hardcoded_term()
+	 * happends to be a good flag.
+	 *
+	 * See parse_events_config_bpf() and
+	 * config_term_tracepoint().
+	 */
+	list_for_each_entry_safe(term, temp, evt_head_config, list)
+		if (!parse_events__is_hardcoded_term(term))
+			list_move_tail(&term->list, obj_head_config);
+}
+
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
 			  bool source,
 			  struct list_head *head_config)
 {
-	struct bpf_object *obj;
 	int err;
+	struct bpf_object *obj;
+	LIST_HEAD(obj_head_config);
+
+	if (head_config)
+		split_bpf_config_terms(head_config, &obj_head_config);
 
 	obj = bpf__prepare_load(bpf_file_name, source);
 	if (IS_ERR(obj)) {
@@ -749,10 +785,18 @@ int parse_events_load_bpf(struct parse_events_evlist *data,
 		return err;
 	}
 
-	err = parse_events_load_bpf_obj(data, list, obj);
+	err = parse_events_load_bpf_obj(data, list, obj, head_config);
 	if (err)
 		return err;
-	return parse_events_config_bpf(data, obj, head_config);
+	err = parse_events_config_bpf(data, obj, &obj_head_config);
+
+	/*
+	 * Caller doesn't know anything about obj_head_config,
+	 * so combine them together again before returnning.
+	 */
+	if (head_config)
+		list_splice_tail(&obj_head_config, head_config);
+	return err;
 }
 
 static int
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index e445622..67e4930 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -146,7 +146,8 @@ int parse_events_load_bpf(struct parse_events_evlist *data,
 struct bpf_object;
 int parse_events_load_bpf_obj(struct parse_events_evlist *data,
 			      struct list_head *list,
-			      struct bpf_object *obj);
+			      struct bpf_object *obj,
+			      struct list_head *head_config);
 int parse_events_add_numeric(struct parse_events_evlist *data,
 			     struct list_head *list,
 			     u32 type, u64 config,
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 17/55] perf tools: Introduce bpf-output event
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (15 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 16/55] perf tools: Pass tracepoint options to BPF script Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 18/55] perf data: Support converting data from bpf_perf_event_output() Wang Nan
                   ` (37 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Commit a43eec304259a6c637f4014a6d4767159b6a3aa3 (bpf: introduce
bpf_perf_event_output() helper) add a helper to enable BPF program
output data to perf ring buffer through a new type of perf event
PERF_COUNT_SW_BPF_OUTPUT. This patch enable perf to create perf
event of that type. Now perf user can use following cmdline to
receive output data from BPF programs:

 # ./perf record -a -e bpf-output/no-inherit,name=evt/ \
                    -e ./test_bpf_output.c/maps:channel.event=evt/ ls /
 # ./perf script
            perf  1560 [004] 347747.086295:                       evt:  ffffffff811fd201 sys_write ...
            perf  1560 [004] 347747.086300:                       evt:  ffffffff811fd201 sys_write ...
            perf  1560 [004] 347747.086315:                       evt:  ffffffff811fd201 sys_write ...
            ...

Test result:
 # cat ./test_bpf_output.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };

 #define SEC(NAME) __attribute__((section(NAME), used))
 static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
 };

 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
 	struct {
 		u64 ktime;
 		int cpuid;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output: %d\n";

 	output_data.cpuid = get_smp_processor_id();
 	output_data.ktime = ktime_get_ns();
 	int err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				    &output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data), err);
 	return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************ END ***************************/

 # ./perf record -a -e bpf-output/no-inherit,name=evt/ \
                    -e ./test_bpf_output.c/maps:channel.event=evt/ ls /
 # ./perf script | grep ls
          ls  2242 [003] 347851.557563:                       evt:  ffffffff811fd201 sys_write ...
          ls  2242 [003] 347851.557571:                       evt:  ffffffff811fd201 sys_write ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 5 ++---
 tools/perf/util/evsel.c        | 5 +++++
 tools/perf/util/evsel.h        | 8 ++++++++
 tools/perf/util/parse-events.l | 1 +
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index b6526b0..94f5b89 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -1331,13 +1331,12 @@ apply_config_evsel_for_key(const char *name, int map_fd, void *pkey,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel))
+		check_pass = true;
 	if (attr->type == PERF_TYPE_RAW)
 		check_pass = true;
 	if (attr->type == PERF_TYPE_HARDWARE)
 		check_pass = true;
-	if (attr->type == PERF_TYPE_SOFTWARE &&
-			attr->config == PERF_COUNT_SW_BPF_OUTPUT)
-		check_pass = true;
 	if (!check_pass) {
 		pr_debug("ERROR: Event type is wrong for map %s\n", name);
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6ae20d0..0902fe4 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -225,6 +225,11 @@ struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
 	if (evsel != NULL)
 		perf_evsel__init(evsel, attr, idx);
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		evsel->attr.sample_type |= PERF_SAMPLE_RAW;
+		evsel->attr.sample_period = 1;
+	}
+
 	return evsel;
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8e75434..efad78f 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -364,6 +364,14 @@ static inline bool perf_evsel__is_function_event(struct perf_evsel *evsel)
 #undef FUNCTION_EVENT
 }
 
+static inline bool perf_evsel__is_bpf_output(struct perf_evsel *evsel)
+{
+	struct perf_event_attr *attr = &evsel->attr;
+
+	return (attr->config == PERF_COUNT_SW_BPF_OUTPUT) &&
+		(attr->type == PERF_TYPE_SOFTWARE);
+}
+
 struct perf_attr_details {
 	bool freq;
 	bool verbose;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index fb85d03..1477fbc 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -248,6 +248,7 @@ cpu-migrations|migrations			{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COU
 alignment-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
 emulation-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }
 dummy						{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
+bpf-output					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
 
 	/*
 	 * We have to handle the kernel PMU event cycles-ct/cycles-t/mem-loads/mem-stores separately.
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 18/55] perf data: Support converting data from bpf_perf_event_output()
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (16 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 17/55] perf tools: Introduce bpf-output event Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 19/55] perf data: Explicitly set byte order for integer types Wang Nan
                   ` (36 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

bpf_perf_event_output() outputs data through sample->raw_data. This
patch adds support to convert those data into CTF. A python script
then can be used to process output data from BPF programs.

Test result:

 # cat ./test_bpf_output_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };
 #define SEC(NAME) __attribute__((section(NAME), used))
 static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
 };

 static inline int __attribute__((always_inline))
 func(void *ctx, int type)
 {
 	struct {
 		u64 ktime;
 		int type;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output\n";
 	int err;

 	output_data.type = type;
 	output_data.ktime = ktime_get_ns();
 	err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				&output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data));
 	return 0;
 }
 SEC("func_begin=sys_nanosleep")
 int func_begin(void *ctx) {return func(ctx, 1);}
 SEC("func_end=sys_nanosleep%return")
 int func_end(void *ctx) { return func(ctx, 2);}
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 # ./perf record -e bpf-output/no-inherit,name=evt/ \
                 -e ./test_bpf_output_2.c/maps:channel.event=evt/ \
                 usleep 100000
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]

 # ./perf script
          usleep 14942 92503.198504: evt:  ffffffff810e0ba1 sys_nanosleep (/lib/modules/4.3.0....
          usleep 14942 92503.298562: evt:  ffffffff810585e9 kretprobe_trampoline_holder (/lib....

 # ./perf data convert --to-ctf ./out.ctf
 [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
 [ perf data convert: Converted and wrote 0.000 MB (2 samples) ]

 # babeltrace ./out.ctf
 [01:41:43.198504134] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E0BA1, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x32C0C07B, [1] = 0x5421, [2] = 0x1 ] }
 [01:41:43.298562257] (+0.100058123) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810585E9, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x38B77FAA, [1] = 0x5421, [2] = 0x2 ] }

 # cat ./test_bpf_output_2.py
 from babeltrace import TraceCollection
 tc = TraceCollection()
 tc.add_trace('./out.ctf', 'ctf')
 d = {1:[], 2:[]}
 for event in tc.events:
     if not event.name.startswith('evt'):
         continue
     raw_data = event['raw_data']
     (time, type) = ((raw_data[0] + (raw_data[1] << 32)), raw_data[2])
     d[type].append(time)
 print(list(map(lambda i: d[2][i] - d[1][i], range(len(d[1])))));

 # python3 ./test_bpf_output_2.py
 [100056879]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data-convert-bt.c | 112 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 111 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index b722e57..70f462d 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -352,6 +352,84 @@ static int add_tracepoint_values(struct ctf_writer *cw,
 	return ret;
 }
 
+static int
+add_bpf_output_values(struct bt_ctf_event_class *event_class,
+		      struct bt_ctf_event *event,
+		      struct perf_sample *sample)
+{
+	struct bt_ctf_field_type *len_type, *seq_type;
+	struct bt_ctf_field *len_field, *seq_field;
+	unsigned int raw_size = sample->raw_size;
+	unsigned int nr_elements = raw_size / sizeof(u32);
+	unsigned int i;
+	int ret;
+
+	if (nr_elements * sizeof(u32) != raw_size)
+		pr_warning("Incorrect raw_size (%u) in bpf output event, skip %lu bytes\n",
+			   raw_size, nr_elements * sizeof(u32) - raw_size);
+
+	len_type = bt_ctf_event_class_get_field_by_name(event_class, "raw_len");
+	len_field = bt_ctf_field_create(len_type);
+	if (!len_field) {
+		pr_err("failed to create 'raw_len' for bpf output event\n");
+		ret = -1;
+		goto put_len_type;
+	}
+
+	ret = bt_ctf_field_unsigned_integer_set_value(len_field, nr_elements);
+	if (ret) {
+		pr_err("failed to set field value for raw_len\n");
+		goto put_len_field;
+	}
+	ret = bt_ctf_event_set_payload(event, "raw_len", len_field);
+	if (ret) {
+		pr_err("failed to set payload to raw_len\n");
+		goto put_len_field;
+	}
+
+	seq_type = bt_ctf_event_class_get_field_by_name(event_class, "raw_data");
+	seq_field = bt_ctf_field_create(seq_type);
+	if (!seq_field) {
+		pr_err("failed to create 'raw_data' for bpf output event\n");
+		ret = -1;
+		goto put_seq_type;
+	}
+
+	ret = bt_ctf_field_sequence_set_length(seq_field, len_field);
+	if (ret) {
+		pr_err("failed to set length of 'raw_data'\n");
+		goto put_seq_field;
+	}
+
+	for (i = 0; i < nr_elements; i++) {
+		struct bt_ctf_field *elem_field =
+			bt_ctf_field_sequence_get_field(seq_field, i);
+
+		ret = bt_ctf_field_unsigned_integer_set_value(elem_field,
+				((u32 *)(sample->raw_data))[i]);
+
+		bt_ctf_field_put(elem_field);
+		if (ret) {
+			pr_err("failed to set raw_data[%d]\n", i);
+			goto put_seq_field;
+		}
+	}
+
+	ret = bt_ctf_event_set_payload(event, "raw_data", seq_field);
+	if (ret)
+		pr_err("failed to set payload for raw_data\n");
+
+put_seq_field:
+	bt_ctf_field_put(seq_field);
+put_seq_type:
+	bt_ctf_field_type_put(seq_type);
+put_len_field:
+	bt_ctf_field_put(len_field);
+put_len_type:
+	bt_ctf_field_type_put(len_type);
+	return ret;
+}
+
 static int add_generic_values(struct ctf_writer *cw,
 			      struct bt_ctf_event *event,
 			      struct perf_evsel *evsel,
@@ -597,6 +675,12 @@ static int process_sample_event(struct perf_tool *tool,
 			return -1;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		ret = add_bpf_output_values(event_class, event, sample);
+		if (ret)
+			return -1;
+	}
+
 	cs = ctf_stream(cw, get_sample_cpu(cw, sample, evsel));
 	if (cs) {
 		if (is_flush_needed(cs))
@@ -744,6 +828,25 @@ static int add_tracepoint_types(struct ctf_writer *cw,
 	return ret;
 }
 
+static int add_bpf_output_types(struct ctf_writer *cw,
+				struct bt_ctf_event_class *class)
+{
+	struct bt_ctf_field_type *len_type = cw->data.u32;
+	struct bt_ctf_field_type *seq_base_type = cw->data.u32_hex;
+	struct bt_ctf_field_type *seq_type;
+	int ret;
+
+	ret = bt_ctf_event_class_add_field(class, len_type, "raw_len");
+	if (ret)
+		return ret;
+
+	seq_type = bt_ctf_field_type_sequence_create(seq_base_type, "raw_len");
+	if (!seq_type)
+		return -1;
+
+	return bt_ctf_event_class_add_field(class, seq_type, "raw_data");
+}
+
 static int add_generic_types(struct ctf_writer *cw, struct perf_evsel *evsel,
 			     struct bt_ctf_event_class *event_class)
 {
@@ -755,7 +858,8 @@ static int add_generic_types(struct ctf_writer *cw, struct perf_evsel *evsel,
 	 *                              ctf event header
 	 *   PERF_SAMPLE_READ         - TODO
 	 *   PERF_SAMPLE_CALLCHAIN    - TODO
-	 *   PERF_SAMPLE_RAW          - tracepoint fields are handled separately
+	 *   PERF_SAMPLE_RAW          - tracepoint fields and BPF output
+	 *                              are handled separately
 	 *   PERF_SAMPLE_BRANCH_STACK - TODO
 	 *   PERF_SAMPLE_REGS_USER    - TODO
 	 *   PERF_SAMPLE_STACK_USER   - TODO
@@ -824,6 +928,12 @@ static int add_event(struct ctf_writer *cw, struct perf_evsel *evsel)
 			goto err;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		ret = add_bpf_output_types(cw, event_class);
+		if (ret)
+			goto err;
+	}
+
 	ret = bt_ctf_stream_class_add_event_class(cw->stream_class, event_class);
 	if (ret) {
 		pr("Failed to add event class into stream.\n");
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 19/55] perf data: Explicitly set byte order for integer types
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (17 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 18/55] perf data: Support converting data from bpf_perf_event_output() Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 20/55] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
                   ` (35 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

After babeltrace commit 5cec03e402aa ("ir: copy variants and
sequences when setting a field path"), 'perf data convert' get
incorrect result if there's bpf output data. For example:

 # perf data convert --to-ctf ./out.ctf
 # babeltrace ./out.ctf
 [10:44:31.186045346] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E7DD1, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0xC028E32F, [1] = 0x815D0100, [2] = 0x1000000 ] }
 [10:44:31.286101003] (+0.100055657) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF8105B609, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0x35D9F1EB, [1] = 0x15D81, [2] = 0x2 ] }

The expected result of the first sample should be:

 raw_data = [ [0] = 0x2FE328C0, [1] = 0x15D81, [2] = 0x1 ] }

however, 'perf data convert' output big endian value to resuling CTF
file.

The reason is a internal change (or a bug?) of babeltrace.

Before this patch, at the first add_bpf_output_values(), byte order of
all integer type is uncertain (is 0, neither 1234 (le) nor 4321 (be)).
It would be fixed by:

perf_evlist__deliver_sample
 -> process_sample_event
   -> ctf_stream
      ...
      ->bt_ctf_trace_add_stream_class
        ->bt_ctf_field_type_structure_set_byte_order
          ->bt_ctf_field_type_integer_set_byte_order

during creating the stream.

However, the babeltrace commit mentioned above duplicates types in
sequence to prevent potential conflict in following call stack and
link the newly allocated type into the 'raw_data' sequence:

perf_evlist__deliver_sample
 -> process_sample_event
   -> ctf_stream
      ...
      -> bt_ctf_trace_add_stream_class
        -> bt_ctf_stream_class_resolve_types
           ...
           -> bt_ctf_field_type_sequence_copy
             ->bt_ctf_field_type_integer_copy

This happens before byte order setting, so only the newly allocated
type is initialized, the byte order of original type perf choose to
create the first raw_data is still uncertain.

Byte order in CTF output is not related to byte order in perf.data.
Setting it to anything other than BT_CTF_BYTE_ORDER_NATIVE solves this
problem (only BT_CTF_BYTE_ORDER_NATIVE needs to be fixed). To reduce
behavior changing, set byte order according to compiling options.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data-convert-bt.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 70f462d..3f723f4a 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -1080,6 +1080,12 @@ static struct bt_ctf_field_type *create_int_type(int size, bool sign, bool hex)
 	    bt_ctf_field_type_integer_set_base(type, BT_CTF_INTEGER_BASE_HEXADECIMAL))
 		goto err;
 
+#if __BYTE_ORDER == __BIG_ENDIAN
+	bt_ctf_field_type_set_byte_order(type, BT_CTF_BYTE_ORDER_BIG_ENDIAN);
+#else
+	bt_ctf_field_type_set_byte_order(type, BT_CTF_BYTE_ORDER_LITTLE_ENDIAN);
+#endif
+
 	pr2("Created type: INTEGER %d-bit %ssigned %s\n",
 	    size, sign ? "un" : "", hex ? "hex" : "");
 	return type;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 20/55] perf core: Introduce new ioctl options to pause and resume ring buffer
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (18 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 19/55] perf data: Explicitly set byte order for integer types Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 21/55] perf core: Set event's default overflow_handler Wang Nan
                   ` (34 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Add new ioctl() to pause/resume ring-buffer output.

In some situations we want to read from ring buffer only when we
ensure nothing can write to the ring buffer during reading. Without
this patch we have to turn off all events attached to this ring buffer
to achieve this.

This patch is for supporting overwrite ring buffer. Following
commits will introduce new methods support reading from overwrite ring
buffer. Before reading caller must ensure the ring buffer is frozen, or
the reading is unreliable.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/uapi/linux/perf_event.h |  1 +
 kernel/events/core.c            | 13 +++++++++++++
 kernel/events/internal.h        | 11 +++++++++++
 kernel/events/ring_buffer.c     |  7 ++++++-
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 1afe962..a3c1903 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -401,6 +401,7 @@ struct perf_event_attr {
 #define PERF_EVENT_IOC_SET_FILTER	_IOW('$', 6, char *)
 #define PERF_EVENT_IOC_ID		_IOR('$', 7, __u64 *)
 #define PERF_EVENT_IOC_SET_BPF		_IOW('$', 8, __u32)
+#define PERF_EVENT_IOC_PAUSE_OUTPUT	_IOW('$', 9, __u32)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 94c47e3..a7075ae 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4231,6 +4231,19 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
 	case PERF_EVENT_IOC_SET_BPF:
 		return perf_event_set_bpf_prog(event, arg);
 
+	case PERF_EVENT_IOC_PAUSE_OUTPUT: {
+		struct ring_buffer *rb;
+
+		rcu_read_lock();
+		rb = rcu_dereference(event->rb);
+		if (!event->rb) {
+			rcu_read_unlock();
+			return -EINVAL;
+		}
+		rb_toggle_paused(rb, !!arg);
+		rcu_read_unlock();
+		return 0;
+	}
 	default:
 		return -ENOTTY;
 	}
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 2bbad9c..6a93d1b 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -18,6 +18,7 @@ struct ring_buffer {
 #endif
 	int				nr_pages;	/* nr of data pages  */
 	int				overwrite;	/* can overwrite itself */
+	int				paused;		/* can write into ring buffer */
 
 	atomic_t			poll;		/* POLL_ for wakeups */
 
@@ -65,6 +66,16 @@ static inline void rb_free_rcu(struct rcu_head *rcu_head)
 	rb_free(rb);
 }
 
+static inline void
+rb_toggle_paused(struct ring_buffer *rb,
+		 bool pause)
+{
+	if (!pause && rb->nr_pages)
+		rb->paused = 0;
+	else
+		rb->paused = 1;
+}
+
 extern struct ring_buffer *
 rb_alloc(int nr_pages, long watermark, int cpu, int flags);
 extern void perf_event_wakeup(struct perf_event *event);
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 1faad2c..22e1a47 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -125,8 +125,11 @@ int perf_output_begin(struct perf_output_handle *handle,
 	if (unlikely(!rb))
 		goto out;
 
-	if (unlikely(!rb->nr_pages))
+	if (unlikely(rb->paused)) {
+		if (rb->nr_pages)
+			local_inc(&rb->lost);
 		goto out;
+	}
 
 	handle->rb    = rb;
 	handle->event = event;
@@ -244,6 +247,8 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
 	INIT_LIST_HEAD(&rb->event_list);
 	spin_lock_init(&rb->event_lock);
 	init_irq_work(&rb->irq_work, rb_irq_work);
+
+	rb->paused = rb->nr_pages ? 0 : 1;
 }
 
 static void ring_buffer_put_async(struct ring_buffer *rb)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 21/55] perf core: Set event's default overflow_handler
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (19 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 20/55] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 22/55] perf core: Prepare writing into ring buffer from end Wang Nan
                   ` (33 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Set a default event->overflow_handler in perf_event_alloc() so don't
need checking event->overflow_handler in __perf_event_overflow().
Following commits can give a different default overflow_handler.

No extra performance introduced into hot path because in the original
code we still need reading this handler from memory. A conditional branch
is avoided so actually we remove some instructions.

Initial idea comes from Peter at [1].

[1] http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 kernel/events/core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a7075ae..ae34061 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6392,10 +6392,7 @@ static int __perf_event_overflow(struct perf_event *event,
 		irq_work_queue(&event->pending);
 	}
 
-	if (event->overflow_handler)
-		event->overflow_handler(event, data, regs);
-	else
-		perf_event_output(event, data, regs);
+	event->overflow_handler(event, data, regs);
 
 	if (*perf_event_fasync(event) && event->pending_kill) {
 		event->pending_wakeup = 1;
@@ -7868,8 +7865,13 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 		context = parent_event->overflow_handler_context;
 	}
 
-	event->overflow_handler	= overflow_handler;
-	event->overflow_handler_context = context;
+	if (overflow_handler) {
+		event->overflow_handler	= overflow_handler;
+		event->overflow_handler_context = context;
+	} else {
+		event->overflow_handler = perf_event_output;
+		event->overflow_handler_context = NULL;
+	}
 
 	perf_event__state_init(event);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 22/55] perf core: Prepare writing into ring buffer from end
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (20 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 21/55] perf core: Set event's default overflow_handler Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 23/55] perf core: Add backward attribute to perf event Wang Nan
                   ` (32 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Convert perf_output_begin to __perf_output_begin and make the later
function able to write records from the end of the ring buffer.
Following commits will utilize the 'backward' flag.

This patch doesn't introduce any extra performance overhead since we
use always_inline.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 kernel/events/ring_buffer.c | 42 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 22e1a47..37c11c6 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -102,8 +102,21 @@ out:
 	preempt_enable();
 }
 
-int perf_output_begin(struct perf_output_handle *handle,
-		      struct perf_event *event, unsigned int size)
+static bool __always_inline
+ring_buffer_has_space(unsigned long head, unsigned long tail,
+		      unsigned long data_size, unsigned int size,
+		      bool backward)
+{
+	if (!backward)
+		return CIRC_SPACE(head, tail, data_size) >= size;
+	else
+		return CIRC_SPACE(tail, head, data_size) >= size;
+}
+
+static int __always_inline
+__perf_output_begin(struct perf_output_handle *handle,
+		    struct perf_event *event, unsigned int size,
+		    bool backward)
 {
 	struct ring_buffer *rb;
 	unsigned long tail, offset, head;
@@ -146,9 +159,12 @@ int perf_output_begin(struct perf_output_handle *handle,
 	do {
 		tail = READ_ONCE(rb->user_page->data_tail);
 		offset = head = local_read(&rb->head);
-		if (!rb->overwrite &&
-		    unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size))
-			goto fail;
+		if (!rb->overwrite) {
+			if (unlikely(!ring_buffer_has_space(head, tail,
+							    perf_data_size(rb),
+							    size, backward)))
+				goto fail;
+		}
 
 		/*
 		 * The above forms a control dependency barrier separating the
@@ -162,9 +178,17 @@ int perf_output_begin(struct perf_output_handle *handle,
 		 * See perf_output_put_handle().
 		 */
 
-		head += size;
+		if (!backward)
+			head += size;
+		else
+			head -= size;
 	} while (local_cmpxchg(&rb->head, offset, head) != offset);
 
+	if (backward) {
+		offset = head;
+		head = (u64)(-head);
+	}
+
 	/*
 	 * We rely on the implied barrier() by local_cmpxchg() to ensure
 	 * none of the data stores below can be lifted up by the compiler.
@@ -206,6 +230,12 @@ out:
 	return -ENOSPC;
 }
 
+int perf_output_begin(struct perf_output_handle *handle,
+		      struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, false);
+}
+
 unsigned int perf_output_copy(struct perf_output_handle *handle,
 		      const void *buf, unsigned int len)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 23/55] perf core: Add backward attribute to perf event
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (21 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 22/55] perf core: Prepare writing into ring buffer from end Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 24/55] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
                   ` (31 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

In perf_event_attr a new bit 'write_backward' is appended to indicate
this event should write ring buffer from its end to beginning.

In perf_output_begin(), prepare ring buffer according this bit.

This patch introduces small overhead into perf_output_begin():
an extra memory read and a conditional branch. Further patch can remove
this overhead by using custom output handler.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/linux/perf_event.h      | 5 +++++
 include/uapi/linux/perf_event.h | 3 ++-
 kernel/events/core.c            | 7 +++++++
 kernel/events/ring_buffer.c     | 2 ++
 4 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index b35a61a..0ce1015 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1029,6 +1029,11 @@ static inline bool has_aux(struct perf_event *event)
 	return event->pmu->setup_aux;
 }
 
+static inline bool is_write_backward(struct perf_event *event)
+{
+	return !!event->attr.write_backward;
+}
+
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
 extern void perf_output_end(struct perf_output_handle *handle);
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index a3c1903..43fc8d2 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -340,7 +340,8 @@ struct perf_event_attr {
 				comm_exec      :  1, /* flag comm events that are due to an exec */
 				use_clockid    :  1, /* use @clockid for time fields */
 				context_switch :  1, /* context switch data */
-				__reserved_1   : 37;
+				write_backward :  1, /* Write ring buffer from end to beginning */
+				__reserved_1   : 36;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index ae34061..9353154 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8101,6 +8101,13 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
 		goto out;
 
 	/*
+	 * Either writing ring buffer from beginning or from end.
+	 * Mixing is not allowed.
+	 */
+	if (is_write_backward(output_event) != is_write_backward(event))
+		goto out;
+
+	/*
 	 * If both events generate aux data, they must be on the same PMU
 	 */
 	if (has_aux(event) && has_aux(output_event) &&
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 37c11c6..80b1fa7 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -233,6 +233,8 @@ out:
 int perf_output_begin(struct perf_output_handle *handle,
 		      struct perf_event *event, unsigned int size)
 {
+	if (unlikely(is_write_backward(event)))
+		return __perf_output_begin(handle, event, size, true);
 	return __perf_output_begin(handle, event, size, false);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 24/55] perf core: Reduce perf event output overhead by new overflow handler
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (22 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 23/55] perf core: Add backward attribute to perf event Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 25/55] perf tools: Only validate is_pos for tracking evsels Wang Nan
                   ` (30 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

By creating onward and backward specific overflow handlers and setting
them according to event's backward setting, normal sampling events
don't need checking backward setting of an event any more.

This is the last patch of backward writing patchset. After this patch,
there's no extra overhead introduced to the fast path of sampling
output.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/linux/perf_event.h  | 17 +++++++++++++++--
 kernel/events/core.c        | 41 ++++++++++++++++++++++++++++++++++++-----
 kernel/events/ring_buffer.c | 12 ++++++++++++
 3 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 0ce1015..e466cc6 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -827,9 +827,15 @@ extern int perf_event_overflow(struct perf_event *event,
 				 struct perf_sample_data *data,
 				 struct pt_regs *regs);
 
+extern void perf_event_output_onward(struct perf_event *event,
+				     struct perf_sample_data *data,
+				     struct pt_regs *regs);
+extern void perf_event_output_backward(struct perf_event *event,
+				       struct perf_sample_data *data,
+				       struct pt_regs *regs);
 extern void perf_event_output(struct perf_event *event,
-				struct perf_sample_data *data,
-				struct pt_regs *regs);
+			      struct perf_sample_data *data,
+			      struct pt_regs *regs);
 
 extern void
 perf_event_header__init_id(struct perf_event_header *header,
@@ -1036,6 +1042,13 @@ static inline bool is_write_backward(struct perf_event *event)
 
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
+extern int perf_output_begin_onward(struct perf_output_handle *handle,
+				    struct perf_event *event,
+				    unsigned int size);
+extern int perf_output_begin_backward(struct perf_output_handle *handle,
+				      struct perf_event *event,
+				      unsigned int size);
+
 extern void perf_output_end(struct perf_output_handle *handle);
 extern unsigned int perf_output_copy(struct perf_output_handle *handle,
 			     const void *buf, unsigned int len);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9353154..ce70f54 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5531,9 +5531,13 @@ void perf_prepare_sample(struct perf_event_header *header,
 	}
 }
 
-void perf_event_output(struct perf_event *event,
-			struct perf_sample_data *data,
-			struct pt_regs *regs)
+static void __always_inline
+__perf_event_output(struct perf_event *event,
+		    struct perf_sample_data *data,
+		    struct pt_regs *regs,
+		    int (*output_begin)(struct perf_output_handle *,
+					struct perf_event *,
+					unsigned int))
 {
 	struct perf_output_handle handle;
 	struct perf_event_header header;
@@ -5543,7 +5547,7 @@ void perf_event_output(struct perf_event *event,
 
 	perf_prepare_sample(&header, data, event, regs);
 
-	if (perf_output_begin(&handle, event, header.size))
+	if (output_begin(&handle, event, header.size))
 		goto exit;
 
 	perf_output_sample(&handle, &header, data, event);
@@ -5554,6 +5558,30 @@ exit:
 	rcu_read_unlock();
 }
 
+void
+perf_event_output_onward(struct perf_event *event,
+			 struct perf_sample_data *data,
+			 struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin_onward);
+}
+
+void
+perf_event_output_backward(struct perf_event *event,
+			   struct perf_sample_data *data,
+			   struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin_backward);
+}
+
+void
+perf_event_output(struct perf_event *event,
+		  struct perf_sample_data *data,
+		  struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin);
+}
+
 /*
  * read event_id
  */
@@ -7868,8 +7896,11 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 	if (overflow_handler) {
 		event->overflow_handler	= overflow_handler;
 		event->overflow_handler_context = context;
+	} else if (is_write_backward(event)){
+		event->overflow_handler = perf_event_output_backward;
+		event->overflow_handler_context = NULL;
 	} else {
-		event->overflow_handler = perf_event_output;
+		event->overflow_handler = perf_event_output_onward;
 		event->overflow_handler_context = NULL;
 	}
 
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 80b1fa7..7e30e012 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -230,6 +230,18 @@ out:
 	return -ENOSPC;
 }
 
+int perf_output_begin_onward(struct perf_output_handle *handle,
+			     struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, false);
+}
+
+int perf_output_begin_backward(struct perf_output_handle *handle,
+			       struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, true);
+}
+
 int perf_output_begin(struct perf_output_handle *handle,
 		      struct perf_event *event, unsigned int size)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 25/55] perf tools: Only validate is_pos for tracking evsels
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (23 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 24/55] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 26/55] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
                   ` (29 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

is_pos only useful for tracking events (fork, mmap, exit, ...).
Perf collects those events through evsel with 'tracking' set.
Therefore, there's no need to validate every is_pos against
evlist->is_pos.

This patch is required after perf support PERF_SAMPLE_TAILSIZE.
Since there an extra u64 at the end of this type of evsels, is_pos
for evsel with PERF_SAMPLE_TAILSIZE setting is different from other
evsels.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c42e196..fef465a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1274,8 +1274,15 @@ bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
 		return false;
 
 	evlist__for_each(evlist, pos) {
-		if (pos->id_pos != evlist->id_pos ||
-		    pos->is_pos != evlist->is_pos)
+		if (pos->id_pos != evlist->id_pos)
+			return false;
+		/*
+		 * Only tracking events needs is_pos. Those events are
+		 * collected if evsel->tracking is selected.
+		 * For other evsel, is_pos is useless for other evsels,
+		 * so skip validating them.
+		 */
+		if (pos->tracking && pos->is_pos != evlist->is_pos)
 			return false;
 	}
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 26/55] perf tools: Print write_backward value in perf_event_attr__fprintf
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (24 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 25/55] perf tools: Only validate is_pos for tracking evsels Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 27/55] perf tools: Make ordered_events reusable Wang Nan
                   ` (28 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Print write_backward setting when printing perf evsel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evsel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0902fe4..510afa4 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1299,6 +1299,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(comm_exec, p_unsigned);
 	PRINT_ATTRf(use_clockid, p_unsigned);
 	PRINT_ATTRf(context_switch, p_unsigned);
+	PRINT_ATTRf(write_backward, p_unsigned);
 
 	PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned);
 	PRINT_ATTRf(bp_type, p_unsigned);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 27/55] perf tools: Make ordered_events reusable
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (25 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 26/55] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 28/55] perf record: Extract synthesize code to record__synthesize() Wang Nan
                   ` (27 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

ordered_events__free() leaves linked lists and timestamps not cleared,
so unable to be reused after ordered_events__free(). Which is inconvenient
after 'perf record' supports generating multiple perf.data output and
process build-ids for each of them.

Calls ordered_events__init() in ordered_events__free() so ordered_events
can be reused.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/ordered-events.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
index b1b9e23..70c0dc8 100644
--- a/tools/perf/util/ordered-events.c
+++ b/tools/perf/util/ordered-events.c
@@ -299,6 +299,8 @@ void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t d
 
 void ordered_events__free(struct ordered_events *oe)
 {
+	ordered_events__deliver_t old_deliver = oe->deliver;
+
 	while (!list_empty(&oe->to_free)) {
 		struct ordered_event *event;
 
@@ -307,4 +309,7 @@ void ordered_events__free(struct ordered_events *oe)
 		free_dup_event(oe, event->event);
 		free(event);
 	}
+
+	memset(oe, '\0', sizeof(*oe));
+	ordered_events__init(oe, old_deliver);
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 28/55] perf record: Extract synthesize code to record__synthesize()
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (26 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 27/55] perf tools: Make ordered_events reusable Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 29/55] perf tools: Add perf_data_file__switch() helper Wang Nan
                   ` (26 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Create record__synthesize(). It can be used to create tracking events
for each perf.data after perf supporting splitting into multiple
outputs.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 132 +++++++++++++++++++++++++-------------------
 1 file changed, 76 insertions(+), 56 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7d11162..4633c0a 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -485,6 +485,81 @@ static void workload_exec_failed_signal(int signo __maybe_unused,
 
 static void snapshot_sig_handler(int sig);
 
+static int record__synthesize(struct record *rec)
+{
+	struct perf_session *session = rec->session;
+	struct machine *machine = &session->machines.host;
+	struct perf_data_file *file = &rec->file;
+	struct record_opts *opts = &rec->opts;
+	struct perf_tool *tool = &rec->tool;
+	int fd = perf_data_file__fd(file);
+	int err = 0;
+	static bool warned_kmaps = false, warned_modules = false;
+
+	if (file->is_pipe) {
+		err = perf_event__synthesize_attrs(tool, session,
+						   process_synthesized_event);
+		if (err < 0) {
+			pr_err("Couldn't synthesize attrs.\n");
+			goto out;
+		}
+
+		if (have_tracepoints(&rec->evlist->entries)) {
+			/*
+			 * FIXME err <= 0 here actually means that
+			 * there were no tracepoints so its not really
+			 * an error, just that we don't need to
+			 * synthesize anything.  We really have to
+			 * return this more properly and also
+			 * propagate errors that now are calling die()
+			 */
+			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
+								  process_synthesized_event);
+			if (err <= 0) {
+				pr_err("Couldn't record tracing data.\n");
+				goto out;
+			}
+			rec->bytes_written += err;
+		}
+	}
+
+	if (rec->opts.full_auxtrace) {
+		err = perf_event__synthesize_auxtrace_info(rec->itr, tool,
+					session, process_synthesized_event);
+		if (err)
+			goto out;
+	}
+
+	err = perf_event__synthesize_kernel_mmap(tool, process_synthesized_event,
+						 machine);
+	if (err < 0 && !warned_kmaps) {
+		warned_kmaps = true;
+		pr_err("Couldn't record kernel reference relocation symbol\n"
+		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
+		       "Check /proc/kallsyms permission or run as root.\n");
+	}
+
+	err = perf_event__synthesize_modules(tool, process_synthesized_event,
+					     machine);
+	if (err < 0 && !warned_modules) {
+		warned_modules = true;
+		pr_err("Couldn't record kernel module information.\n"
+		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
+		       "Check /proc/modules permission or run as root.\n");
+	}
+
+	if (perf_guest) {
+		machines__process_guests(&session->machines,
+					 perf_event__synthesize_guest_os, tool);
+	}
+
+	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
+					    process_synthesized_event, opts->sample_address,
+					    opts->proc_map_timeout);
+out:
+	return err;
+}
+
 static int __cmd_record(struct record *rec, int argc, const char **argv)
 {
 	int err;
@@ -579,63 +654,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	if (file->is_pipe) {
-		err = perf_event__synthesize_attrs(tool, session,
-						   process_synthesized_event);
-		if (err < 0) {
-			pr_err("Couldn't synthesize attrs.\n");
-			goto out_child;
-		}
-
-		if (have_tracepoints(&rec->evlist->entries)) {
-			/*
-			 * FIXME err <= 0 here actually means that
-			 * there were no tracepoints so its not really
-			 * an error, just that we don't need to
-			 * synthesize anything.  We really have to
-			 * return this more properly and also
-			 * propagate errors that now are calling die()
-			 */
-			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
-								  process_synthesized_event);
-			if (err <= 0) {
-				pr_err("Couldn't record tracing data.\n");
-				goto out_child;
-			}
-			rec->bytes_written += err;
-		}
-	}
-
-	if (rec->opts.full_auxtrace) {
-		err = perf_event__synthesize_auxtrace_info(rec->itr, tool,
-					session, process_synthesized_event);
-		if (err)
-			goto out_delete_session;
-	}
-
-	err = perf_event__synthesize_kernel_mmap(tool, process_synthesized_event,
-						 machine);
-	if (err < 0)
-		pr_err("Couldn't record kernel reference relocation symbol\n"
-		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
-		       "Check /proc/kallsyms permission or run as root.\n");
-
-	err = perf_event__synthesize_modules(tool, process_synthesized_event,
-					     machine);
+	err = record__synthesize(rec);
 	if (err < 0)
-		pr_err("Couldn't record kernel module information.\n"
-		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
-		       "Check /proc/modules permission or run as root.\n");
-
-	if (perf_guest) {
-		machines__process_guests(&session->machines,
-					 perf_event__synthesize_guest_os, tool);
-	}
-
-	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
-					    process_synthesized_event, opts->sample_address,
-					    opts->proc_map_timeout);
-	if (err != 0)
 		goto out_child;
 
 	if (rec->realtime_prio) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 29/55] perf tools: Add perf_data_file__switch() helper
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (27 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 28/55] perf record: Extract synthesize code to record__synthesize() Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 30/55] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
                   ` (25 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

perf_data_file__switch() closes current output file, renames it, then
open a new one to continue record. It will be used by perf record
to split output into multiple perf.data files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data.c | 36 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/data.h | 11 ++++++++++-
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index 1921942..bfded6a 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -136,3 +136,39 @@ ssize_t perf_data_file__write(struct perf_data_file *file,
 {
 	return writen(file->fd, buf, size);
 }
+
+int perf_data_file__switch(struct perf_data_file *file,
+			   const char *postfix,
+			   size_t pos, bool at_exit)
+{
+	char *new_filepath;
+	int ret;
+
+	if (check_pipe(file))
+		return -EINVAL;
+	if (perf_data_file__is_read(file))
+		return -EINVAL;
+
+	if (asprintf(&new_filepath, "%s.%s", file->path, postfix) < 0)
+		return -ENOMEM;
+
+	rename(file->path, new_filepath);
+
+	if (!at_exit) {
+		close(file->fd);
+		ret = perf_data_file__open(file);
+		if (ret < 0)
+			goto out;
+
+		if (lseek(file->fd, pos, SEEK_SET) == (off_t)-1) {
+			ret = -errno;
+			pr_debug("Failed to lseek to %zu: %s",
+				 pos, strerror(errno));
+			goto out;
+		}
+	}
+	ret = file->fd;
+out:
+	free(new_filepath);
+	return ret;
+}
diff --git a/tools/perf/util/data.h b/tools/perf/util/data.h
index 2b15d0c..ae510ce 100644
--- a/tools/perf/util/data.h
+++ b/tools/perf/util/data.h
@@ -46,5 +46,14 @@ int perf_data_file__open(struct perf_data_file *file);
 void perf_data_file__close(struct perf_data_file *file);
 ssize_t perf_data_file__write(struct perf_data_file *file,
 			      void *buf, size_t size);
-
+/*
+ * If at_exit is set, only rename current perf.data to
+ * perf.data.<postfix>, continue write on original file.
+ * Set at_exit when flushing the last output.
+ *
+ * Return value is fd of new output.
+ */
+int perf_data_file__switch(struct perf_data_file *file,
+			   const char *postfix,
+			   size_t pos, bool at_exit);
 #endif /* __PERF_DATA_H */
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 30/55] perf record: Turns auxtrace_snapshot_enable into 3 states
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (28 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 29/55] perf tools: Add perf_data_file__switch() helper Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 31/55] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
                   ` (24 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

auxtrace_snapshot_enable has only two states (0/1). Turns it into a
triple states enum so SIGUSR2 handler can safely do other works without
triggering auxtrace snapshot.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 59 +++++++++++++++++++++++++++++++++++++--------
 1 file changed, 49 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4633c0a..8caace3 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -123,7 +123,43 @@ out:
 static volatile int done;
 static volatile int signr = -1;
 static volatile int child_finished;
-static volatile int auxtrace_snapshot_enabled;
+
+static volatile enum {
+	AUXTRACE_SNAPSHOT_OFF = -1,
+	AUXTRACE_SNAPSHOT_DISABLED = 0,
+	AUXTRACE_SNAPSHOT_ENABLED = 1,
+} auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_OFF;
+
+static inline void
+auxtrace_snapshot_on(void)
+{
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_DISABLED;
+}
+
+static inline void
+auxtrace_snapshot_enable(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return;
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_ENABLED;
+}
+
+static inline void
+auxtrace_snapshot_disable(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return;
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_DISABLED;
+}
+
+static inline bool
+auxtrace_snapshot_is_enabled(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return false;
+	return auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_ENABLED;
+}
+
 static volatile int auxtrace_snapshot_err;
 static volatile int auxtrace_record__snapshot_started;
 
@@ -247,7 +283,7 @@ static void record__read_auxtrace_snapshot(struct record *rec)
 	} else {
 		auxtrace_snapshot_err = auxtrace_record__snapshot_finish(rec->itr);
 		if (!auxtrace_snapshot_err)
-			auxtrace_snapshot_enabled = 1;
+			auxtrace_snapshot_enable();
 	}
 }
 
@@ -580,10 +616,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	signal(SIGCHLD, sig_handler);
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
-	if (rec->opts.auxtrace_snapshot_mode)
+
+	if (rec->opts.auxtrace_snapshot_mode) {
 		signal(SIGUSR2, snapshot_sig_handler);
-	else
+		auxtrace_snapshot_on();
+	} else {
 		signal(SIGUSR2, SIG_IGN);
+	}
 
 	session = perf_session__new(file, false, tool);
 	if (session == NULL) {
@@ -709,12 +748,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		perf_evlist__enable(rec->evlist);
 	}
 
-	auxtrace_snapshot_enabled = 1;
+	auxtrace_snapshot_enable();
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
 		if (record__mmap_read_all(rec) < 0) {
-			auxtrace_snapshot_enabled = 0;
+			auxtrace_snapshot_disable();
 			err = -1;
 			goto out_child;
 		}
@@ -752,12 +791,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		 * disable events in this case.
 		 */
 		if (done && !disabled && !target__none(&opts->target)) {
-			auxtrace_snapshot_enabled = 0;
+			auxtrace_snapshot_disable();
 			perf_evlist__disable(rec->evlist);
 			disabled = true;
 		}
 	}
-	auxtrace_snapshot_enabled = 0;
+	auxtrace_snapshot_disable();
 
 	if (forks && workload_exec_errno) {
 		char msg[STRERR_BUFSIZE];
@@ -1325,9 +1364,9 @@ out_symbol_exit:
 
 static void snapshot_sig_handler(int sig __maybe_unused)
 {
-	if (!auxtrace_snapshot_enabled)
+	if (!auxtrace_snapshot_is_enabled())
 		return;
-	auxtrace_snapshot_enabled = 0;
+	auxtrace_snapshot_disable();
 	auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
 	auxtrace_record__snapshot_started = 1;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 31/55] perf record: Introduce record__finish_output() to finish a perf.data
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (29 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 30/55] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 32/55] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
                   ` (23 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Move code for finalizing 'perf.data' to record__finish_output(). It
will be used by following commits to split output to multiple files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 37 +++++++++++++++++++++++++------------
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8caace3..2411c37 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -503,6 +503,29 @@ static void record__init_features(struct record *rec)
 	perf_header__clear_feat(&session->header, HEADER_STAT);
 }
 
+static void
+record__finish_output(struct record *rec)
+{
+	struct perf_data_file *file = &rec->file;
+	int fd = perf_data_file__fd(file);
+
+	if (file->is_pipe)
+		return;
+
+	rec->session->header.data_size += rec->bytes_written;
+	file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
+
+	if (!rec->no_buildid) {
+		process_buildids(rec);
+
+		if (rec->buildid_all)
+			dsos__hit_all(rec->session);
+	}
+	perf_session__write_header(rec->session, rec->evlist, fd, true);
+
+	return;
+}
+
 static volatile int workload_exec_errno;
 
 /*
@@ -830,18 +853,8 @@ out_child:
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
-	if (!err && !file->is_pipe) {
-		rec->session->header.data_size += rec->bytes_written;
-		file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
-
-		if (!rec->no_buildid) {
-			process_buildids(rec);
-
-			if (rec->buildid_all)
-				dsos__hit_all(rec->session);
-		}
-		perf_session__write_header(rec->session, rec->evlist, fd, true);
-	}
+	if (!err)
+		record__finish_output(rec);
 
 	if (!err && !quiet) {
 		char samples[128];
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 32/55] perf record: Add '--timestamp-filename' option to append timestamp to output filename
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (30 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 31/55] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 33/55] perf record: Split output into multiple files via '--switch-output' Wang Nan
                   ` (22 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This options append current timestamp to output. For example:

 # perf record -a --timestamp-filename
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2015122622265847 ]
 [ perf record: Captured and wrote 0.742 MB perf.data (90 samples) ]
 # ls
 perf.data.201512262226584

After 'perf record' support generating multiple output files, timestamp
would be useful to identify each of them.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 47 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 2411c37..e5714b6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -54,6 +54,7 @@ struct record {
 	bool			no_buildid_cache;
 	bool			no_buildid_cache_set;
 	bool			buildid_all;
+	bool			timestamp_filename;
 	unsigned long long	samples;
 };
 
@@ -526,6 +527,37 @@ record__finish_output(struct record *rec)
 	return;
 }
 
+static int
+record__switch_output(struct record *rec, bool at_exit)
+{
+	struct perf_data_file *file = &rec->file;
+	int fd, err;
+
+	/* Same Size:      "2015122520103046"*/
+	char timestamp[] = "InvalidTimestamp";
+
+	rec->samples = 0;
+	record__finish_output(rec);
+	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
+	if (err) {
+		pr_err("Failed to get current timestamp\n");
+		return -EINVAL;
+	}
+
+	fd = perf_data_file__switch(file, timestamp,
+				    rec->session->header.data_offset,
+				    at_exit);
+	if (fd >= 0 && !at_exit) {
+		rec->bytes_written = 0;
+		rec->session->header.data_size = 0;
+	}
+
+	if (!quiet)
+		fprintf(stderr, "[ perf record: Dump %s.%s ]\n",
+			file->path, timestamp);
+	return fd;
+}
+
 static volatile int workload_exec_errno;
 
 /*
@@ -853,8 +885,17 @@ out_child:
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
-	if (!err)
-		record__finish_output(rec);
+	if (!err) {
+		if (!rec->timestamp_filename) {
+			record__finish_output(rec);
+		} else {
+			fd = record__switch_output(rec, true);
+			if (fd < 0) {
+				status = fd;
+				goto out_delete_session;
+			}
+		}
+	}
 
 	if (!err && !quiet) {
 		char samples[128];
@@ -1237,6 +1278,8 @@ struct option __record_options[] = {
 		   "file", "vmlinux pathname"),
 	OPT_BOOLEAN(0, "buildid-all", &record.buildid_all,
 		    "Record build-id of all DSOs regardless of hits"),
+	OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
+		    "append timestamp to output filename"),
 	OPT_END()
 };
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 33/55] perf record: Split output into multiple files via '--switch-output'
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (31 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 32/55] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 34/55] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
                   ` (21 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Allow 'perf record' splits its output into multiple files.

For example:

 # ~/perf record -a --timestamp-filename --switch-output &
 [1] 10763
 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622314468 ]

 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622314762 ]

 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 #[ perf record: Dump perf.data.2015122622315171 ]

 # fg
 perf record -a --timestamp-filename --switch-output
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2015122622315513 ]
 [ perf record: Captured and wrote 0.014 MB perf.data (296 samples) ]

 # ls -l
 total 920
 -rw------- 1 root root 797692 Dec 26 22:31 perf.data.2015122622314468
 -rw------- 1 root root  59960 Dec 26 22:31 perf.data.2015122622314762
 -rw------- 1 root root  59912 Dec 26 22:31 perf.data.2015122622315171
 -rw------- 1 root root  19220 Dec 26 22:31 perf.data.2015122622315513

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e5714b6..15f9576 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -55,6 +55,7 @@ struct record {
 	bool			no_buildid_cache_set;
 	bool			buildid_all;
 	bool			timestamp_filename;
+	bool			switch_output;
 	unsigned long long	samples;
 };
 
@@ -163,6 +164,7 @@ auxtrace_snapshot_is_enabled(void)
 
 static volatile int auxtrace_snapshot_err;
 static volatile int auxtrace_record__snapshot_started;
+static volatile int switch_output_started;
 
 static void sig_handler(int sig)
 {
@@ -672,7 +674,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
-	if (rec->opts.auxtrace_snapshot_mode) {
+	if (rec->opts.auxtrace_snapshot_mode || rec->switch_output) {
 		signal(SIGUSR2, snapshot_sig_handler);
 		auxtrace_snapshot_on();
 	} else {
@@ -824,9 +826,25 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			}
 		}
 
+		if (switch_output_started) {
+			switch_output_started = 0;
+
+			if (!quiet)
+				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
+					waking);
+			waking = 0;
+			fd = record__switch_output(rec, false);
+			if (fd < 0) {
+				pr_err("Failed to switch to new file\n");
+				err = fd;
+				goto out_child;
+			}
+		}
+
 		if (hits == rec->samples) {
 			if (done || draining)
 				break;
+
 			err = perf_evlist__poll(rec->evlist, -1);
 			/*
 			 * Propagate error, only if there's any. Ignore positive
@@ -1280,6 +1298,8 @@ struct option __record_options[] = {
 		    "Record build-id of all DSOs regardless of hits"),
 	OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
 		    "append timestamp to output filename"),
+	OPT_BOOLEAN(0, "switch-output", &record.switch_output,
+		    "Switch output when receive SIGUSR2"),
 	OPT_END()
 };
 
@@ -1420,9 +1440,11 @@ out_symbol_exit:
 
 static void snapshot_sig_handler(int sig __maybe_unused)
 {
-	if (!auxtrace_snapshot_is_enabled())
-		return;
-	auxtrace_snapshot_disable();
-	auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
-	auxtrace_record__snapshot_started = 1;
+	if (auxtrace_snapshot_is_enabled()) {
+		auxtrace_snapshot_disable();
+		auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
+		auxtrace_record__snapshot_started = 1;
+	}
+
+	switch_output_started = 1;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 34/55] perf record: Force enable --timestamp-filename when --switch-output is provided
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (32 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 33/55] perf record: Split output into multiple files via '--switch-output' Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 35/55] perf record: Disable buildid cache options by default in switch output mode Wang Nan
                   ` (20 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Without this patch, the last output doesn't have timestamp appended if
--timestamp-filename is not explicitly provided. For example:

 # perf record -a --switch-output &
 [1] 11224
 # kill -s SIGUSR2 11224
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622372823 ]

 # fg
 perf record -a --switch-output
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.027 MB perf.data (540 samples) ]

 # ls -l
 total 836
 -rw------- 1 root root  33256 Dec 26 22:37 perf.data   <---- *Odd*
 -rw------- 1 root root 817156 Dec 26 22:37 perf.data.2015122622372823

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 15f9576..8987ce8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1355,6 +1355,9 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		return -EINVAL;
 	}
 
+	if (rec->switch_output)
+		rec->timestamp_filename = true;
+
 	if (!rec->itr) {
 		rec->itr = auxtrace_record__init(rec->evlist, &err);
 		if (err)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 35/55] perf record: Disable buildid cache options by default in switch output mode
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (33 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 34/55] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 36/55] perf record: Re-synthesize tracking events after output switching Wang Nan
                   ` (19 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Cost of buildid cache processing is high: read all events in output
perf.data, open elf files to read buildid then copy them into
~/.debug directory. In switch output mode, causes perf stop receiving
from perf events for too long.

Enable no-buildid and no-buildid-cache by default if --switch-output
is provided. Still allow user use --no-no-buildid to explicitly enable
buildid in this case.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8987ce8..2839715 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1383,8 +1383,36 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 "If some relocation was applied (e.g. kexec) symbols may be misresolved\n"
 "even with a suitable vmlinux or kallsyms file.\n\n");
 
-	if (rec->no_buildid_cache || rec->no_buildid)
+	if (rec->no_buildid_cache || rec->no_buildid) {
 		disable_buildid_cache();
+	} else if (rec->switch_output) {
+		/*
+		 * In 'perf record --switch-output', disable buildid
+		 * generation by default to reduce data file switching
+		 * overhead. Still generate buildid if they are required
+		 * explicitly using
+		 *
+		 *  perf record --signal-trigger --no-no-buildid \
+		 *              --no-no-buildid-cache
+		 *
+		 * Following code equals to:
+		 *
+		 * if ((rec->no_buildid || !rec->no_buildid_set) &&
+		 *     (rec->no_buildid_cache || !rec->no_buildid_cache_set))
+		 *         disable_buildid_cache();
+		 */
+		bool disable = true;
+
+		if (rec->no_buildid_set && !rec->no_buildid)
+			disable = false;
+		if (rec->no_buildid_cache_set && !rec->no_buildid_cache)
+			disable = false;
+		if (disable) {
+			rec->no_buildid = true;
+			rec->no_buildid_cache = true;
+			disable_buildid_cache();
+		}
+	}
 
 	if (rec->evlist->nr_entries == 0 &&
 	    perf_evlist__add_default(rec->evlist) < 0) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 36/55] perf record: Re-synthesize tracking events after output switching
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (34 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 35/55] perf record: Disable buildid cache options by default in switch output mode Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 37/55] perf record: Generate tracking events for process forked by perf Wang Nan
                   ` (18 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Tracking events describe kernel and threads. They are generated by
reading /proc/kallsyms, /proc/*/maps and /proc/*/task/* during
initialization of 'perf record', serialized into event sequences and put
at the head of 'perf.data'. In case of output switching, each output
file should contain those events.

This patch calls record__synthesize() during output switching, so the
event sequences described above can be collected again.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 2839715..3a11102 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -529,6 +529,8 @@ record__finish_output(struct record *rec)
 	return;
 }
 
+static int record__synthesize(struct record *rec);
+
 static int
 record__switch_output(struct record *rec, bool at_exit)
 {
@@ -557,6 +559,15 @@ record__switch_output(struct record *rec, bool at_exit)
 	if (!quiet)
 		fprintf(stderr, "[ perf record: Dump %s.%s ]\n",
 			file->path, timestamp);
+
+	/* Reinit machine */
+	if (!at_exit) {
+		machines__exit(&rec->session->machines);
+		machines__init(&rec->session->machines);
+		perf_session__create_kernel_maps(rec->session);
+		perf_session__set_id_hdr_size(rec->session);
+		record__synthesize(rec);
+	}
 	return fd;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 37/55] perf record: Generate tracking events for process forked by perf
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (35 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 36/55] perf record: Re-synthesize tracking events after output switching Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 38/55] perf record: Ensure return non-zero rc when mmap fail Wang Nan
                   ` (17 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

With 'perf record --switch-output' without -a, record__synthesize() in
record__switch_output() won't generate tracking events because there's
no thread_map in evlist. Which causes newly created perf.data doesn't
contain map and comm information.

This patch creates a fake thread_map and directly call
perf_event__synthesize_thread_map() for those events.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3a11102..7d4d8bf 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -567,6 +567,23 @@ record__switch_output(struct record *rec, bool at_exit)
 		perf_session__create_kernel_maps(rec->session);
 		perf_session__set_id_hdr_size(rec->session);
 		record__synthesize(rec);
+
+		if (target__none(&rec->opts.target)) {
+			struct {
+				struct thread_map map;
+				struct thread_map_data map_data;
+			} thread_map;
+
+			thread_map.map.nr = 1;
+			thread_map.map.map[0].pid = rec->evlist->workload.pid;
+			thread_map.map.map[0].comm = NULL;
+			perf_event__synthesize_thread_map(&rec->tool,
+					&thread_map.map,
+					process_synthesized_event,
+					&rec->session->machines.host,
+					rec->opts.sample_address,
+					rec->opts.proc_map_timeout);
+		}
 	}
 	return fd;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 38/55] perf record: Ensure return non-zero rc when mmap fail
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (36 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 37/55] perf record: Generate tracking events for process forked by perf Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 39/55] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
                   ` (16 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

perf_evlist__mmap_ex() can fail without setting errno (for example,
fail in condition checking. In this case all syscall is success).
If this happen, record__open() incorrectly returns 0. Force setting
rc is a quick way to avoid this problem, or we have to follow all
possible code path in perf_evlist__mmap_ex() to make sure there's
at least one system call before returning an error.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7d4d8bf..310e290 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -362,7 +362,10 @@ try_again:
 		} else {
 			pr_err("failed to mmap with %d (%s)\n", errno,
 				strerror_r(errno, msg, sizeof(msg)));
-			rc = -errno;
+			if (errno)
+				rc = -errno;
+			else
+				rc = -EINVAL;
 		}
 		goto out;
 	}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 39/55] perf record: Prevent reading invalid data in record__mmap_read
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (37 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 38/55] perf record: Ensure return non-zero rc when mmap fail Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 40/55] perf tools: Add evlist channel helpers Wang Nan
                   ` (15 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

When record__mmap_read() requires data more than the size of ring
buffer, drop those data to avoid accessing invalid memory.

This can happen when reading from overwritable ring buffer, which
should be avoided. However, check this for robustness.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 310e290..3a7de24 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -37,6 +37,7 @@
 #include <unistd.h>
 #include <sched.h>
 #include <sys/mman.h>
+#include <asm/bug.h>
 
 
 struct record {
@@ -95,6 +96,13 @@ static int record__mmap_read(struct record *rec, int idx)
 	rec->samples++;
 
 	size = head - old;
+	if (size > (unsigned long)(md->mask) + 1) {
+		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
+
+		md->prev = head;
+		perf_evlist__mmap_consume(rec->evlist, idx);
+		return 0;
+	}
 
 	if ((old & md->mask) + size != (head & md->mask)) {
 		buf = &data[old & md->mask];
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 40/55] perf tools: Add evlist channel helpers
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (38 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 39/55] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 41/55] perf tools: Automatically add new channel according to evlist Wang Nan
                   ` (14 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

In this commit sereval helpers are introduced to support the principle
of channel. Channels hold different groups of evsels which configured
differently. It will be used for overwritable evsels, which allows perf
record some events continuously while capture snapshot for other events
when something happen. Tracking events (mmap, mmap2, fork, exit ...)
are another possible events worth to be put into a separated channel.

Channels are represented by an array with channel flags. Each channel
contains evlist->nr_mmaps mmaps. Channels are configured before
perf_evlist__mmap_ex(). During that function nr_mmaps mmaps for each
channel are allocated together as a big array.
perf_evlist__channel_idx() converts index in the big array and the
channel number. For API functions which accept idx, _ex() versions are
introduced to accept selecting an mmap from a channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |   6 ++
 tools/perf/util/evlist.c    | 132 ++++++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/evlist.h    |  58 +++++++++++++++++++
 3 files changed, 190 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3a7de24..24c776c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -356,6 +356,12 @@ try_again:
 		goto out;
 	}
 
+	perf_evlist__channel_reset(evlist);
+	rc = perf_evlist__channel_add(evlist, 0, true);
+	if (rc < 0)
+		goto out;
+	rc = 0;
+
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index fef465a..a6b52fc 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -679,14 +679,51 @@ static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
 	return NULL;
 }
 
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx)
+{
+	int channel = *p_channel;
+	int _idx = *p_idx;
+
+	if (_idx < 0)
+		return -EINVAL;
+	/*
+	 * Negative channel means caller explicitly use real index.
+	 */
+	if (channel < 0) {
+		channel = perf_evlist__idx_channel(evlist, _idx);
+		_idx = _idx % evlist->nr_mmaps;
+	}
+	if (channel < 0)
+		return channel;
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	if (_idx >= evlist->nr_mmaps)
+		return -E2BIG;
+
+	*p_channel = channel;
+	*p_idx = evlist->nr_mmaps * channel + _idx;
+	return 0;
+}
+
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx)
 {
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
 	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head;
-	u64 old = md->prev;
-	unsigned char *data = md->base + page_size;
+	u64 old;
+	unsigned char *data;
 	union perf_event *event = NULL;
 
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return NULL;
+	}
+	old = md->prev;
+	data = md->base + page_size;
+
 	/*
 	 * Check if event was unmapped due to a POLLHUP/POLLERR.
 	 */
@@ -748,6 +785,11 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 	return event;
 }
 
+union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+{
+	return perf_evlist__mmap_read_ex(evlist, -1, idx);
+}
+
 static bool perf_mmap__empty(struct perf_mmap *md)
 {
 	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
@@ -766,10 +808,18 @@ static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 		__perf_evlist__munmap(evlist, idx);
 }
 
-void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx)
 {
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
 	struct perf_mmap *md = &evlist->mmap[idx];
 
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return;
+	}
+
 	if (!evlist->overwrite) {
 		u64 old = md->prev;
 
@@ -780,6 +830,11 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 		perf_evlist__mmap_put(evlist, idx);
 }
 
+void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+{
+	perf_evlist__mmap_consume_ex(evlist, -1, idx);
+}
+
 int __weak auxtrace_mmap__mmap(struct auxtrace_mmap *mm __maybe_unused,
 			       struct auxtrace_mmap_params *mp __maybe_unused,
 			       void *userpg __maybe_unused,
@@ -825,7 +880,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 	if (evlist->mmap == NULL)
 		return;
 
-	for (i = 0; i < evlist->nr_mmaps; i++)
+	for (i = 0; i < perf_evlist__mmap_nr(evlist); i++)
 		__perf_evlist__munmap(evlist, i);
 
 	zfree(&evlist->mmap);
@@ -833,10 +888,17 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 
 static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
+	int total_mmaps;
+
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
-	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
+
+	total_mmaps = perf_evlist__mmap_nr(evlist);
+	if (!total_mmaps)
+		return -EINVAL;
+
+	evlist->mmap = zalloc(total_mmaps * sizeof(struct perf_mmap));
 	return evlist->mmap != NULL ? 0 : -ENOMEM;
 }
 
@@ -1137,6 +1199,12 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	int err;
+
+	perf_evlist__channel_reset(evlist);
+	err = perf_evlist__channel_add(evlist, 0, true);
+	if (err < 0)
+		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
@@ -1764,3 +1832,55 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 
 	return NULL;
 }
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist)
+{
+	int i;
+
+	for (i = PERF_EVLIST__NR_CHANNELS - 1; i >= 0; i--) {
+		unsigned long flags = evlist->channel_flags[i];
+
+		if (flags & PERF_EVLIST__CHANNEL_ENABLED)
+			return i + 1;
+	}
+	return 0;
+}
+
+int perf_evlist__mmap_nr(struct perf_evlist *evlist)
+{
+	return evlist->nr_mmaps * perf_evlist__channel_nr(evlist);
+}
+
+void perf_evlist__channel_reset(struct perf_evlist *evlist)
+{
+	int i;
+
+	BUG_ON(evlist->mmap);
+
+	for (i = 0; i < PERF_EVLIST__NR_CHANNELS; i++)
+		evlist->channel_flags[i] = 0;
+}
+
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default)
+{
+	int n = perf_evlist__channel_nr(evlist);
+	unsigned long *flags = evlist->channel_flags;
+
+	BUG_ON(evlist->mmap);
+
+	if (n >= PERF_EVLIST__NR_CHANNELS) {
+		pr_debug("ERROR: too many channels. Increase PERF_EVLIST__NR_CHANNELS\n");
+		return -ENOSPC;
+	}
+
+	if (is_default) {
+		memmove(&flags[1], &flags[0],
+			sizeof(evlist->channel_flags) -
+			sizeof(evlist->channel_flags[0]));
+		n = 0;
+	}
+	flags[n] = flag | PERF_EVLIST__CHANNEL_ENABLED;
+	return n;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a0d1522..1812652 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,6 +20,11 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
+#define PERF_EVLIST__NR_CHANNELS	1
+enum perf_evlist_mmap_flag {
+	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+};
+
 /**
  * struct perf_mmap - perf's ring buffer mmap details
  *
@@ -52,6 +57,7 @@ struct perf_evlist {
 		pid_t	pid;
 	} workload;
 	struct fdarray	 pollfd;
+	unsigned long channel_flags[PERF_EVLIST__NR_CHANNELS];
 	struct perf_mmap *mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
@@ -116,9 +122,61 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx);
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
 
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx);
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
+int perf_evlist__mmap_nr(struct perf_evlist *evlist);
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist);
+void perf_evlist__channel_reset(struct perf_evlist *evlist);
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default);
+
+static inline bool
+__perf_evlist__channel_check(struct perf_evlist *evlist, int channel,
+			     enum perf_evlist_mmap_flag bits)
+{
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return false;
+
+	return (evlist->channel_flags[channel] & bits) ? true : false;
+}
+#define perf_evlist__channel_check(e, c, b) \
+		__perf_evlist__channel_check(e, c, PERF_EVLIST__CHANNEL_##b)
+
+static inline bool
+perf_evlist__channel_is_enabled(struct perf_evlist *evlist, int channel)
+{
+	return perf_evlist__channel_check(evlist, channel, ENABLED);
+}
+
+static inline int
+perf_evlist__idx_channel(struct perf_evlist *evlist, int idx)
+{
+	int channel = idx / evlist->nr_mmaps;
+
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	return channel;
+}
+
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx);
+
+static inline struct perf_mmap *
+perf_evlist__get_mmap(struct perf_evlist *evlist,
+		      int channel, int idx)
+{
+	if (perf_evlist__channel_idx(evlist, &channel, &idx))
+		return NULL;
+
+	return &evlist->mmap[idx];
+}
 
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 41/55] perf tools: Automatically add new channel according to evlist
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (39 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 40/55] perf tools: Add evlist channel helpers Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 42/55] perf tools: Operate multiple channels Wang Nan
                   ` (13 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

perf_evlist__channel_find() can be used to find a proper channel based
on propreties of a evsel. If the channel doesn't exist, it can create
new one for it. After this patch there's no need to create default
channel explicitly.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  5 -----
 tools/perf/util/evlist.c    | 47 ++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 24c776c..cf8f67a 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -357,11 +357,6 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	rc = perf_evlist__channel_add(evlist, 0, true);
-	if (rc < 0)
-		goto out;
-	rc = 0;
-
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a6b52fc..d94f2c6 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -943,6 +943,43 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
+static unsigned long
+perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
+{
+	return 0;
+}
+
+static int
+perf_evlist__channel_find(struct perf_evlist *evlist,
+			  struct perf_evsel *evsel,
+			  bool add_new)
+{
+	unsigned long flag = perf_evlist__channel_for_evsel(evsel);
+	int i;
+
+	flag |= PERF_EVLIST__CHANNEL_ENABLED;
+	for (i = 0; i < perf_evlist__channel_nr(evlist); i++)
+		if (evlist->channel_flags[i] == flag)
+			return i;
+	if (add_new)
+		return perf_evlist__channel_add(evlist, flag, false);
+	return -ENOENT;
+}
+
+static int
+perf_evlist__channel_complete(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	int err;
+
+	evlist__for_each(evlist, evsel) {
+		err = perf_evlist__channel_find(evlist, evsel, true);
+		if (err < 0)
+			return err;
+	}
+	return 0;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
@@ -1162,6 +1199,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 bool overwrite, unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite)
 {
+	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
@@ -1169,6 +1207,10 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
+	err = perf_evlist__channel_complete(evlist);
+	if (err)
+		return err;
+
 	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
 		return -ENOMEM;
 
@@ -1199,12 +1241,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
-	int err;
-
 	perf_evlist__channel_reset(evlist);
-	err = perf_evlist__channel_add(evlist, 0, true);
-	if (err < 0)
-		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 42/55] perf tools: Operate multiple channels
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (40 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 41/55] perf tools: Automatically add new channel according to evlist Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 43/55] perf tools: Squash overwrite setting into channel Wang Nan
                   ` (12 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Before this patch perf operates on only the first channel. Make perf
mmap and read from multiple channels.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  3 ++-
 tools/perf/util/evlist.c    | 55 ++++++++++++++++++++++++++++++++++-----------
 tools/perf/util/evlist.h    |  2 +-
 3 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index cf8f67a..a472950 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -466,8 +466,9 @@ static int record__mmap_read_all(struct record *rec)
 	u64 bytes_written = rec->bytes_written;
 	int i;
 	int rc = 0;
+	int total_mmaps = perf_evlist__mmap_nr(rec->evlist);
 
-	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
+	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
 		if (rec->evlist->mmap[i].base) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d94f2c6..16f061c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -873,6 +873,21 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
 }
 
+static void
+__perf_evlist__munmap_channels(struct perf_evlist *evlist, int _idx)
+{
+	int _ch;
+
+	for (_ch = 0; _ch < perf_evlist__channel_nr(evlist); _ch++) {
+		int err, idx = _idx, ch = _ch;
+
+		err = perf_evlist__channel_idx(evlist, &ch, &idx);
+		if (err < 0)
+			continue;
+		__perf_evlist__munmap(evlist, idx);
+	}
+}
+
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	int i;
@@ -980,26 +995,38 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
+static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *outputs)
 {
 	struct perf_evsel *evsel;
 
 	evlist__for_each(evlist, evsel) {
-		int fd;
+		int fd, channel, idx, err;
+
+		channel = perf_evlist__channel_find(evlist, evsel, false);
+		if (channel < 0) {
+			pr_err("ERROR: unable to find suitable channel for %s\n",
+			       evsel->name);
+			return -1;
+		}
+
+		idx = _idx;
+		err = perf_evlist__channel_idx(evlist, &channel, &idx);
+		if (err < 0)
+			return err;
 
 		if (evsel->system_wide && thread)
 			continue;
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
-			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+		if (outputs[channel] == -1) {
+			outputs[channel] = fd;
+			if (__perf_evlist__mmap(evlist, idx, mp, outputs[channel]) < 0)
 				return -1;
 		} else {
-			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
+			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, outputs[channel]) != 0)
 				return -1;
 
 			perf_evlist__mmap_get(evlist, idx);
@@ -1039,14 +1066,15 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, outputs))
 				goto out_unmap;
 		}
 	}
@@ -1055,7 +1083,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 
 out_unmap:
 	for (cpu = 0; cpu < nr_cpus; cpu++)
-		__perf_evlist__munmap(evlist, cpu);
+		__perf_evlist__munmap_channels(evlist, cpu);
 	return -1;
 }
 
@@ -1067,13 +1095,14 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						outputs))
 			goto out_unmap;
 	}
 
@@ -1081,7 +1110,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 
 out_unmap:
 	for (thread = 0; thread < nr_threads; thread++)
-		__perf_evlist__munmap(evlist, thread);
+		__perf_evlist__munmap_channels(evlist, thread);
 	return -1;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1812652..b652587 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,7 +20,7 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
-#define PERF_EVLIST__NR_CHANNELS	1
+#define PERF_EVLIST__NR_CHANNELS	2
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 };
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 43/55] perf tools: Squash overwrite setting into channel
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (41 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 42/55] perf tools: Operate multiple channels Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 44/55] perf record: Don't read from and poll overwrite channel Wang Nan
                   ` (11 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Make 'overwrite' a channel configuration other than a evlist global
option. With this setting an evlist can have two channels, one is
normal channel, another is overwritable channel.
perf_evlist__channel_for_evsel() ensures events with 'overwrite'
configuration inserted to overwritable channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  2 +-
 tools/perf/util/evlist.c    | 42 +++++++++++++++++++++++++++---------------
 tools/perf/util/evlist.h    |  5 ++---
 tools/perf/util/evsel.h     |  1 +
 4 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a472950..f5bc5bf 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -357,7 +357,7 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
+	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
 		if (errno == EPERM) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 16f061c..9175c83 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -731,7 +731,7 @@ union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
 		return NULL;
 
 	head = perf_mmap__read_head(md);
-	if (evlist->overwrite) {
+	if (perf_evlist__channel_check(evlist, channel, RDONLY)) {
 		/*
 		 * If we're further behind than half the buffer, there's a chance
 		 * the writer will bite our tail and mess up the samples under us.
@@ -820,7 +820,7 @@ void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
 		return;
 	}
 
-	if (!evlist->overwrite) {
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY)) {
 		u64 old = md->prev;
 
 		perf_mmap__write_tail(md, old);
@@ -918,7 +918,6 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 }
 
 struct mmap_params {
-	int prot;
 	int mask;
 	struct auxtrace_mmap_params auxtrace_mp;
 };
@@ -926,6 +925,15 @@ struct mmap_params {
 static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 			       struct mmap_params *mp, int fd)
 {
+	int channel = perf_evlist__idx_channel(evlist, idx);
+	int prot = PROT_READ;
+
+	if (channel < 0)
+		return -1;
+
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY))
+		prot |= PROT_WRITE;
+
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
 	 * make sure we don't prevent tools from consuming every last event in
@@ -942,7 +950,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	atomic_set(&evlist->mmap[idx].refcnt, 2);
 	evlist->mmap[idx].prev = 0;
 	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
+	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, prot,
 				      MAP_SHARED, fd, 0);
 	if (evlist->mmap[idx].base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
@@ -959,9 +967,13 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 }
 
 static unsigned long
-perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
+perf_evlist__channel_for_evsel(struct perf_evsel *evsel)
 {
-	return 0;
+	unsigned long flag = 0;
+
+	if (evsel->overwrite)
+		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	return flag;
 }
 
 static int
@@ -1211,11 +1223,10 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * perf_evlist__mmap_ex - Create mmaps to receive events.
  * @evlist: list of events
  * @pages: map length in pages
- * @overwrite: overwrite older events?
  * @auxtrace_pages - auxtrace map length in pages
  * @auxtrace_overwrite - overwrite older auxtrace data?
  *
- * If @overwrite is %false the user needs to signal event consumption using
+ * For writable channel, the user needs to signal event consumption using
  * perf_mmap__write_tail().  Using perf_evlist__mmap_read() does this
  * automatically.
  *
@@ -1225,16 +1236,13 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * Return: %0 on success, negative error code otherwise.
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite)
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite)
 {
 	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
-	struct mmap_params mp = {
-		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
-	};
+	struct mmap_params mp;
 
 	err = perf_evlist__channel_complete(evlist);
 	if (err)
@@ -1246,7 +1254,6 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
 	mp.mask = evlist->mmap_len - page_size - 1;
@@ -1270,8 +1277,13 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	struct perf_evsel *evsel;
+
 	perf_evlist__channel_reset(evlist);
-	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+	evlist__for_each(evlist, evsel)
+		evsel->overwrite = overwrite;
+
+	return perf_evlist__mmap_ex(evlist, pages, 0, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b652587..21a8b85 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -23,6 +23,7 @@ struct record_opts;
 #define PERF_EVLIST__NR_CHANNELS	2
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+	PERF_EVLIST__CHANNEL_RDONLY	= 2,
 };
 
 /**
@@ -45,7 +46,6 @@ struct perf_evlist {
 	int		 nr_entries;
 	int		 nr_groups;
 	int		 nr_mmaps;
-	bool		 overwrite;
 	bool		 enabled;
 	bool		 has_user_cpus;
 	size_t		 mmap_len;
@@ -203,8 +203,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 				  int unset);
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite);
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite);
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite);
 void perf_evlist__munmap(struct perf_evlist *evlist);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index efad78f..03c70e5 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -114,6 +114,7 @@ struct perf_evsel {
 	bool			tracking;
 	bool			per_pkg;
 	bool			precise_max;
+	bool			overwrite;
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 44/55] perf record: Don't read from and poll overwrite channel
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (42 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 43/55] perf tools: Squash overwrite setting into channel Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 45/55] perf record: Don't poll on " Wang Nan
                   ` (10 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Reading from overwritable ring buffer is unreliable. Introduce
record__mmap_should_read() and prevent reading from overwrite ring
buffer in 'perf record'. The rule in record__mmap_should_read() will
be changed when perf support reading from backward writing ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f5bc5bf..b27b3ff 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -461,6 +461,19 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static bool record__mmap_should_read(struct record *rec, int idx)
+{
+	int channel = -1;
+
+	if (!rec->evlist->mmap[idx].base)
+		return false;
+	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
+		return false;
+	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int record__mmap_read_all(struct record *rec)
 {
 	u64 bytes_written = rec->bytes_written;
@@ -471,7 +484,7 @@ static int record__mmap_read_all(struct record *rec)
 	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
-		if (rec->evlist->mmap[i].base) {
+		if (record__mmap_should_read(rec, i)) {
 			if (record__mmap_read(rec, i) != 0) {
 				rc = -1;
 				goto out;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 45/55] perf record: Don't poll on overwrite channel
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (43 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 44/55] perf record: Don't read from and poll overwrite channel Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 46/55] perf tools: Detect avalibility of write_backward Wang Nan
                   ` (9 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

There's no need to receive events from overwrite ring buffer. Instead,
perf should make them run background until something happen. This patch
makes events from overwrite ring buffer is ignored except POLLERR and
POLLHUP.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9175c83..7cf0435 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -461,9 +461,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
 {
-	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
 	 * Save the idx so that when we filter out fds POLLHUP'ed we can
 	 * close the associated evlist->mmap[] entry.
@@ -479,7 +479,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1);
+	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
@@ -1007,6 +1007,18 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
+static bool
+perf_evlist__should_poll(struct perf_evlist *evlist,
+			 struct perf_evsel *evsel,
+			 int channel)
+{
+	if (evsel->system_wide)
+		return false;
+	if (perf_evlist__channel_check(evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *outputs)
@@ -1015,6 +1027,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 
 	evlist__for_each(evlist, evsel) {
 		int fd, channel, idx, err;
+		short revent = POLLIN;
 
 		channel = perf_evlist__channel_find(evlist, evsel, false);
 		if (channel < 0) {
@@ -1044,6 +1057,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 			perf_evlist__mmap_get(evlist, idx);
 		}
 
+		if (!perf_evlist__should_poll(evlist, evsel, channel))
+			revent = 0;
 		/*
 		 * The system_wide flag causes a selected event to be opened
 		 * always without a pid.  Consequently it will never get a
@@ -1052,7 +1067,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 46/55] perf tools: Detect avalibility of write_backward
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (44 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 45/55] perf record: Don't poll on " Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 47/55] perf tools: Enable overwrite settings Wang Nan
                   ` (8 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Detect avalibility of write_backward and save the result into
record_opts. With write_backward the start pointer of a ring
buffer mapped read only can be found reliably.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/perf.h        |  1 +
 tools/perf/util/record.c | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 5381a01..198345e 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -73,6 +73,7 @@ struct record_opts {
 	bool	     sample_transaction;
 	unsigned     initial_delay;
 	bool         use_clockid;
+	bool	     has_write_backward;
 	clockid_t    clockid;
 	unsigned int proc_map_timeout;
 };
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 0467367..d01f155 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -85,6 +85,11 @@ static void perf_probe_comm_exec(struct perf_evsel *evsel)
 	evsel->attr.comm_exec = 1;
 }
 
+static void perf_probe_write_backward(struct perf_evsel *evsel)
+{
+	evsel->attr.write_backward = 1;
+}
+
 static void perf_probe_context_switch(struct perf_evsel *evsel)
 {
 	evsel->attr.context_switch = 1;
@@ -105,6 +110,11 @@ bool perf_can_record_switch_events(void)
 	return perf_probe_api(perf_probe_context_switch);
 }
 
+static bool perf_can_write_backward(void)
+{
+	return perf_probe_api(perf_probe_write_backward);
+}
+
 bool perf_can_record_cpu_wide(void)
 {
 	struct perf_event_attr attr = {
@@ -235,6 +245,7 @@ static int record_opts__config_freq(struct record_opts *opts)
 
 int record_opts__config(struct record_opts *opts)
 {
+	opts->has_write_backward = perf_can_write_backward();
 	return record_opts__config_freq(opts);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 47/55] perf tools: Enable overwrite settings
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (45 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 46/55] perf tools: Detect avalibility of write_backward Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 48/55] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
                   ` (7 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

This patch allows following config terms and option:

 # perf record --overwrite ...

   Globally set following events to overwrite;

 # perf record --event cycles/overwrite/ ...
 # perf record --event cycles/no-overwrite/ ...

Set specific events to be overwrite or no-overwrite.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c    |  1 +
 tools/perf/perf.h              |  1 +
 tools/perf/util/evsel.c        |  4 ++++
 tools/perf/util/evsel.h        |  2 ++
 tools/perf/util/parse-events.c | 14 ++++++++++++++
 tools/perf/util/parse-events.h |  2 ++
 tools/perf/util/parse-events.l |  2 ++
 7 files changed, 26 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b27b3ff..d3f0435 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1271,6 +1271,7 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 		     "number of mmap data pages and AUX area tracing mmap pages",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 198345e..7a65a92 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -60,6 +60,7 @@ struct record_opts {
 	bool	     record_switch_events;
 	bool	     all_kernel;
 	bool	     all_user;
+	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 510afa4..10dfdd1 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -670,6 +670,9 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			 */
 			attr->inherit = term->val.inherit ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+			evsel->overwrite = term->val.overwrite ? 1 : 0;
+			break;
 		default:
 			break;
 		}
@@ -745,6 +748,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
+	evsel->overwrite    = opts->overwrite;
 
 	perf_evsel__set_sample_bit(evsel, IP);
 	perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 03c70e5..aa976f9 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -44,6 +44,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
 	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_INHERIT,
+	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -57,6 +58,7 @@ struct perf_evsel_config_term {
 		char	*callgraph;
 		u64	stack_user;
 		bool	inherit;
+		bool	overwrite;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 5e958b9..bf1eee1 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -996,6 +996,12 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
@@ -1044,6 +1050,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
 	case PARSE_EVENTS__TERM_TYPE_INHERIT:
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -1113,6 +1121,12 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 			ADD_CONFIG_TERM(INHERIT, inherit, term->val.num ? 0 : 1);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 67e4930..c7e6e51 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -69,6 +69,8 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
 	PARSE_EVENTS__TERM_TYPE_INHERIT,
+	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
+	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 1477fbc..cc4c426 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -201,6 +201,8 @@ call-graph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
 stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
 inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
+overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
+no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 48/55] perf tools: Set write_backward attribut bit for overwrite events
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (46 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 47/55] perf tools: Enable overwrite settings Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 49/55] perf tools: Record fd into perf_mmap Wang Nan
                   ` (6 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

write_backward attribute makes kernel filling ring buffer from the end
of it, makes reading from overwrite ring buffer possible.

This patch select this attribute if evsel->overwrite is selected
explicitly by user.

Overwrite and write_backward are still controled separatly for legacy
readonly mmap users (most of them are in perf/tests).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  7 +++++++
 tools/perf/util/evlist.c    |  2 ++
 tools/perf/util/evlist.h    |  1 +
 tools/perf/util/evsel.c     | 13 +++++++++++++
 4 files changed, 23 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d3f0435..888a8e8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -332,6 +332,13 @@ static int record__open(struct record *rec)
 	perf_evlist__config(evlist, opts);
 
 	evlist__for_each(evlist, pos) {
+		if (pos->overwrite) {
+			if (!pos->attr.write_backward) {
+				ui__warning("Unable to read from overwrite ring buffer\n\n");
+				rc = -ENOSYS;
+				goto out;
+			}
+		}
 try_again:
 		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
 			if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7cf0435..36dd305 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -973,6 +973,8 @@ perf_evlist__channel_for_evsel(struct perf_evsel *evsel)
 
 	if (evsel->overwrite)
 		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	if (evsel->attr.write_backward)
+		flag |= PERF_EVLIST__CHANNEL_BACKWARD;
 	return flag;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 21a8b85..321224c 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -24,6 +24,7 @@ struct record_opts;
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 	PERF_EVLIST__CHANNEL_RDONLY	= 2,
+	PERF_EVLIST__CHANNEL_BACKWARD	= 4,
 };
 
 /**
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 10dfdd1..0bbd5ef 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -678,6 +678,19 @@ static void apply_config_terms(struct perf_evsel *evsel,
 		}
 	}
 
+	/*
+	 * Set backward after config term processing because it is
+	 * possible to set overwrite globally, without config
+	 * terms.
+	 */
+	if (evsel->overwrite) {
+		if (opts->has_write_backward)
+			attr->write_backward = 1;
+		else
+			pr_err("Reading from overwrite event %s is not supported\n",
+			       evsel->name);
+	}
+
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
 	if ((callgraph_buf != NULL) || (dump_size > 0)) {
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 49/55] perf tools: Record fd into perf_mmap
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (47 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 48/55] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 50/55] perf tools: Add API to pause a channel Wang Nan
                   ` (5 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Add a fd field into perf_mmap so perf can backtrack the fd from mmap.
This feature will be used to toggle overwrite ring buffers.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 15 +++++++++++++--
 tools/perf/util/evlist.h |  1 +
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 36dd305..bd2393a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -868,6 +868,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	if (evlist->mmap[idx].base != NULL) {
 		munmap(evlist->mmap[idx].base, evlist->mmap_len);
 		evlist->mmap[idx].base = NULL;
+		evlist->mmap[idx].fd = -1;
 		atomic_set(&evlist->mmap[idx].refcnt, 0);
 	}
 	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
@@ -903,7 +904,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 
 static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
-	int total_mmaps;
+	int total_mmaps, i;
 
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
@@ -914,7 +915,12 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 		return -EINVAL;
 
 	evlist->mmap = zalloc(total_mmaps * sizeof(struct perf_mmap));
-	return evlist->mmap != NULL ? 0 : -ENOMEM;
+	if (!evlist->mmap)
+		return -ENOMEM;
+
+	for (i = 0; i < total_mmaps; i++)
+		evlist->mmap[i].fd = -1;
+	return 0;
 }
 
 struct mmap_params {
@@ -934,6 +940,10 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	if (!perf_evlist__channel_check(evlist, channel, RDONLY))
 		prot |= PROT_WRITE;
 
+	if (evlist->mmap[idx].fd >= 0) {
+		pr_err("idx %d already mapped\n", idx);
+		return -1;
+	}
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
 	 * make sure we don't prevent tools from consuming every last event in
@@ -958,6 +968,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 		evlist->mmap[idx].base = NULL;
 		return -1;
 	}
+	evlist->mmap[idx].fd = fd;
 
 	if (auxtrace_mmap__mmap(&evlist->mmap[idx].auxtrace_mmap,
 				&mp->auxtrace_mp, evlist->mmap[idx].base, fd))
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 321224c..bc6d787 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -35,6 +35,7 @@ enum perf_evlist_mmap_flag {
 struct perf_mmap {
 	void		 *base;
 	int		 mask;
+	int		 fd;
 	atomic_t	 refcnt;
 	u64		 prev;
 	struct auxtrace_mmap auxtrace_mmap;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 50/55] perf tools: Add API to pause a channel
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (48 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 49/55] perf tools: Record fd into perf_mmap Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 51/55] perf record: Toggle overwrite ring buffer for reading Wang Nan
                   ` (4 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

perf_evlist__channel_toggle_paused() is introduced to pause/resume a
channel in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl.
Following commits use perf_evlist__channel_toggle_paused() to ensure
overwrite ring buffer is turned off before reading.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 28 ++++++++++++++++++++++++++++
 tools/perf/util/evlist.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index bd2393a..06c79c8 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -706,6 +706,34 @@ int perf_evlist__channel_idx(struct perf_evlist *evlist,
 	return 0;
 }
 
+int perf_evlist__channel_toggle_paused(struct perf_evlist *evlist,
+				       int channel, bool pause)
+{
+	int i;
+
+	if (channel >= perf_evlist__channel_nr(evlist))
+		return -E2BIG;
+	if (!evlist->mmap)
+		return -EFAULT;
+	for (i = 0; i < evlist->nr_mmaps; i++) {
+		int n = channel * evlist->nr_mmaps + i;
+		int fd = evlist->mmap[n].fd;
+		int err;
+
+		if (fd < 0)
+			continue;
+		err = ioctl(fd, PERF_EVENT_IOC_PAUSE_OUTPUT,
+			    pause ? 1 : 0);
+		if (err) {
+			err = (errno == 0 ? -EINVAL : -errno);
+			pr_err("Unable to pause output on %d: %s\n",
+			       fd, strerror(-err));
+			return err;
+		}
+	}
+	return 0;
+}
+
 union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
 					    int channel, int idx)
 {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bc6d787..c1831a9 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -180,6 +180,8 @@ perf_evlist__get_mmap(struct perf_evlist *evlist,
 	return &evlist->mmap[idx];
 }
 
+int perf_evlist__channel_toggle_paused(struct perf_evlist *evlist,
+				       int channel, bool pause);
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 51/55] perf record: Toggle overwrite ring buffer for reading
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (49 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 50/55] perf tools: Add API to pause a channel Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 52/55] perf record: Rename variable to make code clear Wang Nan
                   ` (3 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Reading from a overwrite ring buffer is unrelible.
perf_evlist__channel_toggle_paused() should be called before
reading from them.

Toggel overwrite_evt_paused director after receiving done or switch
output.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 79 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 888a8e8..e39475b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -39,6 +39,11 @@
 #include <sys/mman.h>
 #include <asm/bug.h>
 
+enum overwrite_evt_state {
+	OVERWRITE_EVT_RUNNING,
+	OVERWRITE_EVT_DATA_PENDING,
+	OVERWRITE_EVT_EMPTY,
+};
 
 struct record {
 	struct perf_tool	tool;
@@ -57,6 +62,7 @@ struct record {
 	bool			buildid_all;
 	bool			timestamp_filename;
 	bool			switch_output;
+	enum overwrite_evt_state overwrite_evt_state;
 	unsigned long long	samples;
 };
 
@@ -388,6 +394,7 @@ try_again:
 
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
+	rec->overwrite_evt_state = OVERWRITE_EVT_RUNNING;
 out:
 	return rc;
 }
@@ -468,6 +475,52 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static void
+record__toggle_overwrite_evsels(struct record *rec,
+				enum overwrite_evt_state state)
+{
+	struct perf_evlist *evlist = rec->evlist;
+	enum overwrite_evt_state old_state = rec->overwrite_evt_state;
+	enum action {
+		NONE,
+		PAUSE,
+		RESUME,
+	} action = NONE;
+	int ch, nr_channels;
+
+	switch (old_state) {
+	case OVERWRITE_EVT_RUNNING:
+		if (state != OVERWRITE_EVT_RUNNING)
+			action = PAUSE;
+		break;
+	case OVERWRITE_EVT_DATA_PENDING:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		break;
+	case OVERWRITE_EVT_EMPTY:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		if (state == OVERWRITE_EVT_DATA_PENDING)
+			state = OVERWRITE_EVT_EMPTY;
+		break;
+	default:
+		WARN_ONCE(1, "Shouldn't get there\n");
+	}
+
+	rec->overwrite_evt_state = state;
+
+	if (action == NONE)
+		return;
+
+	nr_channels = perf_evlist__channel_nr(evlist);
+	for (ch = 0; ch < nr_channels; ch++) {
+		if (!perf_evlist__channel_check(evlist, ch, RDONLY))
+			continue;
+		perf_evlist__channel_toggle_paused(evlist, ch,
+						   action == PAUSE);
+	}
+}
+
 static bool record__mmap_should_read(struct record *rec, int idx)
 {
 	int channel = -1;
@@ -512,6 +565,8 @@ static int record__mmap_read_all(struct record *rec)
 	if (bytes_written != rec->bytes_written)
 		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
 
+	if (rec->overwrite_evt_state == OVERWRITE_EVT_DATA_PENDING)
+		record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_EMPTY);
 out:
 	return rc;
 }
@@ -870,6 +925,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
+		/*
+		 * rec->overwrite_evt_state is possible to be
+		 * OVERWRITE_EVT_EMPTY here: when done == true and
+		 * hits != rec->samples after previous reading.
+		 *
+		 * record__toggle_overwrite_evsels ensure we never
+		 * convert OVERWRITE_EVT_EMPTY to OVERWRITE_EVT_DATA_PENDING.
+		 */
+		if (switch_output_started || done || draining)
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_DATA_PENDING);
+
 		if (record__mmap_read_all(rec) < 0) {
 			auxtrace_snapshot_disable();
 			err = -1;
@@ -888,7 +954,20 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 
 		if (switch_output_started) {
+			/*
+			 * SIGUSR2 raise after or during record__mmap_read_all().
+			 * continue to read again.
+			 */
+			if (rec->overwrite_evt_state == OVERWRITE_EVT_RUNNING)
+				continue;
+
 			switch_output_started = 0;
+			/*
+			 * Reenable events in overwrite ring buffer after
+			 * record__mmap_read_all(): we should have collected
+			 * data from it.
+			 */
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_RUNNING);
 
 			if (!quiet)
 				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 52/55] perf record: Rename variable to make code clear
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (50 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 51/55] perf record: Toggle overwrite ring buffer for reading Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 53/55] perf record: Read from backward ring buffer Wang Nan
                   ` (2 subsequent siblings)
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

record__mmap_read() write data from ring buffer into perf.data.
'head' is maintained by kernel, points to the last writtend record.
'old' is maintained by perf, points to the record read in previous
round. record__mmap_read() saves data from 'old' to 'head' to
perf.data. The naming of variables are not easy to read. In addition,
when dealing with backward writing ring buffer, the md->prev pointer
should point to 'head' instead of the last byte it got.

Add start and end pointer to make code clear and set md->prev to 'head'
instead of the moved 'old' pointer. This patch doesn't change
behavior since:

    buf = &data[old & md->mask];
    size = head - old;
    old += size;     <--- Here, old == head

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e39475b..b1f37f0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -91,17 +91,18 @@ static int record__mmap_read(struct record *rec, int idx)
 	struct perf_mmap *md = &rec->evlist->mmap[idx];
 	u64 head = perf_mmap__read_head(md);
 	u64 old = md->prev;
+	u64 end = head, start = old;
 	unsigned char *data = md->base + page_size;
 	unsigned long size;
 	void *buf;
 	int rc = 0;
 
-	if (old == head)
+	if (start == end)
 		return 0;
 
 	rec->samples++;
 
-	size = head - old;
+	size = end - start;
 	if (size > (unsigned long)(md->mask) + 1) {
 		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
@@ -110,10 +111,10 @@ static int record__mmap_read(struct record *rec, int idx)
 		return 0;
 	}
 
-	if ((old & md->mask) + size != (head & md->mask)) {
-		buf = &data[old & md->mask];
-		size = md->mask + 1 - (old & md->mask);
-		old += size;
+	if ((start & md->mask) + size != (end & md->mask)) {
+		buf = &data[start & md->mask];
+		size = md->mask + 1 - (start & md->mask);
+		start += size;
 
 		if (record__write(rec, buf, size) < 0) {
 			rc = -1;
@@ -121,16 +122,16 @@ static int record__mmap_read(struct record *rec, int idx)
 		}
 	}
 
-	buf = &data[old & md->mask];
-	size = head - old;
-	old += size;
+	buf = &data[start & md->mask];
+	size = end - start;
+	start += size;
 
 	if (record__write(rec, buf, size) < 0) {
 		rc = -1;
 		goto out;
 	}
 
-	md->prev = old;
+	md->prev = head;
 	perf_evlist__mmap_consume(rec->evlist, idx);
 out:
 	return rc;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 53/55] perf record: Read from backward ring buffer
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (51 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 52/55] perf record: Rename variable to make code clear Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 54/55] perf record: Allow generate tracking events at the end of output Wang Nan
  2016-02-19 11:44 ` [PATCH 55/55] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Introduce rb_find_range() to find start and end position from a backward
ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 69 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 67 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b1f37f0..82b49ce 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -86,6 +86,61 @@ static int process_synthesized_event(struct perf_tool *tool,
 	return record__write(rec, event, event->header.size);
 }
 
+static int
+backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
+{
+	struct perf_event_header *pheader;
+	u64 evt_head = head;
+	int size = mask + 1;
+
+	pr_debug2("backward_rb_find_range: buf=%p, head=%"PRIx64"\n", buf, head);
+	pheader = (struct perf_event_header *)(buf + (head & mask));
+	*start = head;
+	while (true) {
+		if (evt_head - head >= (unsigned int)size) {
+			pr_debug("Finshed reading backward ring buffer: rewind\n");
+			if (evt_head - head > (unsigned int)size)
+				evt_head -= pheader->size;
+			*end = evt_head;
+			return 0;
+		}
+
+		pheader = (struct perf_event_header *)(buf + (evt_head & mask));
+
+		if (pheader->size == 0) {
+			pr_debug("Finshed reading backward ring buffer: get start\n");
+			*end = evt_head;
+			return 0;
+		}
+
+		evt_head += pheader->size;
+		pr_debug3("move evt_head: %"PRIx64"\n", evt_head);
+	}
+	WARN_ONCE(1, "Shouldn't get here\n");
+	return -1;
+}
+
+static int
+rb_find_range(struct perf_evlist *evlist, int idx,
+	      void *data, int mask, u64 head, u64 old,
+	      u64 *start, u64 *end)
+{
+	int channel;
+
+	channel = perf_evlist__idx_channel(evlist, idx);
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY)) {
+		*start = old;
+		*end = head;
+		return 0;
+	}
+
+	if (perf_evlist__channel_check(evlist, channel, BACKWARD))
+		return backward_rb_find_range(data, mask, head, start, end);
+
+	WARN_ONCE(1, "Unable to find start position from a read-only ring buffer\n");
+	return -1;
+}
+
 static int record__mmap_read(struct record *rec, int idx)
 {
 	struct perf_mmap *md = &rec->evlist->mmap[idx];
@@ -97,6 +152,10 @@ static int record__mmap_read(struct record *rec, int idx)
 	void *buf;
 	int rc = 0;
 
+	if (rb_find_range(rec->evlist, idx, data, md->mask, head,
+			  old, &start, &end))
+		return -1;
+
 	if (start == end)
 		return 0;
 
@@ -530,8 +589,14 @@ static bool record__mmap_should_read(struct record *rec, int idx)
 		return false;
 	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
 		return false;
-	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
-		return false;
+	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY)) {
+		if (rec->overwrite_evt_state != OVERWRITE_EVT_DATA_PENDING)
+			return false;
+		if (perf_evlist__channel_check(rec->evlist, channel, BACKWARD))
+			return true;
+		else
+			return false;
+	}
 	return true;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 54/55] perf record: Allow generate tracking events at the end of output
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (52 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 53/55] perf record: Read from backward ring buffer Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  2016-02-19 11:44 ` [PATCH 55/55] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

Before this patch tracking events are generated based on information in
/proc before all samples. However, with the introducing of overwrite
evsel in perf record, it becomes inconvenience: 'perf record' now can
executed as a daemon for sereval hours and only capture the last
snapshot when it receives SIGUSR2. The tracking events generated at
the head of output 'perf.data' becomes too old, but most of tracking
events during 'perf record' running are dropped.

This patch generates tracking events at the end of output. The output
events series would better reflecting status of system when SIGUSR2
received.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 62 +++++++++++++++++++++++++++++++--------------
 1 file changed, 43 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 82b49ce..81e2c3c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -63,6 +63,7 @@ struct record {
 	bool			timestamp_filename;
 	bool			switch_output;
 	enum overwrite_evt_state overwrite_evt_state;
+	bool			tail_tracking;
 	unsigned long long	samples;
 };
 
@@ -685,6 +686,26 @@ record__finish_output(struct record *rec)
 
 static int record__synthesize(struct record *rec);
 
+static void record__synthesize_target(struct record *rec)
+{
+	if (target__none(&rec->opts.target)) {
+		struct {
+			struct thread_map map;
+			struct thread_map_data map_data;
+		} thread_map;
+
+		thread_map.map.nr = 1;
+		thread_map.map.map[0].pid = rec->evlist->workload.pid;
+		thread_map.map.map[0].comm = NULL;
+		perf_event__synthesize_thread_map(&rec->tool,
+				&thread_map.map,
+				process_synthesized_event,
+				&rec->session->machines.host,
+				rec->opts.sample_address,
+				rec->opts.proc_map_timeout);
+	}
+}
+
 static int
 record__switch_output(struct record *rec, bool at_exit)
 {
@@ -694,6 +715,11 @@ record__switch_output(struct record *rec, bool at_exit)
 	/* Same Size:      "2015122520103046"*/
 	char timestamp[] = "InvalidTimestamp";
 
+	if (rec->tail_tracking) {
+		record__synthesize(rec);
+		record__synthesize_target(rec);
+	}
+
 	rec->samples = 0;
 	record__finish_output(rec);
 	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
@@ -720,23 +746,10 @@ record__switch_output(struct record *rec, bool at_exit)
 		machines__init(&rec->session->machines);
 		perf_session__create_kernel_maps(rec->session);
 		perf_session__set_id_hdr_size(rec->session);
-		record__synthesize(rec);
 
-		if (target__none(&rec->opts.target)) {
-			struct {
-				struct thread_map map;
-				struct thread_map_data map_data;
-			} thread_map;
-
-			thread_map.map.nr = 1;
-			thread_map.map.map[0].pid = rec->evlist->workload.pid;
-			thread_map.map.map[0].comm = NULL;
-			perf_event__synthesize_thread_map(&rec->tool,
-					&thread_map.map,
-					process_synthesized_event,
-					&rec->session->machines.host,
-					rec->opts.sample_address,
-					rec->opts.proc_map_timeout);
+		if (!rec->tail_tracking) {
+			record__synthesize(rec);
+			record__synthesize_target(rec);
 		}
 	}
 	return fd;
@@ -932,9 +945,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	err = record__synthesize(rec);
-	if (err < 0)
-		goto out_child;
+	if (!rec->tail_tracking) {
+		err = record__synthesize(rec);
+		if (err < 0)
+			goto out_child;
+	}
 
 	if (rec->realtime_prio) {
 		struct sched_param param;
@@ -1075,6 +1090,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			disabled = true;
 		}
 	}
+
+	if (rec->tail_tracking) {
+		err = record__synthesize(rec);
+		if (err < 0)
+			goto out_child;
+	}
+
 	auxtrace_snapshot_disable();
 
 	if (forks && workload_exec_errno) {
@@ -1507,6 +1529,8 @@ struct option __record_options[] = {
 		    "append timestamp to output filename"),
 	OPT_BOOLEAN(0, "switch-output", &record.switch_output,
 		    "Switch output when receive SIGUSR2"),
+	OPT_BOOLEAN(0, "tail-tracking", &record.tail_tracking,
+		    "Generate tracking events at the end of output"),
 	OPT_END()
 };
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 55/55] perf tools: Don't warn about out of order event if write_backward is used
  2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
                   ` (53 preceding siblings ...)
  2016-02-19 11:44 ` [PATCH 54/55] perf record: Allow generate tracking events at the end of output Wang Nan
@ 2016-02-19 11:44 ` Wang Nan
  54 siblings, 0 replies; 67+ messages in thread
From: Wang Nan @ 2016-02-19 11:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Brendan Gregg
  Cc: Adrian Hunter, Cody P Schafer, David S. Miller, He Kuang,
	Jérémie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan,
	Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama,
	Wang Nan, linux-kernel

If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.

Result:

Before this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
 [ perf record: Woken up 5 times to write data ]
 Warning:
 40 out of order events recorded.
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

After this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
 [ perf record: Woken up 5 times to write data ]
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/session.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 40b7a0d..132c6ab 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1516,10 +1516,27 @@ int perf_session__register_idle_thread(struct perf_session *session)
 	return err;
 }
 
+static void
+perf_session__warn_order(const struct perf_session *session)
+{
+	const struct ordered_events *oe = &session->ordered_events;
+	struct perf_evsel *evsel;
+	bool should_warn = true;
+
+	evlist__for_each(session->evlist, evsel) {
+		if (evsel->attr.write_backward)
+			should_warn = false;
+	}
+
+	if (!should_warn)
+		return;
+	if (oe->nr_unordered_events != 0)
+		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+}
+
 static void perf_session__warn_about_errors(const struct perf_session *session)
 {
 	const struct events_stats *stats = &session->evlist->stats;
-	const struct ordered_events *oe = &session->ordered_events;
 
 	if (session->tool->lost == perf_event__process_lost &&
 	    stats->nr_events[PERF_RECORD_LOST] != 0) {
@@ -1576,8 +1593,7 @@ static void perf_session__warn_about_errors(const struct perf_session *session)
 			    stats->nr_unprocessable_samples);
 	}
 
-	if (oe->nr_unordered_events != 0)
-		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+	perf_session__warn_order(session);
 
 	events_stats__auxtrace_error_warn(stats);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 10/55] perf stat: Forbid user passing improper config terms
       [not found]   ` <20160219140333.GB16141@kernel.org>
@ 2016-02-19 15:08     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 67+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-19 15:08 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Brendan Gregg, Adrian Hunter, Cody P Schafer,
	David S. Miller, He Kuang, Jérémie Galarneau,
	Jiri Olsa, Kirill Smelkov, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, linux-kernel

Em Fri, Feb 19, 2016 at 11:43:58AM +0000, Wang Nan escreveu:
> 'perf stat' accepts some config terms but doesn't apply them. For
> example:
> 
>  # perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
>  # ls
>  # exit

This seems to do the trick, but I wonder if we couldn't instead pass a
callback or instead allow tools to provide a list of supported terms.

I'll apply the patch as it improves the situation, i.e. before people
were left wondering why the modifiers were not producing any effect, now
they get a nice message, but I'd like to hear from Jiri what are his
thoughts, when he's back.

- Arnaldo
 
>  Performance counter stats for 'bash':
> 
>          266258061      instructions/no-inherit/
>          266258061      instructions/inherit/
> 
>        1.402183915 seconds time elapsed
> 
> The result is confusing, because user may expect the first
> 'instructions' event exclude the 'ls' command.
> 
> This patch forbid most of these config terms for 'perf stat'.
> 
> Result:
> 
>  # ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
>  event syntax error: 'instructions/no-inherit/'
>                       \___ 'no-inherit' is not usable in 'perf stat'
>  ...
> 
> We can add blocked config terms back when 'perf stat' really supports them.
> 
> This patch also removes unavailable config term from error message:
> 
>  # ./perf stat -e 'instructions/badterm/' ls
>  event syntax error: 'instructions/badterm/'
>                                    \___ unknown term
> 
>  valid terms: config,config1,config2,name
> 
>  # ./perf stat -e 'cpu/badterm/' ls
>  event syntax error: 'cpu/badterm/'
>                           \___ unknown term
> 
>  valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/builtin-stat.c      |  1 +
>  tools/perf/util/parse-events.c | 48 ++++++++++++++++++++++++++++++++++++++++++
>  tools/perf/util/parse-events.h |  1 +
>  3 files changed, 50 insertions(+)
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 86289df..8c0bc0f 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1831,6 +1831,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
>  	if (evsel_list == NULL)
>  		return -ENOMEM;
>  
> +	parse_events__shrink_config_terms();
>  	argc = parse_options_subcommand(argc, argv, stat_options, stat_subcommands,
>  					(const char **) stat_usage,
>  					PARSE_OPT_STOP_AT_NON_OPTION);
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 3fe0d11..5b8a13b 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -816,6 +816,41 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
>  	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
>  };
>  
> +static bool config_term_shrinked;
> +
> +static bool
> +config_term_avail(int term_type, struct parse_events_error *err)
> +{
> +	if (term_type < 0 || term_type >= __PARSE_EVENTS__TERM_TYPE_NR) {
> +		err->str = strdup("Invalid term_type");
> +		return false;
> +	}
> +	if (!config_term_shrinked)
> +		return true;
> +
> +	switch (term_type) {
> +	case PARSE_EVENTS__TERM_TYPE_CONFIG:
> +	case PARSE_EVENTS__TERM_TYPE_CONFIG1:
> +	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
> +	case PARSE_EVENTS__TERM_TYPE_NAME:
> +		return true;
> +	default:
> +		if (!err)
> +			return false;
> +
> +		/* term_type is validated so indexing is safe */
> +		if (asprintf(&err->str, "'%s' is not usable in 'perf stat'",
> +			     config_term_names[term_type]) < 0)
> +			err->str = NULL;
> +		return false;
> +	}
> +}
> +
> +void parse_events__shrink_config_terms(void)
> +{
> +	config_term_shrinked = true;
> +}
> +
>  typedef int config_term_func_t(struct perf_event_attr *attr,
>  			       struct parse_events_term *term,
>  			       struct parse_events_error *err);
> @@ -885,6 +920,17 @@ do {									   \
>  		return -EINVAL;
>  	}
>  
> +	/*
> +	 * Check term availbility after basic checking so
> +	 * PARSE_EVENTS__TERM_TYPE_USER can be found and filtered.
> +	 *
> +	 * If check availbility at the entry of this function,
> +	 * user will see "'<sysfs term>' is not usable in 'perf stat'"
> +	 * if an invalid config term is provided for legacy events
> +	 * (for example, instructions/badterm/...), which is confusing.
> +	 */
> +	if (!config_term_avail(term->type_term, err))
> +		return -EINVAL;
>  	return 0;
>  #undef CHECK_TYPE_VAL
>  }
> @@ -2177,6 +2223,8 @@ static void config_terms_list(char *buf, size_t buf_sz)
>  	for (i = 0; i < __PARSE_EVENTS__TERM_TYPE_NR; i++) {
>  		const char *name = config_term_names[i];
>  
> +		if (!config_term_avail(i, NULL))
> +			continue;
>  		if (!name)
>  			continue;
>  		if (name[0] == '<')
> diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
> index 2196db9..b6afc57 100644
> --- a/tools/perf/util/parse-events.h
> +++ b/tools/perf/util/parse-events.h
> @@ -106,6 +106,7 @@ struct parse_events_terms {
>  	struct list_head *terms;
>  };
>  
> +void parse_events__shrink_config_terms(void);
>  int parse_events__is_hardcoded_term(struct parse_events_term *term);
>  int parse_events_term__num(struct parse_events_term **term,
>  			   int type_term, char *config, u64 num,
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 12/55] perf tools: Enable config raw and numeric events
       [not found]   ` <20160219141405.GC16141@kernel.org>
@ 2016-02-19 15:08     ` Arnaldo Carvalho de Melo
  2016-02-19 15:15       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 67+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-19 15:08 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Brendan Gregg, Adrian Hunter, Cody P Schafer,
	David S. Miller, He Kuang, Jérémie Galarneau,
	Jiri Olsa, Kirill Smelkov, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, linux-kernel

Em Fri, Feb 19, 2016 at 11:44:00AM +0000, Wang Nan escreveu:
> This patch allows setting config terms for raw and numeric events.
> For example:
> 
>  # perf stat -e cycles/name=cyc/ ls
>  ...
>  1821108      cyc
>  ...
> 
>  # perf stat -e r6530160/name=event/ ls
>  ...
>  1103195      event
>  ...
> 
>  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
>  ...
>  # perf report --stdio

Nice stuff, but I'm investigating now why I'm getting this:

  [acme@jouet linux]$ make O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/home/acme/git/linux/tools/perf'
    BUILD:   Doing 'make -j4' parallel build
    BISON    /tmp/build/perf/util/parse-events-bison.c
  util/parse-events.y:436.23-38: error: symbol opt_event_config is used, but is not defined as a token and has no rules
   PE_VALUE ':' PE_VALUE opt_event_config
                         ^^^^^^^^^^^^^^^^
  util/parse-events.y:442.68-69: error: $4 of ‘event_legacy_numeric’ has no declared type
   	ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, $4));
                                                                      ^^
  util/parse-events.y:443.36-37: error: $4 of ‘event_legacy_numeric’ has no declared type
   	parse_events_terms__delete($4);
                                      ^^
  util/parse-events.y:454.74-75: error: $2 of ‘event_legacy_raw’ has no declared type
   	ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, $2));
                                                                            ^^
  util/parse-events.y:455.36-37: error: $2 of ‘event_legacy_raw’ has no declared type
   	parse_events_terms__delete($2);
                                    ^^
  util/Build:122: recipe for target '/tmp/build/perf/util/parse-events-bison.c' failed
  make[3]: *** [/tmp/build/perf/util/parse-events-bison.c] Error 1
  /home/acme/git/linux/tools/build/Makefile.build:116: recipe for target 'util' failed
  make[2]: *** [util] Error 2
  Makefile.perf:434: recipe for target '/tmp/build/perf/libperf-in.o' failed
  make[1]: *** [/tmp/build/perf/libperf-in.o] Error 2
  make[1]: *** Waiting for unfinished jobs....
  Makefile:108: recipe for target 'install-bin' failed
  make: *** [install-bin] Error 2
  make: Leaving directory '/home/acme/git/linux/tools/perf'
  [acme@jouet linux]$ 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 12/55] perf tools: Enable config raw and numeric events
  2016-02-19 15:08     ` Arnaldo Carvalho de Melo
@ 2016-02-19 15:15       ` Arnaldo Carvalho de Melo
  2016-02-19 21:52         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 67+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-19 15:15 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Brendan Gregg, Adrian Hunter, Cody P Schafer,
	David S. Miller, He Kuang, Jérémie Galarneau,
	Jiri Olsa, Kirill Smelkov, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, linux-kernel

Em Fri, Feb 19, 2016 at 12:08:47PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Feb 19, 2016 at 11:44:00AM +0000, Wang Nan escreveu:
> > This patch allows setting config terms for raw and numeric events.
> > For example:
> > 
> >  # perf stat -e cycles/name=cyc/ ls
> >  ...
> >  1821108      cyc
> >  ...
> > 
> >  # perf stat -e r6530160/name=event/ ls
> >  ...
> >  1103195      event
> >  ...
> > 
> >  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
> >  ...
> >  # perf report --stdio
> 
> Nice stuff, but I'm investigating now why I'm getting this:
> 
>   [acme@jouet linux]$ make O=/tmp/build/perf -C tools/perf install-bin
>   make: Entering directory '/home/acme/git/linux/tools/perf'
>     BUILD:   Doing 'make -j4' parallel build
>     BISON    /tmp/build/perf/util/parse-events-bison.c
>   util/parse-events.y:436.23-38: error: symbol opt_event_config is used, but is not defined as a token and has no rules
>    PE_VALUE ':' PE_VALUE opt_event_config
>                          ^^^^^^^^^^^^^^^^

Ok, this is because I deferred the patch that introduces this
'opt_event_config' thing, that is buried in a BPF patch, when it
could've probably have stood out in a separate patch, trying that now.

- Arnaldo

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 12/55] perf tools: Enable config raw and numeric events
  2016-02-19 15:15       ` Arnaldo Carvalho de Melo
@ 2016-02-19 21:52         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 67+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-19 21:52 UTC (permalink / raw)
  To: Wang Nan
  Cc: Alexei Starovoitov, Brendan Gregg, Adrian Hunter, Cody P Schafer,
	David S. Miller, He Kuang, Jérémie Galarneau,
	Jiri Olsa, Kirill Smelkov, Li Zefan, Masami Hiramatsu,
	Namhyung Kim, Peter Zijlstra, pi3orama, linux-kernel

Em Fri, Feb 19, 2016 at 12:15:33PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Feb 19, 2016 at 12:08:47PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Fri, Feb 19, 2016 at 11:44:00AM +0000, Wang Nan escreveu:
> > > This patch allows setting config terms for raw and numeric events.
> > > For example:
> > > 
> > >  # perf stat -e cycles/name=cyc/ ls
> > >  ...
> > >  1821108      cyc
> > >  ...
> > > 
> > >  # perf stat -e r6530160/name=event/ ls
> > >  ...
> > >  1103195      event
> > >  ...
> > > 
> > >  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
> > >  ...
> > >  # perf report --stdio
> > 
> > Nice stuff, but I'm investigating now why I'm getting this:
> > 
> >   [acme@jouet linux]$ make O=/tmp/build/perf -C tools/perf install-bin
> >   make: Entering directory '/home/acme/git/linux/tools/perf'
> >     BUILD:   Doing 'make -j4' parallel build
> >     BISON    /tmp/build/perf/util/parse-events-bison.c
> >   util/parse-events.y:436.23-38: error: symbol opt_event_config is used, but is not defined as a token and has no rules
> >    PE_VALUE ':' PE_VALUE opt_event_config
> >                          ^^^^^^^^^^^^^^^^
> 
> Ok, this is because I deferred the patch that introduces this
> 'opt_event_config' thing, that is buried in a BPF patch, when it
> could've probably have stood out in a separate patch, trying that now.

Ok, so I added the patch below, that uses this opt_event_config thing to
simplify other places where the event config is optional, like
tracepoints, please see if this is ok, my bison skills are limited:

commit 095d8d6283cf8556432673c408cf85de4bbcd64d
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Fri Feb 19 18:45:12 2016 -0300

    perf tools: Introduce opt_event_config nonterminal
    
    To remove duplicated code that differs only in using the matching
    '/a,b,c/' part or NULL if no event configuration is done ('//' or no
    pair of slashes at all).
    
    Will be used by some new targets allowing the configuration of hardware
    events, etc.
    
    Lifted part of the 'opt_event_config' nonterminal from a patch by Wang
    Nan.
    
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
    Cc: Cody P Schafer <dev@codyps.com>
    Cc: He Kuang <hekuang@huawei.com>
    Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Kirill Smelkov <kirr@nexedi.com>
    Cc: Li Zefan <lizefan@huawei.com>
    Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Zefan Li <lizefan@huawei.com>
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/n/tip-e3xzpx9cqsmwnaguaxyw6r42@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index c0eac88ef474..ce68746bdc89 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -64,6 +64,7 @@ static inc_group_count(struct list_head *list,
 %type <str> PE_PMU_EVENT_PRE PE_PMU_EVENT_SUF PE_KERNEL_PMU_EVENT
 %type <num> value_sym
 %type <head> event_config
+%type <head> opt_event_config
 %type <term> event_term
 %type <head> event_pmu
 %type <head> event_legacy_symbol
@@ -222,16 +223,6 @@ PE_NAME '/' event_config '/'
 	$$ = list;
 }
 |
-PE_NAME '/' '/'
-{
-	struct parse_events_evlist *data = _data;
-	struct list_head *list;
-
-	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_pmu(data, list, $1, NULL));
-	$$ = list;
-}
-|
 PE_KERNEL_PMU_EVENT sep_dc
 {
 	struct parse_events_evlist *data = _data;
@@ -378,7 +369,7 @@ PE_PREFIX_MEM PE_VALUE sep_dc
 }
 
 event_legacy_tracepoint:
-tracepoint_name
+tracepoint_name opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct parse_events_error *error = data->error;
@@ -389,24 +380,7 @@ tracepoint_name
 		error->idx = @1.first_column;
 
 	if (parse_events_add_tracepoint(list, &data->idx, $1.sys, $1.event,
-					error, NULL))
-		return -1;
-
-	$$ = list;
-}
-|
-tracepoint_name '/' event_config '/'
-{
-	struct parse_events_evlist *data = _data;
-	struct parse_events_error *error = data->error;
-	struct list_head *list;
-
-	ALLOC_LIST(list);
-	if (error)
-		error->idx = @1.first_column;
-
-	if (parse_events_add_tracepoint(list, &data->idx, $1.sys, $1.event,
-					error, $3))
+					error, $2))
 		return -1;
 
 	$$ = list;
@@ -476,6 +450,21 @@ PE_BPF_SOURCE
 	$$ = list;
 }
 
+opt_event_config:
+'/' event_config '/'
+{
+	$$ = $2;
+}
+|
+'/' '/'
+{
+	$$ = NULL;
+}
+|
+{
+	$$ = NULL;
+}
+
 start_terms: event_config
 {
 	struct parse_events_terms *data = _data;

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [tip:perf/core] perf bpf: Rename bpf_prog_priv__clear() to clear_prog_priv()
  2016-02-19 11:43 ` [PATCH 03/55] perf tools: Rename bpf_prog_priv__clear() to clear_prog_priv() Wang Nan
@ 2016-02-20 11:36   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 67+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-20 11:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: namhyung, wangnan0, acme, masami.hiramatsu.pt, hekuang, jolsa,
	mingo, linux-kernel, kirr, lizefan, peterz, brendan.d.gregg,
	jeremie.galarneau, dev, tglx, hpa, ast, adrian.hunter

Commit-ID:  80cdce7666924158af63ba5071805a28800ebe4b
Gitweb:     http://git.kernel.org/tip/80cdce7666924158af63ba5071805a28800ebe4b
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 19 Feb 2016 11:43:51 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 19 Feb 2016 19:12:46 -0300

perf bpf: Rename bpf_prog_priv__clear() to clear_prog_priv()

The name of bpf_prog_priv__clear() doesn't follow perf's naming
convention. bpf_prog_priv__delete() seems to be a better name. However,
bpf_prog_priv__delete() should be a method of 'struct bpf_prog_priv',
but its first parameter is 'struct bpf_program'.

It is callback from libbpf to clear priv structures when destroying a
bpf program. It is actually a method of bpf_program (libbpf object), but
bpf_program__ functions should be provided by libbpf.

This patch removes the prefix of that function.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/bpf-loader.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 540a7ef..0bdccf4 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -108,8 +108,8 @@ void bpf__clear(void)
 }
 
 static void
-bpf_prog_priv__clear(struct bpf_program *prog __maybe_unused,
-		     void *_priv)
+clear_prog_priv(struct bpf_program *prog __maybe_unused,
+		void *_priv)
 {
 	struct bpf_prog_priv *priv = _priv;
 
@@ -337,7 +337,7 @@ config_bpf_program(struct bpf_program *prog)
 	}
 	pr_debug("bpf: config '%s' is ok\n", config_str);
 
-	err = bpf_program__set_private(prog, priv, bpf_prog_priv__clear);
+	err = bpf_program__set_private(prog, priv, clear_prog_priv);
 	if (err) {
 		pr_debug("Failed to set priv for program '%s'\n", config_str);
 		goto errout;

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [tip:perf/core] perf tools: Fix checking asprintf return value
  2016-02-19 11:43 ` [PATCH 04/55] perf tools: Fix checking asprintf return value Wang Nan
@ 2016-02-20 11:36   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 67+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-20 11:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, hekuang, tglx, masami.hiramatsu.pt, wangnan0, jolsa, ast,
	acme, dev, namhyung, jeremie.galarneau, adrian.hunter,
	linux-kernel, lizefan, brendan.d.gregg, hpa, peterz, kirr

Commit-ID:  26dee028d365fbc0e3326606a8520260b4462381
Gitweb:     http://git.kernel.org/tip/26dee028d365fbc0e3326606a8520260b4462381
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 19 Feb 2016 11:43:52 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 19 Feb 2016 19:12:47 -0300

perf tools: Fix checking asprintf return value

According to man pages, asprintf returns -1 when failure. This patch
fixes two incorrect return value checker.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: stable@vger.kernel.org # v4.4+
Fixes: ffeb883e5662 ("perf tools: Show proper error message for wrong terms of hw/sw events")
Link: http://lkml.kernel.org/r/1455882283-79592-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/parse-events.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index e5583fd..72524c7 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2110,11 +2110,11 @@ char *parse_events_formats_error_string(char *additional_terms)
 
 	/* valid terms */
 	if (additional_terms) {
-		if (!asprintf(&str, "valid terms: %s,%s",
-			      additional_terms, static_terms))
+		if (asprintf(&str, "valid terms: %s,%s",
+			     additional_terms, static_terms) < 0)
 			goto fail;
 	} else {
-		if (!asprintf(&str, "valid terms: %s", static_terms))
+		if (asprintf(&str, "valid terms: %s", static_terms) < 0)
 			goto fail;
 	}
 	return str;

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [tip:perf/core] perf tools: Create config_term_names array
  2016-02-19 11:43 ` [PATCH 09/55] perf tools: Create config_term_names array Wang Nan
@ 2016-02-20 11:36   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 67+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-20 11:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: ast, hekuang, mingo, namhyung, hpa, linux-kernel, acme, lizefan,
	kirr, jeremie.galarneau, wangnan0, dev, tglx,
	masami.hiramatsu.pt, peterz, jolsa, adrian.hunter,
	brendan.d.gregg

Commit-ID:  17cb5f84b89fd39a143f1c899836f40420a6b799
Gitweb:     http://git.kernel.org/tip/17cb5f84b89fd39a143f1c899836f40420a6b799
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 19 Feb 2016 11:43:57 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 19 Feb 2016 19:12:48 -0300

perf tools: Create config_term_names array

config_term_names[] is introduced for future commits which will be able
to retrieve the config name through the config term.

Utilize this array in parse_events_formats_error_string() so the missing
'{,no-}inherit' terms are added.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/parse-events.c | 51 +++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/parse-events.h |  3 ++-
 tools/perf/util/parse-events.l |  3 +--
 3 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 72524c7..fd085d5 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -746,6 +746,25 @@ static int check_type_val(struct parse_events_term *term,
 	return -EINVAL;
 }
 
+/*
+ * Update according to parse-events.l
+ */
+static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
+	[PARSE_EVENTS__TERM_TYPE_USER]			= "<sysfs term>",
+	[PARSE_EVENTS__TERM_TYPE_CONFIG]		= "config",
+	[PARSE_EVENTS__TERM_TYPE_CONFIG1]		= "config1",
+	[PARSE_EVENTS__TERM_TYPE_CONFIG2]		= "config2",
+	[PARSE_EVENTS__TERM_TYPE_NAME]			= "name",
+	[PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD]		= "period",
+	[PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ]		= "freq",
+	[PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE]	= "branch_type",
+	[PARSE_EVENTS__TERM_TYPE_TIME]			= "time",
+	[PARSE_EVENTS__TERM_TYPE_CALLGRAPH]		= "call-graph",
+	[PARSE_EVENTS__TERM_TYPE_STACKSIZE]		= "stack-size",
+	[PARSE_EVENTS__TERM_TYPE_NOINHERIT]		= "no-inherit",
+	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
+};
+
 typedef int config_term_func_t(struct perf_event_attr *attr,
 			       struct parse_events_term *term,
 			       struct parse_events_error *err);
@@ -2097,6 +2116,31 @@ void parse_events_evlist_error(struct parse_events_evlist *data,
 	WARN_ONCE(!err->str, "WARNING: failed to allocate error string");
 }
 
+static void config_terms_list(char *buf, size_t buf_sz)
+{
+	int i;
+	bool first = true;
+
+	buf[0] = '\0';
+	for (i = 0; i < __PARSE_EVENTS__TERM_TYPE_NR; i++) {
+		const char *name = config_term_names[i];
+
+		if (!name)
+			continue;
+		if (name[0] == '<')
+			continue;
+
+		if (strlen(buf) + strlen(name) + 2 >= buf_sz)
+			return;
+
+		if (!first)
+			strcat(buf, ",");
+		else
+			first = false;
+		strcat(buf, name);
+	}
+}
+
 /*
  * Return string contains valid config terms of an event.
  * @additional_terms: For terms such as PMU sysfs terms.
@@ -2104,10 +2148,11 @@ void parse_events_evlist_error(struct parse_events_evlist *data,
 char *parse_events_formats_error_string(char *additional_terms)
 {
 	char *str;
-	static const char *static_terms = "config,config1,config2,name,"
-					  "period,freq,branch_type,time,"
-					  "call-graph,stack-size\n";
+	/* "branch_type" is the longest name */
+	char static_terms[__PARSE_EVENTS__TERM_TYPE_NR *
+			  (sizeof("branch_type") - 1)];
 
+	config_terms_list(static_terms, sizeof(static_terms));
 	/* valid terms */
 	if (additional_terms) {
 		if (asprintf(&str, "valid terms: %s,%s",
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 53628bf..b50d50b 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -68,7 +68,8 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_CALLGRAPH,
 	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
-	PARSE_EVENTS__TERM_TYPE_INHERIT
+	PARSE_EVENTS__TERM_TYPE_INHERIT,
+	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
 struct parse_events_term {
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 58c5831..99486e6 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -178,8 +178,7 @@ modifier_bp	[rwx]{1,3}
 
 <config>{
 	/*
-	 * Please update parse_events_formats_error_string any time
-	 * new static term is added.
+	 * Please update config_term_names when new static term is added.
 	 */
 config			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG); }
 config1			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CONFIG1); }

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [tip:perf/core] perf stat: Bail out on unsupported event config modifiers
  2016-02-19 11:43 ` [PATCH 10/55] perf stat: Forbid user passing improper config terms Wang Nan
       [not found]   ` <20160219140333.GB16141@kernel.org>
@ 2016-02-20 11:37   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 67+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-20 11:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, linux-kernel, hekuang, tglx, wangnan0, brendan.d.gregg,
	ast, jolsa, adrian.hunter, masami.hiramatsu.pt, hpa, namhyung,
	acme, dev, jeremie.galarneau, lizefan, mingo, kirr

Commit-ID:  1669e509ea25e4e3e871d913d21b1cac4a96d1e8
Gitweb:     http://git.kernel.org/tip/1669e509ea25e4e3e871d913d21b1cac4a96d1e8
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 19 Feb 2016 11:43:58 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 19 Feb 2016 19:12:49 -0300

perf stat: Bail out on unsupported event config modifiers

'perf stat' accepts some config terms but doesn't apply them. For
example:

  # perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
  # ls
  # exit

  Performance counter stats for 'bash':

         266258061      instructions/no-inherit/
         266258061      instructions/inherit/

       1.402183915 seconds time elapsed

The result is confusing, because user may expect the first
'instructions' event exclude the 'ls' command.

This patch forbid most of these config terms for 'perf stat'.

Result:

  # ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
  event syntax error: 'instructions/no-inherit/'
                       \___ 'no-inherit' is not usable in 'perf stat'
  ...

We can add blocked config terms back when 'perf stat' really supports them.

This patch also removes unavailable config term from error message:

  # ./perf stat -e 'instructions/badterm/' ls
  event syntax error: 'instructions/badterm/'
                                    \___ unknown term

  valid terms: config,config1,config2,name

  # ./perf stat -e 'cpu/badterm/' ls
  event syntax error: 'cpu/badterm/'
                           \___ unknown term

  valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-11-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c      |  1 +
 tools/perf/util/parse-events.c | 48 ++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/parse-events.h |  1 +
 3 files changed, 50 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 86289df..8c0bc0f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1831,6 +1831,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (evsel_list == NULL)
 		return -ENOMEM;
 
+	parse_events__shrink_config_terms();
 	argc = parse_options_subcommand(argc, argv, stat_options, stat_subcommands,
 					(const char **) stat_usage,
 					PARSE_OPT_STOP_AT_NON_OPTION);
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index fd085d5..eb5df43 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -765,6 +765,41 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
 };
 
+static bool config_term_shrinked;
+
+static bool
+config_term_avail(int term_type, struct parse_events_error *err)
+{
+	if (term_type < 0 || term_type >= __PARSE_EVENTS__TERM_TYPE_NR) {
+		err->str = strdup("Invalid term_type");
+		return false;
+	}
+	if (!config_term_shrinked)
+		return true;
+
+	switch (term_type) {
+	case PARSE_EVENTS__TERM_TYPE_CONFIG:
+	case PARSE_EVENTS__TERM_TYPE_CONFIG1:
+	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
+	case PARSE_EVENTS__TERM_TYPE_NAME:
+		return true;
+	default:
+		if (!err)
+			return false;
+
+		/* term_type is validated so indexing is safe */
+		if (asprintf(&err->str, "'%s' is not usable in 'perf stat'",
+			     config_term_names[term_type]) < 0)
+			err->str = NULL;
+		return false;
+	}
+}
+
+void parse_events__shrink_config_terms(void)
+{
+	config_term_shrinked = true;
+}
+
 typedef int config_term_func_t(struct perf_event_attr *attr,
 			       struct parse_events_term *term,
 			       struct parse_events_error *err);
@@ -834,6 +869,17 @@ do {									   \
 		return -EINVAL;
 	}
 
+	/*
+	 * Check term availbility after basic checking so
+	 * PARSE_EVENTS__TERM_TYPE_USER can be found and filtered.
+	 *
+	 * If check availbility at the entry of this function,
+	 * user will see "'<sysfs term>' is not usable in 'perf stat'"
+	 * if an invalid config term is provided for legacy events
+	 * (for example, instructions/badterm/...), which is confusing.
+	 */
+	if (!config_term_avail(term->type_term, err))
+		return -EINVAL;
 	return 0;
 #undef CHECK_TYPE_VAL
 }
@@ -2125,6 +2171,8 @@ static void config_terms_list(char *buf, size_t buf_sz)
 	for (i = 0; i < __PARSE_EVENTS__TERM_TYPE_NR; i++) {
 		const char *name = config_term_names[i];
 
+		if (!config_term_avail(i, NULL))
+			continue;
 		if (!name)
 			continue;
 		if (name[0] == '<')
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index b50d50b..76151f9 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -105,6 +105,7 @@ struct parse_events_terms {
 	struct list_head *terms;
 };
 
+void parse_events__shrink_config_terms(void);
 int parse_events__is_hardcoded_term(struct parse_events_term *term);
 int parse_events_term__num(struct parse_events_term **term,
 			   int type_term, char *config, u64 num,

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [tip:perf/core] perf tools: Rename and move pmu_event_name to get_config_name
  2016-02-19 11:43 ` [PATCH 11/55] perf tools: Rename and move pmu_event_name to get_config_name Wang Nan
@ 2016-02-20 11:37   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 67+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-20 11:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: brendan.d.gregg, masami.hiramatsu.pt, lizefan, jolsa, tglx,
	wangnan0, linux-kernel, kirr, peterz, mingo, acme, ast, dev,
	jeremie.galarneau, namhyung, hekuang, adrian.hunter, hpa

Commit-ID:  e814fddde18fec43fa41a27ae94c09b54772697e
Gitweb:     http://git.kernel.org/tip/e814fddde18fec43fa41a27ae94c09b54772697e
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 19 Feb 2016 11:43:59 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 19 Feb 2016 19:12:50 -0300

perf tools: Rename and move pmu_event_name to get_config_name

Following commits will make more events obey /name=newname/ options.
This patch makes pmu_event_name() a generic helper.

Makes new get_config_name() accept NULL input to make life easier.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-12-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/parse-events.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index eb5df43..3243e95 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -279,7 +279,24 @@ const char *event_type(int type)
 	return "unknown";
 }
 
+static int parse_events__is_name_term(struct parse_events_term *term)
+{
+	return term->type_term == PARSE_EVENTS__TERM_TYPE_NAME;
+}
 
+static char *get_config_name(struct list_head *head_terms)
+{
+	struct parse_events_term *term;
+
+	if (!head_terms)
+		return NULL;
+
+	list_for_each_entry(term, head_terms, list)
+		if (parse_events__is_name_term(term))
+			return term->val.str;
+
+	return NULL;
+}
 
 static struct perf_evsel *
 __add_event(struct list_head *list, int *idx,
@@ -1029,22 +1046,6 @@ int parse_events_add_numeric(struct parse_events_evlist *data,
 	return add_event(list, &data->idx, &attr, NULL, &config_terms);
 }
 
-static int parse_events__is_name_term(struct parse_events_term *term)
-{
-	return term->type_term == PARSE_EVENTS__TERM_TYPE_NAME;
-}
-
-static char *pmu_event_name(struct list_head *head_terms)
-{
-	struct parse_events_term *term;
-
-	list_for_each_entry(term, head_terms, list)
-		if (parse_events__is_name_term(term))
-			return term->val.str;
-
-	return NULL;
-}
-
 int parse_events_add_pmu(struct parse_events_evlist *data,
 			 struct list_head *list, char *name,
 			 struct list_head *head_config)
@@ -1089,7 +1090,7 @@ int parse_events_add_pmu(struct parse_events_evlist *data,
 		return -EINVAL;
 
 	evsel = __add_event(list, &data->idx, &attr,
-			    pmu_event_name(head_config), pmu->cpus,
+			    get_config_name(head_config), pmu->cpus,
 			    &config_terms);
 	if (evsel) {
 		evsel->unit = info.unit;

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [tip:perf/core] perf tools: Enable config raw and numeric events
  2016-02-19 11:44 ` [PATCH 12/55] perf tools: Enable config raw and numeric events Wang Nan
       [not found]   ` <20160219141405.GC16141@kernel.org>
@ 2016-02-20 11:38   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 67+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-20 11:38 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, dev, hpa, acme, namhyung, mingo, kirr, hekuang,
	masami.hiramatsu.pt, brendan.d.gregg, tglx, linux-kernel,
	adrian.hunter, peterz, lizefan, ast, wangnan0, jeremie.galarneau

Commit-ID:  10bf358a1b79fa1311eb05ee31f2cefdcad01741
Gitweb:     http://git.kernel.org/tip/10bf358a1b79fa1311eb05ee31f2cefdcad01741
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 19 Feb 2016 11:44:00 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 19 Feb 2016 19:12:50 -0300

perf tools: Enable config raw and numeric events

This patch allows setting config terms for raw and numeric events.
For example:

  # perf stat -e cycles/name=cyc/ ls
  ...
  1821108      cyc
  ...

  # perf stat -e r6530160/name=event/ ls
  ...
  1103195      event
  ...

  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
  ...
  # perf report --stdio
  ...
  # Samples: 124  of event 'cycles'
  46.61%     0.00%  swapper        [kernel.vmlinux]            [k] cpu_startup_entry
  41.26%     0.00%  swapper        [kernel.vmlinux]            [k] start_secondary
  ...
  # Samples: 91  of event 'evtx'
  ...
  93.76%     0.00%  swapper      [kernel.vmlinux]            [k] cpu_startup_entry
          |
          ---cpu_startup_entry
             |
             |--66.63%--call_cpuidle
             |          cpuidle_enter
             |          |
  ...

3 test cases are introduced to test config terms for symbol, raw and
numeric events.

Committer note:

Further testing shows that we can retrieve the event name using 'perf
evlist -v' and looking at the 'config' perf_event_attr field, i.e.:

  # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.724 MB perf.data (2076 samples) ]
  # perf evlist
  cycles
  evtx
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
evtx: type: 4, size: 112, config: 0x6530160, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
  #

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-13-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/parse-events.c | 40 ++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/parse-events.c  |  3 ++-
 tools/perf/util/parse-events.y  | 10 ++++++----
 3 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 6648274..15e2d05 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -1271,6 +1271,31 @@ static int test__checkevent_precise_max_modifier(struct perf_evlist *evlist)
 	return 0;
 }
 
+static int test__checkevent_config_symbol(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+	TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "insn") == 0);
+	return 0;
+}
+
+static int test__checkevent_config_raw(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+	TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "rawpmu") == 0);
+	return 0;
+}
+
+static int test__checkevent_config_num(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+	TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "numpmu") == 0);
+	return 0;
+}
+
+
 static int count_tracepoints(void)
 {
 	struct dirent *events_ent;
@@ -1579,6 +1604,21 @@ static struct evlist_test test__events[] = {
 		.check = test__checkevent_precise_max_modifier,
 		.id    = 47,
 	},
+	{
+		.name  = "instructions/name=insn/",
+		.check = test__checkevent_config_symbol,
+		.id    = 48,
+	},
+	{
+		.name  = "r1234/name=rawpmu/",
+		.check = test__checkevent_config_raw,
+		.id    = 49,
+	},
+	{
+		.name  = "4:0x6530160/name=numpmu/",
+		.check = test__checkevent_config_num,
+		.id    = 50,
+	},
 };
 
 static struct evlist_test test__events_pmu[] = {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 3243e95..75576e1 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1043,7 +1043,8 @@ int parse_events_add_numeric(struct parse_events_evlist *data,
 			return -ENOMEM;
 	}
 
-	return add_event(list, &data->idx, &attr, NULL, &config_terms);
+	return add_event(list, &data->idx, &attr,
+			 get_config_name(head_config), &config_terms);
 }
 
 int parse_events_add_pmu(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index ce68746..82029f9 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -407,24 +407,26 @@ PE_NAME ':' PE_NAME
 }
 
 event_legacy_numeric:
-PE_VALUE ':' PE_VALUE
+PE_VALUE ':' PE_VALUE opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, NULL));
+	ABORT_ON(parse_events_add_numeric(data, list, (u32)$1, $3, $4));
+	parse_events_terms__delete($4);
 	$$ = list;
 }
 
 event_legacy_raw:
-PE_RAW
+PE_RAW opt_event_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, NULL));
+	ABORT_ON(parse_events_add_numeric(data, list, PERF_TYPE_RAW, $1, $2));
+	parse_events_terms__delete($2);
 	$$ = list;
 }
 

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [tip:perf/core] perf tools: Enable config and setting names for legacy cache events
  2016-02-19 11:44 ` [PATCH 13/55] perf tools: Enable config and setting names for legacy cache events Wang Nan
@ 2016-02-20 11:38   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 67+ messages in thread
From: tip-bot for Wang Nan @ 2016-02-20 11:38 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: wangnan0, kirr, brendan.d.gregg, adrian.hunter, jolsa, tglx,
	namhyung, dev, hpa, jeremie.galarneau, masami.hiramatsu.pt,
	peterz, linux-kernel, ast, hekuang, acme, mingo, lizefan

Commit-ID:  43d0b97817a41b274aaec0476e912dae3ae1f93d
Gitweb:     http://git.kernel.org/tip/43d0b97817a41b274aaec0476e912dae3ae1f93d
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Fri, 19 Feb 2016 11:44:01 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 19 Feb 2016 19:12:51 -0300

perf tools: Enable config and setting names for legacy cache events

This patch allows setting config terms for legacy cache events.
For example:

  # perf stat -e L1-icache-misses/name=valA/ -e branches/name=valB/ ls
  ...
   Performance counter stats for 'ls':

              11299      valA
             451605      valB

        0.000779091 seconds time elapsed

  # perf record -e cache-misses/name=inh/ -e cache-misses/name=noinh,no-inherit/ bash
  # ls
  # exit
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.023 MB perf.data (131 samples) ]
  # perf report --stdio | grep -B 1 'Event count'
  # Samples: 105  of event 'inh'
  # Event count (approx.): 109118
  --
  # Samples: 26  of event 'noinh'
  # Event count (approx.): 48302

A test case is introduced to test this feature.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-14-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/parse-events.c | 12 ++++++++++++
 tools/perf/util/parse-events.c  | 30 +++++++++++++++++++++++++++---
 tools/perf/util/parse-events.h  |  4 +++-
 tools/perf/util/parse-events.y  | 18 ++++++++++++------
 4 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 15e2d05..7865f68 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -1295,6 +1295,13 @@ static int test__checkevent_config_num(struct perf_evlist *evlist)
 	return 0;
 }
 
+static int test__checkevent_config_cache(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+	TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "cachepmu") == 0);
+	return 0;
+}
 
 static int count_tracepoints(void)
 {
@@ -1619,6 +1626,11 @@ static struct evlist_test test__events[] = {
 		.check = test__checkevent_config_num,
 		.id    = 50,
 	},
+	{
+		.name  = "L1-dcache-misses/name=cachepmu/",
+		.check = test__checkevent_config_cache,
+		.id    = 51,
+	},
 };
 
 static struct evlist_test test__events_pmu[] = {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 75576e1..2996aa4 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -350,11 +350,25 @@ static int parse_aliases(char *str, const char *names[][PERF_EVSEL__MAX_ALIASES]
 	return -1;
 }
 
+typedef int config_term_func_t(struct perf_event_attr *attr,
+			       struct parse_events_term *term,
+			       struct parse_events_error *err);
+static int config_term_common(struct perf_event_attr *attr,
+			      struct parse_events_term *term,
+			      struct parse_events_error *err);
+static int config_attr(struct perf_event_attr *attr,
+		       struct list_head *head,
+		       struct parse_events_error *err,
+		       config_term_func_t config_term);
+
 int parse_events_add_cache(struct list_head *list, int *idx,
-			   char *type, char *op_result1, char *op_result2)
+			   char *type, char *op_result1, char *op_result2,
+			   struct parse_events_error *error,
+			   struct list_head *head_config)
 {
 	struct perf_event_attr attr;
-	char name[MAX_NAME_LEN];
+	LIST_HEAD(config_terms);
+	char name[MAX_NAME_LEN], *config_name;
 	int cache_type = -1, cache_op = -1, cache_result = -1;
 	char *op_result[2] = { op_result1, op_result2 };
 	int i, n;
@@ -368,6 +382,7 @@ int parse_events_add_cache(struct list_head *list, int *idx,
 	if (cache_type == -1)
 		return -EINVAL;
 
+	config_name = get_config_name(head_config);
 	n = snprintf(name, MAX_NAME_LEN, "%s", type);
 
 	for (i = 0; (i < 2) && (op_result[i]); i++) {
@@ -408,7 +423,16 @@ int parse_events_add_cache(struct list_head *list, int *idx,
 	memset(&attr, 0, sizeof(attr));
 	attr.config = cache_type | (cache_op << 8) | (cache_result << 16);
 	attr.type = PERF_TYPE_HW_CACHE;
-	return add_event(list, idx, &attr, name, NULL);
+
+	if (head_config) {
+		if (config_attr(&attr, head_config, error,
+				config_term_common))
+			return -EINVAL;
+
+		if (get_config_terms(head_config, &config_terms))
+			return -ENOMEM;
+	}
+	return add_event(list, idx, &attr, config_name ? : name, &config_terms);
 }
 
 static void tracepoint_error(struct parse_events_error *e, int err,
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 76151f9..d5eb2af 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -140,7 +140,9 @@ int parse_events_add_numeric(struct parse_events_evlist *data,
 			     u32 type, u64 config,
 			     struct list_head *head_config);
 int parse_events_add_cache(struct list_head *list, int *idx,
-			   char *type, char *op_result1, char *op_result2);
+			   char *type, char *op_result1, char *op_result2,
+			   struct parse_events_error *error,
+			   struct list_head *head_config);
 int parse_events_add_breakpoint(struct list_head *list, int *idx,
 				void *ptr, char *type, u64 len);
 int parse_events_add_pmu(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 82029f9..6a2d006 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -293,33 +293,39 @@ value_sym sep_slash_dc
 }
 
 event_legacy_cache:
-PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT
+PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT opt_event_config
 {
 	struct parse_events_evlist *data = _data;
+	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, $5));
+	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, $5, error, $6));
+	parse_events_terms__delete($6);
 	$$ = list;
 }
 |
-PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT
+PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT opt_event_config
 {
 	struct parse_events_evlist *data = _data;
+	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, NULL));
+	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, $3, NULL, error, $4));
+	parse_events_terms__delete($4);
 	$$ = list;
 }
 |
-PE_NAME_CACHE_TYPE
+PE_NAME_CACHE_TYPE opt_event_config
 {
 	struct parse_events_evlist *data = _data;
+	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, NULL, NULL));
+	ABORT_ON(parse_events_add_cache(list, &data->idx, $1, NULL, NULL, error, $2));
+	parse_events_terms__delete($2);
 	$$ = list;
 }
 

^ permalink raw reply related	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2016-02-20 11:39 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-19 11:43 [PATCH 00/55] perf tools: Bugfix, BPF improvements and overwrite ring buffer support Wang Nan
2016-02-19 11:43 ` [PATCH 01/55] perf tools: Record text offset in dso to calculate objdump address Wang Nan
2016-02-19 11:43 ` [PATCH 02/55] perf tools: Adjust symbol for shared objects Wang Nan
2016-02-19 11:43 ` [PATCH 03/55] perf tools: Rename bpf_prog_priv__clear() to clear_prog_priv() Wang Nan
2016-02-20 11:36   ` [tip:perf/core] perf bpf: " tip-bot for Wang Nan
2016-02-19 11:43 ` [PATCH 04/55] perf tools: Fix checking asprintf return value Wang Nan
2016-02-20 11:36   ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-19 11:43 ` [PATCH 05/55] perf tools: Add API to config maps in bpf object Wang Nan
2016-02-19 11:43 ` [PATCH 06/55] perf tools: Enable BPF object configure syntax Wang Nan
2016-02-19 11:43 ` [PATCH 07/55] perf record: Apply config to BPF objects before recording Wang Nan
2016-02-19 11:43 ` [PATCH 08/55] perf tools: Enable passing event to BPF object Wang Nan
2016-02-19 11:43 ` [PATCH 09/55] perf tools: Create config_term_names array Wang Nan
2016-02-20 11:36   ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-19 11:43 ` [PATCH 10/55] perf stat: Forbid user passing improper config terms Wang Nan
     [not found]   ` <20160219140333.GB16141@kernel.org>
2016-02-19 15:08     ` Arnaldo Carvalho de Melo
2016-02-20 11:37   ` [tip:perf/core] perf stat: Bail out on unsupported event config modifiers tip-bot for Wang Nan
2016-02-19 11:43 ` [PATCH 11/55] perf tools: Rename and move pmu_event_name to get_config_name Wang Nan
2016-02-20 11:37   ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-19 11:44 ` [PATCH 12/55] perf tools: Enable config raw and numeric events Wang Nan
     [not found]   ` <20160219141405.GC16141@kernel.org>
2016-02-19 15:08     ` Arnaldo Carvalho de Melo
2016-02-19 15:15       ` Arnaldo Carvalho de Melo
2016-02-19 21:52         ` Arnaldo Carvalho de Melo
2016-02-20 11:38   ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-19 11:44 ` [PATCH 13/55] perf tools: Enable config and setting names for legacy cache events Wang Nan
2016-02-20 11:38   ` [tip:perf/core] " tip-bot for Wang Nan
2016-02-19 11:44 ` [PATCH 14/55] perf tools: Support setting different slots in a BPF map separately Wang Nan
2016-02-19 11:44 ` [PATCH 15/55] perf tools: Enable indices setting syntax for BPF maps Wang Nan
2016-02-19 11:44 ` [PATCH 16/55] perf tools: Pass tracepoint options to BPF script Wang Nan
2016-02-19 11:44 ` [PATCH 17/55] perf tools: Introduce bpf-output event Wang Nan
2016-02-19 11:44 ` [PATCH 18/55] perf data: Support converting data from bpf_perf_event_output() Wang Nan
2016-02-19 11:44 ` [PATCH 19/55] perf data: Explicitly set byte order for integer types Wang Nan
2016-02-19 11:44 ` [PATCH 20/55] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
2016-02-19 11:44 ` [PATCH 21/55] perf core: Set event's default overflow_handler Wang Nan
2016-02-19 11:44 ` [PATCH 22/55] perf core: Prepare writing into ring buffer from end Wang Nan
2016-02-19 11:44 ` [PATCH 23/55] perf core: Add backward attribute to perf event Wang Nan
2016-02-19 11:44 ` [PATCH 24/55] perf core: Reduce perf event output overhead by new overflow handler Wang Nan
2016-02-19 11:44 ` [PATCH 25/55] perf tools: Only validate is_pos for tracking evsels Wang Nan
2016-02-19 11:44 ` [PATCH 26/55] perf tools: Print write_backward value in perf_event_attr__fprintf Wang Nan
2016-02-19 11:44 ` [PATCH 27/55] perf tools: Make ordered_events reusable Wang Nan
2016-02-19 11:44 ` [PATCH 28/55] perf record: Extract synthesize code to record__synthesize() Wang Nan
2016-02-19 11:44 ` [PATCH 29/55] perf tools: Add perf_data_file__switch() helper Wang Nan
2016-02-19 11:44 ` [PATCH 30/55] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
2016-02-19 11:44 ` [PATCH 31/55] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
2016-02-19 11:44 ` [PATCH 32/55] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
2016-02-19 11:44 ` [PATCH 33/55] perf record: Split output into multiple files via '--switch-output' Wang Nan
2016-02-19 11:44 ` [PATCH 34/55] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
2016-02-19 11:44 ` [PATCH 35/55] perf record: Disable buildid cache options by default in switch output mode Wang Nan
2016-02-19 11:44 ` [PATCH 36/55] perf record: Re-synthesize tracking events after output switching Wang Nan
2016-02-19 11:44 ` [PATCH 37/55] perf record: Generate tracking events for process forked by perf Wang Nan
2016-02-19 11:44 ` [PATCH 38/55] perf record: Ensure return non-zero rc when mmap fail Wang Nan
2016-02-19 11:44 ` [PATCH 39/55] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
2016-02-19 11:44 ` [PATCH 40/55] perf tools: Add evlist channel helpers Wang Nan
2016-02-19 11:44 ` [PATCH 41/55] perf tools: Automatically add new channel according to evlist Wang Nan
2016-02-19 11:44 ` [PATCH 42/55] perf tools: Operate multiple channels Wang Nan
2016-02-19 11:44 ` [PATCH 43/55] perf tools: Squash overwrite setting into channel Wang Nan
2016-02-19 11:44 ` [PATCH 44/55] perf record: Don't read from and poll overwrite channel Wang Nan
2016-02-19 11:44 ` [PATCH 45/55] perf record: Don't poll on " Wang Nan
2016-02-19 11:44 ` [PATCH 46/55] perf tools: Detect avalibility of write_backward Wang Nan
2016-02-19 11:44 ` [PATCH 47/55] perf tools: Enable overwrite settings Wang Nan
2016-02-19 11:44 ` [PATCH 48/55] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
2016-02-19 11:44 ` [PATCH 49/55] perf tools: Record fd into perf_mmap Wang Nan
2016-02-19 11:44 ` [PATCH 50/55] perf tools: Add API to pause a channel Wang Nan
2016-02-19 11:44 ` [PATCH 51/55] perf record: Toggle overwrite ring buffer for reading Wang Nan
2016-02-19 11:44 ` [PATCH 52/55] perf record: Rename variable to make code clear Wang Nan
2016-02-19 11:44 ` [PATCH 53/55] perf record: Read from backward ring buffer Wang Nan
2016-02-19 11:44 ` [PATCH 54/55] perf record: Allow generate tracking events at the end of output Wang Nan
2016-02-19 11:44 ` [PATCH 55/55] perf tools: Don't warn about out of order event if write_backward is used Wang Nan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).