All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode
@ 2016-01-11 13:47 Wang Nan
  2016-01-11 13:47 ` [PATCH 01/53] perf tools: Add -lutil in python lib list for broken python-config Wang Nan
                   ` (52 more replies)
  0 siblings, 53 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:47 UTC (permalink / raw)
  To: acme; +Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan

Hi Arnaldo,

   This patch set is based on today's perf/core. It contains 3 parts:

   1. Bugfix in my local tree. Most of them are resent (patch 1 - 17).

   2. BPF related improvement. Also, you should have read them last
      year. Nearly no change (18 - 26).

   3. The most exciting feature I'd like to introduce to you and others:
      perf record overwrite mode support:

      This feature is based on a patch which is discussed but not merged
      yet [1]. I also send it in this series as patch 27. In this patch,
      kernel appends the size of an event at the end of the event data
      in the ring buffer, which enables us reading as much data as
      possible from a overwrite ring buffer, so it works like a flight
      recorder. Patch 28 - 53 add support of it. This is an example:

 # perf record -a -e cycles/overwrite/ \
	          -e raw_syscalls:sys_enter/overwrite/ \
	          -e raw_syscalls:sys_exit/overwrite/ \
	          -e sched:sched_switch/overwrite/ \
                  --switch-output --tail-tracking

 Then send 3 SIGUSR2 to 'perf' in another console:

 # kill -s SIGUSR2 `ps -e | grep 'pts.*perf' | awk '{print $1}'`

 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2016011205392208 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2016011205392597 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2016011205392906 ]
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2016011205393040 

 Here's the result:

 # ls -l ./perf.data.* 
  -rw------- 1 root root 4284861 Jan 12 05:39
  ./perf.data.2016011205392208
  -rw------- 1 root root 4578477 Jan 12 05:39
  ./perf.data.2016011205392597
  -rw------- 1 root root 4602757 Jan 12 05:39
  ./perf.data.2016011205392906
  -rw------- 1 root root 5655429 Jan 12 05:39
  ./perf.data.2016011205393040

In each perf.data output, we get about 4M events before it receives
signal.

This should be useful if we have a extra monitor checks performance
metrics. When it found something unusual, it can send a SIGUSR2 to
perf to collect data near the bad things happen.

My next step is trying to trigger event dumping using eBPF. Then we can
triggeer a perf.data output immediately after a system call takes too
long or when we detect a losting of a display update.

Patch 28 - 40 add a switch-output mode, make perf dump a new perf.data
when it receive a SIGUSR2.

Patch 41 - 45 introduce a concept called 'channel', which allows perf to
collect data through more than one group of mmaped ring buffer with different
configurations.

Patch 46 - 53 are the core of flight record mode. Patch 51 does real
reading from flight recorder ring buffer.

[1] http://lkml.kernel.org/g/1452518653-1794-1-git-send-email-wangnan0@huawei.com

He Kuang (1):
  perf tools: Support perf event alias name

Jiri Olsa (1):
  perf tools: Add missing sources in perf's MANIFEST

Naveen N. Rao (1):
  perf: bpf: Fix build breakage due to libbpf

Wang Nan (50):
  perf tools: Add -lutil in python lib list for broken python-config
  perf tools: Fix phony build target for build-test
  perf tools: Set parallel making options build-test
  perf tools: Pass O option to Makefile.perf in build-test
  perf tools: Test correct path of perf in build-test
  perf tools: Fix PowerPC native building
  tools: Move Makefile.arch from perf/config to tools/scripts
  tools build: Add BPF feature check to test-all
  perf test: Fix false TEST_OK result for 'perf test hist'
  perf test: Reset err after using it hold errcode in hist testcases
  perf tools: Prevent calling machine__delete() on non-allocated machine
  perf test: Check environment before start real BPF test
  perf tools: Fix symbols searching for offline module in buildid-cache
  perf tools: Fix mmap2 event allocation in synthesize code
  perf test: Improve bp_signal
  perf tools: Add API to config maps in bpf object
  perf tools: Enable BPF object configure syntax
  perf record: Apply config to BPF objects before recording
  perf tools: Enable passing event to BPF object
  perf tools: Support setting different slots in a BPF map separately
  perf tools: Enable indices setting syntax for BPF maps
  perf tools: Introduce bpf-output event
  perf data: Support converting data from bpf_perf_event_output()
  perf/core: Put size of a sample at the end of it by
    PERF_SAMPLE_TAILSIZE
  perf tools: Move timestamp creation to util
  perf tools: Make ordered_events reusable
  perf record: Extract synthesize code to record__synthesize()
  perf tools: Add perf_data_file__switch() helper
  perf record: Turns auxtrace_snapshot_enable into 3 states
  perf record: Introduce record__finish_output() to finish a perf.data
  perf record: Use OPT_BOOLEAN_SET for buildid cache related options
  perf record: Add '--timestamp-filename' option to append timestamp to
    output filename
  perf record: Split output into multiple files via '--switch-output'
  perf record: Force enable --timestamp-filename when --switch-output is
    provided
  perf record: Disable buildid cache options by default in switch output
    mode
  perf record: Re-synthesize tracking events after output switching
  perf record: Generate tracking events for process forked by perf
  perf record: Ensure return non-zero rc when mmap fail
  perf record: Prevent reading invalid data in record__mmap_read
  perf tools: Add evlist channel helpers
  perf tools: Automatically add new channel according to evlist
  perf tools: Operate multiple channels
  perf tools: Squash overwrite setting into channel
  perf record: Don't read from and poll overwrite channel
  perf tools: Enable overwrite settings
  perf tools: Consider TAILSIZE bit when caclulate is_pos
  perf tools: Set tailsize attribut bit for overwrite events
  perf record: Read from tailsize ring buffer
  perf record: Toggle tailsize ring buffer for reading
  perf record: Allow generate tracking events at the end of output

 include/linux/perf_event.h                   |  17 +-
 include/uapi/linux/perf_event.h              |   3 +-
 kernel/events/core.c                         |  82 +++-
 kernel/events/ring_buffer.c                  |   7 +-
 tools/build/feature/test-all.c               |   5 +
 tools/build/feature/test-bpf.c               |  20 +-
 tools/lib/bpf/Makefile                       |  16 +-
 tools/lib/bpf/bpf.c                          |   4 +-
 tools/perf/MANIFEST                          |   2 +
 tools/perf/builtin-buildid-cache.c           |  14 +-
 tools/perf/builtin-record.c                  | 549 +++++++++++++++++----
 tools/perf/config/Makefile                   |   4 +-
 tools/perf/perf.h                            |   1 +
 tools/perf/tests/bp_signal.c                 | 140 +++++-
 tools/perf/tests/bpf.c                       |  37 ++
 tools/perf/tests/hists_common.c              |   5 -
 tools/perf/tests/hists_cumulate.c            |   1 +
 tools/perf/tests/hists_filter.c              |   1 +
 tools/perf/tests/hists_link.c                |   1 +
 tools/perf/tests/hists_output.c              |   1 +
 tools/perf/tests/make                        |  72 ++-
 tools/perf/tests/vmlinux-kallsyms.c          |   4 +-
 tools/perf/util/bpf-loader.c                 | 699 +++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h                 |  59 +++
 tools/perf/util/build-id.c                   |  44 ++
 tools/perf/util/build-id.h                   |   1 +
 tools/perf/util/data-convert-bt.c            | 112 ++++-
 tools/perf/util/data.c                       |  36 ++
 tools/perf/util/data.h                       |  11 +-
 tools/perf/util/event.c                      |  28 +-
 tools/perf/util/evlist.c                     | 307 ++++++++++--
 tools/perf/util/evlist.h                     |  67 ++-
 tools/perf/util/evsel.c                      |  42 +-
 tools/perf/util/evsel.h                      |  13 +
 tools/perf/util/machine.c                    |  13 +-
 tools/perf/util/machine.h                    |   3 +-
 tools/perf/util/ordered-events.c             |   9 +
 tools/perf/util/ordered-events.h             |   1 +
 tools/perf/util/parse-events.c               | 139 +++++-
 tools/perf/util/parse-events.h               |  24 +-
 tools/perf/util/parse-events.l               |  18 +-
 tools/perf/util/parse-events.y               | 123 ++++-
 tools/perf/util/session.c                    |   4 +-
 tools/perf/util/symbol.c                     |   4 +
 tools/perf/util/util.c                       |  17 +
 tools/perf/util/util.h                       |   1 +
 tools/{perf/config => scripts}/Makefile.arch |   0
 47 files changed, 2482 insertions(+), 279 deletions(-)
 rename tools/{perf/config => scripts}/Makefile.arch (100%)

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 124+ messages in thread

* [PATCH 01/53] perf tools: Add -lutil in python lib list for broken python-config
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
@ 2016-01-11 13:47 ` Wang Nan
  2016-01-12  9:43   ` Jiri Olsa
  2016-01-12 10:09   ` [tip:perf/urgent] " tip-bot for Wang Nan
  2016-01-11 13:47 ` [PATCH 02/53] perf tools: Fix phony build target for build-test Wang Nan
                   ` (51 subsequent siblings)
  52 siblings, 2 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:47 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Jiri Olsa, Namhyung Kim

On some system the perf-config is broken, causes link failure like this:

 /usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_forkpty':
 /opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3816: undefined reference to `forkpty'
 /usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_openpty':
 /opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3756: undefined reference to `openpty'
 collect2: error: ld returned 1 exit status
make[1]: *** [/home/wangnan/kernel-hydrogen/tools/perf/out/perf] Error 1
make: *** [all] Error 2

 $ python-config --libs
 -lpthread -ldl -lpthread -lutil -lm -lpython2.7

In this case a '-lutil' should be appended to -lpython2.7.

(I know we have --start-group and --end-group. I can see them in
command line of collect2 by strace. However it doesn't work. Seems
I have a broken environment?)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/config/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 254d06e..0793c76 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -493,7 +493,7 @@ else
 
       PYTHON_EMBED_LDOPTS := $(shell $(PYTHON_CONFIG_SQ) --ldflags 2>/dev/null)
       PYTHON_EMBED_LDFLAGS := $(call strip-libs,$(PYTHON_EMBED_LDOPTS))
-      PYTHON_EMBED_LIBADD := $(call grep-libs,$(PYTHON_EMBED_LDOPTS))
+      PYTHON_EMBED_LIBADD := $(call grep-libs,$(PYTHON_EMBED_LDOPTS)) -lutil
       PYTHON_EMBED_CCOPTS := $(shell $(PYTHON_CONFIG_SQ) --cflags 2>/dev/null)
       FLAGS_PYTHON_EMBED := $(PYTHON_EMBED_CCOPTS) $(PYTHON_EMBED_LDOPTS)
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 02/53] perf tools: Fix phony build target for build-test
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
  2016-01-11 13:47 ` [PATCH 01/53] perf tools: Add -lutil in python lib list for broken python-config Wang Nan
@ 2016-01-11 13:47 ` Wang Nan
  2016-01-12 10:09   ` [tip:perf/urgent] " tip-bot for Wang Nan
  2016-01-11 13:47 ` [PATCH 03/53] perf tools: Set parallel making options build-test Wang Nan
                   ` (50 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:47 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, Namhyung Kim

make_kernelsrc and make_kernelsrc_tools are skiped if a previous
build-test is done, because 'make build-test' creates two files with
same names. To avoid this, they should be included in .PHONY list.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/make | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index c1fbb8e..130be7c 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -280,5 +280,5 @@ all: $(run) $(run_O) tarpkg make_kernelsrc make_kernelsrc_tools
 out: $(run_O)
 	@echo OK
 
-.PHONY: all $(run) $(run_O) tarpkg clean
+.PHONY: all $(run) $(run_O) tarpkg clean make_kernelsrc make_kernelsrc_tools
 endif # ifndef MK
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 03/53] perf tools: Set parallel making options build-test
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
  2016-01-11 13:47 ` [PATCH 01/53] perf tools: Add -lutil in python lib list for broken python-config Wang Nan
  2016-01-11 13:47 ` [PATCH 02/53] perf tools: Fix phony build target for build-test Wang Nan
@ 2016-01-11 13:47 ` Wang Nan
  2016-01-11 13:47 ` [PATCH 04/53] perf tools: Pass O option to Makefile.perf in build-test Wang Nan
                   ` (49 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:47 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, Namhyung Kim

'make build-test' is painful because of time consuming. In a full test,
all test cases are built twice with tools/perf/Makefile and
tools/perf/Makefile.perf. 'Makefile' automatically computes parallel
options for make, but 'Makefile.perf' not, so all test cases is built
with one job. It is very slow.

This patch adds '-j' options to Makefile.perf testing. It computes
parallel building options like what tools/perf/Makefile does, and pass
'-j' option to Makefile.perf test.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/make | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index 130be7c..bd9c61a 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -3,7 +3,7 @@ ifeq ($(MAKECMDGOALS),)
 # no target specified, trigger the whole suite
 all:
 	@echo "Testing Makefile";      $(MAKE) -sf tests/make MK=Makefile
-	@echo "Testing Makefile.perf"; $(MAKE) -sf tests/make MK=Makefile.perf
+	@echo "Testing Makefile.perf"; $(MAKE) -sf tests/make MK=Makefile.perf SET_PARALLEL=1
 else
 # run only specific test over 'Makefile'
 %:
@@ -12,6 +12,15 @@ endif
 else
 PERF := .
 
+PARALLEL_OPT=
+ifeq ($(SET_PARALLEL),1)
+  cores := $(shell (getconf _NPROCESSORS_ONLN || egrep -c '^processor|^CPU[0-9]' /proc/cpuinfo) 2>/dev/null)
+  ifeq ($(cores),0)
+    cores := 1
+  endif
+  PARALLEL_OPT="-j$(cores)"
+endif
+
 include config/Makefile.arch
 
 # FIXME looks like x86 is the only arch running tests ;-)
@@ -238,7 +247,7 @@ clean := @(cd $(PERF); make -s -f $(MK) clean >/dev/null)
 $(run):
 	$(call clean)
 	@TMP_DEST=$$(mktemp -d); \
-	cmd="cd $(PERF) && make -f $(MK) DESTDIR=$$TMP_DEST $($@)"; \
+	cmd="cd $(PERF) && make -f $(MK) $(PARALLEL_OPT) DESTDIR=$$TMP_DEST $($@)"; \
 	echo "- $@: $$cmd" && echo $$cmd > $@ && \
 	( eval $$cmd ) >> $@ 2>&1; \
 	echo "  test: $(call test,$@)" >> $@ 2>&1; \
@@ -249,7 +258,7 @@ $(run_O):
 	$(call clean)
 	@TMP_O=$$(mktemp -d); \
 	TMP_DEST=$$(mktemp -d); \
-	cmd="cd $(PERF) && make -f $(MK) O=$$TMP_O DESTDIR=$$TMP_DEST $($(patsubst %_O,%,$@))"; \
+	cmd="cd $(PERF) && make -f $(MK) $(PARALLEL_OPT) O=$$TMP_O DESTDIR=$$TMP_DEST $($(patsubst %_O,%,$@))"; \
 	echo "- $@: $$cmd" && echo $$cmd > $@ && \
 	( eval $$cmd ) >> $@ 2>&1 && \
 	echo "  test: $(call test_O,$@)" >> $@ 2>&1; \
@@ -263,15 +272,15 @@ tarpkg:
 	rm -f $@
 
 make_kernelsrc:
-	@echo "- make -C <kernelsrc> tools/perf"
+	@echo "- make -C <kernelsrc> $(PARALLEL_OPT) tools/perf"
 	$(call clean); \
-	(make -C ../.. tools/perf) > $@ 2>&1 && \
+	(make -C ../.. $(PARALLEL_OPT) tools/perf) > $@ 2>&1 && \
 	test -x perf && rm -f $@ || (cat $@ ; false)
 
 make_kernelsrc_tools:
-	@echo "- make -C <kernelsrc>/tools perf"
+	@echo "- make -C <kernelsrc>/tools $(PARALLEL_OPT) perf"
 	$(call clean); \
-	(make -C ../../tools perf) > $@ 2>&1 && \
+	(make -C ../../tools $(PARALLEL_OPT) perf) > $@ 2>&1 && \
 	test -x perf && rm -f $@ || (cat $@ ; false)
 
 all: $(run) $(run_O) tarpkg make_kernelsrc make_kernelsrc_tools
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 04/53] perf tools: Pass O option to Makefile.perf in build-test
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (2 preceding siblings ...)
  2016-01-11 13:47 ` [PATCH 03/53] perf tools: Set parallel making options build-test Wang Nan
@ 2016-01-11 13:47 ` Wang Nan
  2016-01-11 13:47 ` [PATCH 05/53] perf tools: Test correct path of perf " Wang Nan
                   ` (48 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:47 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim

Doesn't like tools/perf/Makefile, tools/perf/Makefile.perf obey 'O'
option when it is passed through cmdline only, because of code in
tools/scripts/Makefile.include:

 ifneq ($(O),)
 ifeq ($(origin O), command line)
 	...
 	ABSOLUTE_O := $(shell cd $(O) ; pwd)
 	OUTPUT := $(ABSOLUTE_O)/$(if $(subdir),$(subdir)/)
 endif
 endif

This patch passes 'O' to Makefile.perf through cmdline explicitly
to make it follow O variable during build-test.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/make | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index bd9c61a..a32615a3 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -3,7 +3,7 @@ ifeq ($(MAKECMDGOALS),)
 # no target specified, trigger the whole suite
 all:
 	@echo "Testing Makefile";      $(MAKE) -sf tests/make MK=Makefile
-	@echo "Testing Makefile.perf"; $(MAKE) -sf tests/make MK=Makefile.perf SET_PARALLEL=1
+	@echo "Testing Makefile.perf"; $(MAKE) -sf tests/make MK=Makefile.perf SET_PARALLEL=1 SET_O=1
 else
 # run only specific test over 'Makefile'
 %:
@@ -11,6 +11,14 @@ else
 endif
 else
 PERF := .
+O_OPT :=
+
+ifneq ($(O),)
+  FULL_O := $(shell readlink -f $(O) || echo $(O))
+  ifeq ($(SET_O),1)
+    O_OPT := 'O=$(FULL_O)'
+  endif
+endif
 
 PARALLEL_OPT=
 ifeq ($(SET_PARALLEL),1)
@@ -247,7 +255,7 @@ clean := @(cd $(PERF); make -s -f $(MK) clean >/dev/null)
 $(run):
 	$(call clean)
 	@TMP_DEST=$$(mktemp -d); \
-	cmd="cd $(PERF) && make -f $(MK) $(PARALLEL_OPT) DESTDIR=$$TMP_DEST $($@)"; \
+	cmd="cd $(PERF) && make -f $(MK) $(PARALLEL_OPT) $(O_OPT) DESTDIR=$$TMP_DEST $($@)"; \
 	echo "- $@: $$cmd" && echo $$cmd > $@ && \
 	( eval $$cmd ) >> $@ 2>&1; \
 	echo "  test: $(call test,$@)" >> $@ 2>&1; \
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 05/53] perf tools: Test correct path of perf in build-test
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (3 preceding siblings ...)
  2016-01-11 13:47 ` [PATCH 04/53] perf tools: Pass O option to Makefile.perf in build-test Wang Nan
@ 2016-01-11 13:47 ` Wang Nan
  2016-01-11 15:24   ` Arnaldo Carvalho de Melo
  2016-01-11 13:47 ` [PATCH 06/53] perf tools: Fix PowerPC native building Wang Nan
                   ` (47 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:47 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim

If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
will fail because perf resides in a different directory. Fix this by
computing PERF_OUT according to 'O' and test correct output files.
For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
instead because the path is different from others ($(O)/perf vs
 $(O)/tools/perf).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/make | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index a32615a3..0f5afcb 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -11,10 +11,12 @@ else
 endif
 else
 PERF := .
+PERF_OUT := $(PERF)
 O_OPT :=
 
 ifneq ($(O),)
   FULL_O := $(shell readlink -f $(O) || echo $(O))
+  PERF_OUT := $(FULL_O)
   ifeq ($(SET_O),1)
     O_OPT := 'O=$(FULL_O)'
   endif
@@ -159,11 +161,11 @@ test_make_doc    := $(test_ok)
 test_make_help_O := $(test_ok)
 test_make_doc_O  := $(test_ok)
 
-test_make_python_perf_so := test -f $(PERF)/python/perf.so
+test_make_python_perf_so := test -f $(PERF_OUT)/python/perf.so
 
-test_make_perf_o           := test -f $(PERF)/perf.o
-test_make_util_map_o       := test -f $(PERF)/util/map.o
-test_make_util_pmu_bison_o := test -f $(PERF)/util/pmu-bison.o
+test_make_perf_o           := test -f $(PERF_OUT)/perf.o
+test_make_util_map_o       := test -f $(PERF_OUT)/util/map.o
+test_make_util_pmu_bison_o := test -f $(PERF_OUT)/util/pmu-bison.o
 
 define test_dest_files
   for file in $(1); do				\
@@ -230,7 +232,7 @@ test_make_perf_o_O            := test -f $$TMP_O/perf.o
 test_make_util_map_o_O        := test -f $$TMP_O/util/map.o
 test_make_util_pmu_bison_o_O := test -f $$TMP_O/util/pmu-bison.o
 
-test_default = test -x $(PERF)/perf
+test_default = test -x $(PERF_OUT)/perf
 test = $(if $(test_$1),$(test_$1),$(test_default))
 
 test_default_O = test -x $$TMP_O/perf
@@ -250,7 +252,7 @@ endif
 
 MAKEFLAGS := --no-print-directory
 
-clean := @(cd $(PERF); make -s -f $(MK) clean >/dev/null)
+clean := @(cd $(PERF); make -s -f $(MK) O=$(PERF_OUT) clean >/dev/null; make -s -f $(MK) clean >/dev/null)
 
 $(run):
 	$(call clean)
@@ -279,17 +281,22 @@ tarpkg:
 	( eval $$cmd ) >> $@ 2>&1 && \
 	rm -f $@
 
+KBUILD_OUTPUT_DIR := ../..
+ifneq ($(O),)
+  KBUILD_OUTPUT_DIR := $(O)
+endif
+
 make_kernelsrc:
 	@echo "- make -C <kernelsrc> $(PARALLEL_OPT) tools/perf"
 	$(call clean); \
 	(make -C ../.. $(PARALLEL_OPT) tools/perf) > $@ 2>&1 && \
-	test -x perf && rm -f $@ || (cat $@ ; false)
+	test -x $(KBUILD_OUTPUT_DIR)/tools/perf && rm -f $@ || (cat $@ ; false)
 
 make_kernelsrc_tools:
 	@echo "- make -C <kernelsrc>/tools $(PARALLEL_OPT) perf"
 	$(call clean); \
 	(make -C ../../tools $(PARALLEL_OPT) perf) > $@ 2>&1 && \
-	test -x perf && rm -f $@ || (cat $@ ; false)
+	test -x $(KBUILD_OUTPUT_DIR)/tools/perf && rm -f $@ || (cat $@ ; false)
 
 all: $(run) $(run_O) tarpkg make_kernelsrc make_kernelsrc_tools
 	@echo OK
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 06/53] perf tools: Fix PowerPC native building
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (4 preceding siblings ...)
  2016-01-11 13:47 ` [PATCH 05/53] perf tools: Test correct path of perf " Wang Nan
@ 2016-01-11 13:47 ` Wang Nan
  2016-01-12 10:10   ` [tip:perf/urgent] " tip-bot for Wang Nan
  2016-01-11 13:47 ` [PATCH 07/53] tools: Move Makefile.arch from perf/config to tools/scripts Wang Nan
                   ` (46 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:47 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Sukadev Bhattiprolu

Checks BPF syscall number, turn off libbpf building on platform doesn't
correctly support sys_bpf instead of blocking compiling.

Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 tools/build/feature/test-bpf.c | 20 +++++++++++++++++++-
 tools/lib/bpf/bpf.c            |  4 ++--
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/tools/build/feature/test-bpf.c b/tools/build/feature/test-bpf.c
index 062bac8..b389026 100644
--- a/tools/build/feature/test-bpf.c
+++ b/tools/build/feature/test-bpf.c
@@ -1,9 +1,23 @@
+#include <asm/unistd.h>
 #include <linux/bpf.h>
+#include <unistd.h>
+
+#ifndef __NR_bpf
+# if defined(__i386__)
+#  define __NR_bpf 357
+# elif defined(__x86_64__)
+#  define __NR_bpf 321
+# elif defined(__aarch64__)
+#  define __NR_bpf 280
+#  error __NR_bpf not defined. libbpf does not support your arch.
+# endif
+#endif
 
 int main(void)
 {
 	union bpf_attr attr;
 
+	/* Check fields in attr */
 	attr.prog_type = BPF_PROG_TYPE_KPROBE;
 	attr.insn_cnt = 0;
 	attr.insns = 0;
@@ -14,5 +28,9 @@ int main(void)
 	attr.kern_version = 0;
 
 	attr = attr;
-	return 0;
+	/*
+	 * Test existence of __NR_bpf and BPF_PROG_LOAD.
+	 * This call should fail if we run the testcase.
+	 */
+	return syscall(__NR_bpf, BPF_PROG_LOAD, attr, sizeof(attr));
 }
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 5bdc6ea..1f91cc9 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -14,8 +14,8 @@
 #include "bpf.h"
 
 /*
- * When building perf, unistd.h is override. Define __NR_bpf is
- * required to be defined.
+ * When building perf, unistd.h is overrided. __NR_bpf is
+ * required to be defined explicitly.
  */
 #ifndef __NR_bpf
 # if defined(__i386__)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 07/53] tools: Move Makefile.arch from perf/config to tools/scripts
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (5 preceding siblings ...)
  2016-01-11 13:47 ` [PATCH 06/53] perf tools: Fix PowerPC native building Wang Nan
@ 2016-01-11 13:47 ` Wang Nan
  2016-01-11 13:52   ` Wangnan (F)
  2016-01-12 10:10   ` [tip:perf/urgent] tools: Move Makefile.arch from perf/ config " tip-bot for Wang Nan
  2016-01-11 13:47 ` [PATCH 08/53] perf tools: Add missing sources in perf's MANIFEST Wang Nan
                   ` (45 subsequent siblings)
  52 siblings, 2 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:47 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Naveen N. Rao, Sukadev Bhattiprolu

After this patch other directories can use this architecture detector
without directly including it from perf's directory. Libbpf would
utilize it to get proper $(ARCH) so it can receive correct uapi include
directory.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@kernel.org>
[Add missing srctree definition in tests/make]
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 tools/perf/config/Makefile                   |  2 +-
 tools/perf/tests/make                        | 16 +++++++++++++++-
 tools/{perf/config => scripts}/Makefile.arch |  0
 3 files changed, 16 insertions(+), 2 deletions(-)
 rename tools/{perf/config => scripts}/Makefile.arch (100%)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 0793c76..7545ba60 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -17,7 +17,7 @@ detected_var = $(shell echo "$(1)=$($(1))" >> $(OUTPUT).config-detected)
 
 CFLAGS := $(EXTRA_CFLAGS) $(EXTRA_WARNINGS)
 
-include $(src-perf)/config/Makefile.arch
+include $(srctree)/tools/scripts/Makefile.arch
 
 $(call detected_var,ARCH)
 
diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index 0f5afcb..1e59ce8 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -1,3 +1,5 @@
+include ../scripts/Makefile.include
+
 ifndef MK
 ifeq ($(MAKECMDGOALS),)
 # no target specified, trigger the whole suite
@@ -31,7 +33,19 @@ ifeq ($(SET_PARALLEL),1)
   PARALLEL_OPT="-j$(cores)"
 endif
 
-include config/Makefile.arch
+# As per kernel Makefile, avoid funny character set dependencies
+unexport LC_ALL
+LC_COLLATE=C
+LC_NUMERIC=C
+export LC_COLLATE LC_NUMERIC
+
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(shell pwd)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+#$(info Determined 'srctree' to be $(srctree))
+endif
+
+include $(srctree)/tools/scripts/Makefile.arch
 
 # FIXME looks like x86 is the only arch running tests ;-)
 # we need some IS_(32/64) flag to make this generic
diff --git a/tools/perf/config/Makefile.arch b/tools/scripts/Makefile.arch
similarity index 100%
rename from tools/perf/config/Makefile.arch
rename to tools/scripts/Makefile.arch
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 08/53] perf tools: Add missing sources in perf's MANIFEST
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (6 preceding siblings ...)
  2016-01-11 13:47 ` [PATCH 07/53] tools: Move Makefile.arch from perf/config to tools/scripts Wang Nan
@ 2016-01-11 13:47 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 09/53] perf: bpf: Fix build breakage due to libbpf Wang Nan
                   ` (44 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:47 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa,
	Jiri Olsa, Wang Nan

From: Jiri Olsa <jolsa@redhat.com>

Adding missing bitmap.[ch] sources to the MINIFEST file.

Link: http://lkml.kernel.org/n/tip-bkwplvnpk6s6a8zi1923dzuj@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 tools/perf/MANIFEST | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index ddf922f..2e1fa23 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -28,6 +28,7 @@ tools/lib/string.c
 tools/lib/symbol/kallsyms.c
 tools/lib/symbol/kallsyms.h
 tools/lib/find_bit.c
+tools/lib/bitmap.c
 tools/include/asm/atomic.h
 tools/include/asm/barrier.h
 tools/include/asm/bug.h
@@ -57,6 +58,7 @@ tools/include/linux/rbtree_augmented.h
 tools/include/linux/string.h
 tools/include/linux/types.h
 tools/include/linux/err.h
+tools/include/linux/bitmap.h
 include/asm-generic/bitops/arch_hweight.h
 include/asm-generic/bitops/const_hweight.h
 include/asm-generic/bitops/fls64.h
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 09/53] perf: bpf: Fix build breakage due to libbpf
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (7 preceding siblings ...)
  2016-01-11 13:47 ` [PATCH 08/53] perf tools: Add missing sources in perf's MANIFEST Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-12 10:10   ` [tip:perf/urgent] perf " tip-bot for Naveen N. Rao
  2016-01-11 13:48 ` [PATCH 10/53] tools build: Add BPF feature check to test-all Wang Nan
                   ` (43 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Naveen N. Rao,
	Wang Nan, Sukadev Bhattiprolu

From: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>

perf build is currently (v4.4-rc5) broken on powerpc:

bpf.c:28:4: error: #error __NR_bpf not defined. libbpf does not support
your arch.
 #  error __NR_bpf not defined. libbpf does not support your arch.
    ^

Fix this by including tools/scripts/Makefile.arch for the proper
$ARCH macro. While at it, remove redundant LP64 macro definition.

Also, since libbpf require $(srctree) now, detect the path of
srctree like perf.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
[Use tools/scripts/Makefile.arch]
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 tools/lib/bpf/Makefile | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
index 919b717..1d446a6 100644
--- a/tools/lib/bpf/Makefile
+++ b/tools/lib/bpf/Makefile
@@ -6,6 +6,12 @@ BPF_EXTRAVERSION = 1
 
 MAKEFLAGS += --no-print-directory
 
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(shell pwd)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+#$(info Determined 'srctree' to be $(srctree))
+endif
 
 # Makefiles suck: This macro sets a default value of $(2) for the
 # variable named by $(1), unless the variable has been set by
@@ -31,7 +37,8 @@ INSTALL = install
 DESTDIR ?=
 DESTDIR_SQ = '$(subst ','\'',$(DESTDIR))'
 
-LP64 := $(shell echo __LP64__ | ${CC} ${CFLAGS} -E -x c - | tail -n 1)
+include $(srctree)/tools/scripts/Makefile.arch
+
 ifeq ($(LP64), 1)
   libdir_relative = lib64
 else
@@ -57,13 +64,6 @@ ifndef VERBOSE
   VERBOSE = 0
 endif
 
-ifeq ($(srctree),)
-srctree := $(patsubst %/,%,$(dir $(shell pwd)))
-srctree := $(patsubst %/,%,$(dir $(srctree)))
-srctree := $(patsubst %/,%,$(dir $(srctree)))
-#$(info Determined 'srctree' to be $(srctree))
-endif
-
 FEATURE_USER = .libbpf
 FEATURE_TESTS = libelf libelf-getphdrnum libelf-mmap bpf
 FEATURE_DISPLAY = libelf bpf
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 10/53] tools build: Add BPF feature check to test-all
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (8 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 09/53] perf: bpf: Fix build breakage due to libbpf Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-12 10:11   ` [tip:perf/urgent] " tip-bot for Wang Nan
  2016-01-11 13:48 ` [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist' Wang Nan
                   ` (42 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, Namhyung Kim

Existing test-all.c doesn't check BPF related features. For environment
with all other features enabled, BPF would be considered enabled
without doing real feature check.

This patch adds test-bpf.c into test-all.c.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/build/feature/test-all.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index 33cf6f2..81025ca 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -125,6 +125,10 @@
 # include "test-get_cpuid.c"
 #undef main
 
+#define main main_test_bpf
+# include "test-bpf.c"
+#undef main
+
 int main(int argc, char *argv[])
 {
 	main_test_libpython();
@@ -153,6 +157,7 @@ int main(int argc, char *argv[])
 	main_test_pthread_attr_setaffinity_np();
 	main_test_lzma();
 	main_test_get_cpuid();
+	main_test_bpf();
 
 	return 0;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist'
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (9 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 10/53] tools build: Add BPF feature check to test-all Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 14:25   ` Sergei Shtylyov
  2016-01-12 10:11   ` [tip:perf/urgent] perf test: Fix false TEST_OK result for ' perf " tip-bot for Wang Nan
  2016-01-11 13:48 ` [PATCH 12/53] perf test: Reset err after using it hold errcode in hist testcases Wang Nan
                   ` (41 subsequent siblings)
  52 siblings, 2 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu

Commit 71d6de64feddd4b455555326fba2111b3006d9e0 ('perf test: Fix hist
testcases when kptr_restrict is on') solves a double free problem when
'perf test hist' calling setup_fake_machine(). However, the result is
still incorrect. For example:

 $ ./perf test -v 'filtering hist entries'
 25: Test filtering hist entries                              :
 --- start ---
 test child forked, pid 4186
 Cannot create kernel maps
 test child finished with 0
 ---- end ----
 Test filtering hist entries: Ok

In this case the body of this test is not get executed at all, but the
result is 'Ok'.

Actually, in setup_fake_machine() there's no need to create real kernel
maps. What we want is the fake maps. This patch removes the
machine__create_kernel_maps() in setup_fake_machine(), so it won't be
affected by kptr_restrict setting.

Test result:

 $ cat /proc/sys/kernel/kptr_restrict
 1
 $ ~/perf test -v hist
 15: Test matching and linking multiple hists                 :
 --- start ---
 test child forked, pid 24031
 test child finished with 0
 ---- end ----
 Test matching and linking multiple hists: Ok
 [SNIP]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
---
 tools/perf/tests/hists_common.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index bcfd081..071a8b5 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -87,11 +87,6 @@ struct machine *setup_fake_machine(struct machines *machines)
 		return NULL;
 	}
 
-	if (machine__create_kernel_maps(machine)) {
-		pr_debug("Cannot create kernel maps\n");
-		return NULL;
-	}
-
 	for (i = 0; i < ARRAY_SIZE(fake_threads); i++) {
 		struct thread *thread;
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 12/53] perf test: Reset err after using it hold errcode in hist testcases
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (10 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist' Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-12 10:11   ` [tip:perf/urgent] " tip-bot for Wang Nan
  2016-01-11 13:48 ` [PATCH 13/53] perf tools: Prevent calling machine__delete() on non-allocated machine Wang Nan
                   ` (40 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu

All hists test cases forget to reset err after using it to hold an
error code. If error occure in setup_fake_machine() it incorrectly
return TEST_OK.

This patch fixes it.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
---
 tools/perf/tests/hists_cumulate.c | 1 +
 tools/perf/tests/hists_filter.c   | 1 +
 tools/perf/tests/hists_link.c     | 1 +
 tools/perf/tests/hists_output.c   | 1 +
 4 files changed, 4 insertions(+)

diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index e360892..5e6a86e 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -706,6 +706,7 @@ int test__hists_cumulate(int subtest __maybe_unused)
 	err = parse_events(evlist, "cpu-clock", NULL);
 	if (err)
 		goto out;
+	err = TEST_FAIL;
 
 	machines__init(&machines);
 
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 2a784be..351a424 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -120,6 +120,7 @@ int test__hists_filter(int subtest __maybe_unused)
 	err = parse_events(evlist, "task-clock", NULL);
 	if (err)
 		goto out;
+	err = TEST_FAIL;
 
 	/* default sort order (comm,dso,sym) will be used */
 	if (setup_sorting(NULL) < 0)
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index c764d69..64b257d 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -293,6 +293,7 @@ int test__hists_link(int subtest __maybe_unused)
 	if (err)
 		goto out;
 
+	err = TEST_FAIL;
 	/* default sort order (comm,dso,sym) will be used */
 	if (setup_sorting(NULL) < 0)
 		goto out;
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index ebe6cd4..b231265 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -597,6 +597,7 @@ int test__hists_output(int subtest __maybe_unused)
 	err = parse_events(evlist, "cpu-clock", NULL);
 	if (err)
 		goto out;
+	err = TEST_FAIL;
 
 	machines__init(&machines);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 13/53] perf tools: Prevent calling machine__delete() on non-allocated machine
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (11 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 12/53] perf test: Reset err after using it hold errcode in hist testcases Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 15:42   ` Arnaldo Carvalho de Melo
  2016-01-11 13:48 ` [PATCH 14/53] perf test: Check environment before start real BPF test Wang Nan
                   ` (39 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

To prevent futher commits calling machine__delete() on non-allocated
'struct machine' (which would cause memory corruption), this patch
enforces machine__init(), record whether a machine structure is
dynamically allocated or not, and warn if machine__delete() is called
on incorrect object.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/vmlinux-kallsyms.c |  4 ++--
 tools/perf/util/machine.c           | 13 ++++++++-----
 tools/perf/util/machine.h           |  3 ++-
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/tools/perf/tests/vmlinux-kallsyms.c b/tools/perf/tests/vmlinux-kallsyms.c
index f0bfc9e..441e93d 100644
--- a/tools/perf/tests/vmlinux-kallsyms.c
+++ b/tools/perf/tests/vmlinux-kallsyms.c
@@ -35,8 +35,8 @@ int test__vmlinux_matches_kallsyms(int subtest __maybe_unused)
 	 * Init the machines that will hold kernel, modules obtained from
 	 * both vmlinux + .ko files and from /proc/kallsyms split by modules.
 	 */
-	machine__init(&kallsyms, "", HOST_KERNEL_ID);
-	machine__init(&vmlinux, "", HOST_KERNEL_ID);
+	machine__init(&kallsyms, "", HOST_KERNEL_ID, false);
+	machine__init(&vmlinux, "", HOST_KERNEL_ID, false);
 
 	/*
 	 * Step 2:
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index ad79297..59a3c01 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1,3 +1,4 @@
+#include <asm/bug.h>
 #include "callchain.h"
 #include "debug.h"
 #include "event.h"
@@ -23,7 +24,7 @@ static void dsos__init(struct dsos *dsos)
 	pthread_rwlock_init(&dsos->lock, NULL);
 }
 
-int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
+int machine__init(struct machine *machine, const char *root_dir, pid_t pid, bool allocated)
 {
 	memset(machine, 0, sizeof(*machine));
 	map_groups__init(&machine->kmaps, machine);
@@ -65,6 +66,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 	}
 
 	machine->current_tid = NULL;
+	machine->allocated = allocated;
 
 	return 0;
 }
@@ -74,7 +76,7 @@ struct machine *machine__new_host(void)
 	struct machine *machine = malloc(sizeof(*machine));
 
 	if (machine != NULL) {
-		machine__init(machine, "", HOST_KERNEL_ID);
+		machine__init(machine, "", HOST_KERNEL_ID, true);
 
 		if (machine__create_kernel_maps(machine) < 0)
 			goto out_delete;
@@ -137,12 +139,13 @@ void machine__exit(struct machine *machine)
 void machine__delete(struct machine *machine)
 {
 	machine__exit(machine);
-	free(machine);
+	WARN_ONCE((machine->allocated ? free(machine), 0 : -1),
+		  "WARNING: deleting a non-allocated machine. Skip.\n");
 }
 
 void machines__init(struct machines *machines)
 {
-	machine__init(&machines->host, "", HOST_KERNEL_ID);
+	machine__init(&machines->host, "", HOST_KERNEL_ID, false);
 	machines->guests = RB_ROOT;
 	machines->symbol_filter = NULL;
 }
@@ -163,7 +166,7 @@ struct machine *machines__add(struct machines *machines, pid_t pid,
 	if (machine == NULL)
 		return NULL;
 
-	if (machine__init(machine, root_dir, pid) != 0) {
+	if (machine__init(machine, root_dir, pid, true) != 0) {
 		free(machine);
 		return NULL;
 	}
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 2c2b443..24dfd46 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -28,6 +28,7 @@ struct machine {
 	pid_t		  pid;
 	u16		  id_hdr_size;
 	bool		  comm_exec;
+	bool		  allocated;
 	char		  *root_dir;
 	struct rb_root	  threads;
 	pthread_rwlock_t  threads_lock;
@@ -131,7 +132,7 @@ void machines__set_symbol_filter(struct machines *machines,
 void machines__set_comm_exec(struct machines *machines, bool comm_exec);
 
 struct machine *machine__new_host(void);
-int machine__init(struct machine *machine, const char *root_dir, pid_t pid);
+int machine__init(struct machine *machine, const char *root_dir, pid_t pid, bool allocated);
 void machine__exit(struct machine *machine);
 void machine__delete_threads(struct machine *machine);
 void machine__delete(struct machine *machine);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 14/53] perf test: Check environment before start real BPF test
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (12 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 13/53] perf tools: Prevent calling machine__delete() on non-allocated machine Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 21:55   ` Arnaldo Carvalho de Melo
  2016-01-11 13:48 ` [PATCH 15/53] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
                   ` (38 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo

Copying perf to old kernel system results:

 # perf test bpf
 37: Test BPF filter                                          :
 37.1: Test basic BPF filtering                               : FAILED!
 37.2: Test BPF prologue generation                           : Skip

However, in case when kernel doesn't support a test case it should
return 'Skip', 'FAILED!' should be reserved for kernel tests for when
the kernel supports a feature that then fails to work as advertised.

This patch checks environment before real testcase.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/bpf.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 33689a0..826b4b3 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -1,7 +1,11 @@
 #include <stdio.h>
 #include <sys/epoll.h>
+#include <util/util.h>
 #include <util/bpf-loader.h>
 #include <util/evlist.h>
+#include <linux/bpf.h>
+#include <linux/filter.h>
+#include <bpf/bpf.h>
 #include "tests.h"
 #include "llvm.h"
 #include "debug.h"
@@ -227,6 +231,36 @@ const char *test__bpf_subtest_get_desc(int i)
 	return bpf_testcase_table[i].desc;
 }
 
+static int check_env(void)
+{
+	int err;
+	unsigned int kver_int;
+	char license[] = "GPL";
+
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+	};
+
+	err = fetch_kernel_version(&kver_int, NULL, 0);
+	if (err) {
+		pr_debug("Unable to get kernel version\n");
+		return err;
+	}
+
+	err = bpf_load_program(BPF_PROG_TYPE_KPROBE, insns,
+			       sizeof(insns) / sizeof(insns[0]),
+			       license, kver_int, NULL, 0);
+	if (err < 0) {
+		pr_err("Missing basic BPF support, skip this test: %s\n",
+		       strerror(errno));
+		return err;
+	}
+	close(err);
+
+	return 0;
+}
+
 int test__bpf(int i)
 {
 	int err;
@@ -239,6 +273,9 @@ int test__bpf(int i)
 		return TEST_SKIP;
 	}
 
+	if (check_env())
+		return TEST_SKIP;
+
 	err = __test__bpf(i);
 	return err;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 15/53] perf tools: Fix symbols searching for offline module in buildid-cache
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (13 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 14/53] perf test: Check environment before start real BPF test Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 16/53] perf tools: Fix mmap2 event allocation in synthesize code Wang Nan
                   ` (37 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, Masami Hiramatsu, Namhyung Kim

Before this patch, if a sample is triggered inside an offline module
(module not in /lib/modules/`uname -r`/), even if the module is in
buildid-cache, 'perf report' is still unable to get correct symbol.
For example:

 # rm -rf ~/.debug/
 # perf buildid-cache -a ./mymodule.ko
 # perf probe -m ./mymodule.ko -a get_mymodule_val
 Added new event:
   probe:get_mymodule_val (on get_mymodule_val in mymodule)

 You can now use it in all perf tools, such as:

 	perf record -e probe:get_mymodule_val -aR sleep 1

 # perf record -e probe:get_mymodule_val cat /proc/mymodule
 mymodule:3
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]

 # perf report --stdio
 [SNIP]
 #
 # Overhead  Command  Shared Object     Symbol
 # ........  .......  ................  ......................
 #
    100.00%  cat      [mymodule]        [k] 0x0000000000000001

 # perf report -vvvv --stdio
 dso__load_sym: adjusting symbol: st_value: 0 sh_addr: 0 sh_offset: 0x70
 symbol__new: get_mymodule_val 0x70-0x8a
 [SNIP]

This is caused by dso__load() -> dso__load_sym(). In dso__load(), kmod
is true only when dso is regular kernel module. All files loaded from
buildid-cache is treated as user programs. Following dso__load_sym()
set map->pgoff incorrectly.

This patch gives kernel modules in buildid-cache a chance to adjust
value of kmod. After dso__load() get the type of symbols, if it is
buildid, check the last 3 chars of original filename against '.ko',
and adjust the value of kmod if the file is a kernel module.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
---
 tools/perf/util/build-id.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/build-id.h |  1 +
 tools/perf/util/symbol.c   |  4 ++++
 3 files changed, 49 insertions(+)

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 6a7e273..6b18082 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -166,6 +166,50 @@ char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size)
 	return build_id__filename(build_id_hex, bf, size);
 }
 
+bool dso__build_id_is_kmod(const struct dso *dso, char *bf, size_t size)
+{
+	char *id_name, *ch;
+	struct stat sb;
+
+	id_name = dso__build_id_filename(dso, bf, size);
+	if (!id_name)
+		goto err;
+	if (access(id_name, F_OK))
+		goto err;
+	if (lstat(id_name, &sb) == -1)
+		goto err;
+	if ((size_t)sb.st_size > size - 1)
+		goto err;
+	if (readlink(id_name, bf, size - 1) < 0)
+		goto err;
+
+	bf[sb.st_size] = '\0';
+
+	/*
+	 * link should be:
+	 * ../../lib/modules/4.4.0-rc4/kernel/net/ipv4/netfilter/nf_nat_ipv4.ko/a09fe3eb3147dafa4e3b31dbd6257e4d696bdc92
+	 */
+	ch = strrchr(bf, '/');
+	if (!ch)
+		goto err;
+	if (ch - 3 < bf)
+		goto err;
+
+	return strncmp(".ko", ch - 3, 3) == 0;
+err:
+	/*
+	 * If dso__build_id_filename work, get id_name again,
+	 * because id_name points to bf and is broken.
+	 */
+	if (id_name)
+		id_name = dso__build_id_filename(dso, bf, size);
+	pr_err("Invalid build id: %s\n", id_name ? :
+					 dso->long_name ? :
+					 dso->short_name ? :
+					 "[unknown]");
+	return false;
+}
+
 #define dsos__for_each_with_build_id(pos, head)	\
 	list_for_each_entry(pos, head, node)	\
 		if (!pos->has_build_id)		\
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 27a14a8..64af3e2 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -16,6 +16,7 @@ int sysfs__sprintf_build_id(const char *root_dir, char *sbuild_id);
 int filename__sprintf_build_id(const char *pathname, char *sbuild_id);
 
 char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size);
+bool dso__build_id_is_kmod(const struct dso *dso, char *bf, size_t size);
 
 int build_id__mark_dso_hit(struct perf_tool *tool, union perf_event *event,
 			   struct perf_sample *sample, struct perf_evsel *evsel,
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 3b2de6e..d78d105 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1525,6 +1525,10 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 	if (!runtime_ss && syms_ss)
 		runtime_ss = syms_ss;
 
+	if (syms_ss && syms_ss->type == DSO_BINARY_TYPE__BUILD_ID_CACHE)
+		if (dso__build_id_is_kmod(dso, name, PATH_MAX))
+			kmod = true;
+
 	if (syms_ss)
 		ret = dso__load_sym(dso, map, syms_ss, runtime_ss, filter, kmod);
 	else
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 16/53] perf tools: Fix mmap2 event allocation in synthesize code
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (14 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 15/53] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 21:03   ` Arnaldo Carvalho de Melo
  2016-01-11 13:48 ` [PATCH 17/53] perf test: Improve bp_signal Wang Nan
                   ` (36 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Masami Hiramatsu,
	Namhyung Kim

perf_event__synthesize_mmap_events() issues mmap2 events, but the
memory of that event is allocated using:

 mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);

If path of mmap source file is long (near PATH_MAX), random crash
would happen. Should use sizeof(mmap_event->mmap2).

Fix two memory allocations and rename all mmap_event to mmap2_event
to make it clear.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/event.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index cd61bb1..cde8228 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -413,7 +413,7 @@ int perf_event__synthesize_modules(struct perf_tool *tool,
 }
 
 static int __event__synthesize_thread(union perf_event *comm_event,
-				      union perf_event *mmap_event,
+				      union perf_event *mmap2_event,
 				      union perf_event *fork_event,
 				      pid_t pid, int full,
 					  perf_event__handler_t process,
@@ -436,7 +436,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 		if (tgid == -1)
 			return -1;
 
-		return perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid,
+		return perf_event__synthesize_mmap_events(tool, mmap2_event, pid, tgid,
 							  process, machine, mmap_data,
 							  proc_map_timeout);
 	}
@@ -478,7 +478,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
 		rc = 0;
 		if (_pid == pid) {
 			/* process the parent's maps too */
-			rc = perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid,
+			rc = perf_event__synthesize_mmap_events(tool, mmap2_event, pid, tgid,
 						process, machine, mmap_data, proc_map_timeout);
 			if (rc)
 				break;
@@ -496,15 +496,15 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
 				      bool mmap_data,
 				      unsigned int proc_map_timeout)
 {
-	union perf_event *comm_event, *mmap_event, *fork_event;
+	union perf_event *comm_event, *mmap2_event, *fork_event;
 	int err = -1, thread, j;
 
 	comm_event = malloc(sizeof(comm_event->comm) + machine->id_hdr_size);
 	if (comm_event == NULL)
 		goto out;
 
-	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
-	if (mmap_event == NULL)
+	mmap2_event = malloc(sizeof(mmap2_event->mmap2) + machine->id_hdr_size);
+	if (mmap2_event == NULL)
 		goto out_free_comm;
 
 	fork_event = malloc(sizeof(fork_event->fork) + machine->id_hdr_size);
@@ -513,7 +513,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
 
 	err = 0;
 	for (thread = 0; thread < threads->nr; ++thread) {
-		if (__event__synthesize_thread(comm_event, mmap_event,
+		if (__event__synthesize_thread(comm_event, mmap2_event,
 					       fork_event,
 					       thread_map__pid(threads, thread), 0,
 					       process, tool, machine,
@@ -539,7 +539,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
 
 			/* if not, generate events for it */
 			if (need_leader &&
-			    __event__synthesize_thread(comm_event, mmap_event,
+			    __event__synthesize_thread(comm_event, mmap2_event,
 						       fork_event,
 						       comm_event->comm.pid, 0,
 						       process, tool, machine,
@@ -551,7 +551,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
 	}
 	free(fork_event);
 out_free_mmap:
-	free(mmap_event);
+	free(mmap2_event);
 out_free_comm:
 	free(comm_event);
 out:
@@ -567,7 +567,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
 	DIR *proc;
 	char proc_path[PATH_MAX];
 	struct dirent dirent, *next;
-	union perf_event *comm_event, *mmap_event, *fork_event;
+	union perf_event *comm_event, *mmap2_event, *fork_event;
 	int err = -1;
 
 	if (machine__is_default_guest(machine))
@@ -577,8 +577,8 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
 	if (comm_event == NULL)
 		goto out;
 
-	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
-	if (mmap_event == NULL)
+	mmap2_event = malloc(sizeof(mmap2_event->mmap2) + machine->id_hdr_size);
+	if (mmap2_event == NULL)
 		goto out_free_comm;
 
 	fork_event = malloc(sizeof(fork_event->fork) + machine->id_hdr_size);
@@ -601,7 +601,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
  		 * We may race with exiting thread, so don't stop just because
  		 * one thread couldn't be synthesized.
  		 */
-		__event__synthesize_thread(comm_event, mmap_event, fork_event, pid,
+		__event__synthesize_thread(comm_event, mmap2_event, fork_event, pid,
 					   1, process, tool, machine, mmap_data,
 					   proc_map_timeout);
 	}
@@ -611,7 +611,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
 out_free_fork:
 	free(fork_event);
 out_free_mmap:
-	free(mmap_event);
+	free(mmap2_event);
 out_free_comm:
 	free(comm_event);
 out:
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 17/53] perf test: Improve bp_signal
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (15 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 16/53] perf tools: Fix mmap2 event allocation in synthesize code Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 21:37   ` Arnaldo Carvalho de Melo
  2016-01-11 13:48 ` [PATCH 18/53] perf tools: Add API to config maps in bpf object Wang Nan
                   ` (35 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Will Deacon, Jiri Olsa, Arnaldo Carvalho de Melo

Will Deacon [1] has some question on patch [2]. This patch improves
test__bp_signal so we can test:

 1. A watchpoint and a breakpoint that fire on the same instruction
 2. Nested signals

Test result:

 On x86_64 and ARM64 (result are similar with patch [2] on ARM64):

 # ./perf test -v signal
 17: Test breakpoint overflow signal handler                  :
 --- start ---
 test child forked, pid 10213
 count1 1, count2 3, count3 2, overflow 3, overflows_2 3
 test child finished with 0
 ---- end ----
 Test breakpoint overflow signal handler: Ok

So at least 2 cases Will doubted are handled correctly.

[1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
[2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/bp_signal.c | 140 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 118 insertions(+), 22 deletions(-)

diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index fb80c9e..1d1bb48 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -29,14 +29,59 @@
 
 static int fd1;
 static int fd2;
+static int fd3;
 static int overflows;
+static int overflows_2;
+
+volatile long the_var;
+
+
+/*
+ * Use ASM to ensure watchpoint and breakpoint can be triggered
+ * at one instruction.
+ */
+#if defined (__x86_64__)
+extern void __test_function(volatile long *ptr);
+asm (
+	".globl __test_function\n"
+	"__test_function:\n"
+	"incq (%rdi)\n"
+	"ret\n");
+#elif defined (__aarch64__)
+extern void __test_function(volatile long *ptr);
+asm (
+	".globl __test_function\n"
+	"__test_function:\n"
+	"str x30, [x0]\n"
+	"ret\n");
+
+#else
+static void __test_function(volatile long *ptr)
+{
+	*ptr = 0x1234;
+}
+#endif
 
 __attribute__ ((noinline))
 static int test_function(void)
 {
+	__test_function(&the_var);
+	the_var++;
 	return time(NULL);
 }
 
+static void sig_handler_2(int signum __maybe_unused,
+			  siginfo_t *oh __maybe_unused,
+			  void *uc __maybe_unused)
+{
+	overflows_2++;
+	if (overflows_2 > 10) {
+		ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
+		ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+		ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
+	}
+}
+
 static void sig_handler(int signum __maybe_unused,
 			siginfo_t *oh __maybe_unused,
 			void *uc __maybe_unused)
@@ -54,10 +99,11 @@ static void sig_handler(int signum __maybe_unused,
 		 */
 		ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
 		ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+		ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
 	}
 }
 
-static int bp_event(void *fn, int setup_signal)
+static int __event(bool is_x, void *addr, int signal)
 {
 	struct perf_event_attr pe;
 	int fd;
@@ -67,8 +113,8 @@ static int bp_event(void *fn, int setup_signal)
 	pe.size = sizeof(struct perf_event_attr);
 
 	pe.config = 0;
-	pe.bp_type = HW_BREAKPOINT_X;
-	pe.bp_addr = (unsigned long) fn;
+	pe.bp_type = is_x ? HW_BREAKPOINT_X : HW_BREAKPOINT_W;
+	pe.bp_addr = (unsigned long) addr;
 	pe.bp_len = sizeof(long);
 
 	pe.sample_period = 1;
@@ -86,17 +132,25 @@ static int bp_event(void *fn, int setup_signal)
 		return TEST_FAIL;
 	}
 
-	if (setup_signal) {
-		fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
-		fcntl(fd, F_SETSIG, SIGIO);
-		fcntl(fd, F_SETOWN, getpid());
-	}
+	fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
+	fcntl(fd, F_SETSIG, signal);
+	fcntl(fd, F_SETOWN, getpid());
 
 	ioctl(fd, PERF_EVENT_IOC_RESET, 0);
 
 	return fd;
 }
 
+static int bp_event(void *addr, int signal)
+{
+	return __event(true, addr, signal);
+}
+
+static int wp_event(void *addr, int signal)
+{
+	return __event(false, addr, signal);
+}
+
 static long long bp_count(int fd)
 {
 	long long count;
@@ -114,7 +168,7 @@ static long long bp_count(int fd)
 int test__bp_signal(int subtest __maybe_unused)
 {
 	struct sigaction sa;
-	long long count1, count2;
+	long long count1, count2, count3;
 
 	/* setup SIGIO signal handler */
 	memset(&sa, 0, sizeof(struct sigaction));
@@ -126,21 +180,52 @@ int test__bp_signal(int subtest __maybe_unused)
 		return TEST_FAIL;
 	}
 
+	sa.sa_sigaction = (void *) sig_handler_2;
+	if (sigaction(SIGUSR1, &sa, NULL) < 0) {
+		pr_debug("failed setting up signal handler 2\n");
+		return TEST_FAIL;
+	}
+
 	/*
 	 * We create following events:
 	 *
-	 * fd1 - breakpoint event on test_function with SIGIO
+	 * fd1 - breakpoint event on __test_function with SIGIO
 	 *       signal configured. We should get signal
 	 *       notification each time the breakpoint is hit
 	 *
-	 * fd2 - breakpoint event on sig_handler without SIGIO
+	 * fd2 - breakpoint event on sig_handler with SIGUSR1
+	 *       configured. We should get SIGUSR1 each time when
+	 *       breakpoint is hit
+	 *
+	 * fd3 - watchpoint event on __test_function with SIGIO
 	 *       configured.
 	 *
 	 * Following processing should happen:
-	 *   - execute test_function
-	 *   - fd1 event breakpoint hit -> count1 == 1
-	 *   - SIGIO is delivered       -> overflows == 1
-	 *   - fd2 event breakpoint hit -> count2 == 1
+	 *   Exec:               Action:                       Result:
+	 *   incq (%rdi)       - fd1 event breakpoint hit   -> count1 == 1
+	 *                     - SIGIO is delivered
+	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 1
+	 *                     - SIGUSR1 is delivered
+	 *   sig_handler_2                                  -> overflows_2 == 1  (nested signal)
+	 *   sys_rt_sigreturn  - return from sig_handler_2
+	 *   overflows++                                    -> overflows = 1
+	 *   sys_rt_sigreturn  - return from sig_handler
+	 *   incq (%rdi)       - fd3 event watchpoint hit   -> count3 == 1       (wp and bp in one insn)
+	 *                     - SIGIO is delivered
+	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 2
+	 *                     - SIGUSR1 is delivered
+	 *   sig_handler_2                                  -> overflows_2 == 2  (nested signal)
+	 *   sys_rt_sigreturn  - return from sig_handler_2
+	 *   overflows++                                    -> overflows = 2
+	 *   sys_rt_sigreturn  - return from sig_handler
+	 *   the_var++         - fd3 event watchpoint hit   -> count3 == 2       (standalone watchpoint)
+	 *                     - SIGIO is delivered
+	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 3
+	 *                     - SIGUSR1 is delivered
+	 *   sig_handler_2                                  -> overflows_2 == 3  (nested signal)
+	 *   sys_rt_sigreturn  - return from sig_handler_2
+	 *   overflows++                                    -> overflows == 3
+	 *   sys_rt_sigreturn  - return from sig_handler
 	 *
 	 * The test case check following error conditions:
 	 * - we get stuck in signal handler because of debug
@@ -152,11 +237,13 @@ int test__bp_signal(int subtest __maybe_unused)
 	 *
 	 */
 
-	fd1 = bp_event(test_function, 1);
-	fd2 = bp_event(sig_handler, 0);
+	fd1 = bp_event(__test_function, SIGIO);
+	fd2 = bp_event(sig_handler, SIGUSR1);
+	fd3 = wp_event((void *)&the_var, SIGIO);
 
 	ioctl(fd1, PERF_EVENT_IOC_ENABLE, 0);
 	ioctl(fd2, PERF_EVENT_IOC_ENABLE, 0);
+	ioctl(fd3, PERF_EVENT_IOC_ENABLE, 0);
 
 	/*
 	 * Kick off the test by trigering 'fd1'
@@ -166,15 +253,18 @@ int test__bp_signal(int subtest __maybe_unused)
 
 	ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
 	ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+	ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
 
 	count1 = bp_count(fd1);
 	count2 = bp_count(fd2);
+	count3 = bp_count(fd3);
 
 	close(fd1);
 	close(fd2);
+	close(fd3);
 
-	pr_debug("count1 %lld, count2 %lld, overflow %d\n",
-		 count1, count2, overflows);
+	pr_debug("count1 %lld, count2 %lld, count3 %lld, overflow %d, overflows_2 %d\n",
+		 count1, count2, count3, overflows, overflows_2);
 
 	if (count1 != 1) {
 		if (count1 == 11)
@@ -183,12 +273,18 @@ int test__bp_signal(int subtest __maybe_unused)
 			pr_debug("failed: wrong count for bp1%lld\n", count1);
 	}
 
-	if (overflows != 1)
+	if (overflows != 3)
 		pr_debug("failed: wrong overflow hit\n");
 
-	if (count2 != 1)
+	if (overflows_2 != 3)
+		pr_debug("failed: wrong overflow_2 hit\n");
+
+	if (count2 != 3)
 		pr_debug("failed: wrong count for bp2\n");
 
-	return count1 == 1 && overflows == 1 && count2 == 1 ?
+	if (count3 != 2)
+		pr_debug("failed: wrong count for bp3\n");
+
+	return count1 == 1 && overflows == 3 && count2 == 3 && overflows_2 == 3 && count3 == 2 ?
 		TEST_OK : TEST_FAIL;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 18/53] perf tools: Add API to config maps in bpf object
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (16 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 17/53] perf test: Improve bp_signal Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 19/53] perf tools: Enable BPF object configure syntax Wang Nan
                   ` (34 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim

bpf__config_obj() is introduced as a core API to config BPF object
after loading. One configuration option of maps is introduced. After
this patch BPF object can accept configuration like:

 maps:my_map.value=1234

(maps.my_map.value looks pretty. However, there's a small but hard
to fixed problem related to flex's greedy matching. Please see [1].
Choose ':' to avoid it in a simpler way.)

This patch is more complex than the work it really does because the
consideration of extension. In designing of BPF map configuration,
following things should be considered:

 1. Array indices selection: perf should allow user setting different
    value to different slots in an array, with syntax like:
    maps:my_map.value[0,3...6]=1234;

 2. A map can be config by different config terms, each for a part
    of it. For example, set each slot to pid of a thread;

 3. Type of value: integer is not the only valid value type. Perf
    event can also be put into a map after commit 35578d7984003097af2b1e3
    (bpf: Implement function bpf_perf_event_read() that get the selected
    hardware PMU conuter);

 4. For hash table, it is possible to use string or other as key;

 5. It is possible that map configuration is unable to be setup
    during parsing. Perf event is an example.

Therefore, this patch does following:

 1. Instead of updating map element during parsing, this patch stores
    map config options in 'struct bpf_map_priv'. Following patches
    would apply those configs at proper time;

 2. Link map operations to a list so a map can have multiple config
    terms attached, so different parts can be configured separately;

 3. Make 'struct bpf_map_priv' extensible so following patches can
    add new types of keys and operations;

 4. Use bpf_config_map_funcs array to support more maps config options.

Since the patch changing event parser to parse BPF object config is
relative large, I put in another commit. Code in this patch
could be tested after applying next patch.

[1] http://lkml.kernel.org/g/564ED621.4050500@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c | 266 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h |  38 +++++++
 2 files changed, 304 insertions(+)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 540a7ef..7d361aa 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -739,6 +739,251 @@ int bpf__foreach_tev(struct bpf_object *obj,
 	return 0;
 }
 
+enum bpf_map_op_type {
+	BPF_MAP_OP_SET_VALUE,
+};
+
+enum bpf_map_key_type {
+	BPF_MAP_KEY_ALL,
+};
+
+struct bpf_map_op {
+	struct list_head list;
+	enum bpf_map_op_type op_type;
+	enum bpf_map_key_type key_type;
+	union {
+		u64 value;
+	} v;
+};
+
+struct bpf_map_priv {
+	struct list_head ops_list;
+};
+
+static void
+bpf_map_op__free(struct bpf_map_op *op)
+{
+	struct list_head *list = &op->list;
+	/*
+	 * bpf_map_op__free() needs to consider following cases:
+	 *   1. When the op is created but not linked to any list:
+	 *      impossible. This only happen in bpf_map_op__alloc()
+	 *      and it would be freed directly;
+	 *   2. Normal case, when the op is linked to a list;
+	 *   3. After the op has already be removed.
+	 * Thanks to list.h, if it has removed by list_del() then
+	 * list->{next,prev} should have been set to LIST_POISON{1,2}.
+	 */
+	if ((list->next != LIST_POISON1) && (list->prev != LIST_POISON2))
+		list_del(list);
+	free(op);
+}
+
+static void
+bpf_map_priv__clear(struct bpf_map *map __maybe_unused,
+		    void *_priv)
+{
+	struct bpf_map_priv *priv = _priv;
+	struct bpf_map_op *pos, *n;
+
+	list_for_each_entry_safe(pos, n, &priv->ops_list, list)
+		bpf_map_op__free(pos);
+	free(priv);
+}
+
+static struct bpf_map_op *
+bpf_map_op__alloc(struct bpf_map *map)
+{
+	struct bpf_map_op *op;
+	struct bpf_map_priv *priv;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+	err = bpf_map__get_private(map, (void **)&priv);
+	if (err) {
+		pr_debug("Failed to get private from map %s\n", map_name);
+		return ERR_PTR(err);
+	}
+
+	if (!priv) {
+		priv = zalloc(sizeof(*priv));
+		if (!priv) {
+			pr_debug("No enough memory to alloc map private\n");
+			return ERR_PTR(-ENOMEM);
+		}
+		INIT_LIST_HEAD(&priv->ops_list);
+
+		if (bpf_map__set_private(map, priv, bpf_map_priv__clear)) {
+			free(priv);
+			return ERR_PTR(-BPF_LOADER_ERRNO__INTERNAL);
+		}
+	}
+
+	op = zalloc(sizeof(*op));
+	if (!op) {
+		pr_debug("Failed to alloc bpf_map_op\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	op->key_type = BPF_MAP_KEY_ALL;
+	list_add_tail(&op->list, &priv->ops_list);
+	return op;
+}
+
+static int
+bpf__obj_config_map_array_value(struct bpf_map *map,
+				struct parse_events_term *term)
+{
+	struct bpf_map_def def;
+	struct bpf_map_op *op;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("Unable to get map definition from '%s'\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	if (def.type != BPF_MAP_TYPE_ARRAY) {
+		pr_debug("Map %s type is not BPF_MAP_TYPE_ARRAY\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+	}
+	if (def.key_size < sizeof(unsigned int)) {
+		pr_debug("Map %s has incorrect key size\n", map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE;
+	}
+	switch (def.value_size) {
+	case 1:
+	case 2:
+	case 4:
+	case 8:
+		break;
+	default:
+		pr_debug("Map %s has incorrect value size\n", map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
+	}
+
+	op = bpf_map_op__alloc(map);
+	if (IS_ERR(op))
+		return PTR_ERR(op);
+	op->op_type = BPF_MAP_OP_SET_VALUE;
+	op->v.value = term->val.num;
+	return 0;
+}
+
+static int
+bpf__obj_config_map_value(struct bpf_map *map,
+			  struct parse_events_term *term,
+			  struct perf_evlist *evlist __maybe_unused)
+{
+	if (!term->err_val) {
+		pr_debug("Config value not set\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
+	}
+
+	if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM)
+		return bpf__obj_config_map_array_value(map, term);
+
+	pr_debug("ERROR: wrong value type\n");
+	return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
+}
+
+struct bpf_obj_config_map_func {
+	const char *config_opt;
+	int (*config_func)(struct bpf_map *, struct parse_events_term *,
+			   struct perf_evlist *);
+};
+
+struct bpf_obj_config_map_func bpf_obj_config_map_funcs[] = {
+	{"value", bpf__obj_config_map_value},
+};
+
+static int
+bpf__obj_config_map(struct bpf_object *obj,
+		    struct parse_events_term *term,
+		    struct perf_evlist *evlist,
+		    int *key_scan_pos)
+{
+	/* key is "maps:<mapname>.<config opt>" */
+	char *map_name = strdup(term->config + sizeof("maps:") - 1);
+	struct bpf_map *map;
+	int err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
+	char *map_opt;
+	size_t i;
+
+	if (!map_name)
+		return -ENOMEM;
+
+	map_opt = strchr(map_name, '.');
+	if (!map_opt) {
+		pr_debug("ERROR: Invalid map config: %s\n", map_name);
+		goto out;
+	}
+
+	*map_opt++ = '\0';
+	if (*map_opt == '\0') {
+		pr_debug("ERROR: Invalid map option: %s\n", term->config);
+		goto out;
+	}
+
+	map = bpf_object__get_map_by_name(obj, map_name);
+	if (!map) {
+		pr_debug("ERROR: Map %s is not exist\n", map_name);
+		err = -BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST;
+		goto out;
+	}
+
+	*key_scan_pos += map_opt - map_name;
+	for (i = 0; i < ARRAY_SIZE(bpf_obj_config_map_funcs); i++) {
+		struct bpf_obj_config_map_func *func =
+				&bpf_obj_config_map_funcs[i];
+
+		if (strcmp(map_opt, func->config_opt) == 0) {
+			err = func->config_func(map, term, evlist);
+			goto out;
+		}
+	}
+
+	pr_debug("ERROR: invalid config option '%s' for maps\n",
+		 map_opt);
+	err = -BPF_LOADER_ERRNO__OBJCONF_MAP_OPT;
+out:
+	free(map_name);
+	if (!err)
+		key_scan_pos += strlen(map_opt);
+	return err;
+}
+
+int bpf__config_obj(struct bpf_object *obj,
+		    struct parse_events_term *term,
+		    struct perf_evlist *evlist,
+		    int *error_pos)
+{
+	int key_scan_pos = 0;
+	int err;
+
+	if (!obj || !term || !term->config)
+		return -EINVAL;
+
+	if (!prefixcmp(term->config, "maps:")) {
+		key_scan_pos = sizeof("maps:") - 1;
+		err = bpf__obj_config_map(obj, term, evlist, &key_scan_pos);
+		goto out;
+	}
+	err = -BPF_LOADER_ERRNO__OBJCONF_OPT;
+out:
+	if (error_pos)
+		*error_pos = key_scan_pos;
+	return err;
+
+}
+
 #define ERRNO_OFFSET(e)		((e) - __BPF_LOADER_ERRNO__START)
 #define ERRCODE_OFFSET(c)	ERRNO_OFFSET(BPF_LOADER_ERRNO__##c)
 #define NR_ERRNO	(__BPF_LOADER_ERRNO__END - __BPF_LOADER_ERRNO__START)
@@ -753,6 +998,14 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(PROLOGUE)]	= "Failed to generate prologue",
 	[ERRCODE_OFFSET(PROLOGUE2BIG)]	= "Prologue too big for program",
 	[ERRCODE_OFFSET(PROLOGUEOOB)]	= "Offset out of bound for prologue",
+	[ERRCODE_OFFSET(OBJCONF_OPT)]	= "Invalid object config option",
+	[ERRCODE_OFFSET(OBJCONF_CONF)]	= "Config value not set (lost '=')",
+	[ERRCODE_OFFSET(OBJCONF_MAP_OPT)]	= "Invalid object maps config option",
+	[ERRCODE_OFFSET(OBJCONF_MAP_NOTEXIST)]	= "Target map not exist",
+	[ERRCODE_OFFSET(OBJCONF_MAP_VALUE)]	= "Incorrect value type for map",
+	[ERRCODE_OFFSET(OBJCONF_MAP_TYPE)]	= "Incorrect map type",
+	[ERRCODE_OFFSET(OBJCONF_MAP_KEYSIZE)]	= "Incorrect map key size",
+	[ERRCODE_OFFSET(OBJCONF_MAP_VALUESIZE)]	= "Incorrect map value size",
 };
 
 static int
@@ -872,3 +1125,16 @@ int bpf__strerror_load(struct bpf_object *obj,
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
+			     struct parse_events_term *term __maybe_unused,
+			     struct perf_evlist *evlist __maybe_unused,
+			     int *error_pos __maybe_unused, int err,
+			     char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,
+			    "Can't use this config term to this type of map");
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 6fdc045..2464db9 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -10,6 +10,7 @@
 #include <string.h>
 #include <bpf/libbpf.h>
 #include "probe-event.h"
+#include "evlist.h"
 #include "debug.h"
 
 enum bpf_loader_errno {
@@ -24,10 +25,19 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__PROLOGUE,	/* Failed to generate prologue */
 	BPF_LOADER_ERRNO__PROLOGUE2BIG,	/* Prologue too big for program */
 	BPF_LOADER_ERRNO__PROLOGUEOOB,	/* Offset out of bound for prologue */
+	BPF_LOADER_ERRNO__OBJCONF_OPT,	/* Invalid object config option */
+	BPF_LOADER_ERRNO__OBJCONF_CONF,	/* Config value not set (lost '=')) */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_OPT,	/* Invalid object maps config option */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_NOTEXIST,	/* Target map not exist */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE,	/* Incorrect value type for map */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,	/* Incorrect map type */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE,	/* Incorrect map key size */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE,/* Incorrect map value size */
 	__BPF_LOADER_ERRNO__END,
 };
 
 struct bpf_object;
+struct parse_events_term;
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
 typedef int (*bpf_prog_iter_callback_t)(struct probe_trace_event *tev,
@@ -53,6 +63,14 @@ int bpf__strerror_load(struct bpf_object *obj, int err,
 		       char *buf, size_t size);
 int bpf__foreach_tev(struct bpf_object *obj,
 		     bpf_prog_iter_callback_t func, void *arg);
+
+int bpf__config_obj(struct bpf_object *obj, struct parse_events_term *term,
+		    struct perf_evlist *evlist, int *error_pos);
+int bpf__strerror_config_obj(struct bpf_object *obj,
+			     struct parse_events_term *term,
+			     struct perf_evlist *evlist,
+			     int *error_pos, int err, char *buf,
+			     size_t size);
 #else
 static inline struct bpf_object *
 bpf__prepare_load(const char *filename __maybe_unused,
@@ -84,6 +102,15 @@ bpf__foreach_tev(struct bpf_object *obj __maybe_unused,
 }
 
 static inline int
+bpf__config_obj(struct bpf_object *obj __maybe_unused,
+		struct parse_events_term *term __maybe_unused,
+		struct perf_evlist *evlist __maybe_unused,
+		int *error_pos __maybe_unused)
+{
+	return 0;
+}
+
+static inline int
 __bpf_strerror(char *buf, size_t size)
 {
 	if (!size)
@@ -118,5 +145,16 @@ static inline int bpf__strerror_load(struct bpf_object *obj __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int
+bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
+			 struct parse_events_term *term __maybe_unused,
+			 struct perf_evlist *evlist __maybe_unused,
+			 int *error_pos __maybe_unused,
+			 int err __maybe_unused,
+			 char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 19/53] perf tools: Enable BPF object configure syntax
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (17 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 18/53] perf tools: Add API to config maps in bpf object Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 20/53] perf record: Apply config to BPF objects before recording Wang Nan
                   ` (33 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Namhyung Kim

This patch adds the final step for BPF map configuration. A new syntax
is appended into parser so user can config BPF objects through '/' '/'
enclosed config terms.

After this patch, following syntax is available:

 # perf record -e ./test_bpf_map_1.c/maps:channel.value=10/ ...

It would takes effect after appling following commits.

Test result:

 # cat ./test_bpf_map_1.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
     (void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
     .type = BPF_MAP_TYPE_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = 1,
 };
 SEC("func=sys_nanosleep")
 int func(void *ctx)
 {
     int key = 0;
     char fmt[] = "%d\n";
     int *pval = map_lookup_elem(&channel, &key);
     if (!pval)
         return 0;
     trace_printk(fmt, sizeof(fmt), *pval);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 - Normal case:
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=10/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]

 - Error case:

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value/' usleep 10
 event syntax error: '..ps:channel:value/'
                                   \___ Config value not set (lost '=')
 Hint:	Valid config term:
      	maps:[<arraymap>]:value=[value]
	     	(add -v to see detail)
	Run 'perf list' for a list of valid events

 Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -e, --event <event>   event selector. use 'perf list' to list available events

 # ./perf record -e './test_bpf_map_1.c/xmaps:channel.value=10/' usleep 10
 event syntax error: '..pf_map_1.c/xmaps:channel.value=10/'
                                   \___ Invalid object config option
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:xchannel.value=10/' usleep 10
 event syntax error: '..p_1.c/maps:xchannel.value=10/'
                                   \___ Target map not exist
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:channel.xvalue=10/' usleep 10
 event syntax error: '..ps:channel.xvalue=10/'
                                   \___ Invalid object maps config option
 [SNIP]

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=x10/' usleep 10
 event syntax error: '..nnel.value=x10/'
                                   \___ Incorrect value type for map
 [SNIP]

 Change BPF_MAP_TYPE_ARRAY to '1':

 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=10/' usleep 10
 event syntax error: '..ps:channel.value=10/'
                                   \___ Can't use this config term to this type of map

 Hint:	Valid config term:
     	maps:[<arraymap>].value=[value]
     	(add -v to see detail)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c | 56 +++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/parse-events.h |  3 ++-
 tools/perf/util/parse-events.l |  2 +-
 tools/perf/util/parse-events.y | 23 ++++++++++++++---
 4 files changed, 75 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4f7b0ef..1c2dc5d 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -628,17 +628,64 @@ errout:
 	return err;
 }
 
+static int
+parse_events_config_bpf(struct parse_events_evlist *data,
+		       struct bpf_object *obj,
+		       struct list_head *head_config)
+{
+	struct parse_events_term *term;
+	int error_pos;
+
+	if (!head_config || list_empty(head_config))
+		return 0;
+
+	list_for_each_entry(term, head_config, list) {
+		char errbuf[BUFSIZ];
+		int err;
+
+		if (term->type_term != PARSE_EVENTS__TERM_TYPE_USER) {
+			snprintf(errbuf, sizeof(errbuf),
+				 "Invalid config term for BPF object");
+			errbuf[BUFSIZ - 1] = '\0';
+
+			data->error->idx = term->err_term;
+			data->error->str = strdup(errbuf);
+			return -EINVAL;
+		}
+
+		err = bpf__config_obj(obj, term, NULL, &error_pos);
+		if (err) {
+			bpf__strerror_config_obj(obj, term, NULL,
+						 &error_pos, err, errbuf,
+						 sizeof(errbuf));
+			data->error->help = strdup(
+"Hint:\tValid config term:\n"
+"     \tmaps:[<arraymap>].value=[value]\n"
+"     \t(add -v to see detail)");
+			data->error->str = strdup(errbuf);
+			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
+				data->error->idx = term->err_val;
+			else
+				data->error->idx = term->err_term + error_pos;
+			return err;
+		}
+	}
+	return 0;
+
+}
+
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
-			  bool source)
+			  bool source,
+			  struct list_head *head_config)
 {
 	struct bpf_object *obj;
+	int err;
 
 	obj = bpf__prepare_load(bpf_file_name, source);
 	if (IS_ERR(obj)) {
 		char errbuf[BUFSIZ];
-		int err;
 
 		err = PTR_ERR(obj);
 
@@ -656,7 +703,10 @@ int parse_events_load_bpf(struct parse_events_evlist *data,
 		return err;
 	}
 
-	return parse_events_load_bpf_obj(data, list, obj);
+	err = parse_events_load_bpf_obj(data, list, obj);
+	if (err)
+		return err;
+	return parse_events_config_bpf(data, obj, head_config);
 }
 
 static int
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index f1a6db1..84694f3 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -126,7 +126,8 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
 int parse_events_load_bpf(struct parse_events_evlist *data,
 			  struct list_head *list,
 			  char *bpf_file_name,
-			  bool source);
+			  bool source,
+			  struct list_head *head_config);
 /* Provide this function for perf test */
 struct bpf_object;
 int parse_events_load_bpf_obj(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 58c5831..4387728 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -122,7 +122,7 @@ num_dec		[0-9]+
 num_hex		0x[a-fA-F0-9]+
 num_raw_hex	[a-fA-F0-9]+
 name		[a-zA-Z_*?][a-zA-Z0-9_*?.]*
-name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.]*
+name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
 /* If you add a modifier you need to update check_modifier() */
 modifier_event	[ukhpPGHSDI]+
 modifier_bp	[rwx]{1,3}
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index ad37996..8992d16 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -64,6 +64,7 @@ static inc_group_count(struct list_head *list,
 %type <str> PE_PMU_EVENT_PRE PE_PMU_EVENT_SUF PE_KERNEL_PMU_EVENT
 %type <num> value_sym
 %type <head> event_config
+%type <head> event_bpf_config
 %type <term> event_term
 %type <head> event_pmu
 %type <head> event_legacy_symbol
@@ -455,27 +456,41 @@ PE_RAW
 }
 
 event_bpf_file:
-PE_BPF_OBJECT
+PE_BPF_OBJECT event_bpf_config
 {
 	struct parse_events_evlist *data = _data;
 	struct parse_events_error *error = data->error;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_load_bpf(data, list, $1, false));
+	ABORT_ON(parse_events_load_bpf(data, list, $1, false, $2));
+	if ($2)
+		parse_events__free_terms($2);
 	$$ = list;
 }
 |
-PE_BPF_SOURCE
+PE_BPF_SOURCE event_bpf_config
 {
 	struct parse_events_evlist *data = _data;
 	struct list_head *list;
 
 	ALLOC_LIST(list);
-	ABORT_ON(parse_events_load_bpf(data, list, $1, true));
+	ABORT_ON(parse_events_load_bpf(data, list, $1, true, $2));
+	if ($2)
+		parse_events__free_terms($2);
 	$$ = list;
 }
 
+event_bpf_config:
+'/' event_config '/'
+{
+	$$ = $2;
+}
+|
+{
+	$$ = NULL;
+}
+
 start_terms: event_config
 {
 	struct parse_events_terms *data = _data;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 20/53] perf record: Apply config to BPF objects before recording
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (18 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 19/53] perf tools: Enable BPF object configure syntax Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 21/53] perf tools: Enable passing event to BPF object Wang Nan
                   ` (32 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim

bpf__apply_obj_config() is introduced as the core API to apply object
config options to all BPF objects. This patch also does the real work
for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value
stored in map's private field into the BPF map.

This patch is required because we are not always able to set all
BPF config during parsing. Further patch will set events created
by perf to BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist
until perf_evsel__open().

bpf_map_foreach_key() is introduced to iterate over each key
needs to be configured. This function would be extended to support
more map types and different key settings.

In perf record, before start recording, call bpf__apply_config() to
turn on all BPF config options.

Test result:

 # cat ./test_bpf_map_1.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
     (void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
     .type = BPF_MAP_TYPE_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = 1,
 };
 SEC("func=sys_nanosleep")
 int func(void *ctx)
 {
     int key = 0;
     char fmt[] = "%d\n";
     int *pval = map_lookup_elem(&channel, &key);
     if (!pval)
         return 0;
     trace_printk(fmt, sizeof(fmt), *pval);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -e './test_bpf_map_1.c/maps:channel.value=11/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 1/1   #P:8
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            usleep-18593 [007] d... 2394714.395539: : 11
 # ./perf record -e './test_bpf_map.c/maps:channel.value=101/' usleep 10
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 1/1   #P:8
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            usleep-18593 [007] d... 2394714.395539: : 11
            usleep-19000 [006] d... 2394831.057840: : 101

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c  |  11 +++
 tools/perf/util/bpf-loader.c | 180 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-loader.h |  15 ++++
 3 files changed, 206 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dc4e0ad..bd1692c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -32,6 +32,7 @@
 #include "util/parse-branch-options.h"
 #include "util/parse-regs-options.h"
 #include "util/llvm-utils.h"
+#include "util/bpf-loader.h"
 
 #include <unistd.h>
 #include <sched.h>
@@ -526,6 +527,16 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		goto out_child;
 	}
 
+	err = bpf__apply_obj_config();
+	if (err) {
+		char errbuf[BUFSIZ];
+
+		bpf__strerror_apply_obj_config(err, errbuf, sizeof(errbuf));
+		pr_err("ERROR: Apply config to BPF failed: %s\n",
+			 errbuf);
+		goto out_child;
+	}
+
 	/*
 	 * Normally perf_session__new would do this, but it doesn't have the
 	 * evlist.
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 7d361aa..96fd18b 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -7,6 +7,7 @@
 
 #include <linux/bpf.h>
 #include <bpf/libbpf.h>
+#include <bpf/bpf.h>
 #include <linux/err.h>
 #include <linux/string.h>
 #include "perf.h"
@@ -984,6 +985,178 @@ out:
 
 }
 
+typedef int (*map_config_func_t)(const char *name, int map_fd,
+				 struct bpf_map_def *pdef,
+				 struct bpf_map_op *op,
+				 void *pkey, void *arg);
+
+static int
+foreach_key_array_all(map_config_func_t func,
+		      void *arg, const char *name,
+		      int map_fd, struct bpf_map_def *pdef,
+		      struct bpf_map_op *op)
+{
+	unsigned int i;
+	int err;
+
+	for (i = 0; i < pdef->max_entries; i++) {
+		err = func(name, map_fd, pdef, op, &i, arg);
+		if (err) {
+			pr_debug("ERROR: failed to insert value to %s[%u]\n",
+				 name, i);
+			return err;
+		}
+	}
+	return 0;
+}
+
+static int
+bpf_map_config_foreach_key(struct bpf_map *map,
+			   map_config_func_t func,
+			   void *arg)
+{
+	int err, map_fd;
+	const char *name;
+	struct bpf_map_op *op;
+	struct bpf_map_def def;
+	struct bpf_map_priv *priv;
+
+	name = bpf_map__get_name(map);
+
+	err = bpf_map__get_private(map, (void **)&priv);
+	if (err) {
+		pr_debug("ERROR: failed to get private from map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	if (!priv || list_empty(&priv->ops_list)) {
+		pr_debug("INFO: nothing to config for map %s\n", name);
+		return 0;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("ERROR: failed to get definition from map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	map_fd = bpf_map__get_fd(map);
+	if (map_fd < 0) {
+		pr_debug("ERROR: failed to get fd from map %s\n", name);
+		return map_fd;
+	}
+
+	list_for_each_entry(op, &priv->ops_list, list) {
+		switch (def.type) {
+		case BPF_MAP_TYPE_ARRAY:
+			switch (op->key_type) {
+			case BPF_MAP_KEY_ALL:
+				return foreach_key_array_all(func, arg, name,
+							     map_fd, &def, op);
+			default:
+				pr_debug("ERROR: keytype for map '%s' invalid\n",
+					 name);
+				return -BPF_LOADER_ERRNO__INTERNAL;
+		}
+		default:
+			pr_debug("ERROR: type of '%s' incorrect\n", name);
+			return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+		}
+	}
+
+	return 0;
+}
+
+static int
+apply_config_value_for_key(int map_fd, void *pkey,
+			   size_t val_size, u64 val)
+{
+	int err = 0;
+
+	switch (val_size) {
+	case 1: {
+		u8 _val = (u8)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 2: {
+		u16 _val = (u16)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 4: {
+		u32 _val = (u32)(val);
+		err = bpf_map_update_elem(map_fd, pkey, &_val, BPF_ANY);
+		break;
+	}
+	case 8: {
+		err = bpf_map_update_elem(map_fd, pkey, &val, BPF_ANY);
+		break;
+	}
+	default:
+		pr_debug("ERROR: invalid value size\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
+	}
+	if (err && errno)
+		err = -errno;
+	return err;
+}
+
+static int
+apply_obj_config_map_for_key(const char *name, int map_fd,
+			     struct bpf_map_def *pdef __maybe_unused,
+			     struct bpf_map_op *op,
+			     void *pkey, void *arg __maybe_unused)
+{
+	int err;
+
+	switch (op->op_type) {
+	case BPF_MAP_OP_SET_VALUE:
+		err = apply_config_value_for_key(map_fd, pkey,
+						 pdef->value_size,
+						 op->v.value);
+		break;
+	default:
+		pr_debug("ERROR: unknown value type for '%s'\n", name);
+		err = -BPF_LOADER_ERRNO__INTERNAL;
+	}
+	return err;
+}
+
+static int
+apply_obj_config_map(struct bpf_map *map)
+{
+	return bpf_map_config_foreach_key(map,
+					  apply_obj_config_map_for_key,
+					  NULL);
+}
+
+static int
+apply_obj_config_object(struct bpf_object *obj)
+{
+	struct bpf_map *map;
+	int err;
+
+	bpf_map__for_each(map, obj) {
+		err = apply_obj_config_map(map);
+		if (err)
+			return err;
+	}
+	return 0;
+}
+
+int bpf__apply_obj_config(void)
+{
+	struct bpf_object *obj, *tmp;
+	int err;
+
+	bpf_object__for_each_safe(obj, tmp) {
+		err = apply_obj_config_object(obj);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 #define ERRNO_OFFSET(e)		((e) - __BPF_LOADER_ERRNO__START)
 #define ERRCODE_OFFSET(c)	ERRNO_OFFSET(BPF_LOADER_ERRNO__##c)
 #define NR_ERRNO	(__BPF_LOADER_ERRNO__END - __BPF_LOADER_ERRNO__START)
@@ -1138,3 +1311,10 @@ int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 	bpf__strerror_end(buf, size);
 	return 0;
 }
+
+int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
+{
+	bpf__strerror_head(err, buf, size);
+	bpf__strerror_end(buf, size);
+	return 0;
+}
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 2464db9..db3c34c 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -71,6 +71,8 @@ int bpf__strerror_config_obj(struct bpf_object *obj,
 			     struct perf_evlist *evlist,
 			     int *error_pos, int err, char *buf,
 			     size_t size);
+int bpf__apply_obj_config(void);
+int bpf__strerror_apply_obj_config(int err, char *buf, size_t size);
 #else
 static inline struct bpf_object *
 bpf__prepare_load(const char *filename __maybe_unused,
@@ -111,6 +113,12 @@ bpf__config_obj(struct bpf_object *obj __maybe_unused,
 }
 
 static inline int
+bpf__apply_obj_config(void)
+{
+	return 0;
+}
+
+static inline int
 __bpf_strerror(char *buf, size_t size)
 {
 	if (!size)
@@ -156,5 +164,12 @@ bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 {
 	return __bpf_strerror(buf, size);
 }
+
+static inline int
+bpf__strerror_apply_obj_config(int err __maybe_unused,
+			       char *buf, size_t size)
+{
+	return __bpf_strerror(buf, size);
+}
 #endif
 #endif
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 21/53] perf tools: Enable passing event to BPF object
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (19 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 20/53] perf record: Apply config to BPF objects before recording Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 22/53] perf tools: Support perf event alias name Wang Nan
                   ` (31 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Namhyung Kim

A new syntax is appended into parser so user can pass predefined perf
events into BPF objects.

After this patch, BPF programs for perf are finally able to utilize
bpf_perf_event_read() introduced in commit 35578d7984003097af2b1e3
(bpf: Implement function bpf_perf_event_read() that get the selected
hardware PMU conuter).

Test result:

 # cat ./test_bpf_map_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
     (void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_read)(struct bpf_map_def *, int) =
     (void *)BPF_FUNC_perf_event_read;

 struct bpf_map_def SEC("maps") pmu_map = {
     .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = __NR_CPUS__,
 };
 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
     unsigned long long val;
     char fmt[] = "sys_write:        pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }

 SEC("func_write_return=sys_write%return")
 int func_write_return(void *ctx)
 {
     unsigned long long val = 0;
     char fmt[] = "sys_write_return: pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

Normal case:
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 [SNIP]
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.013 MB perf.data (7 samples) ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-17066 [000] d... 938449.863301: : sys_write:        pmu=1157327
               ls-17066 [000] dN.. 938449.863342: : sys_write_return: pmu=1225218
               ls-17066 [000] d... 938449.863349: : sys_write:        pmu=1241922
               ls-17066 [000] dN.. 938449.863369: : sys_write_return: pmu=1267445

Normal case (system wide):
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' -a
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.811 MB perf.data (120 samples) ]

 # cat /sys/kernel/debug/tracing/trace | grep -v '18446744073709551594' | grep -v perf | head -n 20
 [SNIP]
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
            gmain-30828 [002] d... 2740551.068992: : sys_write:        pmu=84373
            gmain-30828 [002] d... 2740551.068992: : sys_write_return: pmu=87696
            gmain-30828 [002] d... 2740551.068996: : sys_write:        pmu=100658
            gmain-30828 [002] d... 2740551.068997: : sys_write_return: pmu=102572

Error case 1:

 # ./perf record -e './test_bpf_map_2.c' ls /
 [SNIP]
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.014 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-17115 [007] d... 2724279.665625: : sys_write:        pmu=18446744073709551614
               ls-17115 [007] dN.. 2724279.665651: : sys_write_return: pmu=18446744073709551614
               ls-17115 [007] d... 2724279.665658: : sys_write:        pmu=18446744073709551614
               ls-17115 [007] dN.. 2724279.665677: : sys_write_return: pmu=18446744073709551614

 (18446744073709551614 is 0xfffffffffffffffe (-2))

Error case 2:
 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=evt/' -a
 event syntax error: '..ps:pmu_map.event=evt/'
                                   \___ Event not found for map setting

 Hint:	Valid config terms:
      	maps:[<arraymap>].value=[value]
      	maps:[<eventmap>].event=[event]
 [SNIP]

Error case 3:
 # ls /proc/2348/task/
 2348  2505  2506  2507  2508
 # ./perf record -i -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' -p 2348
 ERROR: Apply config to BPF failed: Cannot set event to BPF maps in multi-thread tracing

Error case 4:
 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i to turn off inherit)

Error case 5:
 # ./perf record -i -e raw_syscalls:sys_enter -e './test_bpf_map_2.c/maps:pmu_map.event=raw_syscalls:sys_enter/' ls
 ERROR: Apply config to BPF failed: Can only put raw, hardware and BPF output event into a BPF map

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 138 ++++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/bpf-loader.h   |   5 ++
 tools/perf/util/evlist.c       |  16 +++++
 tools/perf/util/evlist.h       |   3 +
 tools/perf/util/parse-events.c |  15 +++--
 tools/perf/util/parse-events.h |   1 +
 6 files changed, 171 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 96fd18b..84b4581 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -742,6 +742,7 @@ int bpf__foreach_tev(struct bpf_object *obj,
 
 enum bpf_map_op_type {
 	BPF_MAP_OP_SET_VALUE,
+	BPF_MAP_OP_SET_EVSEL,
 };
 
 enum bpf_map_key_type {
@@ -754,6 +755,7 @@ struct bpf_map_op {
 	enum bpf_map_key_type key_type;
 	union {
 		u64 value;
+		struct perf_evsel *evsel;
 	} v;
 };
 
@@ -891,10 +893,73 @@ bpf__obj_config_map_value(struct bpf_map *map,
 	if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM)
 		return bpf__obj_config_map_array_value(map, term);
 
-	pr_debug("ERROR: wrong value type\n");
+	pr_debug("ERROR: wrong value type for 'value'\n");
 	return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
 }
 
+static int
+bpf__obj_config_map_array_event(struct bpf_map *map,
+				struct parse_events_term *term,
+				struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	struct bpf_map_def def;
+	struct bpf_map_op *op;
+	const char *map_name;
+	int err;
+
+	map_name = bpf_map__get_name(map);
+	evsel = perf_evlist__find_evsel_by_str(evlist, term->val.str);
+	if (!evsel) {
+		pr_debug("Event (for '%s') '%s' doesn't exist\n",
+			 map_name, term->val.str);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_NOEVT;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("Unable to get map definition from '%s'\n",
+			 map_name);
+		return err;
+	}
+
+	/*
+	 * No need to check key_size and value_size:
+	 * kernel has already checked them.
+	 */
+	if (def.type != BPF_MAP_TYPE_PERF_EVENT_ARRAY) {
+		pr_debug("Map %s type is not BPF_MAP_TYPE_PERF_EVENT_ARRAY\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
+	}
+
+	op = bpf_map_op__alloc(map);
+	if (IS_ERR(op))
+		return PTR_ERR(op);
+
+	op->v.evsel = evsel;
+	op->op_type = BPF_MAP_OP_SET_EVSEL;
+	return 0;
+}
+
+static int
+bpf__obj_config_map_event(struct bpf_map *map,
+			  struct parse_events_term *term,
+			  struct perf_evlist *evlist)
+{
+	if (!term->err_val) {
+		pr_debug("Config value not set\n");
+		return -BPF_LOADER_ERRNO__OBJCONF_CONF;
+	}
+
+	if (term->type_val == PARSE_EVENTS__TERM_TYPE_STR)
+		return bpf__obj_config_map_array_event(map, term, evlist);
+
+	pr_debug("ERROR: wrong value type for 'event'\n");
+	return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE;
+}
+
+
 struct bpf_obj_config_map_func {
 	const char *config_opt;
 	int (*config_func)(struct bpf_map *, struct parse_events_term *,
@@ -903,6 +968,7 @@ struct bpf_obj_config_map_func {
 
 struct bpf_obj_config_map_func bpf_obj_config_map_funcs[] = {
 	{"value", bpf__obj_config_map_value},
+	{"event", bpf__obj_config_map_event},
 };
 
 static int
@@ -1047,6 +1113,7 @@ bpf_map_config_foreach_key(struct bpf_map *map,
 	list_for_each_entry(op, &priv->ops_list, list) {
 		switch (def.type) {
 		case BPF_MAP_TYPE_ARRAY:
+		case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
 			switch (op->key_type) {
 			case BPF_MAP_KEY_ALL:
 				return foreach_key_array_all(func, arg, name,
@@ -1101,6 +1168,60 @@ apply_config_value_for_key(int map_fd, void *pkey,
 }
 
 static int
+apply_config_evsel_for_key(const char *name, int map_fd, void *pkey,
+			   struct perf_evsel *evsel)
+{
+	struct xyarray *xy = evsel->fd;
+	struct perf_event_attr *attr;
+	unsigned int key, events;
+	bool check_pass = false;
+	int *evt_fd;
+	int err;
+
+	if (!xy) {
+		pr_debug("ERROR: evsel not ready for map %s\n", name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	if (xy->row_size / xy->entry_size != 1) {
+		pr_debug("ERROR: Dimension of target event is incorrect for map %s\n",
+			 name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM;
+	}
+
+	attr = &evsel->attr;
+	if (attr->inherit) {
+		pr_debug("ERROR: Can't put inherit event into map %s\n", name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH;
+	}
+
+	if (attr->type == PERF_TYPE_RAW)
+		check_pass = true;
+	if (attr->type == PERF_TYPE_HARDWARE)
+		check_pass = true;
+	if (attr->type == PERF_TYPE_SOFTWARE &&
+			attr->config == PERF_COUNT_SW_BPF_OUTPUT)
+		check_pass = true;
+	if (!check_pass) {
+		pr_debug("ERROR: Event type is wrong for map %s\n", name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE;
+	}
+
+	events = xy->entries / (xy->row_size / xy->entry_size);
+	key = *((unsigned int *)pkey);
+	if (key >= events) {
+		pr_debug("ERROR: there is no event %d for map %s\n",
+			 key, name);
+		return -BPF_LOADER_ERRNO__OBJCONF_MAP_MAPSIZE;
+	}
+	evt_fd = xyarray__entry(xy, key, 0);
+	err = bpf_map_update_elem(map_fd, pkey, evt_fd, BPF_ANY);
+	if (err && errno)
+		err = -errno;
+	return err;
+}
+
+static int
 apply_obj_config_map_for_key(const char *name, int map_fd,
 			     struct bpf_map_def *pdef __maybe_unused,
 			     struct bpf_map_op *op,
@@ -1114,6 +1235,10 @@ apply_obj_config_map_for_key(const char *name, int map_fd,
 						 pdef->value_size,
 						 op->v.value);
 		break;
+	case BPF_MAP_OP_SET_EVSEL:
+		err = apply_config_evsel_for_key(name, map_fd, pkey,
+						 op->v.evsel);
+		break;
 	default:
 		pr_debug("ERROR: unknown value type for '%s'\n", name);
 		err = -BPF_LOADER_ERRNO__INTERNAL;
@@ -1179,6 +1304,11 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(OBJCONF_MAP_TYPE)]	= "Incorrect map type",
 	[ERRCODE_OFFSET(OBJCONF_MAP_KEYSIZE)]	= "Incorrect map key size",
 	[ERRCODE_OFFSET(OBJCONF_MAP_VALUESIZE)]	= "Incorrect map value size",
+	[ERRCODE_OFFSET(OBJCONF_MAP_NOEVT)]	= "Event not found for map setting",
+	[ERRCODE_OFFSET(OBJCONF_MAP_MAPSIZE)]	= "Invalid map size for event setting",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTDIM)]	= "Event dimension too large",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTINH)]	= "Doesn't support inherit event",
+	[ERRCODE_OFFSET(OBJCONF_MAP_EVTTYPE)]	= "Wrong event type for map",
 };
 
 static int
@@ -1315,6 +1445,12 @@ int bpf__strerror_config_obj(struct bpf_object *obj __maybe_unused,
 int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
 {
 	bpf__strerror_head(err, buf, size);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,
+			    "Cannot set event to BPF maps in multi-thread tracing");
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,
+			    "%s (Hint: use -i to turn off inherit)", emsg);
+	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,
+			    "Can only put raw, hardware and BPF output event into a BPF map");
 	bpf__strerror_end(buf, size);
 	return 0;
 }
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index db3c34c..c9ce792 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -33,6 +33,11 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE,	/* Incorrect map type */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_KEYSIZE,	/* Incorrect map key size */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE,/* Incorrect map value size */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_NOEVT,	/* Event not found for map setting */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_MAPSIZE,	/* Invalid map size for event setting */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,	/* Event dimension too large */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,	/* Doesn't support inherit event */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,	/* Wrong event type for map */
 	__BPF_LOADER_ERRNO__END,
 };
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d81f13d..9b56390 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1723,3 +1723,19 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 
 	tracking_evsel->tracking = true;
 }
+
+struct perf_evsel *
+perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
+			       const char *str)
+{
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		if (!evsel->name)
+			continue;
+		if (strcmp(str, evsel->name) == 0)
+			return evsel;
+	}
+
+	return NULL;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7c4d9a2..a0d1522 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -294,4 +294,7 @@ void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 				     struct perf_evsel *tracking_evsel);
 
 void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr);
+
+struct perf_evsel *
+perf_evlist__find_evsel_by_str(struct perf_evlist *evlist, const char *str);
 #endif /* __PERF_EVLIST_H */
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 1c2dc5d..6e2543c 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -653,14 +653,16 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 			return -EINVAL;
 		}
 
-		err = bpf__config_obj(obj, term, NULL, &error_pos);
+		err = bpf__config_obj(obj, term, data->evlist, &error_pos);
 		if (err) {
-			bpf__strerror_config_obj(obj, term, NULL,
+			bpf__strerror_config_obj(obj, term, data->evlist,
 						 &error_pos, err, errbuf,
 						 sizeof(errbuf));
 			data->error->help = strdup(
-"Hint:\tValid config term:\n"
+"Hint:\tValid config terms:\n"
 "     \tmaps:[<arraymap>].value=[value]\n"
+"     \tmaps:[<eventmap>].event=[event]\n"
+"\n"
 "     \t(add -v to see detail)");
 			data->error->str = strdup(errbuf);
 			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
@@ -1442,9 +1444,10 @@ int parse_events(struct perf_evlist *evlist, const char *str,
 		 struct parse_events_error *err)
 {
 	struct parse_events_evlist data = {
-		.list  = LIST_HEAD_INIT(data.list),
-		.idx   = evlist->nr_entries,
-		.error = err,
+		.list   = LIST_HEAD_INIT(data.list),
+		.idx    = evlist->nr_entries,
+		.error  = err,
+		.evlist = evlist,
 	};
 	int ret;
 
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 84694f3..2a2b172 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -98,6 +98,7 @@ struct parse_events_evlist {
 	int			   idx;
 	int			   nr_groups;
 	struct parse_events_error *error;
+	struct perf_evlist	  *evlist;
 };
 
 struct parse_events_terms {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 22/53] perf tools: Support perf event alias name
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (20 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 21/53] perf tools: Enable passing event to BPF object Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 23/53] perf tools: Support setting different slots in a BPF map separately Wang Nan
                   ` (30 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, He Kuang,
	Wang Nan, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Namhyung Kim

From: He Kuang <hekuang@huawei.com>

This patch is useful when trying to pass a perf event to BPF map.
Before this patch we are unable to pass an event with config term to
BPF maps. For example:

 # perf record -a -e cycles/no-inherit,period=0x7fffffffffffffff/ \
                  -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/no-inherit,period=0x7fffffffffffffff//' ls /
 event syntax error: '..ps:pmu_map.event=cycles/'
                                   \___ Event not found for map setting

Because those '/' and ',' embarrass parser.

This patch adds new bison rules for specifying an alias name to a perf
event, which allows cmdline refer to previous defined perf event through
its name. With this patch user can give alias name to a perf event using
following cmdline. The above goal can be achieved using:

 # perf record -a -e cyc=cycles/no-inherit,period=0x7fffffffffffffff/ \
                  -e './test_bpf_map_2.c/maps:pmu_map.event=cyc/' ls /

If alias is not provided (normal case):

 # perf record -e cycles ...

It will be set to event's name automatically ('cycles' in the above
example).

To allow parser refer to existing event selector, pass event list to
'struct parse_events_evlist'. perf_evlist__find_evsel_by_alias() is
introduced to get evsel through its alias.

Test result:
 # cat ./test_bpf_map_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
     unsigned int type;
     unsigned int key_size;
     unsigned int value_size;
     unsigned int max_entries;
 };
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
     (void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
     (void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_read)(struct bpf_map_def *, int) =
     (void *)BPF_FUNC_perf_event_read;

 struct bpf_map_def SEC("maps") pmu_map = {
     .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
     .key_size = sizeof(int),
     .value_size = sizeof(int),
     .max_entries = __NR_CPUS__,
 };
 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
     unsigned long long val;
     char fmt[] = "sys_write:        pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }

 SEC("func_write_return=sys_write%return")
 int func_write_return(void *ctx)
 {
     unsigned long long val = 0;
     char fmt[] = "sys_write_return: pmu=%llu\n";
     val = perf_event_read(&pmu_map, get_smp_processor_id());
     trace_printk(fmt, sizeof(fmt), val);
     return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -a -e cyc=cycles/no-inherit,period=0x7fffffffffffffff/ \
                    -e './test_bpf_map_2.c/maps:pmu_map.event=cyc/' ls /
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.755 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep ls
               ls-25328 [002] d... 940138.313178: : sys_write:        pmu=4503165
               ls-25328 [002] dN.. 940138.313207: : sys_write_return: pmu=4582975
               ls-25328 [002] d... 940138.313211: : sys_write:        pmu=4599840
               ls-25328 [002] dN.. 940138.313220: : sys_write_return: pmu=4633352
 # ./perf report --stdio
 Error:
 The perf.data file has no samples!
 ...
 (This is expected because we set period of cycles to a very large
 value to period of cycles event because we want to use this event
 as a counter only, don't need sampling)

 # ./perf record -e cycles -e './test_bpf_map_2.c/maps:pmu_map.event=cycles/' ls /
 ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i or use /no-inherit/ to turn off inherit)

Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   |  2 +-
 tools/perf/util/evlist.c       |  4 ++--
 tools/perf/util/evsel.c        |  1 +
 tools/perf/util/evsel.h        |  1 +
 tools/perf/util/parse-events.c | 26 ++++++++++++++++++++++++++
 tools/perf/util/parse-events.h |  4 ++++
 tools/perf/util/parse-events.y | 15 ++++++++++++++-
 7 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 84b4581..2893b4e 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -1448,7 +1448,7 @@ int bpf__strerror_apply_obj_config(int err, char *buf, size_t size)
 	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,
 			    "Cannot set event to BPF maps in multi-thread tracing");
 	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,
-			    "%s (Hint: use -i to turn off inherit)", emsg);
+			    "%s (Hint: use -i or use /no-inherit/ to turn off inherit)", emsg);
 	bpf__strerror_entry(BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,
 			    "Can only put raw, hardware and BPF output event into a BPF map");
 	bpf__strerror_end(buf, size);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9b56390..890b08b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1731,9 +1731,9 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 	struct perf_evsel *evsel;
 
 	evlist__for_each(evlist, evsel) {
-		if (!evsel->name)
+		if (!evsel->alias)
 			continue;
-		if (strcmp(str, evsel->name) == 0)
+		if (strcmp(str, evsel->alias) == 0)
 			return evsel;
 	}
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index cdbaf9b..a6b3b07 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1076,6 +1076,7 @@ void perf_evsel__exit(struct perf_evsel *evsel)
 	thread_map__put(evsel->threads);
 	zfree(&evsel->group_name);
 	zfree(&evsel->name);
+	zfree(&evsel->alias);
 	perf_evsel__object.fini(evsel);
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8e75434..19885fb 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -89,6 +89,7 @@ struct perf_evsel {
 	int			idx;
 	u32			ids;
 	char			*name;
+	char			*alias;
 	double			scale;
 	const char		*unit;
 	struct event_format	*tp_format;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6e2543c..1e0ac77 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1091,6 +1091,30 @@ int parse_events__modifier_group(struct list_head *list,
 	return parse_events__modifier_event(list, event_mod, true);
 }
 
+int parse_events__set_event_alias(struct parse_events_evlist *data,
+				  struct list_head *list,
+				  const char *str,
+				  void *loc_alias_)
+{
+	struct perf_evsel *evsel;
+	YYLTYPE *loc_alias = loc_alias_;
+
+	if (!str)
+		return 0;
+
+	if (!list_is_singular(list)) {
+		struct parse_events_error *err = data->error;
+
+		err->idx = loc_alias->first_column;
+		err->str = strdup("One alias can be applied to one event only");
+		return -EINVAL;
+	}
+
+	evsel = list_first_entry(list, struct perf_evsel, node);
+	evsel->alias = strdup(str);
+	return evsel->alias ? 0 : -ENOMEM;
+}
+
 void parse_events__set_leader(char *name, struct list_head *list)
 {
 	struct perf_evsel *leader;
@@ -1283,6 +1307,8 @@ int parse_events_name(struct list_head *list, char *name)
 	__evlist__for_each(list, evsel) {
 		if (!evsel->name)
 			evsel->name = strdup(name);
+		if (!evsel->alias)
+			evsel->alias = strdup(name);
 	}
 
 	return 0;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 2a2b172..20ad3c2 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -172,4 +172,8 @@ extern int is_valid_tracepoint(const char *event_string);
 int valid_event_mount(const char *eventfs);
 char *parse_events_formats_error_string(char *additional_terms);
 
+int parse_events__set_event_alias(struct parse_events_evlist *data,
+				  struct list_head *list,
+				  const char *str,
+				  void *loc_alias_);
 #endif /* __PERF_PARSE_EVENTS_H */
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 8992d16..c3cbd7a 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -77,6 +77,7 @@ static inc_group_count(struct list_head *list,
 %type <head> event_bpf_file
 %type <head> event_def
 %type <head> event_mod
+%type <head> event_alias
 %type <head> event_name
 %type <head> event
 %type <head> events
@@ -193,13 +194,25 @@ event_name PE_MODIFIER_EVENT
 event_name
 
 event_name:
-PE_EVENT_NAME event_def
+PE_EVENT_NAME event_alias
 {
 	ABORT_ON(parse_events_name($2, $1));
 	free($1);
 	$$ = $2;
 }
 |
+event_alias
+
+event_alias:
+PE_NAME '=' event_def
+{
+	struct list_head *list = $3;
+	struct parse_events_evlist *data = _data;
+
+	ABORT_ON(parse_events__set_event_alias(data, list, $1, &@1));
+	$$ = list;
+}
+|
 event_def
 
 event_def: event_pmu |
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 23/53] perf tools: Support setting different slots in a BPF map separately
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (21 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 22/53] perf tools: Support perf event alias name Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 24/53] perf tools: Enable indices setting syntax for BPF maps Wang Nan
                   ` (29 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Alexei Starovoitov, Arnaldo Carvalho de Melo, Jiri Olsa,
	Masami Hiramatsu, Namhyung Kim

This patch introduces basic facilities to support config different
slots in a BPF map one by one.

array.nr_ranges and array.ranges are introduced into 'struct
parse_events_term', where ranges is an array of indices range (start,
length) which will be configured by this config term. nr_ranges
is the size of the array. The array is passed to 'struct bpf_map_priv'.
To indicate the new type of configuration, BPF_MAP_KEY_RANGES is
added as a new key type. bpf_map_config_foreach_key() is extended to
iterate over those indices instead of all possible keys.

Code in this commit will be enabled by following commit which enables
the indices syntax for array configuration.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 132 ++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/bpf-loader.h   |   1 +
 tools/perf/util/parse-events.c |  33 ++++++++++-
 tools/perf/util/parse-events.h |  12 ++++
 4 files changed, 170 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 2893b4e..6c25de8 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -17,6 +17,7 @@
 #include "llvm-utils.h"
 #include "probe-event.h"
 #include "probe-finder.h" // for MAX_PROBES
+#include "parse-events.h"
 #include "llvm-utils.h"
 
 #define DEFINE_PRINT_FN(name, level) \
@@ -747,6 +748,7 @@ enum bpf_map_op_type {
 
 enum bpf_map_key_type {
 	BPF_MAP_KEY_ALL,
+	BPF_MAP_KEY_RANGES,
 };
 
 struct bpf_map_op {
@@ -754,6 +756,9 @@ struct bpf_map_op {
 	enum bpf_map_op_type op_type;
 	enum bpf_map_key_type key_type;
 	union {
+		struct parse_events_array array;
+	} k;
+	union {
 		u64 value;
 		struct perf_evsel *evsel;
 	} v;
@@ -779,6 +784,8 @@ bpf_map_op__free(struct bpf_map_op *op)
 	 */
 	if ((list->next != LIST_POISON1) && (list->prev != LIST_POISON2))
 		list_del(list);
+	if (op->key_type == BPF_MAP_KEY_RANGES)
+		parse_events__clear_array(&op->k.array);
 	free(op);
 }
 
@@ -794,8 +801,30 @@ bpf_map_priv__clear(struct bpf_map *map __maybe_unused,
 	free(priv);
 }
 
+static int
+bpf_map_op_setkey(struct bpf_map_op *op, struct parse_events_term *term,
+		  const char *map_name)
+{
+	op->key_type = BPF_MAP_KEY_ALL;
+
+	if (term->array.nr_ranges) {
+		size_t memsz = term->array.nr_ranges *
+				sizeof(op->k.array.ranges[0]);
+
+		op->k.array.ranges = memdup(term->array.ranges, memsz);
+		if (!op->k.array.ranges) {
+			pr_debug("No enough memory to alloc indices for %s\n",
+				 map_name);
+			return -ENOMEM;
+		}
+		op->key_type = BPF_MAP_KEY_RANGES;
+		op->k.array.nr_ranges = term->array.nr_ranges;
+	}
+	return 0;
+}
+
 static struct bpf_map_op *
-bpf_map_op__alloc(struct bpf_map *map)
+bpf_map_op__alloc(struct bpf_map *map, struct parse_events_term *term)
 {
 	struct bpf_map_op *op;
 	struct bpf_map_priv *priv;
@@ -829,7 +858,12 @@ bpf_map_op__alloc(struct bpf_map *map)
 		return ERR_PTR(-ENOMEM);
 	}
 
-	op->key_type = BPF_MAP_KEY_ALL;
+	err = bpf_map_op_setkey(op, term, map_name);
+	if (err) {
+		free(op);
+		return ERR_PTR(err);
+	}
+
 	list_add_tail(&op->list, &priv->ops_list);
 	return op;
 }
@@ -872,7 +906,7 @@ bpf__obj_config_map_array_value(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUESIZE;
 	}
 
-	op = bpf_map_op__alloc(map);
+	op = bpf_map_op__alloc(map, term);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 	op->op_type = BPF_MAP_OP_SET_VALUE;
@@ -933,7 +967,7 @@ bpf__obj_config_map_array_event(struct bpf_map *map,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
 	}
 
-	op = bpf_map_op__alloc(map);
+	op = bpf_map_op__alloc(map, term);
 	if (IS_ERR(op))
 		return PTR_ERR(op);
 
@@ -972,6 +1006,44 @@ struct bpf_obj_config_map_func bpf_obj_config_map_funcs[] = {
 };
 
 static int
+config_map_indices_range_check(struct parse_events_term *term,
+			       struct bpf_map *map,
+			       const char *map_name)
+{
+	struct parse_events_array *array = &term->array;
+	struct bpf_map_def def;
+	unsigned int i;
+	int err;
+
+	if (!array->nr_ranges)
+		return 0;
+	if (!array->ranges) {
+		pr_debug("ERROR: map %s: array->nr_ranges is %d but range array is NULL\n",
+			 map_name, (int)array->nr_ranges);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	err = bpf_map__get_def(map, &def);
+	if (err) {
+		pr_debug("ERROR: Unable to get map definition from '%s'\n",
+			 map_name);
+		return -BPF_LOADER_ERRNO__INTERNAL;
+	}
+
+	for (i = 0; i < array->nr_ranges; i++) {
+		unsigned int start = array->ranges[i].start;
+		size_t length = array->ranges[i].length;
+		unsigned int idx = start + length - 1;
+
+		if (idx >= def.max_entries) {
+			pr_debug("ERROR: index %d too large\n", idx);
+			return -BPF_LOADER_ERRNO__OBJCONF_MAP_IDX2BIG;
+		}
+	}
+	return 0;
+}
+
+static int
 bpf__obj_config_map(struct bpf_object *obj,
 		    struct parse_events_term *term,
 		    struct perf_evlist *evlist,
@@ -1007,6 +1079,13 @@ bpf__obj_config_map(struct bpf_object *obj,
 	}
 
 	*key_scan_pos += map_opt - map_name;
+
+	*key_scan_pos += strlen(map_opt);
+	err = config_map_indices_range_check(term, map, map_name);
+	if (err)
+		goto out;
+	*key_scan_pos -= strlen(map_opt);
+
 	for (i = 0; i < ARRAY_SIZE(bpf_obj_config_map_funcs); i++) {
 		struct bpf_obj_config_map_func *func =
 				&bpf_obj_config_map_funcs[i];
@@ -1077,6 +1156,33 @@ foreach_key_array_all(map_config_func_t func,
 }
 
 static int
+foreach_key_array_ranges(map_config_func_t func, void *arg,
+			 const char *name, int map_fd,
+			 struct bpf_map_def *pdef,
+			 struct bpf_map_op *op)
+{
+	unsigned int i, j;
+	int err;
+
+	for (i = 0; i < op->k.array.nr_ranges; i++) {
+		unsigned int start = op->k.array.ranges[i].start;
+		size_t length = op->k.array.ranges[i].length;
+
+		for (j = 0; j < length; j++) {
+			unsigned int idx = start + j;
+
+			err = func(name, map_fd, pdef, op, &idx, arg);
+			if (err) {
+				pr_debug("ERROR: failed to insert value to %s[%u]\n",
+					 name, idx);
+				return err;
+			}
+		}
+	}
+	return 0;
+}
+
+static int
 bpf_map_config_foreach_key(struct bpf_map *map,
 			   map_config_func_t func,
 			   void *arg)
@@ -1116,13 +1222,24 @@ bpf_map_config_foreach_key(struct bpf_map *map,
 		case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
 			switch (op->key_type) {
 			case BPF_MAP_KEY_ALL:
-				return foreach_key_array_all(func, arg, name,
-							     map_fd, &def, op);
+				err = foreach_key_array_all(func, arg, name,
+							    map_fd, &def, op);
+				if (err)
+					return err;
+				break;
+			case BPF_MAP_KEY_RANGES:
+				err = foreach_key_array_ranges(func, arg, name,
+							       map_fd, &def,
+							       op);
+				if (err)
+					return err;
+				break;
 			default:
 				pr_debug("ERROR: keytype for map '%s' invalid\n",
 					 name);
 				return -BPF_LOADER_ERRNO__INTERNAL;
-		}
+			}
+			break;
 		default:
 			pr_debug("ERROR: type of '%s' incorrect\n", name);
 			return -BPF_LOADER_ERRNO__OBJCONF_MAP_TYPE;
@@ -1309,6 +1426,7 @@ static const char *bpf_loader_strerror_table[NR_ERRNO] = {
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTDIM)]	= "Event dimension too large",
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTINH)]	= "Doesn't support inherit event",
 	[ERRCODE_OFFSET(OBJCONF_MAP_EVTTYPE)]	= "Wrong event type for map",
+	[ERRCODE_OFFSET(OBJCONF_MAP_IDX2BIG)]	= "Index too large",
 };
 
 static int
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index c9ce792..30ee519 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -38,6 +38,7 @@ enum bpf_loader_errno {
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTDIM,	/* Event dimension too large */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH,	/* Doesn't support inherit event */
 	BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE,	/* Wrong event type for map */
+	BPF_LOADER_ERRNO__OBJCONF_MAP_IDX2BIG,	/* Index too large */
 	__BPF_LOADER_ERRNO__END,
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 1e0ac77..f229663 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2148,8 +2148,39 @@ void parse_events__free_terms(struct list_head *terms)
 {
 	struct parse_events_term *term, *h;
 
-	list_for_each_entry_safe(term, h, terms, list)
+	list_for_each_entry_safe(term, h, terms, list) {
+		if (term->array.nr_ranges)
+			free(term->array.ranges);
 		free(term);
+	}
+}
+
+int parse_events__merge_arrays(struct parse_events_array *dest,
+			       struct parse_events_array *another)
+{
+	struct parse_events_array new;
+
+	if (!dest || !another)
+		return -EINVAL;
+
+	new.nr_ranges = dest->nr_ranges + another->nr_ranges;
+	new.ranges = malloc(sizeof(new.ranges[0]) * new.nr_ranges);
+	if (!new.ranges)
+		return -ENOMEM;
+
+	memcpy(&new.ranges[0], dest->ranges,
+	       sizeof(new.ranges[0]) * dest->nr_ranges);
+	memcpy(&new.ranges[dest->nr_ranges], another->ranges,
+	       sizeof(new.ranges[0]) * another->nr_ranges);
+	free(dest->ranges);
+	free(another->ranges);
+	*dest = new;
+	return 0;
+}
+
+void parse_events__clear_array(struct parse_events_array *a)
+{
+	free(a->ranges);
 }
 
 void parse_events_evlist_error(struct parse_events_evlist *data,
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 20ad3c2..c34615f 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -71,8 +71,17 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_INHERIT
 };
 
+struct parse_events_array {
+	size_t nr_ranges;
+	struct {
+		unsigned int start;
+		size_t length;
+	} *ranges;
+};
+
 struct parse_events_term {
 	char *config;
+	struct parse_events_array array;
 	union {
 		char *str;
 		u64  num;
@@ -117,6 +126,9 @@ int parse_events_term__sym_hw(struct parse_events_term **term,
 int parse_events_term__clone(struct parse_events_term **new,
 			     struct parse_events_term *term);
 void parse_events__free_terms(struct list_head *terms);
+int parse_events__merge_arrays(struct parse_events_array *dest,
+			       struct parse_events_array *another);
+void parse_events__clear_array(struct parse_events_array *a);
 int parse_events__modifier_event(struct list_head *list, char *str, bool add);
 int parse_events__modifier_group(struct list_head *list, char *event_mod);
 int parse_events_name(struct list_head *list, char *name);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 24/53] perf tools: Enable indices setting syntax for BPF maps
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (22 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 23/53] perf tools: Support setting different slots in a BPF map separately Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 25/53] perf tools: Introduce bpf-output event Wang Nan
                   ` (28 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Alexei Starovoitov, Arnaldo Carvalho de Melo, Masami Hiramatsu,
	Namhyung Kim

This patch introduce a new syntax to perf event parser:

 # perf record -e './test_bpf_map_3.c/maps:channel.value[0,1,2,3...5]=101/' usleep 2

By utilizing the basic facilities in bpf-loader.c which allow setting
different slots in a BPF map separately, the newly introduced syntax
allows perf to control specific elements in a BPF map.

Test result:

 # cat ./test_bpf_map_3.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 #define SEC(NAME) __attribute__((section(NAME), used))
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };
 static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
 	(void *)BPF_FUNC_map_lookup_elem;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(unsigned char),
 	.max_entries = 100,
 };
 SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
 int func(void *ctx, int err, long nsec)
 {
 	char fmt[] = "%ld\n";
 	long usec = nsec * 0x10624dd3 >> 38; // nsec / 1000
 	int key = (int)usec;
 	unsigned char *pval = map_lookup_elem(&channel, &key);

 	if (!pval)
 		return 0;
 	trace_printk(fmt, sizeof(fmt), (unsigned char)*pval);
 	return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

Normal case:
 # echo "" > /sys/kernel/debug/tracing/trace
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0,1,2,3...5]=101/' usleep 2
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0...9,20...29]=102,maps:channel.value[10...19]=103/' usleep 3
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[0...9,20...29]=102,maps:channel.value[10...19]=103/' usleep 15
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data ]
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
           usleep-655   [006] d... 2745434.122814: : 102
           usleep-904   [006] d... 2745439.916264: : 103
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[all]=104/' usleep 99
 # cat /sys/kernel/debug/tracing/trace | grep usleep
           usleep-405   [004] d... 2745423.547822: : 101
           usleep-655   [006] d... 2745434.122814: : 102
           usleep-904   [006] d... 2745439.916264: : 103
           usleep-1537  [003] d... 2745538.053737: : 104

Error case:
 # ./perf record -e './test_bpf_map_3.c/maps:channel.value[10...1000]=104/' usleep 99
 event syntax error: '..annel.value[10...1000]=104/'
                                   \___ Index too large
 Hint:	Valid config terms:
      	maps:[<arraymap>].value<indices>=[value]
      	maps:[<eventmap>].event<indices>=[event]

      	where <indices> is something like [0,3...5] or [all]
      	(add -v to see detail)
 Run 'perf list' for a list of valid events

  Usage: perf record [<options>] [<command>]
     or: perf record [<options>] -- <command> [<options>]

     -e, --event <event>   event selector. use 'perf list' to list available events

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/parse-events.c |  5 ++-
 tools/perf/util/parse-events.l | 13 ++++++-
 tools/perf/util/parse-events.y | 85 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index f229663..03d18f4 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -660,9 +660,10 @@ parse_events_config_bpf(struct parse_events_evlist *data,
 						 sizeof(errbuf));
 			data->error->help = strdup(
 "Hint:\tValid config terms:\n"
-"     \tmaps:[<arraymap>].value=[value]\n"
-"     \tmaps:[<eventmap>].event=[event]\n"
+"     \tmaps:[<arraymap>].value<indices>=[value]\n"
+"     \tmaps:[<eventmap>].event<indices>=[event]\n"
 "\n"
+"     \twhere <indices> is something like [0,3...5] or [all]\n"
 "     \t(add -v to see detail)");
 			data->error->str = strdup(errbuf);
 			if (err == -BPF_LOADER_ERRNO__OBJCONF_MAP_VALUE)
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 4387728..8bb3437 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -9,8 +9,8 @@
 %{
 #include <errno.h>
 #include "../perf.h"
-#include "parse-events-bison.h"
 #include "parse-events.h"
+#include "parse-events-bison.h"
 
 char *parse_events_get_text(yyscan_t yyscanner);
 YYSTYPE *parse_events_get_lval(yyscan_t yyscanner);
@@ -111,6 +111,7 @@ do {							\
 %x mem
 %s config
 %x event
+%x array
 
 group		[^,{}/]*[{][^}]*[}][^,{}/]*
 event_pmu	[^,{}/]+[/][^/]*[/][^,{}/]*
@@ -176,6 +177,14 @@ modifier_bp	[rwx]{1,3}
 
 }
 
+<array>{
+"]"			{ BEGIN(config); return ']'; }
+{num_dec}		{ return value(yyscanner, 10); }
+{num_hex}		{ return value(yyscanner, 16); }
+,			{ return ','; }
+"\.\.\."		{ return PE_ARRAY_RANGE; }
+}
+
 <config>{
 	/*
 	 * Please update parse_events_formats_error_string any time
@@ -196,6 +205,8 @@ no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
+\[all\]			{ return PE_ARRAY_ALL; }
+"["			{ BEGIN(array); return '['; }
 }
 
 <mem>{
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index c3cbd7a..7e93b9f 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -48,6 +48,7 @@ static inc_group_count(struct list_head *list,
 %token PE_PREFIX_MEM PE_PREFIX_RAW PE_PREFIX_GROUP
 %token PE_ERROR
 %token PE_PMU_EVENT_PRE PE_PMU_EVENT_SUF PE_KERNEL_PMU_EVENT
+%token PE_ARRAY_ALL PE_ARRAY_RANGE
 %type <num> PE_VALUE
 %type <num> PE_VALUE_SYM_HW
 %type <num> PE_VALUE_SYM_SW
@@ -84,6 +85,9 @@ static inc_group_count(struct list_head *list,
 %type <head> group_def
 %type <head> group
 %type <head> groups
+%type <array> array
+%type <array> array_term
+%type <array> array_terms
 
 %union
 {
@@ -95,6 +99,7 @@ static inc_group_count(struct list_head *list,
 		char *sys;
 		char *event;
 	} tracepoint_name;
+	struct parse_events_array array;
 }
 %%
 
@@ -601,6 +606,86 @@ PE_TERM
 	ABORT_ON(parse_events_term__num(&term, (int)$1, NULL, 1, &@1, NULL));
 	$$ = term;
 }
+|
+PE_NAME array '=' PE_NAME
+{
+	struct parse_events_term *term;
+	int i;
+
+	ABORT_ON(parse_events_term__str(&term, PARSE_EVENTS__TERM_TYPE_USER,
+					$1, $4, &@1, &@4));
+
+	term->array = $2;
+	$$ = term;
+}
+|
+PE_NAME array '=' PE_VALUE
+{
+	struct parse_events_term *term;
+
+	ABORT_ON(parse_events_term__num(&term, PARSE_EVENTS__TERM_TYPE_USER,
+					$1, $4, &@1, &@4));
+	term->array = $2;
+	$$ = term;
+}
+
+array:
+'[' array_terms ']'
+{
+	$$ = $2;
+}
+|
+PE_ARRAY_ALL
+{
+	$$.nr_ranges = 0;
+	$$.ranges = NULL;
+}
+
+array_terms:
+array_terms ',' array_term
+{
+	struct parse_events_array new_array;
+
+	new_array.nr_ranges = $1.nr_ranges + $3.nr_ranges;
+	new_array.ranges = malloc(sizeof(new_array.ranges[0]) *
+				  new_array.nr_ranges);
+	ABORT_ON(!new_array.ranges);
+	memcpy(&new_array.ranges[0], $1.ranges,
+	       $1.nr_ranges * sizeof(new_array.ranges[0]));
+	memcpy(&new_array.ranges[$1.nr_ranges], $3.ranges,
+	       $3.nr_ranges * sizeof(new_array.ranges[0]));
+	free($1.ranges);
+	free($3.ranges);
+	$$ = new_array;
+}
+|
+array_term
+
+array_term:
+PE_VALUE
+{
+	struct parse_events_array array;
+
+	array.nr_ranges = 1;
+	array.ranges = malloc(sizeof(array.ranges[0]));
+	ABORT_ON(!array.ranges);
+	array.ranges[0].start = $1;
+	array.ranges[0].length = 1;
+	$$ = array;
+}
+|
+PE_VALUE PE_ARRAY_RANGE PE_VALUE
+{
+	struct parse_events_array array;
+
+	ABORT_ON($3 < $1);
+	array.nr_ranges = 1;
+	array.ranges = malloc(sizeof(array.ranges[0]));
+	ABORT_ON(!array.ranges);
+	array.ranges[0].start = $1;
+	array.ranges[0].length = $3 - $1 + 1;
+	$$ = array;
+}
 
 sep_dc: ':' |
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 25/53] perf tools: Introduce bpf-output event
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (23 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 24/53] perf tools: Enable indices setting syntax for BPF maps Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 26/53] perf data: Support converting data from bpf_perf_event_output() Wang Nan
                   ` (27 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Alexei Starovoitov, Arnaldo Carvalho de Melo, Brendan Gregg,
	Masami Hiramatsu, Namhyung Kim

Commit a43eec304259a6c637f4014a6d4767159b6a3aa3 (bpf: introduce
bpf_perf_event_output() helper) add a helper to enable BPF program
output data to perf ring buffer through a new type of perf event
PERF_COUNT_SW_BPF_OUTPUT. This patch enable perf to create perf
event of that type. Now perf user can use following cmdline to
receive output data from BPF programs:

 # ./perf record -a -e evt=bpf-output/no-inherit/ \
                    -e ./test_bpf_output.c/maps:channel.event=evt/ ls /
 # ./perf script
	perf 12927 [004] 355971.129276:          0 evt=bpf-output/no-inherit/:  ffffffff811ed5f1 sys_write
	perf 12927 [004] 355971.129279:          0 evt=bpf-output/no-inherit/:  ffffffff811ed5f1 sys_write
	...

Test result:
 # cat ./test_bpf_output.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };

 #define SEC(NAME) __attribute__((section(NAME), used))
 static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
 };

 SEC("func_write=sys_write")
 int func_write(void *ctx)
 {
 	struct {
 		u64 ktime;
 		int cpuid;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output: %d\n";

 	output_data.cpuid = get_smp_processor_id();
 	output_data.ktime = ktime_get_ns();
 	int err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				    &output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data), err);
 	return 0;
 }
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************ END ***************************/

 # ./perf record -a -e evt=bpf-output/no-inherit/ \
                    -e ./test_bpf_output.c/maps:channel.event=evt/ ls /
 # ./perf script | grep ls
              ls  4085 [000] 2746114.230215: evt=bpf-output/no-inherit/:  ffffffff811ed5f1 sys_write (/lib/modules/4.3.0-rc4+/build/vmlinux)
              ls  4085 [000] 2746114.230244: evt=bpf-output/no-inherit/:  ffffffff811ed5f1 sys_write (/lib/modules/4.3.0-rc4+/build/vmlinux)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/bpf-loader.c   | 5 ++---
 tools/perf/util/evsel.c        | 5 +++++
 tools/perf/util/evsel.h        | 8 ++++++++
 tools/perf/util/parse-events.l | 1 +
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 6c25de8..92b815e 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -1312,13 +1312,12 @@ apply_config_evsel_for_key(const char *name, int map_fd, void *pkey,
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel))
+		check_pass = true;
 	if (attr->type == PERF_TYPE_RAW)
 		check_pass = true;
 	if (attr->type == PERF_TYPE_HARDWARE)
 		check_pass = true;
-	if (attr->type == PERF_TYPE_SOFTWARE &&
-			attr->config == PERF_COUNT_SW_BPF_OUTPUT)
-		check_pass = true;
 	if (!check_pass) {
 		pr_debug("ERROR: Event type is wrong for map %s\n", name);
 		return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTTYPE;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a6b3b07..f1b633e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -225,6 +225,11 @@ struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
 	if (evsel != NULL)
 		perf_evsel__init(evsel, attr, idx);
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		evsel->attr.sample_type |= PERF_SAMPLE_RAW;
+		evsel->attr.sample_period = 1;
+	}
+
 	return evsel;
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 19885fb..022fcff 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -365,6 +365,14 @@ static inline bool perf_evsel__is_function_event(struct perf_evsel *evsel)
 #undef FUNCTION_EVENT
 }
 
+static inline bool perf_evsel__is_bpf_output(struct perf_evsel *evsel)
+{
+	struct perf_event_attr *attr = &evsel->attr;
+
+	return (attr->config == PERF_COUNT_SW_BPF_OUTPUT) &&
+		(attr->type == PERF_TYPE_SOFTWARE);
+}
+
 struct perf_attr_details {
 	bool freq;
 	bool verbose;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 8bb3437..27d567f 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -249,6 +249,7 @@ cpu-migrations|migrations			{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COU
 alignment-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
 emulation-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }
 dummy						{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
+bpf-output					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
 
 	/*
 	 * We have to handle the kernel PMU event cycles-ct/cycles-t/mem-loads/mem-stores separately.
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 26/53] perf data: Support converting data from bpf_perf_event_output()
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (24 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 25/53] perf tools: Introduce bpf-output event Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE Wang Nan
                   ` (26 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Alexei Starovoitov, Arnaldo Carvalho de Melo, Brendan Gregg,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim

bpf_perf_event_output() outputs data through sample->raw_data. This
patch adds support to convert those data into CTF. A python script
then can be used to process output data from BPF programs.

Test result:

 # cat ./test_bpf_output_2.c
 /************************ BEGIN **************************/
 #include <uapi/linux/bpf.h>
 struct bpf_map_def {
 	unsigned int type;
 	unsigned int key_size;
 	unsigned int value_size;
 	unsigned int max_entries;
 };
 #define SEC(NAME) __attribute__((section(NAME), used))
 static u64 (*ktime_get_ns)(void) =
 	(void *)BPF_FUNC_ktime_get_ns;
 static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
 	(void *)BPF_FUNC_trace_printk;
 static int (*get_smp_processor_id)(void) =
 	(void *)BPF_FUNC_get_smp_processor_id;
 static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
 	(void *)BPF_FUNC_perf_event_output;

 struct bpf_map_def SEC("maps") channel = {
 	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
 	.key_size = sizeof(int),
 	.value_size = sizeof(u32),
 	.max_entries = __NR_CPUS__,
 };

 static inline int __attribute__((always_inline))
 func(void *ctx, int type)
 {
 	struct {
 		u64 ktime;
 		int type;
 	} __attribute__((packed)) output_data;
 	char error_data[] = "Error: failed to output\n";
 	int err;

 	output_data.type = type;
 	output_data.ktime = ktime_get_ns();
 	err = perf_event_output(ctx, &channel, get_smp_processor_id(),
 				&output_data, sizeof(output_data));
 	if (err)
 		trace_printk(error_data, sizeof(error_data));
 	return 0;
 }
 SEC("func_begin=sys_nanosleep")
 int func_begin(void *ctx) {return func(ctx, 1);}
 SEC("func_end=sys_nanosleep%return")
 int func_end(void *ctx) { return func(ctx, 2);}
 char _license[] SEC("license") = "GPL";
 int _version SEC("version") = LINUX_VERSION_CODE;
 /************************* END ***************************/

 # ./perf record -e evt=bpf-output/no-inherit/ \
                 -e ./test_bpf_output_2.c/maps:channel.event=evt/ \
                 usleep 100000
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]

 # ./perf script
          usleep 14942 92503.198504: evt=bpf-output/no-inherit/:  ffffffff810e0ba1 sys_nanosleep (/lib/modules/4.3.0....
          usleep 14942 92503.298562: evt=bpf-output/no-inherit/:  ffffffff810585e9 kretprobe_trampoline_holder (/lib....

 # ./perf data convert --to-ctf ./out.ctf
 [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
 [ perf data convert: Converted and wrote 0.000 MB (2 samples) ]

 # babeltrace ./out.ctf
 [01:41:43.198504134] (+?.?????????) evt=bpf-output/no-inherit/: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E0BA1, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x32C0C07B, [1] = 0x5421, [2] = 0x1 ] }
 [01:41:43.298562257] (+0.100058123) evt=bpf-output/no-inherit/: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810585E9, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x38B77FAA, [1] = 0x5421, [2] = 0x2 ] }

 # cat ./test_bpf_output_2.py
 from babeltrace import TraceCollection
 tc = TraceCollection(
 tc.add_trace('./out.ctf', 'ctf')
 d = {1:[], 2:[]}
 for event in tc.events:
     if not event.name.startswith('evt=bpf-output/no-inherit/'):
         continue
     raw_data = event['raw_data']
     (time, type) = ((raw_data[0] + (raw_data[1] << 32)), raw_data[2])
     d[type].append(time)
 print(list(map(lambda i: d[2][i] - d[1][i], range(len(d[1]))))));

 # python3 ./test_bpf_output_2.py
 [100056879]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data-convert-bt.c | 112 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 111 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/data-convert-bt.c b/tools/perf/util/data-convert-bt.c
index 34cd1e4..62ccf8d 100644
--- a/tools/perf/util/data-convert-bt.c
+++ b/tools/perf/util/data-convert-bt.c
@@ -352,6 +352,84 @@ static int add_tracepoint_values(struct ctf_writer *cw,
 	return ret;
 }
 
+static int
+add_bpf_output_values(struct bt_ctf_event_class *event_class,
+		      struct bt_ctf_event *event,
+		      struct perf_sample *sample)
+{
+	struct bt_ctf_field_type *len_type, *seq_type;
+	struct bt_ctf_field *len_field, *seq_field;
+	unsigned int raw_size = sample->raw_size;
+	unsigned int nr_elements = raw_size / sizeof(u32);
+	unsigned int i;
+	int ret;
+
+	if (nr_elements * sizeof(u32) != raw_size)
+		pr_warning("Incorrect raw_size (%u) in bpf output event, skip %lu bytes\n",
+			   raw_size, nr_elements * sizeof(u32) - raw_size);
+
+	len_type = bt_ctf_event_class_get_field_by_name(event_class, "raw_len");
+	len_field = bt_ctf_field_create(len_type);
+	if (!len_field) {
+		pr_err("failed to create 'raw_len' for bpf output event\n");
+		ret = -1;
+		goto put_len_type;
+	}
+
+	ret = bt_ctf_field_unsigned_integer_set_value(len_field, nr_elements);
+	if (ret) {
+		pr_err("failed to set field value for raw_len\n");
+		goto put_len_field;
+	}
+	ret = bt_ctf_event_set_payload(event, "raw_len", len_field);
+	if (ret) {
+		pr_err("failed to set payload to raw_len\n");
+		goto put_len_field;
+	}
+
+	seq_type = bt_ctf_event_class_get_field_by_name(event_class, "raw_data");
+	seq_field = bt_ctf_field_create(seq_type);
+	if (!seq_field) {
+		pr_err("failed to create 'raw_data' for bpf output event\n");
+		ret = -1;
+		goto put_seq_type;
+	}
+
+	ret = bt_ctf_field_sequence_set_length(seq_field, len_field);
+	if (ret) {
+		pr_err("failed to set length of 'raw_data'\n");
+		goto put_seq_field;
+	}
+
+	for (i = 0; i < nr_elements; i++) {
+		struct bt_ctf_field *elem_field =
+			bt_ctf_field_sequence_get_field(seq_field, i);
+
+		ret = bt_ctf_field_unsigned_integer_set_value(elem_field,
+				((u32 *)(sample->raw_data))[i]);
+
+		bt_ctf_field_put(elem_field);
+		if (ret) {
+			pr_err("failed to set raw_data[%d]\n", i);
+			goto put_seq_field;
+		}
+	}
+
+	ret = bt_ctf_event_set_payload(event, "raw_data", seq_field);
+	if (ret)
+		pr_err("failed to set payload for raw_data\n");
+
+put_seq_field:
+	bt_ctf_field_put(seq_field);
+put_seq_type:
+	bt_ctf_field_type_put(seq_type);
+put_len_field:
+	bt_ctf_field_put(len_field);
+put_len_type:
+	bt_ctf_field_type_put(len_type);
+	return ret;
+}
+
 static int add_generic_values(struct ctf_writer *cw,
 			      struct bt_ctf_event *event,
 			      struct perf_evsel *evsel,
@@ -597,6 +675,12 @@ static int process_sample_event(struct perf_tool *tool,
 			return -1;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		ret = add_bpf_output_values(event_class, event, sample);
+		if (ret)
+			return -1;
+	}
+
 	cs = ctf_stream(cw, get_sample_cpu(cw, sample, evsel));
 	if (cs) {
 		if (is_flush_needed(cs))
@@ -744,6 +828,25 @@ static int add_tracepoint_types(struct ctf_writer *cw,
 	return ret;
 }
 
+static int add_bpf_output_types(struct ctf_writer *cw,
+				struct bt_ctf_event_class *class)
+{
+	struct bt_ctf_field_type *len_type = cw->data.u32;
+	struct bt_ctf_field_type *seq_base_type = cw->data.u32_hex;
+	struct bt_ctf_field_type *seq_type;
+	int ret;
+
+	ret = bt_ctf_event_class_add_field(class, len_type, "raw_len");
+	if (ret)
+		return ret;
+
+	seq_type = bt_ctf_field_type_sequence_create(seq_base_type, "raw_len");
+	if (!seq_type)
+		return -1;
+
+	return bt_ctf_event_class_add_field(class, seq_type, "raw_data");
+}
+
 static int add_generic_types(struct ctf_writer *cw, struct perf_evsel *evsel,
 			     struct bt_ctf_event_class *event_class)
 {
@@ -755,7 +858,8 @@ static int add_generic_types(struct ctf_writer *cw, struct perf_evsel *evsel,
 	 *                              ctf event header
 	 *   PERF_SAMPLE_READ         - TODO
 	 *   PERF_SAMPLE_CALLCHAIN    - TODO
-	 *   PERF_SAMPLE_RAW          - tracepoint fields are handled separately
+	 *   PERF_SAMPLE_RAW          - tracepoint fields and BPF output
+	 *                              are handled separately
 	 *   PERF_SAMPLE_BRANCH_STACK - TODO
 	 *   PERF_SAMPLE_REGS_USER    - TODO
 	 *   PERF_SAMPLE_STACK_USER   - TODO
@@ -824,6 +928,12 @@ static int add_event(struct ctf_writer *cw, struct perf_evsel *evsel)
 			goto err;
 	}
 
+	if (perf_evsel__is_bpf_output(evsel)) {
+		ret = add_bpf_output_types(cw, event_class);
+		if (ret)
+			goto err;
+	}
+
 	ret = bt_ctf_stream_class_add_event_class(cw->stream_class, event_class);
 	if (ret) {
 		pr("Failed to add event class into stream.\n");
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (25 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 26/53] perf data: Support converting data from bpf_perf_event_output() Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 18:09   ` Alexei Starovoitov
  2016-01-12 14:14   ` Peter Zijlstra
  2016-01-11 13:48 ` [PATCH 28/53] perf tools: Move timestamp creation to util Wang Nan
                   ` (25 subsequent siblings)
  52 siblings, 2 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Adrian Hunter, Arnaldo Carvalho de Melo, David Ahern,
	Ingo Molnar, Peter Zijlstra, Yunlong Song

This patch introduces a PERF_SAMPLE_TAILSIZE flag which allows a size
field attached at the end of a sample. The idea comes from [1] that,
with tie size at tail of an event, it is possible for user program who
read from the ring buffer parse events backward.

For example:

   head
    |
    V
 +--+---+-------+----------+------+---+
 |E6|...|   B  8|   C    11|  D  7|E..|
 +--+---+-------+----------+------+---+

In this case, from the 'head' pointer provided by kernel, user program
can first see '6' by (*(head - sizeof(u64))), then it can get the start
pointer of record 'E', then it can read size and find start position
of record D, C, B in similar way.

The implementation is easy: adding a PERF_SAMPLE_TAILSIZE flag, makes
perf_output_sample() output size at the end of a sample.

Following things are done for ensure the ring buffer is safe for
backward parsing:

 - Don't allow two events with different PERF_SAMPLE_TAILSIZE setting
   set their output to each other;

 - For non-sample events, also output tailsize if required.

This patch has a limitation for perf:

Before reading such ring buffer, perf must ensure all events which may
output to it is already stopped, so the 'head' pointer it get is the
end of the last record.

[1] http://lkml.kernel.org/g/1449063499-236703-1-git-send-email-wangnan0@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yunlong Song <yunlong.song@huawei.com>
---
 include/linux/perf_event.h      | 17 ++++++---
 include/uapi/linux/perf_event.h |  3 +-
 kernel/events/core.c            | 82 +++++++++++++++++++++++++++++------------
 kernel/events/ring_buffer.c     |  7 ++--
 4 files changed, 75 insertions(+), 34 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f9828a4..c5df1e82 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -835,13 +835,13 @@ extern void perf_event_output(struct perf_event *event,
 				struct pt_regs *regs);
 
 extern void
-perf_event_header__init_id(struct perf_event_header *header,
-			   struct perf_sample_data *data,
-			   struct perf_event *event);
+perf_event_header__init_extra(struct perf_event_header *header,
+			      struct perf_sample_data *data,
+			      struct perf_event *event);
 extern void
-perf_event__output_id_sample(struct perf_event *event,
-			     struct perf_output_handle *handle,
-			     struct perf_sample_data *sample);
+perf_event__output_extra(struct perf_event *event, u64 evt_size,
+			 struct perf_output_handle *handle,
+			 struct perf_sample_data *sample);
 
 extern void
 perf_log_lost_samples(struct perf_event *event, u64 lost);
@@ -1032,6 +1032,11 @@ static inline bool has_aux(struct perf_event *event)
 	return event->pmu->setup_aux;
 }
 
+static inline bool has_tailsize(struct perf_event *event)
+{
+	return event->attr.sample_type & PERF_SAMPLE_TAILSIZE;
+}
+
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
 extern void perf_output_end(struct perf_output_handle *handle);
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 1afe962..4e8dde8 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -139,8 +139,9 @@ enum perf_event_sample_format {
 	PERF_SAMPLE_IDENTIFIER			= 1U << 16,
 	PERF_SAMPLE_TRANSACTION			= 1U << 17,
 	PERF_SAMPLE_REGS_INTR			= 1U << 18,
+	PERF_SAMPLE_TAILSIZE			= 1U << 19,
 
-	PERF_SAMPLE_MAX = 1U << 19,		/* non-ABI */
+	PERF_SAMPLE_MAX = 1U << 20,		/* non-ABI */
 };
 
 /*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index bf82441..2d59b59 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5128,12 +5128,14 @@ static void __perf_event_header__init_id(struct perf_event_header *header,
 	}
 }
 
-void perf_event_header__init_id(struct perf_event_header *header,
-				struct perf_sample_data *data,
-				struct perf_event *event)
+void perf_event_header__init_extra(struct perf_event_header *header,
+				   struct perf_sample_data *data,
+				   struct perf_event *event)
 {
 	if (event->attr.sample_id_all)
 		__perf_event_header__init_id(header, data, event);
+	if (has_tailsize(event))
+		header->size += sizeof(u64);
 }
 
 static void __perf_event__output_id_sample(struct perf_output_handle *handle,
@@ -5160,12 +5162,14 @@ static void __perf_event__output_id_sample(struct perf_output_handle *handle,
 		perf_output_put(handle, data->id);
 }
 
-void perf_event__output_id_sample(struct perf_event *event,
-				  struct perf_output_handle *handle,
-				  struct perf_sample_data *sample)
+void perf_event__output_extra(struct perf_event *event, u64 evt_size,
+			      struct perf_output_handle *handle,
+			      struct perf_sample_data *sample)
 {
 	if (event->attr.sample_id_all)
 		__perf_event__output_id_sample(handle, sample);
+	if (has_tailsize(event))
+		perf_output_put(handle, evt_size);
 }
 
 static void perf_output_read_one(struct perf_output_handle *handle,
@@ -5407,6 +5411,13 @@ void perf_output_sample(struct perf_output_handle *handle,
 		}
 	}
 
+	/* Should be the last one */
+	if (sample_type & PERF_SAMPLE_TAILSIZE) {
+		u64 evt_size = header->size;
+
+		perf_output_put(handle, evt_size);
+	}
+
 	if (!event->attr.watermark) {
 		int wakeup_events = event->attr.wakeup_events;
 
@@ -5526,6 +5537,9 @@ void perf_prepare_sample(struct perf_event_header *header,
 
 		header->size += size;
 	}
+
+	if (sample_type & PERF_SAMPLE_TAILSIZE)
+		header->size += sizeof(u64);
 }
 
 void perf_event_output(struct perf_event *event,
@@ -5579,14 +5593,15 @@ perf_event_read_event(struct perf_event *event,
 	};
 	int ret;
 
-	perf_event_header__init_id(&read_event.header, &sample, event);
+	perf_event_header__init_extra(&read_event.header, &sample, event);
 	ret = perf_output_begin(&handle, event, read_event.header.size);
 	if (ret)
 		return;
 
 	perf_output_put(&handle, read_event);
 	perf_output_read(&handle, event);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, read_event.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 }
@@ -5698,7 +5713,7 @@ static void perf_event_task_output(struct perf_event *event,
 	if (!perf_event_task_match(event))
 		return;
 
-	perf_event_header__init_id(&task_event->event_id.header, &sample, event);
+	perf_event_header__init_extra(&task_event->event_id.header, &sample, event);
 
 	ret = perf_output_begin(&handle, event,
 				task_event->event_id.header.size);
@@ -5715,7 +5730,9 @@ static void perf_event_task_output(struct perf_event *event,
 
 	perf_output_put(&handle, task_event->event_id);
 
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event,
+				 task_event->event_id.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 out:
@@ -5794,7 +5811,7 @@ static void perf_event_comm_output(struct perf_event *event,
 	if (!perf_event_comm_match(event))
 		return;
 
-	perf_event_header__init_id(&comm_event->event_id.header, &sample, event);
+	perf_event_header__init_extra(&comm_event->event_id.header, &sample, event);
 	ret = perf_output_begin(&handle, event,
 				comm_event->event_id.header.size);
 
@@ -5808,7 +5825,8 @@ static void perf_event_comm_output(struct perf_event *event,
 	__output_copy(&handle, comm_event->comm,
 				   comm_event->comm_size);
 
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, comm_event->event_id.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 out:
@@ -5917,7 +5935,7 @@ static void perf_event_mmap_output(struct perf_event *event,
 		mmap_event->event_id.header.size += sizeof(mmap_event->flags);
 	}
 
-	perf_event_header__init_id(&mmap_event->event_id.header, &sample, event);
+	perf_event_header__init_extra(&mmap_event->event_id.header, &sample, event);
 	ret = perf_output_begin(&handle, event,
 				mmap_event->event_id.header.size);
 	if (ret)
@@ -5940,7 +5958,8 @@ static void perf_event_mmap_output(struct perf_event *event,
 	__output_copy(&handle, mmap_event->file_name,
 				   mmap_event->file_size);
 
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, mmap_event->event_id.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 out:
@@ -6123,14 +6142,15 @@ void perf_event_aux_event(struct perf_event *event, unsigned long head,
 	};
 	int ret;
 
-	perf_event_header__init_id(&rec.header, &sample, event);
+	perf_event_header__init_extra(&rec.header, &sample, event);
 	ret = perf_output_begin(&handle, event, rec.header.size);
 
 	if (ret)
 		return;
 
 	perf_output_put(&handle, rec);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, rec.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 }
@@ -6156,7 +6176,7 @@ void perf_log_lost_samples(struct perf_event *event, u64 lost)
 		.lost		= lost,
 	};
 
-	perf_event_header__init_id(&lost_samples_event.header, &sample, event);
+	perf_event_header__init_extra(&lost_samples_event.header, &sample, event);
 
 	ret = perf_output_begin(&handle, event,
 				lost_samples_event.header.size);
@@ -6164,7 +6184,8 @@ void perf_log_lost_samples(struct perf_event *event, u64 lost)
 		return;
 
 	perf_output_put(&handle, lost_samples_event);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, lost_samples_event.header.size,
+				 &handle, &sample);
 	perf_output_end(&handle);
 }
 
@@ -6211,7 +6232,7 @@ static void perf_event_switch_output(struct perf_event *event, void *data)
 					perf_event_tid(event, se->next_prev);
 	}
 
-	perf_event_header__init_id(&se->event_id.header, &sample, event);
+	perf_event_header__init_extra(&se->event_id.header, &sample, event);
 
 	ret = perf_output_begin(&handle, event, se->event_id.header.size);
 	if (ret)
@@ -6222,7 +6243,8 @@ static void perf_event_switch_output(struct perf_event *event, void *data)
 	else
 		perf_output_put(&handle, se->event_id);
 
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, se->event_id.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 }
@@ -6282,7 +6304,7 @@ static void perf_log_throttle(struct perf_event *event, int enable)
 	if (enable)
 		throttle_event.header.type = PERF_RECORD_UNTHROTTLE;
 
-	perf_event_header__init_id(&throttle_event.header, &sample, event);
+	perf_event_header__init_extra(&throttle_event.header, &sample, event);
 
 	ret = perf_output_begin(&handle, event,
 				throttle_event.header.size);
@@ -6290,7 +6312,8 @@ static void perf_log_throttle(struct perf_event *event, int enable)
 		return;
 
 	perf_output_put(&handle, throttle_event);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, throttle_event.header.size,
+				 &handle, &sample);
 	perf_output_end(&handle);
 }
 
@@ -6318,14 +6341,15 @@ static void perf_log_itrace_start(struct perf_event *event)
 	rec.pid	= perf_event_pid(event, current);
 	rec.tid	= perf_event_tid(event, current);
 
-	perf_event_header__init_id(&rec.header, &sample, event);
+	perf_event_header__init_extra(&rec.header, &sample, event);
 	ret = perf_output_begin(&handle, event, rec.header.size);
 
 	if (ret)
 		return;
 
 	perf_output_put(&handle, rec);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, rec.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 }
@@ -8098,6 +8122,16 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
 	    event->pmu != output_event->pmu)
 		goto out;
 
+	/*
+	 * Don't allow mixed tailsize setting since the resuling
+	 * ringbuffer would unable to be parsed backward.
+	 *
+	 * '!=' is safe because has_tailsize() returns bool, two differnt
+	 * non-zero values would be treated as equal (both true).
+	 */
+	if (has_tailsize(event) != has_tailsize(output_event))
+		goto out;
+
 set:
 	mutex_lock(&event->mmap_mutex);
 	/* Can't redirect output if we've got an active mmap() */
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index adfdc05..5f8bd89 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -186,10 +186,11 @@ int perf_output_begin(struct perf_output_handle *handle,
 		lost_event.id          = event->id;
 		lost_event.lost        = local_xchg(&rb->lost, 0);
 
-		perf_event_header__init_id(&lost_event.header,
-					   &sample_data, event);
+		perf_event_header__init_extra(&lost_event.header,
+					      &sample_data, event);
 		perf_output_put(handle, lost_event);
-		perf_event__output_id_sample(event, handle, &sample_data);
+		perf_event__output_extra(event, lost_event.header.type,
+					 handle, &sample_data);
 	}
 
 	return 0;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 28/53] perf tools: Move timestamp creation to util
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (26 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 29/53] perf tools: Make ordered_events reusable Wang Nan
                   ` (24 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Timestamp generation becomes a public available helper. Which will
be used by 'perf record', help it output to split output file based
on time.

For example:

 perf.data.2015122620363710
 perf.data.2015122620364092
 perf.data.2015122620365423
 ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-buildid-cache.c | 14 +-------------
 tools/perf/util/util.c             | 17 +++++++++++++++++
 tools/perf/util/util.h             |  1 +
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
index d93bff7..632efc6 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -38,19 +38,7 @@ static int build_id_cache__kcore_buildid(const char *proc_dir, char *sbuildid)
 
 static int build_id_cache__kcore_dir(char *dir, size_t sz)
 {
-	struct timeval tv;
-	struct tm tm;
-	char dt[32];
-
-	if (gettimeofday(&tv, NULL) || !localtime_r(&tv.tv_sec, &tm))
-		return -1;
-
-	if (!strftime(dt, sizeof(dt), "%Y%m%d%H%M%S", &tm))
-		return -1;
-
-	scnprintf(dir, sz, "%s%02u", dt, (unsigned)tv.tv_usec / 10000);
-
-	return 0;
+	return fetch_current_timestamp(dir, sz);
 }
 
 static bool same_kallsyms_reloc(const char *from_dir, char *to_dir)
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 88b8f8d..5b8259b 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -690,3 +690,20 @@ out:
 
 	return tip;
 }
+
+int fetch_current_timestamp(char *buf, size_t sz)
+{
+	struct timeval tv;
+	struct tm tm;
+	char dt[32];
+
+	if (gettimeofday(&tv, NULL) || !localtime_r(&tv.tv_sec, &tm))
+		return -1;
+
+	if (!strftime(dt, sizeof(dt), "%Y%m%d%H%M%S", &tm))
+		return -1;
+
+	scnprintf(buf, sz, "%s%02u", dt, (unsigned)tv.tv_usec / 10000);
+
+	return 0;
+}
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index fe915e6..ef97038 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -343,5 +343,6 @@ int fetch_kernel_version(unsigned int *puint,
 #define KVER_PARAM(x)	KVER_VERSION(x), KVER_PATCHLEVEL(x), KVER_SUBLEVEL(x)
 
 const char *perf_tip(const char *dirpath);
+int fetch_current_timestamp(char *buf, size_t sz);
 
 #endif /* GIT_COMPAT_UTIL_H */
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 29/53] perf tools: Make ordered_events reusable
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (27 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 28/53] perf tools: Move timestamp creation to util Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 21:33   ` Arnaldo Carvalho de Melo
  2016-01-11 13:48 ` [PATCH 30/53] perf record: Extract synthesize code to record__synthesize() Wang Nan
                   ` (23 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

ordered_events__free() leaves linked lists and timestamps not cleared.
Introduce ordered_events__reset() to reinit ordered_events so it can
be reused again.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/ordered-events.c | 9 +++++++++
 tools/perf/util/ordered-events.h | 1 +
 tools/perf/util/session.c        | 4 ++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
index b1b9e23..81daada 100644
--- a/tools/perf/util/ordered-events.c
+++ b/tools/perf/util/ordered-events.c
@@ -308,3 +308,12 @@ void ordered_events__free(struct ordered_events *oe)
 		free(event);
 	}
 }
+
+void ordered_events__reset(struct ordered_events *oe)
+{
+	ordered_events__deliver_t old_deliver = oe->deliver;
+
+	ordered_events__free(oe);
+	memset(oe, '\0', sizeof(*oe));
+	ordered_events__init(oe, old_deliver);
+}
diff --git a/tools/perf/util/ordered-events.h b/tools/perf/util/ordered-events.h
index f403991..77e0f1b 100644
--- a/tools/perf/util/ordered-events.h
+++ b/tools/perf/util/ordered-events.h
@@ -49,6 +49,7 @@ void ordered_events__delete(struct ordered_events *oe, struct ordered_event *eve
 int ordered_events__flush(struct ordered_events *oe, enum oe_flush how);
 void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t deliver);
 void ordered_events__free(struct ordered_events *oe);
+void ordered_events__reset(struct ordered_events *oe);
 
 static inline
 void ordered_events__set_alloc_size(struct ordered_events *oe, u64 size)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index d5636ba..96e10d2 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1701,7 +1701,7 @@ done:
 out_err:
 	free(buf);
 	perf_session__warn_about_errors(session);
-	ordered_events__free(&session->ordered_events);
+	ordered_events__reset(&session->ordered_events);
 	auxtrace__free_events(session);
 	return err;
 }
@@ -1857,7 +1857,7 @@ out:
 out_err:
 	ui_progress__finish();
 	perf_session__warn_about_errors(session);
-	ordered_events__free(&session->ordered_events);
+	ordered_events__reset(&session->ordered_events);
 	auxtrace__free_events(session);
 	session->one_mmap = false;
 	return err;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 30/53] perf record: Extract synthesize code to record__synthesize()
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (28 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 29/53] perf tools: Make ordered_events reusable Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 31/53] perf tools: Add perf_data_file__switch() helper Wang Nan
                   ` (22 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Create record__synthesize(). It can be used to creating tracking events
for each perf.data after perf supporting splitting into multiple
outputs.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 132 +++++++++++++++++++++++++-------------------
 1 file changed, 76 insertions(+), 56 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index bd1692c..10f1349 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -475,6 +475,81 @@ static void workload_exec_failed_signal(int signo __maybe_unused,
 
 static void snapshot_sig_handler(int sig);
 
+static int record__synthesize(struct record *rec)
+{
+	struct perf_session *session = rec->session;
+	struct machine *machine = &session->machines.host;
+	struct perf_data_file *file = &rec->file;
+	struct record_opts *opts = &rec->opts;
+	struct perf_tool *tool = &rec->tool;
+	int fd = perf_data_file__fd(file);
+	int err = 0;
+	static bool warned_kmaps = false, warned_modules = false;
+
+	if (file->is_pipe) {
+		err = perf_event__synthesize_attrs(tool, session,
+						   process_synthesized_event);
+		if (err < 0) {
+			pr_err("Couldn't synthesize attrs.\n");
+			goto out;
+		}
+
+		if (have_tracepoints(&rec->evlist->entries)) {
+			/*
+			 * FIXME err <= 0 here actually means that
+			 * there were no tracepoints so its not really
+			 * an error, just that we don't need to
+			 * synthesize anything.  We really have to
+			 * return this more properly and also
+			 * propagate errors that now are calling die()
+			 */
+			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
+								  process_synthesized_event);
+			if (err <= 0) {
+				pr_err("Couldn't record tracing data.\n");
+				goto out;
+			}
+			rec->bytes_written += err;
+		}
+	}
+
+	if (rec->opts.full_auxtrace) {
+		err = perf_event__synthesize_auxtrace_info(rec->itr, tool,
+					session, process_synthesized_event);
+		if (err)
+			goto out;
+	}
+
+	err = perf_event__synthesize_kernel_mmap(tool, process_synthesized_event,
+						 machine);
+	if (err < 0 && !warned_kmaps) {
+		warned_kmaps = true;
+		pr_err("Couldn't record kernel reference relocation symbol\n"
+		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
+		       "Check /proc/kallsyms permission or run as root.\n");
+	}
+
+	err = perf_event__synthesize_modules(tool, process_synthesized_event,
+					     machine);
+	if (err < 0 && !warned_modules) {
+		warned_modules = true;
+		pr_err("Couldn't record kernel module information.\n"
+		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
+		       "Check /proc/modules permission or run as root.\n");
+	}
+
+	if (perf_guest) {
+		machines__process_guests(&session->machines,
+					 perf_event__synthesize_guest_os, tool);
+	}
+
+	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
+					    process_synthesized_event, opts->sample_address,
+					    opts->proc_map_timeout);
+out:
+	return err;
+}
+
 static int __cmd_record(struct record *rec, int argc, const char **argv)
 {
 	int err;
@@ -569,63 +644,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	if (file->is_pipe) {
-		err = perf_event__synthesize_attrs(tool, session,
-						   process_synthesized_event);
-		if (err < 0) {
-			pr_err("Couldn't synthesize attrs.\n");
-			goto out_child;
-		}
-
-		if (have_tracepoints(&rec->evlist->entries)) {
-			/*
-			 * FIXME err <= 0 here actually means that
-			 * there were no tracepoints so its not really
-			 * an error, just that we don't need to
-			 * synthesize anything.  We really have to
-			 * return this more properly and also
-			 * propagate errors that now are calling die()
-			 */
-			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
-								  process_synthesized_event);
-			if (err <= 0) {
-				pr_err("Couldn't record tracing data.\n");
-				goto out_child;
-			}
-			rec->bytes_written += err;
-		}
-	}
-
-	if (rec->opts.full_auxtrace) {
-		err = perf_event__synthesize_auxtrace_info(rec->itr, tool,
-					session, process_synthesized_event);
-		if (err)
-			goto out_delete_session;
-	}
-
-	err = perf_event__synthesize_kernel_mmap(tool, process_synthesized_event,
-						 machine);
-	if (err < 0)
-		pr_err("Couldn't record kernel reference relocation symbol\n"
-		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
-		       "Check /proc/kallsyms permission or run as root.\n");
-
-	err = perf_event__synthesize_modules(tool, process_synthesized_event,
-					     machine);
+	err = record__synthesize(rec);
 	if (err < 0)
-		pr_err("Couldn't record kernel module information.\n"
-		       "Symbol resolution may be skewed if relocation was used (e.g. kexec).\n"
-		       "Check /proc/modules permission or run as root.\n");
-
-	if (perf_guest) {
-		machines__process_guests(&session->machines,
-					 perf_event__synthesize_guest_os, tool);
-	}
-
-	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
-					    process_synthesized_event, opts->sample_address,
-					    opts->proc_map_timeout);
-	if (err != 0)
 		goto out_child;
 
 	if (rec->realtime_prio) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 31/53] perf tools: Add perf_data_file__switch() helper
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (29 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 30/53] perf record: Extract synthesize code to record__synthesize() Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 32/53] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
                   ` (21 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

perf_data_file__switch() closes current output file, renames it, then
open a new one to continue record. It will be used by perf record
to split output into multiple perf.data files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/data.c | 36 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/data.h | 11 ++++++++++-
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index 1921942..bfded6a 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -136,3 +136,39 @@ ssize_t perf_data_file__write(struct perf_data_file *file,
 {
 	return writen(file->fd, buf, size);
 }
+
+int perf_data_file__switch(struct perf_data_file *file,
+			   const char *postfix,
+			   size_t pos, bool at_exit)
+{
+	char *new_filepath;
+	int ret;
+
+	if (check_pipe(file))
+		return -EINVAL;
+	if (perf_data_file__is_read(file))
+		return -EINVAL;
+
+	if (asprintf(&new_filepath, "%s.%s", file->path, postfix) < 0)
+		return -ENOMEM;
+
+	rename(file->path, new_filepath);
+
+	if (!at_exit) {
+		close(file->fd);
+		ret = perf_data_file__open(file);
+		if (ret < 0)
+			goto out;
+
+		if (lseek(file->fd, pos, SEEK_SET) == (off_t)-1) {
+			ret = -errno;
+			pr_debug("Failed to lseek to %zu: %s",
+				 pos, strerror(errno));
+			goto out;
+		}
+	}
+	ret = file->fd;
+out:
+	free(new_filepath);
+	return ret;
+}
diff --git a/tools/perf/util/data.h b/tools/perf/util/data.h
index 2b15d0c..7763300 100644
--- a/tools/perf/util/data.h
+++ b/tools/perf/util/data.h
@@ -46,5 +46,14 @@ int perf_data_file__open(struct perf_data_file *file);
 void perf_data_file__close(struct perf_data_file *file);
 ssize_t perf_data_file__write(struct perf_data_file *file,
 			      void *buf, size_t size);
-
+/*
+ * If at_exit is set, only rename current perf.data to
+ * perf.data.<postfix>, continue write on original file.
+ * Used when flushing the last output.
+ *
+ * Return value is fd of new output.
+ */
+int perf_data_file__switch(struct perf_data_file *file,
+			   const char *postfix,
+			   size_t pos, bool at_exit);
 #endif /* __PERF_DATA_H */
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 32/53] perf record: Turns auxtrace_snapshot_enable into 3 states
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (30 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 31/53] perf tools: Add perf_data_file__switch() helper Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 33/53] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
                   ` (20 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	Wang Nan, He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa,
	Masami Hiramatsu, Namhyung Kim

auxtrace_snapshot_enable has only two states (0/1). Turns it into a
triple states enum so SIGUSR2 handler can safely do other works without
triggering auxtrace snapshot.

Signed-off-by: Wang Nan <wangnan0@hauwei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 59 +++++++++++++++++++++++++++++++++++++--------
 1 file changed, 49 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 10f1349..318b90f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -120,7 +120,43 @@ out:
 static volatile int done;
 static volatile int signr = -1;
 static volatile int child_finished;
-static volatile int auxtrace_snapshot_enabled;
+
+static volatile enum {
+	AUXTRACE_SNAPSHOT_OFF = -1,
+	AUXTRACE_SNAPSHOT_DISABLED = 0,
+	AUXTRACE_SNAPSHOT_ENABLED = 1,
+} auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_OFF;
+
+static inline void
+auxtrace_snapshot_on(void)
+{
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_DISABLED;
+}
+
+static inline void
+auxtrace_snapshot_enable(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return;
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_ENABLED;
+}
+
+static inline void
+auxtrace_snapshot_disable(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return;
+	auxtrace_snapshot_state = AUXTRACE_SNAPSHOT_DISABLED;
+}
+
+static inline bool
+auxtrace_snapshot_is_enabled(void)
+{
+	if (auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_OFF)
+		return false;
+	return auxtrace_snapshot_state == AUXTRACE_SNAPSHOT_ENABLED;
+}
+
 static volatile int auxtrace_snapshot_err;
 static volatile int auxtrace_record__snapshot_started;
 
@@ -244,7 +280,7 @@ static void record__read_auxtrace_snapshot(struct record *rec)
 	} else {
 		auxtrace_snapshot_err = auxtrace_record__snapshot_finish(rec->itr);
 		if (!auxtrace_snapshot_err)
-			auxtrace_snapshot_enabled = 1;
+			auxtrace_snapshot_enable();
 	}
 }
 
@@ -570,10 +606,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	signal(SIGCHLD, sig_handler);
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
-	if (rec->opts.auxtrace_snapshot_mode)
+
+	if (rec->opts.auxtrace_snapshot_mode) {
 		signal(SIGUSR2, snapshot_sig_handler);
-	else
+		auxtrace_snapshot_on();
+	} else {
 		signal(SIGUSR2, SIG_IGN);
+	}
 
 	session = perf_session__new(file, false, tool);
 	if (session == NULL) {
@@ -699,12 +738,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		perf_evlist__enable(rec->evlist);
 	}
 
-	auxtrace_snapshot_enabled = 1;
+	auxtrace_snapshot_enable();
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
 		if (record__mmap_read_all(rec) < 0) {
-			auxtrace_snapshot_enabled = 0;
+			auxtrace_snapshot_disable();
 			err = -1;
 			goto out_child;
 		}
@@ -742,12 +781,12 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		 * disable events in this case.
 		 */
 		if (done && !disabled && !target__none(&opts->target)) {
-			auxtrace_snapshot_enabled = 0;
+			auxtrace_snapshot_disable();
 			perf_evlist__disable(rec->evlist);
 			disabled = true;
 		}
 	}
-	auxtrace_snapshot_enabled = 0;
+	auxtrace_snapshot_disable();
 
 	if (forks && workload_exec_errno) {
 		char msg[STRERR_BUFSIZE];
@@ -1301,9 +1340,9 @@ out_symbol_exit:
 
 static void snapshot_sig_handler(int sig __maybe_unused)
 {
-	if (!auxtrace_snapshot_enabled)
+	if (!auxtrace_snapshot_is_enabled())
 		return;
-	auxtrace_snapshot_enabled = 0;
+	auxtrace_snapshot_disable();
 	auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
 	auxtrace_record__snapshot_started = 1;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 33/53] perf record: Introduce record__finish_output() to finish a perf.data
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (31 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 32/53] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 34/53] perf record: Use OPT_BOOLEAN_SET for buildid cache related options Wang Nan
                   ` (19 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Move code for finalizing 'perf.data' to record__finish_output(). It
will be used by following commits to split output to multiple files.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 45 +++++++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 318b90f..91bf4b1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -493,6 +493,33 @@ static void record__init_features(struct record *rec)
 	perf_header__clear_feat(&session->header, HEADER_STAT);
 }
 
+static void
+record__finish_output(struct record *rec)
+{
+	struct perf_data_file *file = &rec->file;
+	int fd = perf_data_file__fd(file);
+
+	if (file->is_pipe)
+		return;
+
+	rec->session->header.data_size += rec->bytes_written;
+	file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
+
+	if (!rec->no_buildid) {
+		process_buildids(rec);
+		/*
+		 * We take all buildids when the file contains
+		 * AUX area tracing data because we do not decode the
+		 * trace because it would take too long.
+		 */
+		if (rec->opts.full_auxtrace)
+			dsos__hit_all(rec->session);
+	}
+	perf_session__write_header(rec->session, rec->evlist, fd, true);
+
+	return;
+}
+
 static volatile int workload_exec_errno;
 
 /*
@@ -820,22 +847,8 @@ out_child:
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
-	if (!err && !file->is_pipe) {
-		rec->session->header.data_size += rec->bytes_written;
-		file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
-
-		if (!rec->no_buildid) {
-			process_buildids(rec);
-			/*
-			 * We take all buildids when the file contains
-			 * AUX area tracing data because we do not decode the
-			 * trace because it would take too long.
-			 */
-			if (rec->opts.full_auxtrace)
-				dsos__hit_all(rec->session);
-		}
-		perf_session__write_header(rec->session, rec->evlist, fd, true);
-	}
+	if (!err)
+		record__finish_output(rec);
 
 	if (!err && !quiet) {
 		char samples[128];
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 34/53] perf record: Use OPT_BOOLEAN_SET for buildid cache related options
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (32 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 33/53] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 35/53] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
                   ` (18 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

'perf record' knows whether buildid cache is enabled (via
--no-no-buildid-cache) deliberately. Buildid cache can be turned off
in some situations.

Output switching support needs this feature to turn off buildid cache
by default.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 91bf4b1..7a26cb5 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -50,7 +50,9 @@ struct record {
 	const char		*progname;
 	int			realtime_prio;
 	bool			no_buildid;
+	bool			no_buildid_set;
 	bool			no_buildid_cache;
+	bool			no_buildid_cache_set;
 	unsigned long long	samples;
 };
 
@@ -1176,10 +1178,12 @@ struct option __record_options[] = {
 	OPT_BOOLEAN('P', "period", &record.opts.period, "Record the sample period"),
 	OPT_BOOLEAN('n', "no-samples", &record.opts.no_samples,
 		    "don't sample"),
-	OPT_BOOLEAN('N', "no-buildid-cache", &record.no_buildid_cache,
-		    "do not update the buildid cache"),
-	OPT_BOOLEAN('B', "no-buildid", &record.no_buildid,
-		    "do not collect buildids in perf.data"),
+	OPT_BOOLEAN_SET('N', "no-buildid-cache", &record.no_buildid_cache,
+			&record.no_buildid_cache_set,
+			"do not update the buildid cache"),
+	OPT_BOOLEAN_SET('B', "no-buildid", &record.no_buildid,
+			&record.no_buildid_set,
+			"do not collect buildids in perf.data"),
 	OPT_CALLBACK('G', "cgroup", &record.evlist, "name",
 		     "monitor event in cgroup name only",
 		     parse_cgroups),
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 35/53] perf record: Add '--timestamp-filename' option to append timestamp to output filename
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (33 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 34/53] perf record: Use OPT_BOOLEAN_SET for buildid cache related options Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 36/53] perf record: Split output into multiple files via '--switch-output' Wang Nan
                   ` (17 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

This options append current timestamp to output. For example:

 # perf record -a --timestamp-filename
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2015122622265847 ]
 [ perf record: Captured and wrote 0.742 MB perf.data (90 samples) ]
 # ls
 perf.data.201512262226584

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 47 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7a26cb5..605cccc 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -53,6 +53,7 @@ struct record {
 	bool			no_buildid_set;
 	bool			no_buildid_cache;
 	bool			no_buildid_cache_set;
+	bool			timestamp_filename;
 	unsigned long long	samples;
 };
 
@@ -522,6 +523,37 @@ record__finish_output(struct record *rec)
 	return;
 }
 
+static int
+record__switch_output(struct record *rec, bool at_exit)
+{
+	struct perf_data_file *file = &rec->file;
+	int fd, err;
+
+	/* Same Size:      "2015122520103046"*/
+	char timestamp[] = "InvalidTimestamp";
+
+	rec->samples = 0;
+	record__finish_output(rec);
+	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
+	if (err) {
+		pr_err("Failed to get current timestamp\n");
+		return -EINVAL;
+	}
+
+	fd = perf_data_file__switch(file, timestamp,
+				    rec->session->header.data_offset,
+				    at_exit);
+	if (fd >= 0 && !at_exit) {
+		rec->bytes_written = 0;
+		rec->session->header.data_size = 0;
+	}
+
+	if (!quiet)
+		fprintf(stderr, "[ perf record: Dump %s.%s ]\n",
+			file->path, timestamp);
+	return fd;
+}
+
 static volatile int workload_exec_errno;
 
 /*
@@ -849,8 +881,17 @@ out_child:
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
-	if (!err)
-		record__finish_output(rec);
+	if (!err) {
+		if (!rec->timestamp_filename) {
+			record__finish_output(rec);
+		} else {
+			fd = record__switch_output(rec, true);
+			if (fd < 0) {
+				status = fd;
+				goto out_delete_session;
+			}
+		}
+	}
 
 	if (!err && !quiet) {
 		char samples[128];
@@ -1225,6 +1266,8 @@ struct option __record_options[] = {
 		   "options passed to clang when compiling BPF scriptlets"),
 	OPT_STRING(0, "vmlinux", &symbol_conf.vmlinux_name,
 		   "file", "vmlinux pathname"),
+	OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
+		    "append timestamp to output filename"),
 	OPT_END()
 };
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 36/53] perf record: Split output into multiple files via '--switch-output'
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (34 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 35/53] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 37/53] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
                   ` (16 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Allow 'perf record' splits its output into multiple files.

For example:

 # ~/perf record -a --timestamp-filename --switch-output &
 [1] 10763
 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622314468 ]

 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622314762 ]

 # kill -s SIGUSR2 10763
 [ perf record: dump data: Woken up 1 times ]
 #[ perf record: Dump perf.data.2015122622315171 ]

 # fg
 perf record -a --timestamp-filename --switch-output
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2015122622315513 ]
 [ perf record: Captured and wrote 0.014 MB perf.data (296 samples) ]

 # ls -l
 total 920
 -rw------- 1 root root 797692 Dec 26 22:31 perf.data.2015122622314468
 -rw------- 1 root root  59960 Dec 26 22:31 perf.data.2015122622314762
 -rw------- 1 root root  59912 Dec 26 22:31 perf.data.2015122622315171
 -rw------- 1 root root  19220 Dec 26 22:31 perf.data.2015122622315513

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 605cccc..e8f930c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -54,6 +54,7 @@ struct record {
 	bool			no_buildid_cache;
 	bool			no_buildid_cache_set;
 	bool			timestamp_filename;
+	bool			switch_output;
 	unsigned long long	samples;
 };
 
@@ -162,6 +163,7 @@ auxtrace_snapshot_is_enabled(void)
 
 static volatile int auxtrace_snapshot_err;
 static volatile int auxtrace_record__snapshot_started;
+static volatile int switch_output_started;
 
 static void sig_handler(int sig)
 {
@@ -668,7 +670,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	signal(SIGINT, sig_handler);
 	signal(SIGTERM, sig_handler);
 
-	if (rec->opts.auxtrace_snapshot_mode) {
+	if (rec->opts.auxtrace_snapshot_mode || rec->switch_output) {
 		signal(SIGUSR2, snapshot_sig_handler);
 		auxtrace_snapshot_on();
 	} else {
@@ -820,9 +822,25 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			}
 		}
 
+		if (switch_output_started) {
+			switch_output_started = 0;
+
+			if (!quiet)
+				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
+					waking);
+			waking = 0;
+			fd = record__switch_output(rec, false);
+			if (fd < 0) {
+				pr_err("Failed to switch to new file\n");
+				err = fd;
+				goto out_child;
+			}
+		}
+
 		if (hits == rec->samples) {
 			if (done || draining)
 				break;
+
 			err = perf_evlist__poll(rec->evlist, -1);
 			/*
 			 * Propagate error, only if there's any. Ignore positive
@@ -1268,6 +1286,8 @@ struct option __record_options[] = {
 		   "file", "vmlinux pathname"),
 	OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
 		    "append timestamp to output filename"),
+	OPT_BOOLEAN(0, "switch-output", &record.switch_output,
+		    "Switch output when receive SIGUSR2"),
 	OPT_END()
 };
 
@@ -1400,9 +1420,11 @@ out_symbol_exit:
 
 static void snapshot_sig_handler(int sig __maybe_unused)
 {
-	if (!auxtrace_snapshot_is_enabled())
-		return;
-	auxtrace_snapshot_disable();
-	auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
-	auxtrace_record__snapshot_started = 1;
+	if (auxtrace_snapshot_is_enabled()) {
+		auxtrace_snapshot_disable();
+		auxtrace_snapshot_err = auxtrace_record__snapshot_start(record.itr);
+		auxtrace_record__snapshot_started = 1;
+	}
+
+	switch_output_started = 1;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 37/53] perf record: Force enable --timestamp-filename when --switch-output is provided
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (35 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 36/53] perf record: Split output into multiple files via '--switch-output' Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 38/53] perf record: Disable buildid cache options by default in switch output mode Wang Nan
                   ` (15 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Without this patch, the last output doesn't have timestamp appended if
--timestamp-filename is not explicitly provided. For example:

 # perf record -a --switch-output &
 [1] 11224
 # kill -s SIGUSR2 11224
 [ perf record: dump data: Woken up 1 times ]
 # [ perf record: Dump perf.data.2015122622372823 ]

 # fg
 perf record -a --switch-output
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.027 MB perf.data (540 samples) ]

 # ls -l
 total 836
 -rw------- 1 root root  33256 Dec 26 22:37 perf.data   <---- *Odd*
 -rw------- 1 root root 817156 Dec 26 22:37 perf.data.2015122622372823

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e8f930c..d32facd 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1343,6 +1343,9 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		return -EINVAL;
 	}
 
+	if (rec->switch_output)
+		rec->timestamp_filename = true;
+
 	if (!rec->itr) {
 		rec->itr = auxtrace_record__init(rec->evlist, &err);
 		if (err)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 38/53] perf record: Disable buildid cache options by default in switch output mode
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (36 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 37/53] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 39/53] perf record: Re-synthesize tracking events after output switching Wang Nan
                   ` (14 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Cost of buildid cache processing is high: read all events in output
perf.data, open elf files to read buildid then copy them into
~/.debug directory. In switch output mode, causes perf stop receiving
from perf events for too long.

Enable no-buildid and no-buildid-cache by default if --switch-output
is provided. Still allow user use --no-no-buildid to explicitly enable
buildid in this case.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d32facd..790361b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1371,8 +1371,36 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 "If some relocation was applied (e.g. kexec) symbols may be misresolved\n"
 "even with a suitable vmlinux or kallsyms file.\n\n");
 
-	if (rec->no_buildid_cache || rec->no_buildid)
+	if (rec->no_buildid_cache || rec->no_buildid) {
 		disable_buildid_cache();
+	} else if (rec->switch_output) {
+		/*
+		 * In 'perf record --switch-output', disable buildid
+		 * generation by default to reduce data file switching
+		 * overhead. Still generate buildid if they are required
+		 * explicitly using
+		 *
+		 *  perf record --signal-trigger --no-no-buildid \
+		 *              --no-no-buildid-cache
+		 *
+		 * Equals to:
+		 *
+                 * if ((rec->no_buildid || !rec->no_buildid_set) &&
+		 *     (rec->no_buildid_cache || !rec->no_buildid_cache_set))
+		 *         disable_buildid_cache();
+		 */
+		bool disable = true;
+
+		if (rec->no_buildid_set && !rec->no_buildid)
+			disable = false;
+		if (rec->no_buildid_cache_set && !rec->no_buildid_cache)
+			disable = false;
+		if (disable) {
+			rec->no_buildid = true;
+			rec->no_buildid_cache = true;
+			disable_buildid_cache();
+		}
+	}
 
 	if (rec->evlist->nr_entries == 0 &&
 	    perf_evlist__add_default(rec->evlist) < 0) {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 39/53] perf record: Re-synthesize tracking events after output switching
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (37 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 38/53] perf record: Disable buildid cache options by default in switch output mode Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 40/53] perf record: Generate tracking events for process forked by perf Wang Nan
                   ` (13 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Tracking events describe kernel and threads. They are generated by
reading /proc/kallsyms, /proc/*/maps and /proc/*/task/* during
initialization of 'perf record', serialized into event sequences and put
at the head of 'perf.data'. In case of output switching, each output
file should contain those events.

This patch calls record__synthesize() during output switching, so the
event sequences described above can be collected again.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 790361b..5305a30 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -525,6 +525,8 @@ record__finish_output(struct record *rec)
 	return;
 }
 
+static int record__synthesize(struct record *rec);
+
 static int
 record__switch_output(struct record *rec, bool at_exit)
 {
@@ -553,6 +555,15 @@ record__switch_output(struct record *rec, bool at_exit)
 	if (!quiet)
 		fprintf(stderr, "[ perf record: Dump %s.%s ]\n",
 			file->path, timestamp);
+
+	/* Reinit machine */
+	if (!at_exit) {
+		machines__exit(&rec->session->machines);
+		machines__init(&rec->session->machines);
+		perf_session__create_kernel_maps(rec->session);
+		perf_session__set_id_hdr_size(rec->session);
+		record__synthesize(rec);
+	}
 	return fd;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 40/53] perf record: Generate tracking events for process forked by perf
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (38 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 39/53] perf record: Re-synthesize tracking events after output switching Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 41/53] perf record: Ensure return non-zero rc when mmap fail Wang Nan
                   ` (12 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

With 'perf record --switch-output' without -a, record__synthesize() in
record__switch_output() won't generate tracking events because there's
no thread_map in evlist. Which causes newly created perf.data doesn't
contain map and comm information.

This patch creates a fake thread_map and directly call
perf_event__synthesize_thread_map() for those events.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5305a30..aaf3b0f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -563,6 +563,23 @@ record__switch_output(struct record *rec, bool at_exit)
 		perf_session__create_kernel_maps(rec->session);
 		perf_session__set_id_hdr_size(rec->session);
 		record__synthesize(rec);
+
+		if (target__none(&rec->opts.target)) {
+			struct {
+				struct thread_map map;
+				struct thread_map_data map_data;
+			} thread_map;
+
+			thread_map.map.nr = 1;
+			thread_map.map.map[0].pid = rec->evlist->workload.pid;
+			thread_map.map.map[0].comm = NULL;
+			perf_event__synthesize_thread_map(&rec->tool,
+					&thread_map.map,
+					process_synthesized_event,
+					&rec->session->machines.host,
+					rec->opts.sample_address,
+					rec->opts.proc_map_timeout);
+		}
 	}
 	return fd;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 41/53] perf record: Ensure return non-zero rc when mmap fail
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (39 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 40/53] perf record: Generate tracking events for process forked by perf Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 42/53] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
                   ` (11 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

perf_evlist__mmap_ex() can fail without setting errno (for example,
fail in condition checking. In this case all syscall is success).
If this happen, record__open() incorrectly returns 0. Force setting
rc is a quick way to avoid this problem, or we have to follow all
possible code path in perf_evlist__mmap_ex() to make sure there's
at least one system call before returning an error.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index aaf3b0f..b65b41f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -361,7 +361,10 @@ try_again:
 		} else {
 			pr_err("failed to mmap with %d (%s)\n", errno,
 				strerror_r(errno, msg, sizeof(msg)));
-			rc = -errno;
+			if (errno)
+				rc = -errno;
+			else
+				rc = -EINVAL;
 		}
 		goto out;
 	}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 42/53] perf record: Prevent reading invalid data in record__mmap_read
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (40 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 41/53] perf record: Ensure return non-zero rc when mmap fail Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 14:21   ` Sergei Shtylyov
  2016-01-11 13:48 ` [PATCH 43/53] perf tools: Add evlist channel helpers Wang Nan
                   ` (10 subsequent siblings)
  52 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

When record__mmap_read() require data more than the size of ring
buffer, drop those data to avoid access invalid memory.

This can happen when reading from overwritable ring buffer, which
should be avoided. However, check this for robustness.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b65b41f..3f58426 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -37,6 +37,7 @@
 #include <unistd.h>
 #include <sched.h>
 #include <sys/mman.h>
+#include <asm/bug.h>
 
 
 struct record {
@@ -94,6 +95,13 @@ static int record__mmap_read(struct record *rec, int idx)
 	rec->samples++;
 
 	size = head - old;
+	if (size > (unsigned long)(md->mask) + 1) {
+		WARN_ONCE(1, "WARNING: failed to keep up with mmap data. (warn only once)\n");
+
+		md->prev = head;
+		perf_evlist__mmap_consume(rec->evlist, idx);
+		return 0;
+	}
 
 	if ((old & md->mask) + size != (head & md->mask)) {
 		buf = &data[old & md->mask];
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 43/53] perf tools: Add evlist channel helpers
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (41 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 42/53] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 44/53] perf tools: Automatically add new channel according to evlist Wang Nan
                   ` (9 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

In this commit sereval helpers are introduced to support the principle
of channel. Channels hold different groups of evsels which configured
differently. It will be used for overwritable evsels, which allows perf
record some events continuously while capture snapshot for other events
when something happen. Tracking events (mmap, mmap2, fork, exit ...)
are another possible events worth to be put into a separated channel.

Channels are represented by an array with channel flags. Each channel
contains evlist->nr_mmaps mmaps. Channels are configured before
perf_evlist__mmap_ex(). During that function nr_mmaps mmaps for each
channel are allocated together as a big array.
perf_evlist__channel_idx() converts index in the big array and the
channel number. For API functions which accept idx, _ex() versions are
introduced to accept selecting an mmap from a channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |   6 ++
 tools/perf/util/evlist.c    | 132 ++++++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/evlist.h    |  58 +++++++++++++++++++
 3 files changed, 190 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3f58426..21da64d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -355,6 +355,12 @@ try_again:
 		goto out;
 	}
 
+	perf_evlist__channel_reset(evlist);
+	rc = perf_evlist__channel_add(evlist, 0, true);
+	if (rc < 0)
+		goto out;
+	rc = 0;
+
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 890b08b..ff1beac 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -679,14 +679,51 @@ static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
 	return NULL;
 }
 
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx)
+{
+	int channel = *p_channel;
+	int _idx = *p_idx;
+
+	if (_idx < 0)
+		return -EINVAL;
+	/*
+	 * Negative channel means caller explicitly use real index.
+	 */
+	if (channel < 0) {
+		channel = perf_evlist__idx_channel(evlist, _idx);
+		_idx = _idx % evlist->nr_mmaps;
+	}
+	if (channel < 0)
+		return channel;
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	if (_idx >= evlist->nr_mmaps)
+		return -E2BIG;
+
+	*p_channel = channel;
+	*p_idx = evlist->nr_mmaps * channel + _idx;
+	return 0;
+}
+
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx)
 {
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
 	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head;
-	u64 old = md->prev;
-	unsigned char *data = md->base + page_size;
+	u64 old;
+	unsigned char *data;
 	union perf_event *event = NULL;
 
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return NULL;
+	}
+	old = md->prev;
+	data = md->base + page_size;
+
 	/*
 	 * Check if event was unmapped due to a POLLHUP/POLLERR.
 	 */
@@ -748,6 +785,11 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 	return event;
 }
 
+union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+{
+	return perf_evlist__mmap_read_ex(evlist, -1, idx);
+}
+
 static bool perf_mmap__empty(struct perf_mmap *md)
 {
 	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
@@ -766,10 +808,18 @@ static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 		__perf_evlist__munmap(evlist, idx);
 }
 
-void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx)
 {
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
 	struct perf_mmap *md = &evlist->mmap[idx];
 
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return;
+	}
+
 	if (!evlist->overwrite) {
 		u64 old = md->prev;
 
@@ -780,6 +830,11 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 		perf_evlist__mmap_put(evlist, idx);
 }
 
+void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+{
+	perf_evlist__mmap_consume_ex(evlist, -1, idx);
+}
+
 int __weak auxtrace_mmap__mmap(struct auxtrace_mmap *mm __maybe_unused,
 			       struct auxtrace_mmap_params *mp __maybe_unused,
 			       void *userpg __maybe_unused,
@@ -825,7 +880,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 	if (evlist->mmap == NULL)
 		return;
 
-	for (i = 0; i < evlist->nr_mmaps; i++)
+	for (i = 0; i < perf_evlist__mmap_nr(evlist); i++)
 		__perf_evlist__munmap(evlist, i);
 
 	zfree(&evlist->mmap);
@@ -833,10 +888,17 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 
 static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
+	int total_mmaps;
+
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
-	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
+
+	total_mmaps = perf_evlist__mmap_nr(evlist);
+	if (!total_mmaps)
+		return -EINVAL;
+
+	evlist->mmap = zalloc(total_mmaps * sizeof(struct perf_mmap));
 	return evlist->mmap != NULL ? 0 : -ENOMEM;
 }
 
@@ -1137,6 +1199,12 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	int err;
+
+	perf_evlist__channel_reset(evlist);
+	err = perf_evlist__channel_add(evlist, 0, true);
+	if (err < 0)
+		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
@@ -1739,3 +1807,55 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 
 	return NULL;
 }
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist)
+{
+	int i;
+
+	for (i = PERF_EVLIST__NR_CHANNELS - 1; i >= 0; i--) {
+		unsigned long flags = evlist->channel_flags[i];
+
+		if (flags & PERF_EVLIST__CHANNEL_ENABLED)
+			return i + 1;
+	}
+	return 0;
+}
+
+int perf_evlist__mmap_nr(struct perf_evlist *evlist)
+{
+	return evlist->nr_mmaps * perf_evlist__channel_nr(evlist);
+}
+
+void perf_evlist__channel_reset(struct perf_evlist *evlist)
+{
+	int i;
+
+	BUG_ON(evlist->mmap);
+
+	for (i = 0; i < PERF_EVLIST__NR_CHANNELS; i++)
+		evlist->channel_flags[i] = 0;
+}
+
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default)
+{
+	int n = perf_evlist__channel_nr(evlist);
+	unsigned long *flags = evlist->channel_flags;
+
+	BUG_ON(evlist->mmap);
+
+	if (n >= PERF_EVLIST__NR_CHANNELS) {
+		pr_debug("ERROR: too many channels. Increase PERF_EVLIST__NR_CHANNELS\n");
+		return -ENOSPC;
+	}
+
+	if (is_default) {
+		memmove(&flags[1], &flags[0],
+			sizeof(evlist->channel_flags) -
+			sizeof(evlist->channel_flags[0]));
+		n = 0;
+	}
+	flags[n] = flag | PERF_EVLIST__CHANNEL_ENABLED;
+	return n;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a0d1522..1812652 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,6 +20,11 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
+#define PERF_EVLIST__NR_CHANNELS	1
+enum perf_evlist_mmap_flag {
+	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+};
+
 /**
  * struct perf_mmap - perf's ring buffer mmap details
  *
@@ -52,6 +57,7 @@ struct perf_evlist {
 		pid_t	pid;
 	} workload;
 	struct fdarray	 pollfd;
+	unsigned long channel_flags[PERF_EVLIST__NR_CHANNELS];
 	struct perf_mmap *mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
@@ -116,9 +122,61 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx);
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
 
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx);
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
+int perf_evlist__mmap_nr(struct perf_evlist *evlist);
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist);
+void perf_evlist__channel_reset(struct perf_evlist *evlist);
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default);
+
+static inline bool
+__perf_evlist__channel_check(struct perf_evlist *evlist, int channel,
+			     enum perf_evlist_mmap_flag bits)
+{
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return false;
+
+	return (evlist->channel_flags[channel] & bits) ? true : false;
+}
+#define perf_evlist__channel_check(e, c, b) \
+		__perf_evlist__channel_check(e, c, PERF_EVLIST__CHANNEL_##b)
+
+static inline bool
+perf_evlist__channel_is_enabled(struct perf_evlist *evlist, int channel)
+{
+	return perf_evlist__channel_check(evlist, channel, ENABLED);
+}
+
+static inline int
+perf_evlist__idx_channel(struct perf_evlist *evlist, int idx)
+{
+	int channel = idx / evlist->nr_mmaps;
+
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	return channel;
+}
+
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx);
+
+static inline struct perf_mmap *
+perf_evlist__get_mmap(struct perf_evlist *evlist,
+		      int channel, int idx)
+{
+	if (perf_evlist__channel_idx(evlist, &channel, &idx))
+		return NULL;
+
+	return &evlist->mmap[idx];
+}
 
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 44/53] perf tools: Automatically add new channel according to evlist
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (42 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 43/53] perf tools: Add evlist channel helpers Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 45/53] perf tools: Operate multiple channels Wang Nan
                   ` (8 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

perf_evlist__channel_find() can be used to find a proper channel based
on propreties of a evsel. If the channel doesn't exist, it can create
new one for it. After this patch there's no need to create default
channel explicitly.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  5 -----
 tools/perf/util/evlist.c    | 47 ++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 21da64d..1f9fb6e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -356,11 +356,6 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	rc = perf_evlist__channel_add(evlist, 0, true);
-	if (rc < 0)
-		goto out;
-	rc = 0;
-
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ff1beac..5a898be 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -943,6 +943,43 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
+static unsigned long
+perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
+{
+	return 0;
+}
+
+static int
+perf_evlist__channel_find(struct perf_evlist *evlist,
+			  struct perf_evsel *evsel,
+			  bool add_new)
+{
+	unsigned long flag = perf_evlist__channel_for_evsel(evsel);
+	int i;
+
+	flag |= PERF_EVLIST__CHANNEL_ENABLED;
+	for (i = 0; i < perf_evlist__channel_nr(evlist); i++)
+		if (evlist->channel_flags[i] == flag)
+			return i;
+	if (add_new)
+		return perf_evlist__channel_add(evlist, flag, false);
+	return -ENOENT;
+}
+
+static int
+perf_evlist__channel_complete(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	int err;
+
+	evlist__for_each(evlist, evsel) {
+		err = perf_evlist__channel_find(evlist, evsel, true);
+		if (err < 0)
+			return err;
+	}
+	return 0;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
@@ -1162,6 +1199,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 bool overwrite, unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite)
 {
+	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
@@ -1169,6 +1207,10 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
+	err = perf_evlist__channel_complete(evlist);
+	if (err)
+		return err;
+
 	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
 		return -ENOMEM;
 
@@ -1199,12 +1241,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
-	int err;
-
 	perf_evlist__channel_reset(evlist);
-	err = perf_evlist__channel_add(evlist, 0, true);
-	if (err < 0)
-		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 45/53] perf tools: Operate multiple channels
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (43 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 44/53] perf tools: Automatically add new channel according to evlist Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 46/53] perf tools: Squash overwrite setting into channel Wang Nan
                   ` (7 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Before this patch perf operates on only the first channel. Make perf
mmap and read from multiple channels.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  3 ++-
 tools/perf/util/evlist.c    | 55 ++++++++++++++++++++++++++++++++++-----------
 tools/perf/util/evlist.h    |  2 +-
 3 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1f9fb6e..fee5fd2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -458,8 +458,9 @@ static int record__mmap_read_all(struct record *rec)
 	u64 bytes_written = rec->bytes_written;
 	int i;
 	int rc = 0;
+	int total_mmaps = perf_evlist__mmap_nr(rec->evlist);
 
-	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
+	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
 		if (rec->evlist->mmap[i].base) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5a898be..9187747 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -873,6 +873,21 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
 }
 
+static void
+__perf_evlist__munmap_channels(struct perf_evlist *evlist, int _idx)
+{
+	int _ch;
+
+	for (_ch = 0; _ch < perf_evlist__channel_nr(evlist); _ch++) {
+		int err, idx = _idx, ch = _ch;
+
+		err = perf_evlist__channel_idx(evlist, &ch, &idx);
+		if (err < 0)
+			continue;
+		__perf_evlist__munmap(evlist, idx);
+	}
+}
+
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	int i;
@@ -980,26 +995,38 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
+static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *outputs)
 {
 	struct perf_evsel *evsel;
 
 	evlist__for_each(evlist, evsel) {
-		int fd;
+		int fd, channel, idx, err;
+
+		channel = perf_evlist__channel_find(evlist, evsel, false);
+		if (channel < 0) {
+			pr_err("ERROR: unable to find suitable channel for %s\n",
+			       evsel->name);
+			return -1;
+		}
+
+		idx = _idx;
+		err = perf_evlist__channel_idx(evlist, &channel, &idx);
+		if (err < 0)
+			return err;
 
 		if (evsel->system_wide && thread)
 			continue;
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
-			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+		if (outputs[channel] == -1) {
+			outputs[channel] = fd;
+			if (__perf_evlist__mmap(evlist, idx, mp, outputs[channel]) < 0)
 				return -1;
 		} else {
-			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
+			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, outputs[channel]) != 0)
 				return -1;
 
 			perf_evlist__mmap_get(evlist, idx);
@@ -1039,14 +1066,15 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, outputs))
 				goto out_unmap;
 		}
 	}
@@ -1055,7 +1083,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 
 out_unmap:
 	for (cpu = 0; cpu < nr_cpus; cpu++)
-		__perf_evlist__munmap(evlist, cpu);
+		__perf_evlist__munmap_channels(evlist, cpu);
 	return -1;
 }
 
@@ -1067,13 +1095,14 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						outputs))
 			goto out_unmap;
 	}
 
@@ -1081,7 +1110,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 
 out_unmap:
 	for (thread = 0; thread < nr_threads; thread++)
-		__perf_evlist__munmap(evlist, thread);
+		__perf_evlist__munmap_channels(evlist, thread);
 	return -1;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1812652..b652587 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,7 +20,7 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
-#define PERF_EVLIST__NR_CHANNELS	1
+#define PERF_EVLIST__NR_CHANNELS	2
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 };
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 46/53] perf tools: Squash overwrite setting into channel
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (44 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 45/53] perf tools: Operate multiple channels Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 47/53] perf record: Don't read from and poll overwrite channel Wang Nan
                   ` (6 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Make 'overwrite' a channel configuration other than a evlist global
option. With this setting an evlist can have two channels, one is
normal channel, another is overwritable channel.
perf_evlist__channel_for_evsel() ensures events with 'overwrite'
configuration inserted to overwritable channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  2 +-
 tools/perf/util/evlist.c    | 42 +++++++++++++++++++++++++++---------------
 tools/perf/util/evlist.h    |  5 ++---
 tools/perf/util/evsel.h     |  1 +
 4 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index fee5fd2..fafcee7 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -356,7 +356,7 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
+	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
 		if (errno == EPERM) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9187747..dc00840 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -731,7 +731,7 @@ union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
 		return NULL;
 
 	head = perf_mmap__read_head(md);
-	if (evlist->overwrite) {
+	if (perf_evlist__channel_check(evlist, channel, RDONLY)) {
 		/*
 		 * If we're further behind than half the buffer, there's a chance
 		 * the writer will bite our tail and mess up the samples under us.
@@ -820,7 +820,7 @@ void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
 		return;
 	}
 
-	if (!evlist->overwrite) {
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY)) {
 		u64 old = md->prev;
 
 		perf_mmap__write_tail(md, old);
@@ -918,7 +918,6 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 }
 
 struct mmap_params {
-	int prot;
 	int mask;
 	struct auxtrace_mmap_params auxtrace_mp;
 };
@@ -926,6 +925,15 @@ struct mmap_params {
 static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 			       struct mmap_params *mp, int fd)
 {
+	int channel = perf_evlist__idx_channel(evlist, idx);
+	int prot = PROT_READ;
+
+	if (channel < 0)
+		return -1;
+
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY))
+		prot |= PROT_WRITE;
+
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
 	 * make sure we don't prevent tools from consuming every last event in
@@ -942,7 +950,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	atomic_set(&evlist->mmap[idx].refcnt, 2);
 	evlist->mmap[idx].prev = 0;
 	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
+	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, prot,
 				      MAP_SHARED, fd, 0);
 	if (evlist->mmap[idx].base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
@@ -959,9 +967,13 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 }
 
 static unsigned long
-perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
+perf_evlist__channel_for_evsel(struct perf_evsel *evsel)
 {
-	return 0;
+	unsigned long flag = 0;
+
+	if (evsel->overwrite)
+		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	return flag;
 }
 
 static int
@@ -1211,11 +1223,10 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * perf_evlist__mmap_ex - Create mmaps to receive events.
  * @evlist: list of events
  * @pages: map length in pages
- * @overwrite: overwrite older events?
  * @auxtrace_pages - auxtrace map length in pages
  * @auxtrace_overwrite - overwrite older auxtrace data?
  *
- * If @overwrite is %false the user needs to signal event consumption using
+ * For writable channel, the user needs to signal event consumption using
  * perf_mmap__write_tail().  Using perf_evlist__mmap_read() does this
  * automatically.
  *
@@ -1225,16 +1236,13 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * Return: %0 on success, negative error code otherwise.
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite)
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite)
 {
 	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
-	struct mmap_params mp = {
-		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
-	};
+	struct mmap_params mp;
 
 	err = perf_evlist__channel_complete(evlist);
 	if (err)
@@ -1246,7 +1254,6 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
 	mp.mask = evlist->mmap_len - page_size - 1;
@@ -1270,8 +1277,13 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	struct perf_evsel *evsel;
+
 	perf_evlist__channel_reset(evlist);
-	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+	evlist__for_each(evlist, evsel)
+		evsel->overwrite = overwrite;
+
+	return perf_evlist__mmap_ex(evlist, pages, 0, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b652587..21a8b85 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -23,6 +23,7 @@ struct record_opts;
 #define PERF_EVLIST__NR_CHANNELS	2
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+	PERF_EVLIST__CHANNEL_RDONLY	= 2,
 };
 
 /**
@@ -45,7 +46,6 @@ struct perf_evlist {
 	int		 nr_entries;
 	int		 nr_groups;
 	int		 nr_mmaps;
-	bool		 overwrite;
 	bool		 enabled;
 	bool		 has_user_cpus;
 	size_t		 mmap_len;
@@ -203,8 +203,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 				  int unset);
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite);
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite);
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite);
 void perf_evlist__munmap(struct perf_evlist *evlist);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 022fcff..8932a5c 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -115,6 +115,7 @@ struct perf_evsel {
 	bool			tracking;
 	bool			per_pkg;
 	bool			precise_max;
+	bool			overwrite;
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 47/53] perf record: Don't read from and poll overwrite channel
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (45 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 46/53] perf tools: Squash overwrite setting into channel Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 48/53] perf tools: Enable overwrite settings Wang Nan
                   ` (5 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Read from overwritable ring buffer is unreliable. Also, there's
no need to poll on a overwritable channel because we don't need
consuming data from it. Only select PULLHUP and PULLERR events.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 15 ++++++++++++++-
 tools/perf/util/evlist.c    | 27 +++++++++++++++++++++++----
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index fafcee7..e55a23f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -453,6 +453,19 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static bool record__mmap_should_read(struct record *rec, int idx)
+{
+	int channel = -1;
+
+	if (!rec->evlist->mmap[idx].base)
+		return false;
+	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
+		return false;
+	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int record__mmap_read_all(struct record *rec)
 {
 	u64 bytes_written = rec->bytes_written;
@@ -463,7 +476,7 @@ static int record__mmap_read_all(struct record *rec)
 	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
-		if (rec->evlist->mmap[i].base) {
+		if (record__mmap_should_read(rec, i)) {
 			if (record__mmap_read(rec, i) != 0) {
 				rc = -1;
 				goto out;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index dc00840..0511fd2 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -461,9 +461,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
 {
-	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
 	 * Save the idx so that when we filter out fds POLLHUP'ed we can
 	 * close the associated evlist->mmap[] entry.
@@ -479,7 +479,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1);
+	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
@@ -1007,6 +1007,22 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
+static bool
+perf_evlist__should_poll(struct perf_evlist *evlist,
+			 struct perf_evsel *evsel,
+			 int channel, int idx)
+{
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
+
+	if (err)
+		return false;
+	if (evsel->system_wide)
+		return false;
+	if (perf_evlist__channel_check(evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *outputs)
@@ -1015,6 +1031,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 
 	evlist__for_each(evlist, evsel) {
 		int fd, channel, idx, err;
+		short revent = POLLIN;
 
 		channel = perf_evlist__channel_find(evlist, evsel, false);
 		if (channel < 0) {
@@ -1044,6 +1061,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 			perf_evlist__mmap_get(evlist, idx);
 		}
 
+		if (!perf_evlist__should_poll(evlist, evsel, channel, idx))
+			revent = 0;
 		/*
 		 * The system_wide flag causes a selected event to be opened
 		 * always without a pid.  Consequently it will never get a
@@ -1052,7 +1071,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 48/53] perf tools: Enable overwrite settings
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (46 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 47/53] perf record: Don't read from and poll overwrite channel Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 49/53] perf tools: Consider TAILSIZE bit when caclulate is_pos Wang Nan
                   ` (4 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

This patch allows following config terms and option:

 # perf record --overwrite ...

   Globally set following events to overwrite;

 # perf record --event cycles/overwrite/ ...
 # perf record --event cycles/no-overwrite/ ...

Set specific events to be overwrite or no-overwrite.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c    |  1 +
 tools/perf/perf.h              |  1 +
 tools/perf/util/evsel.c        |  4 ++++
 tools/perf/util/evsel.h        |  2 ++
 tools/perf/util/parse-events.c | 14 ++++++++++++++
 tools/perf/util/parse-events.h |  4 +++-
 tools/perf/util/parse-events.l |  2 ++
 7 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e55a23f..8e56f92 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1267,6 +1267,7 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 		     "number of mmap data pages and AUX area tracing mmap pages",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 90129ac..71f305b 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -58,6 +58,7 @@ struct record_opts {
 	bool	     full_auxtrace;
 	bool	     auxtrace_snapshot_mode;
 	bool	     record_switch_events;
+	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f1b633e..6932b8b 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -670,6 +670,9 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			 */
 			attr->inherit = term->val.inherit ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+			evsel->overwrite = term->val.overwrite ? 1 : 0;
+			break;
 		default:
 			break;
 		}
@@ -745,6 +748,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
+	evsel->overwrite    = opts->overwrite;
 
 	perf_evsel__set_sample_bit(evsel, IP);
 	perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8932a5c..c76e385 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -44,6 +44,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
 	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_INHERIT,
+	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -57,6 +58,7 @@ struct perf_evsel_config_term {
 		char	*callgraph;
 		u64	stack_user;
 		bool	inherit;
+		bool	overwrite;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 03d18f4..c1d4f39 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -855,6 +855,12 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
@@ -892,6 +898,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
 	case PARSE_EVENTS__TERM_TYPE_INHERIT:
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -961,6 +969,12 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 			ADD_CONFIG_TERM(INHERIT, inherit, term->val.num ? 0 : 1);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index c34615f..29cc804 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -68,7 +68,9 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_CALLGRAPH,
 	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
-	PARSE_EVENTS__TERM_TYPE_INHERIT
+	PARSE_EVENTS__TERM_TYPE_INHERIT,
+	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
+	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 };
 
 struct parse_events_array {
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 27d567f..2ef6f96 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -202,6 +202,8 @@ call-graph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
 stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
 inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
+overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
+no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 49/53] perf tools: Consider TAILSIZE bit when caclulate is_pos
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (47 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 48/53] perf tools: Enable overwrite settings Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 50/53] perf tools: Set tailsize attribut bit for overwrite events Wang Nan
                   ` (3 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

evsel->is_pos indicates event id location in a event (count backward).
It is used to find id for tracking events (mmap, exit...). If TAILSIZE
is selected, this location should be changed accordingly.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evsel.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6932b8b..c59ea34 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -144,10 +144,10 @@ static int __perf_evsel__calc_id_pos(u64 sample_type)
  */
 static int __perf_evsel__calc_is_pos(u64 sample_type)
 {
-	int idx = 1;
+	int idx = 1 + (sample_type & PERF_SAMPLE_TAILSIZE ? 1 : 0);
 
 	if (sample_type & PERF_SAMPLE_IDENTIFIER)
-		return 1;
+		return idx;
 
 	if (!(sample_type & PERF_SAMPLE_ID))
 		return -1;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 50/53] perf tools: Set tailsize attribut bit for overwrite events
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (48 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 49/53] perf tools: Consider TAILSIZE bit when caclulate is_pos Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 51/53] perf record: Read from tailsize ring buffer Wang Nan
                   ` (2 subsequent siblings)
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

PERF_SAMPLE_TAILSIZE pad the size of an event at the end of it in the
ring buffer, makes reading from overwrite ring buffer possible. This
patch set that bit if evsel->overwrite is selected explicitly by user.
Overwrite and tailsize are still controled separatly for legacy
readonly mmap users (most of them are in perf/tests).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c |  2 ++
 tools/perf/util/evlist.h |  1 +
 tools/perf/util/evsel.c  | 28 ++++++++++++++++++++++++++++
 tools/perf/util/evsel.h  |  1 +
 4 files changed, 32 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 0511fd2..510e960 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -973,6 +973,8 @@ perf_evlist__channel_for_evsel(struct perf_evsel *evsel)
 
 	if (evsel->overwrite)
 		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	if (evsel->tailsize)
+		flag |= PERF_EVLIST__CHANNEL_TAILSIZE;
 	return flag;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 21a8b85..4dfcd67 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -24,6 +24,7 @@ struct record_opts;
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 	PERF_EVLIST__CHANNEL_RDONLY	= 2,
+	PERF_EVLIST__CHANNEL_TAILSIZE	= 4,
 };
 
 /**
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index c59ea34..ae69a85 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -671,13 +671,33 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			attr->inherit = term->val.inherit ? 1 : 0;
 			break;
 		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+			/*
+			 * Let tailsize and overwrite controled by /overwrite/
+			 * semultaneously because /overwrite/ can only be
+			 * passed by user explicitly, in this case user should
+			 * be able to read from that event so tailsize must
+			 * set.
+			 *
+			 * (overwrite && !tailsize) can happen only when
+			 * perf_evlist__mmap() is called with overwrite == true.
+			 * In that case there's no chance to pass /overwrite/.
+			 */
 			evsel->overwrite = term->val.overwrite ? 1 : 0;
+			evsel->tailsize = term->val.overwrite ? 1 : 0;
 			break;
 		default:
 			break;
 		}
 	}
 
+	/*
+	 * Set tailsize sample bit after config term processing because
+	 * it is possible to set overwrite globally, without config
+	 * terms.
+	 */
+	if (evsel->tailsize)
+		perf_evsel__set_sample_bit(evsel, TAILSIZE);
+
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
 	if ((callgraph_buf != NULL) || (dump_size > 0)) {
 
@@ -748,7 +768,15 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
+
+	/*
+	 * opts->overwrite can be set by user only.
+	 * Always keeps evsel->overwrite == evsel->tailsize.
+	 * (evsel->overwrite && !evsel->tailsize) can only happen
+	 * when calling perf_evlist__mmap() with overwrite == true.
+	 */
 	evsel->overwrite    = opts->overwrite;
+	evsel->tailsize	    = opts->overwrite;
 
 	perf_evsel__set_sample_bit(evsel, IP);
 	perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index c76e385..d93ee02 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -118,6 +118,7 @@ struct perf_evsel {
 	bool			per_pkg;
 	bool			precise_max;
 	bool			overwrite;
+	bool			tailsize;
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 51/53] perf record: Read from tailsize ring buffer
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (49 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 50/53] perf tools: Set tailsize attribut bit for overwrite events Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 52/53] perf record: Toggle tailsize ring buffer for reading Wang Nan
  2016-01-11 13:48 ` [PATCH 53/53] perf record: Allow generate tracking events at the end of output Wang Nan
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

tailsize_rb_find_start() introduced to find the first available event
from a tailsize ring buffer through tailsize. event with '/overwrite/'
setting is able to be read. record__mmap_should_read() is changed
accordingly.

Reading a active tailsize ring buffer is unsafe. A global tailsize ring
buffer director is introduced into 'struct record'
record__mmap_should_read() returns true if tailsize_evt_stopped is true.
Following patch whould turn off events attached to tailsize ring buffer
and toggle this director.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 69 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 68 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8e56f92..6c8905b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -56,6 +56,7 @@ struct record {
 	bool			no_buildid_cache_set;
 	bool			timestamp_filename;
 	bool			switch_output;
+	bool			tailsize_evt_stopped;
 	unsigned long long	samples;
 };
 
@@ -79,6 +80,63 @@ static int process_synthesized_event(struct perf_tool *tool,
 	return record__write(rec, event, event->header.size);
 }
 
+static int
+tailsize_rb_find_start(void *buf, u64 head, int mask, u64 *p_evt_head)
+{
+	int buf_size = mask + 1;
+	u64 evt_head = head;
+	u64 *pevt_size;
+
+	pr_debug("start reading tailsize, head=%"PRId64"\n", head);
+	while (true) {
+		struct perf_event_header *pheader;
+
+		pevt_size = buf + ((evt_head - sizeof(*pevt_size)) & mask);
+		pr_debug4("read tailsize: size: %"PRId64"\n", *pevt_size);
+
+		if (*pevt_size % sizeof(u64) != 0) {
+			pr_warning("Tailsize ring buffer corrupted: unaligned\n");
+			return -1;
+		}
+
+		if (!*pevt_size) {
+			if (evt_head) {
+				pr_warning("Tailsize ring buffer corrupted: size is 0 but evt_head (0x%"PRIx64") is not 0\n",
+					   (unsigned long)evt_head);
+				return -1;
+			}
+			*p_evt_head = evt_head;
+			return 0;
+		}
+
+		if (evt_head < *pevt_size) {
+			pr_warning("Tailsize ring buffer corrupted: head (%"PRId64") < size (%"PRId64")\n",
+				   evt_head, *pevt_size);
+			return -1;
+		}
+
+		evt_head -= *pevt_size;
+
+		if (evt_head + buf_size < head) {
+			evt_head += *pevt_size;
+			pr_debug("Finish reading tailsize buffer, evt_head=%"PRIx64", head=%"PRIx64"\n",
+				 evt_head, head);
+			*p_evt_head = evt_head;
+			return 0;
+		}
+
+		pheader = (struct perf_event_header *)(buf + (evt_head & mask));
+		if (pheader->size != *pevt_size) {
+			pr_warning("Tailsize ring buffer corrupted: found size mismatch: %d vs %"PRId64"\n",
+				   pheader->size, *pevt_size);
+			return -1;
+		}
+	}
+
+	pr_warning("ERROR: shouldn't get there\n");
+	return -1;
+}
+
 static int record__mmap_read(struct record *rec, int idx)
 {
 	struct perf_mmap *md = &rec->evlist->mmap[idx];
@@ -88,10 +146,17 @@ static int record__mmap_read(struct record *rec, int idx)
 	unsigned long size;
 	void *buf;
 	int rc = 0;
+	int channel;
 
 	if (old == head)
 		return 0;
 
+	channel = perf_evlist__idx_channel(rec->evlist, idx);
+	if (perf_evlist__channel_check(rec->evlist, channel, TAILSIZE)) {
+		if (tailsize_rb_find_start(data, head, md->mask, &old))
+			return -1;
+	}
+
 	rec->samples++;
 
 	size = head - old;
@@ -462,7 +527,8 @@ static bool record__mmap_should_read(struct record *rec, int idx)
 	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
 		return false;
 	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
-		return false;
+		if (perf_evlist__channel_check(rec->evlist, channel, TAILSIZE))
+			return rec->tailsize_evt_stopped;
 	return true;
 }
 
@@ -1226,6 +1292,7 @@ static struct record record = {
 		.mmap2		= perf_event__process_mmap2,
 		.ordered_events	= true,
 	},
+	.tailsize_evt_stopped	= false,
 };
 
 const char record_callchain_help[] = CALLCHAIN_RECORD_HELP
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 52/53] perf record: Toggle tailsize ring buffer for reading
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (50 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 51/53] perf record: Read from tailsize ring buffer Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  2016-01-11 13:48 ` [PATCH 53/53] perf record: Allow generate tracking events at the end of output Wang Nan
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Toggel tailsize_evt_stopped director after receiving done or switch
output. After this patch it is possible to trigger a dump use SIGUSR2
when something happen.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 6c8905b..6ec0529 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -591,6 +591,26 @@ static void record__init_features(struct record *rec)
 }
 
 static void
+record__toggle_tailsize_evsels(struct record *rec, bool stop)
+{
+	struct perf_evsel *pos;
+	struct perf_evlist *evlist = rec->evlist;
+
+	evlist__for_each(evlist, pos) {
+		if (!pos->tailsize)
+			continue;
+		if (!pos->overwrite)
+			continue;
+		if (stop)
+			perf_evsel__disable(pos);
+		else
+			perf_evsel__enable(pos);
+	}
+
+	rec->tailsize_evt_stopped = stop;
+}
+
+static void
 record__finish_output(struct record *rec)
 {
 	struct perf_data_file *file = &rec->file;
@@ -925,6 +945,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
+		if (switch_output_started || done)
+			record__toggle_tailsize_evsels(rec, true);
+
 		if (record__mmap_read_all(rec) < 0) {
 			auxtrace_snapshot_disable();
 			err = -1;
@@ -943,7 +966,20 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 
 		if (switch_output_started) {
+			/*
+			 * SIGUSR2 raise after or during record__mmap_read_all().
+			 * continue to read again.
+			 */
+			if (!rec->tailsize_evt_stopped)
+				continue;
+
 			switch_output_started = 0;
+			/* 
+			 * Reenable events in tailsize ring buffer after
+			 * record__mmap_read_all(): we have collected
+			 * data from it.
+			 */
+			record__toggle_tailsize_evsels(rec, false);
 
 			if (!quiet)
 				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 53/53] perf record: Allow generate tracking events at the end of output
  2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
                   ` (51 preceding siblings ...)
  2016-01-11 13:48 ` [PATCH 52/53] perf record: Toggle tailsize ring buffer for reading Wang Nan
@ 2016-01-11 13:48 ` Wang Nan
  52 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-11 13:48 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Wang Nan,
	He Kuang, Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Before this patch tracking events are generated based on information in
/proc before all samples. However, with the introducing of overwrite
evsel in perf record, it becomes inconvenience: 'perf record' now can
executed as a daemon for sereval hours and only capture the last
snapshot when it receives SIGUSR2. The tracking events generated at
the head of output 'perf.data' becomes too old, but most of tracking
events during 'perf record' running are dropped.

This patch generates tracking events at the end of output. The output
events series would better reflecting status of system when SIGUSR2
received.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 62 +++++++++++++++++++++++++++++++--------------
 1 file changed, 43 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 6ec0529..c1023ce 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -56,6 +56,7 @@ struct record {
 	bool			no_buildid_cache_set;
 	bool			timestamp_filename;
 	bool			switch_output;
+	bool			tail_tracking;
 	bool			tailsize_evt_stopped;
 	unsigned long long	samples;
 };
@@ -639,6 +640,26 @@ record__finish_output(struct record *rec)
 
 static int record__synthesize(struct record *rec);
 
+static void record__synthesize_target(struct record *rec)
+{
+	if (target__none(&rec->opts.target)) {
+		struct {
+			struct thread_map map;
+			struct thread_map_data map_data;
+		} thread_map;
+
+		thread_map.map.nr = 1;
+		thread_map.map.map[0].pid = rec->evlist->workload.pid;
+		thread_map.map.map[0].comm = NULL;
+		perf_event__synthesize_thread_map(&rec->tool,
+				&thread_map.map,
+				process_synthesized_event,
+				&rec->session->machines.host,
+				rec->opts.sample_address,
+				rec->opts.proc_map_timeout);
+	}
+}
+
 static int
 record__switch_output(struct record *rec, bool at_exit)
 {
@@ -648,6 +669,11 @@ record__switch_output(struct record *rec, bool at_exit)
 	/* Same Size:      "2015122520103046"*/
 	char timestamp[] = "InvalidTimestamp";
 
+	if (rec->tail_tracking) {
+		record__synthesize(rec);
+		record__synthesize_target(rec);
+	}
+
 	rec->samples = 0;
 	record__finish_output(rec);
 	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
@@ -674,23 +700,10 @@ record__switch_output(struct record *rec, bool at_exit)
 		machines__init(&rec->session->machines);
 		perf_session__create_kernel_maps(rec->session);
 		perf_session__set_id_hdr_size(rec->session);
-		record__synthesize(rec);
 
-		if (target__none(&rec->opts.target)) {
-			struct {
-				struct thread_map map;
-				struct thread_map_data map_data;
-			} thread_map;
-
-			thread_map.map.nr = 1;
-			thread_map.map.map[0].pid = rec->evlist->workload.pid;
-			thread_map.map.map[0].comm = NULL;
-			perf_event__synthesize_thread_map(&rec->tool,
-					&thread_map.map,
-					process_synthesized_event,
-					&rec->session->machines.host,
-					rec->opts.sample_address,
-					rec->opts.proc_map_timeout);
+		if (!rec->tail_tracking) {
+			record__synthesize(rec);
+			record__synthesize_target(rec);
 		}
 	}
 	return fd;
@@ -886,9 +899,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	err = record__synthesize(rec);
-	if (err < 0)
-		goto out_child;
+	if (!rec->tail_tracking) {
+		err = record__synthesize(rec);
+		if (err < 0)
+			goto out_child;
+	}
 
 	if (rec->realtime_prio) {
 		struct sched_param param;
@@ -1021,6 +1036,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			disabled = true;
 		}
 	}
+
+	if (rec->tail_tracking) {
+		err = record__synthesize(rec);
+		if (err < 0)
+			goto out_child;
+	}
+
 	auxtrace_snapshot_disable();
 
 	if (forks && workload_exec_errno) {
@@ -1446,6 +1468,8 @@ struct option __record_options[] = {
 		    "append timestamp to output filename"),
 	OPT_BOOLEAN(0, "switch-output", &record.switch_output,
 		    "Switch output when receive SIGUSR2"),
+	OPT_BOOLEAN(0, "tail-tracking", &record.tail_tracking,
+		    "Generate tracking events at the end of output"),
 	OPT_END()
 };
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* Re: [PATCH 07/53] tools: Move Makefile.arch from perf/config to tools/scripts
  2016-01-11 13:47 ` [PATCH 07/53] tools: Move Makefile.arch from perf/config to tools/scripts Wang Nan
@ 2016-01-11 13:52   ` Wangnan (F)
  2016-01-11 14:10     ` Arnaldo Carvalho de Melo
  2016-01-12 10:10   ` [tip:perf/urgent] tools: Move Makefile.arch from perf/ config " tip-bot for Wang Nan
  1 sibling, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-11 13:52 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Naveen N. Rao,
	Sukadev Bhattiprolu



On 2016/1/11 21:47, Wang Nan wrote:
> After this patch other directories can use this architecture detector
> without directly including it from perf's directory. Libbpf would
> utilize it to get proper $(ARCH) so it can receive correct uapi include
> directory.
>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> [Add missing srctree definition in tests/make]
Hi Arnaldo, I guess you will be okay provide your SOB so I add it here.
You didn't provide it on your original code.

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 07/53] tools: Move Makefile.arch from perf/config to tools/scripts
  2016-01-11 13:52   ` Wangnan (F)
@ 2016-01-11 14:10     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 14:10 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Naveen N. Rao,
	Sukadev Bhattiprolu

Em Mon, Jan 11, 2016 at 09:52:38PM +0800, Wangnan (F) escreveu:
> 
> 
> On 2016/1/11 21:47, Wang Nan wrote:
> >After this patch other directories can use this architecture detector
> >without directly including it from perf's directory. Libbpf would
> >utilize it to get proper $(ARCH) so it can receive correct uapi include
> >directory.
> >
> >Signed-off-by: Wang Nan <wangnan0@huawei.com>
> >Signed-off-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> >[Add missing srctree definition in tests/make]
> Hi Arnaldo, I guess you will be okay provide your SOB so I add it here.
> You didn't provide it on your original code.

Sure, and I'll cherry pick this into perf/urgent, to get the PowerPC
build fixed.

- Arnaldo

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 42/53] perf record: Prevent reading invalid data in record__mmap_read
  2016-01-11 13:48 ` [PATCH 42/53] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
@ 2016-01-11 14:21   ` Sergei Shtylyov
  2016-01-11 15:00     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 124+ messages in thread
From: Sergei Shtylyov @ 2016-01-11 14:21 UTC (permalink / raw)
  To: Wang Nan, acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim

Hello.

On 01/11/2016 04:48 PM, Wang Nan wrote:

> When record__mmap_read() require data more than the size of ring
> buffer, drop those data to avoid access invalid memory.
>
> This can happen when reading from overwritable ring buffer, which
> should be avoided. However, check this for robustness.
>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>   tools/perf/builtin-record.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index b65b41f..3f58426 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -37,6 +37,7 @@
>   #include <unistd.h>
>   #include <sched.h>
>   #include <sys/mman.h>
> +#include <asm/bug.h>
>
>
>   struct record {
> @@ -94,6 +95,13 @@ static int record__mmap_read(struct record *rec, int idx)
>   	rec->samples++;
>
>   	size = head - old;
> +	if (size > (unsigned long)(md->mask) + 1) {
> +		WARN_ONCE(1, "WARNING: failed to keep up with mmap data. (warn only once)\n");

    WARNING is already printed by WARN*(), no?

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist'
  2016-01-11 13:48 ` [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist' Wang Nan
@ 2016-01-11 14:25   ` Sergei Shtylyov
  2016-01-11 14:58     ` Arnaldo Carvalho de Melo
  2016-01-12 10:11   ` [tip:perf/urgent] perf test: Fix false TEST_OK result for ' perf " tip-bot for Wang Nan
  1 sibling, 1 reply; 124+ messages in thread
From: Sergei Shtylyov @ 2016-01-11 14:25 UTC (permalink / raw)
  To: Wang Nan, acme
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu

On 01/11/2016 04:48 PM, Wang Nan wrote:

> Commit 71d6de64feddd4b455555326fba2111b3006d9e0 ('perf test: Fix hist
> testcases when kptr_restrict is on') solves a double free problem when

    You didn't run this patch thru scripts/checkpatch.pl, I guess? A certain 
commit citing style is enforced now, and yours doesn't quite match it...

> 'perf test hist' calling setup_fake_machine(). However, the result is
> still incorrect. For example:
>
>   $ ./perf test -v 'filtering hist entries'
>   25: Test filtering hist entries                              :
>   --- start ---
>   test child forked, pid 4186
>   Cannot create kernel maps
>   test child finished with 0
>   ---- end ----
>   Test filtering hist entries: Ok
>
> In this case the body of this test is not get executed at all, but the
> result is 'Ok'.
>
> Actually, in setup_fake_machine() there's no need to create real kernel
> maps. What we want is the fake maps. This patch removes the
> machine__create_kernel_maps() in setup_fake_machine(), so it won't be
> affected by kptr_restrict setting.
>
> Test result:
>
>   $ cat /proc/sys/kernel/kptr_restrict
>   1
>   $ ~/perf test -v hist
>   15: Test matching and linking multiple hists                 :
>   --- start ---
>   test child forked, pid 24031
>   test child finished with 0
>   ---- end ----
>   Test matching and linking multiple hists: Ok
>   [SNIP]
>
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Suggested-by: Namhyung Kim <namhyung@kernel.org>
> Acked-by: Namhyung Kim <namhyung@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist'
  2016-01-11 14:25   ` Sergei Shtylyov
@ 2016-01-11 14:58     ` Arnaldo Carvalho de Melo
  2016-01-11 15:32       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 14:58 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Wang Nan, acme, linux-kernel, pi3orama, lizefan, netdev, davem,
	Jiri Olsa, Masami Hiramatsu

Em Mon, Jan 11, 2016 at 05:25:48PM +0300, Sergei Shtylyov escreveu:
> On 01/11/2016 04:48 PM, Wang Nan wrote:
> 
> >Commit 71d6de64feddd4b455555326fba2111b3006d9e0 ('perf test: Fix hist
> >testcases when kptr_restrict is on') solves a double free problem when
> 
>    You didn't run this patch thru scripts/checkpatch.pl, I guess? A
> certain commit citing style is enforced now, and yours doesn't quite
> match it...

Which is? /me goes to read checpatch.pl...

- Arnaldo
 
> >'perf test hist' calling setup_fake_machine(). However, the result is
> >still incorrect. For example:
> >
> >  $ ./perf test -v 'filtering hist entries'
> >  25: Test filtering hist entries                              :
> >  --- start ---
> >  test child forked, pid 4186
> >  Cannot create kernel maps
> >  test child finished with 0
> >  ---- end ----
> >  Test filtering hist entries: Ok
> >
> >In this case the body of this test is not get executed at all, but the
> >result is 'Ok'.
> >
> >Actually, in setup_fake_machine() there's no need to create real kernel
> >maps. What we want is the fake maps. This patch removes the
> >machine__create_kernel_maps() in setup_fake_machine(), so it won't be
> >affected by kptr_restrict setting.
> >
> >Test result:
> >
> >  $ cat /proc/sys/kernel/kptr_restrict
> >  1
> >  $ ~/perf test -v hist
> >  15: Test matching and linking multiple hists                 :
> >  --- start ---
> >  test child forked, pid 24031
> >  test child finished with 0
> >  ---- end ----
> >  Test matching and linking multiple hists: Ok
> >  [SNIP]
> >
> >Signed-off-by: Wang Nan <wangnan0@huawei.com>
> >Suggested-by: Namhyung Kim <namhyung@kernel.org>
> >Acked-by: Namhyung Kim <namhyung@kernel.org>
> >Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> >Cc: Jiri Olsa <jolsa@kernel.org>
> >Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> [...]
> 
> MBR, Sergei

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 42/53] perf record: Prevent reading invalid data in record__mmap_read
  2016-01-11 14:21   ` Sergei Shtylyov
@ 2016-01-11 15:00     ` Arnaldo Carvalho de Melo
  2016-01-11 15:01       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 15:00 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Wang Nan, acme, linux-kernel, pi3orama, lizefan, netdev, davem,
	He Kuang, Jiri Olsa, Masami Hiramatsu, Namhyung Kim

Em Mon, Jan 11, 2016 at 05:21:44PM +0300, Sergei Shtylyov escreveu:
> On 01/11/2016 04:48 PM, Wang Nan wrote:
> >  	size = head - old;
> >+	if (size > (unsigned long)(md->mask) + 1) {
> >+		WARN_ONCE(1, "WARNING: failed to keep up with mmap data. (warn only once)\n");
> 
>    WARNING is already printed by WARN*(), no?

No, at least not in tools/include/asm/bug.h, perhaps include/asm/bug.h
has this now and tools/ drifted? Checking now...

- Arnaldo

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 42/53] perf record: Prevent reading invalid data in record__mmap_read
  2016-01-11 15:00     ` Arnaldo Carvalho de Melo
@ 2016-01-11 15:01       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 15:01 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Wang Nan, acme, linux-kernel, pi3orama, lizefan, netdev, davem,
	He Kuang, Jiri Olsa, Masami Hiramatsu, Namhyung Kim

Em Mon, Jan 11, 2016 at 01:00:07PM -0200, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 11, 2016 at 05:21:44PM +0300, Sergei Shtylyov escreveu:
> > On 01/11/2016 04:48 PM, Wang Nan wrote:
> > >  	size = head - old;
> > >+	if (size > (unsigned long)(md->mask) + 1) {
> > >+		WARN_ONCE(1, "WARNING: failed to keep up with mmap data. (warn only once)\n");
> > 
> >    WARNING is already printed by WARN*(), no?
> 
> No, at least not in tools/include/asm/bug.h, perhaps include/asm/bug.h
> has this now and tools/ drifted? Checking now...

Indeed, need to bring it closer together again...

- Arnaldo

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 05/53] perf tools: Test correct path of perf in build-test
  2016-01-11 13:47 ` [PATCH 05/53] perf tools: Test correct path of perf " Wang Nan
@ 2016-01-11 15:24   ` Arnaldo Carvalho de Melo
  2016-01-11 22:06     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 15:24 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa, Namhyung Kim

Em Mon, Jan 11, 2016 at 01:47:56PM +0000, Wang Nan escreveu:
> If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
> will fail because perf resides in a different directory. Fix this by
> computing PERF_OUT according to 'O' and test correct output files.
> For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
> instead because the path is different from others ($(O)/perf vs
>  $(O)/tools/perf).

Ok, applying up to this patch I now manage to almost cleanly build it using O=,
see below, but seems that we have some race, as not all tests end up producing
such warnings.

[acme@felicio linux]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; make O=/tmp/build/perf -C tools/perf build-test
make: Entering directory `/home/acme/git/linux/tools/perf'
Testing Makefile
- make_no_libperl: cd . && make -f Makefile   DESTDIR=/tmp/tmp.m1nXBMqhSA NO_LIBPERL=1
find: ‘/tmp/build/perf/util/trace-event-scripting.o’: No such file or directory
find: ‘/tmp/build/perf/util/build-id.o’: No such file or directory
- make_no_libdw_dwarf_unwind: cd . && make -f Makefile   DESTDIR=/tmp/tmp.RB7Ile9C0b NO_LIBDW_DWARF_UNWIND=1
- make_no_backtrace: cd . && make -f Makefile   DESTDIR=/tmp/tmp.HeNpC0PW1O NO_BACKTRACE=1
find: ‘/tmp/build/perf/util/trace-event-scripting.o’: No such file or directory
find: ‘/tmp/build/perf/util/alias.o’: No such file or directory
- make_install_prefix: cd . && make -f Makefile   DESTDIR=/tmp/tmp.JPK5a72h53 install prefix=/tmp/krava
find: ‘/tmp/build/perf/libapi.a’: No such file or directory
- make_help: cd . && make -f Makefile   DESTDIR=/tmp/tmp.F3Z0qPtslS help
- make_doc: cd . && make -f Makefile   DESTDIR=/tmp/tmp.6a2HbvC2ej doc

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist'
  2016-01-11 14:58     ` Arnaldo Carvalho de Melo
@ 2016-01-11 15:32       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 15:32 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Wang Nan, linux-kernel, pi3orama, lizefan, netdev, davem,
	Jiri Olsa, Masami Hiramatsu

Em Mon, Jan 11, 2016 at 12:58:37PM -0200, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 11, 2016 at 05:25:48PM +0300, Sergei Shtylyov escreveu:
> > On 01/11/2016 04:48 PM, Wang Nan wrote:
> > 
> > >Commit 71d6de64feddd4b455555326fba2111b3006d9e0 ('perf test: Fix hist
> > >testcases when kptr_restrict is on') solves a double free problem when
> > 
> >    You didn't run this patch thru scripts/checkpatch.pl, I guess? A
> > certain commit citing style is enforced now, and yours doesn't quite
> > match it...
> 
> Which is? /me goes to read checpatch.pl...

So, this is it:

[acme@felicio linux]$ scripts/checkpatch.pl /wb/1.patch 
ERROR: Please use git commit description style 'commit <12+ chars of
sha1> ("<title line>")' - ie: 'Commit 71d6de64fedd ("perf test: Fix hist
testcases when kptr_restrict is on")'
#62: 
Commit 71d6de64feddd4b455555326fba2111b3006d9e0 ('perf test: Fix hist

total: 1 errors, 0 warnings, 11 lines checked

/wb/1.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.
[acme@felicio linux]$ 

Ok, matches what I use with this macro that I run in vim with ':!fixes'
after selecting the long commit hash:

#!/bin/bash

if [ $# -eq 1 ] ; then
	cset=$1
else
	read cset
fi
git log --oneline $cset | head -1 | sed -r 's/([^ ]+) (.*)/Fixes: \1 \("\2\")/g'

------------------------

And I have:

[acme@felicio linux]$ grep abbrev .git/config
	abbrev = 12

- Arnaldo

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 13/53] perf tools: Prevent calling machine__delete() on non-allocated machine
  2016-01-11 13:48 ` [PATCH 13/53] perf tools: Prevent calling machine__delete() on non-allocated machine Wang Nan
@ 2016-01-11 15:42   ` Arnaldo Carvalho de Melo
  2016-01-12  7:03     ` Wangnan (F)
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 15:42 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa,
	Masami Hiramatsu, Namhyung Kim

Em Mon, Jan 11, 2016 at 01:48:04PM +0000, Wang Nan escreveu:
> To prevent futher commits calling machine__delete() on non-allocated
> 'struct machine' (which would cause memory corruption), this patch
> enforces machine__init(), record whether a machine structure is
> dynamically allocated or not, and warn if machine__delete() is called
> on incorrect object.

Not sure on this one, I think I voiced this before, this seems like
something to be tested using some static analysis tool or even checking
if the address for the struct hitting machine__delete() is from malloc
or not.

I.e. if we do it here, we may have to do it to any other struct where we
allocate it in the stack or via malloc, and furthermore there are cases
where we embed a struct in another, when we would free just the main
struct but not the second, embedded one, that would need just calling
foo__exit() and not foo__delete().

- Arnaldo
 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/tests/vmlinux-kallsyms.c |  4 ++--
>  tools/perf/util/machine.c           | 13 ++++++++-----
>  tools/perf/util/machine.h           |  3 ++-
>  3 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/perf/tests/vmlinux-kallsyms.c b/tools/perf/tests/vmlinux-kallsyms.c
> index f0bfc9e..441e93d 100644
> --- a/tools/perf/tests/vmlinux-kallsyms.c
> +++ b/tools/perf/tests/vmlinux-kallsyms.c
> @@ -35,8 +35,8 @@ int test__vmlinux_matches_kallsyms(int subtest __maybe_unused)
>  	 * Init the machines that will hold kernel, modules obtained from
>  	 * both vmlinux + .ko files and from /proc/kallsyms split by modules.
>  	 */
> -	machine__init(&kallsyms, "", HOST_KERNEL_ID);
> -	machine__init(&vmlinux, "", HOST_KERNEL_ID);
> +	machine__init(&kallsyms, "", HOST_KERNEL_ID, false);
> +	machine__init(&vmlinux, "", HOST_KERNEL_ID, false);
>  
>  	/*
>  	 * Step 2:
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index ad79297..59a3c01 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1,3 +1,4 @@
> +#include <asm/bug.h>
>  #include "callchain.h"
>  #include "debug.h"
>  #include "event.h"
> @@ -23,7 +24,7 @@ static void dsos__init(struct dsos *dsos)
>  	pthread_rwlock_init(&dsos->lock, NULL);
>  }
>  
> -int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
> +int machine__init(struct machine *machine, const char *root_dir, pid_t pid, bool allocated)
>  {
>  	memset(machine, 0, sizeof(*machine));
>  	map_groups__init(&machine->kmaps, machine);
> @@ -65,6 +66,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
>  	}
>  
>  	machine->current_tid = NULL;
> +	machine->allocated = allocated;
>  
>  	return 0;
>  }
> @@ -74,7 +76,7 @@ struct machine *machine__new_host(void)
>  	struct machine *machine = malloc(sizeof(*machine));
>  
>  	if (machine != NULL) {
> -		machine__init(machine, "", HOST_KERNEL_ID);
> +		machine__init(machine, "", HOST_KERNEL_ID, true);
>  
>  		if (machine__create_kernel_maps(machine) < 0)
>  			goto out_delete;
> @@ -137,12 +139,13 @@ void machine__exit(struct machine *machine)
>  void machine__delete(struct machine *machine)
>  {
>  	machine__exit(machine);
> -	free(machine);
> +	WARN_ONCE((machine->allocated ? free(machine), 0 : -1),
> +		  "WARNING: deleting a non-allocated machine. Skip.\n");
>  }
>  
>  void machines__init(struct machines *machines)
>  {
> -	machine__init(&machines->host, "", HOST_KERNEL_ID);
> +	machine__init(&machines->host, "", HOST_KERNEL_ID, false);
>  	machines->guests = RB_ROOT;
>  	machines->symbol_filter = NULL;
>  }
> @@ -163,7 +166,7 @@ struct machine *machines__add(struct machines *machines, pid_t pid,
>  	if (machine == NULL)
>  		return NULL;
>  
> -	if (machine__init(machine, root_dir, pid) != 0) {
> +	if (machine__init(machine, root_dir, pid, true) != 0) {
>  		free(machine);
>  		return NULL;
>  	}
> diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
> index 2c2b443..24dfd46 100644
> --- a/tools/perf/util/machine.h
> +++ b/tools/perf/util/machine.h
> @@ -28,6 +28,7 @@ struct machine {
>  	pid_t		  pid;
>  	u16		  id_hdr_size;
>  	bool		  comm_exec;
> +	bool		  allocated;
>  	char		  *root_dir;
>  	struct rb_root	  threads;
>  	pthread_rwlock_t  threads_lock;
> @@ -131,7 +132,7 @@ void machines__set_symbol_filter(struct machines *machines,
>  void machines__set_comm_exec(struct machines *machines, bool comm_exec);
>  
>  struct machine *machine__new_host(void);
> -int machine__init(struct machine *machine, const char *root_dir, pid_t pid);
> +int machine__init(struct machine *machine, const char *root_dir, pid_t pid, bool allocated);
>  void machine__exit(struct machine *machine);
>  void machine__delete_threads(struct machine *machine);
>  void machine__delete(struct machine *machine);
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-11 13:48 ` [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE Wang Nan
@ 2016-01-11 18:09   ` Alexei Starovoitov
  2016-01-12  5:33     ` Wangnan (F)
  2016-01-12 14:14   ` Peter Zijlstra
  1 sibling, 1 reply; 124+ messages in thread
From: Alexei Starovoitov @ 2016-01-11 18:09 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, linux-kernel, pi3orama, lizefan, netdev, davem,
	Adrian Hunter, Arnaldo Carvalho de Melo, David Ahern,
	Ingo Molnar, Peter Zijlstra, Yunlong Song

On Mon, Jan 11, 2016 at 01:48:18PM +0000, Wang Nan wrote:
> This patch introduces a PERF_SAMPLE_TAILSIZE flag which allows a size
> field attached at the end of a sample. The idea comes from [1] that,
> with tie size at tail of an event, it is possible for user program who
> read from the ring buffer parse events backward.
> 
> For example:
> 
>    head
>     |
>     V
>  +--+---+-------+----------+------+---+
>  |E6|...|   B  8|   C    11|  D  7|E..|
>  +--+---+-------+----------+------+---+
> 
> In this case, from the 'head' pointer provided by kernel, user program
> can first see '6' by (*(head - sizeof(u64))), then it can get the start
> pointer of record 'E', then it can read size and find start position
> of record D, C, B in similar way.

adding extra 8 bytes for every sample is quite unfortunate.
How about another idea:
. update data_tail pointer when head is about to overwrite it

Ex:
   head   data_tail
    |       |
    V       V
 +--+-------+-------+---+----+---+
 |E |  ...  |   B   | C |  D | E |
 +--+-------+-------+---+----+---+

if new sample F is about to overwrite B, the kernel would need
to read the size of B from B's header and update data_tail to point C.
Or even further.
Comparing to TAILSIZE approach, now kernel will be doing both reads
and writes into ring-buffer and there is a concern that reads may
be hitting cold data, but if the records are small they may be
actually on the same cache line brought by the previous
read A's header, write E record cycle. So I think we shouldn't see
cache misses.
Another concern is validity of records stored. If user space messes
with ring-buffer, kernel won't be able to move data_tail properly
and would need to indicate that to userspace somehow.
But memory saving of 8 bytes per record could be sizable and
user space wouldn't need to walk the whole buffer backwards and
can just start from valid data_tail, so the dumps of overwrite
ring-buffer will be faster too.
Thoughts?

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 16/53] perf tools: Fix mmap2 event allocation in synthesize code
  2016-01-11 13:48 ` [PATCH 16/53] perf tools: Fix mmap2 event allocation in synthesize code Wang Nan
@ 2016-01-11 21:03   ` Arnaldo Carvalho de Melo
  2016-01-12 10:12     ` [PATCH 16/53 v2] " Wang Nan
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 21:03 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux-kernel, pi3orama, lizefan, He Kuang, Masami Hiramatsu,
	Namhyung Kim

Em Mon, Jan 11, 2016 at 01:48:07PM +0000, Wang Nan escreveu:
> perf_event__synthesize_mmap_events() issues mmap2 events, but the
> memory of that event is allocated using:
> 
>  mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
> 
> If path of mmap source file is long (near PATH_MAX), random crash
> would happen. Should use sizeof(mmap_event->mmap2).
> 
> Fix two memory allocations and rename all mmap_event to mmap2_event
> to make it clear.

Try not doing two things in the same patch, i.e. do one minimal patch
with just the fix, i.e. this part:

-     mmap_event = malloc(sizeof(mmap_event->mmap) + > machine->id_hdr_size);
+     mmap_event = malloc(sizeof(mmap_event->mmap2) + > machine->id_hdr_size);

This way we see the fix straight away, no extra renaming noise.

And the other with the rename, but I wouldn't bother doing that,
'mmap_event' is descriptive enough, and we may end up having a mmap3
event, when we would go on touching all those places again...

We're moving around union perf_event pointers, what we could do would be
to, at perf_event allocation time, set the mmap_event->header.type to
PERF_RECORD_MMAP2 and when going to use the mmap_event->mmap2 fields,
check that what was passed is indeed the type (and size) expected.

- Arnaldo
 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/util/event.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index cd61bb1..cde8228 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -413,7 +413,7 @@ int perf_event__synthesize_modules(struct perf_tool *tool,
>  }
>  
>  static int __event__synthesize_thread(union perf_event *comm_event,
> -				      union perf_event *mmap_event,
> +				      union perf_event *mmap2_event,
>  				      union perf_event *fork_event,
>  				      pid_t pid, int full,
>  					  perf_event__handler_t process,
> @@ -436,7 +436,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
>  		if (tgid == -1)
>  			return -1;
>  
> -		return perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid,
> +		return perf_event__synthesize_mmap_events(tool, mmap2_event, pid, tgid,
>  							  process, machine, mmap_data,
>  							  proc_map_timeout);
>  	}
> @@ -478,7 +478,7 @@ static int __event__synthesize_thread(union perf_event *comm_event,
>  		rc = 0;
>  		if (_pid == pid) {
>  			/* process the parent's maps too */
> -			rc = perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid,
> +			rc = perf_event__synthesize_mmap_events(tool, mmap2_event, pid, tgid,
>  						process, machine, mmap_data, proc_map_timeout);
>  			if (rc)
>  				break;
> @@ -496,15 +496,15 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
>  				      bool mmap_data,
>  				      unsigned int proc_map_timeout)
>  {
> -	union perf_event *comm_event, *mmap_event, *fork_event;
> +	union perf_event *comm_event, *mmap2_event, *fork_event;
>  	int err = -1, thread, j;
>  
>  	comm_event = malloc(sizeof(comm_event->comm) + machine->id_hdr_size);
>  	if (comm_event == NULL)
>  		goto out;
>  
> -	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
> -	if (mmap_event == NULL)
> +	mmap2_event = malloc(sizeof(mmap2_event->mmap2) + machine->id_hdr_size);
> +	if (mmap2_event == NULL)
>  		goto out_free_comm;
>  
>  	fork_event = malloc(sizeof(fork_event->fork) + machine->id_hdr_size);
> @@ -513,7 +513,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
>  
>  	err = 0;
>  	for (thread = 0; thread < threads->nr; ++thread) {
> -		if (__event__synthesize_thread(comm_event, mmap_event,
> +		if (__event__synthesize_thread(comm_event, mmap2_event,
>  					       fork_event,
>  					       thread_map__pid(threads, thread), 0,
>  					       process, tool, machine,
> @@ -539,7 +539,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
>  
>  			/* if not, generate events for it */
>  			if (need_leader &&
> -			    __event__synthesize_thread(comm_event, mmap_event,
> +			    __event__synthesize_thread(comm_event, mmap2_event,
>  						       fork_event,
>  						       comm_event->comm.pid, 0,
>  						       process, tool, machine,
> @@ -551,7 +551,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
>  	}
>  	free(fork_event);
>  out_free_mmap:
> -	free(mmap_event);
> +	free(mmap2_event);
>  out_free_comm:
>  	free(comm_event);
>  out:
> @@ -567,7 +567,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
>  	DIR *proc;
>  	char proc_path[PATH_MAX];
>  	struct dirent dirent, *next;
> -	union perf_event *comm_event, *mmap_event, *fork_event;
> +	union perf_event *comm_event, *mmap2_event, *fork_event;
>  	int err = -1;
>  
>  	if (machine__is_default_guest(machine))
> @@ -577,8 +577,8 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
>  	if (comm_event == NULL)
>  		goto out;
>  
> -	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
> -	if (mmap_event == NULL)
> +	mmap2_event = malloc(sizeof(mmap2_event->mmap2) + machine->id_hdr_size);
> +	if (mmap2_event == NULL)
>  		goto out_free_comm;
>  
>  	fork_event = malloc(sizeof(fork_event->fork) + machine->id_hdr_size);
> @@ -601,7 +601,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
>   		 * We may race with exiting thread, so don't stop just because
>   		 * one thread couldn't be synthesized.
>   		 */
> -		__event__synthesize_thread(comm_event, mmap_event, fork_event, pid,
> +		__event__synthesize_thread(comm_event, mmap2_event, fork_event, pid,
>  					   1, process, tool, machine, mmap_data,
>  					   proc_map_timeout);
>  	}
> @@ -611,7 +611,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
>  out_free_fork:
>  	free(fork_event);
>  out_free_mmap:
> -	free(mmap_event);
> +	free(mmap2_event);
>  out_free_comm:
>  	free(comm_event);
>  out:
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 29/53] perf tools: Make ordered_events reusable
  2016-01-11 13:48 ` [PATCH 29/53] perf tools: Make ordered_events reusable Wang Nan
@ 2016-01-11 21:33   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 21:33 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, He Kuang,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim

Em Mon, Jan 11, 2016 at 01:48:20PM +0000, Wang Nan escreveu:
> ordered_events__free() leaves linked lists and timestamps not cleared.
> Introduce ordered_events__reset() to reinit ordered_events so it can
> be reused again.

Reused where? Can you mention the usecase?

Do we have to introduce a new function? Why not just make
ordered_events__free() to get the state to what was after
ordered_events__init()?

- Arnaldo
 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/util/ordered-events.c | 9 +++++++++
>  tools/perf/util/ordered-events.h | 1 +
>  tools/perf/util/session.c        | 4 ++--
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
> index b1b9e23..81daada 100644
> --- a/tools/perf/util/ordered-events.c
> +++ b/tools/perf/util/ordered-events.c
> @@ -308,3 +308,12 @@ void ordered_events__free(struct ordered_events *oe)
>  		free(event);
>  	}
>  }
> +
> +void ordered_events__reset(struct ordered_events *oe)
> +{
> +	ordered_events__deliver_t old_deliver = oe->deliver;
> +
> +	ordered_events__free(oe);
> +	memset(oe, '\0', sizeof(*oe));
> +	ordered_events__init(oe, old_deliver);
> +}
> diff --git a/tools/perf/util/ordered-events.h b/tools/perf/util/ordered-events.h
> index f403991..77e0f1b 100644
> --- a/tools/perf/util/ordered-events.h
> +++ b/tools/perf/util/ordered-events.h
> @@ -49,6 +49,7 @@ void ordered_events__delete(struct ordered_events *oe, struct ordered_event *eve
>  int ordered_events__flush(struct ordered_events *oe, enum oe_flush how);
>  void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t deliver);
>  void ordered_events__free(struct ordered_events *oe);
> +void ordered_events__reset(struct ordered_events *oe);
>  
>  static inline
>  void ordered_events__set_alloc_size(struct ordered_events *oe, u64 size)
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index d5636ba..96e10d2 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1701,7 +1701,7 @@ done:
>  out_err:
>  	free(buf);
>  	perf_session__warn_about_errors(session);
> -	ordered_events__free(&session->ordered_events);
> +	ordered_events__reset(&session->ordered_events);
>  	auxtrace__free_events(session);
>  	return err;
>  }
> @@ -1857,7 +1857,7 @@ out:
>  out_err:
>  	ui_progress__finish();
>  	perf_session__warn_about_errors(session);
> -	ordered_events__free(&session->ordered_events);
> +	ordered_events__reset(&session->ordered_events);
>  	auxtrace__free_events(session);
>  	session->one_mmap = false;
>  	return err;
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 17/53] perf test: Improve bp_signal
  2016-01-11 13:48 ` [PATCH 17/53] perf test: Improve bp_signal Wang Nan
@ 2016-01-11 21:37   ` Arnaldo Carvalho de Melo
  2016-01-12  4:13     ` Wangnan (F)
  2016-01-12  9:21     ` Jiri Olsa
  0 siblings, 2 replies; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 21:37 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Will Deacon, Jiri Olsa

Em Mon, Jan 11, 2016 at 01:48:08PM +0000, Wang Nan escreveu:
> Will Deacon [1] has some question on patch [2]. This patch improves
> test__bp_signal so we can test:
> 
>  1. A watchpoint and a breakpoint that fire on the same instruction
>  2. Nested signals
> 
> Test result:
> 
>  On x86_64 and ARM64 (result are similar with patch [2] on ARM64):
> 
>  # ./perf test -v signal
>  17: Test breakpoint overflow signal handler                  :
>  --- start ---
>  test child forked, pid 10213
>  count1 1, count2 3, count3 2, overflow 3, overflows_2 3
>  test child finished with 0
>  ---- end ----
>  Test breakpoint overflow signal handler: Ok
> 
> So at least 2 cases Will doubted are handled correctly.
> 
> [1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
> [2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Will Deacon <will.deacon@arm.com>

Will, are you ok with this one? Can I have an Acked-by or better,
Tested-by for the AARCH64 base?

IIRC Jiri made some comment about this one?

- Arnaldo

> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/tests/bp_signal.c | 140 ++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 118 insertions(+), 22 deletions(-)
> 
> diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
> index fb80c9e..1d1bb48 100644
> --- a/tools/perf/tests/bp_signal.c
> +++ b/tools/perf/tests/bp_signal.c
> @@ -29,14 +29,59 @@
>  
>  static int fd1;
>  static int fd2;
> +static int fd3;
>  static int overflows;
> +static int overflows_2;
> +
> +volatile long the_var;
> +
> +
> +/*
> + * Use ASM to ensure watchpoint and breakpoint can be triggered
> + * at one instruction.
> + */
> +#if defined (__x86_64__)
> +extern void __test_function(volatile long *ptr);
> +asm (
> +	".globl __test_function\n"
> +	"__test_function:\n"
> +	"incq (%rdi)\n"
> +	"ret\n");
> +#elif defined (__aarch64__)
> +extern void __test_function(volatile long *ptr);
> +asm (
> +	".globl __test_function\n"
> +	"__test_function:\n"
> +	"str x30, [x0]\n"
> +	"ret\n");
> +
> +#else
> +static void __test_function(volatile long *ptr)
> +{
> +	*ptr = 0x1234;
> +}
> +#endif
>  
>  __attribute__ ((noinline))
>  static int test_function(void)
>  {
> +	__test_function(&the_var);
> +	the_var++;
>  	return time(NULL);
>  }
>  
> +static void sig_handler_2(int signum __maybe_unused,
> +			  siginfo_t *oh __maybe_unused,
> +			  void *uc __maybe_unused)
> +{
> +	overflows_2++;
> +	if (overflows_2 > 10) {
> +		ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
> +		ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
> +		ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
> +	}
> +}
> +
>  static void sig_handler(int signum __maybe_unused,
>  			siginfo_t *oh __maybe_unused,
>  			void *uc __maybe_unused)
> @@ -54,10 +99,11 @@ static void sig_handler(int signum __maybe_unused,
>  		 */
>  		ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
>  		ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
> +		ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
>  	}
>  }
>  
> -static int bp_event(void *fn, int setup_signal)
> +static int __event(bool is_x, void *addr, int signal)
>  {
>  	struct perf_event_attr pe;
>  	int fd;
> @@ -67,8 +113,8 @@ static int bp_event(void *fn, int setup_signal)
>  	pe.size = sizeof(struct perf_event_attr);
>  
>  	pe.config = 0;
> -	pe.bp_type = HW_BREAKPOINT_X;
> -	pe.bp_addr = (unsigned long) fn;
> +	pe.bp_type = is_x ? HW_BREAKPOINT_X : HW_BREAKPOINT_W;
> +	pe.bp_addr = (unsigned long) addr;
>  	pe.bp_len = sizeof(long);
>  
>  	pe.sample_period = 1;
> @@ -86,17 +132,25 @@ static int bp_event(void *fn, int setup_signal)
>  		return TEST_FAIL;
>  	}
>  
> -	if (setup_signal) {
> -		fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
> -		fcntl(fd, F_SETSIG, SIGIO);
> -		fcntl(fd, F_SETOWN, getpid());
> -	}
> +	fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
> +	fcntl(fd, F_SETSIG, signal);
> +	fcntl(fd, F_SETOWN, getpid());
>  
>  	ioctl(fd, PERF_EVENT_IOC_RESET, 0);
>  
>  	return fd;
>  }
>  
> +static int bp_event(void *addr, int signal)
> +{
> +	return __event(true, addr, signal);
> +}
> +
> +static int wp_event(void *addr, int signal)
> +{
> +	return __event(false, addr, signal);
> +}
> +
>  static long long bp_count(int fd)
>  {
>  	long long count;
> @@ -114,7 +168,7 @@ static long long bp_count(int fd)
>  int test__bp_signal(int subtest __maybe_unused)
>  {
>  	struct sigaction sa;
> -	long long count1, count2;
> +	long long count1, count2, count3;
>  
>  	/* setup SIGIO signal handler */
>  	memset(&sa, 0, sizeof(struct sigaction));
> @@ -126,21 +180,52 @@ int test__bp_signal(int subtest __maybe_unused)
>  		return TEST_FAIL;
>  	}
>  
> +	sa.sa_sigaction = (void *) sig_handler_2;
> +	if (sigaction(SIGUSR1, &sa, NULL) < 0) {
> +		pr_debug("failed setting up signal handler 2\n");
> +		return TEST_FAIL;
> +	}
> +
>  	/*
>  	 * We create following events:
>  	 *
> -	 * fd1 - breakpoint event on test_function with SIGIO
> +	 * fd1 - breakpoint event on __test_function with SIGIO
>  	 *       signal configured. We should get signal
>  	 *       notification each time the breakpoint is hit
>  	 *
> -	 * fd2 - breakpoint event on sig_handler without SIGIO
> +	 * fd2 - breakpoint event on sig_handler with SIGUSR1
> +	 *       configured. We should get SIGUSR1 each time when
> +	 *       breakpoint is hit
> +	 *
> +	 * fd3 - watchpoint event on __test_function with SIGIO
>  	 *       configured.
>  	 *
>  	 * Following processing should happen:
> -	 *   - execute test_function
> -	 *   - fd1 event breakpoint hit -> count1 == 1
> -	 *   - SIGIO is delivered       -> overflows == 1
> -	 *   - fd2 event breakpoint hit -> count2 == 1
> +	 *   Exec:               Action:                       Result:
> +	 *   incq (%rdi)       - fd1 event breakpoint hit   -> count1 == 1
> +	 *                     - SIGIO is delivered
> +	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 1
> +	 *                     - SIGUSR1 is delivered
> +	 *   sig_handler_2                                  -> overflows_2 == 1  (nested signal)
> +	 *   sys_rt_sigreturn  - return from sig_handler_2
> +	 *   overflows++                                    -> overflows = 1
> +	 *   sys_rt_sigreturn  - return from sig_handler
> +	 *   incq (%rdi)       - fd3 event watchpoint hit   -> count3 == 1       (wp and bp in one insn)
> +	 *                     - SIGIO is delivered
> +	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 2
> +	 *                     - SIGUSR1 is delivered
> +	 *   sig_handler_2                                  -> overflows_2 == 2  (nested signal)
> +	 *   sys_rt_sigreturn  - return from sig_handler_2
> +	 *   overflows++                                    -> overflows = 2
> +	 *   sys_rt_sigreturn  - return from sig_handler
> +	 *   the_var++         - fd3 event watchpoint hit   -> count3 == 2       (standalone watchpoint)
> +	 *                     - SIGIO is delivered
> +	 *   sig_handler       - fd2 event breakpoint hit   -> count2 == 3
> +	 *                     - SIGUSR1 is delivered
> +	 *   sig_handler_2                                  -> overflows_2 == 3  (nested signal)
> +	 *   sys_rt_sigreturn  - return from sig_handler_2
> +	 *   overflows++                                    -> overflows == 3
> +	 *   sys_rt_sigreturn  - return from sig_handler
>  	 *
>  	 * The test case check following error conditions:
>  	 * - we get stuck in signal handler because of debug
> @@ -152,11 +237,13 @@ int test__bp_signal(int subtest __maybe_unused)
>  	 *
>  	 */
>  
> -	fd1 = bp_event(test_function, 1);
> -	fd2 = bp_event(sig_handler, 0);
> +	fd1 = bp_event(__test_function, SIGIO);
> +	fd2 = bp_event(sig_handler, SIGUSR1);
> +	fd3 = wp_event((void *)&the_var, SIGIO);
>  
>  	ioctl(fd1, PERF_EVENT_IOC_ENABLE, 0);
>  	ioctl(fd2, PERF_EVENT_IOC_ENABLE, 0);
> +	ioctl(fd3, PERF_EVENT_IOC_ENABLE, 0);
>  
>  	/*
>  	 * Kick off the test by trigering 'fd1'
> @@ -166,15 +253,18 @@ int test__bp_signal(int subtest __maybe_unused)
>  
>  	ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
>  	ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
> +	ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
>  
>  	count1 = bp_count(fd1);
>  	count2 = bp_count(fd2);
> +	count3 = bp_count(fd3);
>  
>  	close(fd1);
>  	close(fd2);
> +	close(fd3);
>  
> -	pr_debug("count1 %lld, count2 %lld, overflow %d\n",
> -		 count1, count2, overflows);
> +	pr_debug("count1 %lld, count2 %lld, count3 %lld, overflow %d, overflows_2 %d\n",
> +		 count1, count2, count3, overflows, overflows_2);
>  
>  	if (count1 != 1) {
>  		if (count1 == 11)
> @@ -183,12 +273,18 @@ int test__bp_signal(int subtest __maybe_unused)
>  			pr_debug("failed: wrong count for bp1%lld\n", count1);
>  	}
>  
> -	if (overflows != 1)
> +	if (overflows != 3)
>  		pr_debug("failed: wrong overflow hit\n");
>  
> -	if (count2 != 1)
> +	if (overflows_2 != 3)
> +		pr_debug("failed: wrong overflow_2 hit\n");
> +
> +	if (count2 != 3)
>  		pr_debug("failed: wrong count for bp2\n");
>  
> -	return count1 == 1 && overflows == 1 && count2 == 1 ?
> +	if (count3 != 2)
> +		pr_debug("failed: wrong count for bp3\n");
> +
> +	return count1 == 1 && overflows == 3 && count2 == 3 && overflows_2 == 3 && count3 == 2 ?
>  		TEST_OK : TEST_FAIL;
>  }
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 14/53] perf test: Check environment before start real BPF test
  2016-01-11 13:48 ` [PATCH 14/53] perf test: Check environment before start real BPF test Wang Nan
@ 2016-01-11 21:55   ` Arnaldo Carvalho de Melo
  2016-01-12  7:40     ` Wangnan (F)
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 21:55 UTC (permalink / raw)
  To: Wang Nan; +Cc: linux-kernel, pi3orama, lizefan, netdev, davem

Em Mon, Jan 11, 2016 at 01:48:05PM +0000, Wang Nan escreveu:
> Copying perf to old kernel system results:
> 
>  # perf test bpf
>  37: Test BPF filter                                          :
>  37.1: Test basic BPF filtering                               : FAILED!
>  37.2: Test BPF prologue generation                           : Skip
> 
> However, in case when kernel doesn't support a test case it should
> return 'Skip', 'FAILED!' should be reserved for kernel tests for when
> the kernel supports a feature that then fails to work as advertised.
> 
> This patch checks environment before real testcase.

This is really strange, this other test is failing if the above patch is
present, found by bisecting:

[acme@felicio linux]$ perf test decoder
47: Test x86 instruction decoder - new instructions          : FAILED!
[acme@felicio linux]$ git log --oneline -1
91fedd318e3d perf test: Check environment before start real BPF test
[acme@felicio linux]$ git reset --hard HEAD^
HEAD is now at f1f23526d3b6 perf test: Reset err after using it hold
errcode in hist testcases
[acme@felicio linux]$ m
make: Entering directory `/home/acme/git/linux/tools/perf'
  BUILD:   Doing 'make -j4' parallel build
  CC       /tmp/build/perf/arch/common.o
  CC       /tmp/build/perf/util/abspath.o
  CC       /tmp/build/perf/builtin-bench.o
  CC       /tmp/build/perf/util/alias.o

<SNIP>
[acme@felicio linux]$ git log --oneline -1
f1f23526d3b6 perf test: Reset err after using it hold errcode in hist
testcases
[acme@felicio linux]$ perf test decoder
47: Test x86 instruction decoder - new instructions          : Ok
[acme@felicio linux]$ 

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 05/53] perf tools: Test correct path of perf in build-test
  2016-01-11 15:24   ` Arnaldo Carvalho de Melo
@ 2016-01-11 22:06     ` Arnaldo Carvalho de Melo
  2016-01-11 22:39       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 22:06 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa, Namhyung Kim

Em Mon, Jan 11, 2016 at 12:24:56PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 11, 2016 at 01:47:56PM +0000, Wang Nan escreveu:
> > If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
> > will fail because perf resides in a different directory. Fix this by
> > computing PERF_OUT according to 'O' and test correct output files.
> > For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
> > instead because the path is different from others ($(O)/perf vs
> >  $(O)/tools/perf).
> 
> Ok, applying up to this patch I now manage to almost cleanly build it using O=,
> see below, but seems that we have some race, as not all tests end up producing
> such warnings.
> 
> [acme@felicio linux]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; make O=/tmp/build/perf -C tools/perf build-test
> make: Entering directory `/home/acme/git/linux/tools/perf'
> Testing Makefile
> - make_no_libperl: cd . && make -f Makefile   DESTDIR=/tmp/tmp.m1nXBMqhSA NO_LIBPERL=1
> find: ‘/tmp/build/perf/util/trace-event-scripting.o’: No such file or directory

Well, it is happening even without O=:


[acme@felicio linux]$ perf stat make -C tools/perf build-test
make: Entering directory `/home/acme/git/linux/tools/perf'
Testing Makefile
- make_doc: cd . && make -f Makefile   DESTDIR=/tmp/tmp.H8z3S3cEJ0 doc
- make_install_bin: cd . && make -f Makefile   DESTDIR=/tmp/tmp.njIAPXMF7f install-bin
- make_install_prefix: cd . && make -f Makefile   DESTDIR=/tmp/tmp.9FEKGBoeyN install prefix=/tmp/krava
- make_no_gtk2: cd . && make -f Makefile   DESTDIR=/tmp/tmp.nHl593wfMP NO_GTK2=1
- make_util_map_o: cd . && make -f Makefile   DESTDIR=/tmp/tmp.ZSmZP490hX util/map.o
- make_no_slang: cd . && make -f Makefile   DESTDIR=/tmp/tmp.7q24C1xmcu NO_SLANG=1
- make_pure: cd . && make -f Makefile   DESTDIR=/tmp/tmp.R51cy8kdWl 
- make_no_libpython: cd . && make -f Makefile   DESTDIR=/tmp/tmp.3t9tEc0e4b NO_LIBPYTHON=1
- make_no_libbionic: cd . && make -f Makefile   DESTDIR=/tmp/tmp.4yYelFUaq0 NO_LIBBIONIC=1
- make_no_newt: cd . && make -f Makefile   DESTDIR=/tmp/tmp.3Fg7hv3Hn1 NO_NEWT=1
- make_tags: cd . && make -f Makefile   DESTDIR=/tmp/tmp.8WMgskFkOH tags
- make_install: cd . && make -f Makefile   DESTDIR=/tmp/tmp.YQq3wOEkyB install
- make_no_libdw_dwarf_unwind: cd . && make -f Makefile   DESTDIR=/tmp/tmp.WKRVFDA2ty NO_LIBDW_DWARF_UNWIND=1
find: ‘/home/acme/git/linux/tools/perf/.gtk-in.o.cmd’: No such file or directory
find: ‘/home/acme/git/linux/tools/perf/builtin-script.o’: No such file or directory
- make_no_libunwind: cd . && make -f Makefile   DESTDIR=/tmp/tmp.SQftzGTUYf NO_LIBUNWIND=1
- make_no_auxtrace: cd . && make -f Makefile   DESTDIR=/tmp/tmp.Xy2xrSCVuO NO_AUXTRACE=1
- make_no_ui: cd . && make -f Makefile   DESTDIR=/tmp/tmp.ZFNEHWqQFN NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
- make_no_libnuma: cd . && make -f Makefile   DESTDIR=/tmp/tmp.68zRtMaEqf NO_LIBNUMA=1
- make_no_backtrace: cd . && make -f Makefile   DESTDIR=/tmp/tmp.5xcea8XfdC NO_BACKTRACE=1
find: ‘/home/acme/git/linux/tools/perf/arch/x86/tests/dwarf-unwind.o’: No such file or directory
- make_install_prefix_slash: cd . && make -f Makefile   DESTDIR=/tmp/tmp.2c5BqUKGef install prefix=/tmp/krava/
find: ‘/home/acme/git/linux/tools/perf/builtin-record.o’: No such file or directory
find: ‘/home/acme/git/linux/tools/perf/builtin-inject.o’: No such file or directory
find: ‘/home/acme/git/linux/tools/perf/builtin-bench.o’: No such file or directory
find: ‘/home/acme/git/linux/tools/perf/.builtin-lock.o.cmd’: No such file or directory
find: ‘/home/acme/git/linux/tools/perf/perf.o’: No such file or directory
find: ‘/home/acme/git/linux/tools/perf/scripts/.libperf-in.o.cmd’: No such file or directory
find: ‘/home/acme/git/linux/tools/perf/tests/evsel-tp-sched.o’: No such file or directory
find: ‘/home/acme/git/linux/tools/perf/tests/hists_cumulate.o’: No such file or directory
- make_util_pmu_bison_o: cd . && make -f Makefile   DESTDIR=/tmp/tmp.aJUWyFbXsp util/pmu-bis

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 05/53] perf tools: Test correct path of perf in build-test
  2016-01-11 22:06     ` Arnaldo Carvalho de Melo
@ 2016-01-11 22:39       ` Arnaldo Carvalho de Melo
  2016-01-11 22:39         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 22:39 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa, Namhyung Kim

Em Mon, Jan 11, 2016 at 07:06:18PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 11, 2016 at 12:24:56PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Mon, Jan 11, 2016 at 01:47:56PM +0000, Wang Nan escreveu:
> > > If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
> > > will fail because perf resides in a different directory. Fix this by
> > > computing PERF_OUT according to 'O' and test correct output files.
> > > For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
> > > instead because the path is different from others ($(O)/perf vs
> > >  $(O)/tools/perf).
> > 
> > Ok, applying up to this patch I now manage to almost cleanly build it using O=,
> > see below, but seems that we have some race, as not all tests end up producing
> > such warnings.
> > 
> > [acme@felicio linux]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; make O=/tmp/build/perf -C tools/perf build-test
> > make: Entering directory `/home/acme/git/linux/tools/perf'
> > Testing Makefile
> > - make_no_libperl: cd . && make -f Makefile   DESTDIR=/tmp/tmp.m1nXBMqhSA NO_LIBPERL=1
> > find: ‘/tmp/build/perf/util/trace-event-scripting.o’: No such file or directory
> 
> Well, it is happening even without O=:

So I removed a few patches and those aren't appearing anymore, please
take a look at my perf/core branch, running build-test on a few machines
now, will push soon.

My hunch is that build-test has issues with parallel builds, but I'm not
sure...

- Arnaldo
 
> 
> [acme@felicio linux]$ perf stat make -C tools/perf build-test
> make: Entering directory `/home/acme/git/linux/tools/perf'
> Testing Makefile
> - make_doc: cd . && make -f Makefile   DESTDIR=/tmp/tmp.H8z3S3cEJ0 doc
> - make_install_bin: cd . && make -f Makefile   DESTDIR=/tmp/tmp.njIAPXMF7f install-bin
> - make_install_prefix: cd . && make -f Makefile   DESTDIR=/tmp/tmp.9FEKGBoeyN install prefix=/tmp/krava
> - make_no_gtk2: cd . && make -f Makefile   DESTDIR=/tmp/tmp.nHl593wfMP NO_GTK2=1
> - make_util_map_o: cd . && make -f Makefile   DESTDIR=/tmp/tmp.ZSmZP490hX util/map.o
> - make_no_slang: cd . && make -f Makefile   DESTDIR=/tmp/tmp.7q24C1xmcu NO_SLANG=1
> - make_pure: cd . && make -f Makefile   DESTDIR=/tmp/tmp.R51cy8kdWl 
> - make_no_libpython: cd . && make -f Makefile   DESTDIR=/tmp/tmp.3t9tEc0e4b NO_LIBPYTHON=1
> - make_no_libbionic: cd . && make -f Makefile   DESTDIR=/tmp/tmp.4yYelFUaq0 NO_LIBBIONIC=1
> - make_no_newt: cd . && make -f Makefile   DESTDIR=/tmp/tmp.3Fg7hv3Hn1 NO_NEWT=1
> - make_tags: cd . && make -f Makefile   DESTDIR=/tmp/tmp.8WMgskFkOH tags
> - make_install: cd . && make -f Makefile   DESTDIR=/tmp/tmp.YQq3wOEkyB install
> - make_no_libdw_dwarf_unwind: cd . && make -f Makefile   DESTDIR=/tmp/tmp.WKRVFDA2ty NO_LIBDW_DWARF_UNWIND=1
> find: ‘/home/acme/git/linux/tools/perf/.gtk-in.o.cmd’: No such file or directory
> find: ‘/home/acme/git/linux/tools/perf/builtin-script.o’: No such file or directory
> - make_no_libunwind: cd . && make -f Makefile   DESTDIR=/tmp/tmp.SQftzGTUYf NO_LIBUNWIND=1
> - make_no_auxtrace: cd . && make -f Makefile   DESTDIR=/tmp/tmp.Xy2xrSCVuO NO_AUXTRACE=1
> - make_no_ui: cd . && make -f Makefile   DESTDIR=/tmp/tmp.ZFNEHWqQFN NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
> - make_no_libnuma: cd . && make -f Makefile   DESTDIR=/tmp/tmp.68zRtMaEqf NO_LIBNUMA=1
> - make_no_backtrace: cd . && make -f Makefile   DESTDIR=/tmp/tmp.5xcea8XfdC NO_BACKTRACE=1
> find: ‘/home/acme/git/linux/tools/perf/arch/x86/tests/dwarf-unwind.o’: No such file or directory
> - make_install_prefix_slash: cd . && make -f Makefile   DESTDIR=/tmp/tmp.2c5BqUKGef install prefix=/tmp/krava/
> find: ‘/home/acme/git/linux/tools/perf/builtin-record.o’: No such file or directory
> find: ‘/home/acme/git/linux/tools/perf/builtin-inject.o’: No such file or directory
> find: ‘/home/acme/git/linux/tools/perf/builtin-bench.o’: No such file or directory
> find: ‘/home/acme/git/linux/tools/perf/.builtin-lock.o.cmd’: No such file or directory
> find: ‘/home/acme/git/linux/tools/perf/perf.o’: No such file or directory
> find: ‘/home/acme/git/linux/tools/perf/scripts/.libperf-in.o.cmd’: No such file or directory
> find: ‘/home/acme/git/linux/tools/perf/tests/evsel-tp-sched.o’: No such file or directory
> find: ‘/home/acme/git/linux/tools/perf/tests/hists_cumulate.o’: No such file or directory
> - make_util_pmu_bison_o: cd . && make -f Makefile   DESTDIR=/tmp/tmp.aJUWyFbXsp util/pmu-bis

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 05/53] perf tools: Test correct path of perf in build-test
  2016-01-11 22:39       ` Arnaldo Carvalho de Melo
@ 2016-01-11 22:39         ` Arnaldo Carvalho de Melo
  2016-01-12  7:16           ` Wangnan (F)
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-11 22:39 UTC (permalink / raw)
  To: Wang Nan
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa, Namhyung Kim

Em Mon, Jan 11, 2016 at 07:39:04PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 11, 2016 at 07:06:18PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Mon, Jan 11, 2016 at 12:24:56PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Mon, Jan 11, 2016 at 01:47:56PM +0000, Wang Nan escreveu:
> > > > If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
> > > > will fail because perf resides in a different directory. Fix this by
> > > > computing PERF_OUT according to 'O' and test correct output files.
> > > > For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
> > > > instead because the path is different from others ($(O)/perf vs
> > > >  $(O)/tools/perf).
> > > 
> > > Ok, applying up to this patch I now manage to almost cleanly build it using O=,
> > > see below, but seems that we have some race, as not all tests end up producing
> > > such warnings.
> > > 
> > > [acme@felicio linux]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; make O=/tmp/build/perf -C tools/perf build-test
> > > make: Entering directory `/home/acme/git/linux/tools/perf'
> > > Testing Makefile
> > > - make_no_libperl: cd . && make -f Makefile   DESTDIR=/tmp/tmp.m1nXBMqhSA NO_LIBPERL=1
> > > find: ‘/tmp/build/perf/util/trace-event-scripting.o’: No such file or directory
> > 
> > Well, it is happening even without O=:
> 
> So I removed a few patches and those aren't appearing anymore, please
> take a look at my perf/core branch, running build-test on a few machines
> now, will push soon.
> 
> My hunch is that build-test has issues with parallel builds, but I'm not
> sure...


Good:

- make_perf_o_O: cd . && make -f Makefile O=/tmp/tmp.oLeg8aUaOo DESTDIR=/tmp/tmp.16WP4HTQJs perf.o
- make_util_pmu_bison_o_O: cd . && make -f Makefile O=/tmp/tmp.xNRV0pCXfD DESTDIR=/tmp/tmp.8dyU9uEbHe util/pmu-bison.o
- make_no_libdw_dwarf_unwind_O: cd . && make -f Makefile O=/tmp/tmp.pHH4HExHcH DESTDIR=/tmp/tmp.Wo0m8fF5cp NO_LIBDW_DWARF_UNWIND=1
- make_no_demangle_O: cd . && make -f Makefile O=/tmp/tmp.yWNsd4jOsI DESTDIR=/tmp/tmp.Q7eA4kCvwL NO_DEMANGLE=1
- tarpkg: ./tests/perf-targz-src-pkg .
- make -C <kernelsrc> tools/perf
- make -C <kernelsrc>/tools perf
OK

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 17/53] perf test: Improve bp_signal
  2016-01-11 21:37   ` Arnaldo Carvalho de Melo
@ 2016-01-12  4:13     ` Wangnan (F)
  2016-01-12  9:21     ` Jiri Olsa
  1 sibling, 0 replies; 124+ messages in thread
From: Wangnan (F) @ 2016-01-12  4:13 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Will Deacon, Jiri Olsa



On 2016/1/12 5:37, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 11, 2016 at 01:48:08PM +0000, Wang Nan escreveu:
>> Will Deacon [1] has some question on patch [2]. This patch improves
>> test__bp_signal so we can test:
>>
>>   1. A watchpoint and a breakpoint that fire on the same instruction
>>   2. Nested signals
>>
>> Test result:
>>
>>   On x86_64 and ARM64 (result are similar with patch [2] on ARM64):
>>
>>   # ./perf test -v signal
>>   17: Test breakpoint overflow signal handler                  :
>>   --- start ---
>>   test child forked, pid 10213
>>   count1 1, count2 3, count3 2, overflow 3, overflows_2 3
>>   test child finished with 0
>>   ---- end ----
>>   Test breakpoint overflow signal handler: Ok
>>
>> So at least 2 cases Will doubted are handled correctly.
>>
>> [1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
>> [2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Cc: Will Deacon <will.deacon@arm.com>
> Will, are you ok with this one? Can I have an Acked-by or better,
> Tested-by for the AARCH64 base?

Patch [2] is still in question. On AArch64 this test will fail even
without this patch.

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-11 18:09   ` Alexei Starovoitov
@ 2016-01-12  5:33     ` Wangnan (F)
  2016-01-12  6:11       ` Alexei Starovoitov
  2016-01-12 14:05       ` Peter Zijlstra
  0 siblings, 2 replies; 124+ messages in thread
From: Wangnan (F) @ 2016-01-12  5:33 UTC (permalink / raw)
  To: Alexei Starovoitov, Peter Zijlstra
  Cc: acme, linux-kernel, pi3orama, lizefan, netdev, davem,
	Adrian Hunter, Arnaldo Carvalho de Melo, David Ahern,
	Ingo Molnar, Yunlong Song



On 2016/1/12 2:09, Alexei Starovoitov wrote:
> On Mon, Jan 11, 2016 at 01:48:18PM +0000, Wang Nan wrote:
>> This patch introduces a PERF_SAMPLE_TAILSIZE flag which allows a size
>> field attached at the end of a sample. The idea comes from [1] that,
>> with tie size at tail of an event, it is possible for user program who
>> read from the ring buffer parse events backward.
>>
>> For example:
>>
>>     head
>>      |
>>      V
>>   +--+---+-------+----------+------+---+
>>   |E6|...|   B  8|   C    11|  D  7|E..|
>>   +--+---+-------+----------+------+---+
>>
>> In this case, from the 'head' pointer provided by kernel, user program
>> can first see '6' by (*(head - sizeof(u64))), then it can get the start
>> pointer of record 'E', then it can read size and find start position
>> of record D, C, B in similar way.
> adding extra 8 bytes for every sample is quite unfortunate.
> How about another idea:
> . update data_tail pointer when head is about to overwrite it
>
> Ex:
>     head   data_tail
>      |       |
>      V       V
>   +--+-------+-------+---+----+---+
>   |E |  ...  |   B   | C |  D | E |
>   +--+-------+-------+---+----+---+
>
> if new sample F is about to overwrite B, the kernel would need
> to read the size of B from B's header and update data_tail to point C.
> Or even further.
> Comparing to TAILSIZE approach, now kernel will be doing both reads
> and writes into ring-buffer and there is a concern that reads may
> be hitting cold data, but if the records are small they may be
> actually on the same cache line brought by the previous
> read A's header, write E record cycle. So I think we shouldn't see
> cache misses.

After ring buffer rewind, we need a read before nearly
every write operations. The performance penalty depends on
configuration of write allocate. In addition, another data
dependency is required: we must wait for the size of
event B is retrived before overwrite it.

Even in the very first try at 2013 in [1], reading from the ring
buffer is avoided. I don't think Peter changes his mind now.

> Another concern is validity of records stored. If user space messes
> with ring-buffer, kernel won't be able to move data_tail properly
> and would need to indicate that to userspace somehow.
> But memory saving of 8 bytes per record could be sizable

Yes. But I have already discussed with Peter on this in [2].
Last month I suggested:

<quote>

  1. If PERF_SAMPLE_SIZE is selected, we can avoid outputting the event
     size in header. Which eliminate extra space cost;
</quote>

However:

<quote>

That would mandate you always parse the stream backwards. Which seems
rather unfortunate. Also, no you cannot recoup the extra space, see the
alignment and size requirement.

</quote>

>   and
> user space wouldn't need to walk the whole buffer backwards and
> can just start from valid data_tail, so the dumps of overwrite
> ring-buffer will be faster too.
> Thoughts?
>
Please also refer to [3]. In that patch we introduced a userspace
ring buffer in perf and make it continously collect data from
normal ring buffers. Since we have to wake up perf to read data,
the cost is even higher.

[1] 
http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net
[2] 
http://lkml.kernel.org/r/20151203100801.GV3816@twins.programming.kicks-ass.net
[3] 
http://lkml.kernel.org/r/1448373632-8806-1-git-send-email-yunlong.song@huawei.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-12  5:33     ` Wangnan (F)
@ 2016-01-12  6:11       ` Alexei Starovoitov
  2016-01-12 12:36         ` Wangnan (F)
  2016-01-12 14:05       ` Peter Zijlstra
  1 sibling, 1 reply; 124+ messages in thread
From: Alexei Starovoitov @ 2016-01-12  6:11 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Peter Zijlstra, acme, linux-kernel, pi3orama, lizefan, netdev,
	davem, Adrian Hunter, Arnaldo Carvalho de Melo, David Ahern,
	Ingo Molnar, Yunlong Song

On Tue, Jan 12, 2016 at 01:33:28PM +0800, Wangnan (F) wrote:
> 
> 
> On 2016/1/12 2:09, Alexei Starovoitov wrote:
> >On Mon, Jan 11, 2016 at 01:48:18PM +0000, Wang Nan wrote:
> >>This patch introduces a PERF_SAMPLE_TAILSIZE flag which allows a size
> >>field attached at the end of a sample. The idea comes from [1] that,
> >>with tie size at tail of an event, it is possible for user program who
> >>read from the ring buffer parse events backward.
> >>
> >>For example:
> >>
> >>    head
> >>     |
> >>     V
> >>  +--+---+-------+----------+------+---+
> >>  |E6|...|   B  8|   C    11|  D  7|E..|
> >>  +--+---+-------+----------+------+---+
> >>
> >>In this case, from the 'head' pointer provided by kernel, user program
> >>can first see '6' by (*(head - sizeof(u64))), then it can get the start
> >>pointer of record 'E', then it can read size and find start position
> >>of record D, C, B in similar way.
> >adding extra 8 bytes for every sample is quite unfortunate.
> >How about another idea:
> >. update data_tail pointer when head is about to overwrite it
> >
> >Ex:
> >    head   data_tail
> >     |       |
> >     V       V
> >  +--+-------+-------+---+----+---+
> >  |E |  ...  |   B   | C |  D | E |
> >  +--+-------+-------+---+----+---+
> >
> >if new sample F is about to overwrite B, the kernel would need
> >to read the size of B from B's header and update data_tail to point C.
> >Or even further.
> >Comparing to TAILSIZE approach, now kernel will be doing both reads
> >and writes into ring-buffer and there is a concern that reads may
> >be hitting cold data, but if the records are small they may be
> >actually on the same cache line brought by the previous
> >read A's header, write E record cycle. So I think we shouldn't see
> >cache misses.
> 
> After ring buffer rewind, we need a read before nearly
> every write operations. The performance penalty depends on
> configuration of write allocate. In addition, another data
> dependency is required: we must wait for the size of
> event B is retrived before overwrite it.
> 
> Even in the very first try at 2013 in [1], reading from the ring
> buffer is avoided. I don't think Peter changes his mind now.
> 
> >Another concern is validity of records stored. If user space messes
> >with ring-buffer, kernel won't be able to move data_tail properly
> >and would need to indicate that to userspace somehow.
> >But memory saving of 8 bytes per record could be sizable
> 
> Yes. But I have already discussed with Peter on this in [2].
> Last month I suggested:
> 
> <quote>
> 
>  1. If PERF_SAMPLE_SIZE is selected, we can avoid outputting the event
>     size in header. Which eliminate extra space cost;
> </quote>
> 
> However:
> 
> <quote>
> 
> That would mandate you always parse the stream backwards. Which seems
> rather unfortunate. Also, no you cannot recoup the extra space, see the
> alignment and size requirement.

hmm, in this kernel patch I see that you're adding 8 bytes for
every record via this extra TAILSISZE flag and in perf you're
walking the ring buffer backwards by reading this 8 byte
sizes, comparing header sizes and so on until reaching beginning,
where you start dumping it as normal.
So for this 'signal to perf' approach to work the ring buffer
will contain tailsizes everywhere just so that user space can
find the beginning. That's not very pretty. imo if kernel
can do header read to adjust data_tail it would make user
space side clean. May be there are other solutions.
Adding tailsize seems like brute force hack.
There must be some nicer way.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 13/53] perf tools: Prevent calling machine__delete() on non-allocated machine
  2016-01-11 15:42   ` Arnaldo Carvalho de Melo
@ 2016-01-12  7:03     ` Wangnan (F)
  2016-01-12 14:07       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-12  7:03 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa,
	Masami Hiramatsu, Namhyung Kim



On 2016/1/11 23:42, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 11, 2016 at 01:48:04PM +0000, Wang Nan escreveu:
>> To prevent futher commits calling machine__delete() on non-allocated
>> 'struct machine' (which would cause memory corruption), this patch
>> enforces machine__init(), record whether a machine structure is
>> dynamically allocated or not, and warn if machine__delete() is called
>> on incorrect object.
> Not sure on this one, I think I voiced this before, this seems like
> something to be tested using some static analysis tool or even checking
> if the address for the struct hitting machine__delete() is from malloc
> or not.
>
> I.e. if we do it here, we may have to do it to any other struct where we
> allocate it in the stack or via malloc, and furthermore there are cases
> where we embed a struct in another, when we would free just the main
> struct but not the second, embedded one, that would need just calling
> foo__exit() and not foo__delete().
>
> - Arnaldo
>   
OK. Let's drop this one.

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 05/53] perf tools: Test correct path of perf in build-test
  2016-01-11 22:39         ` Arnaldo Carvalho de Melo
@ 2016-01-12  7:16           ` Wangnan (F)
  2016-01-12 14:08             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-12  7:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa, Namhyung Kim



On 2016/1/12 6:39, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 11, 2016 at 07:39:04PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Mon, Jan 11, 2016 at 07:06:18PM -0300, Arnaldo Carvalho de Melo escreveu:
>>> Em Mon, Jan 11, 2016 at 12:24:56PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>> Em Mon, Jan 11, 2016 at 01:47:56PM +0000, Wang Nan escreveu:
>>>>> If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
>>>>> will fail because perf resides in a different directory. Fix this by
>>>>> computing PERF_OUT according to 'O' and test correct output files.
>>>>> For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
>>>>> instead because the path is different from others ($(O)/perf vs
>>>>>   $(O)/tools/perf).
>>>> Ok, applying up to this patch I now manage to almost cleanly build it using O=,
>>>> see below, but seems that we have some race, as not all tests end up producing
>>>> such warnings.
>>>>
>>>> [acme@felicio linux]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; make O=/tmp/build/perf -C tools/perf build-test
>>>> make: Entering directory `/home/acme/git/linux/tools/perf'
>>>> Testing Makefile
>>>> - make_no_libperl: cd . && make -f Makefile   DESTDIR=/tmp/tmp.m1nXBMqhSA NO_LIBPERL=1
>>>> find: ‘/tmp/build/perf/util/trace-event-scripting.o’: No such file or directory

This can happen when you parallelly run find and rm on one directory. 
However,
I've never seen this message in build-test before.

>>> Well, it is happening even without O=:
>> So I removed a few patches and those aren't appearing anymore, please
>> take a look at my perf/core branch, running build-test on a few machines
>> now, will push soon.
>>
>> My hunch is that build-test has issues with parallel builds, but I'm not
>> sure...
>
> Good:
>
> - make_perf_o_O: cd . && make -f Makefile O=/tmp/tmp.oLeg8aUaOo DESTDIR=/tmp/tmp.16WP4HTQJs perf.o
> - make_util_pmu_bison_o_O: cd . && make -f Makefile O=/tmp/tmp.xNRV0pCXfD DESTDIR=/tmp/tmp.8dyU9uEbHe util/pmu-bison.o
> - make_no_libdw_dwarf_unwind_O: cd . && make -f Makefile O=/tmp/tmp.pHH4HExHcH DESTDIR=/tmp/tmp.Wo0m8fF5cp NO_LIBDW_DWARF_UNWIND=1
> - make_no_demangle_O: cd . && make -f Makefile O=/tmp/tmp.yWNsd4jOsI DESTDIR=/tmp/tmp.Q7eA4kCvwL NO_DEMANGLE=1
> - tarpkg: ./tests/perf-targz-src-pkg .
> - make -C <kernelsrc> tools/perf
> - make -C <kernelsrc>/tools perf
> OK
Glad to see this.

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 14/53] perf test: Check environment before start real BPF test
  2016-01-11 21:55   ` Arnaldo Carvalho de Melo
@ 2016-01-12  7:40     ` Wangnan (F)
  2016-01-12 14:10       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-12  7:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-kernel, pi3orama, lizefan, netdev, davem



On 2016/1/12 5:55, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 11, 2016 at 01:48:05PM +0000, Wang Nan escreveu:
>> Copying perf to old kernel system results:
>>
>>   # perf test bpf
>>   37: Test BPF filter                                          :
>>   37.1: Test basic BPF filtering                               : FAILED!
>>   37.2: Test BPF prologue generation                           : Skip
>>
>> However, in case when kernel doesn't support a test case it should
>> return 'Skip', 'FAILED!' should be reserved for kernel tests for when
>> the kernel supports a feature that then fails to work as advertised.
>>
>> This patch checks environment before real testcase.
> This is really strange, this other test is failing if the above patch is
> present, found by bisecting:
>
> [acme@felicio linux]$ perf test decoder
> 47: Test x86 instruction decoder - new instructions          : FAILED!
> [acme@felicio linux]$ git log --oneline -1
> 91fedd318e3d perf test: Check environment before start real BPF test
> [acme@felicio linux]$ git reset --hard HEAD^
> HEAD is now at f1f23526d3b6 perf test: Reset err after using it hold
> errcode in hist testcases
> [acme@felicio linux]$ m
> make: Entering directory `/home/acme/git/linux/tools/perf'
>    BUILD:   Doing 'make -j4' parallel build
>    CC       /tmp/build/perf/arch/common.o
>    CC       /tmp/build/perf/util/abspath.o
>    CC       /tmp/build/perf/builtin-bench.o
>    CC       /tmp/build/perf/util/alias.o
>
> <SNIP>
> [acme@felicio linux]$ git log --oneline -1
> f1f23526d3b6 perf test: Reset err after using it hold errcode in hist
> testcases
> [acme@felicio linux]$ perf test decoder
> 47: Test x86 instruction decoder - new instructions          : Ok
> [acme@felicio linux]$

Yes, really strange, and I can't reproduce your result
in my environment. What's the result of test -v?

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 17/53] perf test: Improve bp_signal
  2016-01-11 21:37   ` Arnaldo Carvalho de Melo
  2016-01-12  4:13     ` Wangnan (F)
@ 2016-01-12  9:21     ` Jiri Olsa
  2016-01-12 14:11       ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 124+ messages in thread
From: Jiri Olsa @ 2016-01-12  9:21 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Wang Nan, linux-kernel, pi3orama, lizefan, netdev, davem,
	Will Deacon, Jiri Olsa

On Mon, Jan 11, 2016 at 06:37:29PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jan 11, 2016 at 01:48:08PM +0000, Wang Nan escreveu:
> > Will Deacon [1] has some question on patch [2]. This patch improves
> > test__bp_signal so we can test:
> > 
> >  1. A watchpoint and a breakpoint that fire on the same instruction
> >  2. Nested signals
> > 
> > Test result:
> > 
> >  On x86_64 and ARM64 (result are similar with patch [2] on ARM64):
> > 
> >  # ./perf test -v signal
> >  17: Test breakpoint overflow signal handler                  :
> >  --- start ---
> >  test child forked, pid 10213
> >  count1 1, count2 3, count3 2, overflow 3, overflows_2 3
> >  test child finished with 0
> >  ---- end ----
> >  Test breakpoint overflow signal handler: Ok
> > 
> > So at least 2 cases Will doubted are handled correctly.
> > 
> > [1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
> > [2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com
> > 
> > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> 
> Will, are you ok with this one? Can I have an Acked-by or better,
> Tested-by for the AARCH64 base?
> 
> IIRC Jiri made some comment about this one?

I thought I acked this one.. all comments were addresses, so:

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 01/53] perf tools: Add -lutil in python lib list for broken python-config
  2016-01-11 13:47 ` [PATCH 01/53] perf tools: Add -lutil in python lib list for broken python-config Wang Nan
@ 2016-01-12  9:43   ` Jiri Olsa
  2016-01-12 10:09   ` [tip:perf/urgent] " tip-bot for Wang Nan
  1 sibling, 0 replies; 124+ messages in thread
From: Jiri Olsa @ 2016-01-12  9:43 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa,
	Namhyung Kim

On Mon, Jan 11, 2016 at 01:47:52PM +0000, Wang Nan wrote:
> On some system the perf-config is broken, causes link failure like this:

I've never got 0/53 email of this patchset.. is there any?

I was just wondering if there's git tree with patchset,
it'd make it easier for me to review it

thanks,
jirka


> 
>  /usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_forkpty':
>  /opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3816: undefined reference to `forkpty'
>  /usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_openpty':
>  /opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3756: undefined reference to `openpty'
>  collect2: error: ld returned 1 exit status
> make[1]: *** [/home/wangnan/kernel-hydrogen/tools/perf/out/perf] Error 1
> make: *** [all] Error 2
> 
>  $ python-config --libs
>  -lpthread -ldl -lpthread -lutil -lm -lpython2.7
> 
> In this case a '-lutil' should be appended to -lpython2.7.
> 
> (I know we have --start-group and --end-group. I can see them in
> command line of collect2 by strace. However it doesn't work. Seems
> I have a broken environment?)
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/config/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
> index 254d06e..0793c76 100644
> --- a/tools/perf/config/Makefile
> +++ b/tools/perf/config/Makefile
> @@ -493,7 +493,7 @@ else
>  
>        PYTHON_EMBED_LDOPTS := $(shell $(PYTHON_CONFIG_SQ) --ldflags 2>/dev/null)
>        PYTHON_EMBED_LDFLAGS := $(call strip-libs,$(PYTHON_EMBED_LDOPTS))
> -      PYTHON_EMBED_LIBADD := $(call grep-libs,$(PYTHON_EMBED_LDOPTS))
> +      PYTHON_EMBED_LIBADD := $(call grep-libs,$(PYTHON_EMBED_LDOPTS)) -lutil
>        PYTHON_EMBED_CCOPTS := $(shell $(PYTHON_CONFIG_SQ) --cflags 2>/dev/null)
>        FLAGS_PYTHON_EMBED := $(PYTHON_EMBED_CCOPTS) $(PYTHON_EMBED_LDOPTS)
>  
> -- 
> 1.8.3.4
> 

^ permalink raw reply	[flat|nested] 124+ messages in thread

* [tip:perf/urgent] perf tools: Add -lutil in python lib list for broken python-config
  2016-01-11 13:47 ` [PATCH 01/53] perf tools: Add -lutil in python lib list for broken python-config Wang Nan
  2016-01-12  9:43   ` Jiri Olsa
@ 2016-01-12 10:09   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 124+ messages in thread
From: tip-bot for Wang Nan @ 2016-01-12 10:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, acme, tglx, jolsa, wangnan0, namhyung, linux-kernel, hpa, lizefan

Commit-ID:  11dc0c57ba8a935dec5a3240370941a4380721c4
Gitweb:     http://git.kernel.org/tip/11dc0c57ba8a935dec5a3240370941a4380721c4
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 11 Jan 2016 13:47:52 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Jan 2016 19:03:16 -0300

perf tools: Add -lutil in python lib list for broken python-config

On some system the perf-config is broken, causes link failure like this:

   /usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_forkpty':
   /opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3816: undefined reference to `forkpty'
   /usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_openpty':
   /opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3756: undefined reference to `openpty'
   collect2: error: ld returned 1 exit status
  make[1]: *** [/home/wangnan/kernel-hydrogen/tools/perf/out/perf] Error 1
  make: *** [all] Error 2

  $ python-config --libs
  -lpthread -ldl -lpthread -lutil -lm -lpython2.7

In this case a '-lutil' should be appended to -lpython2.7.

(I know we have --start-group and --end-group. I can see them in command
line of collect2 by strace. However it doesn't work. Seems I have a
broken environment?)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/config/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 254d06e..0793c76 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -493,7 +493,7 @@ else
 
       PYTHON_EMBED_LDOPTS := $(shell $(PYTHON_CONFIG_SQ) --ldflags 2>/dev/null)
       PYTHON_EMBED_LDFLAGS := $(call strip-libs,$(PYTHON_EMBED_LDOPTS))
-      PYTHON_EMBED_LIBADD := $(call grep-libs,$(PYTHON_EMBED_LDOPTS))
+      PYTHON_EMBED_LIBADD := $(call grep-libs,$(PYTHON_EMBED_LDOPTS)) -lutil
       PYTHON_EMBED_CCOPTS := $(shell $(PYTHON_CONFIG_SQ) --cflags 2>/dev/null)
       FLAGS_PYTHON_EMBED := $(PYTHON_EMBED_CCOPTS) $(PYTHON_EMBED_LDOPTS)
 

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [tip:perf/urgent] perf tools: Fix phony build target for build-test
  2016-01-11 13:47 ` [PATCH 02/53] perf tools: Fix phony build target for build-test Wang Nan
@ 2016-01-12 10:09   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 124+ messages in thread
From: tip-bot for Wang Nan @ 2016-01-12 10:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, mingo, linux-kernel, lizefan, wangnan0, jolsa, hpa, acme, namhyung

Commit-ID:  3167eea27b27b29d375ee6b34dd83035c04d5da8
Gitweb:     http://git.kernel.org/tip/3167eea27b27b29d375ee6b34dd83035c04d5da8
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 11 Jan 2016 13:47:53 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Jan 2016 19:03:24 -0300

perf tools: Fix phony build target for build-test

make_kernelsrc and make_kernelsrc_tools are skiped if a previous
build-test is done, because 'make build-test' creates two files with
same names. To avoid this, they should be included in .PHONY list.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/make | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index c1fbb8e..130be7c 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -280,5 +280,5 @@ all: $(run) $(run_O) tarpkg make_kernelsrc make_kernelsrc_tools
 out: $(run_O)
 	@echo OK
 
-.PHONY: all $(run) $(run_O) tarpkg clean
+.PHONY: all $(run) $(run_O) tarpkg clean make_kernelsrc make_kernelsrc_tools
 endif # ifndef MK

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [tip:perf/urgent] perf tools: Fix PowerPC native building
  2016-01-11 13:47 ` [PATCH 06/53] perf tools: Fix PowerPC native building Wang Nan
@ 2016-01-12 10:10   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 124+ messages in thread
From: tip-bot for Wang Nan @ 2016-01-12 10:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, acme, hpa, linux-kernel, jolsa, mingo, wangnan0,
	naveen.n.rao, sukadev, lizefan

Commit-ID:  8f9e05fb298f16c0cda2e7e78b603331a79f9c10
Gitweb:     http://git.kernel.org/tip/8f9e05fb298f16c0cda2e7e78b603331a79f9c10
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 11 Jan 2016 13:47:57 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Jan 2016 19:22:20 -0300

perf tools: Fix PowerPC native building

Checks BPF syscall number, turn off libbpf building on platform doesn't
correctly support sys_bpf instead of blocking compiling.

Reported-and-Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/build/feature/test-bpf.c | 20 +++++++++++++++++++-
 tools/lib/bpf/bpf.c            |  4 ++--
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/tools/build/feature/test-bpf.c b/tools/build/feature/test-bpf.c
index 062bac8..b389026 100644
--- a/tools/build/feature/test-bpf.c
+++ b/tools/build/feature/test-bpf.c
@@ -1,9 +1,23 @@
+#include <asm/unistd.h>
 #include <linux/bpf.h>
+#include <unistd.h>
+
+#ifndef __NR_bpf
+# if defined(__i386__)
+#  define __NR_bpf 357
+# elif defined(__x86_64__)
+#  define __NR_bpf 321
+# elif defined(__aarch64__)
+#  define __NR_bpf 280
+#  error __NR_bpf not defined. libbpf does not support your arch.
+# endif
+#endif
 
 int main(void)
 {
 	union bpf_attr attr;
 
+	/* Check fields in attr */
 	attr.prog_type = BPF_PROG_TYPE_KPROBE;
 	attr.insn_cnt = 0;
 	attr.insns = 0;
@@ -14,5 +28,9 @@ int main(void)
 	attr.kern_version = 0;
 
 	attr = attr;
-	return 0;
+	/*
+	 * Test existence of __NR_bpf and BPF_PROG_LOAD.
+	 * This call should fail if we run the testcase.
+	 */
+	return syscall(__NR_bpf, BPF_PROG_LOAD, attr, sizeof(attr));
 }
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 5bdc6ea..1f91cc9 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -14,8 +14,8 @@
 #include "bpf.h"
 
 /*
- * When building perf, unistd.h is override. Define __NR_bpf is
- * required to be defined.
+ * When building perf, unistd.h is overrided. __NR_bpf is
+ * required to be defined explicitly.
  */
 #ifndef __NR_bpf
 # if defined(__i386__)

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [tip:perf/urgent] tools: Move Makefile.arch from perf/ config to tools/scripts
  2016-01-11 13:47 ` [PATCH 07/53] tools: Move Makefile.arch from perf/config to tools/scripts Wang Nan
  2016-01-11 13:52   ` Wangnan (F)
@ 2016-01-12 10:10   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 124+ messages in thread
From: tip-bot for Wang Nan @ 2016-01-12 10:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: wangnan0, jolsa, sukadev, tglx, hpa, mingo, linux-kernel,
	lizefan, acme, naveen.n.rao

Commit-ID:  935e6bd310f20d3371ae6bd6f01dd3430a4123b6
Gitweb:     http://git.kernel.org/tip/935e6bd310f20d3371ae6bd6f01dd3430a4123b6
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 11 Jan 2016 13:47:58 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Jan 2016 19:22:20 -0300

tools: Move Makefile.arch from perf/config to tools/scripts

After this patch other directories can use this architecture detector
without directly including it from perf's directory. Libbpf would
utilize it to get proper $(ARCH) so it can receive correct uapi include
directory.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452520124-2073-8-git-send-email-wangnan0@huawei.com
[ Add missing srctree definition in tests/make ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@kernel.org>
---
 tools/perf/config/Makefile                   |  2 +-
 tools/perf/tests/make                        | 16 +++++++++++++++-
 tools/{perf/config => scripts}/Makefile.arch |  0
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 0793c76..7545ba60 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -17,7 +17,7 @@ detected_var = $(shell echo "$(1)=$($(1))" >> $(OUTPUT).config-detected)
 
 CFLAGS := $(EXTRA_CFLAGS) $(EXTRA_WARNINGS)
 
-include $(src-perf)/config/Makefile.arch
+include $(srctree)/tools/scripts/Makefile.arch
 
 $(call detected_var,ARCH)
 
diff --git a/tools/perf/tests/make b/tools/perf/tests/make
index 130be7c..df38dec 100644
--- a/tools/perf/tests/make
+++ b/tools/perf/tests/make
@@ -1,3 +1,5 @@
+include ../scripts/Makefile.include
+
 ifndef MK
 ifeq ($(MAKECMDGOALS),)
 # no target specified, trigger the whole suite
@@ -12,7 +14,19 @@ endif
 else
 PERF := .
 
-include config/Makefile.arch
+# As per kernel Makefile, avoid funny character set dependencies
+unexport LC_ALL
+LC_COLLATE=C
+LC_NUMERIC=C
+export LC_COLLATE LC_NUMERIC
+
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(shell pwd)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+#$(info Determined 'srctree' to be $(srctree))
+endif
+
+include $(srctree)/tools/scripts/Makefile.arch
 
 # FIXME looks like x86 is the only arch running tests ;-)
 # we need some IS_(32/64) flag to make this generic
diff --git a/tools/perf/config/Makefile.arch b/tools/scripts/Makefile.arch
similarity index 100%
rename from tools/perf/config/Makefile.arch
rename to tools/scripts/Makefile.arch

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [tip:perf/urgent] perf bpf: Fix build breakage due to libbpf
  2016-01-11 13:48 ` [PATCH 09/53] perf: bpf: Fix build breakage due to libbpf Wang Nan
@ 2016-01-12 10:10   ` tip-bot for Naveen N. Rao
  0 siblings, 0 replies; 124+ messages in thread
From: tip-bot for Naveen N. Rao @ 2016-01-12 10:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, acme, jolsa, mingo, wangnan0, naveen.n.rao,
	lizefan, hpa, tglx, sukadev

Commit-ID:  d5ef3140351450d240f864208317f5665e7bbd1c
Gitweb:     http://git.kernel.org/tip/d5ef3140351450d240f864208317f5665e7bbd1c
Author:     Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
AuthorDate: Mon, 11 Jan 2016 13:48:00 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Jan 2016 19:22:21 -0300

perf bpf: Fix build breakage due to libbpf

perf build is currently (v4.4-rc5) broken on powerpc:

  bpf.c:28:4: error: #error __NR_bpf not defined. libbpf does not support
  your arch.
   #  error __NR_bpf not defined. libbpf does not support your arch.
      ^

Fix this by including tools/scripts/Makefile.arch for the proper $ARCH
macro. While at it, remove redundant LP64 macro definition.

Also, since libbpf require $(srctree) now, detect the path of srctree
like perf.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-10-git-send-email-wangnan0@huawei.com
[Use tools/scripts/Makefile.arch]
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/bpf/Makefile | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
index 84e0e98..fc1bc75 100644
--- a/tools/lib/bpf/Makefile
+++ b/tools/lib/bpf/Makefile
@@ -6,6 +6,12 @@ BPF_EXTRAVERSION = 1
 
 MAKEFLAGS += --no-print-directory
 
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(shell pwd)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+#$(info Determined 'srctree' to be $(srctree))
+endif
 
 # Makefiles suck: This macro sets a default value of $(2) for the
 # variable named by $(1), unless the variable has been set by
@@ -31,7 +37,8 @@ INSTALL = install
 DESTDIR ?=
 DESTDIR_SQ = '$(subst ','\'',$(DESTDIR))'
 
-LP64 := $(shell echo __LP64__ | ${CC} ${CFLAGS} -E -x c - | tail -n 1)
+include $(srctree)/tools/scripts/Makefile.arch
+
 ifeq ($(LP64), 1)
   libdir_relative = lib64
 else
@@ -57,13 +64,6 @@ ifndef VERBOSE
   VERBOSE = 0
 endif
 
-ifeq ($(srctree),)
-srctree := $(patsubst %/,%,$(dir $(shell pwd)))
-srctree := $(patsubst %/,%,$(dir $(srctree)))
-srctree := $(patsubst %/,%,$(dir $(srctree)))
-#$(info Determined 'srctree' to be $(srctree))
-endif
-
 FEATURE_USER = .libbpf
 FEATURE_TESTS = libelf libelf-getphdrnum libelf-mmap bpf
 FEATURE_DISPLAY = libelf bpf

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [tip:perf/urgent] tools build: Add BPF feature check to test-all
  2016-01-11 13:48 ` [PATCH 10/53] tools build: Add BPF feature check to test-all Wang Nan
@ 2016-01-12 10:11   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 124+ messages in thread
From: tip-bot for Wang Nan @ 2016-01-12 10:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: lizefan, mingo, tglx, namhyung, acme, linux-kernel, jolsa, hpa, wangnan0

Commit-ID:  0c4d40d580752dd7d639208782f71e317b16be67
Gitweb:     http://git.kernel.org/tip/0c4d40d580752dd7d639208782f71e317b16be67
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 11 Jan 2016 13:48:01 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Jan 2016 19:22:21 -0300

tools build: Add BPF feature check to test-all

The test-all.c file doesn't check BPF related features. For an
environment with all other features enabled, BPF would be considered
enabled without doing real feature check.

This patch adds test-bpf.c into test-all.c.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-11-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/build/feature/test-all.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index 33cf6f2..81025ca 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -125,6 +125,10 @@
 # include "test-get_cpuid.c"
 #undef main
 
+#define main main_test_bpf
+# include "test-bpf.c"
+#undef main
+
 int main(int argc, char *argv[])
 {
 	main_test_libpython();
@@ -153,6 +157,7 @@ int main(int argc, char *argv[])
 	main_test_pthread_attr_setaffinity_np();
 	main_test_lzma();
 	main_test_get_cpuid();
+	main_test_bpf();
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [tip:perf/urgent] perf test: Fix false TEST_OK result for ' perf test hist'
  2016-01-11 13:48 ` [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist' Wang Nan
  2016-01-11 14:25   ` Sergei Shtylyov
@ 2016-01-12 10:11   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 124+ messages in thread
From: tip-bot for Wang Nan @ 2016-01-12 10:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, lizefan, tglx, hpa, masami.hiramatsu.pt, acme,
	namhyung, jolsa, mingo, wangnan0

Commit-ID:  71b3ee7e65ffb48135d875d9c36e3183b9ecffeb
Gitweb:     http://git.kernel.org/tip/71b3ee7e65ffb48135d875d9c36e3183b9ecffeb
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 11 Jan 2016 13:48:02 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Jan 2016 19:22:22 -0300

perf test: Fix false TEST_OK result for 'perf test hist'

Commit 71d6de64fedd ("perf test: Fix hist testcases when kptr_restrict is on")
solves a double free problem when 'perf test hist' calling
setup_fake_machine(). However, the result is still incorrect. For example:

  $ ./perf test -v 'filtering hist entries'
  25: Test filtering hist entries                              :
  --- start ---
  test child forked, pid 4186
  Cannot create kernel maps
  test child finished with 0
  ---- end ----
  Test filtering hist entries: Ok

In this case the body of this test is not get executed at all, but the
result is 'Ok'.

Actually, in setup_fake_machine() there's no need to create real kernel
maps. What we want are the fake maps. This patch removes the
machine__create_kernel_maps() in setup_fake_machine(), so it won't be
affected by kptr_restrict setting.

Test result:

  $ cat /proc/sys/kernel/kptr_restrict
  1
  $ ~/perf test -v hist
  15: Test matching and linking multiple hists                 :
  --- start ---
  test child forked, pid 24031
  test child finished with 0
  ---- end ----
  Test matching and linking multiple hists: Ok
  [SNIP]

Suggested-and-Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-12-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/hists_common.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index bcfd081..071a8b5 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -87,11 +87,6 @@ struct machine *setup_fake_machine(struct machines *machines)
 		return NULL;
 	}
 
-	if (machine__create_kernel_maps(machine)) {
-		pr_debug("Cannot create kernel maps\n");
-		return NULL;
-	}
-
 	for (i = 0; i < ARRAY_SIZE(fake_threads); i++) {
 		struct thread *thread;
 

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [tip:perf/urgent] perf test: Reset err after using it hold errcode in hist testcases
  2016-01-11 13:48 ` [PATCH 12/53] perf test: Reset err after using it hold errcode in hist testcases Wang Nan
@ 2016-01-12 10:11   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 124+ messages in thread
From: tip-bot for Wang Nan @ 2016-01-12 10:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: masami.hiramatsu.pt, mingo, tglx, jolsa, hpa, lizefan, acme,
	linux-kernel, wangnan0, namhyung

Commit-ID:  b0500c169b4069e40f03391c7280cd6eaf849e49
Gitweb:     http://git.kernel.org/tip/b0500c169b4069e40f03391c7280cd6eaf849e49
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Mon, 11 Jan 2016 13:48:03 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Jan 2016 19:22:22 -0300

perf test: Reset err after using it hold errcode in hist testcases

All hists test cases forget to reset err after using it to hold an
error code. If error occure in setup_fake_machine() it incorrectly
return TEST_OK.

This patch fixes it.

Suggested-and-Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-13-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/hists_cumulate.c | 1 +
 tools/perf/tests/hists_filter.c   | 1 +
 tools/perf/tests/hists_link.c     | 1 +
 tools/perf/tests/hists_output.c   | 1 +
 4 files changed, 4 insertions(+)

diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index e360892..5e6a86e 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -706,6 +706,7 @@ int test__hists_cumulate(int subtest __maybe_unused)
 	err = parse_events(evlist, "cpu-clock", NULL);
 	if (err)
 		goto out;
+	err = TEST_FAIL;
 
 	machines__init(&machines);
 
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 2a784be..351a424 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -120,6 +120,7 @@ int test__hists_filter(int subtest __maybe_unused)
 	err = parse_events(evlist, "task-clock", NULL);
 	if (err)
 		goto out;
+	err = TEST_FAIL;
 
 	/* default sort order (comm,dso,sym) will be used */
 	if (setup_sorting(NULL) < 0)
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index c764d69..64b257d 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -293,6 +293,7 @@ int test__hists_link(int subtest __maybe_unused)
 	if (err)
 		goto out;
 
+	err = TEST_FAIL;
 	/* default sort order (comm,dso,sym) will be used */
 	if (setup_sorting(NULL) < 0)
 		goto out;
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index ebe6cd4..b231265 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -597,6 +597,7 @@ int test__hists_output(int subtest __maybe_unused)
 	err = parse_events(evlist, "cpu-clock", NULL);
 	if (err)
 		goto out;
+	err = TEST_FAIL;
 
 	machines__init(&machines);
 

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 16/53 v2] perf tools: Fix mmap2 event allocation in synthesize code
  2016-01-11 21:03   ` Arnaldo Carvalho de Melo
@ 2016-01-12 10:12     ` Wang Nan
  2016-01-12 10:49       ` 平松雅巳 / HIRAMATU,MASAMI
  2016-01-13  9:40       ` [tip:perf/urgent] " tip-bot for Wang Nan
  0 siblings, 2 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-12 10:12 UTC (permalink / raw)
  To: acme
  Cc: jolsa, linux-kernel, Wang Nan, Arnaldo Carvalho de Melo,
	He Kuang, Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama

perf_event__synthesize_mmap_events() issues mmap2 events, but the
memory of that event is allocated using:

 mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);

If path of mmap source file is long (near PATH_MAX), random crash
would happen. Should use sizeof(mmap_event->mmap2).

Fix two memory allocations.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---

v1 -> v2: Don't rename mmap to mmap2.

---
 tools/perf/util/event.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index cd61bb1..85155e9 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -503,7 +503,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
 	if (comm_event == NULL)
 		goto out;
 
-	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
+	mmap_event = malloc(sizeof(mmap_event->mmap2) + machine->id_hdr_size);
 	if (mmap_event == NULL)
 		goto out_free_comm;
 
@@ -577,7 +577,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
 	if (comm_event == NULL)
 		goto out;
 
-	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
+	mmap_event = malloc(sizeof(mmap_event->mmap2) + machine->id_hdr_size);
 	if (mmap_event == NULL)
 		goto out_free_comm;
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* RE: [PATCH 16/53 v2] perf tools: Fix mmap2 event allocation in synthesize code
  2016-01-12 10:12     ` [PATCH 16/53 v2] " Wang Nan
@ 2016-01-12 10:49       ` 平松雅巳 / HIRAMATU,MASAMI
  2016-01-12 10:51         ` Wangnan (F)
  2016-01-13  9:40       ` [tip:perf/urgent] " tip-bot for Wang Nan
  1 sibling, 1 reply; 124+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2016-01-12 10:49 UTC (permalink / raw)
  To: 'Wang Nan', acme
  Cc: jolsa, linux-kernel, Arnaldo Carvalho de Melo, He Kuang,
	Namhyung Kim, Zefan Li, pi3orama

>From: Wang Nan [mailto:wangnan0@huawei.com]
>
>perf_event__synthesize_mmap_events() issues mmap2 events, but the
>memory of that event is allocated using:
>
> mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
>
>If path of mmap source file is long (near PATH_MAX), random crash
>would happen. Should use sizeof(mmap_event->mmap2).
>
>Fix two memory allocations.

Looks good to me. But hope to have another rename patch soon after this...

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thanks,

>
>Signed-off-by: Wang Nan <wangnan0@huawei.com>
>Acked-by: Jiri Olsa <jolsa@kernel.org>
>Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>Cc: He Kuang <hekuang@huawei.com>
>Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>Cc: Namhyung Kim <namhyung@kernel.org>
>Cc: Zefan Li <lizefan@huawei.com>
>Cc: pi3orama@163.com
>---
>
>v1 -> v2: Don't rename mmap to mmap2.
>
>---
> tools/perf/util/event.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
>index cd61bb1..85155e9 100644
>--- a/tools/perf/util/event.c
>+++ b/tools/perf/util/event.c
>@@ -503,7 +503,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
> 	if (comm_event == NULL)
> 		goto out;
>
>-	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
>+	mmap_event = malloc(sizeof(mmap_event->mmap2) + machine->id_hdr_size);
> 	if (mmap_event == NULL)
> 		goto out_free_comm;
>
>@@ -577,7 +577,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
> 	if (comm_event == NULL)
> 		goto out;
>
>-	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
>+	mmap_event = malloc(sizeof(mmap_event->mmap2) + machine->id_hdr_size);
> 	if (mmap_event == NULL)
> 		goto out_free_comm;
>
>--
>1.8.3.4

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 16/53 v2] perf tools: Fix mmap2 event allocation in synthesize code
  2016-01-12 10:49       ` 平松雅巳 / HIRAMATU,MASAMI
@ 2016-01-12 10:51         ` Wangnan (F)
  2016-01-12 14:24           ` acme
  0 siblings, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-12 10:51 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI, acme
  Cc: jolsa, linux-kernel, Arnaldo Carvalho de Melo, He Kuang,
	Namhyung Kim, Zefan Li, pi3orama



On 2016/1/12 18:49, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: Wang Nan [mailto:wangnan0@huawei.com]
>>
>> perf_event__synthesize_mmap_events() issues mmap2 events, but the
>> memory of that event is allocated using:
>>
>> mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
>>
>> If path of mmap source file is long (near PATH_MAX), random crash
>> would happen. Should use sizeof(mmap_event->mmap2).
>>
>> Fix two memory allocations.
> Looks good to me. But hope to have another rename patch soon after this...

According to Arnaldo, we don't need rename patch. He think mmap_event
is okay. Right?

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-12  6:11       ` Alexei Starovoitov
@ 2016-01-12 12:36         ` Wangnan (F)
  2016-01-12 19:56           ` Alexei Starovoitov
  0 siblings, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-12 12:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Peter Zijlstra
  Cc: acme, linux-kernel, pi3orama, lizefan, netdev, davem,
	Adrian Hunter, Arnaldo Carvalho de Melo, David Ahern,
	Ingo Molnar, Yunlong Song



On 2016/1/12 14:11, Alexei Starovoitov wrote:
> On Tue, Jan 12, 2016 at 01:33:28PM +0800, Wangnan (F) wrote:
>>
>> On 2016/1/12 2:09, Alexei Starovoitov wrote:
>>> On Mon, Jan 11, 2016 at 01:48:18PM +0000, Wang Nan wrote:
>>>> This patch introduces a PERF_SAMPLE_TAILSIZE flag which allows a size
>>>> field attached at the end of a sample. The idea comes from [1] that,
>>>> with tie size at tail of an event, it is possible for user program who
>>>> read from the ring buffer parse events backward.
>>>>
>>>> For example:
>>>>
>>>>     head
>>>>      |
>>>>      V
>>>>   +--+---+-------+----------+------+---+
>>>>   |E6|...|   B  8|   C    11|  D  7|E..|
>>>>   +--+---+-------+----------+------+---+
>>>>
>>>> In this case, from the 'head' pointer provided by kernel, user program
>>>> can first see '6' by (*(head - sizeof(u64))), then it can get the start
>>>> pointer of record 'E', then it can read size and find start position
>>>> of record D, C, B in similar way.
>>> adding extra 8 bytes for every sample is quite unfortunate.
>>> How about another idea:
>>> . update data_tail pointer when head is about to overwrite it
>>>
>>> Ex:
>>>     head   data_tail
>>>      |       |
>>>      V       V
>>>   +--+-------+-------+---+----+---+
>>>   |E |  ...  |   B   | C |  D | E |
>>>   +--+-------+-------+---+----+---+
>>>
>>> if new sample F is about to overwrite B, the kernel would need
>>> to read the size of B from B's header and update data_tail to point C.
>>> Or even further.
>>> Comparing to TAILSIZE approach, now kernel will be doing both reads
>>> and writes into ring-buffer and there is a concern that reads may
>>> be hitting cold data, but if the records are small they may be
>>> actually on the same cache line brought by the previous
>>> read A's header, write E record cycle. So I think we shouldn't see
>>> cache misses.
>> After ring buffer rewind, we need a read before nearly
>> every write operations. The performance penalty depends on
>> configuration of write allocate. In addition, another data
>> dependency is required: we must wait for the size of
>> event B is retrived before overwrite it.
>>
>> Even in the very first try at 2013 in [1], reading from the ring
>> buffer is avoided. I don't think Peter changes his mind now.
>>
>>> Another concern is validity of records stored. If user space messes
>>> with ring-buffer, kernel won't be able to move data_tail properly
>>> and would need to indicate that to userspace somehow.
>>> But memory saving of 8 bytes per record could be sizable
>> Yes. But I have already discussed with Peter on this in [2].
>> Last month I suggested:
>>
>> <quote>
>>
>>   1. If PERF_SAMPLE_SIZE is selected, we can avoid outputting the event
>>      size in header. Which eliminate extra space cost;
>> </quote>
>>
>> However:
>>
>> <quote>
>>
>> That would mandate you always parse the stream backwards. Which seems
>> rather unfortunate. Also, no you cannot recoup the extra space, see the
>> alignment and size requirement.
> hmm, in this kernel patch I see that you're adding 8 bytes for
> every record via this extra TAILSISZE flag and in perf you're
> walking the ring buffer backwards by reading this 8 byte
> sizes, comparing header sizes and so on until reaching beginning,
> where you start dumping it as normal.
> So for this 'signal to perf' approach to work the ring buffer
> will contain tailsizes everywhere just so that user space can
> find the beginning. That's not very pretty. imo if kernel
> can do header read to adjust data_tail it would make user
> space side clean. May be there are other solutions.
> Adding tailsize seems like brute force hack.
> There must be some nicer way.
Hi Peter,

  What's your opinion? Should we reconsider moving size field from 
header the end?
Or moving whole header to the end of a record?

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-12  5:33     ` Wangnan (F)
  2016-01-12  6:11       ` Alexei Starovoitov
@ 2016-01-12 14:05       ` Peter Zijlstra
  1 sibling, 0 replies; 124+ messages in thread
From: Peter Zijlstra @ 2016-01-12 14:05 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Alexei Starovoitov, acme, linux-kernel, pi3orama, lizefan,
	netdev, davem, Adrian Hunter, Arnaldo Carvalho de Melo,
	David Ahern, Ingo Molnar, Yunlong Song

On Tue, Jan 12, 2016 at 01:33:28PM +0800, Wangnan (F) wrote:
> >How about another idea:
> >. update data_tail pointer when head is about to overwrite it

> Even in the very first try at 2013 in [1], reading from the ring
> buffer is avoided. I don't think Peter changes his mind now.

So I don't object to that this approach per-se, as per:

 lkml.kernel.org/r/20151023151205.GW11639@twins.programming.kicks-ass.net

The main concern is doing it so that the regular !overwrite mode doesn't
suffer in performance.

The patch above choses to sacrifice about half the buffer space to avoid
having to fwd parse events on every overwrite, but doing the fwd search
is certainly possible (just really expensive).

> >Another concern is validity of records stored. If user space messes
> >with ring-buffer, kernel won't be able to move data_tail properly
> >and would need to indicate that to userspace somehow.
> >But memory saving of 8 bytes per record could be sizable
> 
> Yes. But I have already discussed with Peter on this in [2].

So given the trade-off between loosing half the buffer and adding 8
bytes to every event I'm not sure which is the worst.

Also, the way its implemented means they're not in fact mutually
exclusive, both can be had.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 13/53] perf tools: Prevent calling machine__delete() on non-allocated machine
  2016-01-12  7:03     ` Wangnan (F)
@ 2016-01-12 14:07       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-12 14:07 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa,
	Masami Hiramatsu, Namhyung Kim

Em Tue, Jan 12, 2016 at 03:03:39PM +0800, Wangnan (F) escreveu:
> On 2016/1/11 23:42, Arnaldo Carvalho de Melo wrote:
> >Em Mon, Jan 11, 2016 at 01:48:04PM +0000, Wang Nan escreveu:
> >>To prevent futher commits calling machine__delete() on non-allocated
> >>'struct machine' (which would cause memory corruption), this patch
> >>enforces machine__init(), record whether a machine structure is
> >>dynamically allocated or not, and warn if machine__delete() is called
> >>on incorrect object.
> >Not sure on this one, I think I voiced this before, this seems like
> >something to be tested using some static analysis tool or even checking
> >if the address for the struct hitting machine__delete() is from malloc
> >or not.
> >
> >I.e. if we do it here, we may have to do it to any other struct where we
> >allocate it in the stack or via malloc, and furthermore there are cases
> >where we embed a struct in another, when we would free just the main
> >struct but not the second, embedded one, that would need just calling
> >foo__exit() and not foo__delete().

> OK. Let's drop this one.

I'll let a note in my TODO list to improve this situation, dropping the
rename, thanks for the resent patch with just the fix.

Try to do it like that in the future, if possible, i.e. one thing per
patch, one with the super-minimal fix, anything else in a separate
patch.

- Arnaldo

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 05/53] perf tools: Test correct path of perf in build-test
  2016-01-12  7:16           ` Wangnan (F)
@ 2016-01-12 14:08             ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-12 14:08 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: linux-kernel, pi3orama, lizefan, netdev, davem, Jiri Olsa, Namhyung Kim

Em Tue, Jan 12, 2016 at 03:16:08PM +0800, Wangnan (F) escreveu:
> 
> 
> On 2016/1/12 6:39, Arnaldo Carvalho de Melo wrote:
> >Em Mon, Jan 11, 2016 at 07:39:04PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>Em Mon, Jan 11, 2016 at 07:06:18PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>>Em Mon, Jan 11, 2016 at 12:24:56PM -0300, Arnaldo Carvalho de Melo escreveu:
> >>>>Em Mon, Jan 11, 2016 at 01:47:56PM +0000, Wang Nan escreveu:
> >>>>>If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
> >>>>>will fail because perf resides in a different directory. Fix this by
> >>>>>computing PERF_OUT according to 'O' and test correct output files.
> >>>>>For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
> >>>>>instead because the path is different from others ($(O)/perf vs
> >>>>>  $(O)/tools/perf).
> >>>>Ok, applying up to this patch I now manage to almost cleanly build it using O=,
> >>>>see below, but seems that we have some race, as not all tests end up producing
> >>>>such warnings.
> >>>>
> >>>>[acme@felicio linux]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; make O=/tmp/build/perf -C tools/perf build-test
> >>>>make: Entering directory `/home/acme/git/linux/tools/perf'
> >>>>Testing Makefile
> >>>>- make_no_libperl: cd . && make -f Makefile   DESTDIR=/tmp/tmp.m1nXBMqhSA NO_LIBPERL=1
> >>>>find: ‘/tmp/build/perf/util/trace-event-scripting.o’: No such file or directory
> 
> This can happen when you parallelly run find and rm on one directory.
> However,
> I've never seen this message in build-test before.

I'll leave this in the backburner for now, there are other, more
important patches to process, we should revisit this as soon as we
process the other eBPF patches :-\
 
> >>>Well, it is happening even without O=:
> >>So I removed a few patches and those aren't appearing anymore, please
> >>take a look at my perf/core branch, running build-test on a few machines
> >>now, will push soon.
> >>
> >>My hunch is that build-test has issues with parallel builds, but I'm not
> >>sure...
> >
> >Good:
> >
> >- make_perf_o_O: cd . && make -f Makefile O=/tmp/tmp.oLeg8aUaOo DESTDIR=/tmp/tmp.16WP4HTQJs perf.o
> >- make_util_pmu_bison_o_O: cd . && make -f Makefile O=/tmp/tmp.xNRV0pCXfD DESTDIR=/tmp/tmp.8dyU9uEbHe util/pmu-bison.o
> >- make_no_libdw_dwarf_unwind_O: cd . && make -f Makefile O=/tmp/tmp.pHH4HExHcH DESTDIR=/tmp/tmp.Wo0m8fF5cp NO_LIBDW_DWARF_UNWIND=1
> >- make_no_demangle_O: cd . && make -f Makefile O=/tmp/tmp.yWNsd4jOsI DESTDIR=/tmp/tmp.Q7eA4kCvwL NO_DEMANGLE=1
> >- tarpkg: ./tests/perf-targz-src-pkg .
> >- make -C <kernelsrc> tools/perf
> >- make -C <kernelsrc>/tools perf
> >OK
> Glad to see this.
> 
> Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 14/53] perf test: Check environment before start real BPF test
  2016-01-12  7:40     ` Wangnan (F)
@ 2016-01-12 14:10       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-12 14:10 UTC (permalink / raw)
  To: Wangnan (F); +Cc: linux-kernel, pi3orama, lizefan, netdev, davem

Em Tue, Jan 12, 2016 at 03:40:40PM +0800, Wangnan (F) escreveu:
> 
> 
> On 2016/1/12 5:55, Arnaldo Carvalho de Melo wrote:
> >Em Mon, Jan 11, 2016 at 01:48:05PM +0000, Wang Nan escreveu:
> >>Copying perf to old kernel system results:
> >>
> >>  # perf test bpf
> >>  37: Test BPF filter                                          :
> >>  37.1: Test basic BPF filtering                               : FAILED!
> >>  37.2: Test BPF prologue generation                           : Skip
> >>
> >>However, in case when kernel doesn't support a test case it should
> >>return 'Skip', 'FAILED!' should be reserved for kernel tests for when
> >>the kernel supports a feature that then fails to work as advertised.
> >>
> >>This patch checks environment before real testcase.
> >This is really strange, this other test is failing if the above patch is
> >present, found by bisecting:
> >
> >[acme@felicio linux]$ perf test decoder
> >47: Test x86 instruction decoder - new instructions          : FAILED!
> >[acme@felicio linux]$ git log --oneline -1
> >91fedd318e3d perf test: Check environment before start real BPF test
> >[acme@felicio linux]$ git reset --hard HEAD^
> >HEAD is now at f1f23526d3b6 perf test: Reset err after using it hold
> >errcode in hist testcases
> >[acme@felicio linux]$ m
> >make: Entering directory `/home/acme/git/linux/tools/perf'
> >   BUILD:   Doing 'make -j4' parallel build
> >   CC       /tmp/build/perf/arch/common.o
> >   CC       /tmp/build/perf/util/abspath.o
> >   CC       /tmp/build/perf/builtin-bench.o
> >   CC       /tmp/build/perf/util/alias.o
> >
> ><SNIP>
> >[acme@felicio linux]$ git log --oneline -1
> >f1f23526d3b6 perf test: Reset err after using it hold errcode in hist
> >testcases
> >[acme@felicio linux]$ perf test decoder
> >47: Test x86 instruction decoder - new instructions          : Ok
> >[acme@felicio linux]$
> 
> Yes, really strange, and I can't reproduce your result
> in my environment. What's the result of test -v?

So, 'the 47: Test x86 instruction decoder' one, without the patch, working,
goes below, I'll send another message with it failing, after I re-apply that patch.


got: test__dwarf_unwind 0x47edbe, expecting test__dwarf_unwind
test child finished with 0
---- end ----
Test dwarf unwind: Ok
47: Test x86 instruction decoder - new instructions          :
--- start ---
test child forked, pid 22923
Decoded ok: 0f 31                	rdtsc  
Decoded ok: f3 0f 1b 00          	bndmk  (%eax),%bnd0
Decoded ok: f3 0f 1b 05 78 56 34 12 	bndmk  0x12345678,%bnd0
Decoded ok: f3 0f 1b 18          	bndmk  (%eax),%bnd3
Decoded ok: f3 0f 1b 04 01       	bndmk  (%ecx,%eax,1),%bnd0
Decoded ok: f3 0f 1b 04 05 78 56 34 12 	bndmk  0x12345678(,%eax,1),%bnd0
Decoded ok: f3 0f 1b 04 08       	bndmk  (%eax,%ecx,1),%bnd0
Decoded ok: f3 0f 1b 04 c8       	bndmk  (%eax,%ecx,8),%bnd0
Decoded ok: f3 0f 1b 40 12       	bndmk  0x12(%eax),%bnd0
Decoded ok: f3 0f 1b 45 12       	bndmk  0x12(%ebp),%bnd0
Decoded ok: f3 0f 1b 44 01 12    	bndmk  0x12(%ecx,%eax,1),%bnd0
Decoded ok: f3 0f 1b 44 05 12    	bndmk  0x12(%ebp,%eax,1),%bnd0
Decoded ok: f3 0f 1b 44 08 12    	bndmk  0x12(%eax,%ecx,1),%bnd0
Decoded ok: f3 0f 1b 44 c8 12    	bndmk  0x12(%eax,%ecx,8),%bnd0
Decoded ok: f3 0f 1b 80 78 56 34 12 	bndmk  0x12345678(%eax),%bnd0
Decoded ok: f3 0f 1b 85 78 56 34 12 	bndmk  0x12345678(%ebp),%bnd0
Decoded ok: f3 0f 1b 84 01 78 56 34 12 	bndmk  0x12345678(%ecx,%eax,1),%bnd0
Decoded ok: f3 0f 1b 84 05 78 56 34 12 	bndmk  0x12345678(%ebp,%eax,1),%bnd0
Decoded ok: f3 0f 1b 84 08 78 56 34 12 	bndmk  0x12345678(%eax,%ecx,1),%bnd0
Decoded ok: f3 0f 1b 84 c8 78 56 34 12 	bndmk  0x12345678(%eax,%ecx,8),%bnd0
Decoded ok: f3 0f 1a 00          	bndcl  (%eax),%bnd0
Decoded ok: f3 0f 1a 05 78 56 34 12 	bndcl  0x12345678,%bnd0
Decoded ok: f3 0f 1a 18          	bndcl  (%eax),%bnd3
Decoded ok: f3 0f 1a 04 01       	bndcl  (%ecx,%eax,1),%bnd0
Decoded ok: f3 0f 1a 04 05 78 56 34 12 	bndcl  0x12345678(,%eax,1),%bnd0
Decoded ok: f3 0f 1a 04 08       	bndcl  (%eax,%ecx,1),%bnd0
Decoded ok: f3 0f 1a 04 c8       	bndcl  (%eax,%ecx,8),%bnd0
Decoded ok: f3 0f 1a 40 12       	bndcl  0x12(%eax),%bnd0
Decoded ok: f3 0f 1a 45 12       	bndcl  0x12(%ebp),%bnd0
Decoded ok: f3 0f 1a 44 01 12    	bndcl  0x12(%ecx,%eax,1),%bnd0
Decoded ok: f3 0f 1a 44 05 12    	bndcl  0x12(%ebp,%eax,1),%bnd0
Decoded ok: f3 0f 1a 44 08 12    	bndcl  0x12(%eax,%ecx,1),%bnd0
Decoded ok: f3 0f 1a 44 c8 12    	bndcl  0x12(%eax,%ecx,8),%bnd0
Decoded ok: f3 0f 1a 80 78 56 34 12 	bndcl  0x12345678(%eax),%bnd0
Decoded ok: f3 0f 1a 85 78 56 34 12 	bndcl  0x12345678(%ebp),%bnd0
Decoded ok: f3 0f 1a 84 01 78 56 34 12 	bndcl  0x12345678(%ecx,%eax,1),%bnd0
Decoded ok: f3 0f 1a 84 05 78 56 34 12 	bndcl  0x12345678(%ebp,%eax,1),%bnd0
Decoded ok: f3 0f 1a 84 08 78 56 34 12 	bndcl  0x12345678(%eax,%ecx,1),%bnd0
Decoded ok: f3 0f 1a 84 c8 78 56 34 12 	bndcl  0x12345678(%eax,%ecx,8),%bnd0
Decoded ok: f3 0f 1a c0          	bndcl  %eax,%bnd0
Decoded ok: f2 0f 1a 00          	bndcu  (%eax),%bnd0
Decoded ok: f2 0f 1a 05 78 56 34 12 	bndcu  0x12345678,%bnd0
Decoded ok: f2 0f 1a 18          	bndcu  (%eax),%bnd3
Decoded ok: f2 0f 1a 04 01       	bndcu  (%ecx,%eax,1),%bnd0
Decoded ok: f2 0f 1a 04 05 78 56 34 12 	bndcu  0x12345678(,%eax,1),%bnd0
Decoded ok: f2 0f 1a 04 08       	bndcu  (%eax,%ecx,1),%bnd0
Decoded ok: f2 0f 1a 04 c8       	bndcu  (%eax,%ecx,8),%bnd0
Decoded ok: f2 0f 1a 40 12       	bndcu  0x12(%eax),%bnd0
Decoded ok: f2 0f 1a 45 12       	bndcu  0x12(%ebp),%bnd0
Decoded ok: f2 0f 1a 44 01 12    	bndcu  0x12(%ecx,%eax,1),%bnd0
Decoded ok: f2 0f 1a 44 05 12    	bndcu  0x12(%ebp,%eax,1),%bnd0
Decoded ok: f2 0f 1a 44 08 12    	bndcu  0x12(%eax,%ecx,1),%bnd0
Decoded ok: f2 0f 1a 44 c8 12    	bndcu  0x12(%eax,%ecx,8),%bnd0
Decoded ok: f2 0f 1a 80 78 56 34 12 	bndcu  0x12345678(%eax),%bnd0
Decoded ok: f2 0f 1a 85 78 56 34 12 	bndcu  0x12345678(%ebp),%bnd0
Decoded ok: f2 0f 1a 84 01 78 56 34 12 	bndcu  0x12345678(%ecx,%eax,1),%bnd0
Decoded ok: f2 0f 1a 84 05 78 56 34 12 	bndcu  0x12345678(%ebp,%eax,1),%bnd0
Decoded ok: f2 0f 1a 84 08 78 56 34 12 	bndcu  0x12345678(%eax,%ecx,1),%bnd0
Decoded ok: f2 0f 1a 84 c8 78 56 34 12 	bndcu  0x12345678(%eax,%ecx,8),%bnd0
Decoded ok: f2 0f 1a c0          	bndcu  %eax,%bnd0
Decoded ok: f2 0f 1b 00          	bndcn  (%eax),%bnd0
Decoded ok: f2 0f 1b 05 78 56 34 12 	bndcn  0x12345678,%bnd0
Decoded ok: f2 0f 1b 18          	bndcn  (%eax),%bnd3
Decoded ok: f2 0f 1b 04 01       	bndcn  (%ecx,%eax,1),%bnd0
Decoded ok: f2 0f 1b 04 05 78 56 34 12 	bndcn  0x12345678(,%eax,1),%bnd0
Decoded ok: f2 0f 1b 04 08       	bndcn  (%eax,%ecx,1),%bnd0
Decoded ok: f2 0f 1b 04 c8       	bndcn  (%eax,%ecx,8),%bnd0
Decoded ok: f2 0f 1b 40 12       	bndcn  0x12(%eax),%bnd0
Decoded ok: f2 0f 1b 45 12       	bndcn  0x12(%ebp),%bnd0
Decoded ok: f2 0f 1b 44 01 12    	bndcn  0x12(%ecx,%eax,1),%bnd0
Decoded ok: f2 0f 1b 44 05 12    	bndcn  0x12(%ebp,%eax,1),%bnd0
Decoded ok: f2 0f 1b 44 08 12    	bndcn  0x12(%eax,%ecx,1),%bnd0
Decoded ok: f2 0f 1b 44 c8 12    	bndcn  0x12(%eax,%ecx,8),%bnd0
Decoded ok: f2 0f 1b 80 78 56 34 12 	bndcn  0x12345678(%eax),%bnd0
Decoded ok: f2 0f 1b 85 78 56 34 12 	bndcn  0x12345678(%ebp),%bnd0
Decoded ok: f2 0f 1b 84 01 78 56 34 12 	bndcn  0x12345678(%ecx,%eax,1),%bnd0
Decoded ok: f2 0f 1b 84 05 78 56 34 12 	bndcn  0x12345678(%ebp,%eax,1),%bnd0
Decoded ok: f2 0f 1b 84 08 78 56 34 12 	bndcn  0x12345678(%eax,%ecx,1),%bnd0
Decoded ok: f2 0f 1b 84 c8 78 56 34 12 	bndcn  0x12345678(%eax,%ecx,8),%bnd0
Decoded ok: f2 0f 1b c0          	bndcn  %eax,%bnd0
Decoded ok: 66 0f 1a 00          	bndmov (%eax),%bnd0
Decoded ok: 66 0f 1a 05 78 56 34 12 	bndmov 0x12345678,%bnd0
Decoded ok: 66 0f 1a 18          	bndmov (%eax),%bnd3
Decoded ok: 66 0f 1a 04 01       	bndmov (%ecx,%eax,1),%bnd0
Decoded ok: 66 0f 1a 04 05 78 56 34 12 	bndmov 0x12345678(,%eax,1),%bnd0
Decoded ok: 66 0f 1a 04 08       	bndmov (%eax,%ecx,1),%bnd0
Decoded ok: 66 0f 1a 04 c8       	bndmov (%eax,%ecx,8),%bnd0
Decoded ok: 66 0f 1a 40 12       	bndmov 0x12(%eax),%bnd0
Decoded ok: 66 0f 1a 45 12       	bndmov 0x12(%ebp),%bnd0
Decoded ok: 66 0f 1a 44 01 12    	bndmov 0x12(%ecx,%eax,1),%bnd0
Decoded ok: 66 0f 1a 44 05 12    	bndmov 0x12(%ebp,%eax,1),%bnd0
Decoded ok: 66 0f 1a 44 08 12    	bndmov 0x12(%eax,%ecx,1),%bnd0
Decoded ok: 66 0f 1a 44 c8 12    	bndmov 0x12(%eax,%ecx,8),%bnd0
Decoded ok: 66 0f 1a 80 78 56 34 12 	bndmov 0x12345678(%eax),%bnd0
Decoded ok: 66 0f 1a 85 78 56 34 12 	bndmov 0x12345678(%ebp),%bnd0
Decoded ok: 66 0f 1a 84 01 78 56 34 12 	bndmov 0x12345678(%ecx,%eax,1),%bnd0
Decoded ok: 66 0f 1a 84 05 78 56 34 12 	bndmov 0x12345678(%ebp,%eax,1),%bnd0
Decoded ok: 66 0f 1a 84 08 78 56 34 12 	bndmov 0x12345678(%eax,%ecx,1),%bnd0
Decoded ok: 66 0f 1a 84 c8 78 56 34 12 	bndmov 0x12345678(%eax,%ecx,8),%bnd0
Decoded ok: 66 0f 1b 00          	bndmov %bnd0,(%eax)
Decoded ok: 66 0f 1b 05 78 56 34 12 	bndmov %bnd0,0x12345678
Decoded ok: 66 0f 1b 18          	bndmov %bnd3,(%eax)
Decoded ok: 66 0f 1b 04 01       	bndmov %bnd0,(%ecx,%eax,1)
Decoded ok: 66 0f 1b 04 05 78 56 34 12 	bndmov %bnd0,0x12345678(,%eax,1)
Decoded ok: 66 0f 1b 04 08       	bndmov %bnd0,(%eax,%ecx,1)
Decoded ok: 66 0f 1b 04 c8       	bndmov %bnd0,(%eax,%ecx,8)
Decoded ok: 66 0f 1b 40 12       	bndmov %bnd0,0x12(%eax)
Decoded ok: 66 0f 1b 45 12       	bndmov %bnd0,0x12(%ebp)
Decoded ok: 66 0f 1b 44 01 12    	bndmov %bnd0,0x12(%ecx,%eax,1)
Decoded ok: 66 0f 1b 44 05 12    	bndmov %bnd0,0x12(%ebp,%eax,1)
Decoded ok: 66 0f 1b 44 08 12    	bndmov %bnd0,0x12(%eax,%ecx,1)
Decoded ok: 66 0f 1b 44 c8 12    	bndmov %bnd0,0x12(%eax,%ecx,8)
Decoded ok: 66 0f 1b 80 78 56 34 12 	bndmov %bnd0,0x12345678(%eax)
Decoded ok: 66 0f 1b 85 78 56 34 12 	bndmov %bnd0,0x12345678(%ebp)
Decoded ok: 66 0f 1b 84 01 78 56 34 12 	bndmov %bnd0,0x12345678(%ecx,%eax,1)
Decoded ok: 66 0f 1b 84 05 78 56 34 12 	bndmov %bnd0,0x12345678(%ebp,%eax,1)
Decoded ok: 66 0f 1b 84 08 78 56 34 12 	bndmov %bnd0,0x12345678(%eax,%ecx,1)
Decoded ok: 66 0f 1b 84 c8 78 56 34 12 	bndmov %bnd0,0x12345678(%eax,%ecx,8)
Decoded ok: 66 0f 1a c8          	bndmov %bnd0,%bnd1
Decoded ok: 66 0f 1a c1          	bndmov %bnd1,%bnd0
Decoded ok: 0f 1a 00             	bndldx (%eax),%bnd0
Decoded ok: 0f 1a 05 78 56 34 12 	bndldx 0x12345678,%bnd0
Decoded ok: 0f 1a 18             	bndldx (%eax),%bnd3
Decoded ok: 0f 1a 04 01          	bndldx (%ecx,%eax,1),%bnd0
Decoded ok: 0f 1a 04 05 78 56 34 12 	bndldx 0x12345678(,%eax,1),%bnd0
Decoded ok: 0f 1a 04 08          	bndldx (%eax,%ecx,1),%bnd0
Decoded ok: 0f 1a 40 12          	bndldx 0x12(%eax),%bnd0
Decoded ok: 0f 1a 45 12          	bndldx 0x12(%ebp),%bnd0
Decoded ok: 0f 1a 44 01 12       	bndldx 0x12(%ecx,%eax,1),%bnd0
Decoded ok: 0f 1a 44 05 12       	bndldx 0x12(%ebp,%eax,1),%bnd0
Decoded ok: 0f 1a 44 08 12       	bndldx 0x12(%eax,%ecx,1),%bnd0
Decoded ok: 0f 1a 80 78 56 34 12 	bndldx 0x12345678(%eax),%bnd0
Decoded ok: 0f 1a 85 78 56 34 12 	bndldx 0x12345678(%ebp),%bnd0
Decoded ok: 0f 1a 84 01 78 56 34 12 	bndldx 0x12345678(%ecx,%eax,1),%bnd0
Decoded ok: 0f 1a 84 05 78 56 34 12 	bndldx 0x12345678(%ebp,%eax,1),%bnd0
Decoded ok: 0f 1a 84 08 78 56 34 12 	bndldx 0x12345678(%eax,%ecx,1),%bnd0
Decoded ok: 0f 1b 00             	bndstx %bnd0,(%eax)
Decoded ok: 0f 1b 05 78 56 34 12 	bndstx %bnd0,0x12345678
Decoded ok: 0f 1b 18             	bndstx %bnd3,(%eax)
Decoded ok: 0f 1b 04 01          	bndstx %bnd0,(%ecx,%eax,1)
Decoded ok: 0f 1b 04 05 78 56 34 12 	bndstx %bnd0,0x12345678(,%eax,1)
Decoded ok: 0f 1b 04 08          	bndstx %bnd0,(%eax,%ecx,1)
Decoded ok: 0f 1b 40 12          	bndstx %bnd0,0x12(%eax)
Decoded ok: 0f 1b 45 12          	bndstx %bnd0,0x12(%ebp)
Decoded ok: 0f 1b 44 01 12       	bndstx %bnd0,0x12(%ecx,%eax,1)
Decoded ok: 0f 1b 44 05 12       	bndstx %bnd0,0x12(%ebp,%eax,1)
Decoded ok: 0f 1b 44 08 12       	bndstx %bnd0,0x12(%eax,%ecx,1)
Decoded ok: 0f 1b 80 78 56 34 12 	bndstx %bnd0,0x12345678(%eax)
Decoded ok: 0f 1b 85 78 56 34 12 	bndstx %bnd0,0x12345678(%ebp)
Decoded ok: 0f 1b 84 01 78 56 34 12 	bndstx %bnd0,0x12345678(%ecx,%eax,1)
Decoded ok: 0f 1b 84 05 78 56 34 12 	bndstx %bnd0,0x12345678(%ebp,%eax,1)
Decoded ok: 0f 1b 84 08 78 56 34 12 	bndstx %bnd0,0x12345678(%eax,%ecx,1)
Decoded ok: f2 e8 fc ff ff ff    	bnd call 3c3 <main+0x3c3>
Decoded ok: f2 ff 10             	bnd call *(%eax)
Decoded ok: f2 c3                	bnd ret 
Decoded ok: f2 e9 fc ff ff ff    	bnd jmp 3ce <main+0x3ce>
Decoded ok: f2 e9 fc ff ff ff    	bnd jmp 3d4 <main+0x3d4>
Decoded ok: f2 ff 21             	bnd jmp *(%ecx)
Decoded ok: f2 0f 85 fc ff ff ff 	bnd jne 3de <main+0x3de>
Decoded ok: 0f 3a cc c1 00       	sha1rnds4 $0x0,%xmm1,%xmm0
Decoded ok: 0f 3a cc d7 91       	sha1rnds4 $0x91,%xmm7,%xmm2
Decoded ok: 0f 3a cc 00 91       	sha1rnds4 $0x91,(%eax),%xmm0
Decoded ok: 0f 3a cc 05 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678,%xmm0
Decoded ok: 0f 3a cc 18 91       	sha1rnds4 $0x91,(%eax),%xmm3
Decoded ok: 0f 3a cc 04 01 91    	sha1rnds4 $0x91,(%ecx,%eax,1),%xmm0
Decoded ok: 0f 3a cc 04 05 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(,%eax,1),%xmm0
Decoded ok: 0f 3a cc 04 08 91    	sha1rnds4 $0x91,(%eax,%ecx,1),%xmm0
Decoded ok: 0f 3a cc 04 c8 91    	sha1rnds4 $0x91,(%eax,%ecx,8),%xmm0
Decoded ok: 0f 3a cc 40 12 91    	sha1rnds4 $0x91,0x12(%eax),%xmm0
Decoded ok: 0f 3a cc 45 12 91    	sha1rnds4 $0x91,0x12(%ebp),%xmm0
Decoded ok: 0f 3a cc 44 01 12 91 	sha1rnds4 $0x91,0x12(%ecx,%eax,1),%xmm0
Decoded ok: 0f 3a cc 44 05 12 91 	sha1rnds4 $0x91,0x12(%ebp,%eax,1),%xmm0
Decoded ok: 0f 3a cc 44 08 12 91 	sha1rnds4 $0x91,0x12(%eax,%ecx,1),%xmm0
Decoded ok: 0f 3a cc 44 c8 12 91 	sha1rnds4 $0x91,0x12(%eax,%ecx,8),%xmm0
Decoded ok: 0f 3a cc 80 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%eax),%xmm0
Decoded ok: 0f 3a cc 85 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%ebp),%xmm0
Decoded ok: 0f 3a cc 84 01 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%ecx,%eax,1),%xmm0
Decoded ok: 0f 3a cc 84 05 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%ebp,%eax,1),%xmm0
Decoded ok: 0f 3a cc 84 08 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%eax,%ecx,1),%xmm0
Decoded ok: 0f 3a cc 84 c8 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 c8 c1          	sha1nexte %xmm1,%xmm0
Decoded ok: 0f 38 c8 d7          	sha1nexte %xmm7,%xmm2
Decoded ok: 0f 38 c8 00          	sha1nexte (%eax),%xmm0
Decoded ok: 0f 38 c8 05 78 56 34 12 	sha1nexte 0x12345678,%xmm0
Decoded ok: 0f 38 c8 18          	sha1nexte (%eax),%xmm3
Decoded ok: 0f 38 c8 04 01       	sha1nexte (%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 c8 04 05 78 56 34 12 	sha1nexte 0x12345678(,%eax,1),%xmm0
Decoded ok: 0f 38 c8 04 08       	sha1nexte (%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 c8 04 c8       	sha1nexte (%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 c8 40 12       	sha1nexte 0x12(%eax),%xmm0
Decoded ok: 0f 38 c8 45 12       	sha1nexte 0x12(%ebp),%xmm0
Decoded ok: 0f 38 c8 44 01 12    	sha1nexte 0x12(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 c8 44 05 12    	sha1nexte 0x12(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 c8 44 08 12    	sha1nexte 0x12(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 c8 44 c8 12    	sha1nexte 0x12(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 c8 80 78 56 34 12 	sha1nexte 0x12345678(%eax),%xmm0
Decoded ok: 0f 38 c8 85 78 56 34 12 	sha1nexte 0x12345678(%ebp),%xmm0
Decoded ok: 0f 38 c8 84 01 78 56 34 12 	sha1nexte 0x12345678(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 c8 84 05 78 56 34 12 	sha1nexte 0x12345678(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 c8 84 08 78 56 34 12 	sha1nexte 0x12345678(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 c8 84 c8 78 56 34 12 	sha1nexte 0x12345678(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 c9 c1          	sha1msg1 %xmm1,%xmm0
Decoded ok: 0f 38 c9 d7          	sha1msg1 %xmm7,%xmm2
Decoded ok: 0f 38 c9 00          	sha1msg1 (%eax),%xmm0
Decoded ok: 0f 38 c9 05 78 56 34 12 	sha1msg1 0x12345678,%xmm0
Decoded ok: 0f 38 c9 18          	sha1msg1 (%eax),%xmm3
Decoded ok: 0f 38 c9 04 01       	sha1msg1 (%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 c9 04 05 78 56 34 12 	sha1msg1 0x12345678(,%eax,1),%xmm0
Decoded ok: 0f 38 c9 04 08       	sha1msg1 (%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 c9 04 c8       	sha1msg1 (%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 c9 40 12       	sha1msg1 0x12(%eax),%xmm0
Decoded ok: 0f 38 c9 45 12       	sha1msg1 0x12(%ebp),%xmm0
Decoded ok: 0f 38 c9 44 01 12    	sha1msg1 0x12(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 c9 44 05 12    	sha1msg1 0x12(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 c9 44 08 12    	sha1msg1 0x12(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 c9 44 c8 12    	sha1msg1 0x12(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 c9 80 78 56 34 12 	sha1msg1 0x12345678(%eax),%xmm0
Decoded ok: 0f 38 c9 85 78 56 34 12 	sha1msg1 0x12345678(%ebp),%xmm0
Decoded ok: 0f 38 c9 84 01 78 56 34 12 	sha1msg1 0x12345678(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 c9 84 05 78 56 34 12 	sha1msg1 0x12345678(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 c9 84 08 78 56 34 12 	sha1msg1 0x12345678(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 c9 84 c8 78 56 34 12 	sha1msg1 0x12345678(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 ca c1          	sha1msg2 %xmm1,%xmm0
Decoded ok: 0f 38 ca d7          	sha1msg2 %xmm7,%xmm2
Decoded ok: 0f 38 ca 00          	sha1msg2 (%eax),%xmm0
Decoded ok: 0f 38 ca 05 78 56 34 12 	sha1msg2 0x12345678,%xmm0
Decoded ok: 0f 38 ca 18          	sha1msg2 (%eax),%xmm3
Decoded ok: 0f 38 ca 04 01       	sha1msg2 (%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 ca 04 05 78 56 34 12 	sha1msg2 0x12345678(,%eax,1),%xmm0
Decoded ok: 0f 38 ca 04 08       	sha1msg2 (%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 ca 04 c8       	sha1msg2 (%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 ca 40 12       	sha1msg2 0x12(%eax),%xmm0
Decoded ok: 0f 38 ca 45 12       	sha1msg2 0x12(%ebp),%xmm0
Decoded ok: 0f 38 ca 44 01 12    	sha1msg2 0x12(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 ca 44 05 12    	sha1msg2 0x12(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 ca 44 08 12    	sha1msg2 0x12(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 ca 44 c8 12    	sha1msg2 0x12(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 ca 80 78 56 34 12 	sha1msg2 0x12345678(%eax),%xmm0
Decoded ok: 0f 38 ca 85 78 56 34 12 	sha1msg2 0x12345678(%ebp),%xmm0
Decoded ok: 0f 38 ca 84 01 78 56 34 12 	sha1msg2 0x12345678(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 ca 84 05 78 56 34 12 	sha1msg2 0x12345678(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 ca 84 08 78 56 34 12 	sha1msg2 0x12345678(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 ca 84 c8 78 56 34 12 	sha1msg2 0x12345678(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 cb cc          	sha256rnds2 %xmm0,%xmm4,%xmm1
Decoded ok: 0f 38 cb d7          	sha256rnds2 %xmm0,%xmm7,%xmm2
Decoded ok: 0f 38 cb 08          	sha256rnds2 %xmm0,(%eax),%xmm1
Decoded ok: 0f 38 cb 0d 78 56 34 12 	sha256rnds2 %xmm0,0x12345678,%xmm1
Decoded ok: 0f 38 cb 18          	sha256rnds2 %xmm0,(%eax),%xmm3
Decoded ok: 0f 38 cb 0c 01       	sha256rnds2 %xmm0,(%ecx,%eax,1),%xmm1
Decoded ok: 0f 38 cb 0c 05 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(,%eax,1),%xmm1
Decoded ok: 0f 38 cb 0c 08       	sha256rnds2 %xmm0,(%eax,%ecx,1),%xmm1
Decoded ok: 0f 38 cb 0c c8       	sha256rnds2 %xmm0,(%eax,%ecx,8),%xmm1
Decoded ok: 0f 38 cb 48 12       	sha256rnds2 %xmm0,0x12(%eax),%xmm1
Decoded ok: 0f 38 cb 4d 12       	sha256rnds2 %xmm0,0x12(%ebp),%xmm1
Decoded ok: 0f 38 cb 4c 01 12    	sha256rnds2 %xmm0,0x12(%ecx,%eax,1),%xmm1
Decoded ok: 0f 38 cb 4c 05 12    	sha256rnds2 %xmm0,0x12(%ebp,%eax,1),%xmm1
Decoded ok: 0f 38 cb 4c 08 12    	sha256rnds2 %xmm0,0x12(%eax,%ecx,1),%xmm1
Decoded ok: 0f 38 cb 4c c8 12    	sha256rnds2 %xmm0,0x12(%eax,%ecx,8),%xmm1
Decoded ok: 0f 38 cb 88 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%eax),%xmm1
Decoded ok: 0f 38 cb 8d 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%ebp),%xmm1
Decoded ok: 0f 38 cb 8c 01 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%ecx,%eax,1),%xmm1
Decoded ok: 0f 38 cb 8c 05 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%ebp,%eax,1),%xmm1
Decoded ok: 0f 38 cb 8c 08 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%eax,%ecx,1),%xmm1
Decoded ok: 0f 38 cb 8c c8 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%eax,%ecx,8),%xmm1
Decoded ok: 0f 38 cc c1          	sha256msg1 %xmm1,%xmm0
Decoded ok: 0f 38 cc d7          	sha256msg1 %xmm7,%xmm2
Decoded ok: 0f 38 cc 00          	sha256msg1 (%eax),%xmm0
Decoded ok: 0f 38 cc 05 78 56 34 12 	sha256msg1 0x12345678,%xmm0
Decoded ok: 0f 38 cc 18          	sha256msg1 (%eax),%xmm3
Decoded ok: 0f 38 cc 04 01       	sha256msg1 (%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 cc 04 05 78 56 34 12 	sha256msg1 0x12345678(,%eax,1),%xmm0
Decoded ok: 0f 38 cc 04 08       	sha256msg1 (%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 cc 04 c8       	sha256msg1 (%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 cc 40 12       	sha256msg1 0x12(%eax),%xmm0
Decoded ok: 0f 38 cc 45 12       	sha256msg1 0x12(%ebp),%xmm0
Decoded ok: 0f 38 cc 44 01 12    	sha256msg1 0x12(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 cc 44 05 12    	sha256msg1 0x12(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 cc 44 08 12    	sha256msg1 0x12(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 cc 44 c8 12    	sha256msg1 0x12(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 cc 80 78 56 34 12 	sha256msg1 0x12345678(%eax),%xmm0
Decoded ok: 0f 38 cc 85 78 56 34 12 	sha256msg1 0x12345678(%ebp),%xmm0
Decoded ok: 0f 38 cc 84 01 78 56 34 12 	sha256msg1 0x12345678(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 cc 84 05 78 56 34 12 	sha256msg1 0x12345678(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 cc 84 08 78 56 34 12 	sha256msg1 0x12345678(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 cc 84 c8 78 56 34 12 	sha256msg1 0x12345678(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 cd c1          	sha256msg2 %xmm1,%xmm0
Decoded ok: 0f 38 cd d7          	sha256msg2 %xmm7,%xmm2
Decoded ok: 0f 38 cd 00          	sha256msg2 (%eax),%xmm0
Decoded ok: 0f 38 cd 05 78 56 34 12 	sha256msg2 0x12345678,%xmm0
Decoded ok: 0f 38 cd 18          	sha256msg2 (%eax),%xmm3
Decoded ok: 0f 38 cd 04 01       	sha256msg2 (%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 cd 04 05 78 56 34 12 	sha256msg2 0x12345678(,%eax,1),%xmm0
Decoded ok: 0f 38 cd 04 08       	sha256msg2 (%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 cd 04 c8       	sha256msg2 (%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 cd 40 12       	sha256msg2 0x12(%eax),%xmm0
Decoded ok: 0f 38 cd 45 12       	sha256msg2 0x12(%ebp),%xmm0
Decoded ok: 0f 38 cd 44 01 12    	sha256msg2 0x12(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 cd 44 05 12    	sha256msg2 0x12(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 cd 44 08 12    	sha256msg2 0x12(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 cd 44 c8 12    	sha256msg2 0x12(%eax,%ecx,8),%xmm0
Decoded ok: 0f 38 cd 80 78 56 34 12 	sha256msg2 0x12345678(%eax),%xmm0
Decoded ok: 0f 38 cd 85 78 56 34 12 	sha256msg2 0x12345678(%ebp),%xmm0
Decoded ok: 0f 38 cd 84 01 78 56 34 12 	sha256msg2 0x12345678(%ecx,%eax,1),%xmm0
Decoded ok: 0f 38 cd 84 05 78 56 34 12 	sha256msg2 0x12345678(%ebp,%eax,1),%xmm0
Decoded ok: 0f 38 cd 84 08 78 56 34 12 	sha256msg2 0x12345678(%eax,%ecx,1),%xmm0
Decoded ok: 0f 38 cd 84 c8 78 56 34 12 	sha256msg2 0x12345678(%eax,%ecx,8),%xmm0
Decoded ok: 66 0f ae 38          	clflushopt (%eax)
Decoded ok: 66 0f ae 3d 78 56 34 12 	clflushopt 0x12345678
Decoded ok: 66 0f ae bc c8 78 56 34 12 	clflushopt 0x12345678(%eax,%ecx,8)
Decoded ok: 0f ae 38             	clflush (%eax)
Decoded ok: 0f ae f8             	sfence 
Decoded ok: 66 0f ae 30          	clwb   (%eax)
Decoded ok: 66 0f ae 35 78 56 34 12 	clwb   0x12345678
Decoded ok: 66 0f ae b4 c8 78 56 34 12 	clwb   0x12345678(%eax,%ecx,8)
Decoded ok: 0f ae 30             	xsaveopt (%eax)
Decoded ok: 0f ae f0             	mfence 
Decoded ok: 0f c7 20             	xsavec (%eax)
Decoded ok: 0f c7 25 78 56 34 12 	xsavec 0x12345678
Decoded ok: 0f c7 a4 c8 78 56 34 12 	xsavec 0x12345678(%eax,%ecx,8)
Decoded ok: 0f c7 28             	xsaves (%eax)
Decoded ok: 0f c7 2d 78 56 34 12 	xsaves 0x12345678
Decoded ok: 0f c7 ac c8 78 56 34 12 	xsaves 0x12345678(%eax,%ecx,8)
Decoded ok: 0f c7 18             	xrstors (%eax)
Decoded ok: 0f c7 1d 78 56 34 12 	xrstors 0x12345678
Decoded ok: 0f c7 9c c8 78 56 34 12 	xrstors 0x12345678(%eax,%ecx,8)
Decoded ok: 66 0f ae f8          	pcommit 
Decoded ok: 0f 01 ee             	rdpkru
Decoded ok: 0f 01 ef             	wrpkru
Decoded ok: 0f 31                	rdtsc  
Decoded ok: f3 0f 1b 00          	bndmk  (%rax),%bnd0
Decoded ok: f3 41 0f 1b 00       	bndmk  (%r8),%bnd0
Decoded ok: f3 0f 1b 04 25 78 56 34 12 	bndmk  0x12345678,%bnd0
Decoded ok: f3 0f 1b 18          	bndmk  (%rax),%bnd3
Decoded ok: f3 0f 1b 04 01       	bndmk  (%rcx,%rax,1),%bnd0
Decoded ok: f3 0f 1b 04 05 78 56 34 12 	bndmk  0x12345678(,%rax,1),%bnd0
Decoded ok: f3 0f 1b 04 08       	bndmk  (%rax,%rcx,1),%bnd0
Decoded ok: f3 0f 1b 04 c8       	bndmk  (%rax,%rcx,8),%bnd0
Decoded ok: f3 0f 1b 40 12       	bndmk  0x12(%rax),%bnd0
Decoded ok: f3 0f 1b 45 12       	bndmk  0x12(%rbp),%bnd0
Decoded ok: f3 0f 1b 44 01 12    	bndmk  0x12(%rcx,%rax,1),%bnd0
Decoded ok: f3 0f 1b 44 05 12    	bndmk  0x12(%rbp,%rax,1),%bnd0
Decoded ok: f3 0f 1b 44 08 12    	bndmk  0x12(%rax,%rcx,1),%bnd0
Decoded ok: f3 0f 1b 44 c8 12    	bndmk  0x12(%rax,%rcx,8),%bnd0
Decoded ok: f3 0f 1b 80 78 56 34 12 	bndmk  0x12345678(%rax),%bnd0
Decoded ok: f3 0f 1b 85 78 56 34 12 	bndmk  0x12345678(%rbp),%bnd0
Decoded ok: f3 0f 1b 84 01 78 56 34 12 	bndmk  0x12345678(%rcx,%rax,1),%bnd0
Decoded ok: f3 0f 1b 84 05 78 56 34 12 	bndmk  0x12345678(%rbp,%rax,1),%bnd0
Decoded ok: f3 0f 1b 84 08 78 56 34 12 	bndmk  0x12345678(%rax,%rcx,1),%bnd0
Decoded ok: f3 0f 1b 84 c8 78 56 34 12 	bndmk  0x12345678(%rax,%rcx,8),%bnd0
Decoded ok: f3 0f 1a 00          	bndcl  (%rax),%bnd0
Decoded ok: f3 41 0f 1a 00       	bndcl  (%r8),%bnd0
Decoded ok: f3 0f 1a 04 25 78 56 34 12 	bndcl  0x12345678,%bnd0
Decoded ok: f3 0f 1a 18          	bndcl  (%rax),%bnd3
Decoded ok: f3 0f 1a 04 01       	bndcl  (%rcx,%rax,1),%bnd0
Decoded ok: f3 0f 1a 04 05 78 56 34 12 	bndcl  0x12345678(,%rax,1),%bnd0
Decoded ok: f3 0f 1a 04 08       	bndcl  (%rax,%rcx,1),%bnd0
Decoded ok: f3 0f 1a 04 c8       	bndcl  (%rax,%rcx,8),%bnd0
Decoded ok: f3 0f 1a 40 12       	bndcl  0x12(%rax),%bnd0
Decoded ok: f3 0f 1a 45 12       	bndcl  0x12(%rbp),%bnd0
Decoded ok: f3 0f 1a 44 01 12    	bndcl  0x12(%rcx,%rax,1),%bnd0
Decoded ok: f3 0f 1a 44 05 12    	bndcl  0x12(%rbp,%rax,1),%bnd0
Decoded ok: f3 0f 1a 44 08 12    	bndcl  0x12(%rax,%rcx,1),%bnd0
Decoded ok: f3 0f 1a 44 c8 12    	bndcl  0x12(%rax,%rcx,8),%bnd0
Decoded ok: f3 0f 1a 80 78 56 34 12 	bndcl  0x12345678(%rax),%bnd0
Decoded ok: f3 0f 1a 85 78 56 34 12 	bndcl  0x12345678(%rbp),%bnd0
Decoded ok: f3 0f 1a 84 01 78 56 34 12 	bndcl  0x12345678(%rcx,%rax,1),%bnd0
Decoded ok: f3 0f 1a 84 05 78 56 34 12 	bndcl  0x12345678(%rbp,%rax,1),%bnd0
Decoded ok: f3 0f 1a 84 08 78 56 34 12 	bndcl  0x12345678(%rax,%rcx,1),%bnd0
Decoded ok: f3 0f 1a 84 c8 78 56 34 12 	bndcl  0x12345678(%rax,%rcx,8),%bnd0
Decoded ok: f3 0f 1a c0          	bndcl  %rax,%bnd0
Decoded ok: f2 0f 1a 00          	bndcu  (%rax),%bnd0
Decoded ok: f2 41 0f 1a 00       	bndcu  (%r8),%bnd0
Decoded ok: f2 0f 1a 04 25 78 56 34 12 	bndcu  0x12345678,%bnd0
Decoded ok: f2 0f 1a 18          	bndcu  (%rax),%bnd3
Decoded ok: f2 0f 1a 04 01       	bndcu  (%rcx,%rax,1),%bnd0
Decoded ok: f2 0f 1a 04 05 78 56 34 12 	bndcu  0x12345678(,%rax,1),%bnd0
Decoded ok: f2 0f 1a 04 08       	bndcu  (%rax,%rcx,1),%bnd0
Decoded ok: f2 0f 1a 04 c8       	bndcu  (%rax,%rcx,8),%bnd0
Decoded ok: f2 0f 1a 40 12       	bndcu  0x12(%rax),%bnd0
Decoded ok: f2 0f 1a 45 12       	bndcu  0x12(%rbp),%bnd0
Decoded ok: f2 0f 1a 44 01 12    	bndcu  0x12(%rcx,%rax,1),%bnd0
Decoded ok: f2 0f 1a 44 05 12    	bndcu  0x12(%rbp,%rax,1),%bnd0
Decoded ok: f2 0f 1a 44 08 12    	bndcu  0x12(%rax,%rcx,1),%bnd0
Decoded ok: f2 0f 1a 44 c8 12    	bndcu  0x12(%rax,%rcx,8),%bnd0
Decoded ok: f2 0f 1a 80 78 56 34 12 	bndcu  0x12345678(%rax),%bnd0
Decoded ok: f2 0f 1a 85 78 56 34 12 	bndcu  0x12345678(%rbp),%bnd0
Decoded ok: f2 0f 1a 84 01 78 56 34 12 	bndcu  0x12345678(%rcx,%rax,1),%bnd0
Decoded ok: f2 0f 1a 84 05 78 56 34 12 	bndcu  0x12345678(%rbp,%rax,1),%bnd0
Decoded ok: f2 0f 1a 84 08 78 56 34 12 	bndcu  0x12345678(%rax,%rcx,1),%bnd0
Decoded ok: f2 0f 1a 84 c8 78 56 34 12 	bndcu  0x12345678(%rax,%rcx,8),%bnd0
Decoded ok: f2 0f 1a c0          	bndcu  %rax,%bnd0
Decoded ok: f2 0f 1b 00          	bndcn  (%rax),%bnd0
Decoded ok: f2 41 0f 1b 00       	bndcn  (%r8),%bnd0
Decoded ok: f2 0f 1b 04 25 78 56 34 12 	bndcn  0x12345678,%bnd0
Decoded ok: f2 0f 1b 18          	bndcn  (%rax),%bnd3
Decoded ok: f2 0f 1b 04 01       	bndcn  (%rcx,%rax,1),%bnd0
Decoded ok: f2 0f 1b 04 05 78 56 34 12 	bndcn  0x12345678(,%rax,1),%bnd0
Decoded ok: f2 0f 1b 04 08       	bndcn  (%rax,%rcx,1),%bnd0
Decoded ok: f2 0f 1b 04 c8       	bndcn  (%rax,%rcx,8),%bnd0
Decoded ok: f2 0f 1b 40 12       	bndcn  0x12(%rax),%bnd0
Decoded ok: f2 0f 1b 45 12       	bndcn  0x12(%rbp),%bnd0
Decoded ok: f2 0f 1b 44 01 12    	bndcn  0x12(%rcx,%rax,1),%bnd0
Decoded ok: f2 0f 1b 44 05 12    	bndcn  0x12(%rbp,%rax,1),%bnd0
Decoded ok: f2 0f 1b 44 08 12    	bndcn  0x12(%rax,%rcx,1),%bnd0
Decoded ok: f2 0f 1b 44 c8 12    	bndcn  0x12(%rax,%rcx,8),%bnd0
Decoded ok: f2 0f 1b 80 78 56 34 12 	bndcn  0x12345678(%rax),%bnd0
Decoded ok: f2 0f 1b 85 78 56 34 12 	bndcn  0x12345678(%rbp),%bnd0
Decoded ok: f2 0f 1b 84 01 78 56 34 12 	bndcn  0x12345678(%rcx,%rax,1),%bnd0
Decoded ok: f2 0f 1b 84 05 78 56 34 12 	bndcn  0x12345678(%rbp,%rax,1),%bnd0
Decoded ok: f2 0f 1b 84 08 78 56 34 12 	bndcn  0x12345678(%rax,%rcx,1),%bnd0
Decoded ok: f2 0f 1b 84 c8 78 56 34 12 	bndcn  0x12345678(%rax,%rcx,8),%bnd0
Decoded ok: f2 0f 1b c0          	bndcn  %rax,%bnd0
Decoded ok: 66 0f 1a 00          	bndmov (%rax),%bnd0
Decoded ok: 66 41 0f 1a 00       	bndmov (%r8),%bnd0
Decoded ok: 66 0f 1a 04 25 78 56 34 12 	bndmov 0x12345678,%bnd0
Decoded ok: 66 0f 1a 18          	bndmov (%rax),%bnd3
Decoded ok: 66 0f 1a 04 01       	bndmov (%rcx,%rax,1),%bnd0
Decoded ok: 66 0f 1a 04 05 78 56 34 12 	bndmov 0x12345678(,%rax,1),%bnd0
Decoded ok: 66 0f 1a 04 08       	bndmov (%rax,%rcx,1),%bnd0
Decoded ok: 66 0f 1a 04 c8       	bndmov (%rax,%rcx,8),%bnd0
Decoded ok: 66 0f 1a 40 12       	bndmov 0x12(%rax),%bnd0
Decoded ok: 66 0f 1a 45 12       	bndmov 0x12(%rbp),%bnd0
Decoded ok: 66 0f 1a 44 01 12    	bndmov 0x12(%rcx,%rax,1),%bnd0
Decoded ok: 66 0f 1a 44 05 12    	bndmov 0x12(%rbp,%rax,1),%bnd0
Decoded ok: 66 0f 1a 44 08 12    	bndmov 0x12(%rax,%rcx,1),%bnd0
Decoded ok: 66 0f 1a 44 c8 12    	bndmov 0x12(%rax,%rcx,8),%bnd0
Decoded ok: 66 0f 1a 80 78 56 34 12 	bndmov 0x12345678(%rax),%bnd0
Decoded ok: 66 0f 1a 85 78 56 34 12 	bndmov 0x12345678(%rbp),%bnd0
Decoded ok: 66 0f 1a 84 01 78 56 34 12 	bndmov 0x12345678(%rcx,%rax,1),%bnd0
Decoded ok: 66 0f 1a 84 05 78 56 34 12 	bndmov 0x12345678(%rbp,%rax,1),%bnd0
Decoded ok: 66 0f 1a 84 08 78 56 34 12 	bndmov 0x12345678(%rax,%rcx,1),%bnd0
Decoded ok: 66 0f 1a 84 c8 78 56 34 12 	bndmov 0x12345678(%rax,%rcx,8),%bnd0
Decoded ok: 66 0f 1b 00          	bndmov %bnd0,(%rax)
Decoded ok: 66 41 0f 1b 00       	bndmov %bnd0,(%r8)
Decoded ok: 66 0f 1b 04 25 78 56 34 12 	bndmov %bnd0,0x12345678
Decoded ok: 66 0f 1b 18          	bndmov %bnd3,(%rax)
Decoded ok: 66 0f 1b 04 01       	bndmov %bnd0,(%rcx,%rax,1)
Decoded ok: 66 0f 1b 04 05 78 56 34 12 	bndmov %bnd0,0x12345678(,%rax,1)
Decoded ok: 66 0f 1b 04 08       	bndmov %bnd0,(%rax,%rcx,1)
Decoded ok: 66 0f 1b 04 c8       	bndmov %bnd0,(%rax,%rcx,8)
Decoded ok: 66 0f 1b 40 12       	bndmov %bnd0,0x12(%rax)
Decoded ok: 66 0f 1b 45 12       	bndmov %bnd0,0x12(%rbp)
Decoded ok: 66 0f 1b 44 01 12    	bndmov %bnd0,0x12(%rcx,%rax,1)
Decoded ok: 66 0f 1b 44 05 12    	bndmov %bnd0,0x12(%rbp,%rax,1)
Decoded ok: 66 0f 1b 44 08 12    	bndmov %bnd0,0x12(%rax,%rcx,1)
Decoded ok: 66 0f 1b 44 c8 12    	bndmov %bnd0,0x12(%rax,%rcx,8)
Decoded ok: 66 0f 1b 80 78 56 34 12 	bndmov %bnd0,0x12345678(%rax)
Decoded ok: 66 0f 1b 85 78 56 34 12 	bndmov %bnd0,0x12345678(%rbp)
Decoded ok: 66 0f 1b 84 01 78 56 34 12 	bndmov %bnd0,0x12345678(%rcx,%rax,1)
Decoded ok: 66 0f 1b 84 05 78 56 34 12 	bndmov %bnd0,0x12345678(%rbp,%rax,1)
Decoded ok: 66 0f 1b 84 08 78 56 34 12 	bndmov %bnd0,0x12345678(%rax,%rcx,1)
Decoded ok: 66 0f 1b 84 c8 78 56 34 12 	bndmov %bnd0,0x12345678(%rax,%rcx,8)
Decoded ok: 66 0f 1a c8          	bndmov %bnd0,%bnd1
Decoded ok: 66 0f 1a c1          	bndmov %bnd1,%bnd0
Decoded ok: 0f 1a 00             	bndldx (%rax),%bnd0
Decoded ok: 41 0f 1a 00          	bndldx (%r8),%bnd0
Decoded ok: 0f 1a 04 25 78 56 34 12 	bndldx 0x12345678,%bnd0
Decoded ok: 0f 1a 18             	bndldx (%rax),%bnd3
Decoded ok: 0f 1a 04 01          	bndldx (%rcx,%rax,1),%bnd0
Decoded ok: 0f 1a 04 05 78 56 34 12 	bndldx 0x12345678(,%rax,1),%bnd0
Decoded ok: 0f 1a 04 08          	bndldx (%rax,%rcx,1),%bnd0
Decoded ok: 0f 1a 40 12          	bndldx 0x12(%rax),%bnd0
Decoded ok: 0f 1a 45 12          	bndldx 0x12(%rbp),%bnd0
Decoded ok: 0f 1a 44 01 12       	bndldx 0x12(%rcx,%rax,1),%bnd0
Decoded ok: 0f 1a 44 05 12       	bndldx 0x12(%rbp,%rax,1),%bnd0
Decoded ok: 0f 1a 44 08 12       	bndldx 0x12(%rax,%rcx,1),%bnd0
Decoded ok: 0f 1a 80 78 56 34 12 	bndldx 0x12345678(%rax),%bnd0
Decoded ok: 0f 1a 85 78 56 34 12 	bndldx 0x12345678(%rbp),%bnd0
Decoded ok: 0f 1a 84 01 78 56 34 12 	bndldx 0x12345678(%rcx,%rax,1),%bnd0
Decoded ok: 0f 1a 84 05 78 56 34 12 	bndldx 0x12345678(%rbp,%rax,1),%bnd0
Decoded ok: 0f 1a 84 08 78 56 34 12 	bndldx 0x12345678(%rax,%rcx,1),%bnd0
Decoded ok: 0f 1b 00             	bndstx %bnd0,(%rax)
Decoded ok: 41 0f 1b 00          	bndstx %bnd0,(%r8)
Decoded ok: 0f 1b 04 25 78 56 34 12 	bndstx %bnd0,0x12345678
Decoded ok: 0f 1b 18             	bndstx %bnd3,(%rax)
Decoded ok: 0f 1b 04 01          	bndstx %bnd0,(%rcx,%rax,1)
Decoded ok: 0f 1b 04 05 78 56 34 12 	bndstx %bnd0,0x12345678(,%rax,1)
Decoded ok: 0f 1b 04 08          	bndstx %bnd0,(%rax,%rcx,1)
Decoded ok: 0f 1b 40 12          	bndstx %bnd0,0x12(%rax)
Decoded ok: 0f 1b 45 12          	bndstx %bnd0,0x12(%rbp)
Decoded ok: 0f 1b 44 01 12       	bndstx %bnd0,0x12(%rcx,%rax,1)
Decoded ok: 0f 1b 44 05 12       	bndstx %bnd0,0x12(%rbp,%rax,1)
Decoded ok: 0f 1b 44 08 12       	bndstx %bnd0,0x12(%rax,%rcx,1)
Decoded ok: 0f 1b 80 78 56 34 12 	bndstx %bnd0,0x12345678(%rax)
Decoded ok: 0f 1b 85 78 56 34 12 	bndstx %bnd0,0x12345678(%rbp)
Decoded ok: 0f 1b 84 01 78 56 34 12 	bndstx %bnd0,0x12345678(%rcx,%rax,1)
Decoded ok: 0f 1b 84 05 78 56 34 12 	bndstx %bnd0,0x12345678(%rbp,%rax,1)
Decoded ok: 0f 1b 84 08 78 56 34 12 	bndstx %bnd0,0x12345678(%rax,%rcx,1)
Decoded ok: f2 e8 00 00 00 00    	bnd callq 3f6 <main+0x3f6>
Decoded ok: 67 f2 ff 10          	bnd callq *(%eax)
Decoded ok: f2 c3                	bnd retq 
Decoded ok: f2 e9 00 00 00 00    	bnd jmpq 402 <main+0x402>
Decoded ok: f2 e9 00 00 00 00    	bnd jmpq 408 <main+0x408>
Decoded ok: 67 f2 ff 21          	bnd jmpq *(%ecx)
Decoded ok: f2 0f 85 00 00 00 00 	bnd jne 413 <main+0x413>
Decoded ok: 0f 3a cc c1 00       	sha1rnds4 $0x0,%xmm1,%xmm0
Decoded ok: 0f 3a cc d7 91       	sha1rnds4 $0x91,%xmm7,%xmm2
Decoded ok: 41 0f 3a cc c0 91    	sha1rnds4 $0x91,%xmm8,%xmm0
Decoded ok: 44 0f 3a cc c7 91    	sha1rnds4 $0x91,%xmm7,%xmm8
Decoded ok: 45 0f 3a cc c7 91    	sha1rnds4 $0x91,%xmm15,%xmm8
Decoded ok: 0f 3a cc 00 91       	sha1rnds4 $0x91,(%rax),%xmm0
Decoded ok: 41 0f 3a cc 00 91    	sha1rnds4 $0x91,(%r8),%xmm0
Decoded ok: 0f 3a cc 04 25 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678,%xmm0
Decoded ok: 0f 3a cc 18 91       	sha1rnds4 $0x91,(%rax),%xmm3
Decoded ok: 0f 3a cc 04 01 91    	sha1rnds4 $0x91,(%rcx,%rax,1),%xmm0
Decoded ok: 0f 3a cc 04 05 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(,%rax,1),%xmm0
Decoded ok: 0f 3a cc 04 08 91    	sha1rnds4 $0x91,(%rax,%rcx,1),%xmm0
Decoded ok: 0f 3a cc 04 c8 91    	sha1rnds4 $0x91,(%rax,%rcx,8),%xmm0
Decoded ok: 0f 3a cc 40 12 91    	sha1rnds4 $0x91,0x12(%rax),%xmm0
Decoded ok: 0f 3a cc 45 12 91    	sha1rnds4 $0x91,0x12(%rbp),%xmm0
Decoded ok: 0f 3a cc 44 01 12 91 	sha1rnds4 $0x91,0x12(%rcx,%rax,1),%xmm0
Decoded ok: 0f 3a cc 44 05 12 91 	sha1rnds4 $0x91,0x12(%rbp,%rax,1),%xmm0
Decoded ok: 0f 3a cc 44 08 12 91 	sha1rnds4 $0x91,0x12(%rax,%rcx,1),%xmm0
Decoded ok: 0f 3a cc 44 c8 12 91 	sha1rnds4 $0x91,0x12(%rax,%rcx,8),%xmm0
Decoded ok: 0f 3a cc 80 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%rax),%xmm0
Decoded ok: 0f 3a cc 85 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%rbp),%xmm0
Decoded ok: 0f 3a cc 84 01 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%rcx,%rax,1),%xmm0
Decoded ok: 0f 3a cc 84 05 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%rbp,%rax,1),%xmm0
Decoded ok: 0f 3a cc 84 08 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%rax,%rcx,1),%xmm0
Decoded ok: 0f 3a cc 84 c8 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm0
Decoded ok: 44 0f 3a cc bc c8 78 56 34 12 91 	sha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm15
Decoded ok: 0f 38 c8 c1          	sha1nexte %xmm1,%xmm0
Decoded ok: 0f 38 c8 d7          	sha1nexte %xmm7,%xmm2
Decoded ok: 41 0f 38 c8 c0       	sha1nexte %xmm8,%xmm0
Decoded ok: 44 0f 38 c8 c7       	sha1nexte %xmm7,%xmm8
Decoded ok: 45 0f 38 c8 c7       	sha1nexte %xmm15,%xmm8
Decoded ok: 0f 38 c8 00          	sha1nexte (%rax),%xmm0
Decoded ok: 41 0f 38 c8 00       	sha1nexte (%r8),%xmm0
Decoded ok: 0f 38 c8 04 25 78 56 34 12 	sha1nexte 0x12345678,%xmm0
Decoded ok: 0f 38 c8 18          	sha1nexte (%rax),%xmm3
Decoded ok: 0f 38 c8 04 01       	sha1nexte (%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 c8 04 05 78 56 34 12 	sha1nexte 0x12345678(,%rax,1),%xmm0
Decoded ok: 0f 38 c8 04 08       	sha1nexte (%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 c8 04 c8       	sha1nexte (%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 c8 40 12       	sha1nexte 0x12(%rax),%xmm0
Decoded ok: 0f 38 c8 45 12       	sha1nexte 0x12(%rbp),%xmm0
Decoded ok: 0f 38 c8 44 01 12    	sha1nexte 0x12(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 c8 44 05 12    	sha1nexte 0x12(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 c8 44 08 12    	sha1nexte 0x12(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 c8 44 c8 12    	sha1nexte 0x12(%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 c8 80 78 56 34 12 	sha1nexte 0x12345678(%rax),%xmm0
Decoded ok: 0f 38 c8 85 78 56 34 12 	sha1nexte 0x12345678(%rbp),%xmm0
Decoded ok: 0f 38 c8 84 01 78 56 34 12 	sha1nexte 0x12345678(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 c8 84 05 78 56 34 12 	sha1nexte 0x12345678(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 c8 84 08 78 56 34 12 	sha1nexte 0x12345678(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 c8 84 c8 78 56 34 12 	sha1nexte 0x12345678(%rax,%rcx,8),%xmm0
Decoded ok: 44 0f 38 c8 bc c8 78 56 34 12 	sha1nexte 0x12345678(%rax,%rcx,8),%xmm15
Decoded ok: 0f 38 c9 c1          	sha1msg1 %xmm1,%xmm0
Decoded ok: 0f 38 c9 d7          	sha1msg1 %xmm7,%xmm2
Decoded ok: 41 0f 38 c9 c0       	sha1msg1 %xmm8,%xmm0
Decoded ok: 44 0f 38 c9 c7       	sha1msg1 %xmm7,%xmm8
Decoded ok: 45 0f 38 c9 c7       	sha1msg1 %xmm15,%xmm8
Decoded ok: 0f 38 c9 00          	sha1msg1 (%rax),%xmm0
Decoded ok: 41 0f 38 c9 00       	sha1msg1 (%r8),%xmm0
Decoded ok: 0f 38 c9 04 25 78 56 34 12 	sha1msg1 0x12345678,%xmm0
Decoded ok: 0f 38 c9 18          	sha1msg1 (%rax),%xmm3
Decoded ok: 0f 38 c9 04 01       	sha1msg1 (%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 c9 04 05 78 56 34 12 	sha1msg1 0x12345678(,%rax,1),%xmm0
Decoded ok: 0f 38 c9 04 08       	sha1msg1 (%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 c9 04 c8       	sha1msg1 (%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 c9 40 12       	sha1msg1 0x12(%rax),%xmm0
Decoded ok: 0f 38 c9 45 12       	sha1msg1 0x12(%rbp),%xmm0
Decoded ok: 0f 38 c9 44 01 12    	sha1msg1 0x12(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 c9 44 05 12    	sha1msg1 0x12(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 c9 44 08 12    	sha1msg1 0x12(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 c9 44 c8 12    	sha1msg1 0x12(%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 c9 80 78 56 34 12 	sha1msg1 0x12345678(%rax),%xmm0
Decoded ok: 0f 38 c9 85 78 56 34 12 	sha1msg1 0x12345678(%rbp),%xmm0
Decoded ok: 0f 38 c9 84 01 78 56 34 12 	sha1msg1 0x12345678(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 c9 84 05 78 56 34 12 	sha1msg1 0x12345678(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 c9 84 08 78 56 34 12 	sha1msg1 0x12345678(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 c9 84 c8 78 56 34 12 	sha1msg1 0x12345678(%rax,%rcx,8),%xmm0
Decoded ok: 44 0f 38 c9 bc c8 78 56 34 12 	sha1msg1 0x12345678(%rax,%rcx,8),%xmm15
Decoded ok: 0f 38 ca c1          	sha1msg2 %xmm1,%xmm0
Decoded ok: 0f 38 ca d7          	sha1msg2 %xmm7,%xmm2
Decoded ok: 41 0f 38 ca c0       	sha1msg2 %xmm8,%xmm0
Decoded ok: 44 0f 38 ca c7       	sha1msg2 %xmm7,%xmm8
Decoded ok: 45 0f 38 ca c7       	sha1msg2 %xmm15,%xmm8
Decoded ok: 0f 38 ca 00          	sha1msg2 (%rax),%xmm0
Decoded ok: 41 0f 38 ca 00       	sha1msg2 (%r8),%xmm0
Decoded ok: 0f 38 ca 04 25 78 56 34 12 	sha1msg2 0x12345678,%xmm0
Decoded ok: 0f 38 ca 18          	sha1msg2 (%rax),%xmm3
Decoded ok: 0f 38 ca 04 01       	sha1msg2 (%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 ca 04 05 78 56 34 12 	sha1msg2 0x12345678(,%rax,1),%xmm0
Decoded ok: 0f 38 ca 04 08       	sha1msg2 (%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 ca 04 c8       	sha1msg2 (%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 ca 40 12       	sha1msg2 0x12(%rax),%xmm0
Decoded ok: 0f 38 ca 45 12       	sha1msg2 0x12(%rbp),%xmm0
Decoded ok: 0f 38 ca 44 01 12    	sha1msg2 0x12(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 ca 44 05 12    	sha1msg2 0x12(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 ca 44 08 12    	sha1msg2 0x12(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 ca 44 c8 12    	sha1msg2 0x12(%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 ca 80 78 56 34 12 	sha1msg2 0x12345678(%rax),%xmm0
Decoded ok: 0f 38 ca 85 78 56 34 12 	sha1msg2 0x12345678(%rbp),%xmm0
Decoded ok: 0f 38 ca 84 01 78 56 34 12 	sha1msg2 0x12345678(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 ca 84 05 78 56 34 12 	sha1msg2 0x12345678(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 ca 84 08 78 56 34 12 	sha1msg2 0x12345678(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 ca 84 c8 78 56 34 12 	sha1msg2 0x12345678(%rax,%rcx,8),%xmm0
Decoded ok: 44 0f 38 ca bc c8 78 56 34 12 	sha1msg2 0x12345678(%rax,%rcx,8),%xmm15
Decoded ok: 0f 38 cb cc          	sha256rnds2 %xmm0,%xmm4,%xmm1
Decoded ok: 0f 38 cb d7          	sha256rnds2 %xmm0,%xmm7,%xmm2
Decoded ok: 41 0f 38 cb c8       	sha256rnds2 %xmm0,%xmm8,%xmm1
Decoded ok: 44 0f 38 cb c7       	sha256rnds2 %xmm0,%xmm7,%xmm8
Decoded ok: 45 0f 38 cb c7       	sha256rnds2 %xmm0,%xmm15,%xmm8
Decoded ok: 0f 38 cb 08          	sha256rnds2 %xmm0,(%rax),%xmm1
Decoded ok: 41 0f 38 cb 08       	sha256rnds2 %xmm0,(%r8),%xmm1
Decoded ok: 0f 38 cb 0c 25 78 56 34 12 	sha256rnds2 %xmm0,0x12345678,%xmm1
Decoded ok: 0f 38 cb 18          	sha256rnds2 %xmm0,(%rax),%xmm3
Decoded ok: 0f 38 cb 0c 01       	sha256rnds2 %xmm0,(%rcx,%rax,1),%xmm1
Decoded ok: 0f 38 cb 0c 05 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(,%rax,1),%xmm1
Decoded ok: 0f 38 cb 0c 08       	sha256rnds2 %xmm0,(%rax,%rcx,1),%xmm1
Decoded ok: 0f 38 cb 0c c8       	sha256rnds2 %xmm0,(%rax,%rcx,8),%xmm1
Decoded ok: 0f 38 cb 48 12       	sha256rnds2 %xmm0,0x12(%rax),%xmm1
Decoded ok: 0f 38 cb 4d 12       	sha256rnds2 %xmm0,0x12(%rbp),%xmm1
Decoded ok: 0f 38 cb 4c 01 12    	sha256rnds2 %xmm0,0x12(%rcx,%rax,1),%xmm1
Decoded ok: 0f 38 cb 4c 05 12    	sha256rnds2 %xmm0,0x12(%rbp,%rax,1),%xmm1
Decoded ok: 0f 38 cb 4c 08 12    	sha256rnds2 %xmm0,0x12(%rax,%rcx,1),%xmm1
Decoded ok: 0f 38 cb 4c c8 12    	sha256rnds2 %xmm0,0x12(%rax,%rcx,8),%xmm1
Decoded ok: 0f 38 cb 88 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%rax),%xmm1
Decoded ok: 0f 38 cb 8d 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%rbp),%xmm1
Decoded ok: 0f 38 cb 8c 01 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%rcx,%rax,1),%xmm1
Decoded ok: 0f 38 cb 8c 05 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%rbp,%rax,1),%xmm1
Decoded ok: 0f 38 cb 8c 08 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%rax,%rcx,1),%xmm1
Decoded ok: 0f 38 cb 8c c8 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm1
Decoded ok: 44 0f 38 cb bc c8 78 56 34 12 	sha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm15
Decoded ok: 0f 38 cc c1          	sha256msg1 %xmm1,%xmm0
Decoded ok: 0f 38 cc d7          	sha256msg1 %xmm7,%xmm2
Decoded ok: 41 0f 38 cc c0       	sha256msg1 %xmm8,%xmm0
Decoded ok: 44 0f 38 cc c7       	sha256msg1 %xmm7,%xmm8
Decoded ok: 45 0f 38 cc c7       	sha256msg1 %xmm15,%xmm8
Decoded ok: 0f 38 cc 00          	sha256msg1 (%rax),%xmm0
Decoded ok: 41 0f 38 cc 00       	sha256msg1 (%r8),%xmm0
Decoded ok: 0f 38 cc 04 25 78 56 34 12 	sha256msg1 0x12345678,%xmm0
Decoded ok: 0f 38 cc 18          	sha256msg1 (%rax),%xmm3
Decoded ok: 0f 38 cc 04 01       	sha256msg1 (%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 cc 04 05 78 56 34 12 	sha256msg1 0x12345678(,%rax,1),%xmm0
Decoded ok: 0f 38 cc 04 08       	sha256msg1 (%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 cc 04 c8       	sha256msg1 (%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 cc 40 12       	sha256msg1 0x12(%rax),%xmm0
Decoded ok: 0f 38 cc 45 12       	sha256msg1 0x12(%rbp),%xmm0
Decoded ok: 0f 38 cc 44 01 12    	sha256msg1 0x12(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 cc 44 05 12    	sha256msg1 0x12(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 cc 44 08 12    	sha256msg1 0x12(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 cc 44 c8 12    	sha256msg1 0x12(%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 cc 80 78 56 34 12 	sha256msg1 0x12345678(%rax),%xmm0
Decoded ok: 0f 38 cc 85 78 56 34 12 	sha256msg1 0x12345678(%rbp),%xmm0
Decoded ok: 0f 38 cc 84 01 78 56 34 12 	sha256msg1 0x12345678(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 cc 84 05 78 56 34 12 	sha256msg1 0x12345678(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 cc 84 08 78 56 34 12 	sha256msg1 0x12345678(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 cc 84 c8 78 56 34 12 	sha256msg1 0x12345678(%rax,%rcx,8),%xmm0
Decoded ok: 44 0f 38 cc bc c8 78 56 34 12 	sha256msg1 0x12345678(%rax,%rcx,8),%xmm15
Decoded ok: 0f 38 cd c1          	sha256msg2 %xmm1,%xmm0
Decoded ok: 0f 38 cd d7          	sha256msg2 %xmm7,%xmm2
Decoded ok: 41 0f 38 cd c0       	sha256msg2 %xmm8,%xmm0
Decoded ok: 44 0f 38 cd c7       	sha256msg2 %xmm7,%xmm8
Decoded ok: 45 0f 38 cd c7       	sha256msg2 %xmm15,%xmm8
Decoded ok: 0f 38 cd 00          	sha256msg2 (%rax),%xmm0
Decoded ok: 41 0f 38 cd 00       	sha256msg2 (%r8),%xmm0
Decoded ok: 0f 38 cd 04 25 78 56 34 12 	sha256msg2 0x12345678,%xmm0
Decoded ok: 0f 38 cd 18          	sha256msg2 (%rax),%xmm3
Decoded ok: 0f 38 cd 04 01       	sha256msg2 (%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 cd 04 05 78 56 34 12 	sha256msg2 0x12345678(,%rax,1),%xmm0
Decoded ok: 0f 38 cd 04 08       	sha256msg2 (%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 cd 04 c8       	sha256msg2 (%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 cd 40 12       	sha256msg2 0x12(%rax),%xmm0
Decoded ok: 0f 38 cd 45 12       	sha256msg2 0x12(%rbp),%xmm0
Decoded ok: 0f 38 cd 44 01 12    	sha256msg2 0x12(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 cd 44 05 12    	sha256msg2 0x12(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 cd 44 08 12    	sha256msg2 0x12(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 cd 44 c8 12    	sha256msg2 0x12(%rax,%rcx,8),%xmm0
Decoded ok: 0f 38 cd 80 78 56 34 12 	sha256msg2 0x12345678(%rax),%xmm0
Decoded ok: 0f 38 cd 85 78 56 34 12 	sha256msg2 0x12345678(%rbp),%xmm0
Decoded ok: 0f 38 cd 84 01 78 56 34 12 	sha256msg2 0x12345678(%rcx,%rax,1),%xmm0
Decoded ok: 0f 38 cd 84 05 78 56 34 12 	sha256msg2 0x12345678(%rbp,%rax,1),%xmm0
Decoded ok: 0f 38 cd 84 08 78 56 34 12 	sha256msg2 0x12345678(%rax,%rcx,1),%xmm0
Decoded ok: 0f 38 cd 84 c8 78 56 34 12 	sha256msg2 0x12345678(%rax,%rcx,8),%xmm0
Decoded ok: 44 0f 38 cd bc c8 78 56 34 12 	sha256msg2 0x12345678(%rax,%rcx,8),%xmm15
Decoded ok: 66 0f ae 38          	clflushopt (%rax)
Decoded ok: 66 41 0f ae 38       	clflushopt (%r8)
Decoded ok: 66 0f ae 3c 25 78 56 34 12 	clflushopt 0x12345678
Decoded ok: 66 0f ae bc c8 78 56 34 12 	clflushopt 0x12345678(%rax,%rcx,8)
Decoded ok: 66 41 0f ae bc c8 78 56 34 12 	clflushopt 0x12345678(%r8,%rcx,8)
Decoded ok: 0f ae 38             	clflush (%rax)
Decoded ok: 41 0f ae 38          	clflush (%r8)
Decoded ok: 0f ae f8             	sfence 
Decoded ok: 66 0f ae 30          	clwb   (%rax)
Decoded ok: 66 41 0f ae 30       	clwb   (%r8)
Decoded ok: 66 0f ae 34 25 78 56 34 12 	clwb   0x12345678
Decoded ok: 66 0f ae b4 c8 78 56 34 12 	clwb   0x12345678(%rax,%rcx,8)
Decoded ok: 66 41 0f ae b4 c8 78 56 34 12 	clwb   0x12345678(%r8,%rcx,8)
Decoded ok: 0f ae 30             	xsaveopt (%rax)
Decoded ok: 41 0f ae 30          	xsaveopt (%r8)
Decoded ok: 0f ae f0             	mfence 
Decoded ok: 0f c7 20             	xsavec (%rax)
Decoded ok: 41 0f c7 20          	xsavec (%r8)
Decoded ok: 0f c7 24 25 78 56 34 12 	xsavec 0x12345678
Decoded ok: 0f c7 a4 c8 78 56 34 12 	xsavec 0x12345678(%rax,%rcx,8)
Decoded ok: 41 0f c7 a4 c8 78 56 34 12 	xsavec 0x12345678(%r8,%rcx,8)
Decoded ok: 0f c7 28             	xsaves (%rax)
Decoded ok: 41 0f c7 28          	xsaves (%r8)
Decoded ok: 0f c7 2c 25 78 56 34 12 	xsaves 0x12345678
Decoded ok: 0f c7 ac c8 78 56 34 12 	xsaves 0x12345678(%rax,%rcx,8)
Decoded ok: 41 0f c7 ac c8 78 56 34 12 	xsaves 0x12345678(%r8,%rcx,8)
Decoded ok: 0f c7 18             	xrstors (%rax)
Decoded ok: 41 0f c7 18          	xrstors (%r8)
Decoded ok: 0f c7 1c 25 78 56 34 12 	xrstors 0x12345678
Decoded ok: 0f c7 9c c8 78 56 34 12 	xrstors 0x12345678(%rax,%rcx,8)
Decoded ok: 41 0f c7 9c c8 78 56 34 12 	xrstors 0x12345678(%r8,%rcx,8)
Decoded ok: 66 0f ae f8          	pcommit 
Decoded ok: 0f 01 ee             	rdpkru
Decoded ok: 0f 01 ef             	wrpkru
test child finished with 0
---- end ----
Test x86 instruction decoder - new instructions: Ok
48: Test intel cqm nmi context read                          :
--- start ---
test child forked, pid 22924
parse_events failed, is "intel_cqm/llc_occupancy/" available?
test child finished with -2
---- end ----
Test intel cqm nmi context read: Skip
[acme@zoo linux]$ 

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 17/53] perf test: Improve bp_signal
  2016-01-12  9:21     ` Jiri Olsa
@ 2016-01-12 14:11       ` Arnaldo Carvalho de Melo
  2016-01-12 14:17         ` Will Deacon
  0 siblings, 1 reply; 124+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-01-12 14:11 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jiri Olsa, Wang Nan, linux-kernel, pi3orama, lizefan, netdev, Jiri Olsa

Em Tue, Jan 12, 2016 at 10:21:29AM +0100, Jiri Olsa escreveu:
> On Mon, Jan 11, 2016 at 06:37:29PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Jan 11, 2016 at 01:48:08PM +0000, Wang Nan escreveu:
> > > Will Deacon [1] has some question on patch [2]. This patch improves
> > > test__bp_signal so we can test:
> > > 
> > >  1. A watchpoint and a breakpoint that fire on the same instruction
> > >  2. Nested signals
> > > 
> > > Test result:
> > > 
> > >  On x86_64 and ARM64 (result are similar with patch [2] on ARM64):
> > > 
> > >  # ./perf test -v signal
> > >  17: Test breakpoint overflow signal handler                  :
> > >  --- start ---
> > >  test child forked, pid 10213
> > >  count1 1, count2 3, count3 2, overflow 3, overflows_2 3
> > >  test child finished with 0
> > >  ---- end ----
> > >  Test breakpoint overflow signal handler: Ok
> > > 
> > > So at least 2 cases Will doubted are handled correctly.
> > > 
> > > [1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
> > > [2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com
> > > 
> > > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > > Cc: Will Deacon <will.deacon@arm.com>
> > 
> > Will, are you ok with this one? Can I have an Acked-by or better,
> > Tested-by for the AARCH64 base?
> > 
> > IIRC Jiri made some comment about this one?
> 
> I thought I acked this one.. all comments were addresses, so:
> 
> Acked-by: Jiri Olsa <jolsa@kernel.org>

Ok, so, Will, any comments? Nack?

- Arnaldo

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-11 13:48 ` [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE Wang Nan
  2016-01-11 18:09   ` Alexei Starovoitov
@ 2016-01-12 14:14   ` Peter Zijlstra
  2016-01-18 11:52     ` [PATCH] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
  1 sibling, 1 reply; 124+ messages in thread
From: Peter Zijlstra @ 2016-01-12 14:14 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, linux-kernel, pi3orama, lizefan, netdev, davem,
	Adrian Hunter, Arnaldo Carvalho de Melo, David Ahern,
	Ingo Molnar, Yunlong Song

On Mon, Jan 11, 2016 at 01:48:18PM +0000, Wang Nan wrote:
> Before reading such ring buffer, perf must ensure all events which may
> output to it is already stopped, so the 'head' pointer it get is the
> end of the last record.

We could add an extra ioctl() to pause/resume ring-buffer output to make
this more convenient to achieve.

Say replace the !rb->nr_pages test in perf_output_begin() with a control
variable load and ensure its set to pause/fail when !nr_pages.

That way we only have to issue a single ioctl to freeze the buffer,
without adding extra instructions/loads to the fast path.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 17/53] perf test: Improve bp_signal
  2016-01-12 14:11       ` Arnaldo Carvalho de Melo
@ 2016-01-12 14:17         ` Will Deacon
  0 siblings, 0 replies; 124+ messages in thread
From: Will Deacon @ 2016-01-12 14:17 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Wang Nan, linux-kernel, pi3orama, lizefan, netdev, Jiri Olsa

On Tue, Jan 12, 2016 at 11:11:23AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jan 12, 2016 at 10:21:29AM +0100, Jiri Olsa escreveu:
> > On Mon, Jan 11, 2016 at 06:37:29PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Jan 11, 2016 at 01:48:08PM +0000, Wang Nan escreveu:
> > > > Will Deacon [1] has some question on patch [2]. This patch improves
> > > > test__bp_signal so we can test:
> > > > 
> > > >  1. A watchpoint and a breakpoint that fire on the same instruction
> > > >  2. Nested signals
> > > > 
> > > > Test result:
> > > > 
> > > >  On x86_64 and ARM64 (result are similar with patch [2] on ARM64):
> > > > 
> > > >  # ./perf test -v signal
> > > >  17: Test breakpoint overflow signal handler                  :
> > > >  --- start ---
> > > >  test child forked, pid 10213
> > > >  count1 1, count2 3, count3 2, overflow 3, overflows_2 3
> > > >  test child finished with 0
> > > >  ---- end ----
> > > >  Test breakpoint overflow signal handler: Ok
> > > > 
> > > > So at least 2 cases Will doubted are handled correctly.
> > > > 
> > > > [1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
> > > > [2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com
> > > > 
> > > > Signed-off-by: Wang Nan <wangnan0@huawei.com>
> > > > Cc: Will Deacon <will.deacon@arm.com>
> > > 
> > > Will, are you ok with this one? Can I have an Acked-by or better,
> > > Tested-by for the AARCH64 base?
> > > 
> > > IIRC Jiri made some comment about this one?
> > 
> > I thought I acked this one.. all comments were addresses, so:
> > 
> > Acked-by: Jiri Olsa <jolsa@kernel.org>
> 
> Ok, so, Will, any comments? Nack?

Sorry, snowed under at the moment. I need to go back over the arch/arm64
patch, since I did have some concerns on that and the changes to the
perf tool don't do a lot without the corresponding architecture update
which I'm extremely nervous about.

I'll revisit that patch once I've got through the more pressing changes
in the queue.

Will

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 16/53 v2] perf tools: Fix mmap2 event allocation in synthesize code
  2016-01-12 10:51         ` Wangnan (F)
@ 2016-01-12 14:24           ` acme
  2016-01-13  0:40             ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 1 reply; 124+ messages in thread
From: acme @ 2016-01-12 14:24 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: 平松雅巳 / HIRAMATU,MASAMI, jolsa,
	linux-kernel, He Kuang, Namhyung Kim, Zefan Li, pi3orama

Em Tue, Jan 12, 2016 at 06:51:07PM +0800, Wangnan (F) escreveu:
> On 2016/1/12 18:49, 平松雅巳 / HIRAMATU,MASAMI wrote:
> >>From: Wang Nan [mailto:wangnan0@huawei.com]
> >>perf_event__synthesize_mmap_events() issues mmap2 events, but the
> >>memory of that event is allocated using:
> >>
> >>mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
> >>
> >>If path of mmap source file is long (near PATH_MAX), random crash
> >>would happen. Should use sizeof(mmap_event->mmap2).
> >>
> >>Fix two memory allocations.
> >Looks good to me. But hope to have another rename patch soon after this...
> 
> According to Arnaldo, we don't need rename patch. He think mmap_event
> is okay. Right?

Right, no need for the rename.

- Arnaldo

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-12 12:36         ` Wangnan (F)
@ 2016-01-12 19:56           ` Alexei Starovoitov
  2016-01-13  4:34             ` Wangnan (F)
  0 siblings, 1 reply; 124+ messages in thread
From: Alexei Starovoitov @ 2016-01-12 19:56 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Peter Zijlstra, acme, linux-kernel, pi3orama, lizefan, netdev,
	davem, Adrian Hunter, Arnaldo Carvalho de Melo, David Ahern,
	Ingo Molnar, Yunlong Song

On Tue, Jan 12, 2016 at 08:36:23PM +0800, Wangnan (F) wrote:
> >hmm, in this kernel patch I see that you're adding 8 bytes for
> >every record via this extra TAILSISZE flag and in perf you're
> >walking the ring buffer backwards by reading this 8 byte
> >sizes, comparing header sizes and so on until reaching beginning,
> >where you start dumping it as normal.
> >So for this 'signal to perf' approach to work the ring buffer
> >will contain tailsizes everywhere just so that user space can
> >find the beginning. That's not very pretty. imo if kernel
> >can do header read to adjust data_tail it would make user
> >space side clean. May be there are other solutions.
> >Adding tailsize seems like brute force hack.
> >There must be some nicer way.
> Hi Peter,
> 
>  What's your opinion? Should we reconsider moving size field from header the
> end?
> Or moving whole header to the end of a record?

I think moving the whole header under new TAILHEADER flag is
actually very good idea. The ring buffer will be fully utilized
and no extra bytes necessary. User space would need to parse it
backwards, but for this use case it fits well.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* RE: [PATCH 16/53 v2] perf tools: Fix mmap2 event allocation in synthesize code
  2016-01-12 14:24           ` acme
@ 2016-01-13  0:40             ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 0 replies; 124+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2016-01-13  0:40 UTC (permalink / raw)
  To: 'acme@kernel.org', Wangnan (F)
  Cc: jolsa, linux-kernel, He Kuang, Namhyung Kim, Zefan Li, pi3orama

From: acme@kernel.org [mailto:acme@kernel.org]
>
>Em Tue, Jan 12, 2016 at 06:51:07PM +0800, Wangnan (F) escreveu:
>> On 2016/1/12 18:49, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> >>From: Wang Nan [mailto:wangnan0@huawei.com]
>> >>perf_event__synthesize_mmap_events() issues mmap2 events, but the
>> >>memory of that event is allocated using:
>> >>
>> >>mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
>> >>
>> >>If path of mmap source file is long (near PATH_MAX), random crash
>> >>would happen. Should use sizeof(mmap_event->mmap2).
>> >>
>> >>Fix two memory allocations.
>> >Looks good to me. But hope to have another rename patch soon after this...
>>
>> According to Arnaldo, we don't need rename patch. He think mmap_event
>> is okay. Right?
>
>Right, no need for the rename.
>

OK, confirmed :)

Thanks!

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-12 19:56           ` Alexei Starovoitov
@ 2016-01-13  4:34             ` Wangnan (F)
  2016-01-13  5:14               ` Alexei Starovoitov
  0 siblings, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-13  4:34 UTC (permalink / raw)
  To: Alexei Starovoitov, Peter Zijlstra
  Cc: acme, linux-kernel, pi3orama, lizefan, netdev, davem,
	Adrian Hunter, Arnaldo Carvalho de Melo, David Ahern,
	Ingo Molnar, Yunlong Song



On 2016/1/13 3:56, Alexei Starovoitov wrote:
> On Tue, Jan 12, 2016 at 08:36:23PM +0800, Wangnan (F) wrote:
>>> hmm, in this kernel patch I see that you're adding 8 bytes for
>>> every record via this extra TAILSISZE flag and in perf you're
>>> walking the ring buffer backwards by reading this 8 byte
>>> sizes, comparing header sizes and so on until reaching beginning,
>>> where you start dumping it as normal.
>>> So for this 'signal to perf' approach to work the ring buffer
>>> will contain tailsizes everywhere just so that user space can
>>> find the beginning. That's not very pretty. imo if kernel
>>> can do header read to adjust data_tail it would make user
>>> space side clean. May be there are other solutions.
>>> Adding tailsize seems like brute force hack.
>>> There must be some nicer way.
>> Hi Peter,
>>
>>   What's your opinion? Should we reconsider moving size field from header the
>> end?
>> Or moving whole header to the end of a record?
> I think moving the whole header under new TAILHEADER flag is
> actually very good idea. The ring buffer will be fully utilized
> and no extra bytes necessary. User space would need to parse it
> backwards, but for this use case it fits well.

I have another crazy suggestion: can we make kernel writing to
the ring buffer from the end to the beginning? For example:

This is the initial state of the ring buffer, head pointer
pointes to the end of it:

       -------------> Address increase

                                     head
                                       |
                                       V
  +--+---+-------+----------+------+---+
  |                                    |
  +--+---+-------+----------+------+---+


Write the first event at the end of the ring buffer, and *decrease*
the head pointer:

                                 head
                                   |
                                   V
  +--+---+-------+----------+------+---+
  |                                | A |
  +--+---+-------+----------+------+---+


Another record:
                           head
                            |
                            V
  +--+---+-------+----------+------+---+
  |                         |   B  | A |
  +--+---+-------+----------+------+---+


Ring buffer rewind, A is fully overwritten and B is broken:

                                head
                                  |
                                  V
  +--+---+-------+----------+-----+----+
  |F | E |   D   | C        | ... | F  |
  +--+---+-------+----------+-----+----+

At this time user can parse the ring buffer normally from
F to C. From timestamp in it he know which one is the
oldest.

By this perf don't need too much extra work to do. There's no
performance penalty at all, and the 8 bytes are saved.

Thought?

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-13  4:34             ` Wangnan (F)
@ 2016-01-13  5:14               ` Alexei Starovoitov
  0 siblings, 0 replies; 124+ messages in thread
From: Alexei Starovoitov @ 2016-01-13  5:14 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Peter Zijlstra, acme, linux-kernel, pi3orama, lizefan, netdev,
	davem, Adrian Hunter, Arnaldo Carvalho de Melo, David Ahern,
	Ingo Molnar, Yunlong Song

On Wed, Jan 13, 2016 at 12:34:19PM +0800, Wangnan (F) wrote:
> 
> >>Or moving whole header to the end of a record?
> >I think moving the whole header under new TAILHEADER flag is
> >actually very good idea. The ring buffer will be fully utilized
> >and no extra bytes necessary. User space would need to parse it
> >backwards, but for this use case it fits well.
> 
> I have another crazy suggestion: can we make kernel writing to
> the ring buffer from the end to the beginning? For example:
> 
> This is the initial state of the ring buffer, head pointer
> pointes to the end of it:
> 
>       -------------> Address increase
> 
>                                     head
>                                       |
>                                       V
>  +--+---+-------+----------+------+---+
>  |                                    |
>  +--+---+-------+----------+------+---+
> 
> 
> Write the first event at the end of the ring buffer, and *decrease*
> the head pointer:
> 
>                                 head
>                                   |
>                                   V
>  +--+---+-------+----------+------+---+
>  |                                | A |
>  +--+---+-------+----------+------+---+
> 
> 
> Another record:
>                           head
>                            |
>                            V
>  +--+---+-------+----------+------+---+
>  |                         |   B  | A |
>  +--+---+-------+----------+------+---+
> 
> 
> Ring buffer rewind, A is fully overwritten and B is broken:
> 
>                                head
>                                  |
>                                  V
>  +--+---+-------+----------+-----+----+
>  |F | E |   D   | C        | ... | F  |
>  +--+---+-------+----------+-----+----+
> 
> At this time user can parse the ring buffer normally from
> F to C. From timestamp in it he know which one is the
> oldest.
> 
> By this perf don't need too much extra work to do. There's no
> performance penalty at all, and the 8 bytes are saved.
> 
> Thought?

I like it.
I think from algorithmic stand point it's very pretty, but real
cpus may not like to stream the data backwards. x86 can detect
the stride and prefetch the next cache line when stride is
positive. I don't think there is such hw logic for negative strides.
So if it's not too hard, I would suggest to implement both of
your ideas. I negative stride is just as fast as normal, then
let's use that, since it doesn't change the header and nothing
needs to change on perf side or any other tools that read
ring-buffer manually.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* [tip:perf/urgent] perf tools: Fix mmap2 event allocation in synthesize code
  2016-01-12 10:12     ` [PATCH 16/53 v2] " Wang Nan
  2016-01-12 10:49       ` 平松雅巳 / HIRAMATU,MASAMI
@ 2016-01-13  9:40       ` tip-bot for Wang Nan
  1 sibling, 0 replies; 124+ messages in thread
From: tip-bot for Wang Nan @ 2016-01-13  9:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hekuang, hpa, namhyung, jolsa, tglx, linux-kernel, acme, lizefan,
	wangnan0, mingo, masami.hiramatsu.pt

Commit-ID:  b0fb978e97f58ca930f7cafc4ddc264218710765
Gitweb:     http://git.kernel.org/tip/b0fb978e97f58ca930f7cafc4ddc264218710765
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Tue, 12 Jan 2016 10:12:04 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 12 Jan 2016 11:24:43 -0300

perf tools: Fix mmap2 event allocation in synthesize code

perf_event__synthesize_mmap_events() issues mmap2 events, but the memory
of that event is allocated using:

 mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);

If path of mmap source file is long (near PATH_MAX), random crash would
happen. Should use sizeof(mmap_event->mmap2).

Fix two memory allocations.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452593524-138970-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/event.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index cd61bb1..85155e9 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -503,7 +503,7 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool,
 	if (comm_event == NULL)
 		goto out;
 
-	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
+	mmap_event = malloc(sizeof(mmap_event->mmap2) + machine->id_hdr_size);
 	if (mmap_event == NULL)
 		goto out_free_comm;
 
@@ -577,7 +577,7 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
 	if (comm_event == NULL)
 		goto out;
 
-	mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
+	mmap_event = malloc(sizeof(mmap_event->mmap2) + machine->id_hdr_size);
 	if (mmap_event == NULL)
 		goto out_free_comm;
 

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH] perf core: Introduce new ioctl options to pause and resume ring buffer
  2016-01-12 14:14   ` Peter Zijlstra
@ 2016-01-18 11:52     ` Wang Nan
  2016-01-18 12:02       ` Peter Zijlstra
  0 siblings, 1 reply; 124+ messages in thread
From: Wang Nan @ 2016-01-18 11:52 UTC (permalink / raw)
  To: acme, peterz
  Cc: linux-kernel, pi3orama, lizefan, Wang Nan, He Kuang,
	Alexei Starovoitov, Arnaldo Carvalho de Melo, Brendan Gregg,
	David S. Miller, Jiri Olsa, Masami Hiramatsu, Namhyung Kim

Add an extra ioctl() to pause/resume ring-buffer output.

In some situations we want to read from ring buffer only when we
ensure nothing can write to the ring buffer during reading. Without
this patch we have to turn off all events attached to this ring buffer.
This patch is for supporting overwritable ring buffer with TAILSIZE
selected.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/uapi/linux/perf_event.h |  2 ++
 kernel/events/core.c            | 14 ++++++++++++++
 kernel/events/internal.h        | 11 +++++++++++
 kernel/events/ring_buffer.c     |  4 +++-
 4 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 4e8dde8..9508070 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -402,6 +402,8 @@ struct perf_event_attr {
 #define PERF_EVENT_IOC_SET_FILTER	_IOW('$', 6, char *)
 #define PERF_EVENT_IOC_ID		_IOR('$', 7, __u64 *)
 #define PERF_EVENT_IOC_SET_BPF		_IOW('$', 8, __u32)
+#define PERF_EVENT_IOC_PAUSE_OUTPUT	_IO ('$', 9)
+#define PERF_EVENT_IOC_RESUME_OUTPUT	_IO ('$', 10)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2d59b59..d5a0c34 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4241,6 +4241,20 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
 	case PERF_EVENT_IOC_SET_BPF:
 		return perf_event_set_bpf_prog(event, arg);
 
+	case PERF_EVENT_IOC_PAUSE_OUTPUT:
+	case PERF_EVENT_IOC_RESUME_OUTPUT: {
+		struct ring_buffer *rb;
+
+		rcu_read_lock();
+		rb = rcu_dereference(event->rb);
+		if (!event->rb) {
+			rcu_read_unlock();
+			return -EINVAL;
+		}
+		rb_toggle_paused(rb, cmd == PERF_EVENT_IOC_PAUSE_OUTPUT);
+		rcu_read_unlock();
+		return 0;
+	}
 	default:
 		return -ENOTTY;
 	}
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 2bbad9c..6a93d1b 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -18,6 +18,7 @@ struct ring_buffer {
 #endif
 	int				nr_pages;	/* nr of data pages  */
 	int				overwrite;	/* can overwrite itself */
+	int				paused;		/* can write into ring buffer */
 
 	atomic_t			poll;		/* POLL_ for wakeups */
 
@@ -65,6 +66,16 @@ static inline void rb_free_rcu(struct rcu_head *rcu_head)
 	rb_free(rb);
 }
 
+static inline void
+rb_toggle_paused(struct ring_buffer *rb,
+		 bool pause)
+{
+	if (!pause && rb->nr_pages)
+		rb->paused = 0;
+	else
+		rb->paused = 1;
+}
+
 extern struct ring_buffer *
 rb_alloc(int nr_pages, long watermark, int cpu, int flags);
 extern void perf_event_wakeup(struct perf_event *event);
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 5f8bd89..11a1676 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -125,7 +125,7 @@ int perf_output_begin(struct perf_output_handle *handle,
 	if (unlikely(!rb))
 		goto out;
 
-	if (unlikely(!rb->nr_pages))
+	if (unlikely(rb->paused))
 		goto out;
 
 	handle->rb    = rb;
@@ -245,6 +245,8 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
 	INIT_LIST_HEAD(&rb->event_list);
 	spin_lock_init(&rb->event_lock);
 	init_irq_work(&rb->irq_work, rb_irq_work);
+
+	rb->paused = rb->nr_pages ? 0 : 1;
 }
 
 static void ring_buffer_put_async(struct ring_buffer *rb)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* Re: [PATCH] perf core: Introduce new ioctl options to pause and resume ring buffer
  2016-01-18 11:52     ` [PATCH] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
@ 2016-01-18 12:02       ` Peter Zijlstra
  2016-01-19  2:55         ` Wangnan (F)
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
  0 siblings, 2 replies; 124+ messages in thread
From: Peter Zijlstra @ 2016-01-18 12:02 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, linux-kernel, pi3orama, lizefan, He Kuang,
	Alexei Starovoitov, Arnaldo Carvalho de Melo, Brendan Gregg,
	David S. Miller, Jiri Olsa, Masami Hiramatsu, Namhyung Kim

On Mon, Jan 18, 2016 at 11:52:01AM +0000, Wang Nan wrote:

> +#define PERF_EVENT_IOC_PAUSE_OUTPUT	_IO ('$', 9)
> +#define PERF_EVENT_IOC_RESUME_OUTPUT	_IO ('$', 10)

Would not a single IOCTL with a 'boolean' parameter make more sense?

> +++ b/kernel/events/ring_buffer.c
> @@ -125,7 +125,7 @@ int perf_output_begin(struct perf_output_handle *handle,
>  	if (unlikely(!rb))
>  		goto out;
>  
> -	if (unlikely(!rb->nr_pages))
> +	if (unlikely(rb->paused))
>  		goto out;

Should we increment rb->lost in this case?

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH] perf core: Introduce new ioctl options to pause and resume ring buffer
  2016-01-18 12:02       ` Peter Zijlstra
@ 2016-01-19  2:55         ` Wangnan (F)
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
  1 sibling, 0 replies; 124+ messages in thread
From: Wangnan (F) @ 2016-01-19  2:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: acme, linux-kernel, pi3orama, lizefan, He Kuang,
	Alexei Starovoitov, Arnaldo Carvalho de Melo, Brendan Gregg,
	David S. Miller, Jiri Olsa, Masami Hiramatsu, Namhyung Kim



On 2016/1/18 20:02, Peter Zijlstra wrote:
> On Mon, Jan 18, 2016 at 11:52:01AM +0000, Wang Nan wrote:
>
>> +#define PERF_EVENT_IOC_PAUSE_OUTPUT	_IO ('$', 9)
>> +#define PERF_EVENT_IOC_RESUME_OUTPUT	_IO ('$', 10)
> Would not a single IOCTL with a 'boolean' parameter make more sense?

Good suggestion.

>> +++ b/kernel/events/ring_buffer.c
>> @@ -125,7 +125,7 @@ int perf_output_begin(struct perf_output_handle *handle,
>>   	if (unlikely(!rb))
>>   		goto out;
>>   
>> -	if (unlikely(!rb->nr_pages))
>> +	if (unlikely(rb->paused))
>>   		goto out;
> Should we increment rb->lost in this case?

Not sure about this. The ring buffer is paused deliberately, shall we 
consider the
events we miss as losted events? However I'll try it in next version.

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-18 12:02       ` Peter Zijlstra
  2016-01-19  2:55         ` Wangnan (F)
@ 2016-01-19 11:16         ` Wang Nan
  2016-01-19 11:16           ` [PATCH 1/6] perf core: Introduce new ioctl options to pause and resume " Wang Nan
                             ` (7 more replies)
  1 sibling, 8 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-19 11:16 UTC (permalink / raw)
  To: peterz, ast
  Cc: linux-kernel, Wang Nan, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama

This patchset introduces two methods to support reading from overwrite.

 1) Tailsize: write the size of an event at the end of it
 2) Backward writing: write the ring buffer from the end of it to the
    beginning.

Patch 1/6 introduces a new ioctl operation to pause and resume ring
buffer since reading from a overwrite ring buffer is not reliable.

To reduce overhead as much as possible, force setting overflow_handler
and create specific function for backward writing and onward writing.

Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com

Wang Nan (6):
  perf core: Introduce new ioctl options to pause and resume ring buffer
  perf core: Set event's default overflow_handler
  perf core: Prepare writing into ring buffer from end
  perf core: Add backwork attribute to perf event
  perf core: Reduce perf event output overhead by setting overwrite
    handler
  perf/core: Put size of a sample at the end of it by
    PERF_SAMPLE_TAILSIZE

 include/linux/perf_event.h      |  39 +++++++---
 include/uapi/linux/perf_event.h |   7 +-
 kernel/events/core.c            | 155 +++++++++++++++++++++++++++++++---------
 kernel/events/internal.h        |  11 +++
 kernel/events/ring_buffer.c     |  65 ++++++++++++++---
 5 files changed, 223 insertions(+), 54 deletions(-)

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 124+ messages in thread

* [PATCH 1/6] perf core: Introduce new ioctl options to pause and resume ring buffer
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
@ 2016-01-19 11:16           ` Wang Nan
  2016-01-19 11:16           ` [PATCH 2/6] perf core: Set event's default overflow_handler Wang Nan
                             ` (6 subsequent siblings)
  7 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-19 11:16 UTC (permalink / raw)
  To: peterz, ast
  Cc: linux-kernel, Wang Nan, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama

Add new ioctl() to pause/resume ring-buffer output.

In some situations we want to read from ring buffer only when we
ensure nothing can write to the ring buffer during reading. Without
this patch we have to turn off all events attached to this ring buffer.
This patch is for supporting overwritable ring buffer with TAILSIZE
selected.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/uapi/linux/perf_event.h |  1 +
 kernel/events/core.c            | 13 +++++++++++++
 kernel/events/internal.h        | 11 +++++++++++
 kernel/events/ring_buffer.c     |  7 ++++++-
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 1afe962..2c7f00c 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -401,6 +401,7 @@ struct perf_event_attr {
 #define PERF_EVENT_IOC_SET_FILTER	_IOW('$', 6, char *)
 #define PERF_EVENT_IOC_ID		_IOR('$', 7, __u64 *)
 #define PERF_EVENT_IOC_SET_BPF		_IOW('$', 8, __u32)
+#define PERF_EVENT_IOC_PAUSE_OUTPUT	_IO ('$', 9)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index bf82441..9e9c84da 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4241,6 +4241,19 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
 	case PERF_EVENT_IOC_SET_BPF:
 		return perf_event_set_bpf_prog(event, arg);
 
+	case PERF_EVENT_IOC_PAUSE_OUTPUT: {
+		struct ring_buffer *rb;
+
+		rcu_read_lock();
+		rb = rcu_dereference(event->rb);
+		if (!event->rb) {
+			rcu_read_unlock();
+			return -EINVAL;
+		}
+		rb_toggle_paused(rb, !!arg);
+		rcu_read_unlock();
+		return 0;
+	}
 	default:
 		return -ENOTTY;
 	}
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 2bbad9c..6a93d1b 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -18,6 +18,7 @@ struct ring_buffer {
 #endif
 	int				nr_pages;	/* nr of data pages  */
 	int				overwrite;	/* can overwrite itself */
+	int				paused;		/* can write into ring buffer */
 
 	atomic_t			poll;		/* POLL_ for wakeups */
 
@@ -65,6 +66,16 @@ static inline void rb_free_rcu(struct rcu_head *rcu_head)
 	rb_free(rb);
 }
 
+static inline void
+rb_toggle_paused(struct ring_buffer *rb,
+		 bool pause)
+{
+	if (!pause && rb->nr_pages)
+		rb->paused = 0;
+	else
+		rb->paused = 1;
+}
+
 extern struct ring_buffer *
 rb_alloc(int nr_pages, long watermark, int cpu, int flags);
 extern void perf_event_wakeup(struct perf_event *event);
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index adfdc05..9f1a93f 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -125,8 +125,11 @@ int perf_output_begin(struct perf_output_handle *handle,
 	if (unlikely(!rb))
 		goto out;
 
-	if (unlikely(!rb->nr_pages))
+	if (unlikely(rb->paused)) {
+		if (rb->nr_pages)
+			local_inc(&rb->lost);
 		goto out;
+	}
 
 	handle->rb    = rb;
 	handle->event = event;
@@ -244,6 +247,8 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
 	INIT_LIST_HEAD(&rb->event_list);
 	spin_lock_init(&rb->event_lock);
 	init_irq_work(&rb->irq_work, rb_irq_work);
+
+	rb->paused = rb->nr_pages ? 0 : 1;
 }
 
 static void ring_buffer_put_async(struct ring_buffer *rb)
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 2/6] perf core: Set event's default overflow_handler
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
  2016-01-19 11:16           ` [PATCH 1/6] perf core: Introduce new ioctl options to pause and resume " Wang Nan
@ 2016-01-19 11:16           ` Wang Nan
  2016-01-19 11:16           ` [PATCH 3/6] perf core: Prepare writing into ring buffer from end Wang Nan
                             ` (5 subsequent siblings)
  7 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-19 11:16 UTC (permalink / raw)
  To: peterz, ast
  Cc: linux-kernel, Wang Nan, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama

Set a default event->overflow_handler in perf_event_alloc() so don't
need checking event->overflow_handler in __perf_event_overflow().
Following commits can give a different default overflow_handler.

No extra performance introduced into hot path because in the original
code we still need reading this handler from memory. A conditional branch
is avoided so actually we remove some instructions.

Initial idea comes from Peter at [1].

[1] http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 kernel/events/core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9e9c84da..f79c4be 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6402,10 +6402,7 @@ static int __perf_event_overflow(struct perf_event *event,
 		irq_work_queue(&event->pending);
 	}
 
-	if (event->overflow_handler)
-		event->overflow_handler(event, data, regs);
-	else
-		perf_event_output(event, data, regs);
+	event->overflow_handler(event, data, regs);
 
 	if (*perf_event_fasync(event) && event->pending_kill) {
 		event->pending_wakeup = 1;
@@ -7874,8 +7871,13 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 		context = parent_event->overflow_handler_context;
 	}
 
-	event->overflow_handler	= overflow_handler;
-	event->overflow_handler_context = context;
+	if (overflow_handler) {
+		event->overflow_handler	= overflow_handler;
+		event->overflow_handler_context = context;
+	} else {
+		event->overflow_handler = perf_event_output;
+		event->overflow_handler_context = NULL;
+	}
 
 	perf_event__state_init(event);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 3/6] perf core: Prepare writing into ring buffer from end
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
  2016-01-19 11:16           ` [PATCH 1/6] perf core: Introduce new ioctl options to pause and resume " Wang Nan
  2016-01-19 11:16           ` [PATCH 2/6] perf core: Set event's default overflow_handler Wang Nan
@ 2016-01-19 11:16           ` Wang Nan
  2016-01-19 11:16           ` [PATCH 4/6] perf core: Add backwork attribute to perf event Wang Nan
                             ` (4 subsequent siblings)
  7 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-19 11:16 UTC (permalink / raw)
  To: peterz, ast
  Cc: linux-kernel, Wang Nan, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama

Convert perf_output_begin to __perf_output_begin and make the later
function able to write records from the end of the ring buffer.
Following commits will utilize the 'backward' flag.

This patch doesn't introduce any extra performance overhead since we
use always_inline.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 kernel/events/ring_buffer.c | 37 +++++++++++++++++++++++++++++++------
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 9f1a93f..bbc3bc6 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -102,8 +102,21 @@ out:
 	preempt_enable();
 }
 
-int perf_output_begin(struct perf_output_handle *handle,
-		      struct perf_event *event, unsigned int size)
+static bool __always_inline
+ring_buffer_has_space(unsigned long head, unsigned long tail,
+		      unsigned long data_size, unsigned int size,
+		      bool backward)
+{
+	if (!backward)
+		return CIRC_SPACE(head, tail, data_size) < size;
+	else
+		return CIRC_SPACE(tail, head, data_size) < size;
+}
+
+static int __always_inline
+__perf_output_begin(struct perf_output_handle *handle,
+		    struct perf_event *event, unsigned int size,
+		    bool backward)
 {
 	struct ring_buffer *rb;
 	unsigned long tail, offset, head;
@@ -146,9 +159,12 @@ int perf_output_begin(struct perf_output_handle *handle,
 	do {
 		tail = READ_ONCE(rb->user_page->data_tail);
 		offset = head = local_read(&rb->head);
-		if (!rb->overwrite &&
-		    unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size))
-			goto fail;
+		if (!rb->overwrite) {
+			if (unlikely(!ring_buffer_has_space(head, tail,
+							    perf_data_size(rb),
+							    size, backward)))
+				goto fail;
+		}
 
 		/*
 		 * The above forms a control dependency barrier separating the
@@ -162,7 +178,10 @@ int perf_output_begin(struct perf_output_handle *handle,
 		 * See perf_output_put_handle().
 		 */
 
-		head += size;
+		if (!backward)
+			head += size;
+		else
+			head -= size;
 	} while (local_cmpxchg(&rb->head, offset, head) != offset);
 
 	/*
@@ -206,6 +225,12 @@ out:
 	return -ENOSPC;
 }
 
+int perf_output_begin(struct perf_output_handle *handle,
+		      struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, false);
+}
+
 unsigned int perf_output_copy(struct perf_output_handle *handle,
 		      const void *buf, unsigned int len)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 4/6] perf core: Add backwork attribute to perf event
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
                             ` (2 preceding siblings ...)
  2016-01-19 11:16           ` [PATCH 3/6] perf core: Prepare writing into ring buffer from end Wang Nan
@ 2016-01-19 11:16           ` Wang Nan
  2016-01-19 11:16           ` [PATCH 5/6] perf core: Reduce perf event output overhead by setting overwrite handler Wang Nan
                             ` (3 subsequent siblings)
  7 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-19 11:16 UTC (permalink / raw)
  To: peterz, ast
  Cc: linux-kernel, Wang Nan, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama

In perf_event_attr a new bit 'write_backward' is appended to indicate
this event should write ring buffer from its end to beginning.

In perf_output_begin(), prepare ring buffer according this bit.

This patch introduces small overhead into perf_output_begin():
an extra memory read and a conditional branch. Further patch can remove
this overhead using custom output handler.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/linux/perf_event.h      | 5 +++++
 include/uapi/linux/perf_event.h | 3 ++-
 kernel/events/core.c            | 7 +++++++
 kernel/events/ring_buffer.c     | 2 ++
 4 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f9828a4..54c3fb2 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1032,6 +1032,11 @@ static inline bool has_aux(struct perf_event *event)
 	return event->pmu->setup_aux;
 }
 
+static inline bool is_write_backward(struct perf_event *event)
+{
+	return !!event->attr.write_backward;
+}
+
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
 extern void perf_output_end(struct perf_output_handle *handle);
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 2c7f00c..598b9b0 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -340,7 +340,8 @@ struct perf_event_attr {
 				comm_exec      :  1, /* flag comm events that are due to an exec */
 				use_clockid    :  1, /* use @clockid for time fields */
 				context_switch :  1, /* context switch data */
-				__reserved_1   : 37;
+				write_backward :  1, /* Write ring buffer from end to beginning */
+				__reserved_1   : 36;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f79c4be..8ad22a5 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8107,6 +8107,13 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
 		goto out;
 
 	/*
+	 * Either writing ring buffer from beginning or from end.
+	 * Mixing is not allowed.
+	 */
+	if (is_write_backward(output_event) != is_write_backward(event))
+		goto out;
+
+	/*
 	 * If both events generate aux data, they must be on the same PMU
 	 */
 	if (has_aux(event) && has_aux(output_event) &&
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index bbc3bc6..1372427 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -228,6 +228,8 @@ out:
 int perf_output_begin(struct perf_output_handle *handle,
 		      struct perf_event *event, unsigned int size)
 {
+	if (unlikely(is_write_backward(event)))
+		return __perf_output_begin(handle, event, size, true);
 	return __perf_output_begin(handle, event, size, false);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 5/6] perf core: Reduce perf event output overhead by setting overwrite handler
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
                             ` (3 preceding siblings ...)
  2016-01-19 11:16           ` [PATCH 4/6] perf core: Add backwork attribute to perf event Wang Nan
@ 2016-01-19 11:16           ` Wang Nan
  2016-01-19 11:16           ` [PATCH 6/6] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE Wang Nan
                             ` (2 subsequent siblings)
  7 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-19 11:16 UTC (permalink / raw)
  To: peterz, ast
  Cc: linux-kernel, Wang Nan, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama

By creating onward and backward specific overflow handler and setting
them according to event's backward setting, normal sampling events
don't need to check backward setting of an event any more.

This is the last patch of backward writing patchset. After this patch,
there's no extra overhead introduced to the fast path of sampling
output.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/linux/perf_event.h  | 17 +++++++++++++++--
 kernel/events/core.c        | 41 ++++++++++++++++++++++++++++++++++++-----
 kernel/events/ring_buffer.c | 12 ++++++++++++
 3 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 54c3fb2..c0335b9 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -830,9 +830,15 @@ extern int perf_event_overflow(struct perf_event *event,
 				 struct perf_sample_data *data,
 				 struct pt_regs *regs);
 
+extern void perf_event_output_onward(struct perf_event *event,
+				     struct perf_sample_data *data,
+				     struct pt_regs *regs);
+extern void perf_event_output_backward(struct perf_event *event,
+				       struct perf_sample_data *data,
+				       struct pt_regs *regs);
 extern void perf_event_output(struct perf_event *event,
-				struct perf_sample_data *data,
-				struct pt_regs *regs);
+			      struct perf_sample_data *data,
+			      struct pt_regs *regs);
 
 extern void
 perf_event_header__init_id(struct perf_event_header *header,
@@ -1039,6 +1045,13 @@ static inline bool is_write_backward(struct perf_event *event)
 
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
+extern int perf_output_begin_onward(struct perf_output_handle *handle,
+				    struct perf_event *event,
+				    unsigned int size);
+extern int perf_output_begin_backward(struct perf_output_handle *handle,
+				      struct perf_event *event,
+				      unsigned int size);
+
 extern void perf_output_end(struct perf_output_handle *handle);
 extern unsigned int perf_output_copy(struct perf_output_handle *handle,
 			     const void *buf, unsigned int len);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8ad22a5..fa32d8c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5541,9 +5541,13 @@ void perf_prepare_sample(struct perf_event_header *header,
 	}
 }
 
-void perf_event_output(struct perf_event *event,
-			struct perf_sample_data *data,
-			struct pt_regs *regs)
+static void __always_inline
+__perf_event_output(struct perf_event *event,
+		    struct perf_sample_data *data,
+		    struct pt_regs *regs,
+		    int (*output_begin)(struct perf_output_handle *,
+			    		struct perf_event *,
+					unsigned int))
 {
 	struct perf_output_handle handle;
 	struct perf_event_header header;
@@ -5553,7 +5557,7 @@ void perf_event_output(struct perf_event *event,
 
 	perf_prepare_sample(&header, data, event, regs);
 
-	if (perf_output_begin(&handle, event, header.size))
+	if (output_begin(&handle, event, header.size))
 		goto exit;
 
 	perf_output_sample(&handle, &header, data, event);
@@ -5564,6 +5568,30 @@ exit:
 	rcu_read_unlock();
 }
 
+void
+perf_event_output_onward(struct perf_event *event,
+			 struct perf_sample_data *data,
+			 struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin_onward);
+}
+
+void
+perf_event_output_backward(struct perf_event *event,
+			   struct perf_sample_data *data,
+			   struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin_backward);
+}
+
+void
+perf_event_output(struct perf_event *event,
+		  struct perf_sample_data *data,
+		  struct pt_regs *regs)
+{
+	__perf_event_output(event, data, regs, perf_output_begin);
+}
+
 /*
  * read event_id
  */
@@ -7874,8 +7902,11 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 	if (overflow_handler) {
 		event->overflow_handler	= overflow_handler;
 		event->overflow_handler_context = context;
+	} else if (is_write_backward(event)){
+		event->overflow_handler = perf_event_output_backward;
+		event->overflow_handler_context = NULL;
 	} else {
-		event->overflow_handler = perf_event_output;
+		event->overflow_handler = perf_event_output_onward;
 		event->overflow_handler_context = NULL;
 	}
 
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 1372427..4b0ef33 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -225,6 +225,18 @@ out:
 	return -ENOSPC;
 }
 
+int perf_output_begin_onward(struct perf_output_handle *handle,
+			     struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, false);
+}
+
+int perf_output_begin_backward(struct perf_output_handle *handle,
+			       struct perf_event *event, unsigned int size)
+{
+	return __perf_output_begin(handle, event, size, true);
+}
+
 int perf_output_begin(struct perf_output_handle *handle,
 		      struct perf_event *event, unsigned int size)
 {
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 6/6] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
                             ` (4 preceding siblings ...)
  2016-01-19 11:16           ` [PATCH 5/6] perf core: Reduce perf event output overhead by setting overwrite handler Wang Nan
@ 2016-01-19 11:16           ` Wang Nan
  2016-01-19 13:58           ` [PATCH 0/6] perf core: Read from overwrite ring buffer Namhyung Kim
  2016-01-19 17:42           ` Alexei Starovoitov
  7 siblings, 0 replies; 124+ messages in thread
From: Wang Nan @ 2016-01-19 11:16 UTC (permalink / raw)
  To: peterz, ast
  Cc: linux-kernel, Wang Nan, He Kuang, Yunlong Song,
	Arnaldo Carvalho de Melo, Brendan Gregg, Jiri Olsa,
	Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama

This patch introduces a PERF_SAMPLE_TAILSIZE flag which allows a size
field attached at the end of a sample. The idea comes from [1] that,
with tie size at tail of an event, it is possible for user program who
read from the ring buffer parse events backward.

For example:

   head
    |
    V
 +--+---+-------+----------+------+---+
 |E6|...|   B  8|   C    11|  D  7|E..|
 +--+---+-------+----------+------+---+

In this case, from the 'head' pointer provided by kernel, user program
can first see '6' by (*(head - sizeof(u64))), then it can get the start
pointer of record 'E', then it can read size and find start position
of record D, C, B in similar way.

The implementation is easy: adding a PERF_SAMPLE_TAILSIZE flag, makes
perf_output_sample() output size at the end of a sample.

Following things are done for ensure the ring buffer is safe for
backward parsing:

 - Don't allow two events with different PERF_SAMPLE_TAILSIZE setting
   set their output to each other;

 - For non-sample events, also output tailsize if required.

This patch has a limitation for perf:

Before reading such ring buffer, perf must ensure all events which may
output to it is already stopped, so the 'head' pointer it get is the
end of the last record.

[1] http://lkml.kernel.org/g/1449063499-236703-1-git-send-email-wangnan0@huawei.com

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Yunlong Song <yunlong.song@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 include/linux/perf_event.h      | 17 ++++++---
 include/uapi/linux/perf_event.h |  3 +-
 kernel/events/core.c            | 82 +++++++++++++++++++++++++++++------------
 kernel/events/ring_buffer.c     |  7 ++--
 4 files changed, 75 insertions(+), 34 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c0335b9..7c70d4b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -841,13 +841,13 @@ extern void perf_event_output(struct perf_event *event,
 			      struct pt_regs *regs);
 
 extern void
-perf_event_header__init_id(struct perf_event_header *header,
-			   struct perf_sample_data *data,
-			   struct perf_event *event);
+perf_event_header__init_extra(struct perf_event_header *header,
+			      struct perf_sample_data *data,
+			      struct perf_event *event);
 extern void
-perf_event__output_id_sample(struct perf_event *event,
-			     struct perf_output_handle *handle,
-			     struct perf_sample_data *sample);
+perf_event__output_extra(struct perf_event *event, u64 evt_size,
+			 struct perf_output_handle *handle,
+			 struct perf_sample_data *sample);
 
 extern void
 perf_log_lost_samples(struct perf_event *event, u64 lost);
@@ -1043,6 +1043,11 @@ static inline bool is_write_backward(struct perf_event *event)
 	return !!event->attr.write_backward;
 }
 
+static inline bool has_tailsize(struct perf_event *event)
+{
+	return !!(event->attr.sample_type & PERF_SAMPLE_TAILSIZE);
+}
+
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
 extern int perf_output_begin_onward(struct perf_output_handle *handle,
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 598b9b0..f0cad26 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -139,8 +139,9 @@ enum perf_event_sample_format {
 	PERF_SAMPLE_IDENTIFIER			= 1U << 16,
 	PERF_SAMPLE_TRANSACTION			= 1U << 17,
 	PERF_SAMPLE_REGS_INTR			= 1U << 18,
+	PERF_SAMPLE_TAILSIZE			= 1U << 19,
 
-	PERF_SAMPLE_MAX = 1U << 19,		/* non-ABI */
+	PERF_SAMPLE_MAX = 1U << 20,		/* non-ABI */
 };
 
 /*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index fa32d8c..d8bb92e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5141,12 +5141,14 @@ static void __perf_event_header__init_id(struct perf_event_header *header,
 	}
 }
 
-void perf_event_header__init_id(struct perf_event_header *header,
-				struct perf_sample_data *data,
-				struct perf_event *event)
+void perf_event_header__init_extra(struct perf_event_header *header,
+				   struct perf_sample_data *data,
+				   struct perf_event *event)
 {
 	if (event->attr.sample_id_all)
 		__perf_event_header__init_id(header, data, event);
+	if (has_tailsize(event))
+		header->size += sizeof(u64);
 }
 
 static void __perf_event__output_id_sample(struct perf_output_handle *handle,
@@ -5173,12 +5175,14 @@ static void __perf_event__output_id_sample(struct perf_output_handle *handle,
 		perf_output_put(handle, data->id);
 }
 
-void perf_event__output_id_sample(struct perf_event *event,
-				  struct perf_output_handle *handle,
-				  struct perf_sample_data *sample)
+void perf_event__output_extra(struct perf_event *event, u64 evt_size,
+			      struct perf_output_handle *handle,
+			      struct perf_sample_data *sample)
 {
 	if (event->attr.sample_id_all)
 		__perf_event__output_id_sample(handle, sample);
+	if (has_tailsize(event))
+		perf_output_put(handle, evt_size);
 }
 
 static void perf_output_read_one(struct perf_output_handle *handle,
@@ -5420,6 +5424,13 @@ void perf_output_sample(struct perf_output_handle *handle,
 		}
 	}
 
+	/* Should be the last one */
+	if (sample_type & PERF_SAMPLE_TAILSIZE) {
+		u64 evt_size = header->size;
+
+		perf_output_put(handle, evt_size);
+	}
+
 	if (!event->attr.watermark) {
 		int wakeup_events = event->attr.wakeup_events;
 
@@ -5539,6 +5550,9 @@ void perf_prepare_sample(struct perf_event_header *header,
 
 		header->size += size;
 	}
+
+	if (sample_type & PERF_SAMPLE_TAILSIZE)
+		header->size += sizeof(u64);
 }
 
 static void __always_inline
@@ -5620,14 +5634,15 @@ perf_event_read_event(struct perf_event *event,
 	};
 	int ret;
 
-	perf_event_header__init_id(&read_event.header, &sample, event);
+	perf_event_header__init_extra(&read_event.header, &sample, event);
 	ret = perf_output_begin(&handle, event, read_event.header.size);
 	if (ret)
 		return;
 
 	perf_output_put(&handle, read_event);
 	perf_output_read(&handle, event);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, read_event.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 }
@@ -5739,7 +5754,7 @@ static void perf_event_task_output(struct perf_event *event,
 	if (!perf_event_task_match(event))
 		return;
 
-	perf_event_header__init_id(&task_event->event_id.header, &sample, event);
+	perf_event_header__init_extra(&task_event->event_id.header, &sample, event);
 
 	ret = perf_output_begin(&handle, event,
 				task_event->event_id.header.size);
@@ -5756,7 +5771,9 @@ static void perf_event_task_output(struct perf_event *event,
 
 	perf_output_put(&handle, task_event->event_id);
 
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event,
+				 task_event->event_id.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 out:
@@ -5835,7 +5852,7 @@ static void perf_event_comm_output(struct perf_event *event,
 	if (!perf_event_comm_match(event))
 		return;
 
-	perf_event_header__init_id(&comm_event->event_id.header, &sample, event);
+	perf_event_header__init_extra(&comm_event->event_id.header, &sample, event);
 	ret = perf_output_begin(&handle, event,
 				comm_event->event_id.header.size);
 
@@ -5849,7 +5866,8 @@ static void perf_event_comm_output(struct perf_event *event,
 	__output_copy(&handle, comm_event->comm,
 				   comm_event->comm_size);
 
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, comm_event->event_id.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 out:
@@ -5958,7 +5976,7 @@ static void perf_event_mmap_output(struct perf_event *event,
 		mmap_event->event_id.header.size += sizeof(mmap_event->flags);
 	}
 
-	perf_event_header__init_id(&mmap_event->event_id.header, &sample, event);
+	perf_event_header__init_extra(&mmap_event->event_id.header, &sample, event);
 	ret = perf_output_begin(&handle, event,
 				mmap_event->event_id.header.size);
 	if (ret)
@@ -5981,7 +5999,8 @@ static void perf_event_mmap_output(struct perf_event *event,
 	__output_copy(&handle, mmap_event->file_name,
 				   mmap_event->file_size);
 
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, mmap_event->event_id.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 out:
@@ -6164,14 +6183,15 @@ void perf_event_aux_event(struct perf_event *event, unsigned long head,
 	};
 	int ret;
 
-	perf_event_header__init_id(&rec.header, &sample, event);
+	perf_event_header__init_extra(&rec.header, &sample, event);
 	ret = perf_output_begin(&handle, event, rec.header.size);
 
 	if (ret)
 		return;
 
 	perf_output_put(&handle, rec);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, rec.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 }
@@ -6197,7 +6217,7 @@ void perf_log_lost_samples(struct perf_event *event, u64 lost)
 		.lost		= lost,
 	};
 
-	perf_event_header__init_id(&lost_samples_event.header, &sample, event);
+	perf_event_header__init_extra(&lost_samples_event.header, &sample, event);
 
 	ret = perf_output_begin(&handle, event,
 				lost_samples_event.header.size);
@@ -6205,7 +6225,8 @@ void perf_log_lost_samples(struct perf_event *event, u64 lost)
 		return;
 
 	perf_output_put(&handle, lost_samples_event);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, lost_samples_event.header.size,
+				 &handle, &sample);
 	perf_output_end(&handle);
 }
 
@@ -6252,7 +6273,7 @@ static void perf_event_switch_output(struct perf_event *event, void *data)
 					perf_event_tid(event, se->next_prev);
 	}
 
-	perf_event_header__init_id(&se->event_id.header, &sample, event);
+	perf_event_header__init_extra(&se->event_id.header, &sample, event);
 
 	ret = perf_output_begin(&handle, event, se->event_id.header.size);
 	if (ret)
@@ -6263,7 +6284,8 @@ static void perf_event_switch_output(struct perf_event *event, void *data)
 	else
 		perf_output_put(&handle, se->event_id);
 
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, se->event_id.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 }
@@ -6323,7 +6345,7 @@ static void perf_log_throttle(struct perf_event *event, int enable)
 	if (enable)
 		throttle_event.header.type = PERF_RECORD_UNTHROTTLE;
 
-	perf_event_header__init_id(&throttle_event.header, &sample, event);
+	perf_event_header__init_extra(&throttle_event.header, &sample, event);
 
 	ret = perf_output_begin(&handle, event,
 				throttle_event.header.size);
@@ -6331,7 +6353,8 @@ static void perf_log_throttle(struct perf_event *event, int enable)
 		return;
 
 	perf_output_put(&handle, throttle_event);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, throttle_event.header.size,
+				 &handle, &sample);
 	perf_output_end(&handle);
 }
 
@@ -6359,14 +6382,15 @@ static void perf_log_itrace_start(struct perf_event *event)
 	rec.pid	= perf_event_pid(event, current);
 	rec.tid	= perf_event_tid(event, current);
 
-	perf_event_header__init_id(&rec.header, &sample, event);
+	perf_event_header__init_extra(&rec.header, &sample, event);
 	ret = perf_output_begin(&handle, event, rec.header.size);
 
 	if (ret)
 		return;
 
 	perf_output_put(&handle, rec);
-	perf_event__output_id_sample(event, &handle, &sample);
+	perf_event__output_extra(event, rec.header.size,
+				 &handle, &sample);
 
 	perf_output_end(&handle);
 }
@@ -8151,6 +8175,16 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
 	    event->pmu != output_event->pmu)
 		goto out;
 
+	/*
+	 * Don't allow mixed tailsize setting since the resuling
+	 * ringbuffer would unable to be parsed backward.
+	 *
+	 * '!=' is safe because has_tailsize() returns bool, two differnt
+	 * non-zero values would be treated as equal (both true).
+	 */
+	if (has_tailsize(event) != has_tailsize(output_event))
+		goto out;
+
 set:
 	mutex_lock(&event->mmap_mutex);
 	/* Can't redirect output if we've got an active mmap() */
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 4b0ef33..5cb098e 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -208,10 +208,11 @@ __perf_output_begin(struct perf_output_handle *handle,
 		lost_event.id          = event->id;
 		lost_event.lost        = local_xchg(&rb->lost, 0);
 
-		perf_event_header__init_id(&lost_event.header,
-					   &sample_data, event);
+		perf_event_header__init_extra(&lost_event.header,
+					      &sample_data, event);
 		perf_output_put(handle, lost_event);
-		perf_event__output_id_sample(event, handle, &sample_data);
+		perf_event__output_extra(event, lost_event.header.type,
+					 handle, &sample_data);
 	}
 
 	return 0;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* Re: [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
                             ` (5 preceding siblings ...)
  2016-01-19 11:16           ` [PATCH 6/6] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE Wang Nan
@ 2016-01-19 13:58           ` Namhyung Kim
  2016-01-19 14:14             ` pi3orama
  2016-01-19 17:42           ` Alexei Starovoitov
  7 siblings, 1 reply; 124+ messages in thread
From: Namhyung Kim @ 2016-01-19 13:58 UTC (permalink / raw)
  To: Wang Nan
  Cc: peterz, ast, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Zefan Li, pi3orama

Hi,

On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
> This patchset introduces two methods to support reading from overwrite.
> 
>  1) Tailsize: write the size of an event at the end of it
>  2) Backward writing: write the ring buffer from the end of it to the
>     beginning.

So both of two methods should be used together?

Thanks,
Namhyung


> 
> Patch 1/6 introduces a new ioctl operation to pause and resume ring
> buffer since reading from a overwrite ring buffer is not reliable.
> 
> To reduce overhead as much as possible, force setting overflow_handler
> and create specific function for backward writing and onward writing.
> 
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> 
> Wang Nan (6):
>   perf core: Introduce new ioctl options to pause and resume ring buffer
>   perf core: Set event's default overflow_handler
>   perf core: Prepare writing into ring buffer from end
>   perf core: Add backwork attribute to perf event
>   perf core: Reduce perf event output overhead by setting overwrite
>     handler
>   perf/core: Put size of a sample at the end of it by
>     PERF_SAMPLE_TAILSIZE
> 
>  include/linux/perf_event.h      |  39 +++++++---
>  include/uapi/linux/perf_event.h |   7 +-
>  kernel/events/core.c            | 155 +++++++++++++++++++++++++++++++---------
>  kernel/events/internal.h        |  11 +++
>  kernel/events/ring_buffer.c     |  65 ++++++++++++++---
>  5 files changed, 223 insertions(+), 54 deletions(-)
> 
> -- 
> 1.8.3.4
> 

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-19 13:58           ` [PATCH 0/6] perf core: Read from overwrite ring buffer Namhyung Kim
@ 2016-01-19 14:14             ` pi3orama
  0 siblings, 0 replies; 124+ messages in thread
From: pi3orama @ 2016-01-19 14:14 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Wang Nan, peterz, ast, linux-kernel, He Kuang,
	Arnaldo Carvalho de Melo, Brendan Gregg, Jiri Olsa,
	Masami Hiramatsu, Zefan Li



发自我的 iPhone

> 在 2016年1月19日,下午9:58,Namhyung Kim <namhyung@kernel.org> 写道:
> 
> Hi,
> 
>> On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
>> This patchset introduces two methods to support reading from overwrite.
>> 
>> 1) Tailsize: write the size of an event at the end of it
>> 2) Backward writing: write the ring buffer from the end of it to the
>>    beginning.
> 
> So both of two methods should be used together?
> 

They are separated, we should use only one
of them. I prefer backward writing. But if we
select them both they should work.

Both of them have drawback. Tailsize
method adds 8 bytes to each record
(needs more cycles to write and consume
more space), backward writing may causes
more cache misses since processor doesn't
support backward cache prefetching. Tomorrow I will compare their performance penalty to see which one is better.

Thank you.

> Thanks,
> Namhyung
> 
> 
>> 
>> Patch 1/6 introduces a new ioctl operation to pause and resume ring
>> buffer since reading from a overwrite ring buffer is not reliable.
>> 
>> To reduce overhead as much as possible, force setting overflow_handler
>> and create specific function for backward writing and onward writing.
>> 
>> Cc: He Kuang <hekuang@huawei.com>
>> Cc: Alexei Starovoitov <ast@kernel.org>
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> 
>> Wang Nan (6):
>>  perf core: Introduce new ioctl options to pause and resume ring buffer
>>  perf core: Set event's default overflow_handler
>>  perf core: Prepare writing into ring buffer from end
>>  perf core: Add backwork attribute to perf event
>>  perf core: Reduce perf event output overhead by setting overwrite
>>    handler
>>  perf/core: Put size of a sample at the end of it by
>>    PERF_SAMPLE_TAILSIZE
>> 
>> include/linux/perf_event.h      |  39 +++++++---
>> include/uapi/linux/perf_event.h |   7 +-
>> kernel/events/core.c            | 155 +++++++++++++++++++++++++++++++---------
>> kernel/events/internal.h        |  11 +++
>> kernel/events/ring_buffer.c     |  65 ++++++++++++++---
>> 5 files changed, 223 insertions(+), 54 deletions(-)
>> 
>> -- 
>> 1.8.3.4
>> 

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
                             ` (6 preceding siblings ...)
  2016-01-19 13:58           ` [PATCH 0/6] perf core: Read from overwrite ring buffer Namhyung Kim
@ 2016-01-19 17:42           ` Alexei Starovoitov
  2016-01-20  1:37             ` Wangnan (F)
  7 siblings, 1 reply; 124+ messages in thread
From: Alexei Starovoitov @ 2016-01-19 17:42 UTC (permalink / raw)
  To: Wang Nan
  Cc: peterz, ast, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama

On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
> This patchset introduces two methods to support reading from overwrite.
> 
>  1) Tailsize: write the size of an event at the end of it
>  2) Backward writing: write the ring buffer from the end of it to the
>     beginning.

what happend with your other idea of moving the whole header to the end?
That felt better than either of these options.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-19 17:42           ` Alexei Starovoitov
@ 2016-01-20  1:37             ` Wangnan (F)
  2016-01-20  2:20               ` Alexei Starovoitov
  0 siblings, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-20  1:37 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: peterz, ast, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama



On 2016/1/20 1:42, Alexei Starovoitov wrote:
> On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
>> This patchset introduces two methods to support reading from overwrite.
>>
>>   1) Tailsize: write the size of an event at the end of it
>>   2) Backward writing: write the ring buffer from the end of it to the
>>      beginning.
> what happend with your other idea of moving the whole header to the end?
> That felt better than either of these options.

I'll try it today. However, putting all of the three together is
not as easy as this patchset.

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-20  1:37             ` Wangnan (F)
@ 2016-01-20  2:20               ` Alexei Starovoitov
  2016-01-21  6:51                 ` Wangnan (F)
  0 siblings, 1 reply; 124+ messages in thread
From: Alexei Starovoitov @ 2016-01-20  2:20 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: peterz, ast, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama

On Wed, Jan 20, 2016 at 09:37:42AM +0800, Wangnan (F) wrote:
> 
> 
> On 2016/1/20 1:42, Alexei Starovoitov wrote:
> >On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
> >>This patchset introduces two methods to support reading from overwrite.
> >>
> >>  1) Tailsize: write the size of an event at the end of it
> >>  2) Backward writing: write the ring buffer from the end of it to the
> >>     beginning.
> >what happend with your other idea of moving the whole header to the end?
> >That felt better than either of these options.
> 
> I'll try it today. However, putting all of the three together is
> not as easy as this patchset.

I'm missing something. Why all three in one set?
Since you have 1 and 2 implemented, benchmark them with absolute
numbers and then implement this last one without any prior baggage
and benchmark it as well. I think it should be the fastest
and the cleanest. We don't need ten different ways to do one thing.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-20  2:20               ` Alexei Starovoitov
@ 2016-01-21  6:51                 ` Wangnan (F)
  2016-01-22  2:21                   ` Wangnan (F)
  0 siblings, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-21  6:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: peterz, ast, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama



On 2016/1/20 10:20, Alexei Starovoitov wrote:
> On Wed, Jan 20, 2016 at 09:37:42AM +0800, Wangnan (F) wrote:
>>
>> On 2016/1/20 1:42, Alexei Starovoitov wrote:
>>> On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
>>>> This patchset introduces two methods to support reading from overwrite.
>>>>
>>>>   1) Tailsize: write the size of an event at the end of it
>>>>   2) Backward writing: write the ring buffer from the end of it to the
>>>>      beginning.
>>> what happend with your other idea of moving the whole header to the end?
>>> That felt better than either of these options.
>> I'll try it today. However, putting all of the three together is
>> not as easy as this patchset.
> I'm missing something. Why all three in one set?

Can't implement all three in one, but implement two of them make
benchmarking simpler :)

Here comes some numbers.

I attach a target program at the end of this mail. It calls
close(-1) for 3000000 times, and use gettimeofday to check
how many us it takes.

Following cases are tested:


  BASE    : ./a.out
  RAWPERF : ./perf record -o /dev/null -e raw_syscalls:* ./a.out
  WRTBKWRD: ./perf record -o /dev/null -e raw_syscalls:* ./a.out
  TAILSIZE: ./perf record --no-has-write-backward -o /dev/null -e 
raw_syscalls:*/overwrite/ ./a.out
  RAWOVWRT: ./perf record --no-has-write-backward --no-has-tailsize -o 
/dev/null -e raw_syscalls:*/overwrite/ ./a.out

With this script:

func() {
         for x in `seq 1 100` ; do $1; done | tee data_$2
}

func ./a.out base
func "./perf record -o /dev/null -e raw_syscalls:* ./a.out" rawperf
func "./perf record -o /dev/null -e raw_syscalls:*/overwrite/ ./a.out" 
wrtbkwrd
func "./perf record -o /dev/null --no-has-write-backward -e 
raw_syscalls:*/overwrite/ ./a.out" tailsize
func "./perf record -o /dev/null --no-has-write-backward 
--no-has-tailsize -o /dev/null -e raw_syscalls:*/overwrite/ ./a.out" 
rawovwrt

Result:

             MEAN           STDVAR
BASE    :  879870.81      11913.13
RAWPERF : 2603854.7      706658.4
WRTBKWRD: 2313301.220      6727.957
TAILSIZE: 2383051.860      5248.061
RAWOVWRT: 2315273.180      5221.025

So it seems backward writing methods is good enough. We don't need to 
consider
tailsize method.

Code for this benchmark can be found from:

https://git.kernel.org/cgit/linux/kernel/git/pi3orama/linux.git/ 
perf/overwrite-benchmark

Thank you.

-------- Test program ----------
#include <unistd.h>
#include <fcntl.h>
#include <sys/time.h>
#include <stdio.h>

int main()
{
         int i;
         struct timeval tv1, tv2;
         long long us1, us2;

         gettimeofday(&tv1, NULL);
         for (i = 0; i < 1000 * 1000 * 3; i++) {
                 close(-1);
         }
         gettimeofday(&tv2, NULL);

         us1 = tv1.tv_sec * 1000000 + tv1.tv_usec;
         us2 = tv2.tv_sec * 1000000 + tv2.tv_usec;
         printf("%ld\n", us2 - us1);

         return 0;
}

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-21  6:51                 ` Wangnan (F)
@ 2016-01-22  2:21                   ` Wangnan (F)
  2016-01-22  3:21                     ` Alexei Starovoitov
  0 siblings, 1 reply; 124+ messages in thread
From: Wangnan (F) @ 2016-01-22  2:21 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: peterz, ast, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama



On 2016/1/21 14:51, Wangnan (F) wrote:
>
>
> On 2016/1/20 10:20, Alexei Starovoitov wrote:
>> On Wed, Jan 20, 2016 at 09:37:42AM +0800, Wangnan (F) wrote:
>>>
>>> On 2016/1/20 1:42, Alexei Starovoitov wrote:
>>>> On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
>>>>> This patchset introduces two methods to support reading from 
>>>>> overwrite.
>>>>>
>>>>>   1) Tailsize: write the size of an event at the end of it
>>>>>   2) Backward writing: write the ring buffer from the end of it to 
>>>>> the
>>>>>      beginning.
>>>> what happend with your other idea of moving the whole header to the 
>>>> end?
>>>> That felt better than either of these options.
>>> I'll try it today. However, putting all of the three together is
>>> not as easy as this patchset.
>> I'm missing something. Why all three in one set?
>
> Can't implement all three in one, but implement two of them make
> benchmarking simpler :)
>
> Here comes some numbers.
>
> I attach a target program at the end of this mail. It calls
> close(-1) for 3000000 times, and use gettimeofday to check
> how many us it takes.
>
> Following cases are tested:
>
>
>  BASE    : ./a.out
>  RAWPERF : ./perf record -o /dev/null -e raw_syscalls:* ./a.out
>  WRTBKWRD: ./perf record -o /dev/null -e raw_syscalls:* ./a.out
>  TAILSIZE: ./perf record --no-has-write-backward -o /dev/null -e 
> raw_syscalls:*/overwrite/ ./a.out
>  RAWOVWRT: ./perf record --no-has-write-backward --no-has-tailsize -o 
> /dev/null -e raw_syscalls:*/overwrite/ ./a.out
>
> With this script:
>
> func() {
>         for x in `seq 1 100` ; do $1; done | tee data_$2
> }
>
> func ./a.out base
> func "./perf record -o /dev/null -e raw_syscalls:* ./a.out" rawperf
> func "./perf record -o /dev/null -e raw_syscalls:*/overwrite/ ./a.out" 
> wrtbkwrd
> func "./perf record -o /dev/null --no-has-write-backward -e 
> raw_syscalls:*/overwrite/ ./a.out" tailsize
> func "./perf record -o /dev/null --no-has-write-backward 
> --no-has-tailsize -o /dev/null -e raw_syscalls:*/overwrite/ ./a.out" 
> rawovwrt
>
> Result:
>
>             MEAN           STDVAR
> BASE    :  879870.81      11913.13
> RAWPERF : 2603854.7      706658.4
> WRTBKWRD: 2313301.220      6727.957
> TAILSIZE: 2383051.860      5248.061
> RAWOVWRT: 2315273.180      5221.025

Add a number: I tested original perf overwrite ring buffer in pure v4.4
on the same machine:

                     MEAN          STDVAR
RAWOVWRT(original): 2323970.45    5103.39

So I think backward writing method doesn't add extra overhead into
fastpath.

I will send this patchset again with several bugs fixed. After that
I'll start working on tail-header if it is still required.

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-22  2:21                   ` Wangnan (F)
@ 2016-01-22  3:21                     ` Alexei Starovoitov
  2016-01-22  4:45                       ` Wangnan (F)
  0 siblings, 1 reply; 124+ messages in thread
From: Alexei Starovoitov @ 2016-01-22  3:21 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: peterz, ast, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama

On Fri, Jan 22, 2016 at 10:21:19AM +0800, Wangnan (F) wrote:
> 
> 
> On 2016/1/21 14:51, Wangnan (F) wrote:
> >
> >
> >On 2016/1/20 10:20, Alexei Starovoitov wrote:
> >>On Wed, Jan 20, 2016 at 09:37:42AM +0800, Wangnan (F) wrote:
> >>>
> >>>On 2016/1/20 1:42, Alexei Starovoitov wrote:
> >>>>On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
> >>>>>This patchset introduces two methods to support reading from
> >>>>>overwrite.
> >>>>>
> >>>>>  1) Tailsize: write the size of an event at the end of it
> >>>>>  2) Backward writing: write the ring buffer from the end of it to
> >>>>>the
> >>>>>     beginning.
> >>>>what happend with your other idea of moving the whole header to the
> >>>>end?
> >>>>That felt better than either of these options.
> >>>I'll try it today. However, putting all of the three together is
> >>>not as easy as this patchset.
> >>I'm missing something. Why all three in one set?
> >
> >Can't implement all three in one, but implement two of them make
> >benchmarking simpler :)
> >
> >Here comes some numbers.
> >
> >I attach a target program at the end of this mail. It calls
> >close(-1) for 3000000 times, and use gettimeofday to check
> >how many us it takes.
> >
> >Following cases are tested:
> >
> >
> > BASE    : ./a.out
> > RAWPERF : ./perf record -o /dev/null -e raw_syscalls:* ./a.out
> > WRTBKWRD: ./perf record -o /dev/null -e raw_syscalls:* ./a.out
> > TAILSIZE: ./perf record --no-has-write-backward -o /dev/null -e
> >raw_syscalls:*/overwrite/ ./a.out
> > RAWOVWRT: ./perf record --no-has-write-backward --no-has-tailsize -o
> >/dev/null -e raw_syscalls:*/overwrite/ ./a.out
> >
> >With this script:
> >
> >func() {
> >        for x in `seq 1 100` ; do $1; done | tee data_$2
> >}
> >
> >func ./a.out base
> >func "./perf record -o /dev/null -e raw_syscalls:* ./a.out" rawperf
> >func "./perf record -o /dev/null -e raw_syscalls:*/overwrite/ ./a.out"
> >wrtbkwrd
> >func "./perf record -o /dev/null --no-has-write-backward -e
> >raw_syscalls:*/overwrite/ ./a.out" tailsize
> >func "./perf record -o /dev/null --no-has-write-backward --no-has-tailsize
> >-o /dev/null -e raw_syscalls:*/overwrite/ ./a.out" rawovwrt
> >
> >Result:
> >
> >            MEAN           STDVAR
> >BASE    :  879870.81      11913.13
> >RAWPERF : 2603854.7      706658.4
> >WRTBKWRD: 2313301.220      6727.957
> >TAILSIZE: 2383051.860      5248.061
> >RAWOVWRT: 2315273.180      5221.025
> 
> Add a number: I tested original perf overwrite ring buffer in pure v4.4
> on the same machine:
> 
>                     MEAN          STDVAR
> RAWOVWRT(original): 2323970.45    5103.39
> 
> So I think backward writing method doesn't add extra overhead into
> fastpath.
> 
> I will send this patchset again with several bugs fixed. After that
> I'll start working on tail-header if it is still required.

interesting.
did I read the numbers correctly that 'write backwards' method
is actually the fastest? even faster than no-overwrite?
nice. I guess it makes snese that overwrite is faster.
I guess than moving the header to the end will have the same
performance in this benchmark, since RAWOVWRT is the same as well.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 0/6] perf core: Read from overwrite ring buffer
  2016-01-22  3:21                     ` Alexei Starovoitov
@ 2016-01-22  4:45                       ` Wangnan (F)
  0 siblings, 0 replies; 124+ messages in thread
From: Wangnan (F) @ 2016-01-22  4:45 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: peterz, ast, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Brendan Gregg, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Zefan Li, pi3orama



On 2016/1/22 11:21, Alexei Starovoitov wrote:
> On Fri, Jan 22, 2016 at 10:21:19AM +0800, Wangnan (F) wrote:
>>
>> On 2016/1/21 14:51, Wangnan (F) wrote:
>>>
>>> On 2016/1/20 10:20, Alexei Starovoitov wrote:
>>>> On Wed, Jan 20, 2016 at 09:37:42AM +0800, Wangnan (F) wrote:
>>>>> On 2016/1/20 1:42, Alexei Starovoitov wrote:
>>>>>> On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
>>>>>>> This patchset introduces two methods to support reading from
>>>>>>> overwrite.
>>>>>>>
>>>>>>>   1) Tailsize: write the size of an event at the end of it
>>>>>>>   2) Backward writing: write the ring buffer from the end of it to
>>>>>>> the
>>>>>>>      beginning.
>>>>>> what happend with your other idea of moving the whole header to the
>>>>>> end?
>>>>>> That felt better than either of these options.
>>>>> I'll try it today. However, putting all of the three together is
>>>>> not as easy as this patchset.
>>>> I'm missing something. Why all three in one set?
>>> Can't implement all three in one, but implement two of them make
>>> benchmarking simpler :)
>>>
>>> Here comes some numbers.
>>>
>>> I attach a target program at the end of this mail. It calls
>>> close(-1) for 3000000 times, and use gettimeofday to check
>>> how many us it takes.
>>>
>>> Following cases are tested:
>>>
>>>
>>> BASE    : ./a.out
>>> RAWPERF : ./perf record -o /dev/null -e raw_syscalls:* ./a.out
>>> WRTBKWRD: ./perf record -o /dev/null -e raw_syscalls:* ./a.out
>>> TAILSIZE: ./perf record --no-has-write-backward -o /dev/null -e
>>> raw_syscalls:*/overwrite/ ./a.out
>>> RAWOVWRT: ./perf record --no-has-write-backward --no-has-tailsize -o
>>> /dev/null -e raw_syscalls:*/overwrite/ ./a.out
>>>
>>> With this script:
>>>
>>> func() {
>>>         for x in `seq 1 100` ; do $1; done | tee data_$2
>>> }
>>>
>>> func ./a.out base
>>> func "./perf record -o /dev/null -e raw_syscalls:* ./a.out" rawperf
>>> func "./perf record -o /dev/null -e raw_syscalls:*/overwrite/ ./a.out"
>>> wrtbkwrd
>>> func "./perf record -o /dev/null --no-has-write-backward -e
>>> raw_syscalls:*/overwrite/ ./a.out" tailsize
>>> func "./perf record -o /dev/null --no-has-write-backward --no-has-tailsize
>>> -o /dev/null -e raw_syscalls:*/overwrite/ ./a.out" rawovwrt
>>>
>>> Result:
>>>
>>>             MEAN           STDVAR
>>> BASE    :  879870.81      11913.13
>>> RAWPERF : 2603854.7      706658.4
>>> WRTBKWRD: 2313301.220      6727.957
>>> TAILSIZE: 2383051.860      5248.061
>>> RAWOVWRT: 2315273.180      5221.025
>> Add a number: I tested original perf overwrite ring buffer in pure v4.4
>> on the same machine:
>>
>>                      MEAN          STDVAR
>> RAWOVWRT(original): 2323970.45    5103.39
>>
>> So I think backward writing method doesn't add extra overhead into
>> fastpath.
>>
>> I will send this patchset again with several bugs fixed. After that
>> I'll start working on tail-header if it is still required.
> interesting.
> did I read the numbers correctly that 'write backwards' method
> is actually the fastest? even faster than no-overwrite?

Yes. But notice STDVAR, we can't say 'WRTBKWRD' outperform 'RAWOVWRT'. 
However,
at least 'WRTBKWRD' should be as fast as 'RAWOVWRT'.

> nice. I guess it makes snese that overwrite is faster.

In no-overwrite case perf itself wakes up many times to collect data,
I guess it is the source of high stdvar.

> I guess than moving the header to the end will have the same
> performance in this benchmark, since RAWOVWRT is the same as well.
>
Yes.

Do you want to test it by yourself? The code is ready.

Thank you.

^ permalink raw reply	[flat|nested] 124+ messages in thread

end of thread, other threads:[~2016-01-22  4:46 UTC | newest]

Thread overview: 124+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-11 13:47 [PATCH 00/53] perf tools: Bugfix, BPF improvement and perf record flight record mode Wang Nan
2016-01-11 13:47 ` [PATCH 01/53] perf tools: Add -lutil in python lib list for broken python-config Wang Nan
2016-01-12  9:43   ` Jiri Olsa
2016-01-12 10:09   ` [tip:perf/urgent] " tip-bot for Wang Nan
2016-01-11 13:47 ` [PATCH 02/53] perf tools: Fix phony build target for build-test Wang Nan
2016-01-12 10:09   ` [tip:perf/urgent] " tip-bot for Wang Nan
2016-01-11 13:47 ` [PATCH 03/53] perf tools: Set parallel making options build-test Wang Nan
2016-01-11 13:47 ` [PATCH 04/53] perf tools: Pass O option to Makefile.perf in build-test Wang Nan
2016-01-11 13:47 ` [PATCH 05/53] perf tools: Test correct path of perf " Wang Nan
2016-01-11 15:24   ` Arnaldo Carvalho de Melo
2016-01-11 22:06     ` Arnaldo Carvalho de Melo
2016-01-11 22:39       ` Arnaldo Carvalho de Melo
2016-01-11 22:39         ` Arnaldo Carvalho de Melo
2016-01-12  7:16           ` Wangnan (F)
2016-01-12 14:08             ` Arnaldo Carvalho de Melo
2016-01-11 13:47 ` [PATCH 06/53] perf tools: Fix PowerPC native building Wang Nan
2016-01-12 10:10   ` [tip:perf/urgent] " tip-bot for Wang Nan
2016-01-11 13:47 ` [PATCH 07/53] tools: Move Makefile.arch from perf/config to tools/scripts Wang Nan
2016-01-11 13:52   ` Wangnan (F)
2016-01-11 14:10     ` Arnaldo Carvalho de Melo
2016-01-12 10:10   ` [tip:perf/urgent] tools: Move Makefile.arch from perf/ config " tip-bot for Wang Nan
2016-01-11 13:47 ` [PATCH 08/53] perf tools: Add missing sources in perf's MANIFEST Wang Nan
2016-01-11 13:48 ` [PATCH 09/53] perf: bpf: Fix build breakage due to libbpf Wang Nan
2016-01-12 10:10   ` [tip:perf/urgent] perf " tip-bot for Naveen N. Rao
2016-01-11 13:48 ` [PATCH 10/53] tools build: Add BPF feature check to test-all Wang Nan
2016-01-12 10:11   ` [tip:perf/urgent] " tip-bot for Wang Nan
2016-01-11 13:48 ` [PATCH 11/53] perf test: Fix false TEST_OK result for 'perf test hist' Wang Nan
2016-01-11 14:25   ` Sergei Shtylyov
2016-01-11 14:58     ` Arnaldo Carvalho de Melo
2016-01-11 15:32       ` Arnaldo Carvalho de Melo
2016-01-12 10:11   ` [tip:perf/urgent] perf test: Fix false TEST_OK result for ' perf " tip-bot for Wang Nan
2016-01-11 13:48 ` [PATCH 12/53] perf test: Reset err after using it hold errcode in hist testcases Wang Nan
2016-01-12 10:11   ` [tip:perf/urgent] " tip-bot for Wang Nan
2016-01-11 13:48 ` [PATCH 13/53] perf tools: Prevent calling machine__delete() on non-allocated machine Wang Nan
2016-01-11 15:42   ` Arnaldo Carvalho de Melo
2016-01-12  7:03     ` Wangnan (F)
2016-01-12 14:07       ` Arnaldo Carvalho de Melo
2016-01-11 13:48 ` [PATCH 14/53] perf test: Check environment before start real BPF test Wang Nan
2016-01-11 21:55   ` Arnaldo Carvalho de Melo
2016-01-12  7:40     ` Wangnan (F)
2016-01-12 14:10       ` Arnaldo Carvalho de Melo
2016-01-11 13:48 ` [PATCH 15/53] perf tools: Fix symbols searching for offline module in buildid-cache Wang Nan
2016-01-11 13:48 ` [PATCH 16/53] perf tools: Fix mmap2 event allocation in synthesize code Wang Nan
2016-01-11 21:03   ` Arnaldo Carvalho de Melo
2016-01-12 10:12     ` [PATCH 16/53 v2] " Wang Nan
2016-01-12 10:49       ` 平松雅巳 / HIRAMATU,MASAMI
2016-01-12 10:51         ` Wangnan (F)
2016-01-12 14:24           ` acme
2016-01-13  0:40             ` 平松雅巳 / HIRAMATU,MASAMI
2016-01-13  9:40       ` [tip:perf/urgent] " tip-bot for Wang Nan
2016-01-11 13:48 ` [PATCH 17/53] perf test: Improve bp_signal Wang Nan
2016-01-11 21:37   ` Arnaldo Carvalho de Melo
2016-01-12  4:13     ` Wangnan (F)
2016-01-12  9:21     ` Jiri Olsa
2016-01-12 14:11       ` Arnaldo Carvalho de Melo
2016-01-12 14:17         ` Will Deacon
2016-01-11 13:48 ` [PATCH 18/53] perf tools: Add API to config maps in bpf object Wang Nan
2016-01-11 13:48 ` [PATCH 19/53] perf tools: Enable BPF object configure syntax Wang Nan
2016-01-11 13:48 ` [PATCH 20/53] perf record: Apply config to BPF objects before recording Wang Nan
2016-01-11 13:48 ` [PATCH 21/53] perf tools: Enable passing event to BPF object Wang Nan
2016-01-11 13:48 ` [PATCH 22/53] perf tools: Support perf event alias name Wang Nan
2016-01-11 13:48 ` [PATCH 23/53] perf tools: Support setting different slots in a BPF map separately Wang Nan
2016-01-11 13:48 ` [PATCH 24/53] perf tools: Enable indices setting syntax for BPF maps Wang Nan
2016-01-11 13:48 ` [PATCH 25/53] perf tools: Introduce bpf-output event Wang Nan
2016-01-11 13:48 ` [PATCH 26/53] perf data: Support converting data from bpf_perf_event_output() Wang Nan
2016-01-11 13:48 ` [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE Wang Nan
2016-01-11 18:09   ` Alexei Starovoitov
2016-01-12  5:33     ` Wangnan (F)
2016-01-12  6:11       ` Alexei Starovoitov
2016-01-12 12:36         ` Wangnan (F)
2016-01-12 19:56           ` Alexei Starovoitov
2016-01-13  4:34             ` Wangnan (F)
2016-01-13  5:14               ` Alexei Starovoitov
2016-01-12 14:05       ` Peter Zijlstra
2016-01-12 14:14   ` Peter Zijlstra
2016-01-18 11:52     ` [PATCH] perf core: Introduce new ioctl options to pause and resume ring buffer Wang Nan
2016-01-18 12:02       ` Peter Zijlstra
2016-01-19  2:55         ` Wangnan (F)
2016-01-19 11:16         ` [PATCH 0/6] perf core: Read from overwrite " Wang Nan
2016-01-19 11:16           ` [PATCH 1/6] perf core: Introduce new ioctl options to pause and resume " Wang Nan
2016-01-19 11:16           ` [PATCH 2/6] perf core: Set event's default overflow_handler Wang Nan
2016-01-19 11:16           ` [PATCH 3/6] perf core: Prepare writing into ring buffer from end Wang Nan
2016-01-19 11:16           ` [PATCH 4/6] perf core: Add backwork attribute to perf event Wang Nan
2016-01-19 11:16           ` [PATCH 5/6] perf core: Reduce perf event output overhead by setting overwrite handler Wang Nan
2016-01-19 11:16           ` [PATCH 6/6] perf/core: Put size of a sample at the end of it by PERF_SAMPLE_TAILSIZE Wang Nan
2016-01-19 13:58           ` [PATCH 0/6] perf core: Read from overwrite ring buffer Namhyung Kim
2016-01-19 14:14             ` pi3orama
2016-01-19 17:42           ` Alexei Starovoitov
2016-01-20  1:37             ` Wangnan (F)
2016-01-20  2:20               ` Alexei Starovoitov
2016-01-21  6:51                 ` Wangnan (F)
2016-01-22  2:21                   ` Wangnan (F)
2016-01-22  3:21                     ` Alexei Starovoitov
2016-01-22  4:45                       ` Wangnan (F)
2016-01-11 13:48 ` [PATCH 28/53] perf tools: Move timestamp creation to util Wang Nan
2016-01-11 13:48 ` [PATCH 29/53] perf tools: Make ordered_events reusable Wang Nan
2016-01-11 21:33   ` Arnaldo Carvalho de Melo
2016-01-11 13:48 ` [PATCH 30/53] perf record: Extract synthesize code to record__synthesize() Wang Nan
2016-01-11 13:48 ` [PATCH 31/53] perf tools: Add perf_data_file__switch() helper Wang Nan
2016-01-11 13:48 ` [PATCH 32/53] perf record: Turns auxtrace_snapshot_enable into 3 states Wang Nan
2016-01-11 13:48 ` [PATCH 33/53] perf record: Introduce record__finish_output() to finish a perf.data Wang Nan
2016-01-11 13:48 ` [PATCH 34/53] perf record: Use OPT_BOOLEAN_SET for buildid cache related options Wang Nan
2016-01-11 13:48 ` [PATCH 35/53] perf record: Add '--timestamp-filename' option to append timestamp to output filename Wang Nan
2016-01-11 13:48 ` [PATCH 36/53] perf record: Split output into multiple files via '--switch-output' Wang Nan
2016-01-11 13:48 ` [PATCH 37/53] perf record: Force enable --timestamp-filename when --switch-output is provided Wang Nan
2016-01-11 13:48 ` [PATCH 38/53] perf record: Disable buildid cache options by default in switch output mode Wang Nan
2016-01-11 13:48 ` [PATCH 39/53] perf record: Re-synthesize tracking events after output switching Wang Nan
2016-01-11 13:48 ` [PATCH 40/53] perf record: Generate tracking events for process forked by perf Wang Nan
2016-01-11 13:48 ` [PATCH 41/53] perf record: Ensure return non-zero rc when mmap fail Wang Nan
2016-01-11 13:48 ` [PATCH 42/53] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
2016-01-11 14:21   ` Sergei Shtylyov
2016-01-11 15:00     ` Arnaldo Carvalho de Melo
2016-01-11 15:01       ` Arnaldo Carvalho de Melo
2016-01-11 13:48 ` [PATCH 43/53] perf tools: Add evlist channel helpers Wang Nan
2016-01-11 13:48 ` [PATCH 44/53] perf tools: Automatically add new channel according to evlist Wang Nan
2016-01-11 13:48 ` [PATCH 45/53] perf tools: Operate multiple channels Wang Nan
2016-01-11 13:48 ` [PATCH 46/53] perf tools: Squash overwrite setting into channel Wang Nan
2016-01-11 13:48 ` [PATCH 47/53] perf record: Don't read from and poll overwrite channel Wang Nan
2016-01-11 13:48 ` [PATCH 48/53] perf tools: Enable overwrite settings Wang Nan
2016-01-11 13:48 ` [PATCH 49/53] perf tools: Consider TAILSIZE bit when caclulate is_pos Wang Nan
2016-01-11 13:48 ` [PATCH 50/53] perf tools: Set tailsize attribut bit for overwrite events Wang Nan
2016-01-11 13:48 ` [PATCH 51/53] perf record: Read from tailsize ring buffer Wang Nan
2016-01-11 13:48 ` [PATCH 52/53] perf record: Toggle tailsize ring buffer for reading Wang Nan
2016-01-11 13:48 ` [PATCH 53/53] perf record: Allow generate tracking events at the end of output Wang Nan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.