linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 00/37] perf/core improvements and fixes
@ 2016-10-24 16:20 Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 01/37] perf intel-pt/bts: Tidy instruction buffer size usage Arnaldo Carvalho de Melo
                   ` (37 more replies)
  0 siblings, 38 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Alexander Alemayhu, Alexander Shishkin, Alexis Berlemont,
	Andi Kleen, Anton Blanchard, Brice Goglin, David Ahern,
	Davidlohr Bueso, Hitoshi Mitake, Jiri Olsa, Maciej Debski,
	Mathieu Poirier, Namhyung Kim, Nilay Vaish, Peter Zijlstra,
	Ross McIlroy, Sebastian Andrzej Siewior, Stefano Sanfilippo,
	Stephane Eranian, Steven Rostedt, Sukadev Bhattiprolu, Wang Nan,
	Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling, more to come soon,

- Arnaldo

Build and test stats at the end of the message.

The following changes since commit e9c848928abf4cb60601e9ae7d336f0333c98bca:

  Merge tag 'perf-c2c-for-mingo-20161021' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-10-22 10:22:05 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161024

for you to fetch changes up to 04b553ad7dc347eabd3cb4705932272453175a80:

  perf coresight: Removing miscellaneous debug output (2016-10-24 11:07:47 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Dynamicly change verbosity level by pressing 'V' in the 'perf top/report'
  hists TUI browser (Alexis Berlemont)

- Implement 'perf trace --delay' in the same fashion as in 'perf record --delay',
  to skip sampling workload initialization events (Alexis Berlemont)

- Make vendor named events case insensitive in 'perf list', i.e.
  'perf list LONGEST_LAT' works just the same as  'perf list longest_lat' (Andi Kleen)

- Show instruction bytes and lenght in 'perf script' for Intel PT and BTS (Andi Kleen, Adrian Hunter)

   E.g:

    % perf record -e intel_pt// foo
    % perf script --itrace=i0ns -F ip,insn,insnlen
     ffffffff8101232f ilen: 5 insn: 0f 1f 44 00 00
     ffffffff81012334 ilen: 1 insn: 5b
     ffffffff81012335 ilen: 1 insn: 5d
     ffffffff81012336 ilen: 1 insn: c3
     ffffffff810123e3 ilen: 1 insn: 5b
     ffffffff810123e4 ilen: 2 insn: 41 5c
     ffffffff810123e6 ilen: 1 insn: 5d
     ffffffff810123e7 ilen: 1 insn: c3
     ffffffff810124a6 ilen: 2 insn: 31 c0
     ffffffff810124a8 ilen: 9 insn: 41 83 bc 24 a8 01 00 00 01
     ffffffff810124b1 ilen: 2 insn: 75 87

- Allow enabling the perf_event_attr.branch_type attribute member: (Andi Kleen)

  perf record -e sched:sched_switch,cpu/cpu-cycles,branch_type=any/ ...

- Add unwinding support for jitdump (Stefano Sanfilippo)

Fixes:

- Use raw_syscall:sys_enter timestamp in 'perf trace' (Arnaldo Carvalho de Melo)

Infrastructure:

- Allow jitdump to be built without libdwarf (Maciej Debski)

- Sync x86's syscall table tools/ copy (Arnaldo Carvalho de Melo)

- Fixes to avoid calling die() in library fuctions already propagating other
  errors (Arnaldo Carvalho de Melo)

- Improvements to allow libtraceevent to be properly installed in distro
  packages (Jiri Olsa)

- Removing coresight miscellaneous debug output (Mathieu Poirier)

- Cache align the 'perf bench futex' worker struct (Sebastian Andrzej Siewior)

Documentation:

- Minor improvements on the documentation of event parameters (Andi Kleen)

- Add jitdump format specification document (Stephane Eranian)

Spelling fixes:

- Fix typo "No enough" to "Not enough" (Alexander Alemayhu)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Adrian Hunter (1):
      perf intel-pt/bts: Tidy instruction buffer size usage

Alexander Alemayhu (1):
      perf tools: Fix typo "No enough" to "Not enough"

Alexis Berlemont (2):
      perf hists browser: Dynamically change verbosity level
      perf trace: Implement --delay

Andi Kleen (6):
      perf intel-pt/bts: Report instruction bytes and length in sample
      perf script: Support insn and insnlen
      perf record: Improve documentation of event parameters
      perf tools: Implement branch_type event parameter
      perf pmu: Only print Using CPUID message once
      perf list: Make vendor event matching case insensitive

Arnaldo Carvalho de Melo (8):
      perf tools: Sync copy of x86's syscall table
      perf jit: Avoid returning garbage for a ret variable
      perf jit: Add NT_GNU_BUILD_ID definition for older distros
      perf bench mem: Move boilerplate memory allocation to the infrastructure
      perf tools: Normalize sq_quote_argv() error reporting
      perf tools: Use normal error reporting when processing PERF_RECORD_READ events
      perf trace: Remove thread_trace->exit_time
      perf trace: Use the syscall raw_syscalls:sys_enter timestamp

Jiri Olsa (8):
      tools lib traceevent: Add install_headers target
      tools lib traceevent: Add do_install_mkdir Makefile function
      tools lib traceevent: Rename LIB_FILE to LIB_TARGET
      tools lib traceevent: Add version for traceevent shared object
      tools lib: Add for_each_clear_bit macro
      perf report: Move captured info to generic header info
      perf header: Display missing features
      perf header: Display feature name on write failure

Maciej Debski (1):
      perf jit: Enable jitdump support without dwarf

Mathieu Poirier (1):
      perf coresight: Removing miscellaneous debug output

Sebastian Andrzej Siewior (1):
      perf bench futex: Cache align the worker struct

Stefano Sanfilippo (5):
      perf jit: Make perf skip unknown records
      perf jit: Do not assume pgoff is zero
      perf jit: Add unwinding support
      perf jit: Generate .eh_frame/.eh_frame_hdr in DSO
      perf jit: Check JITHEADER_VERSION

Stephane Eranian (3):
      perf jit: Improve error messages from JVMTI
      perf jit: Remove unecessary padding in jitdump file
      perf jit: Add jitdump format specification document

 tools/include/asm-generic/bitops.h                 |   1 +
 tools/include/asm-generic/bitops/__ffz.h           |  12 ++
 tools/include/asm-generic/bitops/find.h            |  28 ++++
 tools/include/linux/bitops.h                       |   5 +
 tools/lib/find_bit.c                               |  25 +++
 tools/lib/traceevent/Makefile                      |  40 +++--
 tools/perf/Documentation/jitdump-specification.txt | 170 +++++++++++++++++++++
 tools/perf/Documentation/perf-record.txt           |   9 +-
 tools/perf/Documentation/perf-script.txt           |   6 +-
 tools/perf/Documentation/perf-trace.txt            |   5 +
 tools/perf/MANIFEST                                |   1 +
 tools/perf/Makefile.config                         |   2 +-
 tools/perf/arch/arm/util/cs-etm.c                  |   2 -
 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl  |   4 +-
 tools/perf/bench/futex-hash.c                      |   5 +-
 tools/perf/bench/mem-functions.c                   |  77 ++++------
 tools/perf/builtin-report.c                        |  12 +-
 tools/perf/builtin-script.c                        |  24 ++-
 tools/perf/builtin-trace.c                         |  19 ++-
 tools/perf/jvmti/jvmti_agent.c                     |  38 +----
 tools/perf/jvmti/libjvmti.c                        |  39 +++--
 tools/perf/tests/backward-ring-buffer.c            |   2 +-
 tools/perf/tests/bpf.c                             |   2 +-
 tools/perf/ui/browsers/hists.c                     |   5 +-
 tools/perf/util/Build                              |   2 +-
 tools/perf/util/bpf-loader.c                       |  14 +-
 tools/perf/util/event.h                            |   3 +
 tools/perf/util/evsel.c                            |   9 ++
 tools/perf/util/evsel.h                            |   2 +
 tools/perf/util/genelf.c                           | 113 +++++++++++++-
 tools/perf/util/genelf.h                           |   5 +-
 tools/perf/util/header.c                           |  19 ++-
 tools/perf/util/intel-bts.c                        |   9 +-
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  |   2 +
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |   1 +
 .../util/intel-pt-decoder/intel-pt-insn-decoder.c  |  13 +-
 .../util/intel-pt-decoder/intel-pt-insn-decoder.h  |   6 +-
 tools/perf/util/intel-pt-decoder/intel-pt-log.c    |   4 +-
 tools/perf/util/intel-pt.c                         |  19 ++-
 tools/perf/util/jitdump.c                          |  82 ++++++++--
 tools/perf/util/jitdump.h                          |  12 ++
 tools/perf/util/llvm-utils.c                       |   2 +-
 tools/perf/util/map.c                              |  17 ++-
 tools/perf/util/parse-branch-options.c             |  85 ++++++-----
 tools/perf/util/parse-branch-options.h             |   3 +-
 tools/perf/util/parse-events.c                     |  15 +-
 tools/perf/util/pmu.c                              |  10 +-
 tools/perf/util/quote.c                            |   2 +-
 tools/perf/util/session.c                          |  10 --
 tools/perf/util/string.c                           |  21 ++-
 tools/perf/util/unwind-libunwind-local.c           |   4 +-
 tools/perf/util/util.h                             |   1 +
 tools/perf/util/values.c                           |  81 +++++++---
 tools/perf/util/values.h                           |   4 +-
 54 files changed, 830 insertions(+), 273 deletions(-)
 create mode 100644 tools/include/asm-generic/bitops/__ffz.h
 create mode 100644 tools/perf/Documentation/jitdump-specification.txt

  # time dm
     1 alpine:3.4: Ok
     2 android-ndk:r12b-arm: Ok
     3 archlinux:latest: Ok
     4 centos:5: Ok
     5 centos:6: Ok
     6 centos:7: Ok
     7 debian:7: Ok
     8 debian:8: Ok
     9 debian:experimental: Ok
    10 fedora:20: Ok
    11 fedora:21: Ok
    12 fedora:22: Ok
    13 fedora:23: Ok
    14 fedora:24: Ok
    15 fedora:24-x-ARC-uClibc: Ok
    16 fedora:rawhide: Ok
    17 mageia:5: Ok
    18 opensuse:13.2: Ok
    19 opensuse:42.1: Ok
    20 opensuse:tumbleweed: Ok
    21 ubuntu:12.04.5: Ok
    22 ubuntu:14.04: Ok
    23 ubuntu:14.04.4: Ok
    24 ubuntu:15.10: Ok
    25 ubuntu:16.04: Ok
    26 ubuntu:16.04-x-arm: Ok
    27 ubuntu:16.04-x-arm64: Ok
    28 ubuntu:16.04-x-powerpc: Ok
    29 ubuntu:16.04-x-powerpc64: Ok
    30 ubuntu:16.04-x-powerpc64el: Ok
    31 ubuntu:16.04-x-s390: Ok
    32 ubuntu:16.10: Ok
  
  real	46m1.737s
  user	0m2.200s
  sys	0m2.735s
  #

   # perf test
   1: vmlinux symtab matches kallsyms                          : Ok
   2: detect openat syscall event                              : Ok
   3: detect openat syscall event on all cpus                  : Ok
   4: read samples using the mmap interface                    : Ok
   5: parse events tests                                       : Ok
   6: Validate PERF_RECORD_* events & perf_sample fields       : Ok
   7: Test perf pmu format parsing                             : Ok
   8: Test dso data read                                       : Ok
   9: Test dso data cache                                      : Ok
  10: Test dso data reopen                                     : Ok
  11: roundtrip evsel->name check                              : Ok
  12: Check parsing of sched tracepoints fields                : Ok
  13: Generate and check syscalls:sys_enter_openat event fields: Ok
  14: struct perf_event_attr setup                             : Ok
  15: Test matching and linking multiple hists                 : Ok
  16: Try 'import perf' in python, checking link problems      : Ok
  17: Test breakpoint overflow signal handler                  : Ok
  18: Test breakpoint overflow sampling                        : Ok
  19: Test number of exit event of a simple workload           : Ok
  20: Test software clock events have valid period values      : Ok
  21: Test object code reading                                 : Ok
  22: Test sample parsing                                      : Ok
  23: Test using a dummy software event to keep tracking       : Ok
  24: Test parsing with no sample_id_all bit set               : Ok
  25: Test filtering hist entries                              : Ok
  26: Test mmap thread lookup                                  : Ok
  27: Test thread mg sharing                                   : Ok
  28: Test output sorting of hist entries                      : Ok
  29: Test cumulation of child hist entries                    : Ok
  30: Test tracking with sched_switch                          : Ok
  31: Filter fds with revents mask in a fdarray                : Ok
  32: Add fd to a fdarray, making it autogrow                  : Ok
  33: Test kmod_path__parse function                           : Ok
  34: Test thread map                                          : Ok
  35: Test LLVM searching and compiling                        :
  35.1: Basic BPF llvm compiling test                          : Ok
  35.2: Test kbuild searching                                  : Ok
  35.3: Compile source for BPF prologue generation test        : Ok
  35.4: Compile source for BPF relocation test                 : Ok
  36: Test topology in session                                 : Ok
  37: Test BPF filter                                          :
  37.1: Test basic BPF filtering                               : Ok
  37.2: Test BPF prologue generation                           : Ok
  37.3: Test BPF relocation checker                            : Ok
  38: Test thread map synthesize                               : Ok
  39: Test cpu map synthesize                                  : Ok
  40: Test stat config synthesize                              : Ok
  41: Test stat synthesize                                     : Ok
  42: Test stat round synthesize                               : Ok
  43: Test attr update synthesize                              : Ok
  44: Test events times                                        : Ok
  45: Test backward reading from ring buffer                   : Ok
  46: Test cpu map print                                       : Ok
  47: Test SDT event probing                                   : Ok
  48: Test is_printable_array function                         : Ok
  49: Test bitmap print                                        : Ok
  50: x86 rdpmc test                                           : Ok
  51: Test converting perf time to TSC                         : Ok
  52: Test dwarf unwind                                        : Ok
  53: Test x86 instruction decoder - new instructions          : Ok
  54: Test intel cqm nmi context read                          : Skip
  #

   $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/linux/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
                   make_tags_O: make tags 
                  make_debug_O: make DEBUG=1
              make_no_libbpf_O: make NO_LIBBPF=1
                make_no_newt_O: make NO_NEWT=1
             make_no_libperl_O: make NO_LIBPERL=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1
               make_no_slang_O: make NO_SLANG=1
            make_no_demangle_O: make NO_DEMANGLE=1
           make_no_libpython_O: make NO_LIBPYTHON=1
        make_with_babeltrace_O: make LIBBABELTRACE=1
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
              make_no_libelf_O: make NO_LIBELF=1
           make_no_backtrace_O: make NO_BACKTRACE=1
             make_util_map_o_O: make util/map.o
           make_no_libunwind_O: make NO_LIBUNWIND=1
             make_no_libnuma_O: make NO_LIBNUMA=1
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
                make_install_O: make install
              make_clean_all_O: make clean all
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
                 make_perf_o_O: make perf.o
                   make_help_O: make help
            make_no_libaudit_O: make NO_LIBAUDIT=1
       make_util_pmu_bison_o_O: make util/pmu-bison.o
                   make_pure_O: make
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
            make_no_auxtrace_O: make NO_AUXTRACE=1
         make_install_prefix_O: make install prefix=/tmp/krava
           make_no_libbionic_O: make NO_LIBBIONIC=1
            make_install_bin_O: make install-bin
                make_no_gtk2_O: make NO_GTK2=1
                 make_static_O: make LDFLAGS=-static
                    make_doc_O: make doc
  OK
  make: Leaving directory '/home/acme/git/linux/tools/perf'
  $

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 01/37] perf intel-pt/bts: Tidy instruction buffer size usage
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 02/37] perf intel-pt/bts: Report instruction bytes and length in sample Arnaldo Carvalho de Melo
                   ` (36 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Tidy instruction buffer size usage in preparation for copying the
instruction bytes onto samples.

The instruction buffer is presently used for debugging, so rename its
size macro from INTEL_PT_INSN_DBG_BUF_SZ to INTEL_PT_INSN_BUF_SZ, and
use it everywhere.

Note that the maximum instruction size is 15 which is a less efficient size
to copy than 16, which is why a separate buffer size is used.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1475847747-30994-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-bts.c                              |  8 +++-----
 tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c | 13 ++++++-------
 tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h |  6 ++----
 tools/perf/util/intel-pt-decoder/intel-pt-log.c          |  4 ++--
 tools/perf/util/intel-pt.c                               |  8 +++-----
 5 files changed, 16 insertions(+), 23 deletions(-)

diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index f545ec1e758a..8bc7fec817d7 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -319,15 +319,12 @@ static int intel_bts_get_next_insn(struct intel_bts_queue *btsq, u64 ip)
 	struct machine *machine = btsq->bts->machine;
 	struct thread *thread;
 	struct addr_location al;
-	unsigned char buf[1024];
-	size_t bufsz;
+	unsigned char buf[INTEL_PT_INSN_BUF_SZ];
 	ssize_t len;
 	int x86_64;
 	uint8_t cpumode;
 	int err = -1;
 
-	bufsz = intel_pt_insn_max_size();
-
 	if (machine__kernel_ip(machine, ip))
 		cpumode = PERF_RECORD_MISC_KERNEL;
 	else
@@ -341,7 +338,8 @@ static int intel_bts_get_next_insn(struct intel_bts_queue *btsq, u64 ip)
 	if (!al.map || !al.map->dso)
 		goto out_put;
 
-	len = dso__data_read_addr(al.map->dso, al.map, machine, ip, buf, bufsz);
+	len = dso__data_read_addr(al.map->dso, al.map, machine, ip, buf,
+				  INTEL_PT_INSN_BUF_SZ);
 	if (len <= 0)
 		goto out_put;
 
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
index d23138c06665..5f95cd442075 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
@@ -27,6 +27,10 @@
 
 #include "intel-pt-insn-decoder.h"
 
+#if INTEL_PT_INSN_BUF_SZ < MAX_INSN_SIZE
+#error Instruction buffer size too small
+#endif
+
 /* Based on branch_type() from perf_event_intel_lbr.c */
 static void intel_pt_insn_decoder(struct insn *insn,
 				  struct intel_pt_insn *intel_pt_insn)
@@ -166,10 +170,10 @@ int intel_pt_get_insn(const unsigned char *buf, size_t len, int x86_64,
 	if (!insn_complete(&insn) || insn.length > len)
 		return -1;
 	intel_pt_insn_decoder(&insn, intel_pt_insn);
-	if (insn.length < INTEL_PT_INSN_DBG_BUF_SZ)
+	if (insn.length < INTEL_PT_INSN_BUF_SZ)
 		memcpy(intel_pt_insn->buf, buf, insn.length);
 	else
-		memcpy(intel_pt_insn->buf, buf, INTEL_PT_INSN_DBG_BUF_SZ);
+		memcpy(intel_pt_insn->buf, buf, INTEL_PT_INSN_BUF_SZ);
 	return 0;
 }
 
@@ -211,11 +215,6 @@ int intel_pt_insn_desc(const struct intel_pt_insn *intel_pt_insn, char *buf,
 	return 0;
 }
 
-size_t intel_pt_insn_max_size(void)
-{
-	return MAX_INSN_SIZE;
-}
-
 int intel_pt_insn_type(enum intel_pt_insn_op op)
 {
 	switch (op) {
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
index b0adbf37323e..37ec5627ae9b 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
@@ -20,7 +20,7 @@
 #include <stdint.h>
 
 #define INTEL_PT_INSN_DESC_MAX		32
-#define INTEL_PT_INSN_DBG_BUF_SZ	16
+#define INTEL_PT_INSN_BUF_SZ		16
 
 enum intel_pt_insn_op {
 	INTEL_PT_OP_OTHER,
@@ -47,7 +47,7 @@ struct intel_pt_insn {
 	enum intel_pt_insn_branch	branch;
 	int				length;
 	int32_t				rel;
-	unsigned char			buf[INTEL_PT_INSN_DBG_BUF_SZ];
+	unsigned char			buf[INTEL_PT_INSN_BUF_SZ];
 };
 
 int intel_pt_get_insn(const unsigned char *buf, size_t len, int x86_64,
@@ -58,8 +58,6 @@ const char *intel_pt_insn_name(enum intel_pt_insn_op op);
 int intel_pt_insn_desc(const struct intel_pt_insn *intel_pt_insn, char *buf,
 		       size_t buf_len);
 
-size_t intel_pt_insn_max_size(void);
-
 int intel_pt_insn_type(enum intel_pt_insn_op op);
 
 #endif
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-log.c b/tools/perf/util/intel-pt-decoder/intel-pt-log.c
index 319bef33a64b..e02bc7b166a0 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-log.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-log.c
@@ -119,8 +119,8 @@ void __intel_pt_log_insn(struct intel_pt_insn *intel_pt_insn, uint64_t ip)
 	if (intel_pt_log_open())
 		return;
 
-	if (len > INTEL_PT_INSN_DBG_BUF_SZ)
-		len = INTEL_PT_INSN_DBG_BUF_SZ;
+	if (len > INTEL_PT_INSN_BUF_SZ)
+		len = INTEL_PT_INSN_BUF_SZ;
 	intel_pt_print_data(intel_pt_insn->buf, len, ip, 8);
 	if (intel_pt_insn_desc(intel_pt_insn, desc, INTEL_PT_INSN_DESC_MAX) > 0)
 		fprintf(f, "%s\n", desc);
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index dc041d4368c8..815a14d8904b 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -428,8 +428,7 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
 	struct machine *machine = ptq->pt->machine;
 	struct thread *thread;
 	struct addr_location al;
-	unsigned char buf[1024];
-	size_t bufsz;
+	unsigned char buf[INTEL_PT_INSN_BUF_SZ];
 	ssize_t len;
 	int x86_64;
 	u8 cpumode;
@@ -440,8 +439,6 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
 	if (to_ip && *ip == to_ip)
 		goto out_no_cache;
 
-	bufsz = intel_pt_insn_max_size();
-
 	if (*ip >= ptq->pt->kernel_start)
 		cpumode = PERF_RECORD_MISC_KERNEL;
 	else
@@ -493,7 +490,8 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
 
 		while (1) {
 			len = dso__data_read_offset(al.map->dso, machine,
-						    offset, buf, bufsz);
+						    offset, buf,
+						    INTEL_PT_INSN_BUF_SZ);
 			if (len <= 0)
 				return -EINVAL;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 02/37] perf intel-pt/bts: Report instruction bytes and length in sample
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 01/37] perf intel-pt/bts: Tidy instruction buffer size usage Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 03/37] perf script: Support insn and insnlen Arnaldo Carvalho de Melo
                   ` (35 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Andi Kleen, Adrian Hunter, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Change Intel PT and BTS to pass up the length and the instruction
bytes of the decoded or sampled instruction in the perf sample.

The decoder already knows this information, we just need to pass it
up. Since it is only a couple of movs it is not very expensive.

Handle instruction cache too. Make sure ilen is always initialized.

Used in the next patch.

[Adrian: re-base on top (and adjust for) instruction buffer size tidy-up]
[Adrian: add BTS support and adjust commit message accordingly]

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/1475847747-30994-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/event.h                                  |  3 +++
 tools/perf/util/intel-bts.c                              |  1 +
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c      |  2 ++
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.h      |  1 +
 tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c |  2 +-
 tools/perf/util/intel-pt.c                               | 11 +++++++++++
 6 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 8d363d5e65a2..c735c53a26f8 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -177,6 +177,8 @@ enum {
 	PERF_IP_FLAG_TRACE_BEGIN	|\
 	PERF_IP_FLAG_TRACE_END)
 
+#define MAX_INSN 16
+
 struct perf_sample {
 	u64 ip;
 	u32 pid, tid;
@@ -193,6 +195,7 @@ struct perf_sample {
 	u32 flags;
 	u16 insn_len;
 	u8  cpumode;
+	char insn[MAX_INSN];
 	void *raw_data;
 	struct ip_callchain *callchain;
 	struct branch_stack *branch_stack;
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 8bc7fec817d7..6c2eb5da4afc 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -295,6 +295,7 @@ static int intel_bts_synth_branch_sample(struct intel_bts_queue *btsq,
 	sample.cpu = btsq->cpu;
 	sample.flags = btsq->sample_flags;
 	sample.insn_len = btsq->intel_pt_insn.length;
+	memcpy(sample.insn, btsq->intel_pt_insn.buf, INTEL_PT_INSN_BUF_SZ);
 
 	if (bts->synth_opts.inject) {
 		event.sample.header.size = bts->branches_event_size;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 16c06d3ae577..e4e7dc781d21 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -980,6 +980,8 @@ static int intel_pt_walk_insn(struct intel_pt_decoder *decoder,
 out_no_progress:
 	decoder->state.insn_op = intel_pt_insn->op;
 	decoder->state.insn_len = intel_pt_insn->length;
+	memcpy(decoder->state.insn, intel_pt_insn->buf,
+	       INTEL_PT_INSN_BUF_SZ);
 
 	if (decoder->tx_flags & INTEL_PT_IN_TX)
 		decoder->state.flags |= INTEL_PT_IN_TX;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index 89399985fa4d..e90619a43c0c 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -66,6 +66,7 @@ struct intel_pt_state {
 	uint32_t flags;
 	enum intel_pt_insn_op insn_op;
 	int insn_len;
+	char insn[INTEL_PT_INSN_BUF_SZ];
 };
 
 struct intel_pt_insn;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
index 5f95cd442075..7913363bde5c 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
@@ -27,7 +27,7 @@
 
 #include "intel-pt-insn-decoder.h"
 
-#if INTEL_PT_INSN_BUF_SZ < MAX_INSN_SIZE
+#if INTEL_PT_INSN_BUF_SZ < MAX_INSN_SIZE || INTEL_PT_INSN_BUF_SZ > MAX_INSN
 #error Instruction buffer size too small
 #endif
 
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 815a14d8904b..85d5eeb66c75 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -143,6 +143,7 @@ struct intel_pt_queue {
 	u32 flags;
 	u16 insn_len;
 	u64 last_insn_cnt;
+	char insn[INTEL_PT_INSN_BUF_SZ];
 };
 
 static void intel_pt_dump(struct intel_pt *pt __maybe_unused,
@@ -315,6 +316,7 @@ struct intel_pt_cache_entry {
 	enum intel_pt_insn_branch	branch;
 	int				length;
 	int32_t				rel;
+	char				insn[INTEL_PT_INSN_BUF_SZ];
 };
 
 static int intel_pt_config_div(const char *var, const char *value, void *data)
@@ -400,6 +402,7 @@ static int intel_pt_cache_add(struct dso *dso, struct machine *machine,
 	e->branch = intel_pt_insn->branch;
 	e->length = intel_pt_insn->length;
 	e->rel = intel_pt_insn->rel;
+	memcpy(e->insn, intel_pt_insn->buf, INTEL_PT_INSN_BUF_SZ);
 
 	err = auxtrace_cache__add(c, offset, &e->entry);
 	if (err)
@@ -436,6 +439,8 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
 	u64 insn_cnt = 0;
 	bool one_map = true;
 
+	intel_pt_insn->length = 0;
+
 	if (to_ip && *ip == to_ip)
 		goto out_no_cache;
 
@@ -475,6 +480,8 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
 				intel_pt_insn->branch = e->branch;
 				intel_pt_insn->length = e->length;
 				intel_pt_insn->rel = e->rel;
+				memcpy(intel_pt_insn->buf, e->insn,
+				       INTEL_PT_INSN_BUF_SZ);
 				intel_pt_log_insn_no_data(intel_pt_insn, *ip);
 				return 0;
 			}
@@ -898,6 +905,7 @@ static void intel_pt_sample_flags(struct intel_pt_queue *ptq)
 		if (ptq->state->flags & INTEL_PT_IN_TX)
 			ptq->flags |= PERF_IP_FLAG_IN_TX;
 		ptq->insn_len = ptq->state->insn_len;
+		memcpy(ptq->insn, ptq->state->insn, INTEL_PT_INSN_BUF_SZ);
 	}
 }
 
@@ -1078,6 +1086,7 @@ static int intel_pt_synth_branch_sample(struct intel_pt_queue *ptq)
 	sample.cpu = ptq->cpu;
 	sample.flags = ptq->flags;
 	sample.insn_len = ptq->insn_len;
+	memcpy(sample.insn, ptq->insn, INTEL_PT_INSN_BUF_SZ);
 
 	/*
 	 * perf report cannot handle events without a branch stack when using
@@ -1139,6 +1148,7 @@ static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
 	sample.cpu = ptq->cpu;
 	sample.flags = ptq->flags;
 	sample.insn_len = ptq->insn_len;
+	memcpy(sample.insn, ptq->insn, INTEL_PT_INSN_BUF_SZ);
 
 	ptq->last_insn_cnt = ptq->state->tot_insn_cnt;
 
@@ -1201,6 +1211,7 @@ static int intel_pt_synth_transaction_sample(struct intel_pt_queue *ptq)
 	sample.cpu = ptq->cpu;
 	sample.flags = ptq->flags;
 	sample.insn_len = ptq->insn_len;
+	memcpy(sample.insn, ptq->insn, INTEL_PT_INSN_BUF_SZ);
 
 	if (pt->synth_opts.callchain) {
 		thread_stack__sample(ptq->thread, ptq->chain,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 03/37] perf script: Support insn and insnlen
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 01/37] perf intel-pt/bts: Tidy instruction buffer size usage Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 02/37] perf intel-pt/bts: Report instruction bytes and length in sample Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 04/37] perf tools: Sync copy of x86's syscall table Arnaldo Carvalho de Melo
                   ` (34 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

When looking at Intel PT traces with perf script it is useful to have
some indication of the instruction. Dump the instruction bytes and
instruction length, which can be used for simple pattern analysis in
scripts.

% perf record -e intel_pt// foo
% perf script --itrace=i0ns -F ip,insn,insnlen
 ffffffff8101232f ilen: 5 insn: 0f 1f 44 00 00
 ffffffff81012334 ilen: 1 insn: 5b
 ffffffff81012335 ilen: 1 insn: 5d
 ffffffff81012336 ilen: 1 insn: c3
 ffffffff810123e3 ilen: 1 insn: 5b
 ffffffff810123e4 ilen: 2 insn: 41 5c
 ffffffff810123e6 ilen: 1 insn: 5d
 ffffffff810123e7 ilen: 1 insn: c3
 ffffffff810124a6 ilen: 2 insn: 31 c0
 ffffffff810124a8 ilen: 9 insn: 41 83 bc 24 a8 01 00 00 01
 ffffffff810124b1 ilen: 2 insn: 75 87
...

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/1475847747-30994-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  6 +++++-
 tools/perf/builtin-script.c              | 24 ++++++++++++++++++++++--
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 053bbbd84ece..c01904f388ce 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@ OPTIONS
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
         srcline, period, iregs, brstack, brstacksym, flags, bpf-output,
-        callindent. Field list can be prepended with the type, trace, sw or hw,
+        callindent, insn, insnlen. Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
 
@@ -181,6 +181,10 @@ OPTIONS
 	Instruction Trace decoding. For calls and returns, it will display the
 	name of the symbol indented with spaces to reflect the stack depth.
 
+	When doing instruction trace decoding insn and insnlen give the
+	instruction bytes and the instruction length of the current
+	instruction.
+
 	Finally, a user may not set fields to none for all event types.
 	i.e., -F "" is not allowed.
 
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7228d141a789..412fb6e65ac0 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -66,6 +66,8 @@ enum perf_output_field {
 	PERF_OUTPUT_WEIGHT	    = 1U << 18,
 	PERF_OUTPUT_BPF_OUTPUT	    = 1U << 19,
 	PERF_OUTPUT_CALLINDENT	    = 1U << 20,
+	PERF_OUTPUT_INSN	    = 1U << 21,
+	PERF_OUTPUT_INSNLEN	    = 1U << 22,
 };
 
 struct output_option {
@@ -93,6 +95,8 @@ struct output_option {
 	{.str = "weight",   .field = PERF_OUTPUT_WEIGHT},
 	{.str = "bpf-output",   .field = PERF_OUTPUT_BPF_OUTPUT},
 	{.str = "callindent", .field = PERF_OUTPUT_CALLINDENT},
+	{.str = "insn", .field = PERF_OUTPUT_INSN},
+	{.str = "insnlen", .field = PERF_OUTPUT_INSNLEN},
 };
 
 /* default set to maintain compatibility with current format */
@@ -624,6 +628,20 @@ static void print_sample_callindent(struct perf_sample *sample,
 		printf("%*s", spacing - len, "");
 }
 
+static void print_insn(struct perf_sample *sample,
+		       struct perf_event_attr *attr)
+{
+	if (PRINT_FIELD(INSNLEN))
+		printf(" ilen: %d", sample->insn_len);
+	if (PRINT_FIELD(INSN)) {
+		int i;
+
+		printf(" insn:");
+		for (i = 0; i < sample->insn_len; i++)
+			printf(" %02x", (unsigned char)sample->insn[i]);
+	}
+}
+
 static void print_sample_bts(struct perf_sample *sample,
 			     struct perf_evsel *evsel,
 			     struct thread *thread,
@@ -668,6 +686,8 @@ static void print_sample_bts(struct perf_sample *sample,
 	if (print_srcline_last)
 		map__fprintf_srcline(al->map, al->addr, "\n  ", stdout);
 
+	print_insn(sample, attr);
+
 	printf("\n");
 }
 
@@ -911,7 +931,7 @@ static void process_event(struct perf_script *script,
 
 	if (perf_evsel__is_bpf_output(evsel) && PRINT_FIELD(BPF_OUTPUT))
 		print_sample_bpf_output(sample);
-
+	print_insn(sample, attr);
 	printf("\n");
 }
 
@@ -2124,7 +2144,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "Valid types: hw,sw,trace,raw. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
 		     "addr,symoff,period,iregs,brstack,brstacksym,flags,"
-		     "bpf-output,callindent", parse_output_fields),
+		     "bpf-output,callindent,insn,insnlen", parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 04/37] perf tools: Sync copy of x86's syscall table
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 03/37] perf script: Support insn and insnlen Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 05/37] tools lib traceevent: Add install_headers target Arnaldo Carvalho de Melo
                   ` (33 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Jiri Olsa, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

To get up to the recent compat pread/pwrite changes, that albeit not
being used by 'perf trace' due to some raw_syscalls tracepoint
limitations, trigger this warning when building perf:

  Warning: x86_64's syscall_64.tbl differs from kernel

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ilgqhxd9ubkg5f66bx0bht2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index 555263e385c9..e9ce9c7c39b4 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -374,5 +374,5 @@
 543	x32	io_setup		compat_sys_io_setup
 544	x32	io_submit		compat_sys_io_submit
 545	x32	execveat		compat_sys_execveat/ptregs
-534	x32	preadv2			compat_sys_preadv2
-535	x32	pwritev2		compat_sys_pwritev2
+546	x32	preadv2			compat_sys_preadv64v2
+547	x32	pwritev2		compat_sys_pwritev64v2
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 05/37] tools lib traceevent: Add install_headers target
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 04/37] perf tools: Sync copy of x86's syscall table Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 06/37] tools lib traceevent: Add do_install_mkdir Makefile function Arnaldo Carvalho de Melo
                   ` (32 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Steven Rostedt,
	Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Adding install_headers target to install all headers
under 'include/traceevent' path, like:

  $ make DESTDIR=/tmp/krava prefix=/usr install_headers
  $ find /tmp/krava/ -type f
  /tmp/krava/usr/include/traceevent/kbuffer.h
  /tmp/krava/usr/include/traceevent/event-utils.h
  /tmp/krava/usr/include/traceevent/event-parse.h

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-if70lj3zhdc3csdqm5webjvc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/Makefile | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile
index 7851df1490e0..8e44bea646ee 100644
--- a/tools/lib/traceevent/Makefile
+++ b/tools/lib/traceevent/Makefile
@@ -240,7 +240,7 @@ define do_install
 	if [ ! -d '$(DESTDIR_SQ)$2' ]; then		\
 		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$2';	\
 	fi;						\
-	$(INSTALL) $1 '$(DESTDIR_SQ)$2'
+	$(INSTALL) $(if $3,-m $3,) $1 '$(DESTDIR_SQ)$2'
 endef
 
 define do_install_plugins
@@ -264,6 +264,12 @@ install_plugins: $(PLUGINS)
 	$(call QUIET_INSTALL, trace_plugins) \
 		$(call do_install_plugins, $(PLUGINS))
 
+install_headers:
+	$(call QUIET_INSTALL, headers) \
+		$(call do_install,event-parse.h,$(prefix)/include/traceevent,644); \
+		$(call do_install,event-utils.h,$(prefix)/include/traceevent,644); \
+		$(call do_install,kbuffer.h,$(prefix)/include/traceevent,644)
+
 install: install_lib
 
 clean:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 06/37] tools lib traceevent: Add do_install_mkdir Makefile function
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 05/37] tools lib traceevent: Add install_headers target Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 07/37] tools lib traceevent: Rename LIB_FILE to LIB_TARGET Arnaldo Carvalho de Melo
                   ` (31 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Steven Rostedt,
	Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Decompose the do_install function to ease up
the following patch a little.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-zzs19yx8seyors532vuer37w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/Makefile | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile
index 8e44bea646ee..deeae5201ec9 100644
--- a/tools/lib/traceevent/Makefile
+++ b/tools/lib/traceevent/Makefile
@@ -236,10 +236,14 @@ TAGS:	force
 	find . -name '*.[ch]' | xargs etags \
 	--regex='/_PE(\([^,)]*\).*/PEVENT_ERRNO__\1/'
 
+define do_install_mkdir
+	if [ ! -d '$(DESTDIR_SQ)$1' ]; then		\
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$1';	\
+	fi
+endef
+
 define do_install
-	if [ ! -d '$(DESTDIR_SQ)$2' ]; then		\
-		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$2';	\
-	fi;						\
+	$(call do_install_mkdir,$2);			\
 	$(INSTALL) $(if $3,-m $3,) $1 '$(DESTDIR_SQ)$2'
 endef
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 07/37] tools lib traceevent: Rename LIB_FILE to LIB_TARGET
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 06/37] tools lib traceevent: Add do_install_mkdir Makefile function Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 08/37] tools lib traceevent: Add version for traceevent shared object Arnaldo Carvalho de Melo
                   ` (30 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Steven Rostedt,
	Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

To ease up following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-zpv5gd8y7clwrhh6dq03ucd5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/Makefile | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile
index deeae5201ec9..0d7e1725a0f8 100644
--- a/tools/lib/traceevent/Makefile
+++ b/tools/lib/traceevent/Makefile
@@ -99,7 +99,7 @@ libdir_SQ = $(subst ','\'',$(libdir))
 libdir_relative_SQ = $(subst ','\'',$(libdir_relative))
 plugin_dir_SQ = $(subst ','\'',$(plugin_dir))
 
-LIB_FILE = libtraceevent.a libtraceevent.so
+LIB_TARGET = libtraceevent.a libtraceevent.so
 
 CONFIG_INCLUDES = 
 CONFIG_LIBS	=
@@ -156,11 +156,11 @@ PLUGINS += plugin_cfg80211.so
 PLUGINS    := $(addprefix $(OUTPUT),$(PLUGINS))
 PLUGINS_IN := $(PLUGINS:.so=-in.o)
 
-TE_IN    := $(OUTPUT)libtraceevent-in.o
-LIB_FILE := $(addprefix $(OUTPUT),$(LIB_FILE))
+TE_IN      := $(OUTPUT)libtraceevent-in.o
+LIB_TARGET := $(addprefix $(OUTPUT),$(LIB_TARGET))
 DYNAMIC_LIST_FILE := $(OUTPUT)libtraceevent-dynamic-list
 
-CMD_TARGETS = $(LIB_FILE) $(PLUGINS) $(DYNAMIC_LIST_FILE)
+CMD_TARGETS = $(LIB_TARGET) $(PLUGINS) $(DYNAMIC_LIST_FILE)
 
 TARGETS = $(CMD_TARGETS)
 
@@ -261,8 +261,8 @@ define do_generate_dynamic_list_file
 endef
 
 install_lib: all_cmd install_plugins
-	$(call QUIET_INSTALL, $(LIB_FILE)) \
-		$(call do_install,$(LIB_FILE),$(libdir_SQ))
+	$(call QUIET_INSTALL, $(LIB_TARGET)) \
+		$(call do_install,$(LIB_TARGET),$(libdir_SQ))
 
 install_plugins: $(PLUGINS)
 	$(call QUIET_INSTALL, trace_plugins) \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 08/37] tools lib traceevent: Add version for traceevent shared object
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 07/37] tools lib traceevent: Rename LIB_FILE to LIB_TARGET Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 09/37] tools lib: Add for_each_clear_bit macro Arnaldo Carvalho de Melo
                   ` (29 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Namhyung Kim, Steven Rostedt,
	Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Adding version support for libtraceevent.so object.

Using the existing EVENT_PARSE_VERSION variable to construct
the .so object version string, which now consists of:

  $(EP_VERSION).$(EP_PATCHLEVEL).$(EP_EXTRAVERSION)

Looks like it was created for this purpose anyway.

The build will now produce following traeceevent libraries:

  $ ll libtraceevent*
  libtraceevent.a
  libtraceevent.so -> libtraceevent.so.1.1.0
  libtraceevent.so.1 -> libtraceevent.so.1.1.0
  libtraceevent.so.1.1.0

Also the install target will carry them:

  $ make DESTDIR=/tmp/krava prefix=/usr install
  INSTALL  trace_plugins
  INSTALL  libtraceevent.a
  INSTALL  libtraceevent.so.1.1.0

  $ find /tmp/krava/ | xargs ls -l
  ...
  /tmp/krava/usr/lib64:
  total 572
  libtraceevent.a
  libtraceevent.so -> libtraceevent.so.1.1.0
  libtraceevent.so.1 -> libtraceevent.so.1.1.0
  libtraceevent.so.1.1.0
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-v64z62fh0dwt0ueie5usrnac@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/Makefile | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile
index 0d7e1725a0f8..c76012ebdb9c 100644
--- a/tools/lib/traceevent/Makefile
+++ b/tools/lib/traceevent/Makefile
@@ -99,8 +99,6 @@ libdir_SQ = $(subst ','\'',$(libdir))
 libdir_relative_SQ = $(subst ','\'',$(libdir_relative))
 plugin_dir_SQ = $(subst ','\'',$(plugin_dir))
 
-LIB_TARGET = libtraceevent.a libtraceevent.so
-
 CONFIG_INCLUDES = 
 CONFIG_LIBS	=
 CONFIG_FLAGS	=
@@ -114,6 +112,9 @@ N		=
 
 EVENT_PARSE_VERSION = $(EP_VERSION).$(EP_PATCHLEVEL).$(EP_EXTRAVERSION)
 
+LIB_TARGET  = libtraceevent.a libtraceevent.so.$(EVENT_PARSE_VERSION)
+LIB_INSTALL = libtraceevent.a libtraceevent.so*
+
 INCLUDES = -I. -I $(srctree)/tools/include $(CONFIG_INCLUDES)
 
 # Set compile option CFLAGS
@@ -171,8 +172,10 @@ all_cmd: $(CMD_TARGETS)
 $(TE_IN): force
 	$(Q)$(MAKE) $(build)=libtraceevent
 
-$(OUTPUT)libtraceevent.so: $(TE_IN)
-	$(QUIET_LINK)$(CC) --shared $^ -o $@
+$(OUTPUT)libtraceevent.so.$(EVENT_PARSE_VERSION): $(TE_IN)
+	$(QUIET_LINK)$(CC) --shared $^ -Wl,-soname,libtraceevent.so.$(EP_VERSION) -o $@
+	@ln -sf $(@F) $(OUTPUT)libtraceevent.so
+	@ln -sf $(@F) $(OUTPUT)libtraceevent.so.$(EP_VERSION)
 
 $(OUTPUT)libtraceevent.a: $(TE_IN)
 	$(QUIET_LINK)$(RM) $@; $(AR) rcs $@ $^
@@ -262,7 +265,8 @@ endef
 
 install_lib: all_cmd install_plugins
 	$(call QUIET_INSTALL, $(LIB_TARGET)) \
-		$(call do_install,$(LIB_TARGET),$(libdir_SQ))
+		$(call do_install_mkdir,$(libdir_SQ)); \
+		cp -fpR $(LIB_INSTALL) $(DESTDIR)$(libdir_SQ)
 
 install_plugins: $(PLUGINS)
 	$(call QUIET_INSTALL, trace_plugins) \
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 09/37] tools lib: Add for_each_clear_bit macro
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 08/37] tools lib traceevent: Add version for traceevent shared object Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 10/37] perf report: Move captured info to generic header info Arnaldo Carvalho de Melo
                   ` (28 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Adrian Hunter, David Ahern,
	Namhyung Kim, Wang Nan, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Adding for_each_clear_bit macro plus all its the necessary backbone
functions. Taken from related kernel code. It will be used in following
patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-cayv2zbqi0nlmg5sjjxs1775@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/include/asm-generic/bitops.h       |  1 +
 tools/include/asm-generic/bitops/__ffz.h | 12 ++++++++++++
 tools/include/asm-generic/bitops/find.h  | 28 ++++++++++++++++++++++++++++
 tools/include/linux/bitops.h             |  5 +++++
 tools/lib/find_bit.c                     | 25 +++++++++++++++++++++++++
 tools/perf/MANIFEST                      |  1 +
 6 files changed, 72 insertions(+)
 create mode 100644 tools/include/asm-generic/bitops/__ffz.h

diff --git a/tools/include/asm-generic/bitops.h b/tools/include/asm-generic/bitops.h
index 653d1bad77de..0304600121da 100644
--- a/tools/include/asm-generic/bitops.h
+++ b/tools/include/asm-generic/bitops.h
@@ -13,6 +13,7 @@
  */
 
 #include <asm-generic/bitops/__ffs.h>
+#include <asm-generic/bitops/__ffz.h>
 #include <asm-generic/bitops/fls.h>
 #include <asm-generic/bitops/__fls.h>
 #include <asm-generic/bitops/fls64.h>
diff --git a/tools/include/asm-generic/bitops/__ffz.h b/tools/include/asm-generic/bitops/__ffz.h
new file mode 100644
index 000000000000..6744bd4cdf46
--- /dev/null
+++ b/tools/include/asm-generic/bitops/__ffz.h
@@ -0,0 +1,12 @@
+#ifndef _ASM_GENERIC_BITOPS_FFZ_H_
+#define _ASM_GENERIC_BITOPS_FFZ_H_
+
+/*
+ * ffz - find first zero in word.
+ * @word: The word to search
+ *
+ * Undefined if no zero exists, so code should check against ~0UL first.
+ */
+#define ffz(x)  __ffs(~(x))
+
+#endif /* _ASM_GENERIC_BITOPS_FFZ_H_ */
diff --git a/tools/include/asm-generic/bitops/find.h b/tools/include/asm-generic/bitops/find.h
index 31f51547fcd4..5538ecdc964a 100644
--- a/tools/include/asm-generic/bitops/find.h
+++ b/tools/include/asm-generic/bitops/find.h
@@ -15,6 +15,21 @@ extern unsigned long find_next_bit(const unsigned long *addr, unsigned long
 		size, unsigned long offset);
 #endif
 
+#ifndef find_next_zero_bit
+
+/**
+ * find_next_zero_bit - find the next cleared bit in a memory region
+ * @addr: The address to base the search on
+ * @offset: The bitnumber to start searching at
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number of the next zero bit
+ * If no bits are zero, returns @size.
+ */
+unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
+				 unsigned long offset);
+#endif
+
 #ifndef find_first_bit
 
 /**
@@ -30,4 +45,17 @@ extern unsigned long find_first_bit(const unsigned long *addr,
 
 #endif /* find_first_bit */
 
+#ifndef find_first_zero_bit
+
+/**
+ * find_first_zero_bit - find the first cleared bit in a memory region
+ * @addr: The address to start the search at
+ * @size: The maximum number of bits to search
+ *
+ * Returns the bit number of the first cleared bit.
+ * If no bits are zero, returns @size.
+ */
+unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size);
+#endif
+
 #endif /*_TOOLS_LINUX_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/tools/include/linux/bitops.h b/tools/include/linux/bitops.h
index 49c929a104ee..fc446343ff41 100644
--- a/tools/include/linux/bitops.h
+++ b/tools/include/linux/bitops.h
@@ -39,6 +39,11 @@ extern unsigned long __sw_hweight64(__u64 w);
 	     (bit) < (size);					\
 	     (bit) = find_next_bit((addr), (size), (bit) + 1))
 
+#define for_each_clear_bit(bit, addr, size) \
+	for ((bit) = find_first_zero_bit((addr), (size));       \
+	     (bit) < (size);                                    \
+	     (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
+
 /* same as for_each_set_bit() but use bit as value to start with */
 #define for_each_set_bit_from(bit, addr, size) \
 	for ((bit) = find_next_bit((addr), (size), (bit));	\
diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
index 9122a9e80046..6d8b8f22cf55 100644
--- a/tools/lib/find_bit.c
+++ b/tools/lib/find_bit.c
@@ -82,3 +82,28 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
 	return size;
 }
 #endif
+
+#ifndef find_first_zero_bit
+/*
+ * Find the first cleared bit in a memory region.
+ */
+unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+{
+	unsigned long idx;
+
+	for (idx = 0; idx * BITS_PER_LONG < size; idx++) {
+		if (addr[idx] != ~0UL)
+			return min(idx * BITS_PER_LONG + ffz(addr[idx]), size);
+	}
+
+	return size;
+}
+#endif
+
+#ifndef find_next_zero_bit
+unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
+				 unsigned long offset)
+{
+	return _find_next_bit(addr, size, offset, ~0UL);
+}
+#endif
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index 0bda2cca2b3a..a511e5f31e36 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -51,6 +51,7 @@ tools/include/asm-generic/bitops/arch_hweight.h
 tools/include/asm-generic/bitops/atomic.h
 tools/include/asm-generic/bitops/const_hweight.h
 tools/include/asm-generic/bitops/__ffs.h
+tools/include/asm-generic/bitops/__ffz.h
 tools/include/asm-generic/bitops/__fls.h
 tools/include/asm-generic/bitops/find.h
 tools/include/asm-generic/bitops/fls64.h
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 10/37] perf report: Move captured info to generic header info
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 09/37] tools lib: Add for_each_clear_bit macro Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 11/37] perf header: Display missing features Arnaldo Carvalho de Melo
                   ` (27 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Adrian Hunter, David Ahern,
	Namhyung Kim, Wang Nan, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

It's not displayed in TUI now, putting it into generic part.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5fk88kejqgi50ye7xdkhiloz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c  |  9 +++++++++
 tools/perf/util/session.c | 10 ----------
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 2f3eded54b0c..05f562760958 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2250,9 +2250,18 @@ int perf_header__fprintf_info(struct perf_session *session, FILE *fp, bool full)
 	struct header_print_data hd;
 	struct perf_header *header = &session->header;
 	int fd = perf_data_file__fd(session->file);
+	struct stat st;
+	int ret;
+
 	hd.fp = fp;
 	hd.full = full;
 
+	ret = fstat(fd, &st);
+	if (ret == -1)
+		return -1;
+
+	fprintf(fp, "# captured on: %s", ctime(&st.st_ctime));
+
 	perf_header__process_sections(header, fd, &hd,
 				      perf_file_section__fprintf_info);
 	return 0;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 5d61242a6e64..f268201048a0 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -2025,20 +2025,10 @@ int perf_session__cpu_bitmap(struct perf_session *session,
 void perf_session__fprintf_info(struct perf_session *session, FILE *fp,
 				bool full)
 {
-	struct stat st;
-	int fd, ret;
-
 	if (session == NULL || fp == NULL)
 		return;
 
-	fd = perf_data_file__fd(session->file);
-
-	ret = fstat(fd, &st);
-	if (ret == -1)
-		return;
-
 	fprintf(fp, "# ========\n");
-	fprintf(fp, "# captured on: %s", ctime(&st.st_ctime));
 	perf_header__fprintf_info(session, fp, full);
 	fprintf(fp, "# ========\n#\n");
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 11/37] perf header: Display missing features
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 10/37] perf report: Move captured info to generic header info Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 12/37] perf header: Display feature name on write failure Arnaldo Carvalho de Melo
                   ` (26 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Adrian Hunter, David Ahern,
	Namhyung Kim, Wang Nan, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Display missing features in header info, like:

  $ perf report --header-only
  # ========
  # captured on: Mon Oct 10 09:39:47 2016
  ...
  # missing features: HEADER_TRACING_DATA HEADER_CPU_TOPOLOGY ...

To help in diagnosing problems.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-bh5gp84gobdmyl345dcp64se@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 05f562760958..d6310c6dace5 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2251,7 +2251,7 @@ int perf_header__fprintf_info(struct perf_session *session, FILE *fp, bool full)
 	struct perf_header *header = &session->header;
 	int fd = perf_data_file__fd(session->file);
 	struct stat st;
-	int ret;
+	int ret, bit;
 
 	hd.fp = fp;
 	hd.full = full;
@@ -2264,6 +2264,14 @@ int perf_header__fprintf_info(struct perf_session *session, FILE *fp, bool full)
 
 	perf_header__process_sections(header, fd, &hd,
 				      perf_file_section__fprintf_info);
+
+	fprintf(fp, "# missing features: ");
+	for_each_clear_bit(bit, header->adds_features, HEADER_LAST_FEATURE) {
+		if (bit)
+			fprintf(fp, "%s ", feat_ops[bit].name);
+	}
+
+	fprintf(fp, "\n");
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 12/37] perf header: Display feature name on write failure
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 11/37] perf header: Display missing features Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 13/37] perf record: Improve documentation of event parameters Arnaldo Carvalho de Melo
                   ` (25 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Jiri Olsa, Adrian Hunter, David Ahern,
	Namhyung Kim, Wang Nan, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Display name of feature instead of just the number
during recording data.

Before:
  failed to write feature 13

Now:
  failed to write feature HEADER_CPU_TOPOLOGY

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-k9d9trozi5kkx737cy8n5xh5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index d6310c6dace5..d89c9c7ef4e5 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2290,7 +2290,7 @@ static int do_write_feat(int fd, struct perf_header *h, int type,
 
 		err = feat_ops[type].write(fd, h, evlist);
 		if (err < 0) {
-			pr_debug("failed to write feature %d\n", type);
+			pr_debug("failed to write feature %s\n", feat_ops[type].name);
 
 			/* undo anything written */
 			lseek(fd, (*p)->offset, SEEK_SET);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 13/37] perf record: Improve documentation of event parameters
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 12/37] perf header: Display feature name on write failure Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 14/37] perf tools: Implement branch_type event parameter Arnaldo Carvalho de Melo
                   ` (24 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

- Some editing (params -> parameters)
- Point to the now more complete list of parameters in the perf list
manpage.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1476381433-22959-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-record.txt | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 92335193dc33..27fc3617c6a4 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -45,9 +45,9 @@ OPTIONS
           param1 and param2 are defined as formats for the PMU in:
           /sys/bus/event_source/devices/<pmu>/format/*
 
-	  There are also some params which are not defined in .../<pmu>/format/*.
+	  There are also some parameters which are not defined in .../<pmu>/format/*.
 	  These params can be used to overload default config values per event.
-	  Here is a list of the params.
+	  Here are some common parameters:
 	  - 'period': Set event sampling period
 	  - 'freq': Set event sampling frequency
 	  - 'time': Disable/enable time stamping. Acceptable values are 1 for
@@ -57,8 +57,11 @@ OPTIONS
 			 FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and
 			 "no" for disable callgraph.
 	  - 'stack-size': user stack size for dwarf mode
+
+          See the linkperf:perf-list[1] man page for more parameters.
+
 	  Note: If user explicitly sets options which conflict with the params,
-	  the value set by the params will be overridden.
+	  the value set by the parameters will be overridden.
 
 	  Also not defined in .../<pmu>/format/* are PMU driver specific
 	  configuration parameters.  Any configuration parameter preceded by
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 14/37] perf tools: Implement branch_type event parameter
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 13/37] perf record: Improve documentation of event parameters Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 15/37] perf jit: Avoid returning garbage for a ret variable Arnaldo Carvalho de Melo
                   ` (23 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

It can be useful to specify branch type state per event, for example if
we want to collect both software trace points and last branch PMU events
in a single collection. Currently this doesn't work because the software
trace point errors out with -b.

There was already a branch-type parameter to configure branch sample
types per event in the parser, but it was stubbed out. This patch
implements the necessary plumbing to actually enable it.

Now:

  $ perf record -e sched:sched_switch,cpu/cpu-cycles,branch_type=any/ ...

works.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1476306127-19721-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c                |  9 ++++
 tools/perf/util/evsel.h                |  2 +
 tools/perf/util/parse-branch-options.c | 85 +++++++++++++++++++---------------
 tools/perf/util/parse-branch-options.h |  3 +-
 tools/perf/util/parse-events.c         | 15 ++++--
 5 files changed, 71 insertions(+), 43 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8bc271141d9d..e58a2fbf3b16 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -28,6 +28,7 @@
 #include "debug.h"
 #include "trace-event.h"
 #include "stat.h"
+#include "util/parse-branch-options.h"
 
 static struct {
 	bool sample_id_all;
@@ -708,6 +709,14 @@ static void apply_config_terms(struct perf_evsel *evsel,
 		case PERF_EVSEL__CONFIG_TERM_CALLGRAPH:
 			callgraph_buf = term->val.callgraph;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_BRANCH:
+			if (term->val.branch && strcmp(term->val.branch, "no")) {
+				perf_evsel__set_sample_bit(evsel, BRANCH_STACK);
+				parse_branch_str(term->val.branch,
+						 &attr->branch_sample_type);
+			} else
+				perf_evsel__reset_sample_bit(evsel, BRANCH_STACK);
+			break;
 		case PERF_EVSEL__CONFIG_TERM_STACK_USER:
 			dump_size = term->val.stack_user;
 			break;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index b1503b0ecdff..8cd7cd227483 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -47,6 +47,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_MAX_STACK,
 	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_DRV_CFG,
+	PERF_EVSEL__CONFIG_TERM_BRANCH,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -63,6 +64,7 @@ struct perf_evsel_config_term {
 		int	max_stack;
 		bool	inherit;
 		bool	overwrite;
+		char	*branch;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-branch-options.c b/tools/perf/util/parse-branch-options.c
index afc088dd7d20..3634d6974300 100644
--- a/tools/perf/util/parse-branch-options.c
+++ b/tools/perf/util/parse-branch-options.c
@@ -31,59 +31,51 @@ static const struct branch_mode branch_modes[] = {
 	BRANCH_END
 };
 
-int
-parse_branch_stack(const struct option *opt, const char *str, int unset)
+int parse_branch_str(const char *str, __u64 *mode)
 {
 #define ONLY_PLM \
 	(PERF_SAMPLE_BRANCH_USER	|\
 	 PERF_SAMPLE_BRANCH_KERNEL	|\
 	 PERF_SAMPLE_BRANCH_HV)
 
-	uint64_t *mode = (uint64_t *)opt->value;
+	int ret = 0;
+	char *p, *s;
+	char *os = NULL;
 	const struct branch_mode *br;
-	char *s, *os = NULL, *p;
-	int ret = -1;
 
-	if (unset)
+	if (str == NULL) {
+		*mode = PERF_SAMPLE_BRANCH_ANY;
 		return 0;
+	}
 
-	/*
-	 * cannot set it twice, -b + --branch-filter for instance
-	 */
-	if (*mode)
+	/* because str is read-only */
+	s = os = strdup(str);
+	if (!s)
 		return -1;
 
-	/* str may be NULL in case no arg is passed to -b */
-	if (str) {
-		/* because str is read-only */
-		s = os = strdup(str);
-		if (!s)
-			return -1;
-
-		for (;;) {
-			p = strchr(s, ',');
-			if (p)
-				*p = '\0';
-
-			for (br = branch_modes; br->name; br++) {
-				if (!strcasecmp(s, br->name))
-					break;
-			}
-			if (!br->name) {
-				ui__warning("unknown branch filter %s,"
-					    " check man page\n", s);
-				goto error;
-			}
-
-			*mode |= br->mode;
-
-			if (!p)
-				break;
+	for (;;) {
+		p = strchr(s, ',');
+		if (p)
+			*p = '\0';
 
-			s = p + 1;
+		for (br = branch_modes; br->name; br++) {
+			if (!strcasecmp(s, br->name))
+				break;
+		}
+		if (!br->name) {
+			ret = -1;
+			ui__warning("unknown branch filter %s,"
+				    " check man page\n", s);
+			goto error;
 		}
+
+		*mode |= br->mode;
+
+		if (!p)
+			break;
+
+		s = p + 1;
 	}
-	ret = 0;
 
 	/* default to any branch */
 	if ((*mode & ~ONLY_PLM) == 0) {
@@ -93,3 +85,20 @@ parse_branch_stack(const struct option *opt, const char *str, int unset)
 	free(os);
 	return ret;
 }
+
+int
+parse_branch_stack(const struct option *opt, const char *str, int unset)
+{
+	__u64 *mode = (__u64 *)opt->value;
+
+	if (unset)
+		return 0;
+
+	/*
+	 * cannot set it twice, -b + --branch-filter for instance
+	 */
+	if (*mode)
+		return -1;
+
+	return parse_branch_str(str, mode);
+}
diff --git a/tools/perf/util/parse-branch-options.h b/tools/perf/util/parse-branch-options.h
index b9d9470c2e82..6086fd90eb23 100644
--- a/tools/perf/util/parse-branch-options.h
+++ b/tools/perf/util/parse-branch-options.h
@@ -1,5 +1,6 @@
 #ifndef _PERF_PARSE_BRANCH_OPTIONS_H
 #define _PERF_PARSE_BRANCH_OPTIONS_H 1
-struct option;
+#include <stdint.h>
 int parse_branch_stack(const struct option *opt, const char *str, int unset);
+int parse_branch_str(const char *str, __u64 *mode);
 #endif /* _PERF_PARSE_BRANCH_OPTIONS_H */
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4e778eae1510..3c876b8ba4de 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -22,6 +22,7 @@
 #include "cpumap.h"
 #include "probe-file.h"
 #include "asm/bug.h"
+#include "util/parse-branch-options.h"
 
 #define MAX_NAME_LEN 100
 
@@ -973,10 +974,13 @@ do {									   \
 		CHECK_TYPE_VAL(NUM);
 		break;
 	case PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE:
-		/*
-		 * TODO uncomment when the field is available
-		 * attr->branch_sample_type = term->val.num;
-		 */
+		CHECK_TYPE_VAL(STR);
+		if (strcmp(term->val.str, "no") &&
+		    parse_branch_str(term->val.str, &attr->branch_sample_type)) {
+			err->str = strdup("invalid branch sample type");
+			err->idx = term->err_val;
+			return -EINVAL;
+		}
 		break;
 	case PARSE_EVENTS__TERM_TYPE_TIME:
 		CHECK_TYPE_VAL(NUM);
@@ -1119,6 +1123,9 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_CALLGRAPH:
 			ADD_CONFIG_TERM(CALLGRAPH, callgraph, term->val.str);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE:
+			ADD_CONFIG_TERM(BRANCH, branch, term->val.str);
+			break;
 		case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
 			ADD_CONFIG_TERM(STACK_USER, stack_user, term->val.num);
 			break;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 15/37] perf jit: Avoid returning garbage for a ret variable
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 14/37] perf tools: Implement branch_type event parameter Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 16/37] perf jit: Add NT_GNU_BUILD_ID definition for older distros Arnaldo Carvalho de Melo
                   ` (22 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Anton Blanchard,
	Jiri Olsa, Maciej Debski, Namhyung Kim, Peter Zijlstra,
	Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

When the loop body isn't executed at all, then the 'ret' local variable,
that is uninitialized will be used as the return value.

This triggers this error on Alpine Linux:

    CC	   /tmp/build/perf/util/demangle-java.o
    CC	   /tmp/build/perf/util/demangle-rust.o
    CC       /tmp/build/perf/util/jitdump.o
    CC       /tmp/build/perf/util/genelf.o
  util/jitdump.c: In function 'jit_process':
  util/jitdump.c:622:3: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
     fprintf(stderr, "injected: %s (%d)\n", path, ret);
     ^
  util/jitdump.c:584:6: note: 'ret' was declared here
    int ret;
        ^
    FLEX     /tmp/build/perf/util/parse-events-flex.c

  / $ gcc -v
  Using built-in specs.
  COLLECT_GCC=gcc
  COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-alpine-linux-musl/5.3.0/lto-wrapper
  Target: x86_64-alpine-linux-musl
  Configured with: /home/buildozer/aports/main/gcc/src/gcc-5.3.0/configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info
  +--build=x86_64-alpine-linux-musl --host=x86_64-alpine-linux-musl --target=x86_64-alpine-linux-musl --with-pkgversion='Alpine 5.3.0' --enable-checking=release
  +--disable-fixed-point --disable-libstdcxx-pch --disable-multilib --disable-nls --disable-werror --disable-symvers --enable-__cxa_atexit --enable-esp
  +--enable-cloog-backend --enable-languages=c,c++,objc,java,fortran,ada --disable-libssp --disable-libmudflap --disable-libsanitizer --enable-shared
  +--enable-threads --enable-tls --with-system-zlib
  Thread model: posix
  gcc version 5.3.0 (Alpine 5.3.0)

But this so far got under the radar, not causing any build problem, till the
"perf jit: enable jitdump support without dwarf" gets applied, when the above
problem takes place, some combination of inlining or whatever, the problem
is real, so fix it by initializing the variable to zero.

Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Maciej Debski <maciejd@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lkml.kernel.org/r/20161013200437.GA12815@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/jitdump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index 95f0884aae02..f3ed3c963c71 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -581,7 +581,7 @@ static int
 jit_process_dump(struct jit_buf_desc *jd)
 {
 	union jr_entry *jr;
-	int ret;
+	int ret = 0;
 
 	while ((jr = jit_get_next_entry(jd))) {
 		switch(jr->prefix.id) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 16/37] perf jit: Add NT_GNU_BUILD_ID definition for older distros
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 15/37] perf jit: Avoid returning garbage for a ret variable Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 17/37] perf jit: Improve error messages from JVMTI Arnaldo Carvalho de Melo
                   ` (21 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Anton Blanchard, David Ahern, Jiri Olsa, Maciej Debski,
	Namhyung Kim, Peter Zijlstra, Stephane Eranian, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Such as CentOS5, where such define is not present in elf.h.

This file, genelf.c, wasn't being built for several systems, because
it mistakenly was conditional on some DWARF features, now that it
is just needing libelf, after "perf jit: Enable jitdump support without
dwarf" it fails.

So, as preparation for "perf jit: Enable jitdump support without dwarf",
conditionally define it, if not available.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Maciej Debski <maciejd@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-k09qay1cmr0l3fzprmztzy3o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/genelf.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/genelf.c b/tools/perf/util/genelf.c
index c1ef805c6a8f..8a2995ea426f 100644
--- a/tools/perf/util/genelf.c
+++ b/tools/perf/util/genelf.c
@@ -25,6 +25,10 @@
 #include "genelf.h"
 #include "../util/jitdump.h"
 
+#ifndef NT_GNU_BUILD_ID
+#define NT_GNU_BUILD_ID 3
+#endif
+
 #define JVMTI
 
 #define BUILD_ID_URANDOM /* different uuid for each run */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 17/37] perf jit: Improve error messages from JVMTI
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 16/37] perf jit: Add NT_GNU_BUILD_ID definition for older distros Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 18/37] perf jit: Enable jitdump support without dwarf Arnaldo Carvalho de Melo
                   ` (20 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Stephane Eranian, Anton Blanchard, Jiri Olsa,
	Namhyung Kim, Nilay Vaish, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Stephane Eranian <eranian@google.com>

This patch improves the usefulness of error messages generated by the
JVMTI interfac.e This can help identify the root cause of a problem by
printing the actual error code. The patch adds a new helper function
called print_error().

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-2-git-send-email-eranian@google.com
[ Handle failure to convert numeric error to a string in print_error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/jvmti/libjvmti.c | 39 +++++++++++++++++++++++++++++----------
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/tools/perf/jvmti/libjvmti.c b/tools/perf/jvmti/libjvmti.c
index ac12e4b91a92..5612641c69b4 100644
--- a/tools/perf/jvmti/libjvmti.c
+++ b/tools/perf/jvmti/libjvmti.c
@@ -12,6 +12,19 @@
 static int has_line_numbers;
 void *jvmti_agent;
 
+static void print_error(jvmtiEnv *jvmti, const char *msg, jvmtiError ret)
+{
+	char *err_msg = NULL;
+	jvmtiError err;
+	err = (*jvmti)->GetErrorName(jvmti, ret, &err_msg);
+	if (err == JVMTI_ERROR_NONE) {
+		warnx("%s failed with %s", msg, err_msg);
+		(*jvmti)->Deallocate(jvmti, (unsigned char *)err_msg);
+	} else {
+		warnx("%s failed with an unknown error %d", msg, ret);
+	}
+}
+
 static jvmtiError
 do_get_line_numbers(jvmtiEnv *jvmti, void *pc, jmethodID m, jint bci,
 		    jvmti_line_info_t *tab, jint *nr)
@@ -22,8 +35,10 @@ do_get_line_numbers(jvmtiEnv *jvmti, void *pc, jmethodID m, jint bci,
 	jvmtiError ret;
 
 	ret = (*jvmti)->GetLineNumberTable(jvmti, m, &nr_lines, &loc_tab);
-	if (ret != JVMTI_ERROR_NONE)
+	if (ret != JVMTI_ERROR_NONE) {
+		print_error(jvmti, "GetLineNumberTable", ret);
 		return ret;
+	}
 
 	for (i = 0; i < nr_lines; i++) {
 		if (loc_tab[i].start_location < bci) {
@@ -71,6 +86,8 @@ get_line_numbers(jvmtiEnv *jvmti, const void *compile_info, jvmti_line_info_t **
 					/* free what was allocated for nothing */
 					(*jvmti)->Deallocate(jvmti, (unsigned char *)lne);
 					nr_total += (int)nr;
+				} else {
+					print_error(jvmti, "GetLineNumberTable", ret);
 				}
 			}
 		}
@@ -130,7 +147,7 @@ compiled_method_load_cb(jvmtiEnv *jvmti,
 	ret = (*jvmti)->GetMethodDeclaringClass(jvmti, method,
 						&decl_class);
 	if (ret != JVMTI_ERROR_NONE) {
-		warnx("jvmti: cannot get declaring class");
+		print_error(jvmti, "GetMethodDeclaringClass", ret);
 		return;
 	}
 
@@ -144,21 +161,21 @@ compiled_method_load_cb(jvmtiEnv *jvmti,
 
 	ret = (*jvmti)->GetSourceFileName(jvmti, decl_class, &file_name);
 	if (ret != JVMTI_ERROR_NONE) {
-		warnx("jvmti: cannot get source filename ret=%d", ret);
+		print_error(jvmti, "GetSourceFileName", ret);
 		goto error;
 	}
 
 	ret = (*jvmti)->GetClassSignature(jvmti, decl_class,
 					  &class_sign, NULL);
 	if (ret != JVMTI_ERROR_NONE) {
-		warnx("jvmti: getclassignature failed");
+		print_error(jvmti, "GetClassSignature", ret);
 		goto error;
 	}
 
 	ret = (*jvmti)->GetMethodName(jvmti, method, &func_name,
 				      &func_sign, NULL);
 	if (ret != JVMTI_ERROR_NONE) {
-		warnx("jvmti: failed getmethodname");
+		print_error(jvmti, "GetMethodName", ret);
 		goto error;
 	}
 
@@ -253,7 +270,7 @@ Agent_OnLoad(JavaVM *jvm, char *options, void *reserved __unused)
 
 	ret = (*jvmti)->AddCapabilities(jvmti, &caps1);
 	if (ret != JVMTI_ERROR_NONE) {
-		warnx("jvmti: acquire compiled_method capability failed");
+		print_error(jvmti, "AddCapabilities", ret);
 		return -1;
 	}
 	ret = (*jvmti)->GetJLocationFormat(jvmti, &format);
@@ -264,7 +281,9 @@ Agent_OnLoad(JavaVM *jvm, char *options, void *reserved __unused)
 		ret = (*jvmti)->AddCapabilities(jvmti, &caps1);
                 if (ret == JVMTI_ERROR_NONE)
                         has_line_numbers = 1;
-        }
+        } else if (ret != JVMTI_ERROR_NONE)
+		print_error(jvmti, "GetJLocationFormat", ret);
+
 
 	memset(&cb, 0, sizeof(cb));
 
@@ -273,21 +292,21 @@ Agent_OnLoad(JavaVM *jvm, char *options, void *reserved __unused)
 
 	ret = (*jvmti)->SetEventCallbacks(jvmti, &cb, sizeof(cb));
 	if (ret != JVMTI_ERROR_NONE) {
-		warnx("jvmti: cannot set event callbacks");
+		print_error(jvmti, "SetEventCallbacks", ret);
 		return -1;
 	}
 
 	ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
 			JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL);
 	if (ret != JVMTI_ERROR_NONE) {
-		warnx("jvmti: setnotification failed for method_load");
+		print_error(jvmti, "SetEventNotificationMode(METHOD_LOAD)", ret);
 		return -1;
 	}
 
 	ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
 			JVMTI_EVENT_DYNAMIC_CODE_GENERATED, NULL);
 	if (ret != JVMTI_ERROR_NONE) {
-		warnx("jvmti: setnotification failed on code_generated");
+		print_error(jvmti, "SetEventNotificationMode(CODE_GENERATED)", ret);
 		return -1;
 	}
 	return 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 18/37] perf jit: Enable jitdump support without dwarf
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (16 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 17/37] perf jit: Improve error messages from JVMTI Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 19/37] perf jit: Remove unecessary padding in jitdump file Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Maciej Debski, Anton Blanchard, Jiri Olsa,
	Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Maciej Debski <maciejd@google.com>

This patch modifies the build dependencies on the jitdump support in
perf. As it stands jitdump was wrongfully made dependent 100% on using
DWARF. However, the dwarf dependency, only exist if generating the
source line table in genelf_debug.c. The rest of the support does not
need DWARF.

This patch removes the dependency on DWARF for the entire jitdump
support. It keeps it only for the genelf_debug.c support.

Signed-off-by: Maciej Debski <maciejd@google.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-3-git-send-email-eranian@google.com
Fixes: e12b202f8fb9 ("perf jitdump: Build only on supported archs")
[ Make it build only if NO_LIBELF isn't defined, as jitdump.o will only be built in that case ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.config | 2 +-
 tools/perf/util/Build      | 2 +-
 tools/perf/util/genelf.c   | 9 +++++++--
 tools/perf/util/genelf.h   | 2 ++
 4 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 72edf83d76b7..cffdd9cf3ebf 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -366,7 +366,7 @@ ifndef NO_SDT
 endif
 
 ifdef PERF_HAVE_JITDUMP
-  ifndef NO_DWARF
+  ifndef NO_LIBELF
     $(call detected,CONFIG_JITDUMP)
     CFLAGS += -DHAVE_JITDUMP
   endif
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index eb60e613d795..1dc67efad634 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -120,7 +120,7 @@ libperf-y += demangle-rust.o
 ifdef CONFIG_JITDUMP
 libperf-$(CONFIG_LIBELF) += jitdump.o
 libperf-$(CONFIG_LIBELF) += genelf.o
-libperf-$(CONFIG_LIBELF) += genelf_debug.o
+libperf-$(CONFIG_DWARF) += genelf_debug.o
 endif
 
 CFLAGS_config.o   += -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
diff --git a/tools/perf/util/genelf.c b/tools/perf/util/genelf.c
index 8a2995ea426f..30dece345b9f 100644
--- a/tools/perf/util/genelf.c
+++ b/tools/perf/util/genelf.c
@@ -19,7 +19,9 @@
 #include <limits.h>
 #include <fcntl.h>
 #include <err.h>
+#ifdef HAVE_DWARF_SUPPORT
 #include <dwarf.h>
+#endif
 
 #include "perf.h"
 #include "genelf.h"
@@ -161,7 +163,7 @@ gen_build_id(struct buildid_note *note, unsigned long load_addr, const void *cod
 int
 jit_write_elf(int fd, uint64_t load_addr, const char *sym,
 	      const void *code, int csize,
-	      void *debug, int nr_debug_entries)
+	      void *debug __maybe_unused, int nr_debug_entries __maybe_unused)
 {
 	Elf *e;
 	Elf_Data *d;
@@ -390,11 +392,14 @@ jit_write_elf(int fd, uint64_t load_addr, const char *sym,
 	shdr->sh_size = sizeof(bnote);
 	shdr->sh_entsize = 0;
 
+#ifdef HAVE_DWARF_SUPPORT
 	if (debug && nr_debug_entries) {
 		retval = jit_add_debug_info(e, load_addr, debug, nr_debug_entries);
 		if (retval)
 			goto error;
-	} else {
+	} else
+#endif
+	{
 		if (elf_update(e, ELF_C_WRITE) < 0) {
 			warnx("elf_update 4 failed");
 			goto error;
diff --git a/tools/perf/util/genelf.h b/tools/perf/util/genelf.h
index 2fbeb59c4bdd..5c933ac71451 100644
--- a/tools/perf/util/genelf.h
+++ b/tools/perf/util/genelf.h
@@ -4,8 +4,10 @@
 /* genelf.c */
 int jit_write_elf(int fd, uint64_t code_addr, const char *sym,
 		  const void *code, int csize, void *debug, int nr_debug_entries);
+#ifdef HAVE_DWARF_SUPPORT
 /* genelf_debug.c */
 int jit_add_debug_info(Elf *e, uint64_t code_addr, void *debug, int nr_debug_entries);
+#endif
 
 #if   defined(__arm__)
 #define GEN_ELF_ARCH	EM_ARM
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 19/37] perf jit: Remove unecessary padding in jitdump file
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (17 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 18/37] perf jit: Enable jitdump support without dwarf Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 20/37] perf jit: Make perf skip unknown records Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Stephane Eranian, Anton Blanchard, Jiri Olsa,
	Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Stephane Eranian <eranian@google.com>

This patch removes all the string padding generated in the jitdump file.
They are not necessary and were adding unnecessary complexity. Modern
processors can handle unaligned accesses quite well. The perf.data/
jitdump file are always post-processed, no need to add extra complexity
for no real gain.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-4-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/jvmti/jvmti_agent.c | 38 +-------------------------------------
 1 file changed, 1 insertion(+), 37 deletions(-)

diff --git a/tools/perf/jvmti/jvmti_agent.c b/tools/perf/jvmti/jvmti_agent.c
index 55daefff0d54..e9651a9d670e 100644
--- a/tools/perf/jvmti/jvmti_agent.c
+++ b/tools/perf/jvmti/jvmti_agent.c
@@ -44,11 +44,6 @@
 static char jit_path[PATH_MAX];
 static void *marker_addr;
 
-/*
- * padding buffer
- */
-static const char pad_bytes[7];
-
 static inline pid_t gettid(void)
 {
 	return (pid_t)syscall(__NR_gettid);
@@ -230,7 +225,6 @@ init_arch_timestamp(void)
 
 void *jvmti_open(void)
 {
-	int pad_cnt;
 	char dump_path[PATH_MAX];
 	struct jitheader header;
 	int fd;
@@ -288,10 +282,6 @@ void *jvmti_open(void)
 	header.total_size = sizeof(header);
 	header.pid        = getpid();
 
-	/* calculate amount of padding '\0' */
-	pad_cnt = PADDING_8ALIGNED(header.total_size);
-	header.total_size += pad_cnt;
-
 	header.timestamp = perf_get_timestamp();
 
 	if (use_arch_timestamp)
@@ -301,13 +291,6 @@ void *jvmti_open(void)
 		warn("jvmti: cannot write dumpfile header");
 		goto error;
 	}
-
-	/* write padding '\0' if necessary */
-	if (pad_cnt && !fwrite(pad_bytes, pad_cnt, 1, fp)) {
-		warn("jvmti: cannot write dumpfile header padding");
-		goto error;
-	}
-
 	return fp;
 error:
 	fclose(fp);
@@ -349,7 +332,6 @@ jvmti_write_code(void *agent, char const *sym,
 	static int code_generation = 1;
 	struct jr_code_load rec;
 	size_t sym_len;
-	size_t padding_count;
 	FILE *fp = agent;
 	int ret = -1;
 
@@ -366,8 +348,6 @@ jvmti_write_code(void *agent, char const *sym,
 
 	rec.p.id           = JIT_CODE_LOAD;
 	rec.p.total_size   = sizeof(rec) + sym_len;
-	padding_count      = PADDING_8ALIGNED(rec.p.total_size);
-	rec.p. total_size += padding_count;
 	rec.p.timestamp    = perf_get_timestamp();
 
 	rec.code_size  = size;
@@ -393,9 +373,6 @@ jvmti_write_code(void *agent, char const *sym,
 	ret = fwrite_unlocked(&rec, sizeof(rec), 1, fp);
 	fwrite_unlocked(sym, sym_len, 1, fp);
 
-	if (padding_count)
-		fwrite_unlocked(pad_bytes, padding_count, 1, fp);
-
 	if (code)
 		fwrite_unlocked(code, size, 1, fp);
 
@@ -412,7 +389,6 @@ jvmti_write_debug_info(void *agent, uint64_t code, const char *file,
 {
 	struct jr_code_debug_info rec;
 	size_t sret, len, size, flen;
-	size_t padding_count;
 	uint64_t addr;
 	const char *fn = file;
 	FILE *fp = agent;
@@ -443,16 +419,10 @@ jvmti_write_debug_info(void *agent, uint64_t code, const char *file,
 	 * int      : line number
 	 * int      : column discriminator
 	 * file[]   : source file name
-	 * padding  : pad to multiple of 8 bytes
 	 */
 	size += nr_lines * sizeof(struct debug_entry);
 	size += flen * nr_lines;
-	/*
-	 * pad to 8 bytes
-	 */
-	padding_count = PADDING_8ALIGNED(size);
-
-	rec.p.total_size = size + padding_count;
+	rec.p.total_size = size;
 
 	/*
 	 * If JVM is multi-threaded, nultiple concurrent calls to agent
@@ -486,12 +456,6 @@ jvmti_write_debug_info(void *agent, uint64_t code, const char *file,
 		if (sret != 1)
 			goto error;
 	}
-	if (padding_count) {
-		sret = fwrite_unlocked(pad_bytes, padding_count, 1, fp);
-		if (sret != 1)
-			goto error;
-	}
-
 	funlockfile(fp);
 	return 0;
 error:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 20/37] perf jit: Make perf skip unknown records
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (18 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 19/37] perf jit: Remove unecessary padding in jitdump file Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 21/37] perf jit: Do not assume pgoff is zero Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Stefano Sanfilippo, Ross McIlroy, Anton Blanchard,
	Jiri Olsa, Namhyung Kim, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Stefano Sanfilippo <ssanfilippo@chromium.org>

The behavior before this commit was to skip the remaining portion of the
jitdump in case an unknown record was found, including those records
that perf could handle.

With this change, parsing a record with an unknown id will cause a
warning to be emitted, the record will be skipped and parsing will
resume from the next (valid) one.

The patch aims at making perf more future proof, by extracting as much
information as possible from jitdumps.

Signed-off-by: Stefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-5-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/jitdump.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index f3ed3c963c71..75b66bbf8429 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -263,8 +263,7 @@ jit_get_next_entry(struct jit_buf_desc *jd)
 		return NULL;
 
 	if (id >= JIT_CODE_MAX) {
-		pr_warning("next_entry: unknown prefix %d, skipping\n", id);
-		return NULL;
+		pr_warning("next_entry: unknown record type %d, skipping\n", id);
 	}
 	if (bs > jd->bufsize) {
 		void *n;
@@ -322,7 +321,8 @@ jit_get_next_entry(struct jit_buf_desc *jd)
 		break;
 	case JIT_CODE_MAX:
 	default:
-		return NULL;
+		/* skip unknown record (we have read them) */
+		break;
 	}
 	return jr;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 21/37] perf jit: Do not assume pgoff is zero
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (19 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 20/37] perf jit: Make perf skip unknown records Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 22/37] perf jit: Add unwinding support Arnaldo Carvalho de Melo
                   ` (16 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Stefano Sanfilippo, Ross McIlroy, Anton Blanchard,
	Jiri Olsa, Namhyung Kim, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Stefano Sanfilippo <ssanfilippo@chromium.org>

When calculating .eh_frame_hdr base and LUT offsets do not always assume
that pgoff is zero.

The assumption is false for DSOs built from the jitdump by perf inject,
because the ELF header did not exist in memory at sampling time.

Signed-off-by: Stefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-6-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/unwind-libunwind-local.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index 20c2e5743903..6fec84dff3f7 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -357,8 +357,8 @@ find_proc_info(unw_addr_space_t as, unw_word_t ip, unw_proc_info_t *pi,
 		di.format   = UNW_INFO_FORMAT_REMOTE_TABLE;
 		di.start_ip = map->start;
 		di.end_ip   = map->end;
-		di.u.rti.segbase    = map->start + segbase;
-		di.u.rti.table_data = map->start + table_data;
+		di.u.rti.segbase    = map->start + segbase - map->pgoff;
+		di.u.rti.table_data = map->start + table_data - map->pgoff;
 		di.u.rti.table_len  = fde_count * sizeof(struct table_entry)
 				      / sizeof(unw_word_t);
 		ret = dwarf_search_unwind_table(as, ip, &di, pi,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 22/37] perf jit: Add unwinding support
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (20 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 21/37] perf jit: Do not assume pgoff is zero Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 23/37] perf jit: Generate .eh_frame/.eh_frame_hdr in DSO Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Stefano Sanfilippo, Ross McIlroy, Anton Blanchard,
	Jiri Olsa, Namhyung Kim, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Stefano Sanfilippo <ssanfilippo@chromium.org>

This record is intended to provide unwinding information in the
eh_frame format. This is required to unwind JITed code which
does not maintain the frame pointer register during function calls.

The eh_frame unwinding information can be emitted by V8 / Chromium
when the --perf_prof_unwinding_info is passed.

A record of type jr_code_unwinding_info comes before the jr_code_load
it referred to and contains both the .eh_frame and .eh_frame_hdr.

The fields in the header have the following meaning:

  * unwinding_size: size of the eh_frame and eh_frame_hdr, necessary
    for distinguishing the content from the padding.

  * eh_frame_hdr_size: as the name says.

  * mapped_size: size of the payload that was in memory at runtime.
    typically unwinding_size if the .eh_frame_hdr and .eh_frame were
    mapped, or 0 if they weren't. It should always be the former case,
    since the .eh_frame is guaranteed to be mapped in memory. However,
    certain JITs might want to inject an .eh_frame_hdr with an empty LUT
    to trigger fp-based unwinding fallback in libunwind. The only part
    of the .eh_frame_hdr that libunwind reads from remote memory is the
    LUT, and since there is none, mapping the unwinding info in memory
    is not necessary, and 0 in this field signifies that it wasn't.
    This practical hack allows to save bytes in code memory for those
    JIT compilers that might or might not maintain a valid frame pointer.

The payload that follows is assumed to contain first the .eh_frame and
then the .eh_header_hdr, with no padding between the two.

Signed-off-by: Stefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-7-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/jitdump.c | 57 ++++++++++++++++++++++++++++++++++++++++++++---
 tools/perf/util/jitdump.h | 12 ++++++++++
 2 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index 75b66bbf8429..9bae66cc78f2 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -37,6 +37,10 @@ struct jit_buf_desc {
 	bool		 needs_bswap; /* handles cross-endianess */
 	bool		 use_arch_timestamp;
 	void		 *debug_data;
+	void		 *unwinding_data;
+	uint64_t	 unwinding_size;
+	uint64_t	 unwinding_mapped_size;
+	uint64_t         eh_frame_hdr_size;
 	size_t		 nr_debug_entries;
 	uint32_t         code_load_count;
 	u64		 bytes_written;
@@ -295,6 +299,13 @@ jit_get_next_entry(struct jit_buf_desc *jd)
 			}
 		}
 		break;
+	case JIT_CODE_UNWINDING_INFO:
+		if (jd->needs_bswap) {
+			jr->unwinding.unwinding_size = bswap_64(jr->unwinding.unwinding_size);
+			jr->unwinding.eh_frame_hdr_size = bswap_64(jr->unwinding.eh_frame_hdr_size);
+			jr->unwinding.mapped_size = bswap_64(jr->unwinding.mapped_size);
+		}
+		break;
 	case JIT_CODE_CLOSE:
 		break;
 	case JIT_CODE_LOAD:
@@ -370,7 +381,7 @@ static int jit_repipe_code_load(struct jit_buf_desc *jd, union jr_entry *jr)
 	u16 idr_size;
 	const char *sym;
 	uint32_t count;
-	int ret, csize;
+	int ret, csize, usize;
 	pid_t pid, tid;
 	struct {
 		u32 pid, tid;
@@ -380,6 +391,7 @@ static int jit_repipe_code_load(struct jit_buf_desc *jd, union jr_entry *jr)
 	pid   = jr->load.pid;
 	tid   = jr->load.tid;
 	csize = jr->load.code_size;
+	usize = jd->unwinding_mapped_size;
 	addr  = jr->load.code_addr;
 	sym   = (void *)((unsigned long)jr + sizeof(jr->load));
 	code  = (unsigned long)jr + jr->load.p.total_size - csize;
@@ -408,6 +420,14 @@ static int jit_repipe_code_load(struct jit_buf_desc *jd, union jr_entry *jr)
 		jd->nr_debug_entries = 0;
 	}
 
+	if (jd->unwinding_data && jd->eh_frame_hdr_size) {
+		free(jd->unwinding_data);
+		jd->unwinding_data = NULL;
+		jd->eh_frame_hdr_size = 0;
+		jd->unwinding_mapped_size = 0;
+		jd->unwinding_size = 0;
+	}
+
 	if (ret) {
 		free(event);
 		return -1;
@@ -422,7 +442,7 @@ static int jit_repipe_code_load(struct jit_buf_desc *jd, union jr_entry *jr)
 
 	event->mmap2.pgoff = GEN_ELF_TEXT_OFFSET;
 	event->mmap2.start = addr;
-	event->mmap2.len   = csize;
+	event->mmap2.len   = usize ? ALIGN_8(csize) + usize : csize;
 	event->mmap2.pid   = pid;
 	event->mmap2.tid   = tid;
 	event->mmap2.ino   = st.st_ino;
@@ -473,6 +493,7 @@ static int jit_repipe_code_move(struct jit_buf_desc *jd, union jr_entry *jr)
 	char *filename;
 	size_t size;
 	struct stat st;
+	int usize;
 	u16 idr_size;
 	int ret;
 	pid_t pid, tid;
@@ -483,6 +504,7 @@ static int jit_repipe_code_move(struct jit_buf_desc *jd, union jr_entry *jr)
 
 	pid = jr->move.pid;
 	tid =  jr->move.tid;
+	usize = jd->unwinding_mapped_size;
 	idr_size = jd->machine->id_hdr_size;
 
 	/*
@@ -511,7 +533,8 @@ static int jit_repipe_code_move(struct jit_buf_desc *jd, union jr_entry *jr)
 			(sizeof(event->mmap2.filename) - size) + idr_size);
 	event->mmap2.pgoff = GEN_ELF_TEXT_OFFSET;
 	event->mmap2.start = jr->move.new_code_addr;
-	event->mmap2.len   = jr->move.code_size;
+	event->mmap2.len   = usize ? ALIGN_8(jr->move.code_size) + usize
+				   : jr->move.code_size;
 	event->mmap2.pid   = pid;
 	event->mmap2.tid   = tid;
 	event->mmap2.ino   = st.st_ino;
@@ -578,6 +601,31 @@ static int jit_repipe_debug_info(struct jit_buf_desc *jd, union jr_entry *jr)
 }
 
 static int
+jit_repipe_unwinding_info(struct jit_buf_desc *jd, union jr_entry *jr)
+{
+	void *unwinding_data;
+	uint32_t unwinding_data_size;
+
+	if (!(jd && jr))
+		return -1;
+
+	unwinding_data_size  = jr->prefix.total_size - sizeof(jr->unwinding);
+	unwinding_data = malloc(unwinding_data_size);
+	if (!unwinding_data)
+		return -1;
+
+	memcpy(unwinding_data, &jr->unwinding.unwinding_data,
+	       unwinding_data_size);
+
+	jd->eh_frame_hdr_size = jr->unwinding.eh_frame_hdr_size;
+	jd->unwinding_size = jr->unwinding.unwinding_size;
+	jd->unwinding_mapped_size = jr->unwinding.mapped_size;
+	jd->unwinding_data = unwinding_data;
+
+	return 0;
+}
+
+static int
 jit_process_dump(struct jit_buf_desc *jd)
 {
 	union jr_entry *jr;
@@ -594,6 +642,9 @@ jit_process_dump(struct jit_buf_desc *jd)
 		case JIT_CODE_DEBUG_INFO:
 			ret = jit_repipe_debug_info(jd, jr);
 			break;
+		case JIT_CODE_UNWINDING_INFO:
+			ret = jit_repipe_unwinding_info(jd, jr);
+			break;
 		default:
 			ret = 0;
 			continue;
diff --git a/tools/perf/util/jitdump.h b/tools/perf/util/jitdump.h
index bcacd20d0c1c..c6b9b67f43bf 100644
--- a/tools/perf/util/jitdump.h
+++ b/tools/perf/util/jitdump.h
@@ -19,6 +19,7 @@
 #define JITHEADER_MAGIC_SW	0x4454694A
 
 #define PADDING_8ALIGNED(x) ((((x) + 7) & 7) ^ 7)
+#define ALIGN_8(x) (((x) + 7) & (~7))
 
 #define JITHEADER_VERSION 1
 
@@ -48,6 +49,7 @@ enum jit_record_type {
         JIT_CODE_MOVE           = 1,
 	JIT_CODE_DEBUG_INFO	= 2,
 	JIT_CODE_CLOSE		= 3,
+	JIT_CODE_UNWINDING_INFO	= 4,
 
 	JIT_CODE_MAX,
 };
@@ -101,12 +103,22 @@ struct jr_code_debug_info {
 	struct debug_entry entries[0];
 };
 
+struct jr_code_unwinding_info {
+	struct jr_prefix p;
+
+	uint64_t unwinding_size;
+	uint64_t eh_frame_hdr_size;
+	uint64_t mapped_size;
+	const char unwinding_data[0];
+};
+
 union jr_entry {
         struct jr_code_debug_info info;
         struct jr_code_close close;
         struct jr_code_load load;
         struct jr_code_move move;
         struct jr_prefix prefix;
+        struct jr_code_unwinding_info unwinding;
 };
 
 static inline struct debug_entry *
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 23/37] perf jit: Generate .eh_frame/.eh_frame_hdr in DSO
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (21 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 22/37] perf jit: Add unwinding support Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 24/37] perf jit: Check JITHEADER_VERSION Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Stefano Sanfilippo, Ross McIlroy, Anton Blanchard,
	Jiri Olsa, Namhyung Kim, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Stefano Sanfilippo <ssanfilippo@chromium.org>

When the jit_buf_desc contains unwinding information, it is emitted as
eh_frame unwinding sections in the DSOs generated by perf inject.

The unwinding information is required to unwind of JITed code which do
not maintain the frame pointer register during function calls.  It can
be emitted by V8 / Chromium when the --perf_prof_unwinding_info is
passed to V8.

The eh_frame and eh_frame_hdr sections are emitted immediately after the
.text.

The .eh_frame is aligned at a 8-byte boundary, and .eh_frame_hdr at a
4-byte one. Since size of the .eh_frame is required to be a multiple of
the word size, which means there will never be additional padding
between it and the .eh_frame_hdr on machines where the word size is 4 or
8 bytes.

However, additional padding might be inserted between .text and
.eh_frame to reach the correct alignment, which will always be 8 bytes,
also on 32bit machines. The reasoning behind this choice is that 4 extra
bytes of padding worst case are not a large cost for the advantage of
removing word-size dependent offset calculations when emitting the
jitdump.

Signed-off-by: Stefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-8-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/genelf.c  | 102 ++++++++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/genelf.h  |   3 +-
 tools/perf/util/jitdump.c |  11 +++--
 3 files changed, 109 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/genelf.c b/tools/perf/util/genelf.c
index 30dece345b9f..c540d47583e7 100644
--- a/tools/perf/util/genelf.c
+++ b/tools/perf/util/genelf.c
@@ -73,6 +73,8 @@ static char shd_string_table[] = {
 	'.', 'd', 'e', 'b', 'u', 'g', '_', 'l', 'i', 'n', 'e', 0, /* 52 */
 	'.', 'd', 'e', 'b', 'u', 'g', '_', 'i', 'n', 'f', 'o', 0, /* 64 */
 	'.', 'd', 'e', 'b', 'u', 'g', '_', 'a', 'b', 'b', 'r', 'e', 'v', 0, /* 76 */
+	'.', 'e', 'h', '_', 'f', 'r', 'a', 'm', 'e', '_', 'h', 'd', 'r', 0, /* 90 */
+	'.', 'e', 'h', '_', 'f', 'r', 'a', 'm', 'e', 0, /* 104 */
 };
 
 static struct buildid_note {
@@ -153,6 +155,86 @@ gen_build_id(struct buildid_note *note, unsigned long load_addr, const void *cod
 }
 #endif
 
+static int
+jit_add_eh_frame_info(Elf *e, void* unwinding, uint64_t unwinding_header_size,
+		      uint64_t unwinding_size, uint64_t base_offset)
+{
+	Elf_Data *d;
+	Elf_Scn *scn;
+	Elf_Shdr *shdr;
+	uint64_t unwinding_table_size = unwinding_size - unwinding_header_size;
+
+	/*
+	 * setup eh_frame section
+	 */
+	scn = elf_newscn(e);
+	if (!scn) {
+		warnx("cannot create section");
+		return -1;
+	}
+
+	d = elf_newdata(scn);
+	if (!d) {
+		warnx("cannot get new data");
+		return -1;
+	}
+
+	d->d_align = 8;
+	d->d_off = 0LL;
+	d->d_buf = unwinding;
+	d->d_type = ELF_T_BYTE;
+	d->d_size = unwinding_table_size;
+	d->d_version = EV_CURRENT;
+
+	shdr = elf_getshdr(scn);
+	if (!shdr) {
+		warnx("cannot get section header");
+		return -1;
+	}
+
+	shdr->sh_name = 104;
+	shdr->sh_type = SHT_PROGBITS;
+	shdr->sh_addr = base_offset;
+	shdr->sh_flags = SHF_ALLOC;
+	shdr->sh_entsize = 0;
+
+	/*
+	 * setup eh_frame_hdr section
+	 */
+	scn = elf_newscn(e);
+	if (!scn) {
+		warnx("cannot create section");
+		return -1;
+	}
+
+	d = elf_newdata(scn);
+	if (!d) {
+		warnx("cannot get new data");
+		return -1;
+	}
+
+	d->d_align = 4;
+	d->d_off = 0LL;
+	d->d_buf = unwinding + unwinding_table_size;
+	d->d_type = ELF_T_BYTE;
+	d->d_size = unwinding_header_size;
+	d->d_version = EV_CURRENT;
+
+	shdr = elf_getshdr(scn);
+	if (!shdr) {
+		warnx("cannot get section header");
+		return -1;
+	}
+
+	shdr->sh_name = 90;
+	shdr->sh_type = SHT_PROGBITS;
+	shdr->sh_addr = base_offset + unwinding_table_size;
+	shdr->sh_flags = SHF_ALLOC;
+	shdr->sh_entsize = 0;
+
+	return 0;
+}
+
 /*
  * fd: file descriptor open for writing for the output file
  * load_addr: code load address (could be zero, just used for buildid)
@@ -163,13 +245,15 @@ gen_build_id(struct buildid_note *note, unsigned long load_addr, const void *cod
 int
 jit_write_elf(int fd, uint64_t load_addr, const char *sym,
 	      const void *code, int csize,
-	      void *debug __maybe_unused, int nr_debug_entries __maybe_unused)
+	      void *debug __maybe_unused, int nr_debug_entries __maybe_unused,
+	      void *unwinding, uint64_t unwinding_header_size, uint64_t unwinding_size)
 {
 	Elf *e;
 	Elf_Data *d;
 	Elf_Scn *scn;
 	Elf_Ehdr *ehdr;
 	Elf_Shdr *shdr;
+	uint64_t eh_frame_base_offset;
 	char *strsym = NULL;
 	int symlen;
 	int retval = -1;
@@ -200,7 +284,7 @@ jit_write_elf(int fd, uint64_t load_addr, const char *sym,
 	ehdr->e_type = ET_DYN;
 	ehdr->e_entry = GEN_ELF_TEXT_OFFSET;
 	ehdr->e_version = EV_CURRENT;
-	ehdr->e_shstrndx= 2; /* shdr index for section name */
+	ehdr->e_shstrndx= unwinding ? 4 : 2; /* shdr index for section name */
 
 	/*
 	 * setup text section
@@ -237,6 +321,18 @@ jit_write_elf(int fd, uint64_t load_addr, const char *sym,
 	shdr->sh_entsize = 0;
 
 	/*
+	 * Setup .eh_frame_hdr and .eh_frame
+	 */
+	if (unwinding) {
+		eh_frame_base_offset = ALIGN_8(GEN_ELF_TEXT_OFFSET + csize);
+		retval = jit_add_eh_frame_info(e, unwinding,
+					       unwinding_header_size, unwinding_size,
+					       eh_frame_base_offset);
+		if (retval)
+			goto error;
+	}
+
+	/*
 	 * setup section headers string table
 	 */
 	scn = elf_newscn(e);
@@ -304,7 +400,7 @@ jit_write_elf(int fd, uint64_t load_addr, const char *sym,
 	shdr->sh_type = SHT_SYMTAB;
 	shdr->sh_flags = 0;
 	shdr->sh_entsize = sizeof(Elf_Sym);
-	shdr->sh_link = 4; /* index of .strtab section */
+	shdr->sh_link = unwinding ? 6 : 4; /* index of .strtab section */
 
 	/*
 	 * setup symbols string table
diff --git a/tools/perf/util/genelf.h b/tools/perf/util/genelf.h
index 5c933ac71451..2424bd9862a3 100644
--- a/tools/perf/util/genelf.h
+++ b/tools/perf/util/genelf.h
@@ -3,7 +3,8 @@
 
 /* genelf.c */
 int jit_write_elf(int fd, uint64_t code_addr, const char *sym,
-		  const void *code, int csize, void *debug, int nr_debug_entries);
+		  const void *code, int csize, void *debug, int nr_debug_entries,
+		  void *unwinding, uint64_t unwinding_header_size, uint64_t unwinding_size);
 #ifdef HAVE_DWARF_SUPPORT
 /* genelf_debug.c */
 int jit_add_debug_info(Elf *e, uint64_t code_addr, void *debug, int nr_debug_entries);
diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index 9bae66cc78f2..6a2688da3c4a 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -72,7 +72,10 @@ jit_emit_elf(char *filename,
 	     const void *code,
 	     int csize,
 	     void *debug,
-	     int nr_debug_entries)
+	     int nr_debug_entries,
+	     void *unwinding,
+	     uint32_t unwinding_header_size,
+	     uint32_t unwinding_size)
 {
 	int ret, fd;
 
@@ -85,7 +88,8 @@ jit_emit_elf(char *filename,
 		return -1;
 	}
 
-        ret = jit_write_elf(fd, code_addr, sym, (const void *)code, csize, debug, nr_debug_entries);
+	ret = jit_write_elf(fd, code_addr, sym, (const void *)code, csize, debug, nr_debug_entries,
+			    unwinding, unwinding_header_size, unwinding_size);
 
         close(fd);
 
@@ -412,7 +416,8 @@ static int jit_repipe_code_load(struct jit_buf_desc *jd, union jr_entry *jr)
 
 	size = PERF_ALIGN(size, sizeof(u64));
 	uaddr = (uintptr_t)code;
-	ret = jit_emit_elf(filename, sym, addr, (const void *)uaddr, csize, jd->debug_data, jd->nr_debug_entries);
+	ret = jit_emit_elf(filename, sym, addr, (const void *)uaddr, csize, jd->debug_data, jd->nr_debug_entries,
+			   jd->unwinding_data, jd->eh_frame_hdr_size, jd->unwinding_size);
 
 	if (jd->debug_data && jd->nr_debug_entries) {
 		free(jd->debug_data);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 24/37] perf jit: Check JITHEADER_VERSION
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (22 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 23/37] perf jit: Generate .eh_frame/.eh_frame_hdr in DSO Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 25/37] perf jit: Add jitdump format specification document Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Stefano Sanfilippo, Ross McIlroy, Anton Blanchard,
	Jiri Olsa, Namhyung Kim, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Stefano Sanfilippo <ssanfilippo@chromium.org>

Check the version number when opening a jitdump file.  Accept older
versions, but not newer ones.

Signed-off-by: Stefano Sanfilippo <ssanfilippo@chromium.org>
Signed-off-by: Ross McIlroy <rmcilroy@chromium.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-9-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/jitdump.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index 6a2688da3c4a..c9a941ef0f6d 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -180,6 +180,12 @@ jit_open(struct jit_buf_desc *jd, const char *name)
 			header.elf_mach,
 			jd->use_arch_timestamp);
 
+	if (header.version > JITHEADER_VERSION) {
+		pr_err("wrong jitdump version %u, expected " STR(JITHEADER_VERSION),
+			header.version);
+		goto error;
+	}
+
 	if (header.flags & JITDUMP_FLAGS_RESERVED) {
 		pr_err("jitdump file contains invalid or unsupported flags 0x%llx\n",
 		       (unsigned long long)header.flags & JITDUMP_FLAGS_RESERVED);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 25/37] perf jit: Add jitdump format specification document
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (23 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 24/37] perf jit: Check JITHEADER_VERSION Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 26/37] perf pmu: Only print Using CPUID message once Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Stephane Eranian, Anton Blanchard, Jiri Olsa,
	Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Stephane Eranian <eranian@google.com>

This patch adds a formal specification of the jitdump format. The goal
is to help jit runtime developers implement the jitdump support without
having to read the jvmti code.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-10-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/jitdump-specification.txt | 170 +++++++++++++++++++++
 1 file changed, 170 insertions(+)
 create mode 100644 tools/perf/Documentation/jitdump-specification.txt

diff --git a/tools/perf/Documentation/jitdump-specification.txt b/tools/perf/Documentation/jitdump-specification.txt
new file mode 100644
index 000000000000..4c62b0713651
--- /dev/null
+++ b/tools/perf/Documentation/jitdump-specification.txt
@@ -0,0 +1,170 @@
+JITDUMP specification version 2
+Last Revised: 09/15/2016
+Author: Stephane Eranian <eranian@gmail.com>
+
+--------------------------------------------------------
+| Revision  |    Date    | Description                 |
+--------------------------------------------------------
+|   1       | 09/07/2016 | Initial revision            |
+--------------------------------------------------------
+|   2       | 09/15/2016 | Add JIT_CODE_UNWINDING_INFO |
+--------------------------------------------------------
+
+
+I/ Introduction
+
+
+This document describes the jitdump file format. The file is generated by Just-In-time compiler runtimes to save meta-data information about the generated code, such as address, size, and name of generated functions, the native code generated, the source line information. The data may then be used by performance tools, such as Linux perf to generate function and assembly level profiles.
+
+The format is not specific to any particular programming language. It can be extended as need be.
+
+The format of the file is binary. It is self-describing in terms of endianness and is portable across multiple processor architectures.
+
+
+II/ Overview of the format
+
+
+The format requires only sequential accesses, i.e., append only mode. The file starts with a fixed size file header describing the version of the specification, the endianness.
+
+The header is followed by a series of records, each starting with a fixed size header describing the type of record and its size. It is, itself, followed by the payload for the record. Records can have a variable size even for a given type.
+
+Each entry in the file is timestamped. All timestamps must use the same clock source. The CLOCK_MONOTONIC clock source is recommended.
+
+
+III/ Jitdump file header format
+
+Each jitdump file starts with a fixed size header containing the following fields in order:
+
+
+* uint32_t magic     : a magic number tagging the file type. The value is 4-byte long and represents the string "JiTD" in ASCII form. It is 0x4A695444 or 0x4454694a depending on the endianness. The field can be used to detect the endianness of the file
+* uint32_t version   : a 4-byte value representing the format version. It is currently set to 2
+* uint32_t total_size: size in bytes of file header
+* uint32_t elf_mach  : ELF architecture encoding (ELF e_machine value as specified in /usr/include/elf.h)
+* uint32_t pad1      : padding. Reserved for future use
+* uint32_t pid       : JIT runtime process identification (OS specific)
+* uint64_t timestamp : timestamp of when the file was created
+* uint64_t flags     : a bitmask of flags
+
+The flags currently defined are as follows:
+ * bit 0: JITDUMP_FLAGS_ARCH_TIMESTAMP : set if the jitdump file is using an architecture-specific timestamp clock source. For instance, on x86, one could use TSC directly
+
+IV/ Record header
+
+The file header is immediately followed by records. Each record starts with a fixed size header describing the record that follows.
+
+The record header is specified in order as follows:
+* uint32_t id        : a value identifying the record type (see below)
+* uint32_t total_size: the size in bytes of the record including the header.
+* uint64_t timestamp : a timestamp of when the record was created.
+
+The following record types are defined:
+ * Value 0 : JIT_CODE_LOAD      : record describing a jitted function
+ * Value 1 : JIT_CODE_MOVE      : record describing an already jitted function which is moved
+ * Value 2 : JIT_CODE_DEBUG_INFO: record describing the debug information for a jitted function
+ * Value 3 : JIT_CODE_CLOSE     : record marking the end of the jit runtime (optional)
+ * Value 4 : JIT_CODE_UNWINDING_INFO: record describing a function unwinding information
+
+ The payload of the record must immediately follow the record header without padding.
+
+V/ JIT_CODE_LOAD record
+
+
+  The record has the following fields following the fixed-size record header in order:
+  * uint32_t pid: OS process id of the runtime generating the jitted code
+  * uint32_t tid: OS thread identification of the runtime thread generating the jitted code
+  * uint64_t vma: virtual address of jitted code start
+  * uint64_t code_addr: code start address for the jitted code. By default vma = code_addr
+  * uint64_t code_size: size in bytes of the generated jitted code
+  * uint64_t code_index: unique identifier for the jitted code (see below)
+  * char[n]: function name in ASCII including the null termination
+  * native code: raw byte encoding of the jitted code
+
+  The record header total_size field is inclusive of all components:
+  * record header
+  * fixed-sized fields
+  * function name string, including termination
+  * native code length
+  * record specific variable data (e.g., array of data entries)
+
+The code_index is used to uniquely identify each jitted function. The index can be a monotonically increasing 64-bit value. Each time a function is jitted it gets a new number. This value is used in case the code for a function is moved and avoids having to issue another JIT_CODE_LOAD record.
+
+The format supports empty functions with no native code.
+
+
+VI/ JIT_CODE_MOVE record
+
+  The record type is optional.
+
+  The record has the following fields following the fixed-size record header in order:
+  * uint32_t pid          : OS process id of the runtime generating the jitted code
+  * uint32_t tid          : OS thread identification of the runtime thread generating the jitted code
+  * uint64_t vma          : new virtual address of jitted code start
+  * uint64_t old_code_addr: previous code address for the same function
+  * uint64_t new_code_addr: alternate new code started address for the jitted code. By default it should be equal to the vma address.
+  * uint64_t code_size    : size in bytes of the jitted code
+  * uint64_t code_index   : index referring to the JIT_CODE_LOAD code_index record of when the function was initially jitted
+
+
+The MOVE record can be used in case an already jitted function is simply moved by the runtime inside the code cache.
+
+The JIT_CODE_MOVE record cannot come before the JIT_CODE_LOAD record for the same function name. The function cannot have changed name, otherwise a new JIT_CODE_LOAD record must be emitted.
+
+The code size of the function cannot change.
+
+
+VII/ JIT_DEBUG_INFO record
+
+The record type is optional.
+
+The record contains source lines debug information, i.e., a way to map a code address back to a source line. This information may be used by the performance tool.
+
+The record has the following fields following the fixed-size record header in order:
+  * uint64_t code_addr: address of function for which the debug information is generated
+  * uint64_t nr_entry : number of debug entries for the function
+  * debug_entry[n]: array of nr_entry debug entries for the function
+
+The debug_entry describes the source line information. It is defined as follows in order:
+* uint64_t code_addr: address of function for which the debug information is generated
+* uint32_t line     : source file line number (starting at 1)
+* uint32_t discrim  : column discriminator, 0 is default
+* char name[n]      : source file name in ASCII, including null termination
+
+The debug_entry entries are saved in sequence but given that they have variable sizes due to the file name string, they cannot be indexed directly.
+They need to be walked sequentially. The next debug_entry is found at sizeof(debug_entry) + strlen(name) + 1.
+
+IMPORTANT:
+  The JIT_CODE_DEBUG for a given function must always be generated BEFORE the JIT_CODE_LOAD for the function. This facilitates greatly the parser for the jitdump file.
+
+
+VIII/ JIT_CODE_CLOSE record
+
+
+The record type is optional.
+
+The record is used as a marker for the end of the jitted runtime. It can be replaced by the end of the file.
+
+The JIT_CODE_CLOSE record does not have any specific fields, the record header contains all the information needed.
+
+
+IX/ JIT_CODE_UNWINDING_INFO
+
+
+The record type is optional.
+
+The record is used to describe the unwinding information for a jitted function.
+
+The record has the following fields following the fixed-size record header in order:
+
+uint64_t unwind_data_size   : the size in bytes of the unwinding data table at the end of the record
+uint64_t eh_frame_hdr_size  : the size in bytes of the DWARF EH Frame Header at the start of the unwinding data table at the end of the record
+uint64_t mapped_size        : the size of the unwinding data mapped in memory
+const char unwinding_data[n]: an array of unwinding data, consisting of the EH Frame Header, followed by the actual EH Frame
+
+
+The EH Frame header follows the Linux Standard Base (LSB) specification as described in the document at https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/ehframehdr.html
+
+
+The EH Frame follows the LSB specicfication as described in the document at https://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/ehframechpt.html
+
+
+NOTE: The mapped_size is generally either the same as unwind_data_size (if the unwinding data was mapped in memory by the running process) or zero (if the unwinding data is not mapped by the process). If the unwinding data was not mapped, then only the EH Frame Header will be read, which can be used to specify FP based unwinding for a function which does not have unwinding information.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 26/37] perf pmu: Only print Using CPUID message once
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (24 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 25/37] perf jit: Add jitdump format specification document Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 27/37] perf tools: Fix typo "No enough" to "Not enough" Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Andi Kleen, Jiri Olsa, Stephane Eranian,
	Sukadev Bhattiprolu, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

With uncore event aliases which are duplicated over multiple PMUs the
"Using CPUID" message with -v could be printed many times.  Only print
it once.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1476393332-20732-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index b1474dcadfa2..d7174f340b53 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -504,6 +504,7 @@ static void pmu_add_cpu_aliases(struct list_head *head)
 	struct pmu_events_map *map;
 	struct pmu_event *pe;
 	char *cpuid;
+	static bool printed;
 
 	cpuid = getenv("PERF_CPUID");
 	if (cpuid)
@@ -513,7 +514,10 @@ static void pmu_add_cpu_aliases(struct list_head *head)
 	if (!cpuid)
 		return;
 
-	pr_debug("Using CPUID %s\n", cpuid);
+	if (!printed) {
+		pr_debug("Using CPUID %s\n", cpuid);
+		printed = true;
+	}
 
 	i = 0;
 	while (1) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 27/37] perf tools: Fix typo "No enough" to "Not enough"
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (25 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 26/37] perf pmu: Only print Using CPUID message once Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 28/37] perf hists browser: Dynamically change verbosity level Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Alexander Alemayhu, Alexander Shishkin,
	Peter Zijlstra, Wang Nan, Arnaldo Carvalho de Melo

From: Alexander Alemayhu <alexander@alemayhu.com>

The latter version occurs much more when running git grep.

Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20161013161811.4939-1-alexander@alemayhu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/backward-ring-buffer.c |  2 +-
 tools/perf/tests/bpf.c                  |  2 +-
 tools/perf/util/bpf-loader.c            | 14 +++++++-------
 tools/perf/util/llvm-utils.c            |  2 +-
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index e6d1816e431a..42e892b1e979 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -97,7 +97,7 @@ int test__backward_ring_buffer(int subtest __maybe_unused)
 
 	evlist = perf_evlist__new();
 	if (!evlist) {
-		pr_debug("No enough memory to create evlist\n");
+		pr_debug("Not enough memory to create evlist\n");
 		return TEST_FAIL;
 	}
 
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 2673e86ed50f..8f0298aff222 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -125,7 +125,7 @@ static int do_test(struct bpf_object *obj, int (*func)(void),
 	/* Instead of perf_evlist__new_default, don't add default events */
 	evlist = perf_evlist__new();
 	if (!evlist) {
-		pr_debug("No enough memory to create evlist\n");
+		pr_debug("Not enough memory to create evlist\n");
 		return TEST_FAIL;
 	}
 
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 2b2c9b82f5ab..a5fd275238f7 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -241,7 +241,7 @@ parse_prog_config_kvpair(const char *config_str, struct perf_probe_event *pev)
 	int err = 0;
 
 	if (!text) {
-		pr_debug("No enough memory: dup config_str failed\n");
+		pr_debug("Not enough memory: dup config_str failed\n");
 		return ERR_PTR(-ENOMEM);
 	}
 
@@ -531,7 +531,7 @@ static int map_prologue(struct perf_probe_event *pev, int *mapping,
 
 	ptevs = malloc(array_sz);
 	if (!ptevs) {
-		pr_debug("No enough memory: alloc ptevs failed\n");
+		pr_debug("Not enough memory: alloc ptevs failed\n");
 		return -ENOMEM;
 	}
 
@@ -604,13 +604,13 @@ static int hook_load_preprocessor(struct bpf_program *prog)
 	priv->need_prologue = true;
 	priv->insns_buf = malloc(sizeof(struct bpf_insn) * BPF_MAXINSNS);
 	if (!priv->insns_buf) {
-		pr_debug("No enough memory: alloc insns_buf failed\n");
+		pr_debug("Not enough memory: alloc insns_buf failed\n");
 		return -ENOMEM;
 	}
 
 	priv->type_mapping = malloc(sizeof(int) * pev->ntevs);
 	if (!priv->type_mapping) {
-		pr_debug("No enough memory: alloc type_mapping failed\n");
+		pr_debug("Not enough memory: alloc type_mapping failed\n");
 		return -ENOMEM;
 	}
 	memset(priv->type_mapping, -1,
@@ -864,7 +864,7 @@ bpf_map_op_setkey(struct bpf_map_op *op, struct parse_events_term *term)
 
 		op->k.array.ranges = memdup(term->array.ranges, memsz);
 		if (!op->k.array.ranges) {
-			pr_debug("No enough memory to alloc indices for map\n");
+			pr_debug("Not enough memory to alloc indices for map\n");
 			return -ENOMEM;
 		}
 		op->key_type = BPF_MAP_KEY_RANGES;
@@ -929,7 +929,7 @@ bpf_map_priv__clone(struct bpf_map_priv *priv)
 
 	newpriv = zalloc(sizeof(*newpriv));
 	if (!newpriv) {
-		pr_debug("No enough memory to alloc map private\n");
+		pr_debug("Not enough memory to alloc map private\n");
 		return NULL;
 	}
 	INIT_LIST_HEAD(&newpriv->ops_list);
@@ -960,7 +960,7 @@ bpf_map__add_op(struct bpf_map *map, struct bpf_map_op *op)
 	if (!priv) {
 		priv = zalloc(sizeof(*priv));
 		if (!priv) {
-			pr_debug("No enough memory to alloc map private\n");
+			pr_debug("Not enough memory to alloc map private\n");
 			return -ENOMEM;
 		}
 		INIT_LIST_HEAD(&priv->ops_list);
diff --git a/tools/perf/util/llvm-utils.c b/tools/perf/util/llvm-utils.c
index bf7216b8731d..27b6f303720a 100644
--- a/tools/perf/util/llvm-utils.c
+++ b/tools/perf/util/llvm-utils.c
@@ -339,7 +339,7 @@ dump_obj(const char *path, void *obj_buf, size_t size)
 	char *p;
 
 	if (!obj_path) {
-		pr_warning("WARNING: No enough memory, skip object dumping\n");
+		pr_warning("WARNING: Not enough memory, skip object dumping\n");
 		return;
 	}
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 28/37] perf hists browser: Dynamically change verbosity level
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (26 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 27/37] perf tools: Fix typo "No enough" to "Not enough" Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 29/37] perf trace: Implement --delay Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Alexis Berlemont, Alexander Shishkin,
	Peter Zijlstra, Arnaldo Carvalho de Melo

From: Alexis Berlemont <alexis.berlemont@gmail.com>

Here is a small patch which tries to fulfill a point in the perf todo
list:

* Make pressing 'V' multiple times to go on cycling thru various
  verbosity levels in 'perf top', so that info that is present in
  'perf top -v' can be obtained without having to restart the tool
  (acme).

After a small grep in the code, the max verbosity level seems 3; so,
we cycle at 4; I did not dare define a MAX_VERBOSE_LEVEL constant.

Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161012214823.14324-2-alexis.berlemont@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/browsers/hists.c |  5 ++++-
 tools/perf/util/map.c          | 17 ++++++++++++-----
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 31d6d5a7c2dc..ddc4c3e59cc1 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2807,7 +2807,10 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
 			do_zoom_dso(browser, actions);
 			continue;
 		case 'V':
-			browser->show_dso = !browser->show_dso;
+			verbose = (verbose + 1) % 4;
+			browser->show_dso = verbose > 0;
+			ui_helpline__fpush("Verbosity level set to %d\n",
+					   verbose);
 			continue;
 		case 't':
 			actions->thread = thread;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index c662fef95d14..4f9a71c63026 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -682,9 +682,16 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp
 			continue;
 
 		if (verbose >= 2) {
-			fputs("overlapping maps:\n", fp);
-			map__fprintf(map, fp);
-			map__fprintf(pos, fp);
+
+			if (use_browser) {
+				pr_warning("overlapping maps in %s "
+					   "(disable tui for more info)\n",
+					   map->dso->name);
+			} else {
+				fputs("overlapping maps:\n", fp);
+				map__fprintf(map, fp);
+				map__fprintf(pos, fp);
+			}
 		}
 
 		rb_erase_init(&pos->rb_node, root);
@@ -702,7 +709,7 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp
 
 			before->end = map->start;
 			__map_groups__insert(pos->groups, before);
-			if (verbose >= 2)
+			if (verbose >= 2 && !use_browser)
 				map__fprintf(before, fp);
 			map__put(before);
 		}
@@ -717,7 +724,7 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp
 
 			after->start = map->end;
 			__map_groups__insert(pos->groups, after);
-			if (verbose >= 2)
+			if (verbose >= 2 && !use_browser)
 				map__fprintf(after, fp);
 			map__put(after);
 		}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 29/37] perf trace: Implement --delay
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (27 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 28/37] perf hists browser: Dynamically change verbosity level Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 30/37] perf bench mem: Move boilerplate memory allocation to the infrastructure Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Alexis Berlemont, Alexander Shishkin,
	Peter Zijlstra, Arnaldo Carvalho de Melo

From: Alexis Berlemont <alexis.berlemont@gmail.com>

In the perf wiki todo-list[1], there is an entry regarding initial-delay
and 'perf trace'; the following small patch tries to fulfill this point.
It has been generated against the branch tip/perf/core.

It has only been implemented in the "trace__run" case.

Ex.:

  $ sudo strace -- ./perf trace --delay 5 sleep 1 2>&1
  ...
  fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK)  = 0
  ioctl(7, PERF_EVENT_IOC_ID, 0x7ffc8fd35718) = 0
  ioctl(11, PERF_EVENT_IOC_SET_OUTPUT, 0x7) = 0
  fcntl(11, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  ioctl(11, PERF_EVENT_IOC_ID, 0x7ffc8fd35718) = 0
  write(6, "\0", 1)                       = 1
  close(6)                                = 0
  nanosleep({0, 5000000}, NULL)           = 0  # DELAY OF 5 MS BEFORE ENABLING THE EVENTS
  ioctl(3, PERF_EVENT_IOC_ENABLE, 0)      = 0
  ioctl(4, PERF_EVENT_IOC_ENABLE, 0)      = 0
  ioctl(5, PERF_EVENT_IOC_ENABLE, 0)      = 0
  ioctl(7, PERF_EVENT_IOC_ENABLE, 0)      = 0
  ...

[1]: https://perf.wiki.kernel.org/index.php/Todo

Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161010054328.4028-2-alexis.berlemont@gmail.com
[ Add entry to the manpage, cut'n'pasted from stat's and record's ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-trace.txt |  5 +++++
 tools/perf/builtin-trace.c              | 10 +++++++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt
index 1ab0782369b1..781b019751a4 100644
--- a/tools/perf/Documentation/perf-trace.txt
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -39,6 +39,11 @@ OPTIONS
 	Prefixing with ! shows all syscalls but the ones specified.  You may
 	need to escape it.
 
+-D msecs::
+--delay msecs::
+After starting the program, wait msecs before measuring. This is useful to
+filter out the startup phase of the program, which is often very different.
+
 -o::
 --output=::
 	Output file name.
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index c298bd3e1d90..0bae454e8efa 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2310,12 +2310,17 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	if (err < 0)
 		goto out_error_mmap;
 
-	if (!target__none(&trace->opts.target))
+	if (!target__none(&trace->opts.target) && !trace->opts.initial_delay)
 		perf_evlist__enable(evlist);
 
 	if (forks)
 		perf_evlist__start_workload(evlist);
 
+	if (trace->opts.initial_delay) {
+		usleep(trace->opts.initial_delay * 1000);
+		perf_evlist__enable(evlist);
+	}
+
 	trace->multiple_threads = thread_map__pid(evlist->threads, 0) == -1 ||
 				  evlist->threads->nr > 1 ||
 				  perf_evlist__first(evlist)->attr.inherit;
@@ -2816,6 +2821,9 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "Default: kernel.perf_event_max_stack or " __stringify(PERF_MAX_STACK_DEPTH)),
 	OPT_UINTEGER(0, "proc-map-timeout", &trace.opts.proc_map_timeout,
 			"per thread proc mmap processing timeout in ms"),
+	OPT_UINTEGER('D', "delay", &trace.opts.initial_delay,
+		     "ms to wait before starting measurement after program "
+		     "start"),
 	OPT_END()
 	};
 	bool __maybe_unused max_stack_user_set = true;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 30/37] perf bench mem: Move boilerplate memory allocation to the infrastructure
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (28 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 29/37] perf trace: Implement --delay Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 31/37] perf tools: Normalize sq_quote_argv() error reporting Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Hitoshi Mitake, Jiri Olsa, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Instead of having all tests perform alloc/free, do it in the code that
calls the do_cycles() and do_gettimeofday() functions.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lywj4mbdb1m9x1z9asivwuuy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/mem-functions.c | 77 ++++++++++++++++------------------------
 1 file changed, 30 insertions(+), 47 deletions(-)

diff --git a/tools/perf/bench/mem-functions.c b/tools/perf/bench/mem-functions.c
index c684910e5a48..52504a83b5a1 100644
--- a/tools/perf/bench/mem-functions.c
+++ b/tools/perf/bench/mem-functions.c
@@ -106,9 +106,10 @@ static double timeval2double(struct timeval *ts)
 
 struct bench_mem_info {
 	const struct function *functions;
-	u64 (*do_cycles)(const struct function *r, size_t size);
-	double (*do_gettimeofday)(const struct function *r, size_t size);
+	u64 (*do_cycles)(const struct function *r, size_t size, void *src, void *dst);
+	double (*do_gettimeofday)(const struct function *r, size_t size, void *src, void *dst);
 	const char *const *usage;
+	bool alloc_src;
 };
 
 static void __bench_mem_function(struct bench_mem_info *info, int r_idx, size_t size, double size_total)
@@ -116,16 +117,26 @@ static void __bench_mem_function(struct bench_mem_info *info, int r_idx, size_t
 	const struct function *r = &info->functions[r_idx];
 	double result_bps = 0.0;
 	u64 result_cycles = 0;
+	void *src = NULL, *dst = zalloc(size);
 
 	printf("# function '%s' (%s)\n", r->name, r->desc);
 
+	if (dst == NULL)
+		goto out_alloc_failed;
+
+	if (info->alloc_src) {
+		src = zalloc(size);
+		if (src == NULL)
+			goto out_alloc_failed;
+	}
+
 	if (bench_format == BENCH_FORMAT_DEFAULT)
 		printf("# Copying %s bytes ...\n\n", size_str);
 
 	if (use_cycles) {
-		result_cycles = info->do_cycles(r, size);
+		result_cycles = info->do_cycles(r, size, src, dst);
 	} else {
-		result_bps = info->do_gettimeofday(r, size);
+		result_bps = info->do_gettimeofday(r, size, src, dst);
 	}
 
 	switch (bench_format) {
@@ -149,6 +160,14 @@ static void __bench_mem_function(struct bench_mem_info *info, int r_idx, size_t
 		BUG_ON(1);
 		break;
 	}
+
+out_free:
+	free(src);
+	free(dst);
+	return;
+out_alloc_failed:
+	printf("# Memory allocation failed - maybe size (%s) is too large?\n", size_str);
+	goto out_free;
 }
 
 static int bench_mem_common(int argc, const char **argv, struct bench_mem_info *info)
@@ -201,28 +220,14 @@ static int bench_mem_common(int argc, const char **argv, struct bench_mem_info *
 	return 0;
 }
 
-static void memcpy_alloc_mem(void **dst, void **src, size_t size)
-{
-	*dst = zalloc(size);
-	if (!*dst)
-		die("memory allocation failed - maybe size is too large?\n");
-
-	*src = zalloc(size);
-	if (!*src)
-		die("memory allocation failed - maybe size is too large?\n");
-
-	/* Make sure to always prefault zero pages even if MMAP_THRESH is crossed: */
-	memset(*src, 0, size);
-}
-
-static u64 do_memcpy_cycles(const struct function *r, size_t size)
+static u64 do_memcpy_cycles(const struct function *r, size_t size, void *src, void *dst)
 {
 	u64 cycle_start = 0ULL, cycle_end = 0ULL;
-	void *src = NULL, *dst = NULL;
 	memcpy_t fn = r->fn.memcpy;
 	int i;
 
-	memcpy_alloc_mem(&dst, &src, size);
+	/* Make sure to always prefault zero pages even if MMAP_THRESH is crossed: */
+	memset(src, 0, size);
 
 	/*
 	 * We prefault the freshly allocated memory range here,
@@ -235,20 +240,15 @@ static u64 do_memcpy_cycles(const struct function *r, size_t size)
 		fn(dst, src, size);
 	cycle_end = get_cycles();
 
-	free(src);
-	free(dst);
 	return cycle_end - cycle_start;
 }
 
-static double do_memcpy_gettimeofday(const struct function *r, size_t size)
+static double do_memcpy_gettimeofday(const struct function *r, size_t size, void *src, void *dst)
 {
 	struct timeval tv_start, tv_end, tv_diff;
 	memcpy_t fn = r->fn.memcpy;
-	void *src = NULL, *dst = NULL;
 	int i;
 
-	memcpy_alloc_mem(&dst, &src, size);
-
 	/*
 	 * We prefault the freshly allocated memory range here,
 	 * to not measure page fault overhead:
@@ -262,9 +262,6 @@ static double do_memcpy_gettimeofday(const struct function *r, size_t size)
 
 	timersub(&tv_end, &tv_start, &tv_diff);
 
-	free(src);
-	free(dst);
-
 	return (double)(((double)size * nr_loops) / timeval2double(&tv_diff));
 }
 
@@ -294,27 +291,18 @@ int bench_mem_memcpy(int argc, const char **argv, const char *prefix __maybe_unu
 		.do_cycles		= do_memcpy_cycles,
 		.do_gettimeofday	= do_memcpy_gettimeofday,
 		.usage			= bench_mem_memcpy_usage,
+		.alloc_src              = true,
 	};
 
 	return bench_mem_common(argc, argv, &info);
 }
 
-static void memset_alloc_mem(void **dst, size_t size)
-{
-	*dst = zalloc(size);
-	if (!*dst)
-		die("memory allocation failed - maybe size is too large?\n");
-}
-
-static u64 do_memset_cycles(const struct function *r, size_t size)
+static u64 do_memset_cycles(const struct function *r, size_t size, void *src __maybe_unused, void *dst)
 {
 	u64 cycle_start = 0ULL, cycle_end = 0ULL;
 	memset_t fn = r->fn.memset;
-	void *dst = NULL;
 	int i;
 
-	memset_alloc_mem(&dst, size);
-
 	/*
 	 * We prefault the freshly allocated memory range here,
 	 * to not measure page fault overhead:
@@ -326,19 +314,15 @@ static u64 do_memset_cycles(const struct function *r, size_t size)
 		fn(dst, i, size);
 	cycle_end = get_cycles();
 
-	free(dst);
 	return cycle_end - cycle_start;
 }
 
-static double do_memset_gettimeofday(const struct function *r, size_t size)
+static double do_memset_gettimeofday(const struct function *r, size_t size, void *src __maybe_unused, void *dst)
 {
 	struct timeval tv_start, tv_end, tv_diff;
 	memset_t fn = r->fn.memset;
-	void *dst = NULL;
 	int i;
 
-	memset_alloc_mem(&dst, size);
-
 	/*
 	 * We prefault the freshly allocated memory range here,
 	 * to not measure page fault overhead:
@@ -352,7 +336,6 @@ static double do_memset_gettimeofday(const struct function *r, size_t size)
 
 	timersub(&tv_end, &tv_start, &tv_diff);
 
-	free(dst);
 	return (double)(((double)size * nr_loops) / timeval2double(&tv_diff));
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 31/37] perf tools: Normalize sq_quote_argv() error reporting
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (29 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 30/37] perf bench mem: Move boilerplate memory allocation to the infrastructure Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 32/37] perf tools: Use normal error reporting when processing PERF_RECORD_READ events Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Jiri Olsa, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

It already returns whatever strbuf_(grow|addch)() returns in case of
failure, so just return -ENOSPC in the only case where it was die()ing.
When it returns, its only caller will call die() anyway, so no need to
be so eager, die later.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-as05b7mbogprlwi8iarwns8e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/quote.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/quote.c b/tools/perf/util/quote.c
index 639d1da2f978..293534c1a474 100644
--- a/tools/perf/util/quote.c
+++ b/tools/perf/util/quote.c
@@ -54,7 +54,7 @@ int sq_quote_argv(struct strbuf *dst, const char** argv, size_t maxlen)
 			break;
 		ret = sq_quote_buf(dst, argv[i]);
 		if (maxlen && dst->len > maxlen)
-			die("Too many or long arguments");
+			return -ENOSPC;
 	}
 	return ret;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 32/37] perf tools: Use normal error reporting when processing PERF_RECORD_READ events
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (30 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 31/37] perf tools: Normalize sq_quote_argv() error reporting Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 33/37] perf bench futex: Cache align the worker struct Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Brice Goglin, David Ahern, Jiri Olsa, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We already have handling for errors when processing PERF_RECORD_ events,
so instead of calling die() when not being able to alloc, propagate the
error, so that the normal UI exit sequence can take place, the user be
warned and possibly the terminal be properly reset to a sane mode.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-r90je3c009a125dvs3525yge@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c | 12 +++++--
 tools/perf/util/values.c    | 81 +++++++++++++++++++++++++++++++++------------
 tools/perf/util/values.h    |  4 +--
 3 files changed, 70 insertions(+), 27 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6e88460cd13d..8064de8ceedc 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -207,11 +207,14 @@ static int process_read_event(struct perf_tool *tool,
 
 	if (rep->show_threads) {
 		const char *name = evsel ? perf_evsel__name(evsel) : "unknown";
-		perf_read_values_add_value(&rep->show_threads_values,
+		int err = perf_read_values_add_value(&rep->show_threads_values,
 					   event->read.pid, event->read.tid,
 					   event->read.id,
 					   name,
 					   event->read.value);
+
+		if (err)
+			return err;
 	}
 
 	dump_printf(": %d %d %s %" PRIu64 "\n", event->read.pid, event->read.tid,
@@ -539,8 +542,11 @@ static int __cmd_report(struct report *rep)
 		}
 	}
 
-	if (rep->show_threads)
-		perf_read_values_init(&rep->show_threads_values);
+	if (rep->show_threads) {
+		ret = perf_read_values_init(&rep->show_threads_values);
+		if (ret)
+			return ret;
+	}
 
 	ret = report__setup_sample_type(rep);
 	if (ret) {
diff --git a/tools/perf/util/values.c b/tools/perf/util/values.c
index 0fb3c1fcd3e6..5074be4ed467 100644
--- a/tools/perf/util/values.c
+++ b/tools/perf/util/values.c
@@ -2,15 +2,18 @@
 
 #include "util.h"
 #include "values.h"
+#include "debug.h"
 
-void perf_read_values_init(struct perf_read_values *values)
+int perf_read_values_init(struct perf_read_values *values)
 {
 	values->threads_max = 16;
 	values->pid = malloc(values->threads_max * sizeof(*values->pid));
 	values->tid = malloc(values->threads_max * sizeof(*values->tid));
 	values->value = malloc(values->threads_max * sizeof(*values->value));
-	if (!values->pid || !values->tid || !values->value)
-		die("failed to allocate read_values threads arrays");
+	if (!values->pid || !values->tid || !values->value) {
+		pr_debug("failed to allocate read_values threads arrays");
+		goto out_free_pid;
+	}
 	values->threads = 0;
 
 	values->counters_max = 16;
@@ -18,9 +21,22 @@ void perf_read_values_init(struct perf_read_values *values)
 				      * sizeof(*values->counterrawid));
 	values->countername = malloc(values->counters_max
 				     * sizeof(*values->countername));
-	if (!values->counterrawid || !values->countername)
-		die("failed to allocate read_values counters arrays");
+	if (!values->counterrawid || !values->countername) {
+		pr_debug("failed to allocate read_values counters arrays");
+		goto out_free_counter;
+	}
 	values->counters = 0;
+
+	return 0;
+
+out_free_counter:
+	zfree(&values->counterrawid);
+	zfree(&values->countername);
+out_free_pid:
+	zfree(&values->pid);
+	zfree(&values->tid);
+	zfree(&values->value);
+	return -ENOMEM;
 }
 
 void perf_read_values_destroy(struct perf_read_values *values)
@@ -41,17 +57,27 @@ void perf_read_values_destroy(struct perf_read_values *values)
 	zfree(&values->countername);
 }
 
-static void perf_read_values__enlarge_threads(struct perf_read_values *values)
+static int perf_read_values__enlarge_threads(struct perf_read_values *values)
 {
-	values->threads_max *= 2;
-	values->pid = realloc(values->pid,
-			      values->threads_max * sizeof(*values->pid));
-	values->tid = realloc(values->tid,
-			      values->threads_max * sizeof(*values->tid));
-	values->value = realloc(values->value,
-				values->threads_max * sizeof(*values->value));
-	if (!values->pid || !values->tid || !values->value)
-		die("failed to enlarge read_values threads arrays");
+	int nthreads_max = values->threads_max * 2;
+	void *npid = realloc(values->pid, nthreads_max * sizeof(*values->pid)),
+	     *ntid = realloc(values->tid, nthreads_max * sizeof(*values->tid)),
+	     *nvalue = realloc(values->value, nthreads_max * sizeof(*values->value));
+
+	if (!npid || !ntid || !nvalue)
+		goto out_err;
+
+	values->threads_max = nthreads_max;
+	values->pid = npid;
+	values->tid = ntid;
+	values->value = nvalue;
+	return 0;
+out_err:
+	free(npid);
+	free(ntid);
+	free(nvalue);
+	pr_debug("failed to enlarge read_values threads arrays");
+	return -ENOMEM;
 }
 
 static int perf_read_values__findnew_thread(struct perf_read_values *values,
@@ -63,15 +89,21 @@ static int perf_read_values__findnew_thread(struct perf_read_values *values,
 		if (values->pid[i] == pid && values->tid[i] == tid)
 			return i;
 
-	if (values->threads == values->threads_max)
-		perf_read_values__enlarge_threads(values);
+	if (values->threads == values->threads_max) {
+		i = perf_read_values__enlarge_threads(values);
+		if (i < 0)
+			return i;
+	}
 
-	i = values->threads++;
+	i = values->threads + 1;
+	values->value[i] = malloc(values->counters_max * sizeof(**values->value));
+	if (!values->value[i]) {
+		pr_debug("failed to allocate read_values counters array");
+		return -ENOMEM;
+	}
 	values->pid[i] = pid;
 	values->tid[i] = tid;
-	values->value[i] = malloc(values->counters_max * sizeof(**values->value));
-	if (!values->value[i])
-		die("failed to allocate read_values counters array");
+	values->threads = i;
 
 	return i;
 }
@@ -115,16 +147,21 @@ static int perf_read_values__findnew_counter(struct perf_read_values *values,
 	return i;
 }
 
-void perf_read_values_add_value(struct perf_read_values *values,
+int perf_read_values_add_value(struct perf_read_values *values,
 				u32 pid, u32 tid,
 				u64 rawid, const char *name, u64 value)
 {
 	int tindex, cindex;
 
 	tindex = perf_read_values__findnew_thread(values, pid, tid);
+	if (tindex < 0)
+		return tindex;
 	cindex = perf_read_values__findnew_counter(values, rawid, name);
+	if (cindex < 0)
+		return cindex;
 
 	values->value[tindex][cindex] = value;
+	return 0;
 }
 
 static void perf_read_values__display_pretty(FILE *fp,
diff --git a/tools/perf/util/values.h b/tools/perf/util/values.h
index b21a80c6cf8d..808ff9c73bf5 100644
--- a/tools/perf/util/values.h
+++ b/tools/perf/util/values.h
@@ -14,10 +14,10 @@ struct perf_read_values {
 	u64 **value;
 };
 
-void perf_read_values_init(struct perf_read_values *values);
+int perf_read_values_init(struct perf_read_values *values);
 void perf_read_values_destroy(struct perf_read_values *values);
 
-void perf_read_values_add_value(struct perf_read_values *values,
+int perf_read_values_add_value(struct perf_read_values *values,
 				u32 pid, u32 tid,
 				u64 rawid, const char *name, u64 value);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 33/37] perf bench futex: Cache align the worker struct
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (31 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 32/37] perf tools: Use normal error reporting when processing PERF_RECORD_READ events Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 18:50   ` Davidlohr Bueso
  2016-10-24 16:20 ` [PATCH 34/37] perf trace: Remove thread_trace->exit_time Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  37 siblings, 1 reply; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Sebastian Andrzej Siewior, Davidlohr Bueso,
	Peter Zijlstra, Arnaldo Carvalho de Melo

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

It popped up in perf testing that the worker consumes some amount of
CPU. It boils down to the increment of `ops` which causes cache line
bouncing between the individual threads.

This patch aligns the struct by 256 bytes to ensure that not a cache
line is shared among CPUs. 128 byte is the x86 worst case and grep says
that L1_CACHE_SHIFT is set to 8 on s390.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161016190803.3392-1-bigeasy@linutronix.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/futex-hash.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index 8024cd5febd2..d9e5e80bb4d0 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -39,12 +39,15 @@ static unsigned int threads_starting;
 static struct stats throughput_stats;
 static pthread_cond_t thread_parent, thread_worker;
 
+#define SMP_CACHE_BYTES 256
+#define __cacheline_aligned __attribute__ ((aligned (SMP_CACHE_BYTES)))
+
 struct worker {
 	int tid;
 	u_int32_t *futex;
 	pthread_t thread;
 	unsigned long ops;
-};
+} __cacheline_aligned;
 
 static const struct option options[] = {
 	OPT_UINTEGER('t', "threads", &nthreads, "Specify amount of threads"),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 34/37] perf trace: Remove thread_trace->exit_time
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (32 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 33/37] perf bench futex: Cache align the worker struct Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 35/37] perf trace: Use the syscall raw_syscalls:sys_enter timestamp Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Jiri Olsa, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Not used at all, we need just the entry_time to calculate the syscall
duration.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-js6r09zdwlzecvaei7t4l3vd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 0bae454e8efa..c72a92c6d9c4 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -843,7 +843,6 @@ static size_t fprintf_duration(unsigned long t, FILE *fp)
  */
 struct thread_trace {
 	u64		  entry_time;
-	u64		  exit_time;
 	bool		  entry_pending;
 	unsigned long	  nr_events;
 	unsigned long	  pfmaj, pfmin;
@@ -1571,8 +1570,6 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel,
 		++trace->stats.vfs_getname;
 	}
 
-	ttrace->exit_time = sample->time;
-
 	if (ttrace->entry_time) {
 		duration = sample->time - ttrace->entry_time;
 		if (trace__filter_duration(trace, duration))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 35/37] perf trace: Use the syscall raw_syscalls:sys_enter timestamp
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (33 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 34/37] perf trace: Remove thread_trace->exit_time Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 36/37] perf list: Make vendor event matching case insensitive Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Jiri Olsa, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Instead of the one when another syscall takes place while another is being
processed (in another CPU, but we show it serialized, so need to "interrupt"
the other), and also when finally showing the sys_enter + sys_exit + duration,
where we were showing the sample->time for the sys_exit, duh.

Before:

  # perf trace sleep 1
  <SNIP>
     0.373 (   0.001 ms): close(fd: 3                   ) = 0
  1000.626 (1000.211 ms): nanosleep(rqtp: 0x7ffd6ddddfb0) = 0
  1000.653 (   0.003 ms): close(fd: 1                   ) = 0
  1000.657 (   0.002 ms): close(fd: 2                   ) = 0
  1000.667 (   0.000 ms): exit_group(                   )
  #

After:

  # perf trace sleep 1
  <SNIP>
     0.336 (   0.001 ms): close(fd: 3                   ) = 0
     0.373 (1000.086 ms): nanosleep(rqtp: 0x7ffe303e9550) = 0
  1000.481 (   0.002 ms): close(fd: 1                   ) = 0
  1000.485 (   0.001 ms): close(fd: 2                   ) = 0
  1000.494 (   0.000 ms): exit_group(                   )
[root@jouet linux]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ecbzgmu2ni6glc6zkw8p1zmx@git.kernel.org
Fixes: 752fde44fd1c ("perf trace: Support interrupted syscalls")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index c72a92c6d9c4..5f45166c892d 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1451,7 +1451,7 @@ static int trace__printf_interrupted_entry(struct trace *trace, struct perf_samp
 
 	duration = sample->time - ttrace->entry_time;
 
-	printed  = trace__fprintf_entry_head(trace, trace->current, duration, sample->time, trace->output);
+	printed  = trace__fprintf_entry_head(trace, trace->current, duration, ttrace->entry_time, trace->output);
 	printed += fprintf(trace->output, "%-70s) ...\n", ttrace->entry_str);
 	ttrace->entry_pending = false;
 
@@ -1498,7 +1498,7 @@ static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel,
 
 	if (sc->is_exit) {
 		if (!(trace->duration_filter || trace->summary_only || trace->min_stack)) {
-			trace__fprintf_entry_head(trace, thread, 1, sample->time, trace->output);
+			trace__fprintf_entry_head(trace, thread, 1, ttrace->entry_time, trace->output);
 			fprintf(trace->output, "%-70s)\n", ttrace->entry_str);
 		}
 	} else {
@@ -1589,7 +1589,7 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel,
 	if (trace->summary_only)
 		goto out;
 
-	trace__fprintf_entry_head(trace, thread, duration, sample->time, trace->output);
+	trace__fprintf_entry_head(trace, thread, duration, ttrace->entry_time, trace->output);
 
 	if (ttrace->entry_pending) {
 		fprintf(trace->output, "%-70s", ttrace->entry_str);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 36/37] perf list: Make vendor event matching case insensitive
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (34 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 35/37] perf trace: Use the syscall raw_syscalls:sys_enter timestamp Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 16:20 ` [PATCH 37/37] perf coresight: Removing miscellaneous debug output Arnaldo Carvalho de Melo
  2016-10-24 18:44 ` [GIT PULL 00/37] perf/core improvements and fixes Ingo Molnar
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Make the 'perf list' glob matching for vendor events case insensitive.
This allows to use the upper case vendor events with perf list too.

Now the following works:

  % perf list LONGEST_LAT

  ...

  cache:
    longest_lat_cache.miss
         [Core-originated cacheable demand requests missed LLC]
    longest_lat_cache.reference
         [Core-originated cacheable demand requests that refer to LLC]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1476899402-31460-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c    |  4 ++--
 tools/perf/util/string.c | 21 ++++++++++++++++-----
 tools/perf/util/util.h   |  1 +
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index d7174f340b53..31b845ec32e2 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1139,8 +1139,8 @@ void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
 			bool is_cpu = !strcmp(pmu->name, "cpu");
 
 			if (event_glob != NULL &&
-			    !(strglobmatch(name, event_glob) ||
-			      (!is_cpu && strglobmatch(alias->name,
+			    !(strglobmatch_nocase(name, event_glob) ||
+			      (!is_cpu && strglobmatch_nocase(alias->name,
 						       event_glob))))
 				continue;
 
diff --git a/tools/perf/util/string.c b/tools/perf/util/string.c
index 7f7e072be746..d8dfaf64b32e 100644
--- a/tools/perf/util/string.c
+++ b/tools/perf/util/string.c
@@ -193,7 +193,8 @@ static bool __match_charclass(const char *pat, char c, const char **npat)
 }
 
 /* Glob/lazy pattern matching */
-static bool __match_glob(const char *str, const char *pat, bool ignore_space)
+static bool __match_glob(const char *str, const char *pat, bool ignore_space,
+			bool case_ins)
 {
 	while (*str && *pat && *pat != '*') {
 		if (ignore_space) {
@@ -219,8 +220,13 @@ static bool __match_glob(const char *str, const char *pat, bool ignore_space)
 				return false;
 		else if (*pat == '\\') /* Escaped char match as normal char */
 			pat++;
-		if (*str++ != *pat++)
+		if (case_ins) {
+			if (tolower(*str) != tolower(*pat))
+				return false;
+		} else if (*str != *pat)
 			return false;
+		str++;
+		pat++;
 	}
 	/* Check wild card */
 	if (*pat == '*') {
@@ -229,7 +235,7 @@ static bool __match_glob(const char *str, const char *pat, bool ignore_space)
 		if (!*pat)	/* Tail wild card matches all */
 			return true;
 		while (*str)
-			if (__match_glob(str++, pat, ignore_space))
+			if (__match_glob(str++, pat, ignore_space, case_ins))
 				return true;
 	}
 	return !*str && !*pat;
@@ -249,7 +255,12 @@ static bool __match_glob(const char *str, const char *pat, bool ignore_space)
  */
 bool strglobmatch(const char *str, const char *pat)
 {
-	return __match_glob(str, pat, false);
+	return __match_glob(str, pat, false, false);
+}
+
+bool strglobmatch_nocase(const char *str, const char *pat)
+{
+	return __match_glob(str, pat, false, true);
 }
 
 /**
@@ -262,7 +273,7 @@ bool strglobmatch(const char *str, const char *pat)
  */
 bool strlazymatch(const char *str, const char *pat)
 {
-	return __match_glob(str, pat, true);
+	return __match_glob(str, pat, true, false);
 }
 
 /**
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 43899e0d6fa1..71b6992f1d98 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -222,6 +222,7 @@ s64 perf_atoll(const char *str);
 char **argv_split(const char *str, int *argcp);
 void argv_free(char **argv);
 bool strglobmatch(const char *str, const char *pat);
+bool strglobmatch_nocase(const char *str, const char *pat);
 bool strlazymatch(const char *str, const char *pat);
 static inline bool strisglob(const char *str)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 37/37] perf coresight: Removing miscellaneous debug output
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (35 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 36/37] perf list: Make vendor event matching case insensitive Arnaldo Carvalho de Melo
@ 2016-10-24 16:20 ` Arnaldo Carvalho de Melo
  2016-10-24 18:44 ` [GIT PULL 00/37] perf/core improvements and fixes Ingo Molnar
  37 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Mathieu Poirier, Peter Zijlstra, linux-arm-kernel,
	Arnaldo Carvalho de Melo

From: Mathieu Poirier <mathieu.poirier@linaro.org>

Printing the full path of the selected link is obviously not needed,
hence removing.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1476913323-6836-1-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/arm/util/cs-etm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index 47d584da5819..dfea6b635525 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -575,8 +575,6 @@ static FILE *cs_device__open_file(const char *name)
 	snprintf(path, PATH_MAX,
 		 "%s" CS_BUS_DEVICE_PATH "%s", sysfs, name);
 
-	printf("path: %s\n", path);
-
 	if (stat(path, &st) < 0)
 		return NULL;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [GIT PULL 00/37] perf/core improvements and fixes
  2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (36 preceding siblings ...)
  2016-10-24 16:20 ` [PATCH 37/37] perf coresight: Removing miscellaneous debug output Arnaldo Carvalho de Melo
@ 2016-10-24 18:44 ` Ingo Molnar
  37 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2016-10-24 18:44 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Alexander Alemayhu,
	Alexander Shishkin, Alexis Berlemont, Andi Kleen,
	Anton Blanchard, Brice Goglin, David Ahern, Davidlohr Bueso,
	Hitoshi Mitake, Jiri Olsa, Maciej Debski, Mathieu Poirier,
	Namhyung Kim, Nilay Vaish, Peter Zijlstra, Ross McIlroy,
	Sebastian Andrzej Siewior, Stefano Sanfilippo, Stephane Eranian,
	Steven Rostedt, Sukadev Bhattiprolu, Wang Nan,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling, more to come soon,
> 
> - Arnaldo
> 
> Build and test stats at the end of the message.
> 
> The following changes since commit e9c848928abf4cb60601e9ae7d336f0333c98bca:
> 
>   Merge tag 'perf-c2c-for-mingo-20161021' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-10-22 10:22:05 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161024
> 
> for you to fetch changes up to 04b553ad7dc347eabd3cb4705932272453175a80:
> 
>   perf coresight: Removing miscellaneous debug output (2016-10-24 11:07:47 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> New features:
> 
> - Dynamicly change verbosity level by pressing 'V' in the 'perf top/report'
>   hists TUI browser (Alexis Berlemont)
> 
> - Implement 'perf trace --delay' in the same fashion as in 'perf record --delay',
>   to skip sampling workload initialization events (Alexis Berlemont)
> 
> - Make vendor named events case insensitive in 'perf list', i.e.
>   'perf list LONGEST_LAT' works just the same as  'perf list longest_lat' (Andi Kleen)
> 
> - Show instruction bytes and lenght in 'perf script' for Intel PT and BTS (Andi Kleen, Adrian Hunter)
> 
>    E.g:
> 
>     % perf record -e intel_pt// foo
>     % perf script --itrace=i0ns -F ip,insn,insnlen
>      ffffffff8101232f ilen: 5 insn: 0f 1f 44 00 00
>      ffffffff81012334 ilen: 1 insn: 5b
>      ffffffff81012335 ilen: 1 insn: 5d
>      ffffffff81012336 ilen: 1 insn: c3
>      ffffffff810123e3 ilen: 1 insn: 5b
>      ffffffff810123e4 ilen: 2 insn: 41 5c
>      ffffffff810123e6 ilen: 1 insn: 5d
>      ffffffff810123e7 ilen: 1 insn: c3
>      ffffffff810124a6 ilen: 2 insn: 31 c0
>      ffffffff810124a8 ilen: 9 insn: 41 83 bc 24 a8 01 00 00 01
>      ffffffff810124b1 ilen: 2 insn: 75 87
> 
> - Allow enabling the perf_event_attr.branch_type attribute member: (Andi Kleen)
> 
>   perf record -e sched:sched_switch,cpu/cpu-cycles,branch_type=any/ ...
> 
> - Add unwinding support for jitdump (Stefano Sanfilippo)
> 
> Fixes:
> 
> - Use raw_syscall:sys_enter timestamp in 'perf trace' (Arnaldo Carvalho de Melo)
> 
> Infrastructure:
> 
> - Allow jitdump to be built without libdwarf (Maciej Debski)
> 
> - Sync x86's syscall table tools/ copy (Arnaldo Carvalho de Melo)
> 
> - Fixes to avoid calling die() in library fuctions already propagating other
>   errors (Arnaldo Carvalho de Melo)
> 
> - Improvements to allow libtraceevent to be properly installed in distro
>   packages (Jiri Olsa)
> 
> - Removing coresight miscellaneous debug output (Mathieu Poirier)
> 
> - Cache align the 'perf bench futex' worker struct (Sebastian Andrzej Siewior)
> 
> Documentation:
> 
> - Minor improvements on the documentation of event parameters (Andi Kleen)
> 
> - Add jitdump format specification document (Stephane Eranian)
> 
> Spelling fixes:
> 
> - Fix typo "No enough" to "Not enough" (Alexander Alemayhu)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Adrian Hunter (1):
>       perf intel-pt/bts: Tidy instruction buffer size usage
> 
> Alexander Alemayhu (1):
>       perf tools: Fix typo "No enough" to "Not enough"
> 
> Alexis Berlemont (2):
>       perf hists browser: Dynamically change verbosity level
>       perf trace: Implement --delay
> 
> Andi Kleen (6):
>       perf intel-pt/bts: Report instruction bytes and length in sample
>       perf script: Support insn and insnlen
>       perf record: Improve documentation of event parameters
>       perf tools: Implement branch_type event parameter
>       perf pmu: Only print Using CPUID message once
>       perf list: Make vendor event matching case insensitive
> 
> Arnaldo Carvalho de Melo (8):
>       perf tools: Sync copy of x86's syscall table
>       perf jit: Avoid returning garbage for a ret variable
>       perf jit: Add NT_GNU_BUILD_ID definition for older distros
>       perf bench mem: Move boilerplate memory allocation to the infrastructure
>       perf tools: Normalize sq_quote_argv() error reporting
>       perf tools: Use normal error reporting when processing PERF_RECORD_READ events
>       perf trace: Remove thread_trace->exit_time
>       perf trace: Use the syscall raw_syscalls:sys_enter timestamp
> 
> Jiri Olsa (8):
>       tools lib traceevent: Add install_headers target
>       tools lib traceevent: Add do_install_mkdir Makefile function
>       tools lib traceevent: Rename LIB_FILE to LIB_TARGET
>       tools lib traceevent: Add version for traceevent shared object
>       tools lib: Add for_each_clear_bit macro
>       perf report: Move captured info to generic header info
>       perf header: Display missing features
>       perf header: Display feature name on write failure
> 
> Maciej Debski (1):
>       perf jit: Enable jitdump support without dwarf
> 
> Mathieu Poirier (1):
>       perf coresight: Removing miscellaneous debug output
> 
> Sebastian Andrzej Siewior (1):
>       perf bench futex: Cache align the worker struct
> 
> Stefano Sanfilippo (5):
>       perf jit: Make perf skip unknown records
>       perf jit: Do not assume pgoff is zero
>       perf jit: Add unwinding support
>       perf jit: Generate .eh_frame/.eh_frame_hdr in DSO
>       perf jit: Check JITHEADER_VERSION
> 
> Stephane Eranian (3):
>       perf jit: Improve error messages from JVMTI
>       perf jit: Remove unecessary padding in jitdump file
>       perf jit: Add jitdump format specification document
> 
>  tools/include/asm-generic/bitops.h                 |   1 +
>  tools/include/asm-generic/bitops/__ffz.h           |  12 ++
>  tools/include/asm-generic/bitops/find.h            |  28 ++++
>  tools/include/linux/bitops.h                       |   5 +
>  tools/lib/find_bit.c                               |  25 +++
>  tools/lib/traceevent/Makefile                      |  40 +++--
>  tools/perf/Documentation/jitdump-specification.txt | 170 +++++++++++++++++++++
>  tools/perf/Documentation/perf-record.txt           |   9 +-
>  tools/perf/Documentation/perf-script.txt           |   6 +-
>  tools/perf/Documentation/perf-trace.txt            |   5 +
>  tools/perf/MANIFEST                                |   1 +
>  tools/perf/Makefile.config                         |   2 +-
>  tools/perf/arch/arm/util/cs-etm.c                  |   2 -
>  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl  |   4 +-
>  tools/perf/bench/futex-hash.c                      |   5 +-
>  tools/perf/bench/mem-functions.c                   |  77 ++++------
>  tools/perf/builtin-report.c                        |  12 +-
>  tools/perf/builtin-script.c                        |  24 ++-
>  tools/perf/builtin-trace.c                         |  19 ++-
>  tools/perf/jvmti/jvmti_agent.c                     |  38 +----
>  tools/perf/jvmti/libjvmti.c                        |  39 +++--
>  tools/perf/tests/backward-ring-buffer.c            |   2 +-
>  tools/perf/tests/bpf.c                             |   2 +-
>  tools/perf/ui/browsers/hists.c                     |   5 +-
>  tools/perf/util/Build                              |   2 +-
>  tools/perf/util/bpf-loader.c                       |  14 +-
>  tools/perf/util/event.h                            |   3 +
>  tools/perf/util/evsel.c                            |   9 ++
>  tools/perf/util/evsel.h                            |   2 +
>  tools/perf/util/genelf.c                           | 113 +++++++++++++-
>  tools/perf/util/genelf.h                           |   5 +-
>  tools/perf/util/header.c                           |  19 ++-
>  tools/perf/util/intel-bts.c                        |   9 +-
>  .../perf/util/intel-pt-decoder/intel-pt-decoder.c  |   2 +
>  .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |   1 +
>  .../util/intel-pt-decoder/intel-pt-insn-decoder.c  |  13 +-
>  .../util/intel-pt-decoder/intel-pt-insn-decoder.h  |   6 +-
>  tools/perf/util/intel-pt-decoder/intel-pt-log.c    |   4 +-
>  tools/perf/util/intel-pt.c                         |  19 ++-
>  tools/perf/util/jitdump.c                          |  82 ++++++++--
>  tools/perf/util/jitdump.h                          |  12 ++
>  tools/perf/util/llvm-utils.c                       |   2 +-
>  tools/perf/util/map.c                              |  17 ++-
>  tools/perf/util/parse-branch-options.c             |  85 ++++++-----
>  tools/perf/util/parse-branch-options.h             |   3 +-
>  tools/perf/util/parse-events.c                     |  15 +-
>  tools/perf/util/pmu.c                              |  10 +-
>  tools/perf/util/quote.c                            |   2 +-
>  tools/perf/util/session.c                          |  10 --
>  tools/perf/util/string.c                           |  21 ++-
>  tools/perf/util/unwind-libunwind-local.c           |   4 +-
>  tools/perf/util/util.h                             |   1 +
>  tools/perf/util/values.c                           |  81 +++++++---
>  tools/perf/util/values.h                           |   4 +-
>  54 files changed, 830 insertions(+), 273 deletions(-)
>  create mode 100644 tools/include/asm-generic/bitops/__ffz.h
>  create mode 100644 tools/perf/Documentation/jitdump-specification.txt
> 
>   # time dm
>      1 alpine:3.4: Ok
>      2 android-ndk:r12b-arm: Ok
>      3 archlinux:latest: Ok
>      4 centos:5: Ok
>      5 centos:6: Ok
>      6 centos:7: Ok
>      7 debian:7: Ok
>      8 debian:8: Ok
>      9 debian:experimental: Ok
>     10 fedora:20: Ok
>     11 fedora:21: Ok
>     12 fedora:22: Ok
>     13 fedora:23: Ok
>     14 fedora:24: Ok
>     15 fedora:24-x-ARC-uClibc: Ok
>     16 fedora:rawhide: Ok
>     17 mageia:5: Ok
>     18 opensuse:13.2: Ok
>     19 opensuse:42.1: Ok
>     20 opensuse:tumbleweed: Ok
>     21 ubuntu:12.04.5: Ok
>     22 ubuntu:14.04: Ok
>     23 ubuntu:14.04.4: Ok
>     24 ubuntu:15.10: Ok
>     25 ubuntu:16.04: Ok
>     26 ubuntu:16.04-x-arm: Ok
>     27 ubuntu:16.04-x-arm64: Ok
>     28 ubuntu:16.04-x-powerpc: Ok
>     29 ubuntu:16.04-x-powerpc64: Ok
>     30 ubuntu:16.04-x-powerpc64el: Ok
>     31 ubuntu:16.04-x-s390: Ok
>     32 ubuntu:16.10: Ok
>   
>   real	46m1.737s
>   user	0m2.200s
>   sys	0m2.735s
>   #
> 
>    # perf test
>    1: vmlinux symtab matches kallsyms                          : Ok
>    2: detect openat syscall event                              : Ok
>    3: detect openat syscall event on all cpus                  : Ok
>    4: read samples using the mmap interface                    : Ok
>    5: parse events tests                                       : Ok
>    6: Validate PERF_RECORD_* events & perf_sample fields       : Ok
>    7: Test perf pmu format parsing                             : Ok
>    8: Test dso data read                                       : Ok
>    9: Test dso data cache                                      : Ok
>   10: Test dso data reopen                                     : Ok
>   11: roundtrip evsel->name check                              : Ok
>   12: Check parsing of sched tracepoints fields                : Ok
>   13: Generate and check syscalls:sys_enter_openat event fields: Ok
>   14: struct perf_event_attr setup                             : Ok
>   15: Test matching and linking multiple hists                 : Ok
>   16: Try 'import perf' in python, checking link problems      : Ok
>   17: Test breakpoint overflow signal handler                  : Ok
>   18: Test breakpoint overflow sampling                        : Ok
>   19: Test number of exit event of a simple workload           : Ok
>   20: Test software clock events have valid period values      : Ok
>   21: Test object code reading                                 : Ok
>   22: Test sample parsing                                      : Ok
>   23: Test using a dummy software event to keep tracking       : Ok
>   24: Test parsing with no sample_id_all bit set               : Ok
>   25: Test filtering hist entries                              : Ok
>   26: Test mmap thread lookup                                  : Ok
>   27: Test thread mg sharing                                   : Ok
>   28: Test output sorting of hist entries                      : Ok
>   29: Test cumulation of child hist entries                    : Ok
>   30: Test tracking with sched_switch                          : Ok
>   31: Filter fds with revents mask in a fdarray                : Ok
>   32: Add fd to a fdarray, making it autogrow                  : Ok
>   33: Test kmod_path__parse function                           : Ok
>   34: Test thread map                                          : Ok
>   35: Test LLVM searching and compiling                        :
>   35.1: Basic BPF llvm compiling test                          : Ok
>   35.2: Test kbuild searching                                  : Ok
>   35.3: Compile source for BPF prologue generation test        : Ok
>   35.4: Compile source for BPF relocation test                 : Ok
>   36: Test topology in session                                 : Ok
>   37: Test BPF filter                                          :
>   37.1: Test basic BPF filtering                               : Ok
>   37.2: Test BPF prologue generation                           : Ok
>   37.3: Test BPF relocation checker                            : Ok
>   38: Test thread map synthesize                               : Ok
>   39: Test cpu map synthesize                                  : Ok
>   40: Test stat config synthesize                              : Ok
>   41: Test stat synthesize                                     : Ok
>   42: Test stat round synthesize                               : Ok
>   43: Test attr update synthesize                              : Ok
>   44: Test events times                                        : Ok
>   45: Test backward reading from ring buffer                   : Ok
>   46: Test cpu map print                                       : Ok
>   47: Test SDT event probing                                   : Ok
>   48: Test is_printable_array function                         : Ok
>   49: Test bitmap print                                        : Ok
>   50: x86 rdpmc test                                           : Ok
>   51: Test converting perf time to TSC                         : Ok
>   52: Test dwarf unwind                                        : Ok
>   53: Test x86 instruction decoder - new instructions          : Ok
>   54: Test intel cqm nmi context read                          : Skip
>   #
> 
>    $ make -C tools/perf build-test
>   make: Entering directory '/home/acme/git/linux/tools/perf'
>   - tarpkg: ./tests/perf-targz-src-pkg .
>                    make_tags_O: make tags 
>                   make_debug_O: make DEBUG=1
>               make_no_libbpf_O: make NO_LIBBPF=1
>                 make_no_newt_O: make NO_NEWT=1
>              make_no_libperl_O: make NO_LIBPERL=1
>                 make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1
>                make_no_slang_O: make NO_SLANG=1
>             make_no_demangle_O: make NO_DEMANGLE=1
>            make_no_libpython_O: make NO_LIBPYTHON=1
>         make_with_babeltrace_O: make LIBBABELTRACE=1
>                   make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
>               make_no_libelf_O: make NO_LIBELF=1
>            make_no_backtrace_O: make NO_BACKTRACE=1
>              make_util_map_o_O: make util/map.o
>            make_no_libunwind_O: make NO_LIBUNWIND=1
>              make_no_libnuma_O: make NO_LIBNUMA=1
>              make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
>                 make_install_O: make install
>               make_clean_all_O: make clean all
>    make_install_prefix_slash_O: make install prefix=/tmp/krava/
>                  make_perf_o_O: make perf.o
>                    make_help_O: make help
>             make_no_libaudit_O: make NO_LIBAUDIT=1
>        make_util_pmu_bison_o_O: make util/pmu-bison.o
>                    make_pure_O: make
>   make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
>             make_no_auxtrace_O: make NO_AUXTRACE=1
>          make_install_prefix_O: make install prefix=/tmp/krava
>            make_no_libbionic_O: make NO_LIBBIONIC=1
>             make_install_bin_O: make install-bin
>                 make_no_gtk2_O: make NO_GTK2=1
>                  make_static_O: make LDFLAGS=-static
>                     make_doc_O: make doc
>   OK
>   make: Leaving directory '/home/acme/git/linux/tools/perf'
>   $

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 33/37] perf bench futex: Cache align the worker struct
  2016-10-24 16:20 ` [PATCH 33/37] perf bench futex: Cache align the worker struct Arnaldo Carvalho de Melo
@ 2016-10-24 18:50   ` Davidlohr Bueso
  2016-10-24 18:53     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 41+ messages in thread
From: Davidlohr Bueso @ 2016-10-24 18:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, linux-kernel, Sebastian Andrzej Siewior,
	Peter Zijlstra, Arnaldo Carvalho de Melo

On 2016-10-24 09:20, Arnaldo Carvalho de Melo wrote:
> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> It popped up in perf testing that the worker consumes some amount of
> CPU. It boils down to the increment of `ops` which causes cache line
> bouncing between the individual threads.
> 
> This patch aligns the struct by 256 bytes to ensure that not a cache
> line is shared among CPUs. 128 byte is the x86 worst case and grep says
> that L1_CACHE_SHIFT is set to 8 on s390.

Hmm so this one should have been replaced with this:

https://marc.info/?l=linux-kernel&m=147690008801082

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 33/37] perf bench futex: Cache align the worker struct
  2016-10-24 18:50   ` Davidlohr Bueso
@ 2016-10-24 18:53     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-24 18:53 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, linux-kernel, Sebastian Andrzej Siewior,
	Peter Zijlstra, Arnaldo Carvalho de Melo

Em Mon, Oct 24, 2016 at 11:50:50AM -0700, Davidlohr Bueso escreveu:
> On 2016-10-24 09:20, Arnaldo Carvalho de Melo wrote:
> > From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > 
> > It popped up in perf testing that the worker consumes some amount of
> > CPU. It boils down to the increment of `ops` which causes cache line
> > bouncing between the individual threads.
> > 
> > This patch aligns the struct by 256 bytes to ensure that not a cache
> > line is shared among CPUs. 128 byte is the x86 worst case and grep says
> > that L1_CACHE_SHIFT is set to 8 on s390.
> 
> Hmm so this one should have been replaced with this:
> 
> https://marc.info/?l=linux-kernel&m=147690008801082

Humm, sorry, can you please send a patch fixing this?

- Arnaldo

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2016-10-24 19:03 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-24 16:20 [GIT PULL 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 01/37] perf intel-pt/bts: Tidy instruction buffer size usage Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 02/37] perf intel-pt/bts: Report instruction bytes and length in sample Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 03/37] perf script: Support insn and insnlen Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 04/37] perf tools: Sync copy of x86's syscall table Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 05/37] tools lib traceevent: Add install_headers target Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 06/37] tools lib traceevent: Add do_install_mkdir Makefile function Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 07/37] tools lib traceevent: Rename LIB_FILE to LIB_TARGET Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 08/37] tools lib traceevent: Add version for traceevent shared object Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 09/37] tools lib: Add for_each_clear_bit macro Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 10/37] perf report: Move captured info to generic header info Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 11/37] perf header: Display missing features Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 12/37] perf header: Display feature name on write failure Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 13/37] perf record: Improve documentation of event parameters Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 14/37] perf tools: Implement branch_type event parameter Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 15/37] perf jit: Avoid returning garbage for a ret variable Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 16/37] perf jit: Add NT_GNU_BUILD_ID definition for older distros Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 17/37] perf jit: Improve error messages from JVMTI Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 18/37] perf jit: Enable jitdump support without dwarf Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 19/37] perf jit: Remove unecessary padding in jitdump file Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 20/37] perf jit: Make perf skip unknown records Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 21/37] perf jit: Do not assume pgoff is zero Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 22/37] perf jit: Add unwinding support Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 23/37] perf jit: Generate .eh_frame/.eh_frame_hdr in DSO Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 24/37] perf jit: Check JITHEADER_VERSION Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 25/37] perf jit: Add jitdump format specification document Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 26/37] perf pmu: Only print Using CPUID message once Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 27/37] perf tools: Fix typo "No enough" to "Not enough" Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 28/37] perf hists browser: Dynamically change verbosity level Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 29/37] perf trace: Implement --delay Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 30/37] perf bench mem: Move boilerplate memory allocation to the infrastructure Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 31/37] perf tools: Normalize sq_quote_argv() error reporting Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 32/37] perf tools: Use normal error reporting when processing PERF_RECORD_READ events Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 33/37] perf bench futex: Cache align the worker struct Arnaldo Carvalho de Melo
2016-10-24 18:50   ` Davidlohr Bueso
2016-10-24 18:53     ` Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 34/37] perf trace: Remove thread_trace->exit_time Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 35/37] perf trace: Use the syscall raw_syscalls:sys_enter timestamp Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 36/37] perf list: Make vendor event matching case insensitive Arnaldo Carvalho de Melo
2016-10-24 16:20 ` [PATCH 37/37] perf coresight: Removing miscellaneous debug output Arnaldo Carvalho de Melo
2016-10-24 18:44 ` [GIT PULL 00/37] perf/core improvements and fixes Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).