linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 00/32] perf/core improvements and fixes
@ 2018-01-17 16:11 Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 01/32] perf evsel: Fix incorrect handling of type _TERM_DRV_CFG Arnaldo Carvalho de Melo
                   ` (32 more replies)
  0 siblings, 33 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Andrew Morton,
	Dan Williams, David Ahern, Dongjiu Geng, Federico Vaga,
	Gopanapalli Pradeep, Heiko Carstens, Hendrik Brueckner,
	Jan Kiszka, Jin Yao, Jiri Olsa, Joe Perches, Kan Liang,
	Kim Phillips, linux-arm-kernel, Luis de Bethencourt,
	Marc Zyngier, Mark Rutland, Martin Schwidefsky, Mathieu Poirier,
	Michael Sartain, Namhyung Kim, Noel Grandin, Pawel Moll,
	Peter Zijlstra, Philippe Ombredanne, Rob Herring,
	Stephane Eranian, Steven Rostedt, Suzuki Poulouse, Taeung Song,
	Thomas Gleixner, Thomas Richter, Wang Nan, Will Deacon,
	Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 1ccb8feda7471efb62ebf68d70055b4c51fa7d92:

  Merge tag 'perf-core-for-mingo-4.16-20180110' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2018-01-11 06:53:06 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.16-20180117

for you to fetch changes up to 81fccd6ca507d3b2012eaf1edeb9b1dbf4bd22db:

  perf record: Fix failed memory allocation for get_cpuid_str (2018-01-17 10:31:25 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

- Fix various per event 'max-stack' and 'call-graph=dwarf' issues,
  mostly in 'perf trace', allowing to use 'perf trace --call-graph' with
  'dwarf' and 'fp' to setup the callgraph details for the syscall events
  and make that apply to other events, whilhe allowing to override that on
  a per-event basis, using '-e sched:*switch/call-graph=dwarf/' for
  instance (Arnaldo Carvalho de Melo)

- Improve the --time percent support in record/report/script (Jin Yao)

- Fix copyfile_offset update of output offset (Jiri Olsa)

- Add python script to profile and resolve physical mem type (Kan Liang)

- Add ARM Statistical Profiling Extensions (SPE) support (Kim Phillips)

- Remove trailing semicolon in the evlist code (Luis de Bethencourt)

- Fix incorrect handling of type _TERM_DRV_CFG (Mathieu Poirier)

- Use asprintf when possible in libtraceevent (Federico Vaga)

- Fix bad force_token escape sequence in libtraceevent (Michael Sartain)

- Add UL suffix to MISSING_EVENTS in libtraceevent (Michael Sartain)

- value of unknown symbolic fields in libtraceevent (Jan Kiszka)

- libtraceevent updates: (Steven Rostedt)
  o Show value of flags that have not been parsed
  o Simplify pointer print logic and fix %pF
  o Handle new pointer processing of bprint strings
  o Show contents (in hex) of data of unrecognized type records
  o Fix get_field_str() for dynamic strings

- Add missing break in FALSE case of pevent_filter_clear_trivial() (Taeung Song)

- Fix failed memory allocation for get_cpuid_str (Thomas Richter)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------

Arnaldo Carvalho de Melo (8):
      perf trace: No need to set PERF_SAMPLE_IDENTIFIER explicitely
      perf evsel: Check if callchain is enabled before setting it up
      perf trace: Fix setting of --call-graph/--max-stack for non-syscall events
      perf callchain: Fix attr.sample_max_stack setting
      perf unwind: Do not look just at the global callchain_param.record_mode
      perf trace: Setup DWARF callchains for non-syscall events when --max-stack is used
      perf trace: Allow overriding global --max-stack per event
      perf callchains: Ask for PERF_RECORD_MMAP for data mmaps for DWARF unwinding

Federico Vaga (1):
      tools lib traceevent: Use asprintf when possible

Jan Kiszka (1):
      tools lib traceevent: Print value of unknown symbolic fields

Jin Yao (8):
      perf report: Improve error msg when no first/last sample time found
      perf script: Improve error msg when no first/last sample time found
      perf util: Improve error checking for time percent input
      perf util: Support no index time percent slice
      perf report: Add an indication of what time slices are used
      perf util: Allocate time slices buffer according to number of comma
      perf report: Remove the time slices number limitation
      perf script: Remove the time slices number limitation

Jiri Olsa (1):
      perf tools: Fix copyfile_offset update of output offset

Kan Liang (1):
      perf script python: Add script to profile and resolve physical mem type

Kim Phillips (1):
      perf tools: Add ARM Statistical Profiling Extensions (SPE) support

Luis de Bethencourt (1):
      perf evlist: Remove trailing semicolon

Mathieu Poirier (1):
      perf evsel: Fix incorrect handling of type _TERM_DRV_CFG

Michael Sartain (2):
      tools lib traceevent: Fix bad force_token escape sequence
      tools lib traceevent: Add UL suffix to MISSING_EVENTS

Steven Rostedt (VMware) (5):
      tools lib traceevent: Show value of flags that have not been parsed
      tools lib traceevent: Simplify pointer print logic and fix %pF
      tools lib traceevent: Handle new pointer processing of bprint strings
      tools lib traceevent: Show contents (in hex) of data of unrecognized type records
      tools lib traceevent: Fix get_field_str() for dynamic strings

Taeung Song (1):
      tools lib traceevent: Fix missing break in FALSE case of pevent_filter_clear_trivial()

Thomas Richter (1):
      perf record: Fix failed memory allocation for get_cpuid_str

 tools/lib/traceevent/event-parse.c                 |  62 ++-
 tools/lib/traceevent/event-plugin.c                |  24 +-
 tools/lib/traceevent/kbuffer-parse.c               |   4 +-
 tools/lib/traceevent/parse-filter.c                |  22 +-
 tools/perf/Documentation/perf-report.txt           |   2 +-
 tools/perf/Documentation/perf-script.txt           |  10 +-
 tools/perf/arch/arm/util/auxtrace.c                |  77 +++-
 tools/perf/arch/arm/util/pmu.c                     |   6 +
 tools/perf/arch/arm64/util/Build                   |   3 +-
 tools/perf/arch/arm64/util/arm-spe.c               | 225 ++++++++++
 tools/perf/arch/x86/util/header.c                  |   2 +-
 tools/perf/builtin-c2c.c                           |   5 +-
 tools/perf/builtin-report.c                        |  34 +-
 tools/perf/builtin-script.c                        |  25 +-
 tools/perf/builtin-trace.c                         |  62 +--
 tools/perf/scripts/python/bin/mem-phys-addr-record |  19 +
 tools/perf/scripts/python/bin/mem-phys-addr-report |   3 +
 tools/perf/scripts/python/mem-phys-addr.py         |  95 +++++
 tools/perf/tests/dwarf-unwind.c                    |   1 +
 tools/perf/util/Build                              |   2 +
 tools/perf/util/arm-spe-pkt-decoder.c              | 462 +++++++++++++++++++++
 tools/perf/util/arm-spe-pkt-decoder.h              |  43 ++
 tools/perf/util/arm-spe.c                          | 231 +++++++++++
 tools/perf/util/arm-spe.h                          |  31 ++
 tools/perf/util/auxtrace.c                         |   3 +
 tools/perf/util/auxtrace.h                         |   1 +
 tools/perf/util/callchain.c                        |  10 +
 tools/perf/util/callchain.h                        |   2 +
 tools/perf/util/evlist.c                           |   2 +-
 tools/perf/util/evsel.c                            |  40 +-
 .../util/scripting-engines/trace-event-python.c    |   2 +
 tools/perf/util/time-utils.c                       |  72 +++-
 tools/perf/util/time-utils.h                       |   2 +
 tools/perf/util/unwind-libunwind-local.c           |   9 +-
 tools/perf/util/util.c                             |   2 +-
 35 files changed, 1465 insertions(+), 130 deletions(-)
 create mode 100644 tools/perf/arch/arm64/util/arm-spe.c
 create mode 100644 tools/perf/scripts/python/bin/mem-phys-addr-record
 create mode 100644 tools/perf/scripts/python/bin/mem-phys-addr-report
 create mode 100644 tools/perf/scripts/python/mem-phys-addr.py
 create mode 100644 tools/perf/util/arm-spe-pkt-decoder.c
 create mode 100644 tools/perf/util/arm-spe-pkt-decoder.h
 create mode 100644 tools/perf/util/arm-spe.c
 create mode 100644 tools/perf/util/arm-spe.h

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support.  Where clang is available, it is also used to build
perf with/without libelf.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android ones,
and those may not have all the features built, due to lack of multi-arch
devel packages, available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

  Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  # dm
   1 36.63 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0
   2 44.29 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822
   3 41.00 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0
   4 43.95 alpine:edge                   : Ok   gcc (Alpine 6.4.0) 6.4.0
   5 35.44 amazonlinux:1                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
   6 40.90 amazonlinux:2                 : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
   7 26.89 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
   8 27.57 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
   9 22.78 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  10 35.27 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
  11 40.16 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
  12 34.36 debian:7                      : Ok   gcc (Debian 4.7.2-5) 4.7.2
  13 37.23 debian:8                      : Ok   gcc (Debian 4.9.2-10) 4.9.2
  14 62.09 debian:9                      : Ok   gcc (Debian 6.3.0-18) 6.3.0 20170516
  15 64.09 debian:experimental           : Ok   gcc (Debian 7.2.0-18) 7.2.0
  16 39.24 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
  17 37.88 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
  18 34.81 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 7.2.0-11) 7.2.0
  19 37.26 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
  20 37.58 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
  21 40.34 fedora:21                     : Ok   gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
  22 38.42 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  23 40.07 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  24 45.57 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  25 35.50 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  26 80.48 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
  27 88.31 fedora:26                     : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
  28 86.02 fedora:27                     : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
  29 78.37 fedora:rawhide                : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-4)
  30 39.71 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 6.4.0-r1 p1.3) 6.4.0
  31 40.65 mageia:5                      : Ok   gcc (GCC) 4.9.2
  32 42.26 mageia:6                      : Ok   gcc (Mageia 5.4.0-5.mga6) 5.4.0
  33 40.11 opensuse:42.1                 : Ok   gcc (SUSE Linux) 4.8.5
  34 38.80 opensuse:42.2                 : Ok   gcc (SUSE Linux) 4.8.5
  35 40.91 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5
  36 83.55 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 7.2.1 20171020 [gcc-7-branch revision 253932]
  37 32.24 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
  38 42.29 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
  39 33.96 ubuntu:12.04.5                : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  40 38.79 ubuntu:14.04.4                : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
  41 32.93 ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
  42 61.08 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
  43 32.29 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  44 32.09 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  45 31.18 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  46 31.50 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.1) 5.4.0 20160609
  47 32.23 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  48 30.79 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  49 63.98 ubuntu:16.10                  : Ok   gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
  50 64.41 ubuntu:17.04                  : Ok   gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406
  51 64.98 ubuntu:17.10                  : Ok   gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0
  52 68.13 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.2.0-18ubuntu2) 7.2.0
  #

  # uname -a
  Linux seventh 4.15.0-rc8+ #1 SMP Wed Jan 17 10:45:59 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Number of exit events of a simple workload            : Ok
  22: Software clock events period values                   : Ok
  23: Object code reading                                   : Ok
  24: Sample parsing                                        : Ok
  25: Use a dummy software event to keep tracking           : Ok
  26: Parse with no sample_id_all bit set                   : Ok
  27: Filter hist entries                                   : Ok
  28: Lookup mmap thread                                    : Ok
  29: Share thread mg                                       : Ok
  30: Sort output of hist entries                           : Ok
  31: Cumulate child hist entries                           : Ok
  32: Track with sched_switch                               : Ok
  33: Filter fds with revents mask in a fdarray             : Ok
  34: Add fd to a fdarray, making it autogrow               : Ok
  35: kmod_path__parse                                      : Ok
  36: Thread map                                            : Ok
  37: LLVM search and compile                               :
  37.1: Basic BPF llvm compile                              : Ok
  37.2: kbuild searching                                    : Ok
  37.3: Compile source for BPF prologue generation          : Ok
  37.4: Compile source for BPF relocation                   : Ok
  38: Session topology                                      : Ok
  39: BPF filter                                            :
  39.1: Basic BPF filtering                                 : Ok
  39.2: BPF pinning                                         : Ok
  39.3: BPF prologue generation                             : Ok
  39.4: BPF relocation checker                              : Ok
  40: Synthesize thread map                                 : Ok
  41: Remove thread map                                     : Ok
  42: Synthesize cpu map                                    : Ok
  43: Synthesize stat config                                : Ok
  44: Synthesize stat                                       : Ok
  45: Synthesize stat round                                 : Ok
  46: Synthesize attr update                                : Ok
  47: Event times                                           : Ok
  48: Read backward ring buffer                             : Ok
  49: Print cpu map                                         : Ok
  50: Probe SDT events                                      : Ok
  51: is_printable_array                                    : Ok
  52: Print bitmap                                          : Ok
  53: perf hooks                                            : Ok
  54: builtin clang support                                 : Skip (not compiled in)
  55: unit_number__scnprintf                                : Ok
  56: x86 rdpmc                                             : Ok
  57: Convert perf time to TSC                              : Ok
  58: DWARF unwind                                          : Ok
  59: x86 instruction decoder - new instructions            : Ok
  60: Use vfs_getname probe to get syscall args filenames   : Ok
  61: Check open filename arg using perf trace + vfs_getname: Ok
  62: probe libc's inet_pton & backtrace it with ping       : Ok
  63: Add vfs_getname probe to get syscall args filenames   : Ok
  #
  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
              make_clean_all_O: make clean all
            make_no_demangle_O: make NO_DEMANGLE=1
           make_no_backtrace_O: make NO_BACKTRACE=1
             make_no_libperl_O: make NO_LIBPERL=1
         make_with_clangllvm_O: make LIBCLANGLLVM=1
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
              make_no_libelf_O: make NO_LIBELF=1
                    make_doc_O: make doc
                  make_debug_O: make DEBUG=1
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
                   make_tags_O: make tags
                   make_help_O: make help
       make_util_pmu_bison_o_O: make util/pmu-bison.o
             make_util_map_o_O: make util/map.o
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
           make_no_libbionic_O: make NO_LIBBIONIC=1
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
        make_with_babeltrace_O: make LIBBABELTRACE=1
            make_no_auxtrace_O: make NO_AUXTRACE=1
         make_install_prefix_O: make install prefix=/tmp/krava
              make_no_libbpf_O: make NO_LIBBPF=1
                 make_perf_o_O: make perf.o
           make_no_libpython_O: make NO_LIBPYTHON=1
                make_install_O: make install
                   make_pure_O: make
                 make_static_O: make LDFLAGS=-static
               make_no_slang_O: make NO_SLANG=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
             make_no_libnuma_O: make NO_LIBNUMA=1
                make_no_gtk2_O: make NO_GTK2=1
           make_no_libunwind_O: make NO_LIBUNWIND=1
                make_no_newt_O: make NO_NEWT=1
            make_install_bin_O: make install-bin
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 01/32] perf evsel: Fix incorrect handling of type _TERM_DRV_CFG
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2018-01-17 16:11 ` Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 02/32] perf evlist: Remove trailing semicolon Arnaldo Carvalho de Melo
                   ` (31 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Mathieu Poirier,
	Alexander Shishkin, Andi Kleen, Namhyung Kim, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Mathieu Poirier <mathieu.poirier@linaro.org>

Commit ("d0565132605f perf evsel: Enable type checking for
perf_evsel_config_term types") assumes PERF_EVSEL__CONFIG_TERM_DRV_CFG
isn't used and as such adds a BUG_ON().

Since the enumeration type is used in macro ADD_CONFIG_TERM() the change
break CoreSight trace acquisition.

This patch restores the original code.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: d0565132605f ("perf evsel: Enable type checking for perf_evsel_config_term types")
Link: http://lkml.kernel.org/r/1515617211-32024-1-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d934f04e3110..4eea3b404507 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -781,7 +781,7 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			attr->write_backward = term->val.overwrite ? 1 : 0;
 			break;
 		case PERF_EVSEL__CONFIG_TERM_DRV_CFG:
-			BUG_ON(1);
+			break;
 		default:
 			break;
 		}
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 02/32] perf evlist: Remove trailing semicolon
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 01/32] perf evsel: Fix incorrect handling of type _TERM_DRV_CFG Arnaldo Carvalho de Melo
@ 2018-01-17 16:11 ` Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 03/32] perf script python: Add script to profile and resolve physical mem type Arnaldo Carvalho de Melo
                   ` (30 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Luis de Bethencourt,
	Alexander Shishkin, Jiri Olsa, Joe Perches, Namhyung Kim,
	Peter Zijlstra, Arnaldo Carvalho de Melo

From: Luis de Bethencourt <luisbg@kernel.org>

The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.

Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180111155020.9782-1-luisbg@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f0a5e09c4071..120efd85f2c8 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1760,7 +1760,7 @@ void perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist,
 	switch (old_state) {
 	case BKW_MMAP_NOTREADY: {
 		if (state != BKW_MMAP_RUNNING)
-			goto state_err;;
+			goto state_err;
 		break;
 	}
 	case BKW_MMAP_RUNNING: {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 03/32] perf script python: Add script to profile and resolve physical mem type
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 01/32] perf evsel: Fix incorrect handling of type _TERM_DRV_CFG Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 02/32] perf evlist: Remove trailing semicolon Arnaldo Carvalho de Melo
@ 2018-01-17 16:11 ` Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 04/32] perf trace: No need to set PERF_SAMPLE_IDENTIFIER explicitely Arnaldo Carvalho de Melo
                   ` (29 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Kan Liang, Dan Williams,
	Jiri Olsa, Peter Zijlstra, Philippe Ombredanne, Stephane Eranian,
	Arnaldo Carvalho de Melo

From: Kan Liang <Kan.liang@intel.com>

There could be different types of memory in the system. E.g normal
System Memory, Persistent Memory. To understand how the workload maps to
those memories, it's important to know the I/O statistics of them.  Perf
can collect physical addresses, but those are raw data.  It still needs
extra work to resolve the physical addresses.  Provide a script to
facilitate the physical addresses resolving and I/O statistics.

Profile with MEM_INST_RETIRED.ALL_LOADS or MEM_UOPS_RETIRED.ALL_LOADS
event if any of them is available.

Look up the /proc/iomem and resolve the physical address.  Provide
memory type summary.

Here is an example output:

  # perf script report mem-phys-addr
  Event: mem_inst_retired.all_loads:P
  Memory type                                    count   percentage
  ----------------------------------------  -----------  -----------
  System RAM                                        74        53.2%
  Persistent Memory                                 55        39.6%
  N/A

  ---

Changes since V2:
 - Apply the new license rules.
 - Add comments for globals

Changes since V1:
 - Do not mix DLA and Load Latency. Do not compare the loads and stores.
   Only profile the loads.
 - Use event name to replace the RAW event

Signed-off-by: Kan Liang <Kan.liang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lkml.kernel.org/r/1515099595-34770-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/scripts/python/bin/mem-phys-addr-record | 19 +++++
 tools/perf/scripts/python/bin/mem-phys-addr-report |  3 +
 tools/perf/scripts/python/mem-phys-addr.py         | 95 ++++++++++++++++++++++
 .../util/scripting-engines/trace-event-python.c    |  2 +
 4 files changed, 119 insertions(+)
 create mode 100644 tools/perf/scripts/python/bin/mem-phys-addr-record
 create mode 100644 tools/perf/scripts/python/bin/mem-phys-addr-report
 create mode 100644 tools/perf/scripts/python/mem-phys-addr.py

diff --git a/tools/perf/scripts/python/bin/mem-phys-addr-record b/tools/perf/scripts/python/bin/mem-phys-addr-record
new file mode 100644
index 000000000000..5a875122a904
--- /dev/null
+++ b/tools/perf/scripts/python/bin/mem-phys-addr-record
@@ -0,0 +1,19 @@
+#!/bin/bash
+
+#
+# Profiling physical memory by all retired load instructions/uops event
+# MEM_INST_RETIRED.ALL_LOADS or MEM_UOPS_RETIRED.ALL_LOADS
+#
+
+load=`perf list | grep mem_inst_retired.all_loads`
+if [ -z "$load" ]; then
+	load=`perf list | grep mem_uops_retired.all_loads`
+fi
+if [ -z "$load" ]; then
+	echo "There is no event to count all retired load instructions/uops."
+	exit 1
+fi
+
+arg=$(echo $load | tr -d ' ')
+arg="$arg:P"
+perf record --phys-data -e $arg $@
diff --git a/tools/perf/scripts/python/bin/mem-phys-addr-report b/tools/perf/scripts/python/bin/mem-phys-addr-report
new file mode 100644
index 000000000000..3f2b847e2eab
--- /dev/null
+++ b/tools/perf/scripts/python/bin/mem-phys-addr-report
@@ -0,0 +1,3 @@
+#!/bin/bash
+# description: resolve physical address samples
+perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/mem-phys-addr.py
diff --git a/tools/perf/scripts/python/mem-phys-addr.py b/tools/perf/scripts/python/mem-phys-addr.py
new file mode 100644
index 000000000000..ebee2c5ae496
--- /dev/null
+++ b/tools/perf/scripts/python/mem-phys-addr.py
@@ -0,0 +1,95 @@
+# mem-phys-addr.py: Resolve physical address samples
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (c) 2018, Intel Corporation.
+
+from __future__ import division
+import os
+import sys
+import struct
+import re
+import bisect
+import collections
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+	'/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+#physical address ranges for System RAM
+system_ram = []
+#physical address ranges for Persistent Memory
+pmem = []
+#file object for proc iomem
+f = None
+#Count for each type of memory
+load_mem_type_cnt = collections.Counter()
+#perf event name
+event_name = None
+
+def parse_iomem():
+	global f
+	f = open('/proc/iomem', 'r')
+	for i, j in enumerate(f):
+		m = re.split('-|:',j,2)
+		if m[2].strip() == 'System RAM':
+			system_ram.append(long(m[0], 16))
+			system_ram.append(long(m[1], 16))
+		if m[2].strip() == 'Persistent Memory':
+			pmem.append(long(m[0], 16))
+			pmem.append(long(m[1], 16))
+
+def print_memory_type():
+	print "Event: %s" % (event_name)
+	print "%-40s  %10s  %10s\n" % ("Memory type", "count", "percentage"),
+	print "%-40s  %10s  %10s\n" % ("----------------------------------------", \
+					"-----------", "-----------"),
+	total = sum(load_mem_type_cnt.values())
+	for mem_type, count in sorted(load_mem_type_cnt.most_common(), \
+					key = lambda(k, v): (v, k), reverse = True):
+		print "%-40s  %10d  %10.1f%%\n" % (mem_type, count, 100 * count / total),
+
+def trace_begin():
+	parse_iomem()
+
+def trace_end():
+	print_memory_type()
+	f.close()
+
+def is_system_ram(phys_addr):
+	#/proc/iomem is sorted
+	position = bisect.bisect(system_ram, phys_addr)
+	if position % 2 == 0:
+		return False
+	return True
+
+def is_persistent_mem(phys_addr):
+	position = bisect.bisect(pmem, phys_addr)
+	if position % 2 == 0:
+		return False
+	return True
+
+def find_memory_type(phys_addr):
+	if phys_addr == 0:
+		return "N/A"
+	if is_system_ram(phys_addr):
+		return "System RAM"
+
+	if is_persistent_mem(phys_addr):
+		return "Persistent Memory"
+
+	#slow path, search all
+	f.seek(0, 0)
+	for j in f:
+		m = re.split('-|:',j,2)
+		if long(m[0], 16) <= phys_addr <= long(m[1], 16):
+			return m[2]
+	return "N/A"
+
+def process_event(param_dict):
+	name       = param_dict["ev_name"]
+	sample     = param_dict["sample"]
+	phys_addr  = sample["phys_addr"]
+
+	global event_name
+	if event_name == None:
+		event_name = name
+	load_mem_type_cnt[find_memory_type(phys_addr)] += 1
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index c1848b543f27..ea070883c593 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -499,6 +499,8 @@ static PyObject *get_perf_sample_dict(struct perf_sample *sample,
 			PyLong_FromUnsignedLongLong(sample->time));
 	pydict_set_item_string_decref(dict_sample, "period",
 			PyLong_FromUnsignedLongLong(sample->period));
+	pydict_set_item_string_decref(dict_sample, "phys_addr",
+			PyLong_FromUnsignedLongLong(sample->phys_addr));
 	set_sample_read_in_dict(dict_sample, sample, evsel);
 	pydict_set_item_string_decref(dict, "sample", dict_sample);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 04/32] perf trace: No need to set PERF_SAMPLE_IDENTIFIER explicitely
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2018-01-17 16:11 ` [PATCH 03/32] perf script python: Add script to profile and resolve physical mem type Arnaldo Carvalho de Melo
@ 2018-01-17 16:11 ` Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 05/32] perf tools: Fix copyfile_offset update of output offset Arnaldo Carvalho de Melo
                   ` (28 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Hendrick Brueckner, Jiri Olsa,
	Namhyung Kim, Thomas Richter, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Since 75562573bab3 ("perf tools: Add support for
PERF_SAMPLE_IDENTIFIER") we don't need explicitely set
PERF_SAMPLE_IDENTIFIER, as perf_evlist__config() will do this for us,
i.e. when there are more than one evsel in an evlist, it will check if
some evsel has a sample_type different than the one on the first evsel
in the list, setting PERF_SAMPLE_IDENTIFIER in that case.

So, to simplify 'perf trace' codebase, ditch that check.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-12xq6orhwttee2tdtu96ucrp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 23 -----------------------
 1 file changed, 23 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 71e64bdca86f..e84816d02117 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2348,40 +2348,17 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	perf_evlist__config(evlist, &trace->opts, NULL);
 
 	if (callchain_param.enabled) {
-		bool use_identifier = false;
-
 		if (trace->syscalls.events.sys_exit) {
 			perf_evsel__config_callchain(trace->syscalls.events.sys_exit,
 						     &trace->opts, &callchain_param);
-			use_identifier = true;
 		}
 
 		if (pgfault_maj) {
 			perf_evsel__config_callchain(pgfault_maj, &trace->opts, &callchain_param);
-			use_identifier = true;
 		}
 
 		if (pgfault_min) {
 			perf_evsel__config_callchain(pgfault_min, &trace->opts, &callchain_param);
-			use_identifier = true;
-		}
-
-		if (use_identifier) {
-		       /*
-			* Now we have evsels with different sample_ids, use
-			* PERF_SAMPLE_IDENTIFIER to map from sample to evsel
-			* from a fixed position in each ring buffer record.
-			*
-			* As of this the changeset introducing this comment, this
-			* isn't strictly needed, as the fields that can come before
-			* PERF_SAMPLE_ID are all used, but we'll probably disable
-			* some of those for things like copying the payload of
-			* pointer syscall arguments, and for vfs_getname we don't
-			* need PERF_SAMPLE_ADDR and PERF_SAMPLE_IP, so do this
-			* here as a warning we need to use PERF_SAMPLE_IDENTIFIER.
-			*/
-			perf_evlist__set_sample_bit(evlist, IDENTIFIER);
-			perf_evlist__reset_sample_bit(evlist, ID);
 		}
 	}
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 05/32] perf tools: Fix copyfile_offset update of output offset
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2018-01-17 16:11 ` [PATCH 04/32] perf trace: No need to set PERF_SAMPLE_IDENTIFIER explicitely Arnaldo Carvalho de Melo
@ 2018-01-17 16:11 ` Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 06/32] perf evsel: Check if callchain is enabled before setting it up Arnaldo Carvalho de Melo
                   ` (27 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, Alexander Shishkin,
	Andi Kleen, David Ahern, Namhyung Kim, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

We need to increase output offset in each iteration, not decrease it as
we currently do.

I guess we were lucky to finish in most cases in first iteration, so the
bug never showed. However it shows a lot when working with big (~4GB)
size data.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 9c9f5a2f1944 ("perf tools: Introduce copyfile_offset() function")
Link: http://lkml.kernel.org/r/20180109133923.25406-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/util.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index a789f952b3e9..443892dabedb 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -210,7 +210,7 @@ static int copyfile_offset(int ifd, loff_t off_in, int ofd, loff_t off_out, u64
 
 		size -= ret;
 		off_in += ret;
-		off_out -= ret;
+		off_out += ret;
 	}
 	munmap(ptr, off_in + size);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 06/32] perf evsel: Check if callchain is enabled before setting it up
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2018-01-17 16:11 ` [PATCH 05/32] perf tools: Fix copyfile_offset update of output offset Arnaldo Carvalho de Melo
@ 2018-01-17 16:11 ` Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 07/32] perf trace: Fix setting of --call-graph/--max-stack for non-syscall events Arnaldo Carvalho de Melo
                   ` (26 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Hendrick Brueckner, Jiri Olsa,
	Namhyung Kim, Thomas Richter, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The construct:

	if (callchain_param)
		perf_evsel__config_callchain(evsel, opts, &callchain_param);

happens in several places, so make perf_evsel__config_callchain() work
just like free(NULL), do nothing if param->enabled is not set.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ykk0qzxnxwx3o611ctjnmxav@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4eea3b404507..efa2e629a669 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -651,9 +651,9 @@ int perf_evsel__group_desc(struct perf_evsel *evsel, char *buf, size_t size)
 	return ret;
 }
 
-void perf_evsel__config_callchain(struct perf_evsel *evsel,
-				  struct record_opts *opts,
-				  struct callchain_param *param)
+static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
+					   struct record_opts *opts,
+					   struct callchain_param *param)
 {
 	bool function = perf_evsel__is_function_event(evsel);
 	struct perf_event_attr *attr = &evsel->attr;
@@ -699,6 +699,14 @@ void perf_evsel__config_callchain(struct perf_evsel *evsel,
 	}
 }
 
+void perf_evsel__config_callchain(struct perf_evsel *evsel,
+				  struct record_opts *opts,
+				  struct callchain_param *param)
+{
+	if (param->enabled)
+		return __perf_evsel__config_callchain(evsel, opts, param);
+}
+
 static void
 perf_evsel__reset_callgraph(struct perf_evsel *evsel,
 			    struct callchain_param *param)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 07/32] perf trace: Fix setting of --call-graph/--max-stack for non-syscall events
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2018-01-17 16:11 ` [PATCH 06/32] perf evsel: Check if callchain is enabled before setting it up Arnaldo Carvalho de Melo
@ 2018-01-17 16:11 ` Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 08/32] tools lib traceevent: Fix bad force_token escape sequence Arnaldo Carvalho de Melo
                   ` (25 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Hendrick Brueckner, Jiri Olsa,
	Namhyung Kim, Thomas Richter, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The raw_syscalls:sys_{enter,exit} were first supported in 'perf trace',
together with minor and major page faults, then we supported
--call-graph, then --max-stack, but when the other tracepoints got
supported, and bpf, etc, I forgot to make those global call-graph
settings apply to them.

Fix it by realizing that the global --max-stack and --call-graph
settings are done via:

        OPT_CALLBACK(0, "call-graph", &trace.opts,
                     "record_mode[,record_size]", record_callchain_help,
                     &record_parse_callchain_opt),

And then, when we go to parse the events in -e via:

        OPT_CALLBACK('e', "event", &trace, "event",
                     "event/syscall selector. use 'perf list' to list available events",
                     trace__parse_events_option),

And trace__parse_sevents_option() calls:

                struct option o = OPT_CALLBACK('e', "event", &trace->evlist, "event",
                                               "event selector. use 'perf list' to list available events",
                                               parse_events_option);
                err = parse_events_option(&o, lists[0], 0);

parse_events_option() will override the global --call-graph and
--max-stack if the "call-graph" and/or "max-stack" terms are in the
event definition, such as in the probe_libc:inet_pton event in one of the
examples below (-e probe_libc:inet_pton/max-stack=2).

Before:

  # perf trace --mmap 1024 --call-graph dwarf -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1
       1.525 (         ): probe_libc:inet_pton:(7f77f3ac9350))
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.071 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.071/0.071/0.071/0.000 ms
       1.677 ( 0.081 ms): ping/31296 sendto(fd: 3, buff: 0x55681b652720, len: 64, addr: 0x55681b650640, addr_len: 28) = 64
                                         __libc_sendto (/usr/lib64/libc-2.26.so)
                                         [0xffffaa97e4bc9cef] (/usr/bin/ping)
                                         [0xffffaa97e4bc656d] (/usr/bin/ping)
                                         [0xffffaa97e4bc7d0a] (/usr/bin/ping)
                                         [0xffffaa97e4bca447] (/usr/bin/ping)
                                         [0xffffaa97e4bc2f91] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa97e4bc3379] (/usr/bin/ping)
  #

After:

  # perf trace --mmap 1024 --call-graph dwarf -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.089 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.089/0.089/0.089/0.000 ms
       1.955 (         ): probe_libc:inet_pton:(7f383a311350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa5d91444f3f] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa5d91445379] (/usr/bin/ping)
       2.140 ( 0.101 ms): ping/32047 sendto(fd: 3, buff: 0x55a26edd0720, len: 64, addr: 0x55a26edce640, addr_len: 28) = 64
                                         __libc_sendto (/usr/lib64/libc-2.26.so)
                                         [0xffffaa5d9144bcef] (/usr/bin/ping)
                                         [0xffffaa5d9144856d] (/usr/bin/ping)
                                         [0xffffaa5d91449d0a] (/usr/bin/ping)
                                         [0xffffaa5d9144c447] (/usr/bin/ping)
                                         [0xffffaa5d91444f91] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa5d91445379] (/usr/bin/ping)
  #

Same thing for --max-stack, the global one:

  # perf trace --max-stack 3 -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.097 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.097/0.097/0.097/0.000 ms
       1.577 (         ): probe_libc:inet_pton:(7f32f3957350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
       1.738 ( 0.108 ms): ping/32103 sendto(fd: 3, buff: 0x55c3132d7720, len: 64, addr: 0x55c3132d5640, addr_len: 28) = 64
                                         __libc_sendto (/usr/lib64/libc-2.26.so)
                                         [0xffffaa3cecf44cef] (/usr/bin/ping)
                                         [0xffffaa3cecf4156d] (/usr/bin/ping)
  #

And then setting up a global setting (dwarf, max-stack=4), that will
affect the raw_syscall:sys_enter for the 'sendto' syscall and that will
be overriden in the probe_libc:inet_pton call to just one entry.

  # perf trace --max-stack=4 --call-graph dwarf -e sendto -e probe_libc:inet_pton/max-stack=1/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.090 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.090/0.090/0.090/0.000 ms
       2.140 (         ): probe_libc:inet_pton:(7f9fe9337350))
                                         __GI___inet_pton (/usr/lib64/libc-2.26.so)
       2.283 ( 0.103 ms): ping/31804 sendto(fd: 3, buff: 0x55c7f3e19720, len: 64, addr: 0x55c7f3e17640, addr_len: 28) = 64
                                         __libc_sendto (/usr/lib64/libc-2.26.so)
                                         [0xffffaa380c402cef] (/usr/bin/ping)
                                         [0xffffaa380c3ff56d] (/usr/bin/ping)
                                         [0xffffaa380c400d0a] (/usr/bin/ping)
  #

Install iputils-debuginfo to get those /usr/bin/ping addresses resolved,
those routines are not on its .dymsym nor .symtab :-)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-qgl2gse8elhh9zztw4ajopg3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index e84816d02117..0362974854e9 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2222,6 +2222,9 @@ static int trace__add_syscall_newtp(struct trace *trace)
 	if (perf_evsel__init_sc_tp_uint_field(sys_exit, ret))
 		goto out_delete_sys_exit;
 
+	perf_evsel__config_callchain(sys_enter, &trace->opts, &callchain_param);
+	perf_evsel__config_callchain(sys_exit, &trace->opts, &callchain_param);
+
 	perf_evlist__add(evlist, sys_enter);
 	perf_evlist__add(evlist, sys_exit);
 
@@ -2318,6 +2321,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		pgfault_maj = perf_evsel__new_pgfault(PERF_COUNT_SW_PAGE_FAULTS_MAJ);
 		if (pgfault_maj == NULL)
 			goto out_error_mem;
+		perf_evsel__config_callchain(pgfault_maj, &trace->opts, &callchain_param);
 		perf_evlist__add(evlist, pgfault_maj);
 	}
 
@@ -2325,6 +2329,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		pgfault_min = perf_evsel__new_pgfault(PERF_COUNT_SW_PAGE_FAULTS_MIN);
 		if (pgfault_min == NULL)
 			goto out_error_mem;
+		perf_evsel__config_callchain(pgfault_min, &trace->opts, &callchain_param);
 		perf_evlist__add(evlist, pgfault_min);
 	}
 
@@ -2347,21 +2352,6 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 
 	perf_evlist__config(evlist, &trace->opts, NULL);
 
-	if (callchain_param.enabled) {
-		if (trace->syscalls.events.sys_exit) {
-			perf_evsel__config_callchain(trace->syscalls.events.sys_exit,
-						     &trace->opts, &callchain_param);
-		}
-
-		if (pgfault_maj) {
-			perf_evsel__config_callchain(pgfault_maj, &trace->opts, &callchain_param);
-		}
-
-		if (pgfault_min) {
-			perf_evsel__config_callchain(pgfault_min, &trace->opts, &callchain_param);
-		}
-	}
-
 	signal(SIGCHLD, sig_handler);
 	signal(SIGINT, sig_handler);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 08/32] tools lib traceevent: Fix bad force_token escape sequence
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2018-01-17 16:11 ` [PATCH 07/32] perf trace: Fix setting of --call-graph/--max-stack for non-syscall events Arnaldo Carvalho de Melo
@ 2018-01-17 16:11 ` Arnaldo Carvalho de Melo
  2018-01-17 16:11 ` [PATCH 09/32] tools lib traceevent: Show value of flags that have not been parsed Arnaldo Carvalho de Melo
                   ` (24 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Michael Sartain, Andrew Morton,
	Steven Rostedt, Arnaldo Carvalho de Melo

From: Michael Sartain <mikesart@fastmail.com>

Older kernels have a bug that creates invalid symbols. event-parse.c
handles them by replacing them with a "%s" token. But the fix included
an extra backslash, and "\%s" was added incorrectly.

Signed-off-by: Michael Sartain <mikesart@fastmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004821.827168881@goodmis.org
Link: http://lkml.kernel.org/r/d320000d37c10ce0912851e1fb78d1e0c946bcd9.1497486273.git.mikesart@fastmail.com
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/event-parse.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 7ce724fc0544..0bc1a6df8a27 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -1094,7 +1094,7 @@ static enum event_type __read_token(char **tok)
 		if (strcmp(*tok, "LOCAL_PR_FMT") == 0) {
 			free(*tok);
 			*tok = NULL;
-			return force_token("\"\%s\" ", tok);
+			return force_token("\"%s\" ", tok);
 		} else if (strcmp(*tok, "STA_PR_FMT") == 0) {
 			free(*tok);
 			*tok = NULL;
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 09/32] tools lib traceevent: Show value of flags that have not been parsed
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2018-01-17 16:11 ` [PATCH 08/32] tools lib traceevent: Fix bad force_token escape sequence Arnaldo Carvalho de Melo
@ 2018-01-17 16:11 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 10/32] tools lib traceevent: Print value of unknown symbolic fields Arnaldo Carvalho de Melo
                   ` (23 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Steven Rostedt (VMware),
	Andrew Morton, Arnaldo Carvalho de Melo

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

If the value contains bits that are not defined by print_flags() helper,
then show the remaining bits. This aligns with the functionality of the
kernel.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/e60c889f-55e7-4ee8-0e50-151e435ffd8c@siemens.com
Link: http://lkml.kernel.org/r/20180112004821.976225232@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/event-parse.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 0bc1a6df8a27..96c9c0b33423 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -3970,6 +3970,11 @@ static void print_str_arg(struct trace_seq *s, void *data, int size,
 				val &= ~fval;
 			}
 		}
+		if (val) {
+			if (print && arg->flags.delim)
+				trace_seq_puts(s, arg->flags.delim);
+			trace_seq_printf(s, "0x%llx", val);
+		}
 		break;
 	case PRINT_SYMBOL:
 		val = eval_num_arg(data, size, event, arg->symbol.field);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 10/32] tools lib traceevent: Print value of unknown symbolic fields
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2018-01-17 16:11 ` [PATCH 09/32] tools lib traceevent: Show value of flags that have not been parsed Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 11/32] tools lib traceevent: Simplify pointer print logic and fix %pF Arnaldo Carvalho de Melo
                   ` (22 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jan Kiszka, Andrew Morton,
	Steven Rostedt, Arnaldo Carvalho de Melo

From: Jan Kiszka <jan.kiszka@siemens.com>

Aligns trace-cmd with the behavior of the kernel.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/e60c889f-55e7-4ee8-0e50-151e435ffd8c@siemens.com
Link: http://lkml.kernel.org/r/20180112004822.118332436@goodmis.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/event-parse.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 96c9c0b33423..87757eabbb08 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -3985,6 +3985,8 @@ static void print_str_arg(struct trace_seq *s, void *data, int size,
 				break;
 			}
 		}
+		if (!flag)
+			trace_seq_printf(s, "0x%llx", val);
 		break;
 	case PRINT_HEX:
 	case PRINT_HEX_STR:
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 11/32] tools lib traceevent: Simplify pointer print logic and fix %pF
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 10/32] tools lib traceevent: Print value of unknown symbolic fields Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 12/32] tools lib traceevent: Handle new pointer processing of bprint strings Arnaldo Carvalho de Melo
                   ` (21 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Steven Rostedt (VMware),
	Andrew Morton, Arnaldo Carvalho de Melo

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

When processing %pX in pretty_print(), simplify the logic slightly by
incrementing the ptr to the format string if isalnum(ptr[1]) is true.
This follows the logic a bit more closely to what is in the kernel.

Also, this fixes a small bug where %pF was not giving the offset of the
function.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004822.260262257@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/event-parse.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 87757eabbb08..8757dd64e42c 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -4956,21 +4956,22 @@ static void pretty_print(struct trace_seq *s, void *data, int size, struct event
 				else
 					ls = 2;
 
-				if (*(ptr+1) == 'F' || *(ptr+1) == 'f' ||
-				    *(ptr+1) == 'S' || *(ptr+1) == 's') {
+				if (isalnum(ptr[1]))
 					ptr++;
+
+				if (*ptr == 'F' || *ptr == 'f' ||
+				    *ptr == 'S' || *ptr == 's') {
 					show_func = *ptr;
-				} else if (*(ptr+1) == 'M' || *(ptr+1) == 'm') {
-					print_mac_arg(s, *(ptr+1), data, size, event, arg);
-					ptr++;
+				} else if (*ptr == 'M' || *ptr == 'm') {
+					print_mac_arg(s, *ptr, data, size, event, arg);
 					arg = arg->next;
 					break;
-				} else if (*(ptr+1) == 'I' || *(ptr+1) == 'i') {
+				} else if (*ptr == 'I' || *ptr == 'i') {
 					int n;
 
-					n = print_ip_arg(s, ptr+1, data, size, event, arg);
+					n = print_ip_arg(s, ptr, data, size, event, arg);
 					if (n > 0) {
-						ptr += n;
+						ptr += n - 1;
 						arg = arg->next;
 						break;
 					}
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 12/32] tools lib traceevent: Handle new pointer processing of bprint strings
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 11/32] tools lib traceevent: Simplify pointer print logic and fix %pF Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 13/32] tools lib traceevent: Show contents (in hex) of data of unrecognized type records Arnaldo Carvalho de Melo
                   ` (20 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Steven Rostedt (VMware),
	Andrew Morton, Arnaldo Carvalho de Melo

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The Linux kernel printf() has some extended use cases that dereference
the pointer. This is dangerouse for tracing because the pointer that is
dereferenced can change or even be unmapped. It also causes issues when
the trace data is extracted, because user space does not have access to
the contents of the pointer even if it still exists.

To handle this, the kernel was updated to process these dereferenced
pointers at the time they are recorded, and not post processed. Now they
exist in the tracing buffer, and no dereference is needed at the time of
reading the trace.

The event parsing library needs to handle this new case.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004822.403349289@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/event-parse.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 8757dd64e42c..344a034a8fbc 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -4300,6 +4300,26 @@ static struct print_arg *make_bprint_args(char *fmt, void *data, int size, struc
 				goto process_again;
 			case 'p':
 				ls = 1;
+				if (isalnum(ptr[1])) {
+					ptr++;
+					/* Check for special pointers */
+					switch (*ptr) {
+					case 's':
+					case 'S':
+					case 'f':
+					case 'F':
+						break;
+					default:
+						/*
+						 * Older kernels do not process
+						 * dereferenced pointers.
+						 * Only process if the pointer
+						 * value is a printable.
+						 */
+						if (isprint(*(char *)bptr))
+							goto process_string;
+					}
+				}
 				/* fall through */
 			case 'd':
 			case 'u':
@@ -4352,6 +4372,7 @@ static struct print_arg *make_bprint_args(char *fmt, void *data, int size, struc
 
 				break;
 			case 's':
+ process_string:
 				arg = alloc_arg();
 				if (!arg) {
 					do_warning_event(event, "%s(%d): not enough memory!",
@@ -4959,6 +4980,11 @@ static void pretty_print(struct trace_seq *s, void *data, int size, struct event
 				if (isalnum(ptr[1]))
 					ptr++;
 
+				if (arg->type == PRINT_BSTRING) {
+					trace_seq_puts(s, arg->string.string);
+					break;
+				}
+
 				if (*ptr == 'F' || *ptr == 'f' ||
 				    *ptr == 'S' || *ptr == 's') {
 					show_func = *ptr;
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 13/32] tools lib traceevent: Show contents (in hex) of data of unrecognized type records
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 12/32] tools lib traceevent: Handle new pointer processing of bprint strings Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 14/32] tools lib traceevent: Use asprintf when possible Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Steven Rostedt (VMware),
	Andrew Morton, Arnaldo Carvalho de Melo

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

When a record has an unrecognized type, an error message is reported,
but it would also be helpful to see the contents of that record. At
least show what it is in hex, instead of just showing a blank line.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004822.542204577@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/event-parse.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 344a034a8fbc..e5f2acbb70cc 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -5566,8 +5566,14 @@ void pevent_print_event(struct pevent *pevent, struct trace_seq *s,
 
 	event = pevent_find_event_by_record(pevent, record);
 	if (!event) {
-		do_warning("ug! no event found for type %d",
-			   trace_parse_common_type(pevent, record->data));
+		int i;
+		int type = trace_parse_common_type(pevent, record->data);
+
+		do_warning("ug! no event found for type %d", type);
+		trace_seq_printf(s, "[UNKNOWN TYPE %d]", type);
+		for (i = 0; i < record->size; i++)
+			trace_seq_printf(s, " %02x",
+					 ((unsigned char *)record->data)[i]);
 		return;
 	}
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 14/32] tools lib traceevent: Use asprintf when possible
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 13/32] tools lib traceevent: Show contents (in hex) of data of unrecognized type records Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 15/32] tools lib traceevent: Add UL suffix to MISSING_EVENTS Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Federico Vaga, Andrew Morton,
	Steven Rostedt, Arnaldo Carvalho de Melo

From: Federico Vaga <federico.vaga@vaga.pv.it>

It makes the code clearer and less error prone.

clearer:
- less code
- the code is now using the same format to create strings dynamically

less error prone:
- no magic number +2 +9 +5 to compute the size
- no copy&paste of the strings to compute the size and to concatenate

The function `asprintf` is not POSIX standard but the program
was already using it. Later it can be decided to use only POSIX
functions, then we can easly replace all the `asprintf(3)` with a local
implementation of that function.

Signed-off-by: Federico Vaga <federico.vaga@vaga.pv.it>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Federico Vaga <federico.vaga@vaga.pv.it>
Link: http://lkml.kernel.org/r/20170802221558.9684-2-federico.vaga@vaga.pv.it
Link: http://lkml.kernel.org/r/20180112004822.686281649@goodmis.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/event-plugin.c | 24 +++++++++---------------
 tools/lib/traceevent/parse-filter.c | 11 ++++-------
 2 files changed, 13 insertions(+), 22 deletions(-)

diff --git a/tools/lib/traceevent/event-plugin.c b/tools/lib/traceevent/event-plugin.c
index a16756ae3526..d542cb60ca1a 100644
--- a/tools/lib/traceevent/event-plugin.c
+++ b/tools/lib/traceevent/event-plugin.c
@@ -120,12 +120,12 @@ char **traceevent_plugin_list_options(void)
 		for (op = reg->options; op->name; op++) {
 			char *alias = op->plugin_alias ? op->plugin_alias : op->file;
 			char **temp = list;
+			int ret;
 
-			name = malloc(strlen(op->name) + strlen(alias) + 2);
-			if (!name)
+			ret = asprintf(&name, "%s:%s", alias, op->name);
+			if (ret < 0)
 				goto err;
 
-			sprintf(name, "%s:%s", alias, op->name);
 			list = realloc(list, count + 2);
 			if (!list) {
 				list = temp;
@@ -290,17 +290,14 @@ load_plugin(struct pevent *pevent, const char *path,
 	const char *alias;
 	char *plugin;
 	void *handle;
+	int ret;
 
-	plugin = malloc(strlen(path) + strlen(file) + 2);
-	if (!plugin) {
+	ret = asprintf(&plugin, "%s/%s", path, file);
+	if (ret < 0) {
 		warning("could not allocate plugin memory\n");
 		return;
 	}
 
-	strcpy(plugin, path);
-	strcat(plugin, "/");
-	strcat(plugin, file);
-
 	handle = dlopen(plugin, RTLD_NOW | RTLD_GLOBAL);
 	if (!handle) {
 		warning("could not load plugin '%s'\n%s\n",
@@ -391,6 +388,7 @@ load_plugins(struct pevent *pevent, const char *suffix,
 	char *home;
 	char *path;
 	char *envdir;
+	int ret;
 
 	if (pevent->flags & PEVENT_DISABLE_PLUGINS)
 		return;
@@ -421,16 +419,12 @@ load_plugins(struct pevent *pevent, const char *suffix,
 	if (!home)
 		return;
 
-	path = malloc(strlen(home) + strlen(LOCAL_PLUGIN_DIR) + 2);
-	if (!path) {
+	ret = asprintf(&path, "%s/%s", home, LOCAL_PLUGIN_DIR);
+	if (ret < 0) {
 		warning("could not allocate plugin memory\n");
 		return;
 	}
 
-	strcpy(path, home);
-	strcat(path, "/");
-	strcat(path, LOCAL_PLUGIN_DIR);
-
 	load_plugins_dir(pevent, suffix, path, load_plugin, data);
 
 	free(path);
diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c
index 315df0a70265..2410afdcbcfe 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -287,12 +287,10 @@ find_event(struct pevent *pevent, struct event_list **events,
 		sys_name = NULL;
 	}
 
-	reg = malloc(strlen(event_name) + 3);
-	if (reg == NULL)
+	ret = asprintf(&reg, "^%s$", event_name);
+	if (ret < 0)
 		return PEVENT_ERRNO__MEM_ALLOC_FAILED;
 
-	sprintf(reg, "^%s$", event_name);
-
 	ret = regcomp(&ereg, reg, REG_ICASE|REG_NOSUB);
 	free(reg);
 
@@ -300,13 +298,12 @@ find_event(struct pevent *pevent, struct event_list **events,
 		return PEVENT_ERRNO__INVALID_EVENT_NAME;
 
 	if (sys_name) {
-		reg = malloc(strlen(sys_name) + 3);
-		if (reg == NULL) {
+		ret = asprintf(&reg, "^%s$", sys_name);
+		if (ret < 0) {
 			regfree(&ereg);
 			return PEVENT_ERRNO__MEM_ALLOC_FAILED;
 		}
 
-		sprintf(reg, "^%s$", sys_name);
 		ret = regcomp(&sreg, reg, REG_ICASE|REG_NOSUB);
 		free(reg);
 		if (ret) {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 15/32] tools lib traceevent: Add UL suffix to MISSING_EVENTS
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 14/32] tools lib traceevent: Use asprintf when possible Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 16/32] tools lib traceevent: Fix missing break in FALSE case of pevent_filter_clear_trivial() Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Michael Sartain, Andrew Morton,
	Steven Rostedt, Arnaldo Carvalho de Melo

From: Michael Sartain <mikesart@fastmail.com>

Add UL suffix to MISSING_EVENTS since ints shouldn't be left shifted by 31.

Signed-off-by: Michael Sartain <mikesart@fastmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20171016165542.13038-4-mikesart@fastmail.com
Link: http://lkml.kernel.org/r/20180112004822.829533885@goodmis.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/kbuffer-parse.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/kbuffer-parse.c b/tools/lib/traceevent/kbuffer-parse.c
index c94e3641b046..ca424b157e46 100644
--- a/tools/lib/traceevent/kbuffer-parse.c
+++ b/tools/lib/traceevent/kbuffer-parse.c
@@ -24,8 +24,8 @@
 
 #include "kbuffer.h"
 
-#define MISSING_EVENTS (1 << 31)
-#define MISSING_STORED (1 << 30)
+#define MISSING_EVENTS (1UL << 31)
+#define MISSING_STORED (1UL << 30)
 
 #define COMMIT_MASK ((1 << 27) - 1)
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 16/32] tools lib traceevent: Fix missing break in FALSE case of pevent_filter_clear_trivial()
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 15/32] tools lib traceevent: Add UL suffix to MISSING_EVENTS Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 17/32] tools lib traceevent: Fix get_field_str() for dynamic strings Arnaldo Carvalho de Melo
                   ` (16 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Taeung Song, Andrew Morton,
	Steven Rostedt, Arnaldo Carvalho de Melo

From: Taeung Song <treeze.taeung@gmail.com>

Currently the FILTER_TRIVIAL_FALSE case has a missing break statement,
if the trivial type is FALSE, it will also run into the TRUE case, and
always be skipped as the TRUE statement will continue the loop on the
inverse condition of the FALSE statement.

Reported-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004823.012918807@goodmis.org
Link: http://lkml.kernel.org/r/1493218540-12296-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/parse-filter.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c
index 2410afdcbcfe..2b9048f90bae 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1631,6 +1631,7 @@ int pevent_filter_clear_trivial(struct event_filter *filter,
 		case FILTER_TRIVIAL_FALSE:
 			if (filter_type->filter->boolean.value)
 				continue;
+			break;
 		case FILTER_TRIVIAL_TRUE:
 			if (!filter_type->filter->boolean.value)
 				continue;
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 17/32] tools lib traceevent: Fix get_field_str() for dynamic strings
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 16/32] tools lib traceevent: Fix missing break in FALSE case of pevent_filter_clear_trivial() Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 18/32] perf tools: Add ARM Statistical Profiling Extensions (SPE) support Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Steven Rostedt (VMware),
	Andrew Morton, Arnaldo Carvalho de Melo

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

If a field is a dynamic string, get_field_str() returned just the
offset/size value and not the string. Have it parse the offset/size
correctly to return the actual string. Otherwise filtering fails when
trying to filter fields that are dynamic strings.

Reported-by: Gopanapalli Pradeep <prap_hai@yahoo.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004823.146333275@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/parse-filter.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c
index 2b9048f90bae..431e8b309f6e 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1877,17 +1877,25 @@ static const char *get_field_str(struct filter_arg *arg, struct pevent_record *r
 	struct pevent *pevent;
 	unsigned long long addr;
 	const char *val = NULL;
+	unsigned int size;
 	char hex[64];
 
 	/* If the field is not a string convert it */
 	if (arg->str.field->flags & FIELD_IS_STRING) {
 		val = record->data + arg->str.field->offset;
+		size = arg->str.field->size;
+
+		if (arg->str.field->flags & FIELD_IS_DYNAMIC) {
+			addr = *(unsigned int *)val;
+			val = record->data + (addr & 0xffff);
+			size = addr >> 16;
+		}
 
 		/*
 		 * We need to copy the data since we can't be sure the field
 		 * is null terminated.
 		 */
-		if (*(val + arg->str.field->size - 1)) {
+		if (*(val + size - 1)) {
 			/* copy it */
 			memcpy(arg->str.buffer, val, arg->str.field->size);
 			/* the buffer is already NULL terminated */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 18/32] perf tools: Add ARM Statistical Profiling Extensions (SPE) support
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (16 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 17/32] tools lib traceevent: Fix get_field_str() for dynamic strings Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 19/32] perf callchain: Fix attr.sample_max_stack setting Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Kim Phillips, Alexander Shishkin,
	Andi Kleen, Jiri Olsa, linux-arm-kernel, Marc Zyngier,
	Mark Rutland, Mathieu Poirier, Pawel Moll, Peter Zijlstra,
	Rob Herring, Suzuki Poulouse, Thomas Gleixner, Wang Nan,
	Will Deacon, Arnaldo Carvalho de Melo

From: Kim Phillips <kim.phillips@arm.com>

'perf record' and 'perf report --dump-raw-trace' supported in this
release.

Example usage:

 # perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000
 # perf report --dump-raw-trace

Note that the perf.data file is portable, so the report can be run on
another architecture host if necessary.

Output will contain raw SPE data and its textual representation, such
as:

0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000  offset: 0  ref: 0x1891ad0e  idx: 1  tid: 2227  cpu: 1
.
. ... ARM SPE data: size 2097152 bytes
.  00000000:  49 00                                           LD
.  00000002:  b2 c0 3b 29 0f 00 00 ff ff                      VA 0xffff00000f293bc0
.  0000000b:  b3 c0 eb 24 fb 00 00 00 80                      PA 0xfb24ebc0 ns=1
.  00000014:  9a 00 00                                        LAT 0 XLAT
.  00000017:  42 16                                           EV RETIRED L1D-ACCESS TLB-ACCESS
.  00000019:  b0 00 c4 15 08 00 00 ff ff                      PC 0xff00000815c400 el3 ns=1
.  00000022:  98 00 00                                        LAT 0 TOT
.  00000025:  71 36 6c 21 2c 09 00 00 00                      TS 39395093558
.  0000002e:  49 00                                           LD
.  00000030:  b2 80 3c 29 0f 00 00 ff ff                      VA 0xffff00000f293c80
.  00000039:  b3 80 ec 24 fb 00 00 00 80                      PA 0xfb24ec80 ns=1
.  00000042:  9a 00 00                                        LAT 0 XLAT
.  00000045:  42 16                                           EV RETIRED L1D-ACCESS TLB-ACCESS
.  00000047:  b0 f4 11 16 08 00 00 ff ff                      PC 0xff0000081611f4 el3 ns=1
.  00000050:  98 00 00                                        LAT 0 TOT
.  00000053:  71 36 6c 21 2c 09 00 00 00                      TS 39395093558
.  0000005c:  48 00                                           INSN-OTHER
.  0000005e:  42 02                                           EV RETIRED
.  00000060:  b0 2c ef 7f 08 00 00 ff ff                      PC 0xff0000087fef2c el3 ns=1
.  00000069:  98 00 00                                        LAT 0 TOT
.  0000006c:  71 d1 6f 21 2c 09 00 00 00                      TS 39395094481
...

Other release notes:

- applies to acme's perf/{core,urgent} branches, likely elsewhere

- Report is self-contained within the tool.
  Record requires enabling the kernel SPE driver by
  setting CONFIG_ARM_SPE_PMU.

- The intel-bts implementation was used as a starting point; its
  min/default/max buffer sizes and power of 2 pages granularity need to be
  revisited for ARM SPE

- Recording across multiple SPE clusters/domains not supported

- Snapshot support (record -S), and conversion to native perf events
  (e.g., via 'perf inject --itrace'), are also not supported

- Technically both cs-etm and spe can be used simultaneously, however
  disabled for simplicity in this release

Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/arm/util/auxtrace.c   |  77 +++++-
 tools/perf/arch/arm/util/pmu.c        |   6 +
 tools/perf/arch/arm64/util/Build      |   3 +-
 tools/perf/arch/arm64/util/arm-spe.c  | 225 +++++++++++++++++
 tools/perf/util/Build                 |   2 +
 tools/perf/util/arm-spe-pkt-decoder.c | 462 ++++++++++++++++++++++++++++++++++
 tools/perf/util/arm-spe-pkt-decoder.h |  43 ++++
 tools/perf/util/arm-spe.c             | 231 +++++++++++++++++
 tools/perf/util/arm-spe.h             |  31 +++
 tools/perf/util/auxtrace.c            |   3 +
 tools/perf/util/auxtrace.h            |   1 +
 11 files changed, 1077 insertions(+), 7 deletions(-)
 create mode 100644 tools/perf/arch/arm64/util/arm-spe.c
 create mode 100644 tools/perf/util/arm-spe-pkt-decoder.c
 create mode 100644 tools/perf/util/arm-spe-pkt-decoder.h
 create mode 100644 tools/perf/util/arm-spe.c
 create mode 100644 tools/perf/util/arm-spe.h

diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c
index 8edf2cb71564..2323581b157d 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -22,6 +22,42 @@
 #include "../../util/evlist.h"
 #include "../../util/pmu.h"
 #include "cs-etm.h"
+#include "arm-spe.h"
+
+static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err)
+{
+	struct perf_pmu **arm_spe_pmus = NULL;
+	int ret, i, nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
+	/* arm_spe_xxxxxxxxx\0 */
+	char arm_spe_pmu_name[sizeof(ARM_SPE_PMU_NAME) + 10];
+
+	arm_spe_pmus = zalloc(sizeof(struct perf_pmu *) * nr_cpus);
+	if (!arm_spe_pmus) {
+		pr_err("spes alloc failed\n");
+		*err = -ENOMEM;
+		return NULL;
+	}
+
+	for (i = 0; i < nr_cpus; i++) {
+		ret = sprintf(arm_spe_pmu_name, "%s%d", ARM_SPE_PMU_NAME, i);
+		if (ret < 0) {
+			pr_err("sprintf failed\n");
+			*err = -ENOMEM;
+			return NULL;
+		}
+
+		arm_spe_pmus[*nr_spes] = perf_pmu__find(arm_spe_pmu_name);
+		if (arm_spe_pmus[*nr_spes]) {
+			pr_debug2("%s %d: arm_spe_pmu %d type %d name %s\n",
+				 __func__, __LINE__, *nr_spes,
+				 arm_spe_pmus[*nr_spes]->type,
+				 arm_spe_pmus[*nr_spes]->name);
+			(*nr_spes)++;
+		}
+	}
+
+	return arm_spe_pmus;
+}
 
 struct auxtrace_record
 *auxtrace_record__init(struct perf_evlist *evlist, int *err)
@@ -29,22 +65,51 @@ struct auxtrace_record
 	struct perf_pmu	*cs_etm_pmu;
 	struct perf_evsel *evsel;
 	bool found_etm = false;
+	bool found_spe = false;
+	static struct perf_pmu **arm_spe_pmus = NULL;
+	static int nr_spes = 0;
+	int i;
+
+	if (!evlist)
+		return NULL;
 
 	cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
 
-	if (evlist) {
-		evlist__for_each_entry(evlist, evsel) {
-			if (cs_etm_pmu &&
-			    evsel->attr.type == cs_etm_pmu->type)
-				found_etm = true;
+	if (!arm_spe_pmus)
+		arm_spe_pmus = find_all_arm_spe_pmus(&nr_spes, err);
+
+	evlist__for_each_entry(evlist, evsel) {
+		if (cs_etm_pmu &&
+		    evsel->attr.type == cs_etm_pmu->type)
+			found_etm = true;
+
+		if (!nr_spes)
+			continue;
+
+		for (i = 0; i < nr_spes; i++) {
+			if (evsel->attr.type == arm_spe_pmus[i]->type) {
+				found_spe = true;
+				break;
+			}
 		}
 	}
 
+	if (found_etm && found_spe) {
+		pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n");
+		*err = -EOPNOTSUPP;
+		return NULL;
+	}
+
 	if (found_etm)
 		return cs_etm_record_init(err);
 
+#if defined(__aarch64__)
+	if (found_spe)
+		return arm_spe_recording_init(err, arm_spe_pmus[i]);
+#endif
+
 	/*
-	 * Clear 'err' even if we haven't found a cs_etm event - that way perf
+	 * Clear 'err' even if we haven't found an event - that way perf
 	 * record can still be used even if tracers aren't present.  The NULL
 	 * return value will take care of telling the infrastructure HW tracing
 	 * isn't available.
diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c
index 98d67399a0d6..ac4dffc807b8 100644
--- a/tools/perf/arch/arm/util/pmu.c
+++ b/tools/perf/arch/arm/util/pmu.c
@@ -20,6 +20,7 @@
 #include <linux/perf_event.h>
 
 #include "cs-etm.h"
+#include "arm-spe.h"
 #include "../../util/pmu.h"
 
 struct perf_event_attr
@@ -30,7 +31,12 @@ struct perf_event_attr
 		/* add ETM default config here */
 		pmu->selectable = true;
 		pmu->set_drv_config = cs_etm_set_drv_config;
+#if defined(__aarch64__)
+	} else if (strstarts(pmu->name, ARM_SPE_PMU_NAME)) {
+		return arm_spe_pmu_default_config(pmu);
+#endif
 	}
+
 #endif
 	return NULL;
 }
diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index e04f6cdd6f32..c0b8dfef98ba 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -5,4 +5,5 @@ libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
 
 libperf-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \
 			      ../../arm/util/auxtrace.o \
-			      ../../arm/util/cs-etm.o
+			      ../../arm/util/cs-etm.o \
+			      arm-spe.o
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
new file mode 100644
index 000000000000..1120e39c1b00
--- /dev/null
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -0,0 +1,225 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <linux/log2.h>
+#include <time.h>
+
+#include "../../util/cpumap.h"
+#include "../../util/evsel.h"
+#include "../../util/evlist.h"
+#include "../../util/session.h"
+#include "../../util/util.h"
+#include "../../util/pmu.h"
+#include "../../util/debug.h"
+#include "../../util/auxtrace.h"
+#include "../../util/arm-spe.h"
+
+#define KiB(x) ((x) * 1024)
+#define MiB(x) ((x) * 1024 * 1024)
+
+struct arm_spe_recording {
+	struct auxtrace_record		itr;
+	struct perf_pmu			*arm_spe_pmu;
+	struct perf_evlist		*evlist;
+};
+
+static size_t
+arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
+		       struct perf_evlist *evlist __maybe_unused)
+{
+	return ARM_SPE_AUXTRACE_PRIV_SIZE;
+}
+
+static int arm_spe_info_fill(struct auxtrace_record *itr,
+			     struct perf_session *session,
+			     struct auxtrace_info_event *auxtrace_info,
+			     size_t priv_size)
+{
+	struct arm_spe_recording *sper =
+			container_of(itr, struct arm_spe_recording, itr);
+	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
+
+	if (priv_size != ARM_SPE_AUXTRACE_PRIV_SIZE)
+		return -EINVAL;
+
+	if (!session->evlist->nr_mmaps)
+		return -EINVAL;
+
+	auxtrace_info->type = PERF_AUXTRACE_ARM_SPE;
+	auxtrace_info->priv[ARM_SPE_PMU_TYPE] = arm_spe_pmu->type;
+
+	return 0;
+}
+
+static int arm_spe_recording_options(struct auxtrace_record *itr,
+				     struct perf_evlist *evlist,
+				     struct record_opts *opts)
+{
+	struct arm_spe_recording *sper =
+			container_of(itr, struct arm_spe_recording, itr);
+	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
+	struct perf_evsel *evsel, *arm_spe_evsel = NULL;
+	bool privileged = geteuid() == 0 || perf_event_paranoid() < 0;
+	struct perf_evsel *tracking_evsel;
+	int err;
+
+	sper->evlist = evlist;
+
+	evlist__for_each_entry(evlist, evsel) {
+		if (evsel->attr.type == arm_spe_pmu->type) {
+			if (arm_spe_evsel) {
+				pr_err("There may be only one " ARM_SPE_PMU_NAME "x event\n");
+				return -EINVAL;
+			}
+			evsel->attr.freq = 0;
+			evsel->attr.sample_period = 1;
+			arm_spe_evsel = evsel;
+			opts->full_auxtrace = true;
+		}
+	}
+
+	if (!opts->full_auxtrace)
+		return 0;
+
+	/* We are in full trace mode but '-m,xyz' wasn't specified */
+	if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) {
+		if (privileged) {
+			opts->auxtrace_mmap_pages = MiB(4) / page_size;
+		} else {
+			opts->auxtrace_mmap_pages = KiB(128) / page_size;
+			if (opts->mmap_pages == UINT_MAX)
+				opts->mmap_pages = KiB(256) / page_size;
+		}
+	}
+
+	/* Validate auxtrace_mmap_pages */
+	if (opts->auxtrace_mmap_pages) {
+		size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size;
+		size_t min_sz = KiB(8);
+
+		if (sz < min_sz || !is_power_of_2(sz)) {
+			pr_err("Invalid mmap size for ARM SPE: must be at least %zuKiB and a power of 2\n",
+			       min_sz / 1024);
+			return -EINVAL;
+		}
+	}
+
+
+	/*
+	 * To obtain the auxtrace buffer file descriptor, the auxtrace event
+	 * must come first.
+	 */
+	perf_evlist__to_front(evlist, arm_spe_evsel);
+
+	perf_evsel__set_sample_bit(arm_spe_evsel, CPU);
+	perf_evsel__set_sample_bit(arm_spe_evsel, TIME);
+	perf_evsel__set_sample_bit(arm_spe_evsel, TID);
+
+	/* Add dummy event to keep tracking */
+	err = parse_events(evlist, "dummy:u", NULL);
+	if (err)
+		return err;
+
+	tracking_evsel = perf_evlist__last(evlist);
+	perf_evlist__set_tracking_event(evlist, tracking_evsel);
+
+	tracking_evsel->attr.freq = 0;
+	tracking_evsel->attr.sample_period = 1;
+	perf_evsel__set_sample_bit(tracking_evsel, TIME);
+	perf_evsel__set_sample_bit(tracking_evsel, CPU);
+	perf_evsel__reset_sample_bit(tracking_evsel, BRANCH_STACK);
+
+	return 0;
+}
+
+static u64 arm_spe_reference(struct auxtrace_record *itr __maybe_unused)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
+
+	return ts.tv_sec ^ ts.tv_nsec;
+}
+
+static void arm_spe_recording_free(struct auxtrace_record *itr)
+{
+	struct arm_spe_recording *sper =
+			container_of(itr, struct arm_spe_recording, itr);
+
+	free(sper);
+}
+
+static int arm_spe_read_finish(struct auxtrace_record *itr, int idx)
+{
+	struct arm_spe_recording *sper =
+			container_of(itr, struct arm_spe_recording, itr);
+	struct perf_evsel *evsel;
+
+	evlist__for_each_entry(sper->evlist, evsel) {
+		if (evsel->attr.type == sper->arm_spe_pmu->type)
+			return perf_evlist__enable_event_idx(sper->evlist,
+							     evsel, idx);
+	}
+	return -EINVAL;
+}
+
+struct auxtrace_record *arm_spe_recording_init(int *err,
+					       struct perf_pmu *arm_spe_pmu)
+{
+	struct arm_spe_recording *sper;
+
+	if (!arm_spe_pmu) {
+		*err = -ENODEV;
+		return NULL;
+	}
+
+	sper = zalloc(sizeof(struct arm_spe_recording));
+	if (!sper) {
+		*err = -ENOMEM;
+		return NULL;
+	}
+
+	sper->arm_spe_pmu = arm_spe_pmu;
+	sper->itr.recording_options = arm_spe_recording_options;
+	sper->itr.info_priv_size = arm_spe_info_priv_size;
+	sper->itr.info_fill = arm_spe_info_fill;
+	sper->itr.free = arm_spe_recording_free;
+	sper->itr.reference = arm_spe_reference;
+	sper->itr.read_finish = arm_spe_read_finish;
+	sper->itr.alignment = 0;
+
+	return &sper->itr;
+}
+
+struct perf_event_attr
+*arm_spe_pmu_default_config(struct perf_pmu *arm_spe_pmu)
+{
+	struct perf_event_attr *attr;
+
+	attr = zalloc(sizeof(struct perf_event_attr));
+	if (!attr) {
+		pr_err("arm_spe default config cannot allocate a perf_event_attr\n");
+		return NULL;
+	}
+
+	/*
+	 * If kernel driver doesn't advertise a minimum,
+	 * use max allowable by PMSIDR_EL1.INTERVAL
+	 */
+	if (perf_pmu__scan_file(arm_spe_pmu, "caps/min_interval", "%llu",
+				  &attr->sample_period) != 1) {
+		pr_debug("arm_spe driver doesn't advertise a min. interval. Using 4096\n");
+		attr->sample_period = 4096;
+	}
+
+	arm_spe_pmu->selectable = true;
+	arm_spe_pmu->is_uncore = false;
+
+	return attr;
+}
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index a3de7916fe63..7c6a8b461e24 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -86,6 +86,8 @@ libperf-$(CONFIG_AUXTRACE) += auxtrace.o
 libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
 libperf-$(CONFIG_AUXTRACE) += intel-pt.o
 libperf-$(CONFIG_AUXTRACE) += intel-bts.o
+libperf-$(CONFIG_AUXTRACE) += arm-spe.o
+libperf-$(CONFIG_AUXTRACE) += arm-spe-pkt-decoder.o
 libperf-y += parse-branch-options.o
 libperf-y += dump-insn.o
 libperf-y += parse-regs-options.o
diff --git a/tools/perf/util/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-pkt-decoder.c
new file mode 100644
index 000000000000..b94001b756c7
--- /dev/null
+++ b/tools/perf/util/arm-spe-pkt-decoder.c
@@ -0,0 +1,462 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <endian.h>
+#include <byteswap.h>
+
+#include "arm-spe-pkt-decoder.h"
+
+#define BIT(n)		(1ULL << (n))
+
+#define NS_FLAG		BIT(63)
+#define EL_FLAG		(BIT(62) | BIT(61))
+
+#define SPE_HEADER0_PAD			0x0
+#define SPE_HEADER0_END			0x1
+#define SPE_HEADER0_ADDRESS		0x30 /* address packet (short) */
+#define SPE_HEADER0_ADDRESS_MASK	0x38
+#define SPE_HEADER0_COUNTER		0x18 /* counter packet (short) */
+#define SPE_HEADER0_COUNTER_MASK	0x38
+#define SPE_HEADER0_TIMESTAMP		0x71
+#define SPE_HEADER0_TIMESTAMP		0x71
+#define SPE_HEADER0_EVENTS		0x2
+#define SPE_HEADER0_EVENTS_MASK		0xf
+#define SPE_HEADER0_SOURCE		0x3
+#define SPE_HEADER0_SOURCE_MASK		0xf
+#define SPE_HEADER0_CONTEXT		0x24
+#define SPE_HEADER0_CONTEXT_MASK	0x3c
+#define SPE_HEADER0_OP_TYPE		0x8
+#define SPE_HEADER0_OP_TYPE_MASK	0x3c
+#define SPE_HEADER1_ALIGNMENT		0x0
+#define SPE_HEADER1_ADDRESS		0xb0 /* address packet (extended) */
+#define SPE_HEADER1_ADDRESS_MASK	0xf8
+#define SPE_HEADER1_COUNTER		0x98 /* counter packet (extended) */
+#define SPE_HEADER1_COUNTER_MASK	0xf8
+
+#if __BYTE_ORDER == __BIG_ENDIAN
+#define le16_to_cpu bswap_16
+#define le32_to_cpu bswap_32
+#define le64_to_cpu bswap_64
+#define memcpy_le64(d, s, n) do { \
+	memcpy((d), (s), (n));    \
+	*(d) = le64_to_cpu(*(d)); \
+} while (0)
+#else
+#define le16_to_cpu
+#define le32_to_cpu
+#define le64_to_cpu
+#define memcpy_le64 memcpy
+#endif
+
+static const char * const arm_spe_packet_name[] = {
+	[ARM_SPE_PAD]		= "PAD",
+	[ARM_SPE_END]		= "END",
+	[ARM_SPE_TIMESTAMP]	= "TS",
+	[ARM_SPE_ADDRESS]	= "ADDR",
+	[ARM_SPE_COUNTER]	= "LAT",
+	[ARM_SPE_CONTEXT]	= "CONTEXT",
+	[ARM_SPE_OP_TYPE]	= "OP-TYPE",
+	[ARM_SPE_EVENTS]	= "EVENTS",
+	[ARM_SPE_DATA_SOURCE]	= "DATA-SOURCE",
+};
+
+const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
+{
+	return arm_spe_packet_name[type];
+}
+
+/* return ARM SPE payload size from its encoding,
+ * which is in bits 5:4 of the byte.
+ * 00 : byte
+ * 01 : halfword (2)
+ * 10 : word (4)
+ * 11 : doubleword (8)
+ */
+static int payloadlen(unsigned char byte)
+{
+	return 1 << ((byte & 0x30) >> 4);
+}
+
+static int arm_spe_get_payload(const unsigned char *buf, size_t len,
+			       struct arm_spe_pkt *packet)
+{
+	size_t payload_len = payloadlen(buf[0]);
+
+	if (len < 1 + payload_len)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	buf++;
+
+	switch (payload_len) {
+	case 1: packet->payload = *(uint8_t *)buf; break;
+	case 2: packet->payload = le16_to_cpu(*(uint16_t *)buf); break;
+	case 4: packet->payload = le32_to_cpu(*(uint32_t *)buf); break;
+	case 8: packet->payload = le64_to_cpu(*(uint64_t *)buf); break;
+	default: return ARM_SPE_BAD_PACKET;
+	}
+
+	return 1 + payload_len;
+}
+
+static int arm_spe_get_pad(struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_PAD;
+	return 1;
+}
+
+static int arm_spe_get_alignment(const unsigned char *buf, size_t len,
+				 struct arm_spe_pkt *packet)
+{
+	unsigned int alignment = 1 << ((buf[0] & 0xf) + 1);
+
+	if (len < alignment)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	packet->type = ARM_SPE_PAD;
+	return alignment - (((uintptr_t)buf) & (alignment - 1));
+}
+
+static int arm_spe_get_end(struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_END;
+	return 1;
+}
+
+static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
+				 struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_TIMESTAMP;
+	return arm_spe_get_payload(buf, len, packet);
+}
+
+static int arm_spe_get_events(const unsigned char *buf, size_t len,
+			      struct arm_spe_pkt *packet)
+{
+	int ret = arm_spe_get_payload(buf, len, packet);
+
+	packet->type = ARM_SPE_EVENTS;
+
+	/* we use index to identify Events with a less number of
+	 * comparisons in arm_spe_pkt_desc(): E.g., the LLC-ACCESS,
+	 * LLC-REFILL, and REMOTE-ACCESS events are identified iff
+	 * index > 1.
+	 */
+	packet->index = ret - 1;
+
+	return ret;
+}
+
+static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
+				   struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_DATA_SOURCE;
+	return arm_spe_get_payload(buf, len, packet);
+}
+
+static int arm_spe_get_context(const unsigned char *buf, size_t len,
+			       struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_CONTEXT;
+	packet->index = buf[0] & 0x3;
+
+	return arm_spe_get_payload(buf, len, packet);
+}
+
+static int arm_spe_get_op_type(const unsigned char *buf, size_t len,
+			       struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_OP_TYPE;
+	packet->index = buf[0] & 0x3;
+	return arm_spe_get_payload(buf, len, packet);
+}
+
+static int arm_spe_get_counter(const unsigned char *buf, size_t len,
+			       const unsigned char ext_hdr, struct arm_spe_pkt *packet)
+{
+	if (len < 2)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	packet->type = ARM_SPE_COUNTER;
+	if (ext_hdr)
+		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
+	else
+		packet->index = buf[0] & 0x7;
+
+	packet->payload = le16_to_cpu(*(uint16_t *)(buf + 1));
+
+	return 1 + ext_hdr + 2;
+}
+
+static int arm_spe_get_addr(const unsigned char *buf, size_t len,
+			    const unsigned char ext_hdr, struct arm_spe_pkt *packet)
+{
+	if (len < 8)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	packet->type = ARM_SPE_ADDRESS;
+	if (ext_hdr)
+		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
+	else
+		packet->index = buf[0] & 0x7;
+
+	memcpy_le64(&packet->payload, buf + 1, 8);
+
+	return 1 + ext_hdr + 8;
+}
+
+static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
+				 struct arm_spe_pkt *packet)
+{
+	unsigned int byte;
+
+	memset(packet, 0, sizeof(struct arm_spe_pkt));
+
+	if (!len)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	byte = buf[0];
+	if (byte == SPE_HEADER0_PAD)
+		return arm_spe_get_pad(packet);
+	else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
+		return arm_spe_get_end(packet);
+	else if (byte & 0xc0 /* 0y11xxxxxx */) {
+		if (byte & 0x80) {
+			if ((byte & SPE_HEADER0_ADDRESS_MASK) == SPE_HEADER0_ADDRESS)
+				return arm_spe_get_addr(buf, len, 0, packet);
+			if ((byte & SPE_HEADER0_COUNTER_MASK) == SPE_HEADER0_COUNTER)
+				return arm_spe_get_counter(buf, len, 0, packet);
+		} else
+			if (byte == SPE_HEADER0_TIMESTAMP)
+				return arm_spe_get_timestamp(buf, len, packet);
+			else if ((byte & SPE_HEADER0_EVENTS_MASK) == SPE_HEADER0_EVENTS)
+				return arm_spe_get_events(buf, len, packet);
+			else if ((byte & SPE_HEADER0_SOURCE_MASK) == SPE_HEADER0_SOURCE)
+				return arm_spe_get_data_source(buf, len, packet);
+			else if ((byte & SPE_HEADER0_CONTEXT_MASK) == SPE_HEADER0_CONTEXT)
+				return arm_spe_get_context(buf, len, packet);
+			else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == SPE_HEADER0_OP_TYPE)
+				return arm_spe_get_op_type(buf, len, packet);
+	} else if ((byte & 0xe0) == 0x20 /* 0y001xxxxx */) {
+		/* 16-bit header */
+		byte = buf[1];
+		if (byte == SPE_HEADER1_ALIGNMENT)
+			return arm_spe_get_alignment(buf, len, packet);
+		else if ((byte & SPE_HEADER1_ADDRESS_MASK) == SPE_HEADER1_ADDRESS)
+			return arm_spe_get_addr(buf, len, 1, packet);
+		else if ((byte & SPE_HEADER1_COUNTER_MASK) == SPE_HEADER1_COUNTER)
+			return arm_spe_get_counter(buf, len, 1, packet);
+	}
+
+	return ARM_SPE_BAD_PACKET;
+}
+
+int arm_spe_get_packet(const unsigned char *buf, size_t len,
+		       struct arm_spe_pkt *packet)
+{
+	int ret;
+
+	ret = arm_spe_do_get_packet(buf, len, packet);
+	/* put multiple consecutive PADs on the same line, up to
+	 * the fixed-width output format of 16 bytes per line.
+	 */
+	if (ret > 0 && packet->type == ARM_SPE_PAD) {
+		while (ret < 16 && len > (size_t)ret && !buf[ret])
+			ret += 1;
+	}
+	return ret;
+}
+
+int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
+		     size_t buf_len)
+{
+	int ret, ns, el, idx = packet->index;
+	unsigned long long payload = packet->payload;
+	const char *name = arm_spe_pkt_name(packet->type);
+
+	switch (packet->type) {
+	case ARM_SPE_BAD:
+	case ARM_SPE_PAD:
+	case ARM_SPE_END:
+		return snprintf(buf, buf_len, "%s", name);
+	case ARM_SPE_EVENTS: {
+		size_t blen = buf_len;
+
+		ret = 0;
+		ret = snprintf(buf, buf_len, "EV");
+		buf += ret;
+		blen -= ret;
+		if (payload & 0x1) {
+			ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x2) {
+			ret = snprintf(buf, buf_len, " RETIRED");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x4) {
+			ret = snprintf(buf, buf_len, " L1D-ACCESS");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x8) {
+			ret = snprintf(buf, buf_len, " L1D-REFILL");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x10) {
+			ret = snprintf(buf, buf_len, " TLB-ACCESS");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x20) {
+			ret = snprintf(buf, buf_len, " TLB-REFILL");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x40) {
+			ret = snprintf(buf, buf_len, " NOT-TAKEN");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x80) {
+			ret = snprintf(buf, buf_len, " MISPRED");
+			buf += ret;
+			blen -= ret;
+		}
+		if (idx > 1) {
+			if (payload & 0x100) {
+				ret = snprintf(buf, buf_len, " LLC-ACCESS");
+				buf += ret;
+				blen -= ret;
+			}
+			if (payload & 0x200) {
+				ret = snprintf(buf, buf_len, " LLC-REFILL");
+				buf += ret;
+				blen -= ret;
+			}
+			if (payload & 0x400) {
+				ret = snprintf(buf, buf_len, " REMOTE-ACCESS");
+				buf += ret;
+				blen -= ret;
+			}
+		}
+		if (ret < 0)
+			return ret;
+		blen -= ret;
+		return buf_len - blen;
+	}
+	case ARM_SPE_OP_TYPE:
+		switch (idx) {
+		case 0:	return snprintf(buf, buf_len, "%s", payload & 0x1 ?
+					"COND-SELECT" : "INSN-OTHER");
+		case 1:	{
+			size_t blen = buf_len;
+
+			if (payload & 0x1)
+				ret = snprintf(buf, buf_len, "ST");
+			else
+				ret = snprintf(buf, buf_len, "LD");
+			buf += ret;
+			blen -= ret;
+			if (payload & 0x2) {
+				if (payload & 0x4) {
+					ret = snprintf(buf, buf_len, " AT");
+					buf += ret;
+					blen -= ret;
+				}
+				if (payload & 0x8) {
+					ret = snprintf(buf, buf_len, " EXCL");
+					buf += ret;
+					blen -= ret;
+				}
+				if (payload & 0x10) {
+					ret = snprintf(buf, buf_len, " AR");
+					buf += ret;
+					blen -= ret;
+				}
+			} else if (payload & 0x4) {
+				ret = snprintf(buf, buf_len, " SIMD-FP");
+				buf += ret;
+				blen -= ret;
+			}
+			if (ret < 0)
+				return ret;
+			blen -= ret;
+			return buf_len - blen;
+		}
+		case 2:	{
+			size_t blen = buf_len;
+
+			ret = snprintf(buf, buf_len, "B");
+			buf += ret;
+			blen -= ret;
+			if (payload & 0x1) {
+				ret = snprintf(buf, buf_len, " COND");
+				buf += ret;
+				blen -= ret;
+			}
+			if (payload & 0x2) {
+				ret = snprintf(buf, buf_len, " IND");
+				buf += ret;
+				blen -= ret;
+			}
+			if (ret < 0)
+				return ret;
+			blen -= ret;
+			return buf_len - blen;
+			}
+		default: return 0;
+		}
+	case ARM_SPE_DATA_SOURCE:
+	case ARM_SPE_TIMESTAMP:
+		return snprintf(buf, buf_len, "%s %lld", name, payload);
+	case ARM_SPE_ADDRESS:
+		switch (idx) {
+		case 0:
+		case 1: ns = !!(packet->payload & NS_FLAG);
+			el = (packet->payload & EL_FLAG) >> 61;
+			payload &= ~(0xffULL << 56);
+			return snprintf(buf, buf_len, "%s 0x%llx el%d ns=%d",
+				        (idx == 1) ? "TGT" : "PC", payload, el, ns);
+		case 2:	return snprintf(buf, buf_len, "VA 0x%llx", payload);
+		case 3:	ns = !!(packet->payload & NS_FLAG);
+			payload &= ~(0xffULL << 56);
+			return snprintf(buf, buf_len, "PA 0x%llx ns=%d",
+					payload, ns);
+		default: return 0;
+		}
+	case ARM_SPE_CONTEXT:
+		return snprintf(buf, buf_len, "%s 0x%lx el%d", name,
+				(unsigned long)payload, idx + 1);
+	case ARM_SPE_COUNTER: {
+		size_t blen = buf_len;
+
+		ret = snprintf(buf, buf_len, "%s %d ", name,
+			       (unsigned short)payload);
+		buf += ret;
+		blen -= ret;
+		switch (idx) {
+		case 0:	ret = snprintf(buf, buf_len, "TOT"); break;
+		case 1:	ret = snprintf(buf, buf_len, "ISSUE"); break;
+		case 2:	ret = snprintf(buf, buf_len, "XLAT"); break;
+		default: ret = 0;
+		}
+		if (ret < 0)
+			return ret;
+		blen -= ret;
+		return buf_len - blen;
+	}
+	default:
+		break;
+	}
+
+	return snprintf(buf, buf_len, "%s 0x%llx (%d)",
+			name, payload, packet->index);
+}
diff --git a/tools/perf/util/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-pkt-decoder.h
new file mode 100644
index 000000000000..d786ef65113f
--- /dev/null
+++ b/tools/perf/util/arm-spe-pkt-decoder.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#ifndef INCLUDE__ARM_SPE_PKT_DECODER_H__
+#define INCLUDE__ARM_SPE_PKT_DECODER_H__
+
+#include <stddef.h>
+#include <stdint.h>
+
+#define ARM_SPE_PKT_DESC_MAX		256
+
+#define ARM_SPE_NEED_MORE_BYTES		-1
+#define ARM_SPE_BAD_PACKET		-2
+
+enum arm_spe_pkt_type {
+	ARM_SPE_BAD,
+	ARM_SPE_PAD,
+	ARM_SPE_END,
+	ARM_SPE_TIMESTAMP,
+	ARM_SPE_ADDRESS,
+	ARM_SPE_COUNTER,
+	ARM_SPE_CONTEXT,
+	ARM_SPE_OP_TYPE,
+	ARM_SPE_EVENTS,
+	ARM_SPE_DATA_SOURCE,
+};
+
+struct arm_spe_pkt {
+	enum arm_spe_pkt_type	type;
+	unsigned char		index;
+	uint64_t		payload;
+};
+
+const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
+
+int arm_spe_get_packet(const unsigned char *buf, size_t len,
+		       struct arm_spe_pkt *packet);
+
+int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf, size_t len);
+#endif
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
new file mode 100644
index 000000000000..6067267cc76c
--- /dev/null
+++ b/tools/perf/util/arm-spe.c
@@ -0,0 +1,231 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#include <endian.h>
+#include <errno.h>
+#include <byteswap.h>
+#include <inttypes.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <linux/log2.h>
+
+#include "cpumap.h"
+#include "color.h"
+#include "evsel.h"
+#include "evlist.h"
+#include "machine.h"
+#include "session.h"
+#include "util.h"
+#include "thread.h"
+#include "debug.h"
+#include "auxtrace.h"
+#include "arm-spe.h"
+#include "arm-spe-pkt-decoder.h"
+
+struct arm_spe {
+	struct auxtrace			auxtrace;
+	struct auxtrace_queues		queues;
+	struct auxtrace_heap		heap;
+	u32				auxtrace_type;
+	struct perf_session		*session;
+	struct machine			*machine;
+	u32				pmu_type;
+};
+
+struct arm_spe_queue {
+	struct arm_spe		*spe;
+	unsigned int		queue_nr;
+	struct auxtrace_buffer	*buffer;
+	bool			on_heap;
+	bool			done;
+	pid_t			pid;
+	pid_t			tid;
+	int			cpu;
+};
+
+static void arm_spe_dump(struct arm_spe *spe __maybe_unused,
+			 unsigned char *buf, size_t len)
+{
+	struct arm_spe_pkt packet;
+	size_t pos = 0;
+	int ret, pkt_len, i;
+	char desc[ARM_SPE_PKT_DESC_MAX];
+	const char *color = PERF_COLOR_BLUE;
+
+	color_fprintf(stdout, color,
+		      ". ... ARM SPE data: size %zu bytes\n",
+		      len);
+
+	while (len) {
+		ret = arm_spe_get_packet(buf, len, &packet);
+		if (ret > 0)
+			pkt_len = ret;
+		else
+			pkt_len = 1;
+		printf(".");
+		color_fprintf(stdout, color, "  %08x: ", pos);
+		for (i = 0; i < pkt_len; i++)
+			color_fprintf(stdout, color, " %02x", buf[i]);
+		for (; i < 16; i++)
+			color_fprintf(stdout, color, "   ");
+		if (ret > 0) {
+			ret = arm_spe_pkt_desc(&packet, desc,
+					       ARM_SPE_PKT_DESC_MAX);
+			if (ret > 0)
+				color_fprintf(stdout, color, " %s\n", desc);
+		} else {
+			color_fprintf(stdout, color, " Bad packet!\n");
+		}
+		pos += pkt_len;
+		buf += pkt_len;
+		len -= pkt_len;
+	}
+}
+
+static void arm_spe_dump_event(struct arm_spe *spe, unsigned char *buf,
+			       size_t len)
+{
+	printf(".\n");
+	arm_spe_dump(spe, buf, len);
+}
+
+static int arm_spe_process_event(struct perf_session *session __maybe_unused,
+				 union perf_event *event __maybe_unused,
+				 struct perf_sample *sample __maybe_unused,
+				 struct perf_tool *tool __maybe_unused)
+{
+	return 0;
+}
+
+static int arm_spe_process_auxtrace_event(struct perf_session *session,
+					  union perf_event *event,
+					  struct perf_tool *tool __maybe_unused)
+{
+	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
+					     auxtrace);
+	struct auxtrace_buffer *buffer;
+	off_t data_offset;
+	int fd = perf_data__fd(session->data);
+	int err;
+
+	if (perf_data__is_pipe(session->data)) {
+		data_offset = 0;
+	} else {
+		data_offset = lseek(fd, 0, SEEK_CUR);
+		if (data_offset == -1)
+			return -errno;
+	}
+
+	err = auxtrace_queues__add_event(&spe->queues, session, event,
+					 data_offset, &buffer);
+	if (err)
+		return err;
+
+	/* Dump here now we have copied a piped trace out of the pipe */
+	if (dump_trace) {
+		if (auxtrace_buffer__get_data(buffer, fd)) {
+			arm_spe_dump_event(spe, buffer->data,
+					     buffer->size);
+			auxtrace_buffer__put_data(buffer);
+		}
+	}
+
+	return 0;
+}
+
+static int arm_spe_flush(struct perf_session *session __maybe_unused,
+			 struct perf_tool *tool __maybe_unused)
+{
+	return 0;
+}
+
+static void arm_spe_free_queue(void *priv)
+{
+	struct arm_spe_queue *speq = priv;
+
+	if (!speq)
+		return;
+	free(speq);
+}
+
+static void arm_spe_free_events(struct perf_session *session)
+{
+	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
+					     auxtrace);
+	struct auxtrace_queues *queues = &spe->queues;
+	unsigned int i;
+
+	for (i = 0; i < queues->nr_queues; i++) {
+		arm_spe_free_queue(queues->queue_array[i].priv);
+		queues->queue_array[i].priv = NULL;
+	}
+	auxtrace_queues__free(queues);
+}
+
+static void arm_spe_free(struct perf_session *session)
+{
+	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
+					     auxtrace);
+
+	auxtrace_heap__free(&spe->heap);
+	arm_spe_free_events(session);
+	session->auxtrace = NULL;
+	free(spe);
+}
+
+static const char * const arm_spe_info_fmts[] = {
+	[ARM_SPE_PMU_TYPE]		= "  PMU Type           %"PRId64"\n",
+};
+
+static void arm_spe_print_info(u64 *arr)
+{
+	if (!dump_trace)
+		return;
+
+	fprintf(stdout, arm_spe_info_fmts[ARM_SPE_PMU_TYPE], arr[ARM_SPE_PMU_TYPE]);
+}
+
+int arm_spe_process_auxtrace_info(union perf_event *event,
+				  struct perf_session *session)
+{
+	struct auxtrace_info_event *auxtrace_info = &event->auxtrace_info;
+	size_t min_sz = sizeof(u64) * ARM_SPE_PMU_TYPE;
+	struct arm_spe *spe;
+	int err;
+
+	if (auxtrace_info->header.size < sizeof(struct auxtrace_info_event) +
+					min_sz)
+		return -EINVAL;
+
+	spe = zalloc(sizeof(struct arm_spe));
+	if (!spe)
+		return -ENOMEM;
+
+	err = auxtrace_queues__init(&spe->queues);
+	if (err)
+		goto err_free;
+
+	spe->session = session;
+	spe->machine = &session->machines.host; /* No kvm support */
+	spe->auxtrace_type = auxtrace_info->type;
+	spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
+
+	spe->auxtrace.process_event = arm_spe_process_event;
+	spe->auxtrace.process_auxtrace_event = arm_spe_process_auxtrace_event;
+	spe->auxtrace.flush_events = arm_spe_flush;
+	spe->auxtrace.free_events = arm_spe_free_events;
+	spe->auxtrace.free = arm_spe_free;
+	session->auxtrace = &spe->auxtrace;
+
+	arm_spe_print_info(&auxtrace_info->priv[0]);
+
+	return 0;
+
+err_free:
+	free(spe);
+	return err;
+}
diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h
new file mode 100644
index 000000000000..98d3235781c3
--- /dev/null
+++ b/tools/perf/util/arm-spe.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#ifndef INCLUDE__PERF_ARM_SPE_H__
+#define INCLUDE__PERF_ARM_SPE_H__
+
+#define ARM_SPE_PMU_NAME "arm_spe_"
+
+enum {
+	ARM_SPE_PMU_TYPE,
+	ARM_SPE_PER_CPU_MMAPS,
+	ARM_SPE_AUXTRACE_PRIV_MAX,
+};
+
+#define ARM_SPE_AUXTRACE_PRIV_SIZE (ARM_SPE_AUXTRACE_PRIV_MAX * sizeof(u64))
+
+union perf_event;
+struct perf_session;
+struct perf_pmu;
+
+struct auxtrace_record *arm_spe_recording_init(int *err,
+					       struct perf_pmu *arm_spe_pmu);
+
+int arm_spe_process_auxtrace_info(union perf_event *event,
+				  struct perf_session *session);
+
+struct perf_event_attr *arm_spe_pmu_default_config(struct perf_pmu *arm_spe_pmu);
+#endif
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index c76687e42344..3bba9947ab7f 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -54,6 +54,7 @@
 
 #include "intel-pt.h"
 #include "intel-bts.h"
+#include "arm-spe.h"
 
 #include "sane_ctype.h"
 #include "symbol/kallsyms.h"
@@ -910,6 +911,8 @@ int perf_event__process_auxtrace_info(struct perf_tool *tool __maybe_unused,
 		return intel_pt_process_auxtrace_info(event, session);
 	case PERF_AUXTRACE_INTEL_BTS:
 		return intel_bts_process_auxtrace_info(event, session);
+	case PERF_AUXTRACE_ARM_SPE:
+		return arm_spe_process_auxtrace_info(event, session);
 	case PERF_AUXTRACE_CS_ETM:
 	case PERF_AUXTRACE_UNKNOWN:
 	default:
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index d19e11b68de7..453c148d2158 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -43,6 +43,7 @@ enum auxtrace_type {
 	PERF_AUXTRACE_INTEL_PT,
 	PERF_AUXTRACE_INTEL_BTS,
 	PERF_AUXTRACE_CS_ETM,
+	PERF_AUXTRACE_ARM_SPE,
 };
 
 enum itrace_period_type {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 19/32] perf callchain: Fix attr.sample_max_stack setting
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (17 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 18/32] perf tools: Add ARM Statistical Profiling Extensions (SPE) support Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 20/32] perf unwind: Do not look just at the global callchain_param.record_mode Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Hendrick Brueckner, Jiri Olsa,
	Namhyung Kim, Thomas Richter, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

When setting the "dwarf" unwinder for a specific event and not
specifying the max-stack, the attr.sample_max_stack ended up using an
uninitialized callchain_param.max_stack, fix it by using designated
initializers for that callchain_param variable, zeroing all non
explicitely initialized struct members.

Here is what happened:

  # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  callchain: type DWARF
  callchain: stack dump size 8192
  perf_event_attr:
    type                             2
    size                             112
    config                           0x730
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
    exclude_callchain_user           1
    { wakeup_events, wakeup_watermark } 1
    sample_regs_user                 0xff0fff
    sample_stack_user                8192
    sample_max_stack                 50656
  sys_perf_event_open failed, error -75
  Value too large for defined data type
  # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  callchain: type DWARF
  callchain: stack dump size 8192
  perf_event_attr:
    type                             2
    size                             112
    config                           0x730
    sample_type                      IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
    exclude_callchain_user           1
    sample_regs_user                 0xff0fff
    sample_stack_user                8192
    sample_max_stack                 30448
  sys_perf_event_open failed, error -75
  Value too large for defined data type
  #

Now the attr.sample_max_stack is set to zero and the above works as
expected:

  # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
       0.000 probe_libc:inet_pton:(7feb7a998350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa39b6108f3f] (/usr/bin/ping)
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-is9tramondqa9jlxxsgcm9iz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index efa2e629a669..8f971a2301d1 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -731,14 +731,14 @@ static void apply_config_terms(struct perf_evsel *evsel,
 	struct perf_evsel_config_term *term;
 	struct list_head *config_terms = &evsel->config_terms;
 	struct perf_event_attr *attr = &evsel->attr;
-	struct callchain_param param;
+	/* callgraph default */
+	struct callchain_param param = {
+		.record_mode = callchain_param.record_mode,
+	};
 	u32 dump_size = 0;
 	int max_stack = 0;
 	const char *callgraph_buf = NULL;
 
-	/* callgraph default */
-	param.record_mode = callchain_param.record_mode;
-
 	list_for_each_entry(term, config_terms, list) {
 		switch (term->type) {
 		case PERF_EVSEL__CONFIG_TERM_PERIOD:
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 20/32] perf unwind: Do not look just at the global callchain_param.record_mode
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (18 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 19/32] perf callchain: Fix attr.sample_max_stack setting Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 21/32] perf trace: Setup DWARF callchains for non-syscall events when --max-stack is used Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Hendrick Brueckner, Jiri Olsa,
	Thomas Richter, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

When setting up DWARF callchains on specific events, without using
'record' or 'trace' --call-graph, but instead doing it like:

	perf trace -e cycles/call-graph=dwarf/

The unwind__prepare_access() call in thread__insert_map() when we
process PERF_RECORD_MMAP(2) metadata events were not being performed,
precluding us from using per-event DWARF callchains, handling them just
when we asked for all events to be DWARF, using "--call-graph dwarf".

We do it in the PERF_RECORD_MMAP because we have to look at one of the
executable maps to figure out the executable type (64-bit, 32-bit) of
the DSO laid out in that mmap. Also to look at the architecture where
the perf.data file was recorded.

All this probably should be deferred to when we process a sample for
some thread that has callchains, so that we do this processing only for
the threads with samples, not for all of them.

For now, fix using DWARF on specific events.

Before:

  # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.048 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.048/0.048/0.048/0.000 ms
     0.000 probe_libc:inet_pton:(7fe9597bb350))
  Problem processing probe_libc:inet_pton callchain, skipping...
  #

After:

  # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.060 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.060/0.060/0.060/0.000 ms
       0.000 probe_libc:inet_pton:(7fd4aa930350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa804e51af3f] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa804e51b379] (/usr/bin/ping)
  #
  # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.057 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.057/0.057/0.057/0.000 ms
       0.000 probe_libc:inet_pton:(7f9363b9e350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffa9e8a14e0f3f] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffa9e8a14e1379] (/usr/bin/ping)
  #
  # perf trace --call-graph=fp --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.077 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.077/0.077/0.077/0.000 ms
       0.000 probe_libc:inet_pton:(7f4947e1c350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa716d88ef3f] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa716d88f379] (/usr/bin/ping)
  #
  # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=fp/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.078 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.078/0.078/0.078/0.000 ms
       0.000 probe_libc:inet_pton:(7fa157696350))
                                         __GI___inet_pton (/usr/lib64/libc-2.26.so)
                                         getaddrinfo (/usr/lib64/libc-2.26.so)
                                         [0xffffa9ba39c74f40] (/usr/bin/ping)
  #

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/r/20180116182650.GE16107@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-c2c.c                 |  5 +++--
 tools/perf/builtin-report.c              |  5 +++--
 tools/perf/builtin-script.c              |  5 +++--
 tools/perf/tests/dwarf-unwind.c          |  1 +
 tools/perf/util/callchain.c              | 10 ++++++++++
 tools/perf/util/callchain.h              |  2 ++
 tools/perf/util/unwind-libunwind-local.c |  9 +++------
 7 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index c0debc3f79b6..c0815a37fdb5 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2390,9 +2390,10 @@ static int setup_callchain(struct perf_evlist *evlist)
 	enum perf_call_graph_mode mode = CALLCHAIN_NONE;
 
 	if ((sample_type & PERF_SAMPLE_REGS_USER) &&
-	    (sample_type & PERF_SAMPLE_STACK_USER))
+	    (sample_type & PERF_SAMPLE_STACK_USER)) {
 		mode = CALLCHAIN_DWARF;
-	else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
+		dwarf_callchain_users = true;
+	} else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
 		mode = CALLCHAIN_LBR;
 	else if (sample_type & PERF_SAMPLE_CALLCHAIN)
 		mode = CALLCHAIN_FP;
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index dd4df9a5cd06..6593779224d5 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -338,9 +338,10 @@ static int report__setup_sample_type(struct report *rep)
 
 	if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
 		if ((sample_type & PERF_SAMPLE_REGS_USER) &&
-		    (sample_type & PERF_SAMPLE_STACK_USER))
+		    (sample_type & PERF_SAMPLE_STACK_USER)) {
 			callchain_param.record_mode = CALLCHAIN_DWARF;
-		else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
+			dwarf_callchain_users = true;
+		} else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
 			callchain_param.record_mode = CALLCHAIN_LBR;
 		else
 			callchain_param.record_mode = CALLCHAIN_FP;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c1cce474c0f1..08bc818f371b 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2919,9 +2919,10 @@ static void script__setup_sample_type(struct perf_script *script)
 
 	if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
 		if ((sample_type & PERF_SAMPLE_REGS_USER) &&
-		    (sample_type & PERF_SAMPLE_STACK_USER))
+		    (sample_type & PERF_SAMPLE_STACK_USER)) {
 			callchain_param.record_mode = CALLCHAIN_DWARF;
-		else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
+			dwarf_callchain_users = true;
+		} else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
 			callchain_param.record_mode = CALLCHAIN_LBR;
 		else
 			callchain_param.record_mode = CALLCHAIN_FP;
diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index ac40e05bcab4..260418969120 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -173,6 +173,7 @@ int test__dwarf_unwind(struct test *test __maybe_unused, int subtest __maybe_unu
 	}
 
 	callchain_param.record_mode = CALLCHAIN_DWARF;
+	dwarf_callchain_users = true;
 
 	if (init_live_machine(machine)) {
 		pr_err("Could not init machine\n");
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 082505d08d72..32ef7bdca1cf 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -37,6 +37,15 @@ struct callchain_param callchain_param = {
 	CALLCHAIN_PARAM_DEFAULT
 };
 
+/*
+ * Are there any events usind DWARF callchains?
+ *
+ * I.e.
+ *
+ * -e cycles/call-graph=dwarf/
+ */
+bool dwarf_callchain_users;
+
 struct callchain_param callchain_param_default = {
 	CALLCHAIN_PARAM_DEFAULT
 };
@@ -265,6 +274,7 @@ int parse_callchain_record(const char *arg, struct callchain_param *param)
 			ret = 0;
 			param->record_mode = CALLCHAIN_DWARF;
 			param->dump_size = default_stack_dump_size;
+			dwarf_callchain_users = true;
 
 			tok = strtok_r(NULL, ",", &saveptr);
 			if (tok) {
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index b79ef2478a57..154560b1eb65 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -89,6 +89,8 @@ enum chain_value {
 	CCVAL_COUNT,
 };
 
+extern bool dwarf_callchain_users;
+
 struct callchain_param {
 	bool			enabled;
 	enum perf_call_graph_mode record_mode;
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index 7a42f703e858..af873044d33a 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -631,9 +631,8 @@ static unw_accessors_t accessors = {
 
 static int _unwind__prepare_access(struct thread *thread)
 {
-	if (callchain_param.record_mode != CALLCHAIN_DWARF)
+	if (!dwarf_callchain_users)
 		return 0;
-
 	thread->addr_space = unw_create_addr_space(&accessors, 0);
 	if (!thread->addr_space) {
 		pr_err("unwind: Can't create unwind address space.\n");
@@ -646,17 +645,15 @@ static int _unwind__prepare_access(struct thread *thread)
 
 static void _unwind__flush_access(struct thread *thread)
 {
-	if (callchain_param.record_mode != CALLCHAIN_DWARF)
+	if (!dwarf_callchain_users)
 		return;
-
 	unw_flush_cache(thread->addr_space, 0, 0);
 }
 
 static void _unwind__finish_access(struct thread *thread)
 {
-	if (callchain_param.record_mode != CALLCHAIN_DWARF)
+	if (!dwarf_callchain_users)
 		return;
-
 	unw_destroy_addr_space(thread->addr_space);
 }
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 21/32] perf trace: Setup DWARF callchains for non-syscall events when --max-stack is used
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (19 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 20/32] perf unwind: Do not look just at the global callchain_param.record_mode Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 22/32] perf trace: Allow overriding global --max-stack per event Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Hendrick Brueckner, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

If we use:

	perf trace --max-stack=4

then the syscall events will use DWARF callchains, when available
(libunwind enabled in the build) and the printing will stop at 4 levels.

When we introduced support for tracepoint events this ended up not
applying for them, fix it.

Before:

  # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.058 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.058/0.058/0.058/0.000 ms
       0.000 probe_libc:inet_pton:(7fc6c2a16350))
  #

After:

  # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.087 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.087/0.087/0.087/0.000 ms
       0.000 probe_libc:inet_pton:(7fbf9a041350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa947cb67f3f] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa947cb68379] (/usr/bin/ping)
  #

Reported-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-afsu9eegd43ppihiuafhh9qv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 0362974854e9..ee85c29dbf70 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2350,7 +2350,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		goto out_delete_evlist;
 	}
 
-	perf_evlist__config(evlist, &trace->opts, NULL);
+	perf_evlist__config(evlist, &trace->opts, &callchain_param);
 
 	signal(SIGCHLD, sig_handler);
 	signal(SIGINT, sig_handler);
@@ -3065,8 +3065,9 @@ int cmd_trace(int argc, const char **argv)
 	}
 
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-	if ((trace.min_stack || max_stack_user_set) && !callchain_param.enabled && trace.trace_syscalls)
+	if ((trace.min_stack || max_stack_user_set) && !callchain_param.enabled) {
 		record_opts__parse_callchain(&trace.opts, &callchain_param, "dwarf", false);
+	}
 #endif
 
 	if (callchain_param.enabled) {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 22/32] perf trace: Allow overriding global --max-stack per event
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (20 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 21/32] perf trace: Setup DWARF callchains for non-syscall events when --max-stack is used Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 23/32] perf callchains: Ask for PERF_RECORD_MMAP for data mmaps for DWARF unwinding Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Hendrick Brueckner, Jiri Olsa,
	Namhyung Kim, Thomas Richter, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The per-event max-stack setting wasn't overriding the global --max-stack
setting:

  # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf,max-stack=2/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
       0.000 probe_libc:inet_pton:(7feb7a998350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa39b6108f3f] (/usr/bin/ping)
  #

Fix it:

  # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf,max-stack=2/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.073 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.073/0.073/0.073/0.000 ms
       0.000 probe_libc:inet_pton:(7f1083221350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ic3g837xg8ob3kcpkspxwz0g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index ee85c29dbf70..531d43bf57e1 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1644,7 +1644,7 @@ static int trace__resolve_callchain(struct trace *trace, struct perf_evsel *evse
 	struct addr_location al;
 
 	if (machine__resolve(trace->host, &al, sample) < 0 ||
-	    thread__resolve_callchain(al.thread, cursor, evsel, sample, NULL, NULL, trace->max_stack))
+	    thread__resolve_callchain(al.thread, cursor, evsel, sample, NULL, NULL, evsel->attr.sample_max_stack))
 		return -1;
 
 	return 0;
@@ -2423,6 +2423,18 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	trace->multiple_threads = thread_map__pid(evlist->threads, 0) == -1 ||
 				  evlist->threads->nr > 1 ||
 				  perf_evlist__first(evlist)->attr.inherit;
+
+	/*
+	 * Now that we already used evsel->attr to ask the kernel to setup the
+	 * events, lets reuse evsel->attr.sample_max_stack as the limit in
+	 * trace__resolve_callchain(), allowing per-event max-stack settings
+	 * to override an explicitely set --max-stack global setting.
+	 */
+	evlist__for_each_entry(evlist, evsel) {
+		if ((evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
+		    evsel->attr.sample_max_stack == 0)
+			evsel->attr.sample_max_stack = trace->max_stack;
+	}
 again:
 	before = trace->nr_events;
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 23/32] perf callchains: Ask for PERF_RECORD_MMAP for data mmaps for DWARF unwinding
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (21 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 22/32] perf trace: Allow overriding global --max-stack per event Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 24/32] perf report: Improve error msg when no first/last sample time found Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Hendrick Brueckner, Jiri Olsa,
	Namhyung Kim, Noel Grandin, Thomas Richter, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

When we use a global DWARF setting as in:

	perf record --call-graph dwarf

According to 5c0cf22477ea ("perf record: Store data mmaps for dwarf unwind") we need
to set up some extra perf_event_attr bits.

But when we instead do a per event dwarf setting:

	perf record -e cycles/call-graph=dwarf/

This was not being done, make them equivalent.

This didn't produce any output changes in my tests while fixing up loose
ends in the per-event settings, I found it just by comparing the
perf_event_attr fields trying to find an explanation for those problems.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6476r53h2o38skbs9qa4ust4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8f971a2301d1..85eb84dfdf91 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -726,7 +726,7 @@ perf_evsel__reset_callgraph(struct perf_evsel *evsel,
 }
 
 static void apply_config_terms(struct perf_evsel *evsel,
-			       struct record_opts *opts)
+			       struct record_opts *opts, bool track)
 {
 	struct perf_evsel_config_term *term;
 	struct list_head *config_terms = &evsel->config_terms;
@@ -797,6 +797,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
 	if ((callgraph_buf != NULL) || (dump_size > 0) || max_stack) {
+		bool sample_address = false;
+
 		if (max_stack) {
 			param.max_stack = max_stack;
 			if (callgraph_buf == NULL)
@@ -816,6 +818,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 					       evsel->name);
 					return;
 				}
+				if (param.record_mode == CALLCHAIN_DWARF)
+					sample_address = true;
 			}
 		}
 		if (dump_size > 0) {
@@ -828,8 +832,14 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			perf_evsel__reset_callgraph(evsel, &callchain_param);
 
 		/* set perf-event callgraph */
-		if (param.enabled)
+		if (param.enabled) {
+			if (sample_address) {
+				perf_evsel__set_sample_bit(evsel, ADDR);
+				perf_evsel__set_sample_bit(evsel, DATA_SRC);
+				evsel->attr.mmap_data = track;
+			}
 			perf_evsel__config_callchain(evsel, opts, &param);
+		}
 	}
 }
 
@@ -1060,7 +1070,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 	 * Apply event specific term settings,
 	 * it overloads any global configuration.
 	 */
-	apply_config_terms(evsel, opts);
+	apply_config_terms(evsel, opts, track);
 
 	evsel->ignore_missing_thread = opts->ignore_missing_thread;
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 24/32] perf report: Improve error msg when no first/last sample time found
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (22 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 23/32] perf callchains: Ask for PERF_RECORD_MMAP for data mmaps for DWARF unwinding Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 25/32] perf script: " Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6593779224d5..7d4f0a5de326 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1300,7 +1300,9 @@ int cmd_report(int argc, const char **argv)
 	if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
 		if (session->evlist->first_sample_time == 0 &&
 		    session->evlist->last_sample_time == 0) {
-			pr_err("No first/last sample time in perf data\n");
+			pr_err("HINT: no first/last sample time found in perf data.\n"
+			       "Please use latest perf binary to execute 'perf record'\n"
+			       "(if '--buildid-all' is enabled, please set '--timestamp-boundary').\n");
 			return -EINVAL;
 		}
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 25/32] perf script: Improve error msg when no first/last sample time found
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (23 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 24/32] perf report: Improve error msg when no first/last sample time found Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 26/32] perf util: Improve error checking for time percent input Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

The following message will be returned to user when executing 'perf
script --time' if perf data file doesn't contain the first/last sample
time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 08bc818f371b..ac781916e51e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3449,7 +3449,9 @@ int cmd_script(int argc, const char **argv)
 	if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
 		if (session->evlist->first_sample_time == 0 &&
 		    session->evlist->last_sample_time == 0) {
-			pr_err("No first/last sample time in perf data\n");
+			pr_err("HINT: no first/last sample time found in perf data.\n"
+			       "Please use latest perf binary to execute 'perf record'\n"
+			       "(if '--buildid-all' is enabled, please set '--timestamp-boundary').\n");
 			err = -EINVAL;
 			goto out_delete;
 		}
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 26/32] perf util: Improve error checking for time percent input
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (24 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 25/32] perf script: " Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 27/32] perf util: Support no index time percent slice Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.

This patch uses strtod() to replace original atof() and check the entire
string. Now for the same command line, it would return error message
"Invalid time string".

root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/time-utils.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 3f7f18f06982..88510ab6450e 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -116,7 +116,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
 
 static int parse_percent(double *pcnt, char *str)
 {
-	char *c;
+	char *c, *endptr;
+	double d;
 
 	c = strchr(str, '%');
 	if (c)
@@ -124,8 +125,11 @@ static int parse_percent(double *pcnt, char *str)
 	else
 		return -1;
 
-	*pcnt = atof(str) / 100.0;
+	d = strtod(str, &endptr);
+	if (endptr != str + strlen(str))
+		return -1;
 
+	*pcnt = d / 100.0;
 	return 0;
 }
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 27/32] perf util: Support no index time percent slice
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (25 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 26/32] perf util: Improve error checking for time percent input Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 28/32] perf report: Add an indication of what time slices are used Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

Previously, the time percent slice needs an index to specify which one
the user wants.

It may be easier to use if the index can be omitted.  So with this
patch, for example,

perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-5-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/time-utils.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 88510ab6450e..5769f972c23e 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -261,6 +261,37 @@ static int percent_comma_split(struct perf_time_interval *ptime_buf, int num,
 	return i;
 }
 
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+			       const char *ostr, u64 start, u64 end, char *c)
+{
+	char *str;
+	int len = strlen(ostr), ret;
+
+	/*
+	 * c points to '%'.
+	 * '%' should be the last character
+	 */
+	if (ostr + len - 1 != c)
+		return -1;
+
+	/*
+	 * Construct a string like "xx%/1"
+	 */
+	str = malloc(len + 3);
+	if (str == NULL)
+		return -ENOMEM;
+
+	memcpy(str, ostr, len);
+	strcpy(str + len, "/1");
+
+	ret = percent_slash_split(str, ptime_buf, start, end);
+	if (ret == 0)
+		ret = 1;
+
+	free(str);
+	return ret;
+}
+
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 				 const char *ostr, u64 start, u64 end)
 {
@@ -270,6 +301,7 @@ int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 	 * ostr example:
 	 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
 	 * 0%-10%,30%-40%: multiple time range
+	 * 50%: just one percent
 	 */
 
 	memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
@@ -286,6 +318,10 @@ int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 					   end, percent_dash_split);
 	}
 
+	c = strchr(ostr, '%');
+	if (c)
+		return one_percent_convert(ptime_buf, ostr, start, end, c);
+
 	return -1;
 }
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 28/32] perf report: Add an indication of what time slices are used
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (26 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 27/32] perf util: Support no index time percent slice Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 29/32] perf util: Allocate time slices buffer according to number of comma Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

Add a time slices indication to the perf report header.

For example,

  # perf report --stdio --time 10%

  # Total Lost Samples: 0
  #
  # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
  # Event count (approx.): 8951288803

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested--by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-6-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 7d4f0a5de326..4aaaa37262a8 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -404,6 +404,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report
 	if (evname != NULL)
 		ret += fprintf(fp, " of event '%s'", evname);
 
+	if (rep->time_str)
+		ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
 	if (symbol_conf.show_ref_callgraph &&
 	    strstr(evname, "call-graph=no")) {
 		ret += fprintf(fp, ", show reference callgraph");
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 29/32] perf util: Allocate time slices buffer according to number of comma
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (27 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 28/32] perf report: Add an indication of what time slices are used Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 30/32] perf report: Remove the time slices number limitation Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

Previously we use a magic number 10 to limit the number of time slices.
It's not very good.

This patch creates a new function perf_time__range_alloc() to allocate
time slices buffer. The number of buffer entries is determined by the
number of comma in string but at least it will allocate one entry even
if no comma is found.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-7-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/time-utils.c | 28 ++++++++++++++++++++++++++++
 tools/perf/util/time-utils.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 5769f972c23e..6193b46050a5 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -325,6 +325,34 @@ int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 	return -1;
 }
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+	const char *p1, *p2;
+	int i = 1;
+	struct perf_time_interval *ptime;
+
+	/*
+	 * At least allocate one time range.
+	 */
+	if (!ostr)
+		goto alloc;
+
+	p1 = ostr;
+	while (p1 < ostr + strlen(ostr)) {
+		p2 = strchr(p1, ',');
+		if (!p2)
+			break;
+
+		p1 = p2 + 1;
+		i++;
+	}
+
+alloc:
+	*size = i;
+	ptime = calloc(i, sizeof(*ptime));
+	return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
 	/* if time is not set don't drop sample */
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 34d5eba26bf5..70b177d2b98c 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -16,6 +16,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr);
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 				 const char *ostr, u64 start, u64 end);
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
 bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 30/32] perf report: Remove the time slices number limitation
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (28 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 29/32] perf util: Allocate time slices buffer according to number of comma Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 31/32] perf script: " Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Jiri Olsa, Kan Liang, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

Previously it was only allowed to use at most 10 time slices in 'perf
report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-8-git-send-email-yao.jin@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/builtin-report.c              | 22 ++++++++++++++++------
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 63d0db3184c9..907e505b6309 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -403,7 +403,7 @@ OPTIONS
 	to end of file.
 
 	Also support time percent with multiple time range. Time string is
-	'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 10.
+	'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
 	For example:
 	Select the second 10% time slice:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 4aaaa37262a8..42a52dcc41cd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -54,8 +54,6 @@
 #include <unistd.h>
 #include <linux/mman.h>
 
-#define PTIME_RANGE_MAX	10
-
 struct report {
 	struct perf_tool	tool;
 	struct perf_session	*session;
@@ -76,7 +74,8 @@ struct report {
 	const char		*cpu_list;
 	const char		*symbol_filter_str;
 	const char		*time_str;
-	struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+	struct perf_time_interval *ptime_range;
+	int			range_size;
 	int			range_num;
 	float			min_percent;
 	u64			nr_entries;
@@ -1300,24 +1299,33 @@ int cmd_report(int argc, const char **argv)
 	if (symbol__init(&session->header.env) < 0)
 		goto error;
 
+	report.ptime_range = perf_time__range_alloc(report.time_str,
+						    &report.range_size);
+	if (!report.ptime_range) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
 	if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
 		if (session->evlist->first_sample_time == 0 &&
 		    session->evlist->last_sample_time == 0) {
 			pr_err("HINT: no first/last sample time found in perf data.\n"
 			       "Please use latest perf binary to execute 'perf record'\n"
 			       "(if '--buildid-all' is enabled, please set '--timestamp-boundary').\n");
-			return -EINVAL;
+			ret = -EINVAL;
+			goto error;
 		}
 
 		report.range_num = perf_time__percent_parse_str(
-					report.ptime_range, PTIME_RANGE_MAX,
+					report.ptime_range, report.range_size,
 					report.time_str,
 					session->evlist->first_sample_time,
 					session->evlist->last_sample_time);
 
 		if (report.range_num < 0) {
 			pr_err("Invalid time string\n");
-			return -EINVAL;
+			ret = -EINVAL;
+			goto error;
 		}
 	} else {
 		report.range_num = 1;
@@ -1333,6 +1341,8 @@ int cmd_report(int argc, const char **argv)
 		ret = 0;
 
 error:
+	zfree(&report.ptime_range);
+
 	perf_session__delete(session);
 	return ret;
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 31/32] perf script: Remove the time slices number limitation
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (29 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 30/32] perf report: Remove the time slices number limitation Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:12 ` [PATCH 32/32] perf record: Fix failed memory allocation for get_cpuid_str Arnaldo Carvalho de Melo
  2018-01-17 16:22 ` [GIT PULL 00/32] perf/core improvements and fixes Ingo Molnar
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

Previously it was only allowed to use at most 10 time slices in 'perf
script --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf script --time 1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515596433-24653-9-git-send-email-yao.jin@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt | 10 +++++-----
 tools/perf/builtin-script.c              | 16 ++++++++++++----
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 806ec6391fd6..7730c1d2b5d3 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -351,19 +351,19 @@ include::itrace.txt[]
 	to end of file.
 
 	Also support time percent with multipe time range. Time string is
-	'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 10.
+	'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
 	For example:
-	Select the second 10% time slice
+	Select the second 10% time slice:
 	perf script --time 10%/2
 
-	Select from 0% to 10% time slice
+	Select from 0% to 10% time slice:
 	perf script --time 0%-10%
 
-	Select the first and second 10% time slices
+	Select the first and second 10% time slices:
 	perf script --time 10%/1,10%/2
 
-	Select from 0% to 10% and 30% to 40% slices
+	Select from 0% to 10% and 30% to 40% slices:
 	perf script --time 0%-10%,30%-40%
 
 --max-blocks::
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ac781916e51e..3499d68e1d70 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1480,8 +1480,6 @@ static int perf_sample__fprintf_synth(struct perf_sample *sample,
 	return 0;
 }
 
-#define PTIME_RANGE_MAX	10
-
 struct perf_script {
 	struct perf_tool	tool;
 	struct perf_session	*session;
@@ -1496,7 +1494,8 @@ struct perf_script {
 	struct thread_map	*threads;
 	int			name_width;
 	const char              *time_str;
-	struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+	struct perf_time_interval *ptime_range;
+	int			range_size;
 	int			range_num;
 };
 
@@ -3445,6 +3444,13 @@ int cmd_script(int argc, const char **argv)
 	if (err < 0)
 		goto out_delete;
 
+	script.ptime_range = perf_time__range_alloc(script.time_str,
+						    &script.range_size);
+	if (!script.ptime_range) {
+		err = -ENOMEM;
+		goto out_delete;
+	}
+
 	/* needs to be parsed after looking up reference time */
 	if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
 		if (session->evlist->first_sample_time == 0 &&
@@ -3457,7 +3463,7 @@ int cmd_script(int argc, const char **argv)
 		}
 
 		script.range_num = perf_time__percent_parse_str(
-					script.ptime_range, PTIME_RANGE_MAX,
+					script.ptime_range, script.range_size,
 					script.time_str,
 					session->evlist->first_sample_time,
 					session->evlist->last_sample_time);
@@ -3476,6 +3482,8 @@ int cmd_script(int argc, const char **argv)
 	flush_scripting();
 
 out_delete:
+	zfree(&script.ptime_range);
+
 	perf_evlist__free_stats(session->evlist);
 	perf_session__delete(session);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 32/32] perf record: Fix failed memory allocation for get_cpuid_str
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (30 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 31/32] perf script: " Arnaldo Carvalho de Melo
@ 2018-01-17 16:12 ` Arnaldo Carvalho de Melo
  2018-01-17 16:22 ` [GIT PULL 00/32] perf/core improvements and fixes Ingo Molnar
  32 siblings, 0 replies; 34+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-01-17 16:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Thomas Richter, Heiko Carstens,
	Hendrik Brueckner, Martin Schwidefsky, Arnaldo Carvalho de Melo

From: Thomas Richter <tmricht@linux.vnet.ibm.com>

In x86 architecture dependend part function get_cpuid_str() mallocs a
128 byte buffer, but does not check if the memory allocation succeeded
or not.

When the memory allocation fails, function __get_cpuid() is called with
first parameter being a NULL pointer.  However this function references
its first parameter and operates on a NULL pointer which might cause
core dumps.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180117131611.34319-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/util/header.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/header.c b/tools/perf/arch/x86/util/header.c
index b626d2bad9f1..fb0d71afee8b 100644
--- a/tools/perf/arch/x86/util/header.c
+++ b/tools/perf/arch/x86/util/header.c
@@ -70,7 +70,7 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	char *buf = malloc(128);
 
-	if (__get_cpuid(buf, 128, "%s-%u-%X$") < 0) {
+	if (buf && __get_cpuid(buf, 128, "%s-%u-%X$") < 0) {
 		free(buf);
 		return NULL;
 	}
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [GIT PULL 00/32] perf/core improvements and fixes
  2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (31 preceding siblings ...)
  2018-01-17 16:12 ` [PATCH 32/32] perf record: Fix failed memory allocation for get_cpuid_str Arnaldo Carvalho de Melo
@ 2018-01-17 16:22 ` Ingo Molnar
  32 siblings, 0 replies; 34+ messages in thread
From: Ingo Molnar @ 2018-01-17 16:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, Adrian Hunter,
	Alexander Shishkin, Andi Kleen, Andrew Morton, Dan Williams,
	David Ahern, Dongjiu Geng, Federico Vaga, Gopanapalli Pradeep,
	Heiko Carstens, Hendrik Brueckner, Jan Kiszka, Jin Yao,
	Jiri Olsa, Joe Perches, Kan Liang, Kim Phillips,
	linux-arm-kernel, Luis de Bethencourt, Marc Zyngier,
	Mark Rutland, Martin Schwidefsky, Mathieu Poirier,
	Michael Sartain, Namhyung Kim, Noel Grandin, Pawel Moll,
	Peter Zijlstra, Philippe Ombredanne, Rob Herring,
	Stephane Eranian, Steven Rostedt, Suzuki Poulouse, Taeung Song,
	Thomas Gleixner, Thomas Richter, Wang Nan, Will Deacon,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 1ccb8feda7471efb62ebf68d70055b4c51fa7d92:
> 
>   Merge tag 'perf-core-for-mingo-4.16-20180110' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2018-01-11 06:53:06 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.16-20180117
> 
> for you to fetch changes up to 81fccd6ca507d3b2012eaf1edeb9b1dbf4bd22db:
> 
>   perf record: Fix failed memory allocation for get_cpuid_str (2018-01-17 10:31:25 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> - Fix various per event 'max-stack' and 'call-graph=dwarf' issues,
>   mostly in 'perf trace', allowing to use 'perf trace --call-graph' with
>   'dwarf' and 'fp' to setup the callgraph details for the syscall events
>   and make that apply to other events, whilhe allowing to override that on
>   a per-event basis, using '-e sched:*switch/call-graph=dwarf/' for
>   instance (Arnaldo Carvalho de Melo)
> 
> - Improve the --time percent support in record/report/script (Jin Yao)
> 
> - Fix copyfile_offset update of output offset (Jiri Olsa)
> 
> - Add python script to profile and resolve physical mem type (Kan Liang)
> 
> - Add ARM Statistical Profiling Extensions (SPE) support (Kim Phillips)
> 
> - Remove trailing semicolon in the evlist code (Luis de Bethencourt)
> 
> - Fix incorrect handling of type _TERM_DRV_CFG (Mathieu Poirier)
> 
> - Use asprintf when possible in libtraceevent (Federico Vaga)
> 
> - Fix bad force_token escape sequence in libtraceevent (Michael Sartain)
> 
> - Add UL suffix to MISSING_EVENTS in libtraceevent (Michael Sartain)
> 
> - value of unknown symbolic fields in libtraceevent (Jan Kiszka)
> 
> - libtraceevent updates: (Steven Rostedt)
>   o Show value of flags that have not been parsed
>   o Simplify pointer print logic and fix %pF
>   o Handle new pointer processing of bprint strings
>   o Show contents (in hex) of data of unrecognized type records
>   o Fix get_field_str() for dynamic strings
> 
> - Add missing break in FALSE case of pevent_filter_clear_trivial() (Taeung Song)
> 
> - Fix failed memory allocation for get_cpuid_str (Thomas Richter)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> 
> Arnaldo Carvalho de Melo (8):
>       perf trace: No need to set PERF_SAMPLE_IDENTIFIER explicitely
>       perf evsel: Check if callchain is enabled before setting it up
>       perf trace: Fix setting of --call-graph/--max-stack for non-syscall events
>       perf callchain: Fix attr.sample_max_stack setting
>       perf unwind: Do not look just at the global callchain_param.record_mode
>       perf trace: Setup DWARF callchains for non-syscall events when --max-stack is used
>       perf trace: Allow overriding global --max-stack per event
>       perf callchains: Ask for PERF_RECORD_MMAP for data mmaps for DWARF unwinding
> 
> Federico Vaga (1):
>       tools lib traceevent: Use asprintf when possible
> 
> Jan Kiszka (1):
>       tools lib traceevent: Print value of unknown symbolic fields
> 
> Jin Yao (8):
>       perf report: Improve error msg when no first/last sample time found
>       perf script: Improve error msg when no first/last sample time found
>       perf util: Improve error checking for time percent input
>       perf util: Support no index time percent slice
>       perf report: Add an indication of what time slices are used
>       perf util: Allocate time slices buffer according to number of comma
>       perf report: Remove the time slices number limitation
>       perf script: Remove the time slices number limitation
> 
> Jiri Olsa (1):
>       perf tools: Fix copyfile_offset update of output offset
> 
> Kan Liang (1):
>       perf script python: Add script to profile and resolve physical mem type
> 
> Kim Phillips (1):
>       perf tools: Add ARM Statistical Profiling Extensions (SPE) support
> 
> Luis de Bethencourt (1):
>       perf evlist: Remove trailing semicolon
> 
> Mathieu Poirier (1):
>       perf evsel: Fix incorrect handling of type _TERM_DRV_CFG
> 
> Michael Sartain (2):
>       tools lib traceevent: Fix bad force_token escape sequence
>       tools lib traceevent: Add UL suffix to MISSING_EVENTS
> 
> Steven Rostedt (VMware) (5):
>       tools lib traceevent: Show value of flags that have not been parsed
>       tools lib traceevent: Simplify pointer print logic and fix %pF
>       tools lib traceevent: Handle new pointer processing of bprint strings
>       tools lib traceevent: Show contents (in hex) of data of unrecognized type records
>       tools lib traceevent: Fix get_field_str() for dynamic strings
> 
> Taeung Song (1):
>       tools lib traceevent: Fix missing break in FALSE case of pevent_filter_clear_trivial()
> 
> Thomas Richter (1):
>       perf record: Fix failed memory allocation for get_cpuid_str
> 
>  tools/lib/traceevent/event-parse.c                 |  62 ++-
>  tools/lib/traceevent/event-plugin.c                |  24 +-
>  tools/lib/traceevent/kbuffer-parse.c               |   4 +-
>  tools/lib/traceevent/parse-filter.c                |  22 +-
>  tools/perf/Documentation/perf-report.txt           |   2 +-
>  tools/perf/Documentation/perf-script.txt           |  10 +-
>  tools/perf/arch/arm/util/auxtrace.c                |  77 +++-
>  tools/perf/arch/arm/util/pmu.c                     |   6 +
>  tools/perf/arch/arm64/util/Build                   |   3 +-
>  tools/perf/arch/arm64/util/arm-spe.c               | 225 ++++++++++
>  tools/perf/arch/x86/util/header.c                  |   2 +-
>  tools/perf/builtin-c2c.c                           |   5 +-
>  tools/perf/builtin-report.c                        |  34 +-
>  tools/perf/builtin-script.c                        |  25 +-
>  tools/perf/builtin-trace.c                         |  62 +--
>  tools/perf/scripts/python/bin/mem-phys-addr-record |  19 +
>  tools/perf/scripts/python/bin/mem-phys-addr-report |   3 +
>  tools/perf/scripts/python/mem-phys-addr.py         |  95 +++++
>  tools/perf/tests/dwarf-unwind.c                    |   1 +
>  tools/perf/util/Build                              |   2 +
>  tools/perf/util/arm-spe-pkt-decoder.c              | 462 +++++++++++++++++++++
>  tools/perf/util/arm-spe-pkt-decoder.h              |  43 ++
>  tools/perf/util/arm-spe.c                          | 231 +++++++++++
>  tools/perf/util/arm-spe.h                          |  31 ++
>  tools/perf/util/auxtrace.c                         |   3 +
>  tools/perf/util/auxtrace.h                         |   1 +
>  tools/perf/util/callchain.c                        |  10 +
>  tools/perf/util/callchain.h                        |   2 +
>  tools/perf/util/evlist.c                           |   2 +-
>  tools/perf/util/evsel.c                            |  40 +-
>  .../util/scripting-engines/trace-event-python.c    |   2 +
>  tools/perf/util/time-utils.c                       |  72 +++-
>  tools/perf/util/time-utils.h                       |   2 +
>  tools/perf/util/unwind-libunwind-local.c           |   9 +-
>  tools/perf/util/util.c                             |   2 +-
>  35 files changed, 1465 insertions(+), 130 deletions(-)
>  create mode 100644 tools/perf/arch/arm64/util/arm-spe.c
>  create mode 100644 tools/perf/scripts/python/bin/mem-phys-addr-record
>  create mode 100644 tools/perf/scripts/python/bin/mem-phys-addr-report
>  create mode 100644 tools/perf/scripts/python/mem-phys-addr.py
>  create mode 100644 tools/perf/util/arm-spe-pkt-decoder.c
>  create mode 100644 tools/perf/util/arm-spe-pkt-decoder.h
>  create mode 100644 tools/perf/util/arm-spe.c
>  create mode 100644 tools/perf/util/arm-spe.h

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2018-01-17 16:23 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-17 16:11 [GIT PULL 00/32] perf/core improvements and fixes Arnaldo Carvalho de Melo
2018-01-17 16:11 ` [PATCH 01/32] perf evsel: Fix incorrect handling of type _TERM_DRV_CFG Arnaldo Carvalho de Melo
2018-01-17 16:11 ` [PATCH 02/32] perf evlist: Remove trailing semicolon Arnaldo Carvalho de Melo
2018-01-17 16:11 ` [PATCH 03/32] perf script python: Add script to profile and resolve physical mem type Arnaldo Carvalho de Melo
2018-01-17 16:11 ` [PATCH 04/32] perf trace: No need to set PERF_SAMPLE_IDENTIFIER explicitely Arnaldo Carvalho de Melo
2018-01-17 16:11 ` [PATCH 05/32] perf tools: Fix copyfile_offset update of output offset Arnaldo Carvalho de Melo
2018-01-17 16:11 ` [PATCH 06/32] perf evsel: Check if callchain is enabled before setting it up Arnaldo Carvalho de Melo
2018-01-17 16:11 ` [PATCH 07/32] perf trace: Fix setting of --call-graph/--max-stack for non-syscall events Arnaldo Carvalho de Melo
2018-01-17 16:11 ` [PATCH 08/32] tools lib traceevent: Fix bad force_token escape sequence Arnaldo Carvalho de Melo
2018-01-17 16:11 ` [PATCH 09/32] tools lib traceevent: Show value of flags that have not been parsed Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 10/32] tools lib traceevent: Print value of unknown symbolic fields Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 11/32] tools lib traceevent: Simplify pointer print logic and fix %pF Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 12/32] tools lib traceevent: Handle new pointer processing of bprint strings Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 13/32] tools lib traceevent: Show contents (in hex) of data of unrecognized type records Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 14/32] tools lib traceevent: Use asprintf when possible Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 15/32] tools lib traceevent: Add UL suffix to MISSING_EVENTS Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 16/32] tools lib traceevent: Fix missing break in FALSE case of pevent_filter_clear_trivial() Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 17/32] tools lib traceevent: Fix get_field_str() for dynamic strings Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 18/32] perf tools: Add ARM Statistical Profiling Extensions (SPE) support Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 19/32] perf callchain: Fix attr.sample_max_stack setting Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 20/32] perf unwind: Do not look just at the global callchain_param.record_mode Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 21/32] perf trace: Setup DWARF callchains for non-syscall events when --max-stack is used Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 22/32] perf trace: Allow overriding global --max-stack per event Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 23/32] perf callchains: Ask for PERF_RECORD_MMAP for data mmaps for DWARF unwinding Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 24/32] perf report: Improve error msg when no first/last sample time found Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 25/32] perf script: " Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 26/32] perf util: Improve error checking for time percent input Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 27/32] perf util: Support no index time percent slice Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 28/32] perf report: Add an indication of what time slices are used Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 29/32] perf util: Allocate time slices buffer according to number of comma Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 30/32] perf report: Remove the time slices number limitation Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 31/32] perf script: " Arnaldo Carvalho de Melo
2018-01-17 16:12 ` [PATCH 32/32] perf record: Fix failed memory allocation for get_cpuid_str Arnaldo Carvalho de Melo
2018-01-17 16:22 ` [GIT PULL 00/32] perf/core improvements and fixes Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).