linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] perf/core improvements and fixes
@ 2019-08-16 20:16 Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 01/17] perf vendor events intel: Add Tremontx event file v1.02 Arnaldo Carvalho de Melo
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Florian Weimer,
	William Cohen, Haiyan Song, John Keeping,
	Arnaldo Carvalho de Melo

Hi Ingo, Thomas,

	Please consider pulling,

Best regards,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 4511708b9a044f2bc83c7c7f7f8a2c45ec488219:

  Merge tag 'perf-core-for-mingo-5.4-20190814' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2019-08-15 11:10:38 +0200)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-5.4-20190816

for you to fetch changes up to e2736219e6ca3117e10651e215b96d66775220da:

  perf unwind: Remove unnecessary test (2019-08-16 12:30:14 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

report/script/trace/top:

  Arnaldo Carvalho de Melo:

  - Allow specifying marker events demarcating when to consider the other events,
    i.e. one now can state something like:

        # perf probe kernel_function
        # perf record -e cycles,probe:kernel_function

    And then, in 'perf script' or 'perf report' say:

        # perf report --switch-on=probe:kernel_function

    And then the cycles event samples will be considered only after we
    find the first probe:kernel_function event.

    There is also --switch-off=event, to make it stop considering events
    out of some window, say to avoid some winding down of a workload.

    The same can be done with the "live mode" tools: 'perf top' and 'perf trace'.

    There are examples in the cset comments showing how to use it with
    SDT events in things like 'systemtap', that have those tracepoint-like
    events for the start/end of passes, etc.

    Another example involves selecting scheduler events + entry/exit of
    a syscall, using the syscalls tracepoints, one can then see the
    scheduler events that take place while that syscall is being processed.

    In the future this should be possible in record/top/trace via eBPF
    where the perf tools would hook into the marker events and enable events
    put in place but not enabled when the on/off conditions are the desired
    ones, reducing the amount of events sampled, but this userspace only
    solution should be good enough for many scenarios.

perf vendor events intel:

  Haiyan Song:

  - Add Tremontx event file v1.02.

unwind:

  John Keeping:

  - Fix callchain unwinding when tid != pid, that was working only for the
    thread group leader.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (13):
      perf script: Allow specifying event to switch on processing of other events
      perf script: Allow showing the --switch-on event
      perf script: Allow specifying event to switch off processing of other events
      perf evswitch: Move struct to a separate header to use in other tools
      perf evswitch: Move switch logic to use in other tools
      perf evswitch: Add the names of on/off events
      perf evswitch: Introduce OPTS_EVSWITCH() for cmd line processing
      perf evswitch: Introduce init() method to set the on/off evsels from the command line
      perf evswitch: Move enoent error message printing to separate function
      perf evswitch: Add hint when not finding specified on/off events
      perf trace: Add --switch-on/--switch-off events
      perf top: Add --switch-on/--switch-off events
      perf report: Add --switch-on/--switch-off events

Haiyan Song (1):
      perf vendor events intel: Add Tremontx event file v1.02

John Keeping (3):
      perf map: Use zalloc for map_groups
      perf unwind: Fix libunwind when tid != pid
      perf unwind: Remove unnecessary test

 tools/perf/Documentation/perf-report.txt           |  17 +
 tools/perf/Documentation/perf-script.txt           |   9 +
 tools/perf/Documentation/perf-top.txt              |  38 ++
 tools/perf/Documentation/perf-trace.txt            |   9 +
 tools/perf/builtin-report.c                        |  10 +
 tools/perf/builtin-script.c                        |  10 +
 tools/perf/builtin-top.c                           |  10 +-
 tools/perf/builtin-trace.c                         |  10 +
 tools/perf/pmu-events/arch/x86/mapfile.csv         |   1 +
 tools/perf/pmu-events/arch/x86/tremontx/cache.json | 111 ++++++
 .../pmu-events/arch/x86/tremontx/frontend.json     |  26 ++
 .../perf/pmu-events/arch/x86/tremontx/memory.json  |  26 ++
 tools/perf/pmu-events/arch/x86/tremontx/other.json |  26 ++
 .../pmu-events/arch/x86/tremontx/pipeline.json     | 111 ++++++
 .../arch/x86/tremontx/uncore-memory.json           |  73 ++++
 .../pmu-events/arch/x86/tremontx/uncore-other.json | 431 +++++++++++++++++++++
 .../pmu-events/arch/x86/tremontx/uncore-power.json |  11 +
 .../arch/x86/tremontx/virtual-memory.json          |  86 ++++
 tools/perf/util/Build                              |   1 +
 tools/perf/util/evswitch.c                         |  61 +++
 tools/perf/util/evswitch.h                         |  31 ++
 tools/perf/util/map.c                              |   5 +-
 tools/perf/util/map_groups.h                       |   4 +
 tools/perf/util/thread.c                           |   7 +-
 tools/perf/util/thread.h                           |   4 -
 tools/perf/util/top.h                              |   2 +
 tools/perf/util/unwind-libunwind-local.c           |  18 +-
 tools/perf/util/unwind-libunwind.c                 |  40 +-
 tools/perf/util/unwind.h                           |  25 +-
 29 files changed, 1158 insertions(+), 55 deletions(-)
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/cache.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/frontend.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/other.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/pipeline.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-other.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-power.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/virtual-memory.json
 create mode 100644 tools/perf/util/evswitch.c
 create mode 100644 tools/perf/util/evswitch.h

Test results:

The first ones are container based builds of tools/perf with and without libelf
support.  Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

Clearlinux is failing when building with libpython, but that is not a perf
regression, will try to remove one compiler warning that is causing the problem
when building some of the glue code files in the python files, outside perf.

  # export PERF_TARBALL=http://192.168.124.1/perf/perf-5.3.0-rc4.tar.xz
  # dm
   1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final)
   2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final)
   3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final)
   4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
   5 alpine:3.8                    : Ok   gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
   6 alpine:3.9                    : Ok   gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
   7 alpine:3.10                   : Ok   gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
   8 alpine:edge                   : Ok   gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 8.0.1 (tags/RELEASE_801/final) (based on LLVM 8.0.1)
   9 amazonlinux:1                 : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final)
  10 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
  11 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  12 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  13 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  14 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
  15 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
  16 clearlinux:latest             : Ok   gcc (Clear Linux OS for Intel Architecture) 9.1.1 20190808 gcc-9-branch@274204, clang version 8.0.0 (tags/RELEASE_800/final)
  17 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
  18 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final)
  19 debian:10                     : Ok   gcc (Debian 8.3.0-6) 8.3.0, clang version 7.0.1-8 (tags/RELEASE_701/final)
  20 debian:experimental           : Ok   gcc (Debian 8.3.0-19) 8.3.0, clang version 7.0.1-9 (tags/RELEASE_701/final)
  21 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.3.0-19) 8.3.0
  22 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 8.3.0-19) 8.3.0
  23 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 8.3.0-7) 8.3.0
  24 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 8.3.0-19) 8.3.0
  25 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
  26 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.5.0 (tags/RELEASE_350/final)
  27 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final)
  28 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final)
  29 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  30 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final)
  31 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final)
  32 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final)
  33 fedora:28                     : Ok   gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final)
  34 fedora:29                     : Ok   gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29)
  35 fedora:30                     : Ok   gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1), clang version 8.0.0 (Fedora 8.0.0-1.fc30)
  36 fedora:30-x-ARC-glibc         : Ok   arc-linux-gcc (ARC HS GNU/Linux glibc toolchain 2019.03-rc1) 8.3.1 20190225
  37 fedora:30-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
  38 fedora:31                     : Ok   gcc (GCC) 9.1.1 20190605 (Red Hat 9.1.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc31.1)
  39 fedora:rawhide                : Ok   gcc (GCC) 9.1.1 20190605 (Red Hat 9.1.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc31.1)
  40 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 8.3.0-r1 p1.1) 8.3.0
  41 mageia:5                      : Ok   gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final)
  42 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final)
  43 mageia:7                      : Ok   gcc (Mageia 8.3.1-0.20190524.1.mga7) 8.3.1 20190524, clang version 8.0.0 (Mageia 8.0.0-1.mga7)
  44 manjaro:latest                : Ok   gcc (GCC) 9.1.0, clang version 8.0.1 (tags/RELEASE_801/final)
  45 opensuse:15.0                 : Ok   gcc (SUSE Linux) 7.4.1 20190424 [gcc-7-branch revision 270538], clang version 5.0.1 (tags/RELEASE_501/final 312548)
  46 opensuse:15.1                 : Ok   gcc (SUSE Linux) 7.4.0, clang version 7.0.1 (tags/RELEASE_701/final 349238)
  47 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5, clang version 3.8.0 (tags/RELEASE_380/final 262553)
  48 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 9.1.1 20190723 [gcc-9-branch revision 273734], clang version 8.0.1 (tags/RELEASE_801/final 366581)
  49 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
  50 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39.0.1), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
  51 oraclelinux:8                 : Ok   gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3.0.1), clang version 7.0.1 (tags/RELEASE_701/final)
  52 ubuntu:12.04                  : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
  53 ubuntu:14.04                  : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, Ubuntu clang version 3.4-1ubuntu3 (tags/RELEASE_34/final) (based on LLVM 3.4)
  54 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
  55 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  56 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  57 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  58 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  59 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  60 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  61 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
  62 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1) 7.4.0
  63 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1) 7.4.0
  64 ubuntu:18.04-x-m68k           : Ok   m68k-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  65 ubuntu:18.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  66 ubuntu:18.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  67 ubuntu:18.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  68 ubuntu:18.04-x-riscv64        : Ok   riscv64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  69 ubuntu:18.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  70 ubuntu:18.04-x-sh4            : Ok   sh4-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  71 ubuntu:18.04-x-sparc64        : Ok   sparc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  72 ubuntu:18.10                  : Ok   gcc (Ubuntu 8.3.0-6ubuntu1~18.10.1) 8.3.0, clang version 7.0.0-3 (tags/RELEASE_700/final)
  73 ubuntu:19.04                  : Ok   gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, clang version 8.0.0-3 (tags/RELEASE_800/final)
  74 ubuntu:19.04-x-alpha          : Ok   alpha-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
  75 ubuntu:19.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 8.3.0-6ubuntu1) 8.3.0
  76 ubuntu:19.04-x-hppa           : Ok   hppa-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
  77 ubuntu:19.10                  : Ok   gcc (Ubuntu 9.1.0-9ubuntu2) 9.1.0, clang version 8.0.1-+rc4-1 (tags/RELEASE_801/rc4)



  # uname -a
  Linux quaco 5.2.6-200.fc30.x86_64 #1 SMP Mon Aug 5 13:20:47 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  # git log --oneline -1
  e2736219e6ca perf unwind: Remove unnecessary test
  # perf version --build-options
  perf version 5.3.rc4.ge2736219e6ca
                   dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
      dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                   glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                    gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
           syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                  libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                  libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                 libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
  numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
               libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                libslang: [ on  ]  # HAVE_SLANG_SUPPORT
               libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
               libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
      libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                    zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                    lzma: [ on  ]  # HAVE_LZMA_SUPPORT
               get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
                     aio: [ on  ]  # HAVE_AIO_SUPPORT
                    zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Breakpoint accounting                                 : Ok
  22: Watchpoint                                            :
  22.1: Read Only Watchpoint                                : Skip
  22.2: Write Only Watchpoint                               : Ok
  22.3: Read / Write Watchpoint                             : Ok
  22.4: Modify Watchpoint                                   : Ok
  23: Number of exit events of a simple workload            : Ok
  24: Software clock events period values                   : Ok
  25: Object code reading                                   : Ok
  26: Sample parsing                                        : Ok
  27: Use a dummy software event to keep tracking           : Ok
  28: Parse with no sample_id_all bit set                   : Ok
  29: Filter hist entries                                   : Ok
  30: Lookup mmap thread                                    : Ok
  31: Share thread mg                                       : Ok
  32: Sort output of hist entries                           : Ok
  33: Cumulate child hist entries                           : Ok
  34: Track with sched_switch                               : Ok
  35: Filter fds with revents mask in a fdarray             : Ok
  36: Add fd to a fdarray, making it autogrow               : Ok
  37: kmod_path__parse                                      : Ok
  38: Thread map                                            : Ok
  39: LLVM search and compile                               :
  39.1: Basic BPF llvm compile                              : Ok
  39.2: kbuild searching                                    : Ok
  39.3: Compile source for BPF prologue generation          : Ok
  39.4: Compile source for BPF relocation                   : Ok
  40: Session topology                                      : Ok
  41: BPF filter                                            :
  41.1: Basic BPF filtering                                 : Ok
  41.2: BPF pinning                                         : Ok
  41.3: BPF prologue generation                             : Ok
  41.4: BPF relocation checker                              : Ok
  42: Synthesize thread map                                 : Ok
  43: Remove thread map                                     : Ok
  44: Synthesize cpu map                                    : Ok
  45: Synthesize stat config                                : Ok
  46: Synthesize stat                                       : Ok
  47: Synthesize stat round                                 : Ok
  48: Synthesize attr update                                : Ok
  49: Event times                                           : Ok
  50: Read backward ring buffer                             : Ok
  51: Print cpu map                                         : Ok
  52: Probe SDT events                                      : Ok
  53: is_printable_array                                    : Ok
  54: Print bitmap                                          : Ok
  55: perf hooks                                            : Ok
  56: builtin clang support                                 : Skip (not compiled in)
  57: unit_number__scnprintf                                : Ok
  58: mem2node                                              : Ok
  59: time utils                                            : Ok
  60: map_groups__merge_in                                  : Ok
  61: x86 rdpmc                                             : Ok
  62: Convert perf time to TSC                              : Ok
  63: DWARF unwind                                          : Ok
  64: x86 instruction decoder - new instructions            : Ok
  65: Intel PT packet decoder                               : Ok
  66: x86 bp modify                                         : Ok
  67: probe libc's inet_pton & backtrace it with ping       : Ok
  68: Use vfs_getname probe to get syscall args filenames   : Ok
  69: Add vfs_getname probe to get syscall args filenames   : Ok
  70: Check open filename arg using perf trace + vfs_getname: Ok
  71: Zstd perf.data compression/decompression              : Ok
  #
  
  $ time make -C tools/perf build-test
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
                make_no_gtk2_O: make NO_GTK2=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
                 make_cscope_O: make cscope
                  make_debug_O: make DEBUG=1
           make_no_backtrace_O: make NO_BACKTRACE=1
                make_no_newt_O: make NO_NEWT=1
              make_no_libbpf_O: make NO_LIBBPF=1
             make_util_map_o_O: make util/map.o
         make_install_prefix_O: make install prefix=/tmp/krava
         make_with_clangllvm_O: make LIBCLANGLLVM=1
           make_no_libbionic_O: make NO_LIBBIONIC=1
             make_no_libnuma_O: make NO_LIBNUMA=1
              make_clean_all_O: make clean all
                   make_help_O: make help
           make_no_libpython_O: make NO_LIBPYTHON=1
       make_util_pmu_bison_o_O: make util/pmu-bison.o
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
            make_install_bin_O: make install-bin
            make_no_demangle_O: make NO_DEMANGLE=1
             make_no_libperl_O: make NO_LIBPERL=1
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
            make_no_auxtrace_O: make NO_AUXTRACE=1
                 make_static_O: make LDFLAGS=-static
              make_no_libelf_O: make NO_LIBELF=1
               make_no_slang_O: make NO_SLANG=1
                   make_tags_O: make tags
           make_no_libunwind_O: make NO_LIBUNWIND=1
                    make_doc_O: make doc
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
        make_with_babeltrace_O: make LIBBABELTRACE=1
                 make_perf_o_O: make perf.o
                make_install_O: make install
                   make_pure_O: make
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $ 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 01/17] perf vendor events intel: Add Tremontx event file v1.02
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 02/17] perf script: Allow specifying event to switch on processing of other events Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Haiyan Song, Kan Liang, Alexander Shishkin,
	Andi Kleen, Jin Yao, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Haiyan Song <haiyanx.song@intel.com>

Add a Intel event file for perf.

Signed-off-by: Haiyan Song <haiyanx.song@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190815035942.30602-1-haiyanx.song@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/pmu-events/arch/x86/mapfile.csv    |   1 +
 .../pmu-events/arch/x86/tremontx/cache.json   | 111 +++++
 .../arch/x86/tremontx/frontend.json           |  26 ++
 .../pmu-events/arch/x86/tremontx/memory.json  |  26 ++
 .../pmu-events/arch/x86/tremontx/other.json   |  26 ++
 .../arch/x86/tremontx/pipeline.json           | 111 +++++
 .../arch/x86/tremontx/uncore-memory.json      |  73 +++
 .../arch/x86/tremontx/uncore-other.json       | 431 ++++++++++++++++++
 .../arch/x86/tremontx/uncore-power.json       |  11 +
 .../arch/x86/tremontx/virtual-memory.json     |  86 ++++
 10 files changed, 902 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/cache.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/frontend.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/other.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/pipeline.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-other.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-power.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/virtual-memory.json

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index b90e5fec2f32..745ced083844 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -35,4 +35,5 @@ GenuineIntel-6-55-[01234],v1,skylakex,core
 GenuineIntel-6-55-[56789ABCDEF],v1,cascadelakex,core
 GenuineIntel-6-7D,v1,icelake,core
 GenuineIntel-6-7E,v1,icelake,core
+GenuineIntel-6-86,v1,tremontx,core
 AuthenticAMD-23-[[:xdigit:]]+,v1,amdfam17h,core
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/cache.json b/tools/perf/pmu-events/arch/x86/tremontx/cache.json
new file mode 100644
index 000000000000..f88040171b4d
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/cache.json
@@ -0,0 +1,111 @@
+[
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts cacheable memory requests that miss in the the Last Level Cache.  Requests include Demand Loads, Reads for Ownership(RFO), Instruction fetches and L1 HW prefetches. If the platform has an L3 cache, last level cache is the L3, otherwise it is the L2.",
+        "EventCode": "0x2e",
+        "Counter": "0,1,2,3",
+        "UMask": "0x41",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "LONGEST_LAT_CACHE.MISS",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts memory requests originating from the core that miss in the last level cache. If the platform has an L3 cache, last level cache is the L3, otherwise it is the L2."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts cacheable memory requests that access the Last Level Cache.  Requests include Demand Loads, Reads for Ownership(RFO), Instruction fetches and L1 HW prefetches. If the platform has an L3 cache, last level cache is the L3, otherwise it is the L2.",
+        "EventCode": "0x2e",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4f",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "LONGEST_LAT_CACHE.REFERENCE",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts memory requests originating from the core that reference a cache line in the last level cache. If the platform has an L3 cache, last level cache is the L3, otherwise it is the L2."
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of load uops retired. This event is Precise Event capable",
+        "EventCode": "0xd0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x81",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of load uops retired.",
+        "Data_LA": "1"
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of store uops retired. This event is Precise Event capable",
+        "EventCode": "0xd0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x82",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of store uops retired.",
+        "Data_LA": "1"
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "EventCode": "0xd1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "MEM_LOAD_UOPS_RETIRED.L1_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of load uops retired that hit the level 1 data cache",
+        "Data_LA": "1"
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "EventCode": "0xd1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "MEM_LOAD_UOPS_RETIRED.L2_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of load uops retired that hit in the level 2 cache",
+        "Data_LA": "1"
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "EventCode": "0xd1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "MEM_LOAD_UOPS_RETIRED.L3_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of load uops retired that miss in the level 3 cache"
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "EventCode": "0xd1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "MEM_LOAD_UOPS_RETIRED.L1_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of load uops retired that miss in the level 1 data cache",
+        "Data_LA": "1"
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "EventCode": "0xd1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "MEM_LOAD_UOPS_RETIRED.L2_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of load uops retired that miss in the level 2 cache",
+        "Data_LA": "1"
+    }
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/frontend.json b/tools/perf/pmu-events/arch/x86/tremontx/frontend.json
new file mode 100644
index 000000000000..73b0a1ed5756
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/frontend.json
@@ -0,0 +1,26 @@
+[
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts requests to the Instruction Cache (ICache)  for one or more bytes in an ICache Line and that cache line is not in the ICache (miss).  The event strives to count on a cache line basis, so that multiple accesses which miss in a single cache line count as one ICACHE.MISS.  Specifically, the event counts when straight line code crosses the cache line boundary, or when a branch target is to a new line, and that cache line is not in the ICache.",
+        "EventCode": "0x80",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "ICACHE.MISSES",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts requests to the Instruction Cache (ICache) for one or more bytes in a cache line and they do not hit in the ICache (miss)."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts requests to the Instruction Cache (ICache) for one or more bytes in an ICache Line.  The event strives to count on a cache line basis, so that multiple fetches to a single cache line count as one ICACHE.ACCESS.  Specifically, the event counts when accesses from straight line code crosses the cache line boundary, or when a branch target is to a new line.",
+        "EventCode": "0x80",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "ICACHE.ACCESSES",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts requests to the Instruction Cache (ICache) for one or more bytes cache Line."
+    }
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/memory.json b/tools/perf/pmu-events/arch/x86/tremontx/memory.json
new file mode 100644
index 000000000000..65469e84f35b
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/memory.json
@@ -0,0 +1,26 @@
+[
+    {
+        "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "EventCode": "0XB7",
+        "MSRValue": "0x000000003F04000001",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OCR.DEMAND_DATA_RD.L3_MISS",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts demand data reads that was not supplied by the L3 cache.",
+        "Offcore": "1"
+    },
+    {
+        "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "EventCode": "0XB7",
+        "MSRValue": "0x000000003F04000002",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OCR.DEMAND_RFO.L3_MISS",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts all demand reads for ownership (RFO) requests and software based prefetches for exclusive ownership (PREFETCHW) that was not supplied by the L3 cache.",
+        "Offcore": "1"
+    }
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/other.json b/tools/perf/pmu-events/arch/x86/tremontx/other.json
new file mode 100644
index 000000000000..85bf3c8f3914
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/other.json
@@ -0,0 +1,26 @@
+[
+    {
+        "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "EventCode": "0XB7",
+        "MSRValue": "0x000000000000010001",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OCR.DEMAND_DATA_RD.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts demand data reads that have any response type.",
+        "Offcore": "1"
+    },
+    {
+        "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "EventCode": "0XB7",
+        "MSRValue": "0x000000000000010002",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OCR.DEMAND_RFO.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts all demand reads for ownership (RFO) requests and software based prefetches for exclusive ownership (PREFETCHW) that have any response type.",
+        "Offcore": "1"
+    }
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/pipeline.json b/tools/perf/pmu-events/arch/x86/tremontx/pipeline.json
new file mode 100644
index 000000000000..05a8f6a7d9c0
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/pipeline.json
@@ -0,0 +1,111 @@
+[
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of instructions that retire. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers.  This event uses fixed counter 0.",
+        "Counter": "32",
+        "UMask": "0x1",
+        "PEBScounters": "32",
+        "EventName": "INST_RETIRED.ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of instructions retired. (Fixed event)"
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of core cycles while the core is not in a halt state.  The core enters the halt state when it is running the HLT instruction. The core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time.  This event uses fixed counter 1.",
+        "Counter": "33",
+        "UMask": "0x2",
+        "PEBScounters": "33",
+        "EventName": "CPU_CLK_UNHALTED.CORE",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of unhalted core clock cycles. (Fixed event)"
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of reference cycles that the core is not in a halt state. The core enters the halt state when it is running the HLT instruction.  The core frequency may change from time.  This event is not affected by core frequency changes and at a fixed frequency.  This event uses fixed counter 2.",
+        "Counter": "34",
+        "UMask": "0x3",
+        "PEBScounters": "34",
+        "EventName": "CPU_CLK_UNHALTED.REF_TSC",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of unhalted reference clock cycles at TSC frequency. (Fixed event)"
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of core cycles while the core is not in a halt state.  The core enters the halt state when it is running the HLT instruction. The core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time.  This event uses a programmable general purpose performance counter.",
+        "EventCode": "0x3c",
+        "Counter": "0,1,2,3",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.CORE_P",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of unhalted core clock cycles."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts reference cycles (at TSC frequency) when core is not halted.  This event uses a programmable general purpose perfmon counter.",
+        "EventCode": "0x3c",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.REF",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of unhalted reference clock cycles at TSC frequency."
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The event continues counting during hardware interrupts, traps, and inside interrupt handlers.  This is an architectural performance event.  This event uses a Programmable general purpose perfmon counter. *This event is Precise Event capable:  The EventingRIP field in the PEBS record is precise to the address of the instruction which caused the event.",
+        "EventCode": "0xc0",
+        "Counter": "0,1,2,3",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "INST_RETIRED.ANY_P",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of instructions retired."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "EventCode": "0xc3",
+        "Counter": "0,1,2,3",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.ANY",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "20003",
+        "BriefDescription": "Counts all machine clears due to, but not limited to memory ordering, memory disambiguation, SMC, page faults and FP assist."
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts branch instructions retired for all branch types. This event is Precise Event capable. This is an architectural event.",
+        "EventCode": "0xc4",
+        "Counter": "0,1,2,3",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of branch instructions retired for all branch types."
+    },
+    {
+        "PEBS": "1",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts mispredicted branch instructions retired for all branch types. This event is Precise Event capable. This is an architectural event.",
+        "EventCode": "0xc5",
+        "Counter": "0,1,2,3",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of mispredicted branch instructions retired."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "EventCode": "0xcd",
+        "Counter": "0,1,2,3",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "CYCLES_DIV_BUSY.ANY",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts cycles the floating point divider or integer divider or both are busy.  Does not imply a stall waiting for either divider."
+    }
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/uncore-memory.json b/tools/perf/pmu-events/arch/x86/tremontx/uncore-memory.json
new file mode 100644
index 000000000000..15376f2cf052
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/uncore-memory.json
@@ -0,0 +1,73 @@
+[
+    {
+        "BriefDescription": "read requests to memory controller. Derived from unc_m_cas_count.rd",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x04",
+        "EventName": "LLC_MISSES.MEM_READ",
+        "PerPkg": "1",
+        "ScaleUnit": "64Bytes",
+        "UMask": "0x0f",
+        "Unit": "iMC"
+    },
+    {
+        "BriefDescription": "write requests to memory controller. Derived from unc_m_cas_count.wr",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x04",
+        "EventName": "LLC_MISSES.MEM_WRITE",
+        "PerPkg": "1",
+        "ScaleUnit": "64Bytes",
+        "UMask": "0x30",
+        "Unit": "iMC"
+    },
+    {
+        "BriefDescription": "Memory controller clock ticks",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventName": "UNC_M_CLOCKTICKS",
+        "PerPkg": "1",
+        "Unit": "iMC"
+    },
+    {
+        "BriefDescription": "Pre-charge for reads",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x02",
+        "EventName": "UNC_M_PRE_COUNT.RD",
+        "PerPkg": "1",
+        "UMask": "0x04",
+        "Unit": "iMC"
+    },
+    {
+        "BriefDescription": "Pre-charge for writes",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x02",
+        "EventName": "UNC_M_PRE_COUNT.WR",
+        "PerPkg": "1",
+        "UMask": "0x08",
+        "Unit": "iMC"
+    },
+    {
+        "BriefDescription": "Precharge due to read on page miss, write on page miss or PGT",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x02",
+        "EventName": "UNC_M_PRE_COUNT.ALL",
+        "PerPkg": "1",
+        "UMask": "0x1c",
+        "Unit": "iMC"
+    },
+    {
+        "BriefDescription": "DRAM Precharge commands. : Precharge due to page table",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x02",
+        "EventName": "UNC_M_PRE_COUNT.PGT",
+        "PerPkg": "1",
+        "PublicDescription": "DRAM Precharge commands. : Precharge due to page table : Counts the number of DRAM Precharge commands sent on this channel.",
+        "UMask": "0x10",
+        "Unit": "iMC"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/uncore-other.json b/tools/perf/pmu-events/arch/x86/tremontx/uncore-other.json
new file mode 100644
index 000000000000..6deff1fe89e3
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/uncore-other.json
@@ -0,0 +1,431 @@
+[
+    {
+        "BriefDescription": "Uncore cache clock ticks",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventName": "UNC_CHA_CLOCKTICKS",
+        "PerPkg": "1",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "LLC misses - Uncacheable reads (from cpu) . Derived from unc_cha_tor_inserts.ia_miss",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "LLC_MISSES.UNCACHEABLE",
+        "Filter": "config1=0x40e33",
+        "PerPkg": "1",
+        "UMask": "0xC001FE01",
+        "UMaskExt": "0xC001FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "MMIO reads. Derived from unc_cha_tor_inserts.ia_miss",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "LLC_MISSES.MMIO_READ",
+        "Filter": "config1=0x40040e33",
+        "PerPkg": "1",
+        "UMask": "0xC001FE01",
+        "UMaskExt": "0xC001FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "MMIO writes. Derived from unc_cha_tor_inserts.ia_miss",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "LLC_MISSES.MMIO_WRITE",
+        "Filter": "config1=0x40041e33",
+        "PerPkg": "1",
+        "UMask": "0xC001FE01",
+        "UMaskExt": "0xC001FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "Streaming stores (full cache line). Derived from unc_cha_tor_inserts.ia_miss",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "LLC_REFERENCES.STREAMING_FULL",
+        "Filter": "config1=0x41833",
+        "PerPkg": "1",
+        "ScaleUnit": "64Bytes",
+        "UMask": "0xC001FE01",
+        "UMaskExt": "0xC001FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "Streaming stores (partial cache line). Derived from unc_cha_tor_inserts.ia_miss",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "LLC_REFERENCES.STREAMING_PARTIAL",
+        "Filter": "config1=0x41a33",
+        "PerPkg": "1",
+        "ScaleUnit": "64Bytes",
+        "UMask": "0xC001FE01",
+        "UMaskExt": "0xC001FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "PCI Express bandwidth reading at IIO. Derived from unc_iio_data_req_of_cpu.mem_read.part0",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "LLC_MISSES.PCIE_READ",
+        "FCMask": "0x07",
+        "Filter": "ch_mask=0x1f",
+        "MetricExpr": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART0 +UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART1 +UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART2 +UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART3",
+        "MetricName": "LLC_MISSES.PCIE_READ",
+        "PerPkg": "1",
+        "PortMask": "0x01",
+        "ScaleUnit": "4Bytes",
+        "UMask": "0x04",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "PCI Express bandwidth writing at IIO. Derived from unc_iio_data_req_of_cpu.mem_write.part0",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "LLC_MISSES.PCIE_WRITE",
+        "FCMask": "0x07",
+        "Filter": "ch_mask=0x1f",
+        "MetricExpr": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 +UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 +UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 +UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3",
+        "MetricName": "LLC_MISSES.PCIE_WRITE",
+        "PerPkg": "1",
+        "PortMask": "0x01",
+        "ScaleUnit": "4Bytes",
+        "UMask": "0x01",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "PCI Express bandwidth writing at IIO, part 1",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x02",
+        "ScaleUnit": "4Bytes",
+        "UMask": "0x01",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "PCI Express bandwidth writing at IIO, part 2",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x04",
+        "ScaleUnit": "4Bytes",
+        "UMask": "0x01",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "PCI Express bandwidth writing at IIO, part 3",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x08",
+        "ScaleUnit": "4Bytes",
+        "UMask": "0x01",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "PCI Express bandwidth reading at IIO, part 1",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART1",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x02",
+        "ScaleUnit": "4Bytes",
+        "UMask": "0x04",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "PCI Express bandwidth reading at IIO, part 2",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART2",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x04",
+        "ScaleUnit": "4Bytes",
+        "UMask": "0x04",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "PCI Express bandwidth reading at IIO, part 3",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART3",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x08",
+        "ScaleUnit": "4Bytes",
+        "UMask": "0x04",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "TOR Inserts; CRd misses from local IA",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_CRD",
+        "PerPkg": "1",
+        "PublicDescription": "TOR Inserts; Code read from local IA that misses in the snoop filter",
+        "UMask": "0xC80FFE01",
+        "UMaskExt": "0xC80FFE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "TOR Inserts; CRd Pref misses from local IA",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_CRD_PREF",
+        "PerPkg": "1",
+        "PublicDescription": "TOR Inserts; Code read prefetch from local IA that misses in the snoop filter",
+        "UMask": "0xC88FFE01",
+        "UMaskExt": "0xC88FFE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "TOR Inserts; DRd Opt misses from local IA",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_OPT",
+        "PerPkg": "1",
+        "PublicDescription": "TOR Inserts; Data read opt from local IA that misses in the snoop filter",
+        "UMask": "0xC827FE01",
+        "UMaskExt": "0xC827FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "TOR Inserts; DRd Opt Pref misses from local IA",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_OPT_PREF",
+        "PerPkg": "1",
+        "PublicDescription": "TOR Inserts; Data read opt prefetch from local IA that misses in the snoop filter",
+        "UMask": "0xC8A7FE01",
+        "UMaskExt": "0xC8A7FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "TOR Inserts; RFO misses from local IA",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO",
+        "PerPkg": "1",
+        "PublicDescription": "TOR Inserts; Read for ownership from local IA that misses in the snoop filter",
+        "UMask": "0xC807FE01",
+        "UMaskExt": "0xC807FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "TOR Inserts; RFO pref misses from local IA",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO_PREF",
+        "PerPkg": "1",
+        "PublicDescription": "TOR Inserts; Read for ownership prefetch from local IA that misses in the snoop filter",
+        "UMask": "0xC887FE01",
+        "UMaskExt": "0xC887FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "TOR Inserts; WCiL misses from local IA",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_WCIL",
+        "PerPkg": "1",
+        "PublicDescription": "TOR Inserts; Data read from local IA that misses in the snoop filter",
+        "UMask": "0xC86FFE01",
+        "UMaskExt": "0xC86FFE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "TOR Inserts; WCiLF misses from local IA",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x35",
+        "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_WCILF",
+        "PerPkg": "1",
+        "PublicDescription": "TOR Inserts; Data read from local IA that misses in the snoop filter",
+        "UMask": "0xC867FE01",
+        "UMaskExt": "0xC867FE",
+        "Unit": "CHA"
+    },
+    {
+        "BriefDescription": "Clockticks of the integrated IO (IIO) traffic controller",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x01",
+        "EventName": "UNC_IIO_CLOCKTICKS",
+        "PerPkg": "1",
+        "PublicDescription": "Clockticks of the integrated IO (IIO) traffic controller",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "Data requested of the CPU : Card reading from DRAM",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART4",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x10",
+        "PublicDescription": "Data requested of the CPU : Card reading from DRAM : Number of DWs (4 bytes) the card requests of the main die.    Includes all requests initiated by the Card, including reads and writes. : x16 card plugged in to stack, Or x8 card plugged in to Lane 0/1, Or x4 card is plugged in to slot 0",
+        "UMask": "0x04",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "Data requested of the CPU : Card reading from DRAM",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART5",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x20",
+        "PublicDescription": "Data requested of the CPU : Card reading from DRAM : Number of DWs (4 bytes) the card requests of the main die.    Includes all requests initiated by the Card, including reads and writes. : x4 card is plugged in to slot 1",
+        "UMask": "0x04",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "Data requested of the CPU : Card reading from DRAM",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART6",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x40",
+        "PublicDescription": "Data requested of the CPU : Card reading from DRAM : Number of DWs (4 bytes) the card requests of the main die.    Includes all requests initiated by the Card, including reads and writes. : x8 card plugged in to Lane 2/3, Or x4 card is plugged in to slot 1",
+        "UMask": "0x04",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "Data requested of the CPU : Card reading from DRAM",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART7",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x80",
+        "PublicDescription": "Data requested of the CPU : Card reading from DRAM : Number of DWs (4 bytes) the card requests of the main die.    Includes all requests initiated by the Card, including reads and writes. : x4 card is plugged in to slot 3",
+        "UMask": "0x04",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "Data requested of the CPU : Card writing to DRAM",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART4",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x10",
+        "PublicDescription": "Data requested of the CPU : Card writing to DRAM : Number of DWs (4 bytes) the card requests of the main die.    Includes all requests initiated by the Card, including reads and writes. : x16 card plugged in to stack, Or x8 card plugged in to Lane 0/1, Or x4 card is plugged in to slot 0",
+        "UMask": "0x01",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "Data requested of the CPU : Card writing to DRAM",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART5",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x20",
+        "PublicDescription": "Data requested of the CPU : Card writing to DRAM : Number of DWs (4 bytes) the card requests of the main die.    Includes all requests initiated by the Card, including reads and writes. : x4 card is plugged in to slot 1",
+        "UMask": "0x01",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "Data requested of the CPU : Card writing to DRAM",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART6",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x40",
+        "PublicDescription": "Data requested of the CPU : Card writing to DRAM : Number of DWs (4 bytes) the card requests of the main die.    Includes all requests initiated by the Card, including reads and writes. : x8 card plugged in to Lane 2/3, Or x4 card is plugged in to slot 1",
+        "UMask": "0x01",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "Data requested of the CPU : Card writing to DRAM",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x83",
+        "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART7",
+        "FCMask": "0x07",
+        "PerPkg": "1",
+        "PortMask": "0x80",
+        "PublicDescription": "Data requested of the CPU : Card writing to DRAM : Number of DWs (4 bytes) the card requests of the main die.    Includes all requests initiated by the Card, including reads and writes. : x4 card is plugged in to slot 3",
+        "UMask": "0x01",
+        "Unit": "IIO"
+    },
+    {
+        "BriefDescription": "Clockticks of the IO coherency tracker (IRP)",
+        "Counter": "0,1",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x01",
+        "EventName": "UNC_I_CLOCKTICKS",
+        "PerPkg": "1",
+        "PublicDescription": "Clockticks of the IO coherency tracker (IRP)",
+        "Unit": "IRP"
+    },
+    {
+        "BriefDescription": "Clockticks of the mesh to memory (M2M)",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventName": "UNC_M2M_CLOCKTICKS",
+        "PerPkg": "1",
+        "PublicDescription": "Clockticks of the mesh to memory (M2M)",
+        "Unit": "M2M"
+    },
+    {
+        "BriefDescription": "Clockticks of the mesh to PCI (M2P)",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventCode": "0x01",
+        "EventName": "UNC_M2P_CLOCKTICKS",
+        "PerPkg": "1",
+        "PublicDescription": "Clockticks of the mesh to PCI (M2P)",
+        "Unit": "M2PCIe"
+    },
+    {
+        "BriefDescription": "Clockticks in the UBOX using a dedicated 48-bit Fixed Counter",
+        "Counter": "FIXED",
+        "CounterType": "PGMABLE",
+        "EventCode": "0xff",
+        "EventName": "UNC_U_CLOCKTICKS",
+        "PerPkg": "1",
+        "PublicDescription": "Clockticks in the UBOX using a dedicated 48-bit Fixed Counter",
+        "Unit": "UBOX"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/uncore-power.json b/tools/perf/pmu-events/arch/x86/tremontx/uncore-power.json
new file mode 100644
index 000000000000..ea62c092b43f
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/uncore-power.json
@@ -0,0 +1,11 @@
+[
+    {
+        "BriefDescription": "Clockticks of the power control unit (PCU)",
+        "Counter": "0,1,2,3",
+        "CounterType": "PGMABLE",
+        "EventName": "UNC_P_CLOCKTICKS",
+        "PerPkg": "1",
+        "PublicDescription": "Clockticks of the power control unit (PCU)",
+        "Unit": "PCU"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/virtual-memory.json b/tools/perf/pmu-events/arch/x86/tremontx/virtual-memory.json
new file mode 100644
index 000000000000..93e407a0f645
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/virtual-memory.json
@@ -0,0 +1,86 @@
+[
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts page walks completed due to demand data loads (including SW prefetches) whose address translations missed in all TLB levels and were mapped to 4K pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_4K",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Page walk completed due to a demand load to a 4K page."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts page walks completed due to demand data loads (including SW prefetches) whose address translations missed in all TLB levels and were mapped to 2M or 4M pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Page walk completed due to a demand load to a 2M or 4M page."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 4K pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_4K",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Page walk completed due to a demand data store to a 4K page."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 2M or 4M pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Page walk completed due to a demand data store to a 2M or 4M page."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of times the machine was unable to find a translation in the Instruction Translation Lookaside Buffer (ITLB) and new translation was filled into the ITLB.  The event is speculative in nature, but will not count translations (page walks) that are begun and not finished, or translations that are finished but not filled into the ITLB.",
+        "EventCode": "0x81",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "ITLB.FILLS",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts the number of times there was an ITLB miss and a new translation was filled into the ITLB."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts page walks completed due to instruction fetches whose address translations missed in the TLB and were mapped to 4K pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x85",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "ITLB_MISSES.WALK_COMPLETED_4K",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Page walk completed due to an instruction fetch in a 4K page."
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts page walks completed due to instruction fetches whose address translations missed in the TLB and were mapped to 2M or 4M pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x85",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "PEBScounters": "0,1,2,3",
+        "EventName": "ITLB_MISSES.WALK_COMPLETED_2M_4M",
+        "PDIR_COUNTER": "na",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Page walk completed due to an instruction fetch in a 2M or 4M page."
+    }
+]
\ No newline at end of file
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 02/17] perf script: Allow specifying event to switch on processing of other events
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 01/17] perf vendor events intel: Add Tremontx event file v1.02 Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 03/17] perf script: Allow showing the --switch-on event Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Sometime we want to only consider events after something happens, so
allow discarding events till such events is found, e.g.:

Record all scheduler tracepoints and the sys_enter_nanosleep syscall
event for the 'sleep 1' workload:

  # perf record -e sched:*,syscalls:sys_enter_nanosleep sleep 1
  [ perf record: Woken up 31 times to write data ]
  [ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ]
  #

So we have these events in the generated perf data file:

  # perf evlist
  sched:sched_kthread_stop
  sched:sched_kthread_stop_ret
  sched:sched_waking
  sched:sched_wakeup
  sched:sched_wakeup_new
  sched:sched_switch
  sched:sched_migrate_task
  sched:sched_process_free
  sched:sched_process_exit
  sched:sched_wait_task
  sched:sched_process_wait
  sched:sched_process_fork
  sched:sched_process_exec
  sched:sched_stat_wait
  sched:sched_stat_sleep
  sched:sched_stat_iowait
  sched:sched_stat_blocked
  sched:sched_stat_runtime
  sched:sched_pi_setprio
  sched:sched_move_numa
  sched:sched_stick_numa
  sched:sched_swap_numa
  sched:sched_wake_idle_without_ipi
  syscalls:sys_enter_nanosleep
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  #

Then show all of the events that actually took place in this 'perf record' session:

  # perf script
          :13637 13637 [002] 108237.581529:            sched:sched_waking: comm=perf pid=13638 prio=120 target_cpu=001
          :13637 13637 [002] 108237.581537:            sched:sched_wakeup: perf:13638 [120] success=1 CPU:001
           sleep 13638 [001] 108237.581992:      sched:sched_process_exec: filename=/usr/bin/sleep pid=13638 old_pid=13638
           sleep 13638 [001] 108237.582286:  syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000
           sleep 13638 [001] 108237.582289:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
           sleep 13638 [001] 108237.582291:            sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
         swapper     0 [001] 108238.582428:            sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
         swapper     0 [001] 108238.582458:            sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
           sleep 13638 [001] 108238.582698:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
           sleep 13638 [001] 108238.582782:      sched:sched_process_exit: comm=sleep pid=13638 prio=120
  #

Now lets see only the ones that took place after a certain "marker":

  # perf script --switch-on syscalls:sys_enter_nanosleep
           sleep 13638 [001] 108237.582289:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
           sleep 13638 [001] 108237.582291:            sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
         swapper     0 [001] 108238.582428:            sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
         swapper     0 [001] 108238.582458:            sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
           sleep 13638 [001] 108238.582698:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
           sleep 13638 [001] 108238.582782:      sched:sched_process_exit: comm=sleep pid=13638 prio=120
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-f1oo0ufdhrkx6nhy2lj1ierm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  3 +++
 tools/perf/builtin-script.c              | 26 ++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index caaab28f8400..9936819aae1c 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -417,6 +417,9 @@ include::itrace.txt[]
 	For itrace only show specified functions and their callees for
 	itrace. Multiple functions can be separated by comma.
 
+--switch-on EVENT_NAME::
+	Only consider events after this event is found.
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 31a529ec139f..d0bc7ccaf7bf 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1616,6 +1616,11 @@ static int perf_sample__fprintf_synth(struct perf_sample *sample,
 	return 0;
 }
 
+struct evswitch {
+	struct evsel *on;
+	bool	     discarding;
+};
+
 struct perf_script {
 	struct perf_tool	tool;
 	struct perf_session	*session;
@@ -1628,6 +1633,7 @@ struct perf_script {
 	bool			show_bpf_events;
 	bool			allocated;
 	bool			per_event_dump;
+	struct evswitch		evswitch;
 	struct perf_cpu_map	*cpus;
 	struct perf_thread_map *threads;
 	int			name_width;
@@ -1805,6 +1811,13 @@ static void process_event(struct perf_script *script,
 	if (!show_event(sample, evsel, thread, al))
 		return;
 
+	if (script->evswitch.on && script->evswitch.discarding) {
+		if (script->evswitch.on != evsel)
+			return;
+
+		script->evswitch.discarding = false;
+	}
+
 	++es->samples;
 
 	perf_sample__fprintf_start(sample, thread, evsel,
@@ -3395,6 +3408,7 @@ int cmd_script(int argc, const char **argv)
 	struct utsname uts;
 	char *script_path = NULL;
 	const char **__argv;
+	const char *event_switch_on = NULL;
 	int i, j, err = 0;
 	struct perf_script script = {
 		.tool = {
@@ -3538,6 +3552,8 @@ int cmd_script(int argc, const char **argv)
 		   "file", "file saving guest os /proc/kallsyms"),
 	OPT_STRING(0, "guestmodules", &symbol_conf.default_guest_modules,
 		   "file", "file saving guest os /proc/modules"),
+	OPT_STRING(0, "switch-on", &event_switch_on,
+		   "event", "Consider events from the first ocurrence of this event"),
 	OPT_END()
 	};
 	const char * const script_subcommands[] = { "record", "report", NULL };
@@ -3862,6 +3878,16 @@ int cmd_script(int argc, const char **argv)
 						  script.range_num);
 	}
 
+	if (event_switch_on) {
+		script.evswitch.on = perf_evlist__find_evsel_by_str(session->evlist, event_switch_on);
+		if (script.evswitch.on == NULL) {
+			fprintf(stderr, "switch-on event not found (%s)\n", event_switch_on);
+			err = -ENOENT;
+			goto out_delete;
+		}
+		script.evswitch.discarding = true;
+	}
+
 	err = __cmd_script(&script);
 
 	flush_scripting();
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 03/17] perf script: Allow showing the --switch-on event
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 01/17] perf vendor events intel: Add Tremontx event file v1.02 Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 02/17] perf script: Allow specifying event to switch on processing of other events Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 04/17] perf script: Allow specifying event to switch off processing of other events Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	William Cohen, Florian Weimer

From: Arnaldo Carvalho de Melo <acme@redhat.com>

One may want to see the --switch-on event as well, allow for that, using
the previous cset example:

  # perf script --switch-on syscalls:sys_enter_nanosleep --show-on-off
        sleep 13638 [001] 108237.582286: syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000
        sleep 13638 [001] 108237.582289:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
        sleep 13638 [001] 108237.582291:           sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
      swapper     0 [001] 108238.582428:           sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
      swapper     0 [001] 108238.582458:           sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
        sleep 13638 [001] 108238.582698:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
        sleep 13638 [001] 108238.582782:     sched:sched_process_exit: comm=sleep pid=13638 prio=120
  #
  # perf script --switch-on syscalls:sys_enter_nanosleep
        sleep 13638 [001] 108237.582289:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
        sleep 13638 [001] 108237.582291:           sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
      swapper     0 [001] 108238.582428:           sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
      swapper     0 [001] 108238.582458:           sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
        sleep 13638 [001] 108238.582698:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
        sleep 13638 [001] 108238.582782:     sched:sched_process_exit: comm=sleep pid=13638 prio=120
  #

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Cc: Florian Weimer <fweimer@redhat.com>
Link: https://lkml.kernel.org/n/tip-0omwwoywj1v63gu8cz0tr0cy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt | 3 +++
 tools/perf/builtin-script.c              | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 9936819aae1c..24eea68ee829 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -420,6 +420,9 @@ include::itrace.txt[]
 --switch-on EVENT_NAME::
 	Only consider events after this event is found.
 
+--show-on-off-events::
+	Show the --switch-on event too.
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index d0bc7ccaf7bf..fa0cc8b0eccc 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1619,6 +1619,7 @@ static int perf_sample__fprintf_synth(struct perf_sample *sample,
 struct evswitch {
 	struct evsel *on;
 	bool	     discarding;
+	bool	     show_on_off_events;
 };
 
 struct perf_script {
@@ -1816,6 +1817,9 @@ static void process_event(struct perf_script *script,
 			return;
 
 		script->evswitch.discarding = false;
+
+		if (!script->evswitch.show_on_off_events)
+			return;
 	}
 
 	++es->samples;
@@ -3554,6 +3558,8 @@ int cmd_script(int argc, const char **argv)
 		   "file", "file saving guest os /proc/modules"),
 	OPT_STRING(0, "switch-on", &event_switch_on,
 		   "event", "Consider events from the first ocurrence of this event"),
+	OPT_BOOLEAN(0, "show-on-off-events", &script.evswitch.show_on_off_events,
+		    "Show the on/off switch events, used with --switch-on"),
 	OPT_END()
 	};
 	const char * const script_subcommands[] = { "record", "report", NULL };
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 04/17] perf script: Allow specifying event to switch off processing of other events
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 03/17] perf script: Allow showing the --switch-on event Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 05/17] perf evswitch: Move struct to a separate header to use in other tools Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Counterpart of --switch-on:

  # perf record -e sched:*,syscalls:sys_*_nanosleep sleep 1
  [ perf record: Woken up 36 times to write data ]
  [ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ]
  #
  # perf script
      :20918 20918 [002] 109866.143696:            sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001
      :20918 20918 [002] 109866.143702:            sched:sched_wakeup: perf:20919 [120] success=1 CPU:001
       sleep 20919 [001] 109866.144081:      sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919
       sleep 20919 [001] 109866.144408:  syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
       sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
       sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
     swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
     swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
       sleep 20919 [001] 109867.144614:   syscalls:sys_exit_nanosleep: 0x0
       sleep 20919 [001] 109867.144753:      sched:sched_process_exit: comm=sleep pid=20919 prio=120
  #
  # perf script --switch-off syscalls:sys_exit_nanosleep
      :20918 20918 [002] 109866.143696:            sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001
      :20918 20918 [002] 109866.143702:            sched:sched_wakeup: perf:20919 [120] success=1 CPU:001
       sleep 20919 [001] 109866.144081:      sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919
       sleep 20919 [001] 109866.144408:  syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
       sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
       sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
     swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
     swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
       sleep 20919 [001] 109867.144753:      sched:sched_process_exit: comm=sleep pid=20919 prio=120
  #
  # perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep
       sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
       sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
     swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
     swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
  #
  # perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep --show-on-off
       sleep 20919 [001] 109866.144408:  syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
       sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
       sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
     swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
     swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
       sleep 20919 [001] 109867.144614:   syscalls:sys_exit_nanosleep: 0x0
  #

Now think about using this together with 'perf probe' to create custom on/off
events in your app :-)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-li3j01c4tmj9kw6ydsl8swej@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  5 +++-
 tools/perf/builtin-script.c              | 31 +++++++++++++++++++++---
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 24eea68ee829..2599b057e47b 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -420,8 +420,11 @@ include::itrace.txt[]
 --switch-on EVENT_NAME::
 	Only consider events after this event is found.
 
+--switch-off EVENT_NAME::
+	Stop considering events after this event is found.
+
 --show-on-off-events::
-	Show the --switch-on event too.
+	Show the --switch-on/off events too.
 
 SEE ALSO
 --------
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index fa0cc8b0eccc..b6eed0f7e835 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1617,7 +1617,8 @@ static int perf_sample__fprintf_synth(struct perf_sample *sample,
 }
 
 struct evswitch {
-	struct evsel *on;
+	struct evsel *on,
+		     *off;
 	bool	     discarding;
 	bool	     show_on_off_events;
 };
@@ -1820,8 +1821,20 @@ static void process_event(struct perf_script *script,
 
 		if (!script->evswitch.show_on_off_events)
 			return;
+
+		goto print_it;
 	}
 
+	if (script->evswitch.off && !script->evswitch.discarding) {
+		if (script->evswitch.off != evsel)
+			goto print_it;
+
+		script->evswitch.discarding = true;
+
+		if (!script->evswitch.show_on_off_events)
+			return;
+	}
+print_it:
 	++es->samples;
 
 	perf_sample__fprintf_start(sample, thread, evsel,
@@ -3412,7 +3425,8 @@ int cmd_script(int argc, const char **argv)
 	struct utsname uts;
 	char *script_path = NULL;
 	const char **__argv;
-	const char *event_switch_on = NULL;
+	const char *event_switch_on  = NULL,
+		   *event_switch_off = NULL;
 	int i, j, err = 0;
 	struct perf_script script = {
 		.tool = {
@@ -3557,7 +3571,9 @@ int cmd_script(int argc, const char **argv)
 	OPT_STRING(0, "guestmodules", &symbol_conf.default_guest_modules,
 		   "file", "file saving guest os /proc/modules"),
 	OPT_STRING(0, "switch-on", &event_switch_on,
-		   "event", "Consider events from the first ocurrence of this event"),
+		   "event", "Consider events after the ocurrence of this event"),
+	OPT_STRING(0, "switch-off", &event_switch_off,
+		   "event", "Stop considering events after the ocurrence of this event"),
 	OPT_BOOLEAN(0, "show-on-off-events", &script.evswitch.show_on_off_events,
 		    "Show the on/off switch events, used with --switch-on"),
 	OPT_END()
@@ -3894,6 +3910,15 @@ int cmd_script(int argc, const char **argv)
 		script.evswitch.discarding = true;
 	}
 
+	if (event_switch_off) {
+		script.evswitch.off = perf_evlist__find_evsel_by_str(session->evlist, event_switch_off);
+		if (script.evswitch.off == NULL) {
+			fprintf(stderr, "switch-off event not found (%s)\n", event_switch_off);
+			err = -ENOENT;
+			goto out_delete;
+		}
+	}
+
 	err = __cmd_script(&script);
 
 	flush_scripting();
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 05/17] perf evswitch: Move struct to a separate header to use in other tools
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 04/17] perf script: Allow specifying event to switch off processing of other events Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 06/17] perf evswitch: Move switch logic " Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Now that we see that the simple userspace-based "slicing" of events
using delimiter events ("markers") works, lets move it to a separate
header to make it available to other tools, next step will be having
the switch on/off check done at the PERF_RECORD_SAMPLE processing
function moved too.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-z0cyi9ifzlr37cedr9xztc1k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c |  8 +-------
 tools/perf/util/evswitch.h  | 16 ++++++++++++++++
 2 files changed, 17 insertions(+), 7 deletions(-)
 create mode 100644 tools/perf/util/evswitch.h

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index b6eed0f7e835..fff02e0d70c4 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -16,6 +16,7 @@
 #include "util/trace-event.h"
 #include "util/evlist.h"
 #include "util/evsel.h"
+#include "util/evswitch.h"
 #include "util/sort.h"
 #include "util/data.h"
 #include "util/auxtrace.h"
@@ -1616,13 +1617,6 @@ static int perf_sample__fprintf_synth(struct perf_sample *sample,
 	return 0;
 }
 
-struct evswitch {
-	struct evsel *on,
-		     *off;
-	bool	     discarding;
-	bool	     show_on_off_events;
-};
-
 struct perf_script {
 	struct perf_tool	tool;
 	struct perf_session	*session;
diff --git a/tools/perf/util/evswitch.h b/tools/perf/util/evswitch.h
new file mode 100644
index 000000000000..bb88e8002f39
--- /dev/null
+++ b/tools/perf/util/evswitch.h
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (C) 2019, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
+#ifndef __PERF_EVSWITCH_H
+#define __PERF_EVSWITCH_H 1
+
+#include <stdbool.h>
+
+struct evsel;
+
+struct evswitch {
+	struct evsel *on, *off;
+	bool	     discarding;
+	bool	     show_on_off_events;
+};
+
+#endif /* __PERF_EVSWITCH_H */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 06/17] perf evswitch: Move switch logic to use in other tools
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 05/17] perf evswitch: Move struct to a separate header to use in other tools Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 07/17] perf evswitch: Add the names of on/off events Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Now other tools that want switching can use an evswitch for that, just
set it up and add it to the PERF_RECORD_SAMPLE processing function.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-b1trj1q97qwfv251l66q3noj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 23 ++---------------------
 tools/perf/util/Build       |  1 +
 tools/perf/util/evswitch.c  | 31 +++++++++++++++++++++++++++++++
 tools/perf/util/evswitch.h  |  2 ++
 4 files changed, 36 insertions(+), 21 deletions(-)
 create mode 100644 tools/perf/util/evswitch.c

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index fff02e0d70c4..e7b950e977a9 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1807,28 +1807,9 @@ static void process_event(struct perf_script *script,
 	if (!show_event(sample, evsel, thread, al))
 		return;
 
-	if (script->evswitch.on && script->evswitch.discarding) {
-		if (script->evswitch.on != evsel)
-			return;
-
-		script->evswitch.discarding = false;
-
-		if (!script->evswitch.show_on_off_events)
-			return;
-
-		goto print_it;
-	}
-
-	if (script->evswitch.off && !script->evswitch.discarding) {
-		if (script->evswitch.off != evsel)
-			goto print_it;
-
-		script->evswitch.discarding = true;
+	if (evswitch__discard(&script->evswitch, evsel))
+		return;
 
-		if (!script->evswitch.show_on_off_events)
-			return;
-	}
-print_it:
 	++es->samples;
 
 	perf_sample__fprintf_start(sample, thread, evsel,
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 7cda749059a9..b922c8c297ca 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -9,6 +9,7 @@ perf-y += event.o
 perf-y += evlist.o
 perf-y += evsel.o
 perf-y += evsel_fprintf.o
+perf-y += evswitch.o
 perf-y += find_bit.o
 perf-y += get_current_dir_name.o
 perf-y += kallsyms.o
diff --git a/tools/perf/util/evswitch.c b/tools/perf/util/evswitch.c
new file mode 100644
index 000000000000..c87f988d81c8
--- /dev/null
+++ b/tools/perf/util/evswitch.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (C) 2019, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
+
+#include "evswitch.h"
+
+bool evswitch__discard(struct evswitch *evswitch, struct evsel *evsel)
+{
+	if (evswitch->on && evswitch->discarding) {
+		if (evswitch->on != evsel)
+			return true;
+
+		evswitch->discarding = false;
+
+		if (!evswitch->show_on_off_events)
+			return true;
+
+		return false;
+	}
+
+	if (evswitch->off && !evswitch->discarding) {
+		if (evswitch->off != evsel)
+			return false;
+
+		evswitch->discarding = true;
+
+		if (!evswitch->show_on_off_events)
+			return true;
+	}
+
+	return false;
+}
diff --git a/tools/perf/util/evswitch.h b/tools/perf/util/evswitch.h
index bb88e8002f39..bae3a22ad719 100644
--- a/tools/perf/util/evswitch.h
+++ b/tools/perf/util/evswitch.h
@@ -13,4 +13,6 @@ struct evswitch {
 	bool	     show_on_off_events;
 };
 
+bool evswitch__discard(struct evswitch *evswitch, struct evsel *evsel);
+
 #endif /* __PERF_EVSWITCH_H */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 07/17] perf evswitch: Add the names of on/off events
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 06/17] perf evswitch: Move switch logic " Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 08/17] perf evswitch: Introduce OPTS_EVSWITCH() for cmd line processing Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

So that we can have macros for the OPT_ entries and also for finding
those in an evlist, this way other tools will use this very easily.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-q0og1xoqqi0w38ve5u0a43k2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 18 ++++++++----------
 tools/perf/util/evswitch.h  |  1 +
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index e7b950e977a9..177e4e91b199 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3400,8 +3400,6 @@ int cmd_script(int argc, const char **argv)
 	struct utsname uts;
 	char *script_path = NULL;
 	const char **__argv;
-	const char *event_switch_on  = NULL,
-		   *event_switch_off = NULL;
 	int i, j, err = 0;
 	struct perf_script script = {
 		.tool = {
@@ -3545,9 +3543,9 @@ int cmd_script(int argc, const char **argv)
 		   "file", "file saving guest os /proc/kallsyms"),
 	OPT_STRING(0, "guestmodules", &symbol_conf.default_guest_modules,
 		   "file", "file saving guest os /proc/modules"),
-	OPT_STRING(0, "switch-on", &event_switch_on,
+	OPT_STRING(0, "switch-on", &script.evswitch.on_name,
 		   "event", "Consider events after the ocurrence of this event"),
-	OPT_STRING(0, "switch-off", &event_switch_off,
+	OPT_STRING(0, "switch-off", &script.evswitch.off_name,
 		   "event", "Stop considering events after the ocurrence of this event"),
 	OPT_BOOLEAN(0, "show-on-off-events", &script.evswitch.show_on_off_events,
 		    "Show the on/off switch events, used with --switch-on"),
@@ -3875,20 +3873,20 @@ int cmd_script(int argc, const char **argv)
 						  script.range_num);
 	}
 
-	if (event_switch_on) {
-		script.evswitch.on = perf_evlist__find_evsel_by_str(session->evlist, event_switch_on);
+	if (script.evswitch.on_name) {
+		script.evswitch.on = perf_evlist__find_evsel_by_str(session->evlist, script.evswitch.on_name);
 		if (script.evswitch.on == NULL) {
-			fprintf(stderr, "switch-on event not found (%s)\n", event_switch_on);
+			fprintf(stderr, "switch-on event not found (%s)\n", script.evswitch.on_name);
 			err = -ENOENT;
 			goto out_delete;
 		}
 		script.evswitch.discarding = true;
 	}
 
-	if (event_switch_off) {
-		script.evswitch.off = perf_evlist__find_evsel_by_str(session->evlist, event_switch_off);
+	if (script.evswitch.off_name) {
+		script.evswitch.off = perf_evlist__find_evsel_by_str(session->evlist, script.evswitch.off_name);
 		if (script.evswitch.off == NULL) {
-			fprintf(stderr, "switch-off event not found (%s)\n", event_switch_off);
+			fprintf(stderr, "switch-off event not found (%s)\n", script.evswitch.off_name);
 			err = -ENOENT;
 			goto out_delete;
 		}
diff --git a/tools/perf/util/evswitch.h b/tools/perf/util/evswitch.h
index bae3a22ad719..891164504080 100644
--- a/tools/perf/util/evswitch.h
+++ b/tools/perf/util/evswitch.h
@@ -9,6 +9,7 @@ struct evsel;
 
 struct evswitch {
 	struct evsel *on, *off;
+	const char   *on_name, *off_name;
 	bool	     discarding;
 	bool	     show_on_off_events;
 };
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 08/17] perf evswitch: Introduce OPTS_EVSWITCH() for cmd line processing
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 07/17] perf evswitch: Add the names of on/off events Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 09/17] perf evswitch: Introduce init() method to set the on/off evsels from the command line Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

All tools will want those, so provide a convenient way to get them.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-v16pe3sbf3wjmn152u18f649@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 7 +------
 tools/perf/util/evswitch.h  | 8 ++++++++
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 177e4e91b199..2a5b8af80e6b 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3543,12 +3543,7 @@ int cmd_script(int argc, const char **argv)
 		   "file", "file saving guest os /proc/kallsyms"),
 	OPT_STRING(0, "guestmodules", &symbol_conf.default_guest_modules,
 		   "file", "file saving guest os /proc/modules"),
-	OPT_STRING(0, "switch-on", &script.evswitch.on_name,
-		   "event", "Consider events after the ocurrence of this event"),
-	OPT_STRING(0, "switch-off", &script.evswitch.off_name,
-		   "event", "Stop considering events after the ocurrence of this event"),
-	OPT_BOOLEAN(0, "show-on-off-events", &script.evswitch.show_on_off_events,
-		    "Show the on/off switch events, used with --switch-on"),
+	OPTS_EVSWITCH(&script.evswitch),
 	OPT_END()
 	};
 	const char * const script_subcommands[] = { "record", "report", NULL };
diff --git a/tools/perf/util/evswitch.h b/tools/perf/util/evswitch.h
index 891164504080..94220d1bb479 100644
--- a/tools/perf/util/evswitch.h
+++ b/tools/perf/util/evswitch.h
@@ -16,4 +16,12 @@ struct evswitch {
 
 bool evswitch__discard(struct evswitch *evswitch, struct evsel *evsel);
 
+#define OPTS_EVSWITCH(evswitch)								  \
+	OPT_STRING(0, "switch-on", &(evswitch)->on_name,				  \
+		   "event", "Consider events after the ocurrence of this event"),	  \
+	OPT_STRING(0, "switch-off", &(evswitch)->off_name,				  \
+		   "event", "Stop considering events after the ocurrence of this event"), \
+	OPT_BOOLEAN(0, "show-on-off-events", &(evswitch)->show_on_off_events,		  \
+		    "Show the on/off switch events, used with --switch-on and --switch-off")
+
 #endif /* __PERF_EVSWITCH_H */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 09/17] perf evswitch: Introduce init() method to set the on/off evsels from the command line
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 08/17] perf evswitch: Introduce OPTS_EVSWITCH() for cmd line processing Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 10/17] perf evswitch: Move enoent error message printing to separate function Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Another step in having all the boilerplate in just one place to then use
in the other tools.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-snreb1wmwyjei3eefwotxp1l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 21 +++------------------
 tools/perf/util/evswitch.c  | 23 +++++++++++++++++++++++
 tools/perf/util/evswitch.h  |  4 ++++
 3 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 2a5b8af80e6b..1764efd16cd4 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3868,24 +3868,9 @@ int cmd_script(int argc, const char **argv)
 						  script.range_num);
 	}
 
-	if (script.evswitch.on_name) {
-		script.evswitch.on = perf_evlist__find_evsel_by_str(session->evlist, script.evswitch.on_name);
-		if (script.evswitch.on == NULL) {
-			fprintf(stderr, "switch-on event not found (%s)\n", script.evswitch.on_name);
-			err = -ENOENT;
-			goto out_delete;
-		}
-		script.evswitch.discarding = true;
-	}
-
-	if (script.evswitch.off_name) {
-		script.evswitch.off = perf_evlist__find_evsel_by_str(session->evlist, script.evswitch.off_name);
-		if (script.evswitch.off == NULL) {
-			fprintf(stderr, "switch-off event not found (%s)\n", script.evswitch.off_name);
-			err = -ENOENT;
-			goto out_delete;
-		}
-	}
+	err = evswitch__init(&script.evswitch, session->evlist, stderr);
+	if (err)
+		goto out_delete;
 
 	err = __cmd_script(&script);
 
diff --git a/tools/perf/util/evswitch.c b/tools/perf/util/evswitch.c
index c87f988d81c8..b57b5f0816f5 100644
--- a/tools/perf/util/evswitch.c
+++ b/tools/perf/util/evswitch.c
@@ -2,6 +2,7 @@
 // Copyright (C) 2019, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
 
 #include "evswitch.h"
+#include "evlist.h"
 
 bool evswitch__discard(struct evswitch *evswitch, struct evsel *evsel)
 {
@@ -29,3 +30,25 @@ bool evswitch__discard(struct evswitch *evswitch, struct evsel *evsel)
 
 	return false;
 }
+
+int evswitch__init(struct evswitch *evswitch, struct evlist *evlist, FILE *fp)
+{
+	if (evswitch->on_name) {
+		evswitch->on = perf_evlist__find_evsel_by_str(evlist, evswitch->on_name);
+		if (evswitch->on == NULL) {
+			fprintf(fp, "switch-on event not found (%s)\n", evswitch->on_name);
+			return -ENOENT;
+		}
+		evswitch->discarding = true;
+	}
+
+	if (evswitch->off_name) {
+		evswitch->off = perf_evlist__find_evsel_by_str(evlist, evswitch->off_name);
+		if (evswitch->off == NULL) {
+			fprintf(fp, "switch-off event not found (%s)\n", evswitch->off_name);
+			return -ENOENT;
+		}
+	}
+
+	return 0;
+}
diff --git a/tools/perf/util/evswitch.h b/tools/perf/util/evswitch.h
index 94220d1bb479..fd30460b6218 100644
--- a/tools/perf/util/evswitch.h
+++ b/tools/perf/util/evswitch.h
@@ -4,8 +4,10 @@
 #define __PERF_EVSWITCH_H 1
 
 #include <stdbool.h>
+#include <stdio.h>
 
 struct evsel;
+struct evlist;
 
 struct evswitch {
 	struct evsel *on, *off;
@@ -14,6 +16,8 @@ struct evswitch {
 	bool	     show_on_off_events;
 };
 
+int evswitch__init(struct evswitch *evswitch, struct evlist *evlist, FILE *fp);
+
 bool evswitch__discard(struct evswitch *evswitch, struct evsel *evsel);
 
 #define OPTS_EVSWITCH(evswitch)								  \
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 10/17] perf evswitch: Move enoent error message printing to separate function
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 09/17] perf evswitch: Introduce init() method to set the on/off evsels from the command line Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 11/17] perf evswitch: Add hint when not finding specified on/off events Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Allows adding hints there, will be done in followup patch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-1kvrdi7weuz3hxycwvarcu6v@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evswitch.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evswitch.c b/tools/perf/util/evswitch.c
index b57b5f0816f5..71daed615a2c 100644
--- a/tools/perf/util/evswitch.c
+++ b/tools/perf/util/evswitch.c
@@ -31,12 +31,17 @@ bool evswitch__discard(struct evswitch *evswitch, struct evsel *evsel)
 	return false;
 }
 
+static int evswitch__fprintf_enoent(FILE *fp, const char *evtype, const char *evname)
+{
+	return fprintf(fp, "ERROR: switch-%s event not found (%s)\n", evtype, evname);
+}
+
 int evswitch__init(struct evswitch *evswitch, struct evlist *evlist, FILE *fp)
 {
 	if (evswitch->on_name) {
 		evswitch->on = perf_evlist__find_evsel_by_str(evlist, evswitch->on_name);
 		if (evswitch->on == NULL) {
-			fprintf(fp, "switch-on event not found (%s)\n", evswitch->on_name);
+			evswitch__fprintf_enoent(fp, "on", evswitch->on_name);
 			return -ENOENT;
 		}
 		evswitch->discarding = true;
@@ -45,7 +50,7 @@ int evswitch__init(struct evswitch *evswitch, struct evlist *evlist, FILE *fp)
 	if (evswitch->off_name) {
 		evswitch->off = perf_evlist__find_evsel_by_str(evlist, evswitch->off_name);
 		if (evswitch->off == NULL) {
-			fprintf(fp, "switch-off event not found (%s)\n", evswitch->off_name);
+			evswitch__fprintf_enoent(fp, "off", evswitch->off_name);
 			return -ENOENT;
 		}
 	}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 11/17] perf evswitch: Add hint when not finding specified on/off events
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 10/17] perf evswitch: Move enoent error message printing to separate function Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 12/17] perf trace: Add --switch-on/--switch-off events Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

If the user specifies a on or off switch event and it isn't in the
perf.data file, provide a hint about how to see the events in the
perf.data evlist:

  # perf script --switch-on=syscall:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep
  ERROR: event_on event not found (syscall:sys_enter_nanosleep)
  HINT:  use 'perf evlist' to see the available event names
  #
  # perf evlist
  sched:sched_kthread_stop
  sched:sched_kthread_stop_ret
  sched:sched_waking
  sched:sched_wakeup
  sched:sched_wakeup_new
  sched:sched_switch
  sched:sched_migrate_task
  sched:sched_process_free
  sched:sched_process_exit
  sched:sched_wait_task
  sched:sched_process_wait
  sched:sched_process_fork
  sched:sched_process_exec
  sched:sched_stat_wait
  sched:sched_stat_sleep
  sched:sched_stat_iowait
  sched:sched_stat_blocked
  sched:sched_stat_runtime
  sched:sched_pi_setprio
  sched:sched_move_numa
  sched:sched_stick_numa
  sched:sched_swap_numa
  sched:sched_wake_idle_without_ipi
  syscalls:sys_enter_clock_nanosleep
  syscalls:sys_exit_clock_nanosleep
  syscalls:sys_enter_nanosleep
  syscalls:sys_exit_nanosleep
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  #
  # perf script --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep
       sleep 20919 [001] 109866.144411:  sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [ns]
       sleep 20919 [001] 109866.144412:        sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
     swapper     0 [001] 109867.144568:        sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
     swapper     0 [001] 109867.144586:        sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-iijjvdlyad973oskdq8gmi5w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evswitch.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evswitch.c b/tools/perf/util/evswitch.c
index 71daed615a2c..3ba72f743d3c 100644
--- a/tools/perf/util/evswitch.c
+++ b/tools/perf/util/evswitch.c
@@ -33,7 +33,9 @@ bool evswitch__discard(struct evswitch *evswitch, struct evsel *evsel)
 
 static int evswitch__fprintf_enoent(FILE *fp, const char *evtype, const char *evname)
 {
-	return fprintf(fp, "ERROR: switch-%s event not found (%s)\n", evtype, evname);
+	int printed = fprintf(fp, "ERROR: switch-%s event not found (%s)\n", evtype, evname);
+
+	return printed += fprintf(fp, "HINT:  use 'perf evlist' to see the available event names\n");
 }
 
 int evswitch__init(struct evswitch *evswitch, struct evlist *evlist, FILE *fp)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 12/17] perf trace: Add --switch-on/--switch-off events
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 11/17] perf evswitch: Add hint when not finding specified on/off events Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 13/17] perf top: " Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Just like with 'perf script':

  # perf trace -e sched:*,syscalls:*sleep* sleep 1
       0.000 :28345/28345 sched:sched_waking:comm=perf pid=28346 prio=120 target_cpu=005
       0.005 :28345/28345 sched:sched_wakeup:perf:28346 [120] success=1 CPU:005
       0.383 sleep/28346 sched:sched_process_exec:filename=/usr/bin/sleep pid=28346 old_pid=28346
       0.613 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=607375 [ns] vruntime=23289041218 [ns]
       0.689 sleep/28346 syscalls:sys_enter_nanosleep:rqtp: 0x7ffc491789b0
       0.693 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=72021 [ns] vruntime=23289113239 [ns]
       0.694 sleep/28346 sched:sched_switch:sleep:28346 [120] S ==> swapper/5:0 [120]
    1000.787 :0/0 sched:sched_waking:comm=sleep pid=28346 prio=120 target_cpu=005
    1000.824 :0/0 sched:sched_wakeup:sleep:28346 [120] success=1 CPU:005
    1000.908 sleep/28346 syscalls:sys_exit_nanosleep:0x0
    1001.218 sleep/28346 sched:sched_process_exit:comm=sleep pid=28346 prio=120
  # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep sleep 1
       0.000 sleep/28349 sched:sched_stat_runtime:comm=sleep pid=28349 runtime=603036 [ns] vruntime=23873537697 [ns]
       0.001 sleep/28349 sched:sched_switch:sleep:28349 [120] S ==> swapper/4:0 [120]
    1000.392 :0/0 sched:sched_waking:comm=sleep pid=28349 prio=120 target_cpu=004
    1000.443 :0/0 sched:sched_wakeup:sleep:28349 [120] success=1 CPU:004
    1000.540 sleep/28349 syscalls:sys_exit_nanosleep:0x0
    1000.852 sleep/28349 sched:sched_process_exit:comm=sleep pid=28349 prio=120
  # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep sleep 1
       0.000 sleep/28352 sched:sched_stat_runtime:comm=sleep pid=28352 runtime=610543 [ns] vruntime=24811686681 [ns]
       0.001 sleep/28352 sched:sched_switch:sleep:28352 [120] S ==> swapper/0:0 [120]
    1000.397 :0/0 sched:sched_waking:comm=sleep pid=28352 prio=120 target_cpu=000
    1000.440 :0/0 sched:sched_wakeup:sleep:28352 [120] success=1 CPU:000
  #
  # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep --show-on-off sleep 1
       0.000 sleep/28367 syscalls:sys_enter_nanosleep:rqtp: 0x7fffd1a25fc0
       0.004 sleep/28367 sched:sched_stat_runtime:comm=sleep pid=28367 runtime=628760 [ns] vruntime=22170052672 [ns]
       0.005 sleep/28367 sched:sched_switch:sleep:28367 [120] S ==> swapper/2:0 [120]
    1000.367 :0/0 sched:sched_waking:comm=sleep pid=28367 prio=120 target_cpu=002
    1000.412 :0/0 sched:sched_wakeup:sleep:28367 [120] success=1 CPU:002
    1000.512 sleep/28367 syscalls:sys_exit_nanosleep:0x0
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-t3ngpt1brcc1fm9gep9gxm4q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-trace.txt |  9 +++++++++
 tools/perf/builtin-trace.c              | 10 ++++++++++
 2 files changed, 19 insertions(+)

diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt
index fc6e43262c41..25b74fdb36fa 100644
--- a/tools/perf/Documentation/perf-trace.txt
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -176,6 +176,15 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
 	only at exit time or when a syscall is interrupted, i.e. in those cases this
 	option is equivalent to the number of lines printed.
 
+--switch-on EVENT_NAME::
+	Only consider events after this event is found.
+
+--switch-off EVENT_NAME::
+	Stop considering events after this event is found.
+
+--show-on-off-events::
+	Show the --switch-on/off events too.
+
 --max-stack::
         Set the stack depth limit when parsing the callchain, anything
         beyond the specified depth will be ignored. Note that at this point
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index d553d06a9aeb..bc44ed29e05a 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -27,6 +27,7 @@
 #include "util/env.h"
 #include "util/event.h"
 #include "util/evlist.h"
+#include "util/evswitch.h"
 #include <subcmd/exec-cmd.h>
 #include "util/machine.h"
 #include "util/map.h"
@@ -106,6 +107,7 @@ struct trace {
 	unsigned long		nr_events;
 	unsigned long		nr_events_printed;
 	unsigned long		max_events;
+	struct evswitch		evswitch;
 	struct strlist		*ev_qualifier;
 	struct {
 		size_t		nr;
@@ -2680,6 +2682,9 @@ static void trace__handle_event(struct trace *trace, union perf_event *event, st
 		return;
 	}
 
+	if (evswitch__discard(&trace->evswitch, evsel))
+		return;
+
 	trace__set_base_time(trace, evsel, sample);
 
 	if (evsel->core.attr.type == PERF_TYPE_TRACEPOINT &&
@@ -4157,6 +4162,7 @@ int cmd_trace(int argc, const char **argv)
 	OPT_UINTEGER('D', "delay", &trace.opts.initial_delay,
 		     "ms to wait before starting measurement after program "
 		     "start"),
+	OPTS_EVSWITCH(&trace.evswitch),
 	OPT_END()
 	};
 	bool __maybe_unused max_stack_user_set = true;
@@ -4380,6 +4386,10 @@ int cmd_trace(int argc, const char **argv)
 		}
 	}
 
+	err = evswitch__init(&trace.evswitch, trace.evlist, stderr);
+	if (err)
+		goto out_close;
+
 	err = target__validate(&trace.opts.target);
 	if (err) {
 		target__strerror(&trace.opts.target, err, bf, sizeof(bf));
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 13/17] perf top: Add --switch-on/--switch-off events
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 12/17] perf trace: Add --switch-on/--switch-off events Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 14/17] perf report: " Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Just like 'perf trace' and 'perf script', should be useful for instance
to only consider samples after the initialization phase of some
workload.

The man page has some examples and considerations about its current
interface, that still doesn't handle the on/off events in a special way,
behaving just like when multiple events are specified, i.e.:

- In non-group mode (when the event list is not enclosed in {}) show a
  a menu to allow choosing which event the user wants to see in the
  histograms browser

- In group mode, be it using {} or asking for --group, show one column
  per event.

Try for instance:

  # perf top -e '{cycles,instructions,probe:icmp_rcv}' --switch-on=probe:icmp_rcv

Replace probe:icmp_rcv, that I put in place using:

  # perf probe icmp_rcv:59

To hit when broadcast packets arrive, with a probe installed after an
initialization phase is over or after some other point of interest, some
garbage collection, etc, and also use --switch-off, for instance, on a
probe installed after said garbage collection is over.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-c7q7qjeqtyvc9mkeipxza6ne@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-top.txt | 38 +++++++++++++++++++++++++++
 tools/perf/builtin-top.c              | 10 ++++++-
 tools/perf/util/top.h                 |  2 ++
 3 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index cfea87c6f38e..5596129a71cf 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -266,6 +266,44 @@ Default is to monitor all CPUS.
 	Record events of type PERF_RECORD_NAMESPACES and display it with the
 	'cgroup_id' sort key.
 
+--switch-on EVENT_NAME::
+	Only consider events after this event is found.
+
+	E.g.:
+
+           Find out where broadcast packets are handled
+
+		perf probe -L icmp_rcv
+
+	   Insert a probe there:
+
+		perf probe icmp_rcv:59
+
+	   Start perf top and ask it to only consider the cycles events when a
+           broadcast packet arrives This will show a menu with two entries and
+           will start counting when a broadcast packet arrives:
+
+		perf top -e cycles,probe:icmp_rcv --switch-on=probe:icmp_rcv
+
+	   Alternatively one can ask for --group and then two overhead columns
+           will appear, the first for cycles and the second for the switch-on event.
+
+		perf top --group -e cycles,probe:icmp_rcv --switch-on=probe:icmp_rcv
+
+	This may be interesting to measure a workload only after some initialization
+	phase is over, i.e. insert a perf probe at that point and use the above
+	examples replacing probe:icmp_rcv with the just-after-init probe.
+
+--switch-off EVENT_NAME::
+	Stop considering events after this event is found.
+
+--show-on-off-events::
+	Show the --switch-on/off events too. This has no effect in 'perf top' now
+	but probably we'll make the default not to show the switch-on/off events
+        on the --group mode and if there is only one event besides the off/on ones,
+	go straight to the histogram browser, just like 'perf top' with no events
+	explicitely specified does.
+
 
 INTERACTIVE PROMPTING KEYS
 --------------------------
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 78e7efc597a6..5970723cd55a 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1148,8 +1148,11 @@ static int deliver_event(struct ordered_events *qe,
 	evsel = perf_evlist__id2evsel(session->evlist, sample.id);
 	assert(evsel != NULL);
 
-	if (event->header.type == PERF_RECORD_SAMPLE)
+	if (event->header.type == PERF_RECORD_SAMPLE) {
+		if (evswitch__discard(&top->evswitch, evsel))
+			return 0;
 		++top->samples;
+	}
 
 	switch (sample.cpumode) {
 	case PERF_RECORD_MISC_USER:
@@ -1534,6 +1537,7 @@ int cmd_top(int argc, const char **argv)
 			"number of thread to run event synthesize"),
 	OPT_BOOLEAN(0, "namespaces", &opts->record_namespaces,
 		    "Record namespaces events"),
+	OPTS_EVSWITCH(&top.evswitch),
 	OPT_END()
 	};
 	struct evlist *sb_evlist = NULL;
@@ -1567,6 +1571,10 @@ int cmd_top(int argc, const char **argv)
 		goto out_delete_evlist;
 	}
 
+	status = evswitch__init(&top.evswitch, top.evlist, stderr);
+	if (status)
+		goto out_delete_evlist;
+
 	if (symbol_conf.report_hierarchy) {
 		/* disable incompatible options */
 		symbol_conf.event_group = false;
diff --git a/tools/perf/util/top.h b/tools/perf/util/top.h
index 2023e0bf6165..dc4bb6e52a83 100644
--- a/tools/perf/util/top.h
+++ b/tools/perf/util/top.h
@@ -3,6 +3,7 @@
 #define __PERF_TOP_H 1
 
 #include "tool.h"
+#include "evswitch.h"
 #include "annotate.h"
 #include <linux/types.h>
 #include <stddef.h>
@@ -18,6 +19,7 @@ struct perf_top {
 	struct evlist *evlist;
 	struct record_opts record_opts;
 	struct annotation_options annotation_opts;
+	struct evswitch	   evswitch;
 	/*
 	 * Symbols will be added here in perf_event__process_sample and will
 	 * get out after decayed.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 14/17] perf report: Add --switch-on/--switch-off events
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 13/17] perf top: " Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 15/17] perf map: Use zalloc for map_groups Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Florian Weimer, William Cohen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Since 'perf top' shares the histogram browser with 'perf report', then
the same explanation in the previous cset applies.

An additional example uses a pair of SDT events available for systemtap:

  # perf probe --exec=/usr/bin/stap '%*:*'
  Added new events:
    sdt_stap:benchmark__thread__start (on %* in /usr/bin/stap)
    sdt_stap:benchmark   (on %* in /usr/bin/stap)
    sdt_stap:benchmark__thread__end (on %* in /usr/bin/stap)
    sdt_stap:pass6__start (on %* in /usr/bin/stap)
    sdt_stap:pass6__end  (on %* in /usr/bin/stap)
    sdt_stap:pass5__start (on %* in /usr/bin/stap)
    sdt_stap:pass5__end  (on %* in /usr/bin/stap)
    sdt_stap:pass0__start (on %* in /usr/bin/stap)
    sdt_stap:pass0__end  (on %* in /usr/bin/stap)
    sdt_stap:pass1a__start (on %* in /usr/bin/stap)
    sdt_stap:pass1b__start (on %* in /usr/bin/stap)
    sdt_stap:pass1__end  (on %* in /usr/bin/stap)
    sdt_stap:pass2__start (on %* in /usr/bin/stap)
    sdt_stap:pass2__end  (on %* in /usr/bin/stap)
    sdt_stap:pass3__start (on %* in /usr/bin/stap)
    sdt_stap:pass3__end  (on %* in /usr/bin/stap)
    sdt_stap:pass4__start (on %* in /usr/bin/stap)
    sdt_stap:pass4__end  (on %* in /usr/bin/stap)
    sdt_stap:benchmark__start (on %* in /usr/bin/stap)
    sdt_stap:benchmark__end (on %* in /usr/bin/stap)
    sdt_stap:cache__get  (on %* in /usr/bin/stap)
    sdt_stap:cache__clean (on %* in /usr/bin/stap)
    sdt_stap:cache__add__module (on %* in /usr/bin/stap)
    sdt_stap:cache__add__source (on %* in /usr/bin/stap)
    sdt_stap:stap_system__complete (on %* in /usr/bin/stap)
    sdt_stap:stap_system__start (on %* in /usr/bin/stap)
    sdt_stap:stap_system__spawn (on %* in /usr/bin/stap)
    sdt_stap:stap_system__fork (on %* in /usr/bin/stap)
    sdt_stap:intern_string (on %* in /usr/bin/stap)
    sdt_stap:client__start (on %* in /usr/bin/stap)
    sdt_stap:client__end (on %* in /usr/bin/stap)

  You can now use it in all perf tools, such as:

  	perf record -e sdt_stap:client__end -aR sleep 1

  #

From these we're use the two below to run systemtap's test suite:

  # perf record -e sdt_stap:pass2__*,cycles:P make installcheck > /dev/null
  ^C[ perf record: Woken up 8 times to write data ]
  [ perf record: Captured and wrote 2.691 MB perf.data (39638 samples) ]
  Terminated
  # perf script | grep sdt_stap
              stap 28979 [000] 19424.302660: sdt_stap:pass2__start: (561b9a537de3) arg1=140730364262544
              stap 28979 [000] 19424.333083:   sdt_stap:pass2__end: (561b9a53a9e1) arg1=140730364262544
              stap 29045 [006] 19424.933460: sdt_stap:pass2__start: (563edddcede3) arg1=140722674883152
              stap 29045 [006] 19424.963794:   sdt_stap:pass2__end: (563edddd19e1) arg1=140722674883152
  # perf script | grep cycles |  wc -l
  39634
  #

Looking at the whole perf.data file:

  [root@quaco testsuite]# perf report | grep cycles:P -A25
  # Samples: 39K of event 'cycles:P'
  # Event count (approx.): 34044267368
  #
  # Overhead  Command  Shared Object         Symbol
  # ........  .......  ....................  ................................
  #
       3.50%  cc1      cc1                   [.] ht_lookup_with_hash
       3.04%  cc1      cc1                   [.] _cpp_lex_token
       2.11%  cc1      cc1                   [.] ggc_internal_alloc
       1.83%  cc1      cc1                   [.] cpp_get_token_with_location
       1.68%  cc1      libc-2.29.so          [.] _int_malloc
       1.41%  cc1      cc1                   [.] linemap_position_for_column
       1.25%  cc1      cc1                   [.] ggc_internal_cleared_alloc
       1.20%  cc1      cc1                   [.] c_lex_with_flags
       1.18%  cc1      cc1                   [.] get_combined_adhoc_loc
       1.05%  cc1      libc-2.29.so          [.] malloc
       1.01%  cc1      libc-2.29.so          [.] _int_free
       0.96%  stap     stap                  [.] std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, stringtable_hash, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > >
       0.78%  stap     stap                  [.] lexer::scan
       0.74%  cc1      cc1                   [.] _cpp_lex_direct
       0.70%  cc1      cc1                   [.] pop_scope
       0.70%  cc1      cc1                   [.] c_parser_declspecs
       0.69%  stap     libc-2.29.so          [.] _int_malloc
       0.68%  cc1      cc1                   [.] htab_find_slot
       0.68%  cc1      [kernel.vmlinux]      [k] prepare_exit_to_usermode
       0.64%  cc1      [kernel.vmlinux]      [k] clear_page_erms
  [root@quaco testsuite]#

And now only what happens in slices demarcated by those start/end SDT
events:

  [root@quaco testsuite]# perf report --switch-on=sdt_stap:pass2__start --switch-off=sdt_stap:pass2__end | grep cycles:P -A100
  # Samples: 240  of event 'cycles:P'
  # Event count (approx.): 206491934
  #
  # Overhead  Command  Shared Object        Symbol
  # ........  .......  ...................  ................................................
  #
      38.99%  stap     stap                 [.] systemtap_session::register_library_aliases
      19.47%  stap     stap                 [.] match_key::operator<
      15.01%  stap     libc-2.29.so         [.] __memcmp_avx2_movbe
       5.19%  stap     libc-2.29.so         [.] _int_malloc
       2.50%  stap     libstdc++.so.6.0.26  [.] std::_Rb_tree_insert_and_rebalance
       2.30%  stap     stap                 [.] match_node::build_no_more
       2.07%  stap     libc-2.29.so         [.] malloc
       1.66%  stap     stap                 [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::find
       1.66%  stap     stap                 [.] match_node::bind
       1.58%  stap     [kernel.vmlinux]     [k] prepare_exit_to_usermode
       1.17%  stap     [kernel.vmlinux]     [k] native_irq_return_iret
       0.87%  stap     stap                 [.] 0x0000000000032ec4
       0.77%  stap     libstdc++.so.6.0.26  [.] std::_Rb_tree_increment
       0.47%  stap     stap                 [.] std::vector<derived_probe_builder*, std::allocator<derived_probe_builder*> >::_M_realloc_insert<derived_probe_builder* const&>
       0.47%  stap     [kernel.vmlinux]     [k] get_page_from_freelist
       0.47%  stap     [kernel.vmlinux]     [k] swapgs_restore_regs_and_return_to_usermode
       0.47%  stap     [kernel.vmlinux]     [k] do_user_addr_fault
       0.46%  stap     [kernel.vmlinux]     [k] __pagevec_lru_add_fn
       0.46%  stap     stap                 [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::_M_emplace_unique<std::pair<match_key, match_node*> >
       0.42%  stap     libstdc++.so.6.0.26  [.] 0x00000000000c18fa
       0.40%  stap     [kernel.vmlinux]     [k] interrupt_entry
       0.40%  stap     [kernel.vmlinux]     [k] update_load_avg
       0.40%  stap     [kernel.vmlinux]     [k] __intel_pmu_disable_all
       0.40%  stap     [kernel.vmlinux]     [k] clear_page_erms
       0.39%  stap     [kernel.vmlinux]     [k] __mod_node_page_state
       0.39%  stap     [kernel.vmlinux]     [k] error_entry
       0.39%  stap     [kernel.vmlinux]     [k] sync_regs
       0.38%  stap     [kernel.vmlinux]     [k] __handle_mm_fault
       0.38%  stap     stap                 [.] derive_probes

  #
  # (Tip: System-wide collection from all CPUs: perf record -a)
  #
  [root@quaco testsuite]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-408hvumcnyn93a0auihnawew@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-report.txt | 17 +++++++++++++++++
 tools/perf/builtin-report.c              | 10 ++++++++++
 2 files changed, 27 insertions(+)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 987261d158d4..7315f155803f 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -438,6 +438,23 @@ OPTIONS
 
 	  perf report --time 0%-10%,30%-40%
 
+--switch-on EVENT_NAME::
+	Only consider events after this event is found.
+
+	This may be interesting to measure a workload only after some initialization
+	phase is over, i.e. insert a perf probe at that point and then using this
+	option with that probe.
+
+--switch-off EVENT_NAME::
+	Stop considering events after this event is found.
+
+--show-on-off-events::
+	Show the --switch-on/off events too. This has no effect in 'perf report' now
+	but probably we'll make the default not to show the switch-on/off events
+        on the --group mode and if there is only one event besides the off/on ones,
+	go straight to the histogram browser, just like 'perf report' with no events
+	explicitely specified does.
+
 --itrace::
 	Options for decoding instruction tracing data. The options are:
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index d4288dcce156..5e003d02821e 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -25,6 +25,7 @@
 #include "util/debug.h"
 #include "util/evlist.h"
 #include "util/evsel.h"
+#include "util/evswitch.h"
 #include "util/header.h"
 #include "util/session.h"
 #include "util/tool.h"
@@ -60,6 +61,7 @@
 struct report {
 	struct perf_tool	tool;
 	struct perf_session	*session;
+	struct evswitch		evswitch;
 	bool			use_tui, use_gtk, use_stdio;
 	bool			show_full_info;
 	bool			show_threads;
@@ -243,6 +245,9 @@ static int process_sample_event(struct perf_tool *tool,
 		return 0;
 	}
 
+	if (evswitch__discard(&rep->evswitch, evsel))
+		return 0;
+
 	if (machine__resolve(machine, &al, sample) < 0) {
 		pr_debug("problem processing %d event, skipping it.\n",
 			 event->header.type);
@@ -1189,6 +1194,7 @@ int cmd_report(int argc, const char **argv)
 	OPT_CALLBACK(0, "time-quantum", &symbol_conf.time_quantum, "time (ms|us|ns|s)",
 		     "Set time quantum for time sort key (default 100ms)",
 		     parse_time_quantum),
+	OPTS_EVSWITCH(&report.evswitch),
 	OPT_END()
 	};
 	struct perf_data data = {
@@ -1257,6 +1263,10 @@ int cmd_report(int argc, const char **argv)
 	if (session == NULL)
 		return -1;
 
+	ret = evswitch__init(&report.evswitch, session->evlist, stderr);
+	if (ret)
+		return ret;
+
 	if (zstd_init(&(session->zstd_data), 0) < 0)
 		pr_warning("Decompression initialization failed. Reported data may be incomplete.\n");
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 15/17] perf map: Use zalloc for map_groups
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 14/17] perf report: " Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 16/17] perf unwind: Fix libunwind when tid != pid Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 17/17] perf unwind: Remove unnecessary test Arnaldo Carvalho de Melo
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, John Keeping, Alexander Shishkin,
	Konstantin Khlebnikov, Peter Zijlstra, Arnaldo Carvalho de Melo

From: John Keeping <john@metanate.com>

In the next commit we will add new fields to map_groups and we need
these to be null if no value is assigned.  The simplest way to achieve
this is to request zeroed memory from the allocator.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: john keeping <john@metanate.com>
Link: http://lkml.kernel.org/r/20190815100146.28842-1-john@metanate.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/map.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 668410b1d426..44b556812e4b 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -636,7 +636,7 @@ bool map_groups__empty(struct map_groups *mg)
 
 struct map_groups *map_groups__new(struct machine *machine)
 {
-	struct map_groups *mg = malloc(sizeof(*mg));
+	struct map_groups *mg = zalloc(sizeof(*mg));
 
 	if (mg != NULL)
 		map_groups__init(mg, machine);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 16/17] perf unwind: Fix libunwind when tid != pid
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 15/17] perf map: Use zalloc for map_groups Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  2019-08-16 20:16 ` [PATCH 17/17] perf unwind: Remove unnecessary test Arnaldo Carvalho de Melo
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, John Keeping, Alexander Shishkin,
	Konstantin Khlebnikov, Peter Zijlstra, Arnaldo Carvalho de Melo

From: John Keeping <john@metanate.com>

Commit e5adfc3e7e77 ("perf map: Synthesize maps only for thread group
leader") changed the recording side so that we no longer get mmap events
for threads other than the thread group leader (when synthesising these
events for threads which exist before perf is started).

When a file recorded after this change is loaded, the lack of mmap
records mean that unwinding is not set up for any other threads.

This can be seen in a simple record/report scenario:

	perf record --call-graph=dwarf -t $TID
	perf report

If $TID is a process ID then the report will show call graphs, but if
$TID is a secondary thread the output is as if --call-graph=none was
specified.

Following the rationale in that commit, move the libunwind fields into
struct map_groups and update the libunwind functions to take this
instead of the struct thread.  This is only required for
unwind__finish_access which must now be called from map_groups__delete
and the others are changed for symmetry.

Note that unwind__get_entries keeps the thread argument since it is
required for symbol lookup and the libdw unwind provider uses the thread
ID.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: e5adfc3e7e77 ("perf map: Synthesize maps only for thread group leader")
Link: http://lkml.kernel.org/r/20190815100146.28842-2-john@metanate.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/map.c                    |  3 ++-
 tools/perf/util/map_groups.h             |  4 +++
 tools/perf/util/thread.c                 |  7 +++--
 tools/perf/util/thread.h                 |  4 ---
 tools/perf/util/unwind-libunwind-local.c | 18 ++++++-------
 tools/perf/util/unwind-libunwind.c       | 34 ++++++++++++------------
 tools/perf/util/unwind.h                 | 25 ++++++++---------
 7 files changed, 48 insertions(+), 47 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 44b556812e4b..27b7b102e4a2 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -647,6 +647,7 @@ struct map_groups *map_groups__new(struct machine *machine)
 void map_groups__delete(struct map_groups *mg)
 {
 	map_groups__exit(mg);
+	unwind__finish_access(mg);
 	free(mg);
 }
 
@@ -887,7 +888,7 @@ int map_groups__clone(struct thread *thread, struct map_groups *parent)
 		if (new == NULL)
 			goto out_unlock;
 
-		err = unwind__prepare_access(thread, new, NULL);
+		err = unwind__prepare_access(mg, new, NULL);
 		if (err)
 			goto out_unlock;
 
diff --git a/tools/perf/util/map_groups.h b/tools/perf/util/map_groups.h
index 5f25efa6d6bc..77252e14008f 100644
--- a/tools/perf/util/map_groups.h
+++ b/tools/perf/util/map_groups.h
@@ -31,6 +31,10 @@ struct map_groups {
 	struct maps	 maps;
 	struct machine	 *machine;
 	refcount_t	 refcnt;
+#ifdef HAVE_LIBUNWIND_SUPPORT
+	void				*addr_space;
+	struct unwind_libunwind_ops	*unwind_libunwind_ops;
+#endif
 };
 
 #define KMAP_NAME_LEN 256
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 590793cc5142..bbf7816cba31 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -105,7 +105,6 @@ void thread__delete(struct thread *thread)
 	}
 	up_write(&thread->comm_lock);
 
-	unwind__finish_access(thread);
 	nsinfo__zput(thread->nsinfo);
 	srccode_state_free(&thread->srccode_state);
 
@@ -252,7 +251,7 @@ static int ____thread__set_comm(struct thread *thread, const char *str,
 		list_add(&new->list, &thread->comm_list);
 
 		if (exec)
-			unwind__flush_access(thread);
+			unwind__flush_access(thread->mg);
 	}
 
 	thread->comm_set = true;
@@ -332,7 +331,7 @@ int thread__insert_map(struct thread *thread, struct map *map)
 {
 	int ret;
 
-	ret = unwind__prepare_access(thread, map, NULL);
+	ret = unwind__prepare_access(thread->mg, map, NULL);
 	if (ret)
 		return ret;
 
@@ -352,7 +351,7 @@ static int __thread__prepare_access(struct thread *thread)
 	down_read(&maps->lock);
 
 	for (map = maps__first(maps); map; map = map__next(map)) {
-		err = unwind__prepare_access(thread, map, &initialized);
+		err = unwind__prepare_access(thread->mg, map, &initialized);
 		if (err || initialized)
 			break;
 	}
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index e97ef6977eb9..bf06113be4f3 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -44,10 +44,6 @@ struct thread {
 	struct thread_stack	*ts;
 	struct nsinfo		*nsinfo;
 	struct srccode_state	srccode_state;
-#ifdef HAVE_LIBUNWIND_SUPPORT
-	void				*addr_space;
-	struct unwind_libunwind_ops	*unwind_libunwind_ops;
-#endif
 	bool			filter;
 	int			filter_entry_depth;
 };
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index 71a788921b62..ebdbb056510c 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -616,26 +616,26 @@ static unw_accessors_t accessors = {
 	.get_proc_name		= get_proc_name,
 };
 
-static int _unwind__prepare_access(struct thread *thread)
+static int _unwind__prepare_access(struct map_groups *mg)
 {
-	thread->addr_space = unw_create_addr_space(&accessors, 0);
-	if (!thread->addr_space) {
+	mg->addr_space = unw_create_addr_space(&accessors, 0);
+	if (!mg->addr_space) {
 		pr_err("unwind: Can't create unwind address space.\n");
 		return -ENOMEM;
 	}
 
-	unw_set_caching_policy(thread->addr_space, UNW_CACHE_GLOBAL);
+	unw_set_caching_policy(mg->addr_space, UNW_CACHE_GLOBAL);
 	return 0;
 }
 
-static void _unwind__flush_access(struct thread *thread)
+static void _unwind__flush_access(struct map_groups *mg)
 {
-	unw_flush_cache(thread->addr_space, 0, 0);
+	unw_flush_cache(mg->addr_space, 0, 0);
 }
 
-static void _unwind__finish_access(struct thread *thread)
+static void _unwind__finish_access(struct map_groups *mg)
 {
-	unw_destroy_addr_space(thread->addr_space);
+	unw_destroy_addr_space(mg->addr_space);
 }
 
 static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
@@ -660,7 +660,7 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 	 */
 	if (max_stack - 1 > 0) {
 		WARN_ONCE(!ui->thread, "WARNING: ui->thread is NULL");
-		addr_space = ui->thread->addr_space;
+		addr_space = ui->thread->mg->addr_space;
 
 		if (addr_space == NULL)
 			return -1;
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index c0811977d7d5..b843f9d0a9ea 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -11,13 +11,13 @@ struct unwind_libunwind_ops __weak *local_unwind_libunwind_ops;
 struct unwind_libunwind_ops __weak *x86_32_unwind_libunwind_ops;
 struct unwind_libunwind_ops __weak *arm64_unwind_libunwind_ops;
 
-static void unwind__register_ops(struct thread *thread,
+static void unwind__register_ops(struct map_groups *mg,
 			  struct unwind_libunwind_ops *ops)
 {
-	thread->unwind_libunwind_ops = ops;
+	mg->unwind_libunwind_ops = ops;
 }
 
-int unwind__prepare_access(struct thread *thread, struct map *map,
+int unwind__prepare_access(struct map_groups *mg, struct map *map,
 			   bool *initialized)
 {
 	const char *arch;
@@ -28,7 +28,7 @@ int unwind__prepare_access(struct thread *thread, struct map *map,
 	if (!dwarf_callchain_users)
 		return 0;
 
-	if (thread->addr_space) {
+	if (mg->addr_space) {
 		pr_debug("unwind: thread map already set, dso=%s\n",
 			 map->dso->name);
 		if (initialized)
@@ -37,14 +37,14 @@ int unwind__prepare_access(struct thread *thread, struct map *map,
 	}
 
 	/* env->arch is NULL for live-mode (i.e. perf top) */
-	if (!thread->mg->machine->env || !thread->mg->machine->env->arch)
+	if (!mg->machine->env || !mg->machine->env->arch)
 		goto out_register;
 
-	dso_type = dso__type(map->dso, thread->mg->machine);
+	dso_type = dso__type(map->dso, mg->machine);
 	if (dso_type == DSO__TYPE_UNKNOWN)
 		return 0;
 
-	arch = perf_env__arch(thread->mg->machine->env);
+	arch = perf_env__arch(mg->machine->env);
 
 	if (!strcmp(arch, "x86")) {
 		if (dso_type != DSO__TYPE_64BIT)
@@ -59,37 +59,37 @@ int unwind__prepare_access(struct thread *thread, struct map *map,
 		return 0;
 	}
 out_register:
-	unwind__register_ops(thread, ops);
+	unwind__register_ops(mg, ops);
 
-	err = thread->unwind_libunwind_ops->prepare_access(thread);
+	err = mg->unwind_libunwind_ops->prepare_access(mg);
 	if (initialized)
 		*initialized = err ? false : true;
 	return err;
 }
 
-void unwind__flush_access(struct thread *thread)
+void unwind__flush_access(struct map_groups *mg)
 {
 	if (!dwarf_callchain_users)
 		return;
 
-	if (thread->unwind_libunwind_ops)
-		thread->unwind_libunwind_ops->flush_access(thread);
+	if (mg->unwind_libunwind_ops)
+		mg->unwind_libunwind_ops->flush_access(mg);
 }
 
-void unwind__finish_access(struct thread *thread)
+void unwind__finish_access(struct map_groups *mg)
 {
 	if (!dwarf_callchain_users)
 		return;
 
-	if (thread->unwind_libunwind_ops)
-		thread->unwind_libunwind_ops->finish_access(thread);
+	if (mg->unwind_libunwind_ops)
+		mg->unwind_libunwind_ops->finish_access(mg);
 }
 
 int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 			 struct thread *thread,
 			 struct perf_sample *data, int max_stack)
 {
-	if (thread->unwind_libunwind_ops)
-		return thread->unwind_libunwind_ops->get_entries(cb, arg, thread, data, max_stack);
+	if (thread->mg->unwind_libunwind_ops)
+		return thread->mg->unwind_libunwind_ops->get_entries(cb, arg, thread, data, max_stack);
 	return 0;
 }
diff --git a/tools/perf/util/unwind.h b/tools/perf/util/unwind.h
index 8a44a1569a21..3a7d00c20d86 100644
--- a/tools/perf/util/unwind.h
+++ b/tools/perf/util/unwind.h
@@ -6,6 +6,7 @@
 #include <linux/types.h>
 
 struct map;
+struct map_groups;
 struct perf_sample;
 struct symbol;
 struct thread;
@@ -19,9 +20,9 @@ struct unwind_entry {
 typedef int (*unwind_entry_cb_t)(struct unwind_entry *entry, void *arg);
 
 struct unwind_libunwind_ops {
-	int (*prepare_access)(struct thread *thread);
-	void (*flush_access)(struct thread *thread);
-	void (*finish_access)(struct thread *thread);
+	int (*prepare_access)(struct map_groups *mg);
+	void (*flush_access)(struct map_groups *mg);
+	void (*finish_access)(struct map_groups *mg);
 	int (*get_entries)(unwind_entry_cb_t cb, void *arg,
 			   struct thread *thread,
 			   struct perf_sample *data, int max_stack);
@@ -46,20 +47,20 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 #endif
 
 int LIBUNWIND__ARCH_REG_ID(int regnum);
-int unwind__prepare_access(struct thread *thread, struct map *map,
+int unwind__prepare_access(struct map_groups *mg, struct map *map,
 			   bool *initialized);
-void unwind__flush_access(struct thread *thread);
-void unwind__finish_access(struct thread *thread);
+void unwind__flush_access(struct map_groups *mg);
+void unwind__finish_access(struct map_groups *mg);
 #else
-static inline int unwind__prepare_access(struct thread *thread __maybe_unused,
+static inline int unwind__prepare_access(struct map_groups *mg __maybe_unused,
 					 struct map *map __maybe_unused,
 					 bool *initialized __maybe_unused)
 {
 	return 0;
 }
 
-static inline void unwind__flush_access(struct thread *thread __maybe_unused) {}
-static inline void unwind__finish_access(struct thread *thread __maybe_unused) {}
+static inline void unwind__flush_access(struct map_groups *mg __maybe_unused) {}
+static inline void unwind__finish_access(struct map_groups *mg __maybe_unused) {}
 #endif
 #else
 static inline int
@@ -72,14 +73,14 @@ unwind__get_entries(unwind_entry_cb_t cb __maybe_unused,
 	return 0;
 }
 
-static inline int unwind__prepare_access(struct thread *thread __maybe_unused,
+static inline int unwind__prepare_access(struct map_groups *mg __maybe_unused,
 					 struct map *map __maybe_unused,
 					 bool *initialized __maybe_unused)
 {
 	return 0;
 }
 
-static inline void unwind__flush_access(struct thread *thread __maybe_unused) {}
-static inline void unwind__finish_access(struct thread *thread __maybe_unused) {}
+static inline void unwind__flush_access(struct map_groups *mg __maybe_unused) {}
+static inline void unwind__finish_access(struct map_groups *mg __maybe_unused) {}
 #endif /* HAVE_DWARF_UNWIND_SUPPORT */
 #endif /* __UNWIND_H */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 17/17] perf unwind: Remove unnecessary test
  2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2019-08-16 20:16 ` [PATCH 16/17] perf unwind: Fix libunwind when tid != pid Arnaldo Carvalho de Melo
@ 2019-08-16 20:16 ` Arnaldo Carvalho de Melo
  16 siblings, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-16 20:16 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, John Keeping, Alexander Shishkin,
	Konstantin Khlebnikov, Peter Zijlstra, Arnaldo Carvalho de Melo

From: John Keeping <john@metanate.com>

If dwarf_callchain_users is false, then unwind__prepare_access() will
not set unwind_libunwind_ops so the remaining test here is sufficient.

Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: john keeping <john@metanate.com>
Link: http://lkml.kernel.org/r/20190815100146.28842-3-john@metanate.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/unwind-libunwind.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index b843f9d0a9ea..6499b22b158b 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -69,18 +69,12 @@ int unwind__prepare_access(struct map_groups *mg, struct map *map,
 
 void unwind__flush_access(struct map_groups *mg)
 {
-	if (!dwarf_callchain_users)
-		return;
-
 	if (mg->unwind_libunwind_ops)
 		mg->unwind_libunwind_ops->flush_access(mg);
 }
 
 void unwind__finish_access(struct map_groups *mg)
 {
-	if (!dwarf_callchain_users)
-		return;
-
 	if (mg->unwind_libunwind_ops)
 		mg->unwind_libunwind_ops->finish_access(mg);
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-08-16 20:18 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-16 20:16 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 01/17] perf vendor events intel: Add Tremontx event file v1.02 Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 02/17] perf script: Allow specifying event to switch on processing of other events Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 03/17] perf script: Allow showing the --switch-on event Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 04/17] perf script: Allow specifying event to switch off processing of other events Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 05/17] perf evswitch: Move struct to a separate header to use in other tools Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 06/17] perf evswitch: Move switch logic " Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 07/17] perf evswitch: Add the names of on/off events Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 08/17] perf evswitch: Introduce OPTS_EVSWITCH() for cmd line processing Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 09/17] perf evswitch: Introduce init() method to set the on/off evsels from the command line Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 10/17] perf evswitch: Move enoent error message printing to separate function Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 11/17] perf evswitch: Add hint when not finding specified on/off events Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 12/17] perf trace: Add --switch-on/--switch-off events Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 13/17] perf top: " Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 14/17] perf report: " Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 15/17] perf map: Use zalloc for map_groups Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 16/17] perf unwind: Fix libunwind when tid != pid Arnaldo Carvalho de Melo
2019-08-16 20:16 ` [PATCH 17/17] perf unwind: Remove unnecessary test Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).