linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] perf/core improvements and fixes
@ 2019-11-22 14:56 Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 01/26] perf map: Move maj/min/ino/ino_generation to separate struct Arnaldo Carvalho de Melo
                   ` (26 more replies)
  0 siblings, 27 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Alexey Budankov, Colin King, Hewenliang, Ian Rogers, Jin Yao,
	Steven Rostedt, Sudipm Mukherjee, Arnaldo Carvalho de Melo

Hi Ingo/Thomas,

	Please consider pulling,

Best regards,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 8f6ee51d772d0dab407d868449d2c5d9c8d2b6fc:

  Merge tag 'perf-core-for-mingo-5.5-20191119' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2019-11-19 12:59:03 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-5.5-20191122

for you to fetch changes up to 4584f084aa9d8033d5911935837dbee7b082d0e9:

  perf parse: Fix potential memory leak when handling tracepoint errors (2019-11-22 10:48:14 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

perf report:

  Jin Yao:

  - Allow entering the annotation view (symbol source/assembly +
    overhead/cycles/etc column) from the 'perf report --total-cycles'
    interface.

    E.g.:

      # perf record --all-cpus --branch-any --all-kernel
      ^C[ perf record: Woken up 5 times to write data ]
      #
      # perf evlist -v
      cycles: size: 120, { sample_period, sample_freq }: 4000,
      sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK,
      read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, mmap: 1, comm: 1, freq: 1, task: 1,
      precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1,
      bpf_event: 1, branch_sample_type: ANY
      #
      # perf report --total-cycles
      #
      # Samples: 78762 of event 'cycles'
      Sampled  Sampled Avg      Avg
      Cycles%  Cycles  Cycles%  Cycles                           [Program Block Range]     Shared Object
        1.72%    95.8K   0.00%     254                        [msr.h:105 -> msr.h:166]  [kernel.vmlinux]
        1.56%   107.6K   0.00%     618                [compiler.h:199 -> common.c:301]  [kernel.vmlinux]
        0.83%    46.3K   0.00%     409              [entry_64.S:153 -> entry_64.S:175]  [kernel.vmlinux]
        0.83%    46.1K   0.00%      83                  [jump_label.h:41 -> tsc.c:230]  [kernel.vmlinux]
        0.64%    36.9K   0.01%    1.4K            [hda_intel.c:904 -> hda_intel.c:916]   [snd_hda_intel]
        0.57%    30.2K   0.00%     282                      [file.c:710 -> file.c:730]  [kernel.vmlinux]
        0.48%    25.8K   0.00%      82              [spinlock.c:158 -> spinlock.c:160]  [kernel.vmlinux]
        0.45%    23.7K   0.00%     369  [tick-broadcast.c:585 -> tick-broadcast.c:586]  [kernel.vmlinux]
        0.44%    24.4K   0.00%      73                       [msr.h:236 -> tsc.c:1088]  [kernel.vmlinux]
        0.43%    22.7K   0.00%     144                [cpuidle.c:229 -> cpuidle.c:232]  [kernel.vmlinux]

    Then press 'A' or Enter on one of those lines, just like with 'perf top', say
    the top one: [msr.h:105 -> msr.h:166], then this shows up:

      Samples: 78K of event 'cycles', 4000 Hz, Event count (approx.): 78762
      native_write_msr  /lib/modules/5.4.0-rc8/build/vmlinux [Percent: local period]
      Percent│ IPC Cycle (Average IPC: 0.02, IPC Coverage: 50.0%)
             │
             │             Disassembly of section .text:
             │
             │             ffffffff8106c480 <native_write_msr>:
             │             __wrmsr():
             │             return EAX_EDX_VAL(val, low, high);
             │             }
             │
             │             static inline void notrace __wrmsr(unsigned int msr, u32 low, u32 high)
             │             {
             │             asm volatile("1: wrmsr\n"
       49.16 │0.02           mov   %edi,%ecx
             │0.02           mov   %esi,%eax
             │0.02           wrmsr
             │             arch_static_branch():
             │             #include <linux/stringify.h>
             │             #include <linux/types.h>
             │
             │             static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
             │             {
             │             asm_volatile_goto("1:"
        0.79 │0.02           nop
             │             native_write_msr():
             │             {
             │             __wrmsr(msr, low, high);
             │
             │             if (msr_tracepoint_active(__tracepoint_write_msr))
             │             do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
             │             }
       50.05 │0.02  254    ← retq
             │             do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
             │               shl   $0x20,%rdx
             │               mov   %esi,%esi
             │               or    %rdx,%rsi
             │               xor   %edx,%edx
             │             → jmpq  do_trace_write_msr

    We need to improve this to show the source code line numbers in the
    annotation view, so one can go from that program block to the annotation view
    and see those source code line numbers straight away.

auxtrace/Intel PT:

  Adrian Hunter:

  - Add support for AUX area sampling, requires new functionality that
    will land in 5.5, its already in tip.

    This includes kernel capability querying so that it fails gracefully
    with older kernels, duimping aux area samples in 'perf report -D' and
    'perf script'.

perf.data:

  Alexey Budankov:

  - Fix decompression of PERF_RECORD_COMPRESSED records.

core:

  Arnaldo Carvalho de Melo:

  - Use the 'dcacheline' cmp routine to find the right DSOs taking into
    account the 'maj', 'min', 'ino' and 'ino_generation', that got moved
    from 'struct map' to 'struct dso', where it belongs.

    This further reduces the size of 'struct map', there is still more
    work to do to maybe get it to max one cacheline.

libtraceevent:

  Hewenliang:

  - Fix memory leakage in copy_filter_type().

  Sudip Mukherjee:

  - Fix header installation.

perf parse:

  Ian Rogers :

  - Fix potential memory leak when handling tracepoint errors, found using
    LLVM's libFuzzer.

perf probe:

  Colin Ian King:

  - Fix spelling mistake "addrees" -> "address".

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Adrian Hunter (14):
      perf tools: Add kernel AUX area sampling definitions
      perf record: Add a function to test for kernel support for AUX area sampling
      perf auxtrace: Move perf_evsel__find_pmu()
      perf auxtrace: Add support for AUX area sample recording
      perf record: Add support for AUX area sampling
      perf record: Add aux-sample-size config term
      perf inject: Cut AUX area samples
      perf auxtrace: Add support for dumping AUX area samples
      perf session: Add facility to peek at all events
      perf auxtrace: Add support for queuing AUX area samples
      perf pmu: When using default config, record which bits of config were changed by the user
      perf intel-pt: Add support for recording AUX area samples
      perf intel-pt: Add support for decoding AUX area samples
      perf intel-bts: Does not support AUX area sampling

Alexey Budankov (1):
      perf session: Fix decompression of PERF_RECORD_COMPRESSED records

Arnaldo Carvalho de Melo (5):
      perf map: Move maj/min/ino/ino_generation to separate struct
      perf map: Pass a dso_id to map__new()
      perf map: Move comparision of map's dso_id to a separate function
      perf dsos: Remove unused dsos__find() method
      perf dso: Move dso_id from 'struct map' to 'struct dso'

Colin Ian King (1):
      perf probe: Fix spelling mistake "addrees" -> "address"

Hewenliang (1):
      libtraceevent: Fix memory leakage in copy_filter_type

Ian Rogers (1):
      perf parse: Fix potential memory leak when handling tracepoint errors

Jin Yao (2):
      perf util: Move block TUI function to ui browsers
      perf report: Jump to symbol source view from total cycles view

Sudip Mukherjee (1):
      libtraceevent: Fix header installation

 tools/include/uapi/linux/perf_event.h     |  10 +-
 tools/lib/traceevent/Makefile             |   8 +-
 tools/lib/traceevent/parse-filter.c       |   9 +-
 tools/perf/Documentation/intel-pt.txt     |  59 +++++-
 tools/perf/Documentation/perf-record.txt  |   9 +
 tools/perf/arch/x86/util/auxtrace.c       |   4 +
 tools/perf/arch/x86/util/intel-bts.c      |   5 +
 tools/perf/arch/x86/util/intel-pt.c       |  81 +++++++-
 tools/perf/builtin-inject.c               |  29 +++
 tools/perf/builtin-record.c               |  21 +-
 tools/perf/builtin-report.c               |  11 +-
 tools/perf/tests/attr/base-record         |   2 +-
 tools/perf/tests/attr/base-stat           |   2 +-
 tools/perf/tests/sample-parsing.c         |  16 +-
 tools/perf/ui/browsers/hists.c            |  78 +++++++-
 tools/perf/util/auxtrace.c                | 322 ++++++++++++++++++++++++++++--
 tools/perf/util/auxtrace.h                |  43 ++++
 tools/perf/util/block-info.c              |  71 +------
 tools/perf/util/block-info.h              |   3 +-
 tools/perf/util/dso.c                     |  24 ++-
 tools/perf/util/dso.h                     |  13 ++
 tools/perf/util/dsos.c                    |  97 +++++++--
 tools/perf/util/dsos.h                    |  14 +-
 tools/perf/util/event.h                   |   6 +
 tools/perf/util/evlist.h                  |   1 +
 tools/perf/util/evsel.c                   |  31 +++
 tools/perf/util/evsel_config.h            |  13 ++
 tools/perf/util/hist.h                    |  15 ++
 tools/perf/util/intel-pt.c                | 109 +++++++++-
 tools/perf/util/machine.c                 |  22 +-
 tools/perf/util/machine.h                 |   2 +
 tools/perf/util/map.c                     |  11 +-
 tools/perf/util/map.h                     |   9 +-
 tools/perf/util/parse-events.c            |  65 +++++-
 tools/perf/util/parse-events.h            |   1 +
 tools/perf/util/parse-events.l            |   1 +
 tools/perf/util/perf_event_attr_fprintf.c |   3 +-
 tools/perf/util/pmu.c                     |  10 +
 tools/perf/util/pmu.h                     |   2 +
 tools/perf/util/probe-finder.c            |   2 +-
 tools/perf/util/record.c                  |  31 +++
 tools/perf/util/record.h                  |   2 +
 tools/perf/util/session.c                 |  82 ++++++--
 tools/perf/util/session.h                 |   5 +
 tools/perf/util/sort.c                    |  24 +--
 tools/perf/util/synthetic-events.c        |  12 ++
 46 files changed, 1190 insertions(+), 200 deletions(-)

Test results:

The first ones are container based builds of tools/perf with and without libelf
support.  Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

Clearlinux is failing when building with libpython, but that is not a perf
regression, will try to remove one compiler warning that is causing the problem
when building some of the glue code files in the python files, outside perf.

Manjaro got fixed by adding the 'gettext' package, that provides a
library needed by bison but not present in its dependencies list, i.e. a
distro bug.

cooker is failing with:

  In file included from cpumap.c:4:
  In file included from /git/linux/tools/include/linux/refcount.h:41:
  In file included from /git/linux/tools/include/linux/atomic.h:5:
  In file included from /git/linux/tools/include/asm/atomic.h:6:
  In file included from /git/linux/tools/include/asm/../../arch/x86/include/asm/atomic.h:11:
  /git/linux/tools/arch/x86/include/asm/cmpxchg.h:12:2: error: unknown attribute 'error' ignored [-Werror,-Wunknown-attributes]
          __compiletime_error("Bad argument size for cmpxchg");
          ^
  /git/linux/tools/include/linux/compiler-gcc.h:20:54: note: expanded from macro '__compiletime_error'
  # define __compiletime_error(message) __attribute__((error(message)))
                                                       ^
    LD       /tmp/build/perf/fs/libapi-in.o

Still needs investigating, new image, just leaving it here for
documentation purposes, maybe related to it using the most recent gcc
and clang versions?

  # export PERF_TARBALL=http://192.168.124.1/perf/perf-5.4.0-rc7.tar.xz
  # dm 
   1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final)
   2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final)
   3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final)
   4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
   5 alpine:3.8                    : Ok   gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
   6 alpine:3.9                    : Ok   gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
   7 alpine:3.10                   : Ok   gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
   8 alpine:edge                   : Ok   gcc (Alpine 9.2.0) 9.2.0, Alpine clang version 9.0.0 (git://git.alpinelinux.org/aports 25c73ae7b95bdb42ae5f0ceac3b703e766582527) (based on LLVM 9.0.0)
   9 amazonlinux:1                 : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final)
  10 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
  11 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  12 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  13 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  14 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
  15 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
  16 centos:8                      : Ok   gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), clang version 7.0.1 (tags/RELEASE_701/final)
  17 clearlinux:latest             : Ok   gcc (Clear Linux OS for Intel Architecture) 9.2.1 20191101 gcc-9-branch@277702, clang version 9.0.0 (tags/RELEASE_900/final)
  18 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
  19 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final)
  20 debian:10                     : Ok   gcc (Debian 8.3.0-6) 8.3.0, clang version 7.0.1-8 (tags/RELEASE_701/final)
  21 debian:experimental           : Ok   gcc (Debian 9.2.1-9) 9.2.1 20191008, clang version 8.0.1-3+b1 (tags/RELEASE_801/final)
  22 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.3.0-19) 8.3.0
  23 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 8.3.0-19) 8.3.0
  24 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 8.3.0-19) 8.3.0
  25 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 8.3.0-19) 8.3.0
  26 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
  27 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.5.0 (tags/RELEASE_350/final)
  28 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final)
  29 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final)
  30 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  31 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final)
  32 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final)
  33 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final)
  34 fedora:28                     : Ok   gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final)
  35 fedora:29                     : Ok   gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29)
  36 fedora:30                     : Ok   gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1), clang version 8.0.0 (Fedora 8.0.0-3.fc30)
  37 fedora:30-x-ARC-glibc         : Ok   arc-linux-gcc (ARC HS GNU/Linux glibc toolchain 2019.03-rc1) 8.3.1 20190225
  38 fedora:30-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
  39 fedora:31                     : Ok   gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1), clang version 9.0.0 (Fedora 9.0.0-1.fc31)
  40 fedora:32                     : Ok   gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1), clang version 9.0.0 (Fedora 9.0.0-1.fc32)
  41 fedora:rawhide                : Ok   gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1), clang version 9.0.0 (Fedora 9.0.0-1.fc32)
  42 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 8.3.0-r1 p1.1) 8.3.0
  43 mageia:5                      : Ok   gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final)
  44 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final)
  45 mageia:7                      : Ok   gcc (Mageia 8.3.1-0.20190524.1.mga7) 8.3.1 20190524, clang version 8.0.0 (Mageia 8.0.0-1.mga7)
  46 manjaro:latest                : Ok   gcc (GCC) 9.2.0, clang version 9.0.0 (tags/RELEASE_900/final)
  47 openmandriva:cooker           : FAIL gcc (GCC) 9.2.1 20191109 (OpenMandriva), clang version 9.0.1 
  48 opensuse:15.0                 : Ok   gcc (SUSE Linux) 7.4.1 20190424 [gcc-7-branch revision 270538], clang version 5.0.1 (tags/RELEASE_501/final 312548)
  49 opensuse:15.1                 : Ok   gcc (SUSE Linux) 7.4.1 20190905 [gcc-7-branch revision 275407], clang version 7.0.1 (tags/RELEASE_701/final 349238)
  50 opensuse:15.2                 : Ok   gcc (SUSE Linux) 7.4.1 20190905 [gcc-7-branch revision 275407], clang version 7.0.1 (tags/RELEASE_701/final 349238)
  51 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5, clang version 3.8.0 (tags/RELEASE_380/final 262553)
  52 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 9.2.1 20190903 [gcc-9-branch revision 275330], clang version 9.0.0 (tags/RELEASE_900/final 372316)
  53 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
  54 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39.0.1), clang version 3.4.2 (tags/RELEASE_34/dot2-final)
  55 oraclelinux:8                 : Ok   gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3.0.1), clang version 7.0.1 (tags/RELEASE_701/final)
  56 ubuntu:12.04                  : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
  57 ubuntu:14.04                  : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, Ubuntu clang version 3.4-1ubuntu3 (tags/RELEASE_34/final) (based on LLVM 3.4)
  58 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
  59 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  60 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  61 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  62 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  63 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  64 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  65 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
  66 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1) 7.4.0
  67 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1) 7.4.0
  68 ubuntu:18.04-x-m68k           : Ok   m68k-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  69 ubuntu:18.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  70 ubuntu:18.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  71 ubuntu:18.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  72 ubuntu:18.04-x-riscv64        : Ok   riscv64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  73 ubuntu:18.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  74 ubuntu:18.04-x-sh4            : Ok   sh4-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  75 ubuntu:18.04-x-sparc64        : Ok   sparc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  76 ubuntu:18.10                  : Ok   gcc (Ubuntu 8.3.0-6ubuntu1~18.10.1) 8.3.0, clang version 7.0.0-3 (tags/RELEASE_700/final)
  77 ubuntu:19.04                  : Ok   gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, clang version 8.0.0-3 (tags/RELEASE_800/final)
  78 ubuntu:19.04-x-alpha          : Ok   alpha-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
  79 ubuntu:19.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 8.3.0-6ubuntu1) 8.3.0
  80 ubuntu:19.04-x-hppa           : Ok   hppa-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
  81 ubuntu:19.10                  : Ok   gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, clang version 9.0.0-2 (tags/RELEASE_900/final)
  #

  # uname -a
  Linux quaco 5.4.0-rc8 #1 SMP Mon Nov 18 06:15:31 -03 2019 x86_64 x86_64 x86_64 GNU/Linux
  # git log --oneline -1
  4584f084aa9d perf parse: Fix potential memory leak when handling tracepoint errors
  # perf version --build-options
  perf version 5.4.rc7.g4584f084aa9d
                   dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
      dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                   glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                    gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
           syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                  libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                  libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                 libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
  numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
               libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                libslang: [ on  ]  # HAVE_SLANG_SUPPORT
               libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
               libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
      libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                    zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                    lzma: [ on  ]  # HAVE_LZMA_SUPPORT
               get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
                     aio: [ on  ]  # HAVE_AIO_SUPPORT
                    zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Breakpoint accounting                                 : Ok
  22: Watchpoint                                            :
  22.1: Read Only Watchpoint                                : Skip
  22.2: Write Only Watchpoint                               : Ok
  22.3: Read / Write Watchpoint                             : Ok
  22.4: Modify Watchpoint                                   : Ok
  23: Number of exit events of a simple workload            : Ok
  24: Software clock events period values                   : Ok
  25: Object code reading                                   : Ok
  26: Sample parsing                                        : Ok
  27: Use a dummy software event to keep tracking           : Ok
  28: Parse with no sample_id_all bit set                   : Ok
  29: Filter hist entries                                   : Ok
  30: Lookup mmap thread                                    : Ok
  31: Share thread mg                                       : Ok
  32: Sort output of hist entries                           : Ok
  33: Cumulate child hist entries                           : Ok
  34: Track with sched_switch                               : Ok
  35: Filter fds with revents mask in a fdarray             : Ok
  36: Add fd to a fdarray, making it autogrow               : Ok
  37: kmod_path__parse                                      : Ok
  38: Thread map                                            : Ok
  39: LLVM search and compile                               :
  39.1: Basic BPF llvm compile                              : Ok
  39.2: kbuild searching                                    : Ok
  39.3: Compile source for BPF prologue generation          : Ok
  39.4: Compile source for BPF relocation                   : Ok
  40: Session topology                                      : Ok
  41: BPF filter                                            :
  41.1: Basic BPF filtering                                 : Ok
  41.2: BPF pinning                                         : Ok
  41.3: BPF prologue generation                             : Ok
  41.4: BPF relocation checker                              : Ok
  42: Synthesize thread map                                 : Ok
  43: Remove thread map                                     : Ok
  44: Synthesize cpu map                                    : Ok
  45: Synthesize stat config                                : Ok
  46: Synthesize stat                                       : Ok
  47: Synthesize stat round                                 : Ok
  48: Synthesize attr update                                : Ok
  49: Event times                                           : Ok
  50: Read backward ring buffer                             : Ok
  51: Print cpu map                                         : Ok
  52: Probe SDT events                                      : Ok
  53: is_printable_array                                    : Ok
  54: Print bitmap                                          : Ok
  55: perf hooks                                            : Ok
  56: builtin clang support                                 : Skip (not compiled in)
  57: unit_number__scnprintf                                : Ok
  58: mem2node                                              : Ok
  59: time utils                                            : Ok
  60: map_groups__merge_in                                  : Ok
  61: x86 rdpmc                                             : Ok
  62: Convert perf time to TSC                              : Ok
  63: DWARF unwind                                          : Ok
  64: x86 instruction decoder - new instructions            : Ok
  65: Intel PT packet decoder                               : Ok
  66: x86 bp modify                                         : Ok
  67: probe libc's inet_pton & backtrace it with ping       : Ok
  68: Use vfs_getname probe to get syscall args filenames   : Ok
  69: Add vfs_getname probe to get syscall args filenames   : Ok
  70: Check open filename arg using perf trace + vfs_getname: Ok
  71: Zstd perf.data compression/decompression              : Ok

  $ make -C tools/perf build-test 
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
               make_no_slang_O: make NO_SLANG=1
                make_no_gtk2_O: make NO_GTK2=1
                 make_perf_o_O: make perf.o
                make_install_O: make install
             make_no_libnuma_O: make NO_LIBNUMA=1
              make_no_libbpf_O: make NO_LIBBPF=1
           make_no_backtrace_O: make NO_BACKTRACE=1
              make_no_libelf_O: make NO_LIBELF=1
                   make_pure_O: make
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
           make_no_libunwind_O: make NO_LIBUNWIND=1
                   make_help_O: make help
        make_with_babeltrace_O: make LIBBABELTRACE=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
           make_no_libbionic_O: make NO_LIBBIONIC=1
           make_no_libpython_O: make NO_LIBPYTHON=1
                 make_cscope_O: make cscope
                make_no_newt_O: make NO_NEWT=1
                 make_static_O: make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1
            make_no_demangle_O: make NO_DEMANGLE=1
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
         make_with_clangllvm_O: make LIBCLANGLLVM=1
                  make_debug_O: make DEBUG=1
              make_clean_all_O: make clean all
             make_no_libperl_O: make NO_LIBPERL=1
            make_install_bin_O: make install-bin
       make_util_pmu_bison_o_O: make util/pmu-bison.o
                    make_doc_O: make doc
         make_install_prefix_O: make install prefix=/tmp/krava
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
            make_no_auxtrace_O: make NO_AUXTRACE=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1
             make_util_map_o_O: make util/map.o
                   make_tags_O: make tags
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 01/26] perf map: Move maj/min/ino/ino_generation to separate struct
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 02/26] perf map: Pass a dso_id to map__new() Arnaldo Carvalho de Melo
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

And this patch highlights where these fields are being used: in the sort
order where it uses it to compare maps and classify samples taking into
account not just the DSO, but those DSO id fields.

I think these should be used to differentiate DSOs with the same name
but different 'struct dso_id' fields, i.e. these fields should move to
'struct dso' and then be used as part of the key when doing lookups for
DSOs, in addition to the DSO name.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8v5isitqy0dup47nnwkpc80f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c |  2 +-
 tools/perf/util/map.c       |  8 ++++----
 tools/perf/util/map.h       | 14 +++++++++++---
 tools/perf/util/sort.c      | 24 ++++++++++++------------
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 585805f51f15..04c197d3beea 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -771,7 +771,7 @@ static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp)
 				   map->prot & PROT_EXEC ? 'x' : '-',
 				   map->flags & MAP_SHARED ? 's' : 'p',
 				   map->pgoff,
-				   map->ino, map->dso->name);
+				   map->dso_id.ino, map->dso->name);
 	}
 
 	return printed;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 67e0f81416cb..4f50b1b2961f 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -162,10 +162,10 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 		vdso = is_vdso_map(filename);
 		no_dso = is_no_dso_memory(filename);
 
-		map->maj = d_maj;
-		map->min = d_min;
-		map->ino = ino;
-		map->ino_generation = ino_gen;
+		map->dso_id.maj = d_maj;
+		map->dso_id.min = d_min;
+		map->dso_id.ino = ino;
+		map->dso_id.ino_generation = ino_gen;
 		map->prot = prot;
 		map->flags = flags;
 		nsi = nsinfo__get(thread->nsinfo);
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 0a6c45f85cd9..70d87dcbe35d 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -18,6 +18,16 @@ struct map_groups;
 struct machine;
 struct evsel;
 
+/*
+ * Data about backing storage DSO, comes from PERF_RECORD_MMAP2 meta events
+ */
+struct dso_id {
+	u32	maj;
+	u32	min;
+	u64	ino;
+	u64	ino_generation;
+};
+
 struct map {
 	union {
 		struct rb_node	rb_node;
@@ -30,9 +40,6 @@ struct map {
 	u32			prot;
 	u64			pgoff;
 	u64			reloc;
-	u32			maj, min; /* only valid for MMAP2 record */
-	u64			ino;      /* only valid for MMAP2 record */
-	u64			ino_generation;/* only valid for MMAP2 record */
 
 	/* ip -> dso rip */
 	u64			(*map_ip)(struct map *, u64);
@@ -40,6 +47,7 @@ struct map {
 	u64			(*unmap_ip)(struct map *, u64);
 
 	struct dso		*dso;
+	struct dso_id		dso_id;
 	refcount_t		refcnt;
 	u32			flags;
 };
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 6b626e6b111e..bc589438cd12 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1212,17 +1212,17 @@ sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right)
 	if (!l_map) return -1;
 	if (!r_map) return 1;
 
-	if (l_map->maj > r_map->maj) return -1;
-	if (l_map->maj < r_map->maj) return 1;
+	if (l_map->dso_id.maj > r_map->dso_id.maj) return -1;
+	if (l_map->dso_id.maj < r_map->dso_id.maj) return 1;
 
-	if (l_map->min > r_map->min) return -1;
-	if (l_map->min < r_map->min) return 1;
+	if (l_map->dso_id.min > r_map->dso_id.min) return -1;
+	if (l_map->dso_id.min < r_map->dso_id.min) return 1;
 
-	if (l_map->ino > r_map->ino) return -1;
-	if (l_map->ino < r_map->ino) return 1;
+	if (l_map->dso_id.ino > r_map->dso_id.ino) return -1;
+	if (l_map->dso_id.ino < r_map->dso_id.ino) return 1;
 
-	if (l_map->ino_generation > r_map->ino_generation) return -1;
-	if (l_map->ino_generation < r_map->ino_generation) return 1;
+	if (l_map->dso_id.ino_generation > r_map->dso_id.ino_generation) return -1;
+	if (l_map->dso_id.ino_generation < r_map->dso_id.ino_generation) return 1;
 
 	/*
 	 * Addresses with no major/minor numbers are assumed to be
@@ -1234,8 +1234,8 @@ sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right)
 
 	if ((left->cpumode != PERF_RECORD_MISC_KERNEL) &&
 	    (!(l_map->flags & MAP_SHARED)) &&
-	    !l_map->maj && !l_map->min && !l_map->ino &&
-	    !l_map->ino_generation) {
+	    !l_map->dso_id.maj && !l_map->dso_id.min &&
+	    !l_map->dso_id.ino && !l_map->dso_id.ino_generation) {
 		/* userspace anonymous */
 
 		if (left->thread->pid_ > right->thread->pid_) return -1;
@@ -1271,8 +1271,8 @@ static int hist_entry__dcacheline_snprintf(struct hist_entry *he, char *bf,
 		if ((he->cpumode != PERF_RECORD_MISC_KERNEL) &&
 		     map && !(map->prot & PROT_EXEC) &&
 		    (map->flags & MAP_SHARED) &&
-		    (map->maj || map->min || map->ino ||
-		     map->ino_generation))
+		    (map->dso_id.maj || map->dso_id.min ||
+		     map->dso_id.ino || map->dso_id.ino_generation))
 			level = 's';
 		else if (!map)
 			level = 'X';
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/26] perf map: Pass a dso_id to map__new()
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 01/26] perf map: Move maj/min/ino/ino_generation to separate struct Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 03/26] perf map: Move comparision of map's dso_id to a separate function Arnaldo Carvalho de Melo
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Instead of the 4 fields, a step in the direction of moving this to
struct dso.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-gp5s1xgxacurmih5d1l94ymy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/machine.c | 15 ++++++++-------
 tools/perf/util/map.c     | 13 +++++++------
 tools/perf/util/map.h     |  3 +--
 3 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 71ee078d30f4..41b4263c073d 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1651,6 +1651,12 @@ int machine__process_mmap2_event(struct machine *machine,
 {
 	struct thread *thread;
 	struct map *map;
+	struct dso_id dso_id = {
+		.maj = event->mmap2.maj,
+		.min = event->mmap2.min,
+		.ino = event->mmap2.ino,
+		.ino_generation = event->mmap2.ino_generation,
+	};
 	int ret = 0;
 
 	if (dump_trace)
@@ -1671,10 +1677,7 @@ int machine__process_mmap2_event(struct machine *machine,
 
 	map = map__new(machine, event->mmap2.start,
 			event->mmap2.len, event->mmap2.pgoff,
-			event->mmap2.maj,
-			event->mmap2.min, event->mmap2.ino,
-			event->mmap2.ino_generation,
-			event->mmap2.prot,
+			&dso_id, event->mmap2.prot,
 			event->mmap2.flags,
 			event->mmap2.filename, thread);
 
@@ -1727,9 +1730,7 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
 
 	map = map__new(machine, event->mmap.start,
 			event->mmap.len, event->mmap.pgoff,
-			0, 0, 0, 0, prot, 0,
-			event->mmap.filename,
-			thread);
+			NULL, prot, 0, event->mmap.filename, thread);
 
 	if (map == NULL)
 		goto out_problem_map;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 4f50b1b2961f..812d663ebb57 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -144,8 +144,8 @@ void map__init(struct map *map, u64 start, u64 end, u64 pgoff, struct dso *dso)
 }
 
 struct map *map__new(struct machine *machine, u64 start, u64 len,
-		     u64 pgoff, u32 d_maj, u32 d_min, u64 ino,
-		     u64 ino_gen, u32 prot, u32 flags, char *filename,
+		     u64 pgoff, struct dso_id *id,
+		     u32 prot, u32 flags, char *filename,
 		     struct thread *thread)
 {
 	struct map *map = malloc(sizeof(*map));
@@ -162,10 +162,11 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 		vdso = is_vdso_map(filename);
 		no_dso = is_no_dso_memory(filename);
 
-		map->dso_id.maj = d_maj;
-		map->dso_id.min = d_min;
-		map->dso_id.ino = ino;
-		map->dso_id.ino_generation = ino_gen;
+		if (id)
+			map->dso_id = *id;
+		else
+			map->dso_id.min = map->dso_id.ino = map->dso_id.ino_generation = 0;
+
 		map->prot = prot;
 		map->flags = flags;
 		nsi = nsinfo__get(thread->nsinfo);
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 70d87dcbe35d..f962eb9035c7 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -117,8 +117,7 @@ struct thread;
 void map__init(struct map *map,
 	       u64 start, u64 end, u64 pgoff, struct dso *dso);
 struct map *map__new(struct machine *machine, u64 start, u64 len,
-		     u64 pgoff, u32 d_maj, u32 d_min, u64 ino,
-		     u64 ino_gen, u32 prot, u32 flags,
+		     u64 pgoff, struct dso_id *id, u32 prot, u32 flags,
 		     char *filename, struct thread *thread);
 struct map *map__new2(u64 start, struct dso *dso);
 void map__delete(struct map *map);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/26] perf map: Move comparision of map's dso_id to a separate function
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 01/26] perf map: Move maj/min/ino/ino_generation to separate struct Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 02/26] perf map: Pass a dso_id to map__new() Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 04/26] perf dsos: Remove unused dsos__find() method Arnaldo Carvalho de Melo
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We'll use it when doing DSO lookups using dso_ids.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-u2nr1oq03o0i29w2ay9jx03s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/dsos.c | 25 +++++++++++++++++++++++++
 tools/perf/util/map.h  |  2 ++
 tools/perf/util/sort.c | 16 ++++------------
 3 files changed, 31 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index 3ea80d203587..ecf8d7346685 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -2,6 +2,7 @@
 #include "debug.h"
 #include "dsos.h"
 #include "dso.h"
+#include "map.h"
 #include "vdso.h"
 #include "namespaces.h"
 #include <libgen.h>
@@ -9,6 +10,30 @@
 #include <string.h>
 #include <symbol.h> // filename__read_build_id
 
+int dso_id__cmp(struct dso_id *a, struct dso_id *b)
+{
+	/*
+	 * The second is always dso->id, so zeroes if not set, assume passing
+	 * NULL for a means a zeroed id
+	 */
+	if (a == NULL)
+		return 0;
+
+	if (a->maj > b->maj) return -1;
+	if (a->maj < b->maj) return 1;
+
+	if (a->min > b->min) return -1;
+	if (a->min < b->min) return 1;
+
+	if (a->ino > b->ino) return -1;
+	if (a->ino < b->ino) return 1;
+
+	if (a->ino_generation > b->ino_generation) return -1;
+	if (a->ino_generation < b->ino_generation) return 1;
+
+	return 0;
+}
+
 bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
 {
 	bool have_build_id = false;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index f962eb9035c7..e1e573a28a55 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -28,6 +28,8 @@ struct dso_id {
 	u64	ino_generation;
 };
 
+int dso_id__cmp(struct dso_id *a, struct dso_id *b);
+
 struct map {
 	union {
 		struct rb_node	rb_node;
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index bc589438cd12..f1481002fafb 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1194,6 +1194,7 @@ sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right)
 {
 	u64 l, r;
 	struct map *l_map, *r_map;
+	int rc;
 
 	if (!left->mem_info)  return -1;
 	if (!right->mem_info) return 1;
@@ -1212,18 +1213,9 @@ sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right)
 	if (!l_map) return -1;
 	if (!r_map) return 1;
 
-	if (l_map->dso_id.maj > r_map->dso_id.maj) return -1;
-	if (l_map->dso_id.maj < r_map->dso_id.maj) return 1;
-
-	if (l_map->dso_id.min > r_map->dso_id.min) return -1;
-	if (l_map->dso_id.min < r_map->dso_id.min) return 1;
-
-	if (l_map->dso_id.ino > r_map->dso_id.ino) return -1;
-	if (l_map->dso_id.ino < r_map->dso_id.ino) return 1;
-
-	if (l_map->dso_id.ino_generation > r_map->dso_id.ino_generation) return -1;
-	if (l_map->dso_id.ino_generation < r_map->dso_id.ino_generation) return 1;
-
+	rc = dso_id__cmp(&l_map->dso_id, &r_map->dso_id);
+	if (rc)
+		return rc;
 	/*
 	 * Addresses with no major/minor numbers are assumed to be
 	 * anonymous in userspace.  Sort those on pid then address.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/26] perf dsos: Remove unused dsos__find() method
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 03/26] perf map: Move comparision of map's dso_id to a separate function Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 05/26] perf dso: Move dso_id from 'struct map' to 'struct dso' Arnaldo Carvalho de Melo
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Not used anywhere, nuke it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-teqz0eqcw43mnt7i3me44esw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/dsos.c | 9 ---------
 tools/perf/util/dsos.h | 1 -
 2 files changed, 10 deletions(-)

diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index ecf8d7346685..1d38d6ac6e5a 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -159,15 +159,6 @@ struct dso *__dsos__find(struct dsos *dsos, const char *name, bool cmp_short)
 	return __dsos__findnew_by_longname(&dsos->root, name);
 }
 
-struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short)
-{
-	struct dso *dso;
-	down_read(&dsos->lock);
-	dso = __dsos__find(dsos, name, cmp_short);
-	up_read(&dsos->lock);
-	return dso;
-}
-
 static void dso__set_basename(struct dso *dso)
 {
 	char *base, *lname;
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index 32f1fbee0feb..fd7ba51fc965 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -24,7 +24,6 @@ void __dsos__add(struct dsos *dsos, struct dso *dso);
 void dsos__add(struct dsos *dsos, struct dso *dso);
 struct dso *__dsos__addnew(struct dsos *dsos, const char *name);
 struct dso *__dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
-struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
 struct dso *__dsos__findnew(struct dsos *dsos, const char *name);
 struct dso *dsos__findnew(struct dsos *dsos, const char *name);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/26] perf dso: Move dso_id from 'struct map' to 'struct dso'
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 04/26] perf dsos: Remove unused dsos__find() method Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 06/26] perf session: Fix decompression of PERF_RECORD_COMPRESSED records Arnaldo Carvalho de Melo
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen

From: Arnaldo Carvalho de Melo <acme@redhat.com>

And take it into account when looking up DSOs when we have the dso_id
fields obtained from somewhere, like from PERF_RECORD_MMAP2 records.

Instances of struct map pointing to the same DSO pathname but with
anything in dso_id different are in fact different DSOs, so better have
different 'struct dso' instances to reflect that. At some point we may
want to get copies of the contents of the different objects if we want
to do correct annotation or other analysis.

With this we get 'struct map' 24 bytes leaner:

  $ pahole -C map ~/bin/perf
  struct map {
  	union {
  		struct rb_node     rb_node __attribute__((__aligned__(8))); /*     0    24 */
  		struct list_head   node;                 /*     0    16 */
  	} __attribute__((__aligned__(8)));               /*     0    24 */
  	u64                        start;                /*    24     8 */
  	u64                        end;                  /*    32     8 */
  	_Bool                      erange_warned:1;      /*    40: 0  1 */
  	_Bool                      priv:1;               /*    40: 1  1 */

  	/* XXX 6 bits hole, try to pack */
  	/* XXX 3 bytes hole, try to pack */

  	u32                        prot;                 /*    44     4 */
  	u64                        pgoff;                /*    48     8 */
  	u64                        reloc;                /*    56     8 */
  	/* --- cacheline 1 boundary (64 bytes) --- */
  	u64                        (*map_ip)(struct map *, u64); /*    64     8 */
  	u64                        (*unmap_ip)(struct map *, u64); /*    72     8 */
  	struct dso *               dso;                  /*    80     8 */
  	refcount_t                 refcnt;               /*    88     4 */
  	u32                        flags;                /*    92     4 */

  	/* size: 96, cachelines: 2, members: 13 */
  	/* sum members: 92, holes: 1, sum holes: 3 */
  	/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
  	/* forced alignments: 1 */
  	/* last cacheline: 32 bytes */
  } __attribute__((__aligned__(8)));
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-g4hxxmraplo7wfjmk384mfsb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c |  2 +-
 tools/perf/util/dso.c       | 24 +++++++---
 tools/perf/util/dso.h       | 13 ++++++
 tools/perf/util/dsos.c      | 87 +++++++++++++++++++++++++++----------
 tools/perf/util/dsos.h      | 13 +++---
 tools/perf/util/machine.c   |  7 ++-
 tools/perf/util/machine.h   |  2 +
 tools/perf/util/map.c       |  8 +---
 tools/perf/util/map.h       | 16 ++-----
 tools/perf/util/sort.c      | 10 ++---
 10 files changed, 118 insertions(+), 64 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 04c197d3beea..0b6157c02c88 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -771,7 +771,7 @@ static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp)
 				   map->prot & PROT_EXEC ? 'x' : '-',
 				   map->flags & MAP_SHARED ? 's' : 'p',
 				   map->pgoff,
-				   map->dso_id.ino, map->dso->name);
+				   map->dso->id.ino, map->dso->name);
 	}
 
 	return printed;
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 0f1b77275a86..91f21239608b 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1149,7 +1149,7 @@ struct dso *machine__findnew_kernel(struct machine *machine, const char *name,
 	return dso;
 }
 
-void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
+static void dso__set_long_name_id(struct dso *dso, const char *name, struct dso_id *id, bool name_allocated)
 {
 	struct rb_root *root = dso->root;
 
@@ -1162,8 +1162,8 @@ void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
 	if (root) {
 		rb_erase(&dso->rb_node, root);
 		/*
-		 * __dsos__findnew_link_by_longname() isn't guaranteed to add it
-		 * back, so a clean removal is required here.
+		 * __dsos__findnew_link_by_longname_id() isn't guaranteed to
+		 * add it back, so a clean removal is required here.
 		 */
 		RB_CLEAR_NODE(&dso->rb_node);
 		dso->root = NULL;
@@ -1174,7 +1174,12 @@ void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
 	dso->long_name_allocated = name_allocated;
 
 	if (root)
-		__dsos__findnew_link_by_longname(root, dso, NULL);
+		__dsos__findnew_link_by_longname_id(root, dso, NULL, id);
+}
+
+void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
+{
+	dso__set_long_name_id(dso, name, NULL, name_allocated);
 }
 
 void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated)
@@ -1215,13 +1220,15 @@ void dso__set_sorted_by_name(struct dso *dso)
 	dso->sorted_by_name = true;
 }
 
-struct dso *dso__new(const char *name)
+struct dso *dso__new_id(const char *name, struct dso_id *id)
 {
 	struct dso *dso = calloc(1, sizeof(*dso) + strlen(name) + 1);
 
 	if (dso != NULL) {
 		strcpy(dso->name, name);
-		dso__set_long_name(dso, dso->name, false);
+		if (id)
+			dso->id = *id;
+		dso__set_long_name_id(dso, dso->name, id, false);
 		dso__set_short_name(dso, dso->name, false);
 		dso->symbols = dso->symbol_names = RB_ROOT_CACHED;
 		dso->data.cache = RB_ROOT;
@@ -1252,6 +1259,11 @@ struct dso *dso__new(const char *name)
 	return dso;
 }
 
+struct dso *dso__new(const char *name)
+{
+	return dso__new_id(name, NULL);
+}
+
 void dso__delete(struct dso *dso)
 {
 	if (!RB_EMPTY_NODE(&dso->rb_node))
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 2f1fcbc6fead..2db64b79617a 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -122,6 +122,16 @@ enum dso_load_errno {
 #define DSO__DATA_CACHE_SIZE 4096
 #define DSO__DATA_CACHE_MASK ~(DSO__DATA_CACHE_SIZE - 1)
 
+/*
+ * Data about backing storage DSO, comes from PERF_RECORD_MMAP2 meta events
+ */
+struct dso_id {
+	u32	maj;
+	u32	min;
+	u64	ino;
+	u64	ino_generation;
+};
+
 struct dso_cache {
 	struct rb_node	rb_node;
 	u64 offset;
@@ -196,6 +206,7 @@ struct dso {
 		u64	 db_id;
 	};
 	struct nsinfo	*nsinfo;
+	struct dso_id	 id;
 	refcount_t	 refcnt;
 	char		 name[0];
 };
@@ -214,9 +225,11 @@ static inline void dso__set_loaded(struct dso *dso)
 	dso->loaded = true;
 }
 
+struct dso *dso__new_id(const char *name, struct dso_id *id);
 struct dso *dso__new(const char *name);
 void dso__delete(struct dso *dso);
 
+int dso__cmp_id(struct dso *a, struct dso *b);
 void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated);
 void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated);
 
diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index 1d38d6ac6e5a..591707c69c39 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -2,7 +2,6 @@
 #include "debug.h"
 #include "dsos.h"
 #include "dso.h"
-#include "map.h"
 #include "vdso.h"
 #include "namespaces.h"
 #include <libgen.h>
@@ -10,15 +9,8 @@
 #include <string.h>
 #include <symbol.h> // filename__read_build_id
 
-int dso_id__cmp(struct dso_id *a, struct dso_id *b)
+static int __dso_id__cmp(struct dso_id *a, struct dso_id *b)
 {
-	/*
-	 * The second is always dso->id, so zeroes if not set, assume passing
-	 * NULL for a means a zeroed id
-	 */
-	if (a == NULL)
-		return 0;
-
 	if (a->maj > b->maj) return -1;
 	if (a->maj < b->maj) return 1;
 
@@ -34,6 +26,23 @@ int dso_id__cmp(struct dso_id *a, struct dso_id *b)
 	return 0;
 }
 
+static int dso_id__cmp(struct dso_id *a, struct dso_id *b)
+{
+	/*
+	 * The second is always dso->id, so zeroes if not set, assume passing
+	 * NULL for a means a zeroed id
+	 */
+	if (a == NULL)
+		return 0;
+
+	return __dso_id__cmp(a, b);
+}
+
+int dso__cmp_id(struct dso *a, struct dso *b)
+{
+	return __dso_id__cmp(&a->id, &b->id);
+}
+
 bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
 {
 	bool have_build_id = false;
@@ -59,12 +68,30 @@ bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
 	return have_build_id;
 }
 
+static int __dso__cmp_long_name(const char *long_name, struct dso_id *id, struct dso *b)
+{
+	int rc = strcmp(long_name, b->long_name);
+	return rc ?: dso_id__cmp(id, &b->id);
+}
+
+static int __dso__cmp_short_name(const char *short_name, struct dso_id *id, struct dso *b)
+{
+	int rc = strcmp(short_name, b->short_name);
+	return rc ?: dso_id__cmp(id, &b->id);
+}
+
+static int dso__cmp_short_name(struct dso *a, struct dso *b)
+{
+	return __dso__cmp_short_name(a->short_name, &a->id, b);
+}
+
 /*
  * Find a matching entry and/or link current entry to RB tree.
  * Either one of the dso or name parameter must be non-NULL or the
  * function will not work.
  */
-struct dso *__dsos__findnew_link_by_longname(struct rb_root *root, struct dso *dso, const char *name)
+struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso *dso,
+						const char *name, struct dso_id *id)
 {
 	struct rb_node **p = &root->rb_node;
 	struct rb_node  *parent = NULL;
@@ -76,7 +103,7 @@ struct dso *__dsos__findnew_link_by_longname(struct rb_root *root, struct dso *d
 	 */
 	while (*p) {
 		struct dso *this = rb_entry(*p, struct dso, rb_node);
-		int rc = strcmp(name, this->long_name);
+		int rc = __dso__cmp_long_name(name, id, this);
 
 		parent = *p;
 		if (rc == 0) {
@@ -92,7 +119,7 @@ struct dso *__dsos__findnew_link_by_longname(struct rb_root *root, struct dso *d
 			 * In this case, the short name should be different.
 			 * Comparing the short names to differentiate the DSOs.
 			 */
-			rc = strcmp(dso->short_name, this->short_name);
+			rc = dso__cmp_short_name(dso, this);
 			if (rc == 0) {
 				pr_err("Duplicated dso name: %s\n", name);
 				return NULL;
@@ -115,7 +142,7 @@ struct dso *__dsos__findnew_link_by_longname(struct rb_root *root, struct dso *d
 void __dsos__add(struct dsos *dsos, struct dso *dso)
 {
 	list_add_tail(&dso->node, &dsos->head);
-	__dsos__findnew_link_by_longname(&dsos->root, dso, NULL);
+	__dsos__findnew_link_by_longname_id(&dsos->root, dso, NULL, &dso->id);
 	/*
 	 * It is now in the linked list, grab a reference, then garbage collect
 	 * this when needing memory, by looking at LRU dso instances in the
@@ -146,17 +173,27 @@ void dsos__add(struct dsos *dsos, struct dso *dso)
 	up_write(&dsos->lock);
 }
 
-struct dso *__dsos__find(struct dsos *dsos, const char *name, bool cmp_short)
+static struct dso *__dsos__findnew_by_longname_id(struct rb_root *root, const char *name, struct dso_id *id)
+{
+	return __dsos__findnew_link_by_longname_id(root, NULL, name, id);
+}
+
+static struct dso *__dsos__find_id(struct dsos *dsos, const char *name, struct dso_id *id, bool cmp_short)
 {
 	struct dso *pos;
 
 	if (cmp_short) {
 		list_for_each_entry(pos, &dsos->head, node)
-			if (strcmp(pos->short_name, name) == 0)
+			if (__dso__cmp_short_name(name, id, pos) == 0)
 				return pos;
 		return NULL;
 	}
-	return __dsos__findnew_by_longname(&dsos->root, name);
+	return __dsos__findnew_by_longname_id(&dsos->root, name, id);
+}
+
+struct dso *__dsos__find(struct dsos *dsos, const char *name, bool cmp_short)
+{
+	return __dsos__find_id(dsos, name, NULL, cmp_short);
 }
 
 static void dso__set_basename(struct dso *dso)
@@ -191,9 +228,9 @@ static void dso__set_basename(struct dso *dso)
 	dso__set_short_name(dso, base, true);
 }
 
-struct dso *__dsos__addnew(struct dsos *dsos, const char *name)
+static struct dso *__dsos__addnew_id(struct dsos *dsos, const char *name, struct dso_id *id)
 {
-	struct dso *dso = dso__new(name);
+	struct dso *dso = dso__new_id(name, id);
 
 	if (dso != NULL) {
 		__dsos__add(dsos, dso);
@@ -204,18 +241,22 @@ struct dso *__dsos__addnew(struct dsos *dsos, const char *name)
 	return dso;
 }
 
-struct dso *__dsos__findnew(struct dsos *dsos, const char *name)
+struct dso *__dsos__addnew(struct dsos *dsos, const char *name)
 {
-	struct dso *dso = __dsos__find(dsos, name, false);
+	return __dsos__addnew_id(dsos, name, NULL);
+}
 
-	return dso ? dso : __dsos__addnew(dsos, name);
+static struct dso *__dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id)
+{
+	struct dso *dso = __dsos__find_id(dsos, name, id, false);
+	return dso ? dso : __dsos__addnew_id(dsos, name, id);
 }
 
-struct dso *dsos__findnew(struct dsos *dsos, const char *name)
+struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id)
 {
 	struct dso *dso;
 	down_write(&dsos->lock);
-	dso = dso__get(__dsos__findnew(dsos, name));
+	dso = dso__get(__dsos__findnew_id(dsos, name, id));
 	up_write(&dsos->lock);
 	return dso;
 }
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index fd7ba51fc965..5dbec2bc6966 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -9,6 +9,7 @@
 #include "rwsem.h"
 
 struct dso;
+struct dso_id;
 
 /*
  * DSOs are put into both a list for fast iteration and rbtree for fast
@@ -24,15 +25,11 @@ void __dsos__add(struct dsos *dsos, struct dso *dso);
 void dsos__add(struct dsos *dsos, struct dso *dso);
 struct dso *__dsos__addnew(struct dsos *dsos, const char *name);
 struct dso *__dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
-struct dso *__dsos__findnew(struct dsos *dsos, const char *name);
-struct dso *dsos__findnew(struct dsos *dsos, const char *name);
 
-struct dso *__dsos__findnew_link_by_longname(struct rb_root *root, struct dso *dso, const char *name);
-
-static inline struct dso *__dsos__findnew_by_longname(struct rb_root *root, const char *name)
-{
-	return __dsos__findnew_link_by_longname(root, NULL, name);
-}
+struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id);
+ 
+struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso *dso,
+						const char *name, struct dso_id *id);
 
 bool __dsos__read_build_ids(struct list_head *head, bool with_hits);
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 41b4263c073d..e2a312c649f0 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2711,9 +2711,14 @@ u8 machine__addr_cpumode(struct machine *machine, u8 cpumode, u64 addr)
 	return addr_cpumode;
 }
 
+struct dso *machine__findnew_dso_id(struct machine *machine, const char *filename, struct dso_id *id)
+{
+	return dsos__findnew_id(&machine->dsos, filename, id);
+}
+
 struct dso *machine__findnew_dso(struct machine *machine, const char *filename)
 {
-	return dsos__findnew(&machine->dsos, filename);
+	return machine__findnew_dso_id(machine, filename, NULL);
 }
 
 char *machine__resolve_kernel_addr(void *vmachine, unsigned long long *addrp, char **modp)
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 1016978f575a..499be204830d 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -11,6 +11,7 @@
 struct addr_location;
 struct branch_stack;
 struct dso;
+struct dso_id;
 struct evsel;
 struct perf_sample;
 struct symbol;
@@ -202,6 +203,7 @@ int machine__nr_cpus_avail(struct machine *machine);
 struct thread *__machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid);
 struct thread *machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid);
 
+struct dso *machine__findnew_dso_id(struct machine *machine, const char *filename, struct dso_id *id);
 struct dso *machine__findnew_dso(struct machine *machine, const char *filename);
 
 size_t machine__fprintf(struct machine *machine, FILE *fp);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 812d663ebb57..744bfbaf35cf 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -161,12 +161,6 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 		anon = is_anon_memory(filename, flags);
 		vdso = is_vdso_map(filename);
 		no_dso = is_no_dso_memory(filename);
-
-		if (id)
-			map->dso_id = *id;
-		else
-			map->dso_id.min = map->dso_id.ino = map->dso_id.ino_generation = 0;
-
 		map->prot = prot;
 		map->flags = flags;
 		nsi = nsinfo__get(thread->nsinfo);
@@ -196,7 +190,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 			pgoff = 0;
 			dso = machine__findnew_vdso(machine, thread);
 		} else
-			dso = machine__findnew_dso(machine, filename);
+			dso = machine__findnew_dso_id(machine, filename, id);
 
 		if (dso == NULL)
 			goto out_delete;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index e1e573a28a55..5e8899883231 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -18,18 +18,6 @@ struct map_groups;
 struct machine;
 struct evsel;
 
-/*
- * Data about backing storage DSO, comes from PERF_RECORD_MMAP2 meta events
- */
-struct dso_id {
-	u32	maj;
-	u32	min;
-	u64	ino;
-	u64	ino_generation;
-};
-
-int dso_id__cmp(struct dso_id *a, struct dso_id *b);
-
 struct map {
 	union {
 		struct rb_node	rb_node;
@@ -49,7 +37,6 @@ struct map {
 	u64			(*unmap_ip)(struct map *, u64);
 
 	struct dso		*dso;
-	struct dso_id		dso_id;
 	refcount_t		refcnt;
 	u32			flags;
 };
@@ -118,6 +105,9 @@ struct thread;
 
 void map__init(struct map *map,
 	       u64 start, u64 end, u64 pgoff, struct dso *dso);
+
+struct dso_id;
+
 struct map *map__new(struct machine *machine, u64 start, u64 len,
 		     u64 pgoff, struct dso_id *id, u32 prot, u32 flags,
 		     char *filename, struct thread *thread);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index f1481002fafb..345b5ccc90f6 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1213,7 +1213,7 @@ sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right)
 	if (!l_map) return -1;
 	if (!r_map) return 1;
 
-	rc = dso_id__cmp(&l_map->dso_id, &r_map->dso_id);
+	rc = dso__cmp_id(l_map->dso, r_map->dso);
 	if (rc)
 		return rc;
 	/*
@@ -1226,8 +1226,8 @@ sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right)
 
 	if ((left->cpumode != PERF_RECORD_MISC_KERNEL) &&
 	    (!(l_map->flags & MAP_SHARED)) &&
-	    !l_map->dso_id.maj && !l_map->dso_id.min &&
-	    !l_map->dso_id.ino && !l_map->dso_id.ino_generation) {
+	    !l_map->dso->id.maj && !l_map->dso->id.min &&
+	    !l_map->dso->id.ino && !l_map->dso->id.ino_generation) {
 		/* userspace anonymous */
 
 		if (left->thread->pid_ > right->thread->pid_) return -1;
@@ -1263,8 +1263,8 @@ static int hist_entry__dcacheline_snprintf(struct hist_entry *he, char *bf,
 		if ((he->cpumode != PERF_RECORD_MISC_KERNEL) &&
 		     map && !(map->prot & PROT_EXEC) &&
 		    (map->flags & MAP_SHARED) &&
-		    (map->dso_id.maj || map->dso_id.min ||
-		     map->dso_id.ino || map->dso_id.ino_generation))
+		    (map->dso->id.maj || map->dso->id.min ||
+		     map->dso->id.ino || map->dso->id.ino_generation))
 			level = 's';
 		else if (!map)
 			level = 'X';
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/26] perf session: Fix decompression of PERF_RECORD_COMPRESSED records
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 05/26] perf dso: Move dso_id from 'struct map' to 'struct dso' Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 07/26] perf util: Move block TUI function to ui browsers Arnaldo Carvalho de Melo
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Alexey Budankov, Alexander Shishkin,
	Andi Kleen, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Alexey Budankov <alexey.budankov@linux.intel.com>

Avoid termination of trace loading in case the last record in the
decompressed buffer partly resides in the following mmaped
PERF_RECORD_COMPRESSED record.

In this case NULL value returned by fetch_mmaped_event() means to
proceed to the next mmaped record then decompress it and load compressed
events.

The issue can be reproduced like this:

  $ perf record -z -- some_long_running_workload
  $ perf report --stdio -vv
  decomp (B): 44519 to 163000
  decomp (B): 48119 to 174800
  decomp (B): 65527 to 131072
  fetch_mmaped_event: head=0x1ffe0 event->header_size=0x28, mmap_size=0x20000: fuzzed perf.data?
  Error:
  failed to process sample
  ...

Testing:

  71: Zstd perf.data compression/decompression              : Ok

  $ tools/perf/perf report -vv --stdio
  decomp (B): 59593 to 262160
  decomp (B): 4438 to 16512
  decomp (B): 285 to 880
  Looking at the vmlinux_path (8 entries long)
  Using vmlinux for symbols
  decomp (B): 57474 to 261248
  prefetch_event: head=0x3fc78 event->header_size=0x28, mmap_size=0x3fc80: fuzzed or compressed perf.data?
  decomp (B): 25 to 32
  decomp (B): 52 to 120
  ...

Fixes: 57fc032ad643 ("perf session: Avoid infinite loop when seeing invalid header.size")
Link: https://marc.info/?l=linux-kernel&m=156580812427554&w=2
Co-developed-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/cf782c34-f3f8-2f9f-d6ab-145cee0d5322@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 44 ++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f07b8ecb91bc..8454a650146b 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1958,8 +1958,8 @@ static int __perf_session__process_pipe_events(struct perf_session *session)
 }
 
 static union perf_event *
-fetch_mmaped_event(struct perf_session *session,
-		   u64 head, size_t mmap_size, char *buf)
+prefetch_event(char *buf, u64 head, size_t mmap_size,
+	       bool needs_swap, union perf_event *error)
 {
 	union perf_event *event;
 
@@ -1971,20 +1971,32 @@ fetch_mmaped_event(struct perf_session *session,
 		return NULL;
 
 	event = (union perf_event *)(buf + head);
+	if (needs_swap)
+		perf_event_header__bswap(&event->header);
 
-	if (session->header.needs_swap)
+	if (head + event->header.size <= mmap_size)
+		return event;
+
+	/* We're not fetching the event so swap back again */
+	if (needs_swap)
 		perf_event_header__bswap(&event->header);
 
-	if (head + event->header.size > mmap_size) {
-		/* We're not fetching the event so swap back again */
-		if (session->header.needs_swap)
-			perf_event_header__bswap(&event->header);
-		pr_debug("%s: head=%#" PRIx64 " event->header_size=%#x, mmap_size=%#zx: fuzzed perf.data?\n",
-			 __func__, head, event->header.size, mmap_size);
-		return ERR_PTR(-EINVAL);
-	}
+	pr_debug("%s: head=%#" PRIx64 " event->header_size=%#x, mmap_size=%#zx:"
+		 " fuzzed or compressed perf.data?\n",__func__, head, event->header.size, mmap_size);
 
-	return event;
+	return error;
+}
+
+static union perf_event *
+fetch_mmaped_event(u64 head, size_t mmap_size, char *buf, bool needs_swap)
+{
+	return prefetch_event(buf, head, mmap_size, needs_swap, ERR_PTR(-EINVAL));
+}
+
+static union perf_event *
+fetch_decomp_event(u64 head, size_t mmap_size, char *buf, bool needs_swap)
+{
+	return prefetch_event(buf, head, mmap_size, needs_swap, NULL);
 }
 
 static int __perf_session__process_decomp_events(struct perf_session *session)
@@ -1997,10 +2009,8 @@ static int __perf_session__process_decomp_events(struct perf_session *session)
 		return 0;
 
 	while (decomp->head < decomp->size && !session_done()) {
-		union perf_event *event = fetch_mmaped_event(session, decomp->head, decomp->size, decomp->data);
-
-		if (IS_ERR(event))
-			return PTR_ERR(event);
+		union perf_event *event = fetch_decomp_event(decomp->head, decomp->size, decomp->data,
+							     session->header.needs_swap);
 
 		if (!event)
 			break;
@@ -2100,7 +2110,7 @@ reader__process_events(struct reader *rd, struct perf_session *session,
 	}
 
 more:
-	event = fetch_mmaped_event(session, head, mmap_size, buf);
+	event = fetch_mmaped_event(head, mmap_size, buf, session->header.needs_swap);
 	if (IS_ERR(event))
 		return PTR_ERR(event);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/26] perf util: Move block TUI function to ui browsers
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 06/26] perf session: Fix decompression of PERF_RECORD_COMPRESSED records Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 08/26] perf report: Jump to symbol source view from total cycles view Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Jin Yao, Alexander Shishkin, Andi Kleen,
	Jin Yao, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

It would be nice if we could jump to the assembler/source view (like the
normal perf report) from total cycles view.

This patch moves the block_hists_tui_browse from block-info.c to
ui/browsers/hists.c in order to reuse some browser codes (i.e
do_annotate) for implementing new annotation view.

 v2:
 ---
 Fix the 'make NO_SLANG=1' error. (Change 'int block_hists_tui_browse()'
 to 'static inline int block_hists_tui_browse()')

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191118140849.20714-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/browsers/hists.c | 55 ++++++++++++++++++++++++++++
 tools/perf/util/block-info.c   | 65 +---------------------------------
 tools/perf/util/hist.h         | 12 +++++++
 3 files changed, 68 insertions(+), 64 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 4d2d0acfd41a..87405dc4750c 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -3444,3 +3444,58 @@ int perf_evlist__tui_browse_hists(struct evlist *evlist, const char *help,
 					       warn_lost_event,
 					       annotation_opts);
 }
+
+static int block_hists_browser__title(struct hist_browser *browser, char *bf,
+				      size_t size)
+{
+	struct hists *hists = evsel__hists(browser->block_evsel);
+	const char *evname = perf_evsel__name(browser->block_evsel);
+	unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
+	int ret;
+
+	ret = scnprintf(bf, size, "# Samples: %lu", nr_samples);
+	if (evname)
+		scnprintf(bf + ret, size -  ret, " of event '%s'", evname);
+
+	return 0;
+}
+
+int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
+			   float min_percent)
+{
+	struct hists *hists = &bh->block_hists;
+	struct hist_browser *browser;
+	int key = -1;
+	static const char help[] =
+	" q             Quit \n";
+
+	browser = hist_browser__new(hists);
+	if (!browser)
+		return -1;
+
+	browser->block_evsel = evsel;
+	browser->title = block_hists_browser__title;
+	browser->min_pcnt = min_percent;
+
+	/* reset abort key so that it can get Ctrl-C as a key */
+	SLang_reset_tty();
+	SLang_init_tty(0, 0, 0);
+
+	while (1) {
+		key = hist_browser__run(browser, "? - help", true);
+
+		switch (key) {
+		case 'q':
+			goto out;
+		case '?':
+			ui_browser__help_window(&browser->b, help);
+			break;
+		default:
+			break;
+		}
+	}
+
+out:
+	hist_browser__delete(browser);
+	return 0;
+}
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 9abc201ebe63..5887f8f9149f 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -10,6 +10,7 @@
 #include "map.h"
 #include "srcline.h"
 #include "evlist.h"
+#include "hist.h"
 #include "ui/browsers/hists.h"
 
 static struct block_header_column {
@@ -439,70 +440,6 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 	return block_reports;
 }
 
-#ifdef HAVE_SLANG_SUPPORT
-static int block_hists_browser__title(struct hist_browser *browser, char *bf,
-				      size_t size)
-{
-	struct hists *hists = evsel__hists(browser->block_evsel);
-	const char *evname = perf_evsel__name(browser->block_evsel);
-	unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
-	int ret;
-
-	ret = scnprintf(bf, size, "# Samples: %lu", nr_samples);
-	if (evname)
-		scnprintf(bf + ret, size -  ret, " of event '%s'", evname);
-
-	return 0;
-}
-
-static int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
-				  float min_percent)
-{
-	struct hists *hists = &bh->block_hists;
-	struct hist_browser *browser;
-	int key = -1;
-	static const char help[] =
-	" q             Quit \n";
-
-	browser = hist_browser__new(hists);
-	if (!browser)
-		return -1;
-
-	browser->block_evsel = evsel;
-	browser->title = block_hists_browser__title;
-	browser->min_pcnt = min_percent;
-
-	/* reset abort key so that it can get Ctrl-C as a key */
-	SLang_reset_tty();
-	SLang_init_tty(0, 0, 0);
-
-	while (1) {
-		key = hist_browser__run(browser, "? - help", true);
-
-		switch (key) {
-		case 'q':
-			goto out;
-		case '?':
-			ui_browser__help_window(&browser->b, help);
-			break;
-		default:
-			break;
-		}
-	}
-
-out:
-	hist_browser__delete(browser);
-	return 0;
-}
-#else
-static int block_hists_tui_browse(struct block_hist *bh __maybe_unused,
-				  struct evsel *evsel __maybe_unused,
-				  float min_percent __maybe_unused)
-{
-	return 0;
-}
-#endif
-
 int report__browse_block_hists(struct block_hist *bh, float min_percent,
 			       struct evsel *evsel)
 {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 4d87c7b4c1b2..2aca8ce16b2c 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -449,6 +449,8 @@ enum rstype {
 	A_SOURCE
 };
 
+struct block_hist;
+
 #ifdef HAVE_SLANG_SUPPORT
 #include "../ui/keysyms.h"
 void attr_to_script(char *buf, struct perf_event_attr *attr);
@@ -474,6 +476,9 @@ void run_script(char *cmd);
 int res_sample_browse(struct res_sample *res_samples, int num_res,
 		      struct evsel *evsel, enum rstype rstype);
 void res_sample_init(void);
+
+int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
+			   float min_percent);
 #else
 static inline
 int perf_evlist__tui_browse_hists(struct evlist *evlist __maybe_unused,
@@ -518,6 +523,13 @@ static inline int res_sample_browse(struct res_sample *res_samples __maybe_unuse
 
 static inline void res_sample_init(void) {}
 
+static inline int block_hists_tui_browse(struct block_hist *bh __maybe_unused,
+					 struct evsel *evsel __maybe_unused,
+					 float min_percent __maybe_unused)
+{
+	return 0;
+}
+
 #define K_LEFT  -1000
 #define K_RIGHT -2000
 #define K_SWITCH_INPUT_DATA -3000
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/26] perf report: Jump to symbol source view from total cycles view
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 07/26] perf util: Move block TUI function to ui browsers Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 09/26] perf tools: Add kernel AUX area sampling definitions Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Jin Yao, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Andi Kleen, Jin Yao, Kan Liang,
	Peter Zijlstra

From: Jin Yao <yao.jin@linux.intel.com>

This patch supports jumping from tui total cycles view to symbol source
view.

For example,

  perf record -b ./div
  perf report --total-cycles

In total cycles view, we can select one entry and press 'a' or press
ENTER key to jump to symbol source view.

This patch also sets sort_order to NULL in cmd_report() which will use
the default branch sort order. The percent value in new annotate view
will be consistent with the percent in annotate view switched from perf
report (we observed the original percent gap with previous patches).

 v2:
 ---
 Fix the 'make NO_SLANG=1' error. (set __maybe_unused to
 annotation_opts in block_hists_tui_browse()).

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191118140849.20714-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c    |  9 ++++++---
 tools/perf/ui/browsers/hists.c | 25 +++++++++++++++++++++++--
 tools/perf/util/block-info.c   |  6 ++++--
 tools/perf/util/block-info.h   |  3 ++-
 tools/perf/util/hist.h         |  7 +++++--
 5 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 0b6157c02c88..ab0f6e516b03 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -493,7 +493,9 @@ static int perf_evlist__tui_block_hists_browse(struct evlist *evlist,
 
 	evlist__for_each_entry(evlist, pos) {
 		ret = report__browse_block_hists(&rep->block_reports[i++].hist,
-						 rep->min_percent, pos);
+						 rep->min_percent, pos,
+						 &rep->session->header.env,
+						 &rep->annotation_opts);
 		if (ret != 0)
 			return ret;
 	}
@@ -525,7 +527,8 @@ static int perf_evlist__tty_browse_hists(struct evlist *evlist,
 
 		if (rep->total_cycles_mode) {
 			report__browse_block_hists(&rep->block_reports[i++].hist,
-						   rep->min_percent, pos);
+						   rep->min_percent, pos,
+						   NULL, NULL);
 			continue;
 		}
 
@@ -1418,7 +1421,7 @@ int cmd_report(int argc, const char **argv)
 		if (sort__mode != SORT_MODE__BRANCH)
 			report.total_cycles_mode = false;
 		else
-			sort_order = "sym";
+			sort_order = NULL;
 	}
 
 	if (strcmp(input_name, "-") != 0)
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 87405dc4750c..d4d3558fdef4 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2385,7 +2385,11 @@ do_annotate(struct hist_browser *browser, struct popup_action *act)
 	if (!notes->src)
 		return 0;
 
-	evsel = hists_to_evsel(browser->hists);
+	if (browser->block_evsel)
+		evsel = browser->block_evsel;
+	else
+		evsel = hists_to_evsel(browser->hists);
+
 	err = map_symbol__tui_annotate(&act->ms, evsel, browser->hbt,
 				       browser->annotation_opts);
 	he = hist_browser__selected_entry(browser);
@@ -3461,11 +3465,13 @@ static int block_hists_browser__title(struct hist_browser *browser, char *bf,
 }
 
 int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
-			   float min_percent)
+			   float min_percent, struct perf_env *env,
+			   struct annotation_options *annotation_opts)
 {
 	struct hists *hists = &bh->block_hists;
 	struct hist_browser *browser;
 	int key = -1;
+	struct popup_action action;
 	static const char help[] =
 	" q             Quit \n";
 
@@ -3476,11 +3482,15 @@ int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
 	browser->block_evsel = evsel;
 	browser->title = block_hists_browser__title;
 	browser->min_pcnt = min_percent;
+	browser->env = env;
+	browser->annotation_opts = annotation_opts;
 
 	/* reset abort key so that it can get Ctrl-C as a key */
 	SLang_reset_tty();
 	SLang_init_tty(0, 0, 0);
 
+	memset(&action, 0, sizeof(action));
+
 	while (1) {
 		key = hist_browser__run(browser, "? - help", true);
 
@@ -3490,6 +3500,17 @@ int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
 		case '?':
 			ui_browser__help_window(&browser->b, help);
 			break;
+		case 'a':
+		case K_ENTER:
+			if (!browser->selection ||
+			    !browser->selection->sym) {
+				continue;
+			}
+
+			action.ms.map = browser->selection->map;
+			action.ms.sym = browser->selection->sym;
+			do_annotate(browser, &action);
+			continue;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 5887f8f9149f..c4b030bf6ec2 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -441,7 +441,8 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 }
 
 int report__browse_block_hists(struct block_hist *bh, float min_percent,
-			       struct evsel *evsel)
+			       struct evsel *evsel, struct perf_env *env,
+			       struct annotation_options *annotation_opts)
 {
 	int ret;
 
@@ -454,7 +455,8 @@ int report__browse_block_hists(struct block_hist *bh, float min_percent,
 		return 0;
 	case 1:
 		symbol_conf.report_individual_block = true;
-		ret = block_hists_tui_browse(bh, evsel, min_percent);
+		ret = block_hists_tui_browse(bh, evsel, min_percent,
+					     env, annotation_opts);
 		hists__delete_entries(&bh->block_hists);
 		return ret;
 	default:
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index e4d20bccd9b6..bef0d75e9819 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -71,7 +71,8 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 					       u64 total_cycles);
 
 int report__browse_block_hists(struct block_hist *bh, float min_percent,
-			       struct evsel *evsel);
+			       struct evsel *evsel, struct perf_env *env,
+			       struct annotation_options *annotation_opts);
 
 float block_info__total_cycles_percent(struct hist_entry *he);
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 2aca8ce16b2c..45286900aacb 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -478,7 +478,8 @@ int res_sample_browse(struct res_sample *res_samples, int num_res,
 void res_sample_init(void);
 
 int block_hists_tui_browse(struct block_hist *bh, struct evsel *evsel,
-			   float min_percent);
+			   float min_percent, struct perf_env *env,
+			   struct annotation_options *annotation_opts);
 #else
 static inline
 int perf_evlist__tui_browse_hists(struct evlist *evlist __maybe_unused,
@@ -525,7 +526,9 @@ static inline void res_sample_init(void) {}
 
 static inline int block_hists_tui_browse(struct block_hist *bh __maybe_unused,
 					 struct evsel *evsel __maybe_unused,
-					 float min_percent __maybe_unused)
+					 float min_percent __maybe_unused,
+					 struct perf_env *env __maybe_unused,
+					 struct annotation_options *annotation_opts __maybe_unused)
 {
 	return 0;
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/26] perf tools: Add kernel AUX area sampling definitions
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 08/26] perf report: Jump to symbol source view from total cycles view Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 10/26] perf record: Add a function to test for kernel support for AUX area sampling Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Arnaldo Carvalho de Melo,
	Jiri Olsa

From: Adrian Hunter <adrian.hunter@intel.com>

Add kernel AUX area sampling definitions, which brings perf_event.h into
line with the kernel version.

New sample type PERF_SAMPLE_AUX requests a sample of the AUX area
buffer.  New perf_event_attr member 'aux_sample_size' specifies the
desired size of the sample.

Also add support for parsing samples containing AUX area data i.e.
PERF_SAMPLE_AUX.

Committer notes:

I squashed the first two patches in this series to avoid breaking
automatic bisection, i.e. after applying only the original first patch
in this series we would have:

  # perf test -v parsing
  26: Sample parsing                                        :
  --- start ---
  test child forked, pid 17018
  sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating
  test child finished with -1
  ---- end ----
  Sample parsing: FAILED!
  #

With the two paches combined:

  # perf test parsing
  26: Sample parsing                                        : Ok
  #

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/include/uapi/linux/perf_event.h     | 10 ++++++++--
 tools/perf/tests/attr/base-record         |  2 +-
 tools/perf/tests/attr/base-stat           |  2 +-
 tools/perf/tests/sample-parsing.c         | 16 +++++++++++++++-
 tools/perf/util/event.h                   |  6 ++++++
 tools/perf/util/evsel.c                   | 13 +++++++++++++
 tools/perf/util/perf_event_attr_fprintf.c |  3 ++-
 tools/perf/util/session.c                 |  1 +
 tools/perf/util/synthetic-events.c        | 12 ++++++++++++
 9 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index bb7b271397a6..377d794d3105 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -141,8 +141,9 @@ enum perf_event_sample_format {
 	PERF_SAMPLE_TRANSACTION			= 1U << 17,
 	PERF_SAMPLE_REGS_INTR			= 1U << 18,
 	PERF_SAMPLE_PHYS_ADDR			= 1U << 19,
+	PERF_SAMPLE_AUX				= 1U << 20,
 
-	PERF_SAMPLE_MAX = 1U << 20,		/* non-ABI */
+	PERF_SAMPLE_MAX = 1U << 21,		/* non-ABI */
 
 	__PERF_SAMPLE_CALLCHAIN_EARLY		= 1ULL << 63, /* non-ABI; internal use */
 };
@@ -300,6 +301,7 @@ enum perf_event_read_format {
 					/* add: sample_stack_user */
 #define PERF_ATTR_SIZE_VER4	104	/* add: sample_regs_intr */
 #define PERF_ATTR_SIZE_VER5	112	/* add: aux_watermark */
+#define PERF_ATTR_SIZE_VER6	120	/* add: aux_sample_size */
 
 /*
  * Hardware event_id to monitor via a performance monitoring event:
@@ -424,7 +426,9 @@ struct perf_event_attr {
 	 */
 	__u32	aux_watermark;
 	__u16	sample_max_stack;
-	__u16	__reserved_2;	/* align to __u64 */
+	__u16	__reserved_2;
+	__u32	aux_sample_size;
+	__u32	__reserved_3;
 };
 
 /*
@@ -864,6 +868,8 @@ enum perf_event_type {
 	 *	{ u64			abi; # enum perf_sample_regs_abi
 	 *	  u64			regs[weight(mask)]; } && PERF_SAMPLE_REGS_INTR
 	 *	{ u64			phys_addr;} && PERF_SAMPLE_PHYS_ADDR
+	 *	{ u64			size;
+	 *	  char			data[size]; } && PERF_SAMPLE_AUX
 	 * };
 	 */
 	PERF_RECORD_SAMPLE			= 9,
diff --git a/tools/perf/tests/attr/base-record b/tools/perf/tests/attr/base-record
index efd0157b9d22..645009c08b3c 100644
--- a/tools/perf/tests/attr/base-record
+++ b/tools/perf/tests/attr/base-record
@@ -5,7 +5,7 @@ group_fd=-1
 flags=0|8
 cpu=*
 type=0|1
-size=112
+size=120
 config=0
 sample_period=*
 sample_type=263
diff --git a/tools/perf/tests/attr/base-stat b/tools/perf/tests/attr/base-stat
index 4d0c2e42b64e..b0f42c34882e 100644
--- a/tools/perf/tests/attr/base-stat
+++ b/tools/perf/tests/attr/base-stat
@@ -5,7 +5,7 @@ group_fd=-1
 flags=0|8
 cpu=*
 type=0
-size=112
+size=120
 config=0
 sample_period=0
 sample_type=65536
diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
index 3a02426db9a6..2762e1155238 100644
--- a/tools/perf/tests/sample-parsing.c
+++ b/tools/perf/tests/sample-parsing.c
@@ -150,6 +150,15 @@ static bool samples_same(const struct perf_sample *s1,
 	if (type & PERF_SAMPLE_PHYS_ADDR)
 		COMP(phys_addr);
 
+	if (type & PERF_SAMPLE_AUX) {
+		COMP(aux_sample.size);
+		if (memcmp(s1->aux_sample.data, s2->aux_sample.data,
+			   s1->aux_sample.size)) {
+			pr_debug("Samples differ at 'aux_sample'\n");
+			return false;
+		}
+	}
+
 	return true;
 }
 
@@ -182,6 +191,7 @@ static int do_test(u64 sample_type, u64 sample_regs, u64 read_format)
 	u64 regs[64];
 	const u64 raw_data[] = {0x123456780a0b0c0dULL, 0x1102030405060708ULL};
 	const u64 data[] = {0x2211443366558877ULL, 0, 0xaabbccddeeff4321ULL};
+	const u64 aux_data[] = {0xa55a, 0, 0xeeddee, 0x0282028202820282};
 	struct perf_sample sample = {
 		.ip		= 101,
 		.pid		= 102,
@@ -218,6 +228,10 @@ static int do_test(u64 sample_type, u64 sample_regs, u64 read_format)
 			.regs	= regs,
 		},
 		.phys_addr	= 113,
+		.aux_sample	= {
+			.size	= sizeof(aux_data),
+			.data	= (void *)aux_data,
+		},
 	};
 	struct sample_read_value values[] = {{1, 5}, {9, 3}, {2, 7}, {6, 4},};
 	struct perf_sample sample_out;
@@ -317,7 +331,7 @@ int test__sample_parsing(struct test *test __maybe_unused, int subtest __maybe_u
 	 * were added.  Please actually update the test rather than just change
 	 * the condition below.
 	 */
-	if (PERF_SAMPLE_MAX > PERF_SAMPLE_PHYS_ADDR << 1) {
+	if (PERF_SAMPLE_MAX > PERF_SAMPLE_AUX << 1) {
 		pr_debug("sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating\n");
 		return -1;
 	}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index a0a0c91cde4a..85223159737c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -114,6 +114,11 @@ enum {
 
 #define MAX_INSN 16
 
+struct aux_sample {
+	u64 size;
+	void *data;
+};
+
 struct perf_sample {
 	u64 ip;
 	u32 pid, tid;
@@ -142,6 +147,7 @@ struct perf_sample {
 	struct regs_dump  intr_regs;
 	struct stack_dump user_stack;
 	struct sample_read read;
+	struct aux_sample aux_sample;
 };
 
 #define PERF_MEM_DATA_SRC_NONE \
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1bf60f325608..772f4879c492 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2209,6 +2209,19 @@ int perf_evsel__parse_sample(struct evsel *evsel, union perf_event *event,
 		array++;
 	}
 
+	if (type & PERF_SAMPLE_AUX) {
+		OVERFLOW_CHECK_u64(array);
+		sz = *array++;
+
+		OVERFLOW_CHECK(array, sz, max_size);
+		/* Undo swap of data */
+		if (swapped)
+			mem_bswap_64((char *)array, sz);
+		data->aux_sample.size = sz;
+		data->aux_sample.data = (char *)array;
+		array = (void *)array + sz;
+	}
+
 	return 0;
 }
 
diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
index d4ad3f04923a..651203126c71 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -34,7 +34,7 @@ static void __p_sample_type(char *buf, size_t size, u64 value)
 		bit_name(PERIOD), bit_name(STREAM_ID), bit_name(RAW),
 		bit_name(BRANCH_STACK), bit_name(REGS_USER), bit_name(STACK_USER),
 		bit_name(IDENTIFIER), bit_name(REGS_INTR), bit_name(DATA_SRC),
-		bit_name(WEIGHT), bit_name(PHYS_ADDR),
+		bit_name(WEIGHT), bit_name(PHYS_ADDR), bit_name(AUX),
 		{ .name = NULL, }
 	};
 #undef bit_name
@@ -143,6 +143,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(sample_regs_intr, p_hex);
 	PRINT_ATTRf(aux_watermark, p_unsigned);
 	PRINT_ATTRf(sample_max_stack, p_unsigned);
+	PRINT_ATTRf(aux_sample_size, p_unsigned);
 
 	return ret;
 }
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 8454a650146b..dbdb47624dec 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -752,6 +752,7 @@ do { 						\
 	bswap_field_32(sample_stack_user);
 	bswap_field_32(aux_watermark);
 	bswap_field_16(sample_max_stack);
+	bswap_field_32(aux_sample_size);
 
 	/*
 	 * After read_format are bitfields. Check read_format because
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index cfa3c9f67141..48c3f8b9c852 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -1228,6 +1228,11 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 	if (type & PERF_SAMPLE_PHYS_ADDR)
 		result += sizeof(u64);
 
+	if (type & PERF_SAMPLE_AUX) {
+		result += sizeof(u64);
+		result += sample->aux_sample.size;
+	}
+
 	return result;
 }
 
@@ -1396,6 +1401,13 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, u64 read_fo
 		array++;
 	}
 
+	if (type & PERF_SAMPLE_AUX) {
+		sz = sample->aux_sample.size;
+		*array++ = sz;
+		memcpy(array, sample->aux_sample.data, sz);
+		array = (void *)array + sz;
+	}
+
 	return 0;
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/26] perf record: Add a function to test for kernel support for AUX area sampling
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 09/26] perf tools: Add kernel AUX area sampling definitions Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 11/26] perf auxtrace: Move perf_evsel__find_pmu() Arnaldo Carvalho de Melo
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Architectures are expected to know if AUX area sampling is supported by
the hardware. Add a function perf_can_aux_sample() which will determine
whether the kernel supports it.

Committer notes:

I reported that this message was taking place on a kernel without the
required bits:

  # perf record --aux-sample -e '{intel_pt//u,branch-misses:u}'
  Error:
  The sys_perf_event_open() syscall returned with 7 (Argument list too long) for event (branch-misses:u).
  /bin/dmesg | grep -i perf may provide additional information.

Adrian sent a patch addressing it, with this explanation:

 ----
  perf_can_aux_sample_size() always returned true because it did not pass
  the attribute size to sys_perf_event_open, nor correctly check the
  return value and errno.
 ----

After applying it I get, later in the series, when --aux-sample is
added:

  # perf record --aux-sample -e '{intel_pt//u,branch-misses:u}'
  AUX area sampling is not supported by kernel

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.h |  1 +
 tools/perf/util/record.c | 31 +++++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 13051409fd22..3655b9ebb147 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -176,6 +176,7 @@ void perf_evlist__set_id_pos(struct evlist *evlist);
 bool perf_can_sample_identifier(void);
 bool perf_can_record_switch_events(void);
 bool perf_can_record_cpu_wide(void);
+bool perf_can_aux_sample(void);
 void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
 			 struct callchain_param *callchain);
 int record_opts__config(struct record_opts *opts);
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 8579505c29a4..7def66168503 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -136,6 +136,37 @@ bool perf_can_record_cpu_wide(void)
 	return true;
 }
 
+/*
+ * Architectures are expected to know if AUX area sampling is supported by the
+ * hardware. Here we check for kernel support.
+ */
+bool perf_can_aux_sample(void)
+{
+	struct perf_event_attr attr = {
+		.size = sizeof(struct perf_event_attr),
+		.exclude_kernel = 1,
+		/*
+		 * Non-zero value causes the kernel to calculate the effective
+		 * attribute size up to that byte.
+		 */
+		.aux_sample_size = 1,
+	};
+	int fd;
+
+	fd = sys_perf_event_open(&attr, -1, 0, -1, 0);
+	/*
+	 * If the kernel attribute is big enough to contain aux_sample_size
+	 * then we assume that it is supported. We are relying on the kernel to
+	 * validate the attribute size before anything else that could be wrong.
+	 */
+	if (fd < 0 && errno == E2BIG)
+		return false;
+	if (fd >= 0)
+		close(fd);
+
+	return true;
+}
+
 void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
 			 struct callchain_param *callchain)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/26] perf auxtrace: Move perf_evsel__find_pmu()
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 10/26] perf record: Add a function to test for kernel support for AUX area sampling Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 12/26] perf auxtrace: Add support for AUX area sample recording Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Move perf_evsel__find_pmu() so it can be used without forward
declaration.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index c555c3ccd79d..263d1d9d8987 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -57,6 +57,18 @@
 #include "symbol/kallsyms.h"
 #include <internal/lib.h>
 
+static struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
+{
+	struct perf_pmu *pmu = NULL;
+
+	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+		if (pmu->type == evsel->core.attr.type)
+			break;
+	}
+
+	return pmu;
+}
+
 static bool auxtrace__dont_decode(struct perf_session *session)
 {
 	return !session->itrace_synth_opts ||
@@ -2180,18 +2192,6 @@ static int parse_addr_filter(struct evsel *evsel, const char *filter,
 	return err;
 }
 
-static struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
-{
-	struct perf_pmu *pmu = NULL;
-
-	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
-		if (pmu->type == evsel->core.attr.type)
-			break;
-	}
-
-	return pmu;
-}
-
 static int perf_evsel__nr_addr_filter(struct evsel *evsel)
 {
 	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 12/26] perf auxtrace: Add support for AUX area sample recording
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 11/26] perf auxtrace: Move perf_evsel__find_pmu() Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 13/26] perf record: Add support for AUX area sampling Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Add support for parsing and validating AUX area sample options. At
present, the only option is the sample size, but it is also necessary to
ensure that events are in a group with an AUX area event as the leader.

Committer note:

Add missing 'static inline' in front of auxtrace_parse_sample_options()
for when we don't HAVE_AUXTRACE_SUPPORT.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.c | 107 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/auxtrace.h |  17 ++++++
 tools/perf/util/pmu.h      |   1 +
 tools/perf/util/record.h   |   2 +
 4 files changed, 127 insertions(+)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 263d1d9d8987..51fbe01f8a11 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -69,6 +69,13 @@ static struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
 	return pmu;
 }
 
+static bool perf_evsel__is_aux_event(struct evsel *evsel)
+{
+	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
+
+	return pmu && pmu->auxtrace;
+}
+
 static bool auxtrace__dont_decode(struct perf_session *session)
 {
 	return !session->itrace_synth_opts ||
@@ -609,6 +616,106 @@ int auxtrace_parse_snapshot_options(struct auxtrace_record *itr,
 	return -EINVAL;
 }
 
+/*
+ * Event record size is 16-bit which results in a maximum size of about 64KiB.
+ * Allow about 4KiB for the rest of the sample record, to give a maximum
+ * AUX area sample size of 60KiB.
+ */
+#define MAX_AUX_SAMPLE_SIZE (60 * 1024)
+
+/* Arbitrary default size if no other default provided */
+#define DEFAULT_AUX_SAMPLE_SIZE (4 * 1024)
+
+static int auxtrace_validate_aux_sample_size(struct evlist *evlist,
+					     struct record_opts *opts)
+{
+	struct evsel *evsel;
+	bool has_aux_leader = false;
+	u32 sz;
+
+	evlist__for_each_entry(evlist, evsel) {
+		sz = evsel->core.attr.aux_sample_size;
+		if (perf_evsel__is_group_leader(evsel)) {
+			has_aux_leader = perf_evsel__is_aux_event(evsel);
+			if (sz) {
+				if (has_aux_leader)
+					pr_err("Cannot add AUX area sampling to an AUX area event\n");
+				else
+					pr_err("Cannot add AUX area sampling to a group leader\n");
+				return -EINVAL;
+			}
+		}
+		if (sz > MAX_AUX_SAMPLE_SIZE) {
+			pr_err("AUX area sample size %u too big, max. %d\n",
+			       sz, MAX_AUX_SAMPLE_SIZE);
+			return -EINVAL;
+		}
+		if (sz) {
+			if (!has_aux_leader) {
+				pr_err("Cannot add AUX area sampling because group leader is not an AUX area event\n");
+				return -EINVAL;
+			}
+			perf_evsel__set_sample_bit(evsel, AUX);
+			opts->auxtrace_sample_mode = true;
+		} else {
+			perf_evsel__reset_sample_bit(evsel, AUX);
+		}
+	}
+
+	if (!opts->auxtrace_sample_mode) {
+		pr_err("AUX area sampling requires an AUX area event group leader plus other events to which to add samples\n");
+		return -EINVAL;
+	}
+
+	if (!perf_can_aux_sample()) {
+		pr_err("AUX area sampling is not supported by kernel\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+int auxtrace_parse_sample_options(struct auxtrace_record *itr,
+				  struct evlist *evlist,
+				  struct record_opts *opts, const char *str)
+{
+	bool has_aux_leader = false;
+	struct evsel *evsel;
+	char *endptr;
+	unsigned long sz;
+
+	if (!str)
+		return 0;
+
+	if (!itr) {
+		pr_err("No AUX area event to sample\n");
+		return -EINVAL;
+	}
+
+	sz = strtoul(str, &endptr, 0);
+	if (*endptr || sz > UINT_MAX) {
+		pr_err("Bad AUX area sampling option: '%s'\n", str);
+		return -EINVAL;
+	}
+
+	if (!sz)
+		sz = itr->default_aux_sample_size;
+
+	if (!sz)
+		sz = DEFAULT_AUX_SAMPLE_SIZE;
+
+	/* Set aux_sample_size based on --aux-sample option */
+	evlist__for_each_entry(evlist, evsel) {
+		if (perf_evsel__is_group_leader(evsel)) {
+			has_aux_leader = perf_evsel__is_aux_event(evsel);
+		} else if (has_aux_leader) {
+			evsel->core.attr.aux_sample_size = sz;
+		}
+	}
+
+	return auxtrace_validate_aux_sample_size(evlist, opts);
+}
+
 struct auxtrace_record *__weak
 auxtrace_record__init(struct evlist *evlist __maybe_unused, int *err)
 {
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 3f4aa5427d76..ab48de13c353 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -313,6 +313,7 @@ struct auxtrace_mmap_params {
  * @reference: provide a 64-bit reference number for auxtrace_event
  * @read_finish: called after reading from an auxtrace mmap
  * @alignment: alignment (if any) for AUX area data
+ * @default_aux_sample_size: default sample size for --aux sample option
  */
 struct auxtrace_record {
 	int (*recording_options)(struct auxtrace_record *itr,
@@ -336,6 +337,7 @@ struct auxtrace_record {
 	u64 (*reference)(struct auxtrace_record *itr);
 	int (*read_finish)(struct auxtrace_record *itr, int idx);
 	unsigned int alignment;
+	unsigned int default_aux_sample_size;
 };
 
 /**
@@ -498,6 +500,9 @@ struct auxtrace_record *auxtrace_record__init(struct evlist *evlist,
 int auxtrace_parse_snapshot_options(struct auxtrace_record *itr,
 				    struct record_opts *opts,
 				    const char *str);
+int auxtrace_parse_sample_options(struct auxtrace_record *itr,
+				  struct evlist *evlist,
+				  struct record_opts *opts, const char *str);
 int auxtrace_record__options(struct auxtrace_record *itr,
 			     struct evlist *evlist,
 			     struct record_opts *opts);
@@ -648,6 +653,18 @@ int auxtrace_parse_snapshot_options(struct auxtrace_record *itr __maybe_unused,
 	return -EINVAL;
 }
 
+static inline
+int auxtrace_parse_sample_options(struct auxtrace_record *itr __maybe_unused,
+				  struct evlist *evlist __maybe_unused,
+				  struct record_opts *opts __maybe_unused,
+				  const char *str)
+{
+	if (!str)
+		return 0;
+	pr_err("AUX area tracing not supported\n");
+	return -EINVAL;
+}
+
 static inline
 int auxtrace__process_event(struct perf_session *session __maybe_unused,
 			    union perf_event *event __maybe_unused,
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 3e8cd31a89cc..2eb7a7001307 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -26,6 +26,7 @@ struct perf_pmu {
 	__u32 type;
 	bool selectable;
 	bool is_uncore;
+	bool auxtrace;
 	int max_precise;
 	struct perf_event_attr *default_config;
 	struct perf_cpu_map *cpus;
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 948bbcf9aef3..5421fd2ad383 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -32,6 +32,7 @@ struct record_opts {
 	bool	      full_auxtrace;
 	bool	      auxtrace_snapshot_mode;
 	bool	      auxtrace_snapshot_on_exit;
+	bool	      auxtrace_sample_mode;
 	bool	      record_namespaces;
 	bool	      record_switch_events;
 	bool	      all_kernel;
@@ -56,6 +57,7 @@ struct record_opts {
 	u64	      user_interval;
 	size_t	      auxtrace_snapshot_size;
 	const char    *auxtrace_snapshot_opts;
+	const char    *auxtrace_sample_opts;
 	bool	      sample_transaction;
 	unsigned      initial_delay;
 	bool	      use_clockid;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 13/26] perf record: Add support for AUX area sampling
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 12/26] perf auxtrace: Add support for AUX area sample recording Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:56 ` [PATCH 14/26] perf record: Add aux-sample-size config term Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Add a 'perf record' option '--aux-sample' to request AUX area sampling.
AUX area sampling uses an overwriting buffer much like snapshot mode, so
adjust the AUX buffer mmapping accordingly. To make it easy to queue
samples for decoding, synthesize an ID index.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-record.txt |  6 ++++++
 tools/perf/builtin-record.c              | 21 ++++++++++++++++++++-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index ebcba1f95513..e216d7b529c9 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -433,6 +433,12 @@ can be specified in a string that follows this option:
 In Snapshot Mode trace data is captured only when signal SIGUSR2 is received
 and on exit if the above 'e' option is given.
 
+--aux-sample[=OPTIONS]::
+Select AUX area sampling. At least one of the events selected by the -e option
+must be an AUX area event. Samples on other events will be created containing
+data from the AUX area. Optionally sample size may be specified, otherwise it
+defaults to 4KiB.
+
 --proc-map-timeout::
 When processing pre-existing threads /proc/XXX/mmap, it may take a long time,
 because the file may be huge. A time out is needed in such cases.
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7ab3110b4035..b5063d3b6fd0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -680,6 +680,11 @@ static int record__auxtrace_init(struct record *rec)
 	if (err)
 		return err;
 
+	err = auxtrace_parse_sample_options(rec->itr, rec->evlist, &rec->opts,
+					    rec->opts.auxtrace_sample_opts);
+	if (err)
+		return err;
+
 	return auxtrace_parse_filters(rec->evlist);
 }
 
@@ -752,6 +757,8 @@ static int record__mmap_evlist(struct record *rec,
 			       struct evlist *evlist)
 {
 	struct record_opts *opts = &rec->opts;
+	bool auxtrace_overwrite = opts->auxtrace_snapshot_mode ||
+				  opts->auxtrace_sample_mode;
 	char msg[512];
 
 	if (opts->affinity != PERF_AFFINITY_SYS)
@@ -759,7 +766,7 @@ static int record__mmap_evlist(struct record *rec,
 
 	if (evlist__mmap_ex(evlist, opts->mmap_pages,
 				 opts->auxtrace_mmap_pages,
-				 opts->auxtrace_snapshot_mode,
+				 auxtrace_overwrite,
 				 opts->nr_cblocks, opts->affinity,
 				 opts->mmap_flush, opts->comp_level) < 0) {
 		if (errno == EPERM) {
@@ -1046,6 +1053,7 @@ static int record__mmap_read_evlist(struct record *rec, struct evlist *evlist,
 		}
 
 		if (map->auxtrace_mmap.base && !rec->opts.auxtrace_snapshot_mode &&
+		    !rec->opts.auxtrace_sample_mode &&
 		    record__auxtrace_mmap_read(rec, map) != 0) {
 			rc = -1;
 			goto out;
@@ -1321,6 +1329,15 @@ static int record__synthesize(struct record *rec, bool tail)
 	if (err)
 		goto out;
 
+	/* Synthesize id_index before auxtrace_info */
+	if (rec->opts.auxtrace_sample_mode) {
+		err = perf_event__synthesize_id_index(tool,
+						      process_synthesized_event,
+						      session->evlist, machine);
+		if (err)
+			goto out;
+	}
+
 	if (rec->opts.full_auxtrace) {
 		err = perf_event__synthesize_auxtrace_info(rec->itr, tool,
 					session, process_synthesized_event);
@@ -2329,6 +2346,8 @@ static struct option __record_options[] = {
 	parse_clockid),
 	OPT_STRING_OPTARG('S', "snapshot", &record.opts.auxtrace_snapshot_opts,
 			  "opts", "AUX area tracing Snapshot Mode", ""),
+	OPT_STRING_OPTARG(0, "aux-sample", &record.opts.auxtrace_sample_opts,
+			  "opts", "sample AUX area", ""),
 	OPT_UINTEGER(0, "proc-map-timeout", &proc_map_timeout,
 			"per thread proc mmap processing timeout in ms"),
 	OPT_BOOLEAN(0, "namespaces", &record.opts.record_namespaces,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 14/26] perf record: Add aux-sample-size config term
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 13/26] perf record: Add support for AUX area sampling Arnaldo Carvalho de Melo
@ 2019-11-22 14:56 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 15/26] perf inject: Cut AUX area samples Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:56 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

To allow individual events to be selected for AUX area sampling, add
aux-sample-size config term. attr.aux_sample_size is updated by
auxtrace_parse_sample_options() so that the existing validation will see
the value. Any event that has a non-zero aux_sample_size will cause AUX
area sampling to be configured, irrespective of the --aux-sample option.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-record.txt |  3 +
 tools/perf/util/auxtrace.c               | 76 +++++++++++++++++++++++-
 tools/perf/util/evsel.c                  | 16 +++++
 tools/perf/util/evsel_config.h           | 11 ++++
 tools/perf/util/parse-events.c           | 14 +++++
 tools/perf/util/parse-events.h           |  1 +
 tools/perf/util/parse-events.l           |  1 +
 7 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index e216d7b529c9..b23a4012a606 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -62,6 +62,9 @@ OPTIONS
 		    like this: name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\'.
 	  - 'aux-output': Generate AUX records instead of events. This requires
 			  that an AUX area event is also provided.
+	  - 'aux-sample-size': Set sample size for AUX area sampling. If the
+	  '--aux-sample' option has been used, set aux-sample-size=0 to disable
+	  AUX area sampling for the event.
 
           See the linkperf:perf-list[1] man page for more parameters.
 
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 51fbe01f8a11..026585b67a3c 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -31,6 +31,7 @@
 #include "map.h"
 #include "pmu.h"
 #include "evsel.h"
+#include "evsel_config.h"
 #include "symbol.h"
 #include "util/synthetic-events.h"
 #include "thread_map.h"
@@ -76,6 +77,53 @@ static bool perf_evsel__is_aux_event(struct evsel *evsel)
 	return pmu && pmu->auxtrace;
 }
 
+/*
+ * Make a group from 'leader' to 'last', requiring that the events were not
+ * already grouped to a different leader.
+ */
+static int perf_evlist__regroup(struct evlist *evlist,
+				struct evsel *leader,
+				struct evsel *last)
+{
+	struct evsel *evsel;
+	bool grp;
+
+	if (!perf_evsel__is_group_leader(leader))
+		return -EINVAL;
+
+	grp = false;
+	evlist__for_each_entry(evlist, evsel) {
+		if (grp) {
+			if (!(evsel->leader == leader ||
+			     (evsel->leader == evsel &&
+			      evsel->core.nr_members <= 1)))
+				return -EINVAL;
+		} else if (evsel == leader) {
+			grp = true;
+		}
+		if (evsel == last)
+			break;
+	}
+
+	grp = false;
+	evlist__for_each_entry(evlist, evsel) {
+		if (grp) {
+			if (evsel->leader != leader) {
+				evsel->leader = leader;
+				if (leader->core.nr_members < 1)
+					leader->core.nr_members = 1;
+				leader->core.nr_members += 1;
+			}
+		} else if (evsel == leader) {
+			grp = true;
+		}
+		if (evsel == last)
+			break;
+	}
+
+	return 0;
+}
+
 static bool auxtrace__dont_decode(struct perf_session *session)
 {
 	return !session->itrace_synth_opts ||
@@ -679,13 +727,16 @@ int auxtrace_parse_sample_options(struct auxtrace_record *itr,
 				  struct evlist *evlist,
 				  struct record_opts *opts, const char *str)
 {
+	struct perf_evsel_config_term *term;
+	struct evsel *aux_evsel;
+	bool has_aux_sample_size = false;
 	bool has_aux_leader = false;
 	struct evsel *evsel;
 	char *endptr;
 	unsigned long sz;
 
 	if (!str)
-		return 0;
+		goto no_opt;
 
 	if (!itr) {
 		pr_err("No AUX area event to sample\n");
@@ -712,6 +763,29 @@ int auxtrace_parse_sample_options(struct auxtrace_record *itr,
 			evsel->core.attr.aux_sample_size = sz;
 		}
 	}
+no_opt:
+	aux_evsel = NULL;
+	/* Override with aux_sample_size from config term */
+	evlist__for_each_entry(evlist, evsel) {
+		if (perf_evsel__is_aux_event(evsel))
+			aux_evsel = evsel;
+		term = perf_evsel__get_config_term(evsel, AUX_SAMPLE_SIZE);
+		if (term) {
+			has_aux_sample_size = true;
+			evsel->core.attr.aux_sample_size = term->val.aux_sample_size;
+			/* If possible, group with the AUX event */
+			if (aux_evsel && evsel->core.attr.aux_sample_size)
+				perf_evlist__regroup(evlist, aux_evsel, evsel);
+		}
+	}
+
+	if (!str && !has_aux_sample_size)
+		return 0;
+
+	if (!itr) {
+		pr_err("No AUX area event to sample\n");
+		return -EINVAL;
+	}
 
 	return auxtrace_validate_aux_sample_size(evlist, opts);
 }
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 772f4879c492..ad7665a546cf 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -846,6 +846,9 @@ static void apply_config_terms(struct evsel *evsel,
 		case PERF_EVSEL__CONFIG_TERM_AUX_OUTPUT:
 			attr->aux_output = term->val.aux_output ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE:
+			/* Already applied by auxtrace */
+			break;
 		default:
 			break;
 		}
@@ -905,6 +908,19 @@ static bool is_dummy_event(struct evsel *evsel)
 	       (evsel->core.attr.config == PERF_COUNT_SW_DUMMY);
 }
 
+struct perf_evsel_config_term *__perf_evsel__get_config_term(struct evsel *evsel,
+							     enum evsel_term_type type)
+{
+	struct perf_evsel_config_term *term, *found_term = NULL;
+
+	list_for_each_entry(term, &evsel->config_terms, list) {
+		if (term->type == type)
+			found_term = term;
+	}
+
+	return found_term;
+}
+
 /*
  * The enable_on_exec/disabled value strategy:
  *
diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h
index 8a7648037c18..6e654ede8fbe 100644
--- a/tools/perf/util/evsel_config.h
+++ b/tools/perf/util/evsel_config.h
@@ -25,6 +25,7 @@ enum evsel_term_type {
 	PERF_EVSEL__CONFIG_TERM_BRANCH,
 	PERF_EVSEL__CONFIG_TERM_PERCORE,
 	PERF_EVSEL__CONFIG_TERM_AUX_OUTPUT,
+	PERF_EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE,
 };
 
 struct perf_evsel_config_term {
@@ -44,7 +45,17 @@ struct perf_evsel_config_term {
 		unsigned long max_events;
 		bool	      percore;
 		bool	      aux_output;
+		u32	      aux_sample_size;
 	} val;
 	bool weak;
 };
+
+struct evsel;
+
+struct perf_evsel_config_term *__perf_evsel__get_config_term(struct evsel *evsel,
+							     enum evsel_term_type type);
+
+#define perf_evsel__get_config_term(evsel, type) \
+	__perf_evsel__get_config_term(evsel, PERF_EVSEL__CONFIG_TERM_ ## type)
+
 #endif // __PERF_EVSEL_CONFIG_H
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6bae9d6edc12..fc5e27bc8315 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -996,6 +996,7 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_DRV_CFG]		= "driver-config",
 	[PARSE_EVENTS__TERM_TYPE_PERCORE]		= "percore",
 	[PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT]		= "aux-output",
+	[PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE]	= "aux-sample-size",
 };
 
 static bool config_term_shrinked;
@@ -1126,6 +1127,15 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
+		CHECK_TYPE_VAL(NUM);
+		if (term->val.num > UINT_MAX) {
+			parse_events__handle_error(err, term->err_val,
+						strdup("too big"),
+						NULL);
+			return -EINVAL;
+		}
+		break;
 	default:
 		parse_events__handle_error(err, term->err_term,
 				strdup("unknown term"),
@@ -1177,6 +1187,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
 	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 	case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
+	case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -1272,6 +1283,9 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
 			ADD_CONFIG_TERM(AUX_OUTPUT, aux_output, term->val.num ? 1 : 0);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
+			ADD_CONFIG_TERM(AUX_SAMPLE_SIZE, aux_sample_size, term->val.num);
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index ff367f248fe8..27596cbd0ba0 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -77,6 +77,7 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_DRV_CFG,
 	PARSE_EVENTS__TERM_TYPE_PERCORE,
 	PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT,
+	PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 7469497cd28e..7b1c8ee537cf 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -285,6 +285,7 @@ overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
 no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 percore			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); }
 aux-output		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); }
+aux-sample-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 15/26] perf inject: Cut AUX area samples
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2019-11-22 14:56 ` [PATCH 14/26] perf record: Add aux-sample-size config term Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 16/26] perf auxtrace: Add support for dumping " Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

After decoding AUX area samples, the AUX area data is no longer needed
(having been replaced by synthesized events) so cut it out.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-inject.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 1e5d28311e14..9664a72a089d 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -45,6 +45,7 @@ struct perf_inject {
 	u64			aux_id;
 	struct list_head	samples;
 	struct itrace_synth_opts itrace_synth_opts;
+	char			event_copy[PERF_SAMPLE_MAX_SIZE];
 };
 
 struct event_entry {
@@ -214,6 +215,28 @@ static int perf_event__drop_aux(struct perf_tool *tool,
 	return 0;
 }
 
+static union perf_event *
+perf_inject__cut_auxtrace_sample(struct perf_inject *inject,
+				 union perf_event *event,
+				 struct perf_sample *sample)
+{
+	size_t sz1 = sample->aux_sample.data - (void *)event;
+	size_t sz2 = event->header.size - sample->aux_sample.size - sz1;
+	union perf_event *ev = (union perf_event *)inject->event_copy;
+
+	if (sz1 > event->header.size || sz2 > event->header.size ||
+	    sz1 + sz2 > event->header.size ||
+	    sz1 < sizeof(struct perf_event_header) + sizeof(u64))
+		return event;
+
+	memcpy(ev, event, sz1);
+	memcpy((void *)ev + sz1, (void *)event + event->header.size - sz2, sz2);
+	ev->header.size = sz1 + sz2;
+	((u64 *)((void *)ev + sz1))[-1] = 0;
+
+	return ev;
+}
+
 typedef int (*inject_handler)(struct perf_tool *tool,
 			      union perf_event *event,
 			      struct perf_sample *sample,
@@ -226,6 +249,9 @@ static int perf_event__repipe_sample(struct perf_tool *tool,
 				     struct evsel *evsel,
 				     struct machine *machine)
 {
+	struct perf_inject *inject = container_of(tool, struct perf_inject,
+						  tool);
+
 	if (evsel && evsel->handler) {
 		inject_handler f = evsel->handler;
 		return f(tool, event, sample, evsel, machine);
@@ -233,6 +259,9 @@ static int perf_event__repipe_sample(struct perf_tool *tool,
 
 	build_id__mark_dso_hit(tool, event, sample, evsel, machine);
 
+	if (inject->itrace_synth_opts.set && sample->aux_sample.size)
+		event = perf_inject__cut_auxtrace_sample(inject, event, sample);
+
 	return perf_event__repipe_synth(tool, event);
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 16/26] perf auxtrace: Add support for dumping AUX area samples
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 15/26] perf inject: Cut AUX area samples Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 17/26] perf session: Add facility to peek at all events Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Add support for dumping AUX area samples i.e. via the perf script/report
 -D (--dump-raw-trace) option.

Committer notes:

Add __maybe_unused to the two args for auxtrace__dump_auxtrace_sample()
for when we don't HAVE_AUXTRACE_SUPPORT.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.c | 10 ++++++++++
 tools/perf/util/auxtrace.h | 11 +++++++++++
 tools/perf/util/session.c  |  9 +++++++--
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 026585b67a3c..4f5c5fe3516b 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -2417,6 +2417,16 @@ int auxtrace__process_event(struct perf_session *session, union perf_event *even
 	return session->auxtrace->process_event(session, event, sample, tool);
 }
 
+void auxtrace__dump_auxtrace_sample(struct perf_session *session,
+				    struct perf_sample *sample)
+{
+	if (!session->auxtrace || !session->auxtrace->dump_auxtrace_sample ||
+	    auxtrace__dont_decode(session))
+		return;
+
+	session->auxtrace->dump_auxtrace_sample(session, sample);
+}
+
 int auxtrace__flush_events(struct perf_session *session, struct perf_tool *tool)
 {
 	if (!session->auxtrace)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index ab48de13c353..4a8ac7de6e22 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -141,6 +141,7 @@ struct auxtrace_index {
  * struct auxtrace - session callbacks to allow AUX area data decoding.
  * @process_event: lets the decoder see all session events
  * @process_auxtrace_event: process a PERF_RECORD_AUXTRACE event
+ * @dump_auxtrace_sample: dump AUX area sample data
  * @flush_events: process any remaining data
  * @free_events: free resources associated with event processing
  * @free: free resources associated with the session
@@ -153,6 +154,8 @@ struct auxtrace {
 	int (*process_auxtrace_event)(struct perf_session *session,
 				      union perf_event *event,
 				      struct perf_tool *tool);
+	void (*dump_auxtrace_sample)(struct perf_session *session,
+				     struct perf_sample *sample);
 	int (*flush_events)(struct perf_session *session,
 			    struct perf_tool *tool);
 	void (*free_events)(struct perf_session *session);
@@ -555,6 +558,8 @@ int auxtrace_parse_filters(struct evlist *evlist);
 
 int auxtrace__process_event(struct perf_session *session, union perf_event *event,
 			    struct perf_sample *sample, struct perf_tool *tool);
+void auxtrace__dump_auxtrace_sample(struct perf_session *session,
+				    struct perf_sample *sample);
 int auxtrace__flush_events(struct perf_session *session, struct perf_tool *tool);
 void auxtrace__free_events(struct perf_session *session);
 void auxtrace__free(struct perf_session *session);
@@ -674,6 +679,12 @@ int auxtrace__process_event(struct perf_session *session __maybe_unused,
 	return 0;
 }
 
+static inline
+void auxtrace__dump_auxtrace_sample(struct perf_session *session __maybe_unused,
+				    struct perf_sample *sample __maybe_unused)
+{
+}
+
 static inline
 int auxtrace__flush_events(struct perf_session *session __maybe_unused,
 			   struct perf_tool *tool __maybe_unused)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index dbdb47624dec..ab4dae1efea3 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1496,8 +1496,13 @@ static int perf_session__deliver_event(struct perf_session *session,
 	if (ret > 0)
 		return 0;
 
-	return machines__deliver_event(&session->machines, session->evlist,
-				       event, &sample, tool, file_offset);
+	ret = machines__deliver_event(&session->machines, session->evlist,
+				      event, &sample, tool, file_offset);
+
+	if (dump_trace && sample.aux_sample.size)
+		auxtrace__dump_auxtrace_sample(session, &sample);
+
+	return ret;
 }
 
 static s64 perf_session__process_user_event(struct perf_session *session,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 17/26] perf session: Add facility to peek at all events
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 16/26] perf auxtrace: Add support for dumping " Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 18/26] perf auxtrace: Add support for queuing AUX area samples Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

AUX area samples are not limited in how far back in time the sample
could start. Consequently samples must be queued in advance to allow for
time-ordered processing. To achieve that, add
perf_session__peek_events() that walks and peeks at all the events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 28 ++++++++++++++++++++++++++++
 tools/perf/util/session.h |  5 +++++
 2 files changed, 33 insertions(+)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index ab4dae1efea3..d0d7d25b23e3 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1659,6 +1659,34 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 	return 0;
 }
 
+int perf_session__peek_events(struct perf_session *session, u64 offset,
+			      u64 size, peek_events_cb_t cb, void *data)
+{
+	u64 max_offset = offset + size;
+	char buf[PERF_SAMPLE_MAX_SIZE];
+	union perf_event *event;
+	int err;
+
+	do {
+		err = perf_session__peek_event(session, offset, buf,
+					       PERF_SAMPLE_MAX_SIZE, &event,
+					       NULL);
+		if (err)
+			return err;
+
+		err = cb(session, event, offset, data);
+		if (err)
+			return err;
+
+		offset += event->header.size;
+		if (event->header.type == PERF_RECORD_AUXTRACE)
+			offset += event->auxtrace.size;
+
+	} while (offset < max_offset);
+
+	return err;
+}
+
 static s64 perf_session__process_event(struct perf_session *session,
 				       union perf_event *event, u64 file_offset)
 {
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 8456e1d868fd..f76480166d38 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -64,6 +64,11 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 			     void *buf, size_t buf_sz,
 			     union perf_event **event_ptr,
 			     struct perf_sample *sample);
+typedef int (*peek_events_cb_t)(struct perf_session *session,
+				union perf_event *event, u64 offset,
+				void *data);
+int perf_session__peek_events(struct perf_session *session, u64 offset,
+			      u64 size, peek_events_cb_t cb, void *data);
 
 int perf_session__process_events(struct perf_session *session);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 18/26] perf auxtrace: Add support for queuing AUX area samples
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (16 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 17/26] perf session: Add facility to peek at all events Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 19/26] perf pmu: When using default config, record which bits of config were changed by the user Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Add functions to queue AUX area samples in advance
(auxtrace_queue_data()) or individually (auxtrace_queues__add_sample())
or find out what queue a sample belongs on
(auxtrace_queues__sample_queue()).

auxtrace_queue_data() can also queue snapshot data which keeps snapshots
and samples ordered with respect to each other in case support for that
is desired.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.c | 107 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/auxtrace.h |  15 ++++++
 2 files changed, 122 insertions(+)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 4f5c5fe3516b..eb087e7df6f4 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1004,6 +1004,113 @@ struct auxtrace_buffer *auxtrace_buffer__next(struct auxtrace_queue *queue,
 	}
 }
 
+struct auxtrace_queue *auxtrace_queues__sample_queue(struct auxtrace_queues *queues,
+						     struct perf_sample *sample,
+						     struct perf_session *session)
+{
+	struct perf_sample_id *sid;
+	unsigned int idx;
+	u64 id;
+
+	id = sample->id;
+	if (!id)
+		return NULL;
+
+	sid = perf_evlist__id2sid(session->evlist, id);
+	if (!sid)
+		return NULL;
+
+	idx = sid->idx;
+
+	if (idx >= queues->nr_queues)
+		return NULL;
+
+	return &queues->queue_array[idx];
+}
+
+int auxtrace_queues__add_sample(struct auxtrace_queues *queues,
+				struct perf_session *session,
+				struct perf_sample *sample, u64 data_offset,
+				u64 reference)
+{
+	struct auxtrace_buffer buffer = {
+		.pid = -1,
+		.data_offset = data_offset,
+		.reference = reference,
+		.size = sample->aux_sample.size,
+	};
+	struct perf_sample_id *sid;
+	u64 id = sample->id;
+	unsigned int idx;
+
+	if (!id)
+		return -EINVAL;
+
+	sid = perf_evlist__id2sid(session->evlist, id);
+	if (!sid)
+		return -ENOENT;
+
+	idx = sid->idx;
+	buffer.tid = sid->tid;
+	buffer.cpu = sid->cpu;
+
+	return auxtrace_queues__add_buffer(queues, session, idx, &buffer, NULL);
+}
+
+struct queue_data {
+	bool samples;
+	bool events;
+};
+
+static int auxtrace_queue_data_cb(struct perf_session *session,
+				  union perf_event *event, u64 offset,
+				  void *data)
+{
+	struct queue_data *qd = data;
+	struct perf_sample sample;
+	int err;
+
+	if (qd->events && event->header.type == PERF_RECORD_AUXTRACE) {
+		if (event->header.size < sizeof(struct perf_record_auxtrace))
+			return -EINVAL;
+		offset += event->header.size;
+		return session->auxtrace->queue_data(session, NULL, event,
+						     offset);
+	}
+
+	if (!qd->samples || event->header.type != PERF_RECORD_SAMPLE)
+		return 0;
+
+	err = perf_evlist__parse_sample(session->evlist, event, &sample);
+	if (err)
+		return err;
+
+	if (!sample.aux_sample.size)
+		return 0;
+
+	offset += sample.aux_sample.data - (void *)event;
+
+	return session->auxtrace->queue_data(session, &sample, NULL, offset);
+}
+
+int auxtrace_queue_data(struct perf_session *session, bool samples, bool events)
+{
+	struct queue_data qd = {
+		.samples = samples,
+		.events = events,
+	};
+
+	if (auxtrace__dont_decode(session))
+		return 0;
+
+	if (!session->auxtrace || !session->auxtrace->queue_data)
+		return -EINVAL;
+
+	return perf_session__peek_events(session, session->header.data_offset,
+					 session->header.data_size,
+					 auxtrace_queue_data_cb, &qd);
+}
+
 void *auxtrace_buffer__get_data(struct auxtrace_buffer *buffer, int fd)
 {
 	size_t adj = buffer->data_offset & (page_size - 1);
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 4a8ac7de6e22..749d72cd9c7b 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -141,6 +141,8 @@ struct auxtrace_index {
  * struct auxtrace - session callbacks to allow AUX area data decoding.
  * @process_event: lets the decoder see all session events
  * @process_auxtrace_event: process a PERF_RECORD_AUXTRACE event
+ * @queue_data: queue an AUX sample or PERF_RECORD_AUXTRACE event for later
+ *              processing
  * @dump_auxtrace_sample: dump AUX area sample data
  * @flush_events: process any remaining data
  * @free_events: free resources associated with event processing
@@ -154,6 +156,9 @@ struct auxtrace {
 	int (*process_auxtrace_event)(struct perf_session *session,
 				      union perf_event *event,
 				      struct perf_tool *tool);
+	int (*queue_data)(struct perf_session *session,
+			  struct perf_sample *sample, union perf_event *event,
+			  u64 data_offset);
 	void (*dump_auxtrace_sample)(struct perf_session *session,
 				     struct perf_sample *sample);
 	int (*flush_events)(struct perf_session *session,
@@ -467,9 +472,19 @@ int auxtrace_queues__add_event(struct auxtrace_queues *queues,
 			       struct perf_session *session,
 			       union perf_event *event, off_t data_offset,
 			       struct auxtrace_buffer **buffer_ptr);
+struct auxtrace_queue *
+auxtrace_queues__sample_queue(struct auxtrace_queues *queues,
+			      struct perf_sample *sample,
+			      struct perf_session *session);
+int auxtrace_queues__add_sample(struct auxtrace_queues *queues,
+				struct perf_session *session,
+				struct perf_sample *sample, u64 data_offset,
+				u64 reference);
 void auxtrace_queues__free(struct auxtrace_queues *queues);
 int auxtrace_queues__process_index(struct auxtrace_queues *queues,
 				   struct perf_session *session);
+int auxtrace_queue_data(struct perf_session *session, bool samples,
+			bool events);
 struct auxtrace_buffer *auxtrace_buffer__next(struct auxtrace_queue *queue,
 					      struct auxtrace_buffer *buffer);
 void *auxtrace_buffer__get_data(struct auxtrace_buffer *buffer, int fd);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 19/26] perf pmu: When using default config, record which bits of config were changed by the user
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (17 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 18/26] perf auxtrace: Add support for queuing AUX area samples Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 20/26] perf intel-pt: Add support for recording AUX area samples Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Default config for a PMU is defined before selected events are parsed.
That allows the user-entered config to override the default config.

However that does not allow for changing the default config based on
other options.

For example, if the user chooses AUX area sampling mode, in the case of
Intel PT, the psb_period needs to be small for sampling, so there is a
need to set the default psb_period to 0 (2 KiB) in that case. However
that should not override a value set by the user. To allow for that,
when using default config, record which bits of config were changed by
the user.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c        |  2 ++
 tools/perf/util/evsel_config.h |  2 ++
 tools/perf/util/parse-events.c | 42 +++++++++++++++++++++++++++++++++-
 tools/perf/util/pmu.c          | 10 ++++++++
 tools/perf/util/pmu.h          |  1 +
 5 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ad7665a546cf..f4dea055b080 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -849,6 +849,8 @@ static void apply_config_terms(struct evsel *evsel,
 		case PERF_EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE:
 			/* Already applied by auxtrace */
 			break;
+		case PERF_EVSEL__CONFIG_TERM_CFG_CHG:
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h
index 6e654ede8fbe..1f8d2fe0b66e 100644
--- a/tools/perf/util/evsel_config.h
+++ b/tools/perf/util/evsel_config.h
@@ -26,6 +26,7 @@ enum evsel_term_type {
 	PERF_EVSEL__CONFIG_TERM_PERCORE,
 	PERF_EVSEL__CONFIG_TERM_AUX_OUTPUT,
 	PERF_EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE,
+	PERF_EVSEL__CONFIG_TERM_CFG_CHG,
 };
 
 struct perf_evsel_config_term {
@@ -46,6 +47,7 @@ struct perf_evsel_config_term {
 		bool	      percore;
 		bool	      aux_output;
 		u32	      aux_sample_size;
+		u64	      cfg_chg;
 	} val;
 	bool weak;
 };
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index fc5e27bc8315..6c313c4087ed 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1290,7 +1290,40 @@ do {								\
 			break;
 		}
 	}
-#undef ADD_EVSEL_CONFIG
+	return 0;
+}
+
+/*
+ * Add PERF_EVSEL__CONFIG_TERM_CFG_CHG where cfg_chg will have a bit set for
+ * each bit of attr->config that the user has changed.
+ */
+static int get_config_chgs(struct perf_pmu *pmu, struct list_head *head_config,
+			   struct list_head *head_terms)
+{
+	struct parse_events_term *term;
+	u64 bits = 0;
+	int type;
+
+	list_for_each_entry(term, head_config, list) {
+		switch (term->type_term) {
+		case PARSE_EVENTS__TERM_TYPE_USER:
+			type = perf_pmu__format_type(&pmu->format, term->config);
+			if (type != PERF_PMU_FORMAT_VALUE_CONFIG)
+				continue;
+			bits |= perf_pmu__format_bits(&pmu->format, term->config);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_CONFIG:
+			bits = ~(u64)0;
+			break;
+		default:
+			break;
+		}
+	}
+
+	if (bits)
+		ADD_CONFIG_TERM(CFG_CHG, cfg_chg, bits);
+
+#undef ADD_CONFIG_TERM
 	return 0;
 }
 
@@ -1419,6 +1452,13 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
 	if (get_config_terms(head_config, &config_terms))
 		return -ENOMEM;
 
+	/*
+	 * When using default config, record which bits of attr->config were
+	 * changed by the user.
+	 */
+	if (pmu->default_config && get_config_chgs(pmu, head_config, &config_terms))
+		return -ENOMEM;
+
 	if (perf_pmu__config(pmu, &attr, head_config, parse_state->error)) {
 		struct perf_evsel_config_term *pos, *tmp;
 
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index db1e57113f4b..e8d348988026 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -931,6 +931,16 @@ __u64 perf_pmu__format_bits(struct list_head *formats, const char *name)
 	return bits;
 }
 
+int perf_pmu__format_type(struct list_head *formats, const char *name)
+{
+	struct perf_pmu_format *format = pmu_find_format(formats, name);
+
+	if (!format)
+		return -1;
+
+	return format->value;
+}
+
 /*
  * Sets value based on the format definition (format parameter)
  * and unformated value (value parameter).
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 2eb7a7001307..6737e3d5d568 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -72,6 +72,7 @@ int perf_pmu__config_terms(struct list_head *formats,
 			   struct list_head *head_terms,
 			   bool zero, struct parse_events_error *error);
 __u64 perf_pmu__format_bits(struct list_head *formats, const char *name);
+int perf_pmu__format_type(struct list_head *formats, const char *name);
 int perf_pmu__check_alias(struct perf_pmu *pmu, struct list_head *head_terms,
 			  struct perf_pmu_info *info);
 struct list_head *perf_pmu__alias(struct perf_pmu *pmu,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 20/26] perf intel-pt: Add support for recording AUX area samples
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (18 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 19/26] perf pmu: When using default config, record which bits of config were changed by the user Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 21/26] perf intel-pt: Add support for decoding " Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Set up the default number of mmap pages, default sample size and default
psb_period for AUX area sampling. Add documentation also.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/intel-pt.txt | 59 ++++++++++++++++++-
 tools/perf/arch/x86/util/auxtrace.c   |  2 +
 tools/perf/arch/x86/util/intel-pt.c   | 81 ++++++++++++++++++++++++++-
 3 files changed, 139 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index e0d9e7dd4f17..2cf2d9e9d0da 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -434,6 +434,56 @@ pwr_evt		Enable power events.  The power events provide information about
 		"0" otherwise.
 
 
+AUX area sampling option
+------------------------
+
+To select Intel PT "sampling" the AUX area sampling option can be used:
+
+	--aux-sample
+
+Optionally it can be followed by the sample size in bytes e.g.
+
+	--aux-sample=8192
+
+In addition, the Intel PT event to sample must be defined e.g.
+
+	-e intel_pt//u
+
+Samples on other events will be created containing Intel PT data e.g. the
+following will create Intel PT samples on the branch-misses event, note the
+events must be grouped using {}:
+
+	perf record --aux-sample -e '{intel_pt//u,branch-misses:u}'
+
+An alternative to '--aux-sample' is to add the config term 'aux-sample-size' to
+events.  In this case, the grouping is implied e.g.
+
+	perf record -e intel_pt//u -e branch-misses/aux-sample-size=8192/u
+
+is the same as:
+
+	perf record -e '{intel_pt//u,branch-misses/aux-sample-size=8192/u}'
+
+but allows for also using an address filter e.g.:
+
+	perf record -e intel_pt//u --filter 'filter * @/bin/ls' -e branch-misses/aux-sample-size=8192/u -- ls
+
+It is important to select a sample size that is big enough to contain at least
+one PSB packet.  If not a warning will be displayed:
+
+	Intel PT sample size (%zu) may be too small for PSB period (%zu)
+
+The calculation used for that is: if sample_size <= psb_period + 256 display the
+warning.  When sampling is used, psb_period defaults to 0 (2KiB).
+
+The default sample size is 4KiB.
+
+The sample size is passed in aux_sample_size in struct perf_event_attr.  The
+sample size is limited by the maximum event size which is 64KiB.  It is
+difficult to know how big the event might be without the trace sample attached,
+but the tool validates that the sample size is not greater than 60KiB.
+
+
 new snapshot option
 -------------------
 
@@ -487,8 +537,8 @@ their mlock limit (which defaults to 64KiB but is not multiplied by the number
 of cpus).
 
 In full-trace mode, powers of two are allowed for buffer size, with a minimum
-size of 2 pages.  In snapshot mode, it is the same but the minimum size is
-1 page.
+size of 2 pages.  In snapshot mode or sampling mode, it is the same but the
+minimum size is 1 page.
 
 The mmap size and auxtrace mmap size are displayed if the -vv option is used e.g.
 
@@ -501,12 +551,17 @@ Intel PT modes of operation
 
 Intel PT can be used in 2 modes:
 	full-trace mode
+	sample mode
 	snapshot mode
 
 Full-trace mode traces continuously e.g.
 
 	perf record -e intel_pt//u uname
 
+Sample mode attaches a Intel PT sample to other events e.g.
+
+	perf record --aux-sample -e intel_pt//u -e branch-misses:u
+
 Snapshot mode captures the available data when a signal is sent e.g.
 
 	perf record -v -e intel_pt//u -S ./loopy 1000000000 &
diff --git a/tools/perf/arch/x86/util/auxtrace.c b/tools/perf/arch/x86/util/auxtrace.c
index 96f4a2c11893..092543cad324 100644
--- a/tools/perf/arch/x86/util/auxtrace.c
+++ b/tools/perf/arch/x86/util/auxtrace.c
@@ -26,6 +26,8 @@ struct auxtrace_record *auxtrace_record__init_intel(struct evlist *evlist,
 	bool found_bts = false;
 
 	intel_pt_pmu = perf_pmu__find(INTEL_PT_PMU_NAME);
+	if (intel_pt_pmu)
+		intel_pt_pmu->auxtrace = true;
 	intel_bts_pmu = perf_pmu__find(INTEL_BTS_PMU_NAME);
 
 	evlist__for_each_entry(evlist, evsel) {
diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index d6d26256915f..20df442fdf36 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -17,6 +17,7 @@
 #include "../../util/event.h"
 #include "../../util/evlist.h"
 #include "../../util/evsel.h"
+#include "../../util/evsel_config.h"
 #include "../../util/cpumap.h"
 #include "../../util/mmap.h"
 #include <subcmd/parse-options.h>
@@ -551,6 +552,43 @@ static int intel_pt_validate_config(struct perf_pmu *intel_pt_pmu,
 					evsel->core.attr.config);
 }
 
+static void intel_pt_config_sample_mode(struct perf_pmu *intel_pt_pmu,
+					struct evsel *evsel)
+{
+	struct perf_evsel_config_term *term;
+	u64 user_bits = 0, bits;
+
+	term = perf_evsel__get_config_term(evsel, CFG_CHG);
+	if (term)
+		user_bits = term->val.cfg_chg;
+
+	bits = perf_pmu__format_bits(&intel_pt_pmu->format, "psb_period");
+
+	/* Did user change psb_period */
+	if (bits & user_bits)
+		return;
+
+	/* Set psb_period to 0 */
+	evsel->core.attr.config &= ~bits;
+}
+
+static void intel_pt_min_max_sample_sz(struct evlist *evlist,
+				       size_t *min_sz, size_t *max_sz)
+{
+	struct evsel *evsel;
+
+	evlist__for_each_entry(evlist, evsel) {
+		size_t sz = evsel->core.attr.aux_sample_size;
+
+		if (!sz)
+			continue;
+		if (min_sz && (sz < *min_sz || !*min_sz))
+			*min_sz = sz;
+		if (max_sz && sz > *max_sz)
+			*max_sz = sz;
+	}
+}
+
 /*
  * Currently, there is not enough information to disambiguate different PEBS
  * events, so only allow one.
@@ -606,6 +644,11 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 		return -EINVAL;
 	}
 
+	if (opts->auxtrace_snapshot_mode && opts->auxtrace_sample_mode) {
+		pr_err("Snapshot mode (" INTEL_PT_PMU_NAME " PMU) and sample trace cannot be used together\n");
+		return -EINVAL;
+	}
+
 	if (opts->use_clockid) {
 		pr_err("Cannot use clockid (-k option) with " INTEL_PT_PMU_NAME "\n");
 		return -EINVAL;
@@ -617,6 +660,9 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 	if (!opts->full_auxtrace)
 		return 0;
 
+	if (opts->auxtrace_sample_mode)
+		intel_pt_config_sample_mode(intel_pt_pmu, intel_pt_evsel);
+
 	err = intel_pt_validate_config(intel_pt_pmu, intel_pt_evsel);
 	if (err)
 		return err;
@@ -666,6 +712,34 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 				    opts->auxtrace_snapshot_size, psb_period);
 	}
 
+	/* Set default sizes for sample mode */
+	if (opts->auxtrace_sample_mode) {
+		size_t psb_period = intel_pt_psb_period(intel_pt_pmu, evlist);
+		size_t min_sz = 0, max_sz = 0;
+
+		intel_pt_min_max_sample_sz(evlist, &min_sz, &max_sz);
+		if (!opts->auxtrace_mmap_pages && !privileged &&
+		    opts->mmap_pages == UINT_MAX)
+			opts->mmap_pages = KiB(256) / page_size;
+		if (!opts->auxtrace_mmap_pages) {
+			size_t sz = round_up(max_sz, page_size) / page_size;
+
+			opts->auxtrace_mmap_pages = roundup_pow_of_two(sz);
+		}
+		if (max_sz > opts->auxtrace_mmap_pages * (size_t)page_size) {
+			pr_err("Sample size %zu must not be greater than AUX area tracing mmap size %zu\n",
+			       max_sz,
+			       opts->auxtrace_mmap_pages * (size_t)page_size);
+			return -EINVAL;
+		}
+		pr_debug2("Intel PT min. sample size: %zu max. sample size: %zu\n",
+			  min_sz, max_sz);
+		if (psb_period &&
+		    min_sz <= psb_period + INTEL_PT_PSB_PERIOD_NEAR)
+			ui__warning("Intel PT sample size (%zu) may be too small for PSB period (%zu)\n",
+				    min_sz, psb_period);
+	}
+
 	/* Set default sizes for full trace mode */
 	if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) {
 		if (privileged) {
@@ -682,7 +756,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 		size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size;
 		size_t min_sz;
 
-		if (opts->auxtrace_snapshot_mode)
+		if (opts->auxtrace_snapshot_mode || opts->auxtrace_sample_mode)
 			min_sz = KiB(4);
 		else
 			min_sz = KiB(8);
@@ -1136,5 +1210,10 @@ struct auxtrace_record *intel_pt_recording_init(int *err)
 	ptr->itr.parse_snapshot_options = intel_pt_parse_snapshot_options;
 	ptr->itr.reference = intel_pt_reference;
 	ptr->itr.read_finish = intel_pt_read_finish;
+	/*
+	 * Decoding starts at a PSB packet. Minimum PSB period is 2K so 4K
+	 * should give at least 1 PSB per sample.
+	 */
+	ptr->itr.default_aux_sample_size = 4096;
 	return &ptr->itr;
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 21/26] perf intel-pt: Add support for decoding AUX area samples
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (19 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 20/26] perf intel-pt: Add support for recording AUX area samples Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 22/26] perf intel-bts: Does not support AUX area sampling Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Add support for dumping, queuing and decoding AUX area samples. Decoding
samples is the same as regular decoding, except in the case where there
are no timestamps, in which case buffers are decoded immediately before
the sample event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt.c | 109 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 106 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index a1c9eb6d4f40..409afc611be9 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -233,6 +233,16 @@ static void intel_pt_log_event(union perf_event *event)
 	perf_event__fprintf(event, f);
 }
 
+static void intel_pt_dump_sample(struct perf_session *session,
+				 struct perf_sample *sample)
+{
+	struct intel_pt *pt = container_of(session->auxtrace, struct intel_pt,
+					   auxtrace);
+
+	printf("\n");
+	intel_pt_dump(pt, sample->aux_sample.data, sample->aux_sample.size);
+}
+
 static int intel_pt_do_fix_overlap(struct intel_pt *pt, struct auxtrace_buffer *a,
 				   struct auxtrace_buffer *b)
 {
@@ -836,6 +846,18 @@ static bool intel_pt_have_tsc(struct intel_pt *pt)
 	return have_tsc;
 }
 
+static bool intel_pt_sampling_mode(struct intel_pt *pt)
+{
+	struct evsel *evsel;
+
+	evlist__for_each_entry(pt->session->evlist, evsel) {
+		if ((evsel->core.attr.sample_type & PERF_SAMPLE_AUX) &&
+		    evsel->core.attr.aux_sample_size)
+			return true;
+	}
+	return false;
+}
+
 static u64 intel_pt_ns_to_ticks(const struct intel_pt *pt, u64 ns)
 {
 	u64 quot, rem;
@@ -2320,6 +2342,56 @@ static int intel_pt_process_timeless_queues(struct intel_pt *pt, pid_t tid,
 	return 0;
 }
 
+static void intel_pt_sample_set_pid_tid_cpu(struct intel_pt_queue *ptq,
+					    struct auxtrace_queue *queue,
+					    struct perf_sample *sample)
+{
+	struct machine *m = ptq->pt->machine;
+
+	ptq->pid = sample->pid;
+	ptq->tid = sample->tid;
+	ptq->cpu = queue->cpu;
+
+	intel_pt_log("queue %u cpu %d pid %d tid %d\n",
+		     ptq->queue_nr, ptq->cpu, ptq->pid, ptq->tid);
+
+	thread__zput(ptq->thread);
+
+	if (ptq->tid == -1)
+		return;
+
+	if (ptq->pid == -1) {
+		ptq->thread = machine__find_thread(m, -1, ptq->tid);
+		if (ptq->thread)
+			ptq->pid = ptq->thread->pid_;
+		return;
+	}
+
+	ptq->thread = machine__findnew_thread(m, ptq->pid, ptq->tid);
+}
+
+static int intel_pt_process_timeless_sample(struct intel_pt *pt,
+					    struct perf_sample *sample)
+{
+	struct auxtrace_queue *queue;
+	struct intel_pt_queue *ptq;
+	u64 ts = 0;
+
+	queue = auxtrace_queues__sample_queue(&pt->queues, sample, pt->session);
+	if (!queue)
+		return -EINVAL;
+
+	ptq = queue->priv;
+	if (!ptq)
+		return 0;
+
+	ptq->stop = false;
+	ptq->time = sample->time;
+	intel_pt_sample_set_pid_tid_cpu(ptq, queue, sample);
+	intel_pt_run_decoder(ptq, &ts);
+	return 0;
+}
+
 static int intel_pt_lost(struct intel_pt *pt, struct perf_sample *sample)
 {
 	return intel_pt_synth_error(pt, INTEL_PT_ERR_LOST, sample->cpu,
@@ -2550,7 +2622,11 @@ static int intel_pt_process_event(struct perf_session *session,
 	}
 
 	if (pt->timeless_decoding) {
-		if (event->header.type == PERF_RECORD_EXIT) {
+		if (pt->sampling_mode) {
+			if (sample->aux_sample.size)
+				err = intel_pt_process_timeless_sample(pt,
+								       sample);
+		} else if (event->header.type == PERF_RECORD_EXIT) {
 			err = intel_pt_process_timeless_queues(pt,
 							       event->fork.tid,
 							       sample->time);
@@ -2676,6 +2752,28 @@ static int intel_pt_process_auxtrace_event(struct perf_session *session,
 	return 0;
 }
 
+static int intel_pt_queue_data(struct perf_session *session,
+			       struct perf_sample *sample,
+			       union perf_event *event, u64 data_offset)
+{
+	struct intel_pt *pt = container_of(session->auxtrace, struct intel_pt,
+					   auxtrace);
+	u64 timestamp;
+
+	if (event) {
+		return auxtrace_queues__add_event(&pt->queues, session, event,
+						  data_offset, NULL);
+	}
+
+	if (sample->time && sample->time != (u64)-1)
+		timestamp = perf_time_to_tsc(sample->time, &pt->tc);
+	else
+		timestamp = 0;
+
+	return auxtrace_queues__add_sample(&pt->queues, session, sample,
+					   data_offset, timestamp);
+}
+
 struct intel_pt_synth {
 	struct perf_tool dummy_tool;
 	struct perf_session *session;
@@ -3178,7 +3276,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 	if (pt->timeless_decoding && !pt->tc.time_mult)
 		pt->tc.time_mult = 1;
 	pt->have_tsc = intel_pt_have_tsc(pt);
-	pt->sampling_mode = false;
+	pt->sampling_mode = intel_pt_sampling_mode(pt);
 	pt->est_tsc = !pt->timeless_decoding;
 
 	pt->unknown_thread = thread__new(999999999, 999999999);
@@ -3205,6 +3303,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 
 	pt->auxtrace.process_event = intel_pt_process_event;
 	pt->auxtrace.process_auxtrace_event = intel_pt_process_auxtrace_event;
+	pt->auxtrace.queue_data = intel_pt_queue_data;
+	pt->auxtrace.dump_auxtrace_sample = intel_pt_dump_sample;
 	pt->auxtrace.flush_events = intel_pt_flush;
 	pt->auxtrace.free_events = intel_pt_free_events;
 	pt->auxtrace.free = intel_pt_free;
@@ -3282,7 +3382,10 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 
 	intel_pt_setup_pebs_events(pt);
 
-	err = auxtrace_queues__process_index(&pt->queues, session);
+	if (pt->sampling_mode || list_empty(&session->auxtrace_index))
+		err = auxtrace_queue_data(session, true, true);
+	else
+		err = auxtrace_queues__process_index(&pt->queues, session);
 	if (err)
 		goto err_delete_thread;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 22/26] perf intel-bts: Does not support AUX area sampling
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (20 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 21/26] perf intel-pt: Add support for decoding " Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 23/26] libtraceevent: Fix header installation Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Adrian Hunter, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Add an error message because Intel BTS does not support AUX area
sampling.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/util/auxtrace.c  | 2 ++
 tools/perf/arch/x86/util/intel-bts.c | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/tools/perf/arch/x86/util/auxtrace.c b/tools/perf/arch/x86/util/auxtrace.c
index 092543cad324..7abc9fd4cbec 100644
--- a/tools/perf/arch/x86/util/auxtrace.c
+++ b/tools/perf/arch/x86/util/auxtrace.c
@@ -29,6 +29,8 @@ struct auxtrace_record *auxtrace_record__init_intel(struct evlist *evlist,
 	if (intel_pt_pmu)
 		intel_pt_pmu->auxtrace = true;
 	intel_bts_pmu = perf_pmu__find(INTEL_BTS_PMU_NAME);
+	if (intel_bts_pmu)
+		intel_bts_pmu->auxtrace = true;
 
 	evlist__for_each_entry(evlist, evsel) {
 		if (intel_pt_pmu && evsel->core.attr.type == intel_pt_pmu->type)
diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c
index f7f68a50a5cd..27d9e214d068 100644
--- a/tools/perf/arch/x86/util/intel-bts.c
+++ b/tools/perf/arch/x86/util/intel-bts.c
@@ -113,6 +113,11 @@ static int intel_bts_recording_options(struct auxtrace_record *itr,
 	const struct perf_cpu_map *cpus = evlist->core.cpus;
 	bool privileged = perf_event_paranoid_check(-1);
 
+	if (opts->auxtrace_sample_mode) {
+		pr_err("Intel BTS does not support AUX area sampling\n");
+		return -EINVAL;
+	}
+
 	btsr->evlist = evlist;
 	btsr->snapshot_mode = opts->auxtrace_snapshot_mode;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 23/26] libtraceevent: Fix header installation
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (21 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 22/26] perf intel-bts: Does not support AUX area sampling Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 24/26] libtraceevent: Fix memory leakage in copy_filter_type Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Sudip Mukherjee, Steven Rostedt,
	linux-trace-devel, Arnaldo Carvalho de Melo

From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>

When we passed some location in DESTDIR, install_headers called
do_install with DESTDIR as part of the second argument.

But do_install is again using '$(DESTDIR_SQ)$2', so as a result the
headers were installed in a location $DESTDIR/$DESTDIR.

In my testing I passed DESTDIR=/home/sudip/test and the headers were
installed in: /home/sudip/test/home/sudip/test/usr/include/traceevent.

Lets remove DESTDIR from the second argument of do_install so that the
headers are installed in the correct location.

Signed-off-by: Sudipm Mukherjee <sudipm.mukherjee@gmail.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Sudipm Mukherjee <sudipm.mukherjee@gmail.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191114133719.309-1-sudipm.mukherjee@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/Makefile | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile
index 5315f3787f8d..cbb429f55062 100644
--- a/tools/lib/traceevent/Makefile
+++ b/tools/lib/traceevent/Makefile
@@ -232,10 +232,10 @@ install_pkgconfig:
 
 install_headers:
 	$(call QUIET_INSTALL, headers) \
-		$(call do_install,event-parse.h,$(DESTDIR)$(includedir_SQ),644); \
-		$(call do_install,event-utils.h,$(DESTDIR)$(includedir_SQ),644); \
-		$(call do_install,trace-seq.h,$(DESTDIR)$(includedir_SQ),644); \
-		$(call do_install,kbuffer.h,$(DESTDIR)$(includedir_SQ),644)
+		$(call do_install,event-parse.h,$(includedir_SQ),644); \
+		$(call do_install,event-utils.h,$(includedir_SQ),644); \
+		$(call do_install,trace-seq.h,$(includedir_SQ),644); \
+		$(call do_install,kbuffer.h,$(includedir_SQ),644)
 
 install: install_lib
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 24/26] libtraceevent: Fix memory leakage in copy_filter_type
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (22 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 23/26] libtraceevent: Fix header installation Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 25/26] perf probe: Fix spelling mistake "addrees" -> "address" Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Hewenliang, Steven Rostedt, Tzvetomir Stoyanov,
	Arnaldo Carvalho de Melo

From: Hewenliang <hewenliang4@huawei.com>

It is necessary to free the memory that we have allocated when error occurs.

Fixes: ef3072cd1d5c ("tools lib traceevent: Get rid of die in add_filter_type()")
Signed-off-by: Hewenliang <hewenliang4@huawei.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: http://lore.kernel.org/lkml/20191119014415.57210-1-hewenliang4@huawei.com
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/traceevent/parse-filter.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c
index 552592d153fb..f3cbf86e51ac 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1473,8 +1473,10 @@ static int copy_filter_type(struct tep_event_filter *filter,
 	if (strcmp(str, "TRUE") == 0 || strcmp(str, "FALSE") == 0) {
 		/* Add trivial event */
 		arg = allocate_arg();
-		if (arg == NULL)
+		if (arg == NULL) {
+			free(str);
 			return -1;
+		}
 
 		arg->type = TEP_FILTER_ARG_BOOLEAN;
 		if (strcmp(str, "TRUE") == 0)
@@ -1483,8 +1485,11 @@ static int copy_filter_type(struct tep_event_filter *filter,
 			arg->boolean.value = 0;
 
 		filter_type = add_filter_type(filter, event->id);
-		if (filter_type == NULL)
+		if (filter_type == NULL) {
+			free(str);
+			free_arg(arg);
 			return -1;
+		}
 
 		filter_type->filter = arg;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 25/26] perf probe: Fix spelling mistake "addrees" -> "address"
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (23 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 24/26] libtraceevent: Fix memory leakage in copy_filter_type Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-22 14:57 ` [PATCH 26/26] perf parse: Fix potential memory leak when handling tracepoint errors Arnaldo Carvalho de Melo
  2019-11-23  8:07 ` [GIT PULL] perf/core improvements and fixes Ingo Molnar
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Colin Ian King, Masami Hiramatsu,
	Alexander Shishkin, Jiri Olsa, Mark Rutland, Peter Zijlstra,
	kernel-janitors, Arnaldo Carvalho de Melo

From: Colin Ian King <colin.king@canonical.com>

There is a spelling mistake in a pr_warning message. Fix it.

Signed-off-by: Colin King <colin.king@canonical.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191121092623.374896-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/probe-finder.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 38d6cd22779f..c470c49a804f 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -812,7 +812,7 @@ static int verify_representive_line(struct probe_finder *pf, const char *fname,
 	if (strcmp(fname, __fname) || lineno == __lineno)
 		return 0;
 
-	pr_warning("This line is sharing the addrees with other lines.\n");
+	pr_warning("This line is sharing the address with other lines.\n");
 
 	if (pf->pev->point.function) {
 		/* Find best match function name and lines */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 26/26] perf parse: Fix potential memory leak when handling tracepoint errors
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (24 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 25/26] perf probe: Fix spelling mistake "addrees" -> "address" Arnaldo Carvalho de Melo
@ 2019-11-22 14:57 ` Arnaldo Carvalho de Melo
  2019-11-23  8:07 ` [GIT PULL] perf/core improvements and fixes Ingo Molnar
  26 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-11-22 14:57 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Ian Rogers, Alexander Shishkin, Andi Kleen,
	Jin Yao, Jiri Olsa, Mark Rutland, Peter Zijlstra,
	Stephane Eranian, clang-built-linux, Arnaldo Carvalho de Melo

From: Ian Rogers <irogers@google.com>

An error may be in place when tracepoint_error is called, use
parse_events__handle_error to avoid a memory leak and to capture the
first and last error. Error detected by LLVM's libFuzzer using the
following event:

$ perf stat -e 'msr/event/,f:e'
event syntax error: 'msr/event/,f:e'
                     \___ can't access trace events

Error:  No permissions to read /sys/kernel/debug/tracing/events/f/e
Hint:   Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing/'

Initial error:
event syntax error: 'msr/event/,f:e'
                                \___ no value assigned for term
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191120180925.21787-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/parse-events.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6c313c4087ed..ed7c008b9c8b 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -511,6 +511,7 @@ int parse_events_add_cache(struct list_head *list, int *idx,
 static void tracepoint_error(struct parse_events_error *e, int err,
 			     const char *sys, const char *name)
 {
+	const char *str;
 	char help[BUFSIZ];
 
 	if (!e)
@@ -524,18 +525,18 @@ static void tracepoint_error(struct parse_events_error *e, int err,
 
 	switch (err) {
 	case EACCES:
-		e->str = strdup("can't access trace events");
+		str = "can't access trace events";
 		break;
 	case ENOENT:
-		e->str = strdup("unknown tracepoint");
+		str = "unknown tracepoint";
 		break;
 	default:
-		e->str = strdup("failed to add tracepoint");
+		str = "failed to add tracepoint";
 		break;
 	}
 
 	tracing_path__strerror_open_tp(err, help, sizeof(help), sys, name);
-	e->help = strdup(help);
+	parse_events__handle_error(e, 0, strdup(str), strdup(help));
 }
 
 static int add_tracepoint(struct list_head *list, int *idx,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [GIT PULL] perf/core improvements and fixes
  2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (25 preceding siblings ...)
  2019-11-22 14:57 ` [PATCH 26/26] perf parse: Fix potential memory leak when handling tracepoint errors Arnaldo Carvalho de Melo
@ 2019-11-23  8:07 ` Ingo Molnar
  26 siblings, 0 replies; 28+ messages in thread
From: Ingo Molnar @ 2019-11-23  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas Gleixner, Jiri Olsa, Namhyung Kim, Clark Williams,
	linux-kernel, linux-perf-users, Adrian Hunter, Alexey Budankov,
	Colin King, Hewenliang, Ian Rogers, Jin Yao, Steven Rostedt,
	Sudipm Mukherjee, Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo/Thomas,
> 
> 	Please consider pulling,
> 
> Best regards,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 8f6ee51d772d0dab407d868449d2c5d9c8d2b6fc:
> 
>   Merge tag 'perf-core-for-mingo-5.5-20191119' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2019-11-19 12:59:03 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-5.5-20191122
> 
> for you to fetch changes up to 4584f084aa9d8033d5911935837dbee7b082d0e9:
> 
>   perf parse: Fix potential memory leak when handling tracepoint errors (2019-11-22 10:48:14 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> perf report:
> 
>   Jin Yao:
> 
>   - Allow entering the annotation view (symbol source/assembly +
>     overhead/cycles/etc column) from the 'perf report --total-cycles'
>     interface.
> 
>     E.g.:
> 
>       # perf record --all-cpus --branch-any --all-kernel
>       ^C[ perf record: Woken up 5 times to write data ]
>       #
>       # perf evlist -v
>       cycles: size: 120, { sample_period, sample_freq }: 4000,
>       sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK,
>       read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, mmap: 1, comm: 1, freq: 1, task: 1,
>       precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1,
>       bpf_event: 1, branch_sample_type: ANY
>       #
>       # perf report --total-cycles
>       #
>       # Samples: 78762 of event 'cycles'
>       Sampled  Sampled Avg      Avg
>       Cycles%  Cycles  Cycles%  Cycles                           [Program Block Range]     Shared Object
>         1.72%    95.8K   0.00%     254                        [msr.h:105 -> msr.h:166]  [kernel.vmlinux]
>         1.56%   107.6K   0.00%     618                [compiler.h:199 -> common.c:301]  [kernel.vmlinux]
>         0.83%    46.3K   0.00%     409              [entry_64.S:153 -> entry_64.S:175]  [kernel.vmlinux]
>         0.83%    46.1K   0.00%      83                  [jump_label.h:41 -> tsc.c:230]  [kernel.vmlinux]
>         0.64%    36.9K   0.01%    1.4K            [hda_intel.c:904 -> hda_intel.c:916]   [snd_hda_intel]
>         0.57%    30.2K   0.00%     282                      [file.c:710 -> file.c:730]  [kernel.vmlinux]
>         0.48%    25.8K   0.00%      82              [spinlock.c:158 -> spinlock.c:160]  [kernel.vmlinux]
>         0.45%    23.7K   0.00%     369  [tick-broadcast.c:585 -> tick-broadcast.c:586]  [kernel.vmlinux]
>         0.44%    24.4K   0.00%      73                       [msr.h:236 -> tsc.c:1088]  [kernel.vmlinux]
>         0.43%    22.7K   0.00%     144                [cpuidle.c:229 -> cpuidle.c:232]  [kernel.vmlinux]
> 
>     Then press 'A' or Enter on one of those lines, just like with 'perf top', say
>     the top one: [msr.h:105 -> msr.h:166], then this shows up:
> 
>       Samples: 78K of event 'cycles', 4000 Hz, Event count (approx.): 78762
>       native_write_msr  /lib/modules/5.4.0-rc8/build/vmlinux [Percent: local period]
>       Percent│ IPC Cycle (Average IPC: 0.02, IPC Coverage: 50.0%)
>              │
>              │             Disassembly of section .text:
>              │
>              │             ffffffff8106c480 <native_write_msr>:
>              │             __wrmsr():
>              │             return EAX_EDX_VAL(val, low, high);
>              │             }
>              │
>              │             static inline void notrace __wrmsr(unsigned int msr, u32 low, u32 high)
>              │             {
>              │             asm volatile("1: wrmsr\n"
>        49.16 │0.02           mov   %edi,%ecx
>              │0.02           mov   %esi,%eax
>              │0.02           wrmsr
>              │             arch_static_branch():
>              │             #include <linux/stringify.h>
>              │             #include <linux/types.h>
>              │
>              │             static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
>              │             {
>              │             asm_volatile_goto("1:"
>         0.79 │0.02           nop
>              │             native_write_msr():
>              │             {
>              │             __wrmsr(msr, low, high);
>              │
>              │             if (msr_tracepoint_active(__tracepoint_write_msr))
>              │             do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
>              │             }
>        50.05 │0.02  254    ← retq
>              │             do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
>              │               shl   $0x20,%rdx
>              │               mov   %esi,%esi
>              │               or    %rdx,%rsi
>              │               xor   %edx,%edx
>              │             → jmpq  do_trace_write_msr
> 
>     We need to improve this to show the source code line numbers in the
>     annotation view, so one can go from that program block to the annotation view
>     and see those source code line numbers straight away.
> 
> auxtrace/Intel PT:
> 
>   Adrian Hunter:
> 
>   - Add support for AUX area sampling, requires new functionality that
>     will land in 5.5, its already in tip.
> 
>     This includes kernel capability querying so that it fails gracefully
>     with older kernels, duimping aux area samples in 'perf report -D' and
>     'perf script'.
> 
> perf.data:
> 
>   Alexey Budankov:
> 
>   - Fix decompression of PERF_RECORD_COMPRESSED records.
> 
> core:
> 
>   Arnaldo Carvalho de Melo:
> 
>   - Use the 'dcacheline' cmp routine to find the right DSOs taking into
>     account the 'maj', 'min', 'ino' and 'ino_generation', that got moved
>     from 'struct map' to 'struct dso', where it belongs.
> 
>     This further reduces the size of 'struct map', there is still more
>     work to do to maybe get it to max one cacheline.
> 
> libtraceevent:
> 
>   Hewenliang:
> 
>   - Fix memory leakage in copy_filter_type().
> 
>   Sudip Mukherjee:
> 
>   - Fix header installation.
> 
> perf parse:
> 
>   Ian Rogers :
> 
>   - Fix potential memory leak when handling tracepoint errors, found using
>     LLVM's libFuzzer.
> 
> perf probe:
> 
>   Colin Ian King:
> 
>   - Fix spelling mistake "addrees" -> "address".
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------

>  46 files changed, 1190 insertions(+), 200 deletions(-)

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2019-11-23  8:07 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-22 14:56 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 01/26] perf map: Move maj/min/ino/ino_generation to separate struct Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 02/26] perf map: Pass a dso_id to map__new() Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 03/26] perf map: Move comparision of map's dso_id to a separate function Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 04/26] perf dsos: Remove unused dsos__find() method Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 05/26] perf dso: Move dso_id from 'struct map' to 'struct dso' Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 06/26] perf session: Fix decompression of PERF_RECORD_COMPRESSED records Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 07/26] perf util: Move block TUI function to ui browsers Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 08/26] perf report: Jump to symbol source view from total cycles view Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 09/26] perf tools: Add kernel AUX area sampling definitions Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 10/26] perf record: Add a function to test for kernel support for AUX area sampling Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 11/26] perf auxtrace: Move perf_evsel__find_pmu() Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 12/26] perf auxtrace: Add support for AUX area sample recording Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 13/26] perf record: Add support for AUX area sampling Arnaldo Carvalho de Melo
2019-11-22 14:56 ` [PATCH 14/26] perf record: Add aux-sample-size config term Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 15/26] perf inject: Cut AUX area samples Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 16/26] perf auxtrace: Add support for dumping " Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 17/26] perf session: Add facility to peek at all events Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 18/26] perf auxtrace: Add support for queuing AUX area samples Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 19/26] perf pmu: When using default config, record which bits of config were changed by the user Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 20/26] perf intel-pt: Add support for recording AUX area samples Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 21/26] perf intel-pt: Add support for decoding " Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 22/26] perf intel-bts: Does not support AUX area sampling Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 23/26] libtraceevent: Fix header installation Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 24/26] libtraceevent: Fix memory leakage in copy_filter_type Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 25/26] perf probe: Fix spelling mistake "addrees" -> "address" Arnaldo Carvalho de Melo
2019-11-22 14:57 ` [PATCH 26/26] perf parse: Fix potential memory leak when handling tracepoint errors Arnaldo Carvalho de Melo
2019-11-23  8:07 ` [GIT PULL] perf/core improvements and fixes Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).