linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 00/29] perf/core improvements and fixes
@ 2019-01-25 23:18 Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 01/29] perf symbols: Move symbol_conf to separate file Arnaldo Carvalho de Melo
                   ` (29 more replies)
  0 siblings, 30 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Davidlohr Bueso, Song Liu, Tony Jones,
	Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit f575494d4a610278ea8597f2f798c8431b94e884:

  Merge tag 'perf-core-for-mingo-5.0-20190121' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2019-01-22 11:07:01 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-5.0-20190126

for you to fetch changes up to 76a06125dd57ed2c7559410168b543313fa0cc51:

  perf augmented_syscalls: Convert to bpf_map() (2019-01-25 15:12:11 +0100)

----------------------------------------------------------------
perf/core improvements and fixes:

BPF:

  Song Liu:

  - Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT

  Arnaldo Carvalho de Melo:

  - Add bpf_map() helper, to make BPF map declararions more compact and
    allow for BTF annotations to be made transparently.

perf script python:

  Tony Jones:

  - Remove explicit shebangs.

  - Fix the PYTHON=python3 builds.

Core:

  Davidlohr Bueso:

  - Update rbtree implementation, getting it closer to the kernel one.

  - Use cached rbtrees.

  Arnaldo Carvalho de Melo:

  - Remove some needless headers from .c and .h files fixing up the fallout,
    to reduce building time when changes are made to .h files

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (16):
      perf color: Add missing stdarg.g to color.h
      perf symbols: Move symbol_conf to separate file
      perf annotate: Remove lots of headers from annotate.h
      perf tools: Move branch structs to branch.h
      perf block-range: Add missing headers
      perf symbols: Remove include map.h from dso.h
      perf symbols: Remove some unnecessary includes from symbol.h
      perf namespaces: Remove namespaces.h from .h headers
      perf comm: Remove needless headers from comm.h
      perf callchain: No need to include perf.h
      perf bpf: Add bpf_map() helper
      perf bpf: Convert pid_map() to bpf_map()
      perf augmented_raw_syscalls: Use bpf_map()
      perf trace: Fixup etcsnoop example
      perf bpf examples: Convert etcsnoop to use bpf_map()
      perf augmented_syscalls: Convert to bpf_map()

Davidlohr Bueso (7):
      tools: Update rbtree implementation
      perf machine: Use cached rbtrees
      perf callchain: Use cached rbtrees
      perf util: Use cached rbtree for rblists
      perf symbols: Use cached rbtrees
      perf hist: Use cached rbtrees
      perf sched: Use cached rbtrees

Song Liu (1):
      perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT

Tony Jones (6):
      perf script python: Add trace_context extension module to sys.modules
      perf script python: Use PyBytes for attr in trace-event-python
      perf script python: Remove explicit shebang from setup.py
      perf script python: Remove explicit shebang from tests/attr.c
      perf script python: Remove explicit shebang from Python scripts
      perf script python: Add Python3 support to tests/attr.py

 tools/include/linux/rbtree.h                       |  52 +++++-
 tools/include/linux/rbtree_augmented.h             |  60 +++++--
 tools/lib/rbtree.c                                 | 178 +++++++++++++-----
 tools/perf/Makefile.perf                           |   4 +-
 tools/perf/builtin-annotate.c                      |   4 +-
 tools/perf/builtin-c2c.c                           |   6 +-
 tools/perf/builtin-diff.c                          |  10 +-
 tools/perf/builtin-probe.c                         |   1 +
 tools/perf/builtin-report.c                        |   3 +-
 tools/perf/builtin-sched.c                         |  45 ++---
 tools/perf/builtin-top.c                           |   2 +-
 tools/perf/builtin-trace.c                         |   3 +-
 tools/perf/examples/bpf/augmented_raw_syscalls.c   |  14 +-
 tools/perf/examples/bpf/augmented_syscalls.c       |   8 +-
 tools/perf/examples/bpf/etcsnoop.c                 |   8 +-
 tools/perf/include/bpf/bpf.h                       |  16 +-
 tools/perf/scripts/python/exported-sql-viewer.py   |   1 -
 tools/perf/scripts/python/sched-migration.py       |   2 -
 tools/perf/scripts/python/stat-cpi.py              |   1 -
 tools/perf/tests/attr.py                           |  33 ++--
 tools/perf/tests/hists_common.c                    |   8 +-
 tools/perf/tests/hists_cumulate.c                  |  14 +-
 tools/perf/tests/hists_link.c                      |   8 +-
 tools/perf/tests/hists_output.c                    |  32 ++--
 tools/perf/ui/browsers/hists.c                     |  16 +-
 tools/perf/ui/browsers/map.c                       |   1 +
 tools/perf/ui/gtk/annotate.c                       |   1 +
 tools/perf/ui/gtk/hists.c                          |   6 +-
 tools/perf/ui/stdio/hist.c                         |   3 +-
 tools/perf/util/annotate.c                         |   1 +
 tools/perf/util/annotate.h                         |  14 +-
 tools/perf/util/block-range.c                      |   2 +
 tools/perf/util/block-range.h                      |   6 +-
 tools/perf/util/bpf-event.c                        |  10 +-
 tools/perf/util/branch.h                           |  27 ++-
 tools/perf/util/build-id.c                         |  13 +-
 tools/perf/util/build-id.h                         |   3 +-
 tools/perf/util/callchain.h                        |   1 -
 tools/perf/util/color.h                            |   1 +
 tools/perf/util/comm.c                             |   1 +
 tools/perf/util/comm.h                             |   4 +-
 tools/perf/util/dso.c                              |  10 +-
 tools/perf/util/dso.h                              |  16 +-
 tools/perf/util/event.h                            |  21 +--
 tools/perf/util/hist.c                             | 199 ++++++++++++---------
 tools/perf/util/hist.h                             |  10 +-
 tools/perf/util/intlist.h                          |   2 +-
 tools/perf/util/jitdump.c                          |   1 +
 tools/perf/util/machine.c                          |  53 +++---
 tools/perf/util/machine.h                          |  12 +-
 tools/perf/util/map.c                              |   8 +-
 tools/perf/util/metricgroup.c                      |   2 +-
 tools/perf/util/probe-event.c                      |   4 +-
 tools/perf/util/probe-event.h                      |   5 +-
 tools/perf/util/probe-file.c                       |   1 +
 tools/perf/util/rb_resort.h                        |   8 +-
 tools/perf/util/rblist.c                           |  28 +--
 tools/perf/util/rblist.h                           |   2 +-
 .../util/scripting-engines/trace-event-python.c    |  15 +-
 tools/perf/util/setup.py                           |   2 -
 tools/perf/util/sort.h                             |   4 +-
 tools/perf/util/srcline.c                          |  43 +++--
 tools/perf/util/srcline.h                          |  13 +-
 tools/perf/util/stat-display.c                     |   1 +
 tools/perf/util/stat-shadow.c                      |   2 +-
 tools/perf/util/strlist.h                          |   2 +-
 tools/perf/util/symbol-minimal.c                   |   1 +
 tools/perf/util/symbol.c                           |  87 ++++-----
 tools/perf/util/symbol.h                           |  88 ++-------
 tools/perf/util/symbol_conf.h                      |  73 ++++++++
 tools/perf/util/symbol_fprintf.c                   |   3 +-
 tools/perf/util/thread.h                           |   1 +
 tools/perf/util/util.c                             |   1 +
 73 files changed, 810 insertions(+), 531 deletions(-)
 create mode 100644 tools/perf/util/symbol_conf.h

Test results:

The first ones are container based builds of tools/perf with and without libelf
support.  Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

  $ dm
   1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0
   2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822
   3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0
   4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   5 alpine:3.8                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   6 alpine:edge                   : Ok   gcc (Alpine 8.2.0) 8.2.0
   7 amazonlinux:1                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
   8 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
   9 android-ndk:r12b-arm          : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  10 android-ndk:r15c-arm          : Ok   gcc (GCC) 8.2.1 20181215 (Red Hat 8.2.1-6)
  11 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  12 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
  13 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
  14 clearlinux:latest             : Ok   gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502
  15 debian:7                      : Ok   gcc (Debian 4.7.2-5) 4.7.2
  16 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u2) 4.9.2
  17 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
  18 debian:experimental           : Ok   gcc (Debian 8.2.0-13) 8.2.0
  19 debian:experimental-x-arm64   : Ok   gcc (Debian 8.2.0-13) 8.2.0
  20 debian:experimental-x-mips    : Ok   gcc (Debian 8.2.0-13) 8.2.0
  21 debian:experimental-x-mips64  : Ok   gcc (Debian 8.2.0-13) 8.2.0
  22 debian:experimental-x-mipsel  : Ok   gcc (Debian 8.2.0-13) 8.2.0
  23 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
  24 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  25 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  26 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  27 fedora:24-x-ARC-uClibc        : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  28 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
  29 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
  30 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)
  31 fedora:28                     : Ok   gcc (GCC) 8.2.1 20181105 (Red Hat 8.2.1-5)
  32 fedora:29                     : Ok   gcc (GCC) 8.2.1 20181105 (Red Hat 8.2.1-5)
  33 fedora:rawhide                : Ok   gcc (GCC) 8.2.1 20190109 (Red Hat 8.2.1-7)
  34 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 7.3.0-r3 p1.4) 7.3.0
  35 mageia:5                      : Ok   gcc (GCC) 4.9.2
  36 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0
  37 opensuse:13.2                 : Ok   gcc (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
  38 opensuse:15.0                 : Ok   gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
  39 opensuse:42.1                 : Ok   gcc (SUSE Linux) 4.8.5
  40 opensuse:42.2                 : Ok   gcc (SUSE Linux) 4.8.5
  41 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5
  42 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 8.2.1 20181108 [gcc-8-branch revision 265914]
  43 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
  44 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36.0.1)
  45 ubuntu:12.04.5                : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  46 ubuntu:14.04.4                : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
  47 ubuntu:14.04.4-x-linaro-arm64 : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
  48 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  49 ubuntu:16.04-x-arm            : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  50 ubuntu:16.04-x-arm64          : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  51 ubuntu:16.04-x-powerpc        : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  52 ubuntu:16.04-x-powerpc64      : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  53 ubuntu:16.04-x-powerpc64el    : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  54 ubuntu:16.04-x-s390           : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  55 ubuntu:17.10                  : Ok   gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
  56 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  57 ubuntu:18.04-x-arm            : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  58 ubuntu:18.04-x-arm64          : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  59 ubuntu:18.04-x-m68k           : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  60 ubuntu:18.04-x-powerpc        : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  61 ubuntu:18.04-x-powerpc64      : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  62 ubuntu:18.04-x-powerpc64el    : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  63 ubuntu:18.04-x-riscv64        : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  64 ubuntu:18.04-x-s390           : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  65 ubuntu:18.04-x-sh4            : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  66 ubuntu:18.04-x-sparc64        : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  67 ubuntu:18.10                  : Ok   gcc (Ubuntu 8.2.0-7ubuntu1) 8.2.0
  68 ubuntu:19.04                  : Ok   gcc (Ubuntu 8.2.0-12ubuntu1) 8.2.0
  69 ubuntu:19.04-x-alpha          : Ok   gcc (Ubuntu 8.2.0-12ubuntu1) 8.2.0
  70 ubuntu:19.04-x-hppa           : Ok   gcc (Ubuntu 8.2.0-12ubuntu1) 8.2.0
  $
    # uname -a
  Linux quaco 4.20.3-200.fc29.x86_64 #1 SMP Thu Jan 17 15:19:35 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  # git log --oneline -1
  76a06125dd57 perf augmented_syscalls: Convert to bpf_map()
  # perf version --build-options
  perf version 5.0.rc2.g76a061
                   dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
      dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                   glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                    gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
           syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                  libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                  libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                 libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
  numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
               libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                libslang: [ on  ]  # HAVE_SLANG_SUPPORT
               libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
               libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
      libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                    zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                    lzma: [ on  ]  # HAVE_LZMA_SUPPORT
               get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Breakpoint accounting                                 : Ok
  22: Watchpoint                                            :
  22.1: Read Only Watchpoint                                : Skip
  22.2: Write Only Watchpoint                               : Ok
  22.3: Read / Write Watchpoint                             : Ok
  22.4: Modify Watchpoint                                   : Ok
  23: Number of exit events of a simple workload            : Ok
  24: Software clock events period values                   : Ok
  25: Object code reading                                   : Ok
  26: Sample parsing                                        : Ok
  27: Use a dummy software event to keep tracking           : Ok
  28: Parse with no sample_id_all bit set                   : Ok
  29: Filter hist entries                                   : Ok
  30: Lookup mmap thread                                    : Ok
  31: Share thread mg                                       : Ok
  32: Sort output of hist entries                           : Ok
  33: Cumulate child hist entries                           : Ok
  34: Track with sched_switch                               : Ok
  35: Filter fds with revents mask in a fdarray             : Ok
  36: Add fd to a fdarray, making it autogrow               : Ok
  37: kmod_path__parse                                      : Ok
  38: Thread map                                            : Ok
  39: LLVM search and compile                               :
  39.1: Basic BPF llvm compile                              : Ok
  39.2: kbuild searching                                    : Ok
  39.3: Compile source for BPF prologue generation          : Ok
  39.4: Compile source for BPF relocation                   : Ok
  40: Session topology                                      : Ok
  41: BPF filter                                            :
  41.1: Basic BPF filtering                                 : Ok
  41.2: BPF pinning                                         : Ok
  41.3: BPF prologue generation                             : Ok
  41.4: BPF relocation checker                              : Ok
  42: Synthesize thread map                                 : Ok
  43: Remove thread map                                     : Ok
  44: Synthesize cpu map                                    : Ok
  45: Synthesize stat config                                : Ok
  46: Synthesize stat                                       : Ok
  47: Synthesize stat round                                 : Ok
  48: Synthesize attr update                                : Ok
  49: Event times                                           : Ok
  50: Read backward ring buffer                             : Ok
  51: Print cpu map                                         : Ok
  52: Probe SDT events                                      : Ok
  53: is_printable_array                                    : Ok
  54: Print bitmap                                          : Ok
  55: perf hooks                                            : Ok
  56: builtin clang support                                 : Skip (not compiled in)
  57: unit_number__scnprintf                                : Ok
  58: mem2node                                              : Ok
  59: x86 rdpmc                                             : Ok
  60: Convert perf time to TSC                              : Ok
  61: DWARF unwind                                          : Ok
  62: x86 instruction decoder - new instructions            : Ok
  63: x86 bp modify                                         : Ok
  64: probe libc's inet_pton & backtrace it with ping       : Ok
  65: Use vfs_getname probe to get syscall args filenames   : Ok
  66: Add vfs_getname probe to get syscall args filenames   : Ok
  67: Check open filename arg using perf trace + vfs_getname: Ok

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
           make_no_libunwind_O: make NO_LIBUNWIND=1
            make_no_demangle_O: make NO_DEMANGLE=1
             make_no_libperl_O: make NO_LIBPERL=1
              make_no_libelf_O: make NO_LIBELF=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
                   make_pure_O: make
                make_no_newt_O: make NO_NEWT=1
             make_no_libnuma_O: make NO_LIBNUMA=1
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
              make_clean_all_O: make clean all
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
              make_no_libbpf_O: make NO_LIBBPF=1
         make_with_clangllvm_O: make LIBCLANGLLVM=1
             make_util_map_o_O: make util/map.o
                   make_tags_O: make tags
                 make_static_O: make LDFLAGS=-static
        make_with_babeltrace_O: make LIBBABELTRACE=1
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
            make_install_bin_O: make install-bin
                    make_doc_O: make doc
               make_no_slang_O: make NO_SLANG=1
                  make_debug_O: make DEBUG=1
         make_install_prefix_O: make install prefix=/tmp/krava
       make_util_pmu_bison_o_O: make util/pmu-bison.o
           make_no_libbionic_O: make NO_LIBBIONIC=1
            make_no_auxtrace_O: make NO_AUXTRACE=1
                make_no_gtk2_O: make NO_GTK2=1
           make_no_backtrace_O: make NO_BACKTRACE=1
           make_no_libpython_O: make NO_LIBPYTHON=1
                 make_perf_o_O: make perf.o
                   make_help_O: make help
                make_install_O: make install
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 01/29] perf symbols: Move symbol_conf to separate file
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 02/29] perf annotate: Remove lots of headers from annotate.h Arnaldo Carvalho de Melo
                   ` (28 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Namhyung Kim

From: Arnaldo Carvalho de Melo <acme@redhat.com>

So that we don't drag all the headers included in symbol.h when needing
to access symbol_conf in another header, such as annotate.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-rvo9dzflkneqmprb0dgbfybx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/symbol.h      | 64 +-----------------------------
 tools/perf/util/symbol_conf.h | 73 +++++++++++++++++++++++++++++++++++
 2 files changed, 74 insertions(+), 63 deletions(-)
 create mode 100644 tools/perf/util/symbol_conf.h

diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 14d9d438e7e2..277841e3dbd1 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -15,6 +15,7 @@
 #include "build-id.h"
 #include "event.h"
 #include "path.h"
+#include "symbol_conf.h"
 
 #ifdef HAVE_LIBELF_SUPPORT
 #include <libelf.h>
@@ -89,69 +90,6 @@ static inline size_t symbol__size(const struct symbol *sym)
 struct strlist;
 struct intlist;
 
-struct symbol_conf {
-	unsigned short	priv_size;
-	bool		try_vmlinux_path,
-			init_annotation,
-			force,
-			ignore_vmlinux,
-			ignore_vmlinux_buildid,
-			show_kernel_path,
-			use_modules,
-			allow_aliases,
-			sort_by_name,
-			show_nr_samples,
-			show_total_period,
-			use_callchain,
-			cumulate_callchain,
-			show_branchflag_count,
-			exclude_other,
-			show_cpu_utilization,
-			initialized,
-			kptr_restrict,
-			event_group,
-			demangle,
-			demangle_kernel,
-			filter_relative,
-			show_hist_headers,
-			branch_callstack,
-			has_filter,
-			show_ref_callgraph,
-			hide_unresolved,
-			raw_trace,
-			report_hierarchy,
-			inline_name;
-	const char	*vmlinux_name,
-			*kallsyms_name,
-			*source_prefix,
-			*field_sep,
-			*graph_function;
-	const char	*default_guest_vmlinux_name,
-			*default_guest_kallsyms,
-			*default_guest_modules;
-	const char	*guestmount;
-	const char	*dso_list_str,
-			*comm_list_str,
-			*pid_list_str,
-			*tid_list_str,
-			*sym_list_str,
-			*col_width_list_str,
-			*bt_stop_list_str;
-       struct strlist	*dso_list,
-			*comm_list,
-			*sym_list,
-			*dso_from_list,
-			*dso_to_list,
-			*sym_from_list,
-			*sym_to_list,
-			*bt_stop_list;
-	struct intlist	*pid_list,
-			*tid_list;
-	const char	*symfs;
-};
-
-extern struct symbol_conf symbol_conf;
-
 struct symbol_name_rb_node {
 	struct rb_node	rb_node;
 	struct symbol	sym;
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
new file mode 100644
index 000000000000..fffea68c1203
--- /dev/null
+++ b/tools/perf/util/symbol_conf.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_SYMBOL_CONF
+#define __PERF_SYMBOL_CONF 1
+
+#include <stdbool.h>
+
+struct strlist;
+struct intlist;
+
+struct symbol_conf {
+	unsigned short	priv_size;
+	bool		try_vmlinux_path,
+			init_annotation,
+			force,
+			ignore_vmlinux,
+			ignore_vmlinux_buildid,
+			show_kernel_path,
+			use_modules,
+			allow_aliases,
+			sort_by_name,
+			show_nr_samples,
+			show_total_period,
+			use_callchain,
+			cumulate_callchain,
+			show_branchflag_count,
+			exclude_other,
+			show_cpu_utilization,
+			initialized,
+			kptr_restrict,
+			event_group,
+			demangle,
+			demangle_kernel,
+			filter_relative,
+			show_hist_headers,
+			branch_callstack,
+			has_filter,
+			show_ref_callgraph,
+			hide_unresolved,
+			raw_trace,
+			report_hierarchy,
+			inline_name;
+	const char	*vmlinux_name,
+			*kallsyms_name,
+			*source_prefix,
+			*field_sep,
+			*graph_function;
+	const char	*default_guest_vmlinux_name,
+			*default_guest_kallsyms,
+			*default_guest_modules;
+	const char	*guestmount;
+	const char	*dso_list_str,
+			*comm_list_str,
+			*pid_list_str,
+			*tid_list_str,
+			*sym_list_str,
+			*col_width_list_str,
+			*bt_stop_list_str;
+       struct strlist	*dso_list,
+			*comm_list,
+			*sym_list,
+			*dso_from_list,
+			*dso_to_list,
+			*sym_from_list,
+			*sym_to_list,
+			*bt_stop_list;
+	struct intlist	*pid_list,
+			*tid_list;
+	const char	*symfs;
+};
+
+extern struct symbol_conf symbol_conf;
+
+#endif // __PERF_SYMBOL_CONF
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 02/29] perf annotate: Remove lots of headers from annotate.h
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 01/29] perf symbols: Move symbol_conf to separate file Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 03/29] perf tools: Move branch structs to branch.h Arnaldo Carvalho de Melo
                   ` (27 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Namhyung Kim

From: Arnaldo Carvalho de Melo <acme@redhat.com>

To reduce the chances changes trigger tons of rebuilds, more to come.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ytbykaku63862guk7muflcy4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/gtk/annotate.c   |  1 +
 tools/perf/util/annotate.h     | 14 +++++++++++---
 tools/perf/util/stat-display.c |  1 +
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index 48428c9acd89..f22a7d142ada 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "gtk.h"
+#include "util/sort.h"
 #include "util/debug.h"
 #include "util/annotate.h"
 #include "util/evsel.h"
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index fb6463730ba4..95053cab41fe 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -4,16 +4,24 @@
 
 #include <stdbool.h>
 #include <stdint.h>
+#include <stdio.h>
 #include <linux/types.h>
-#include "symbol.h"
-#include "hist.h"
-#include "sort.h"
 #include <linux/list.h>
 #include <linux/rbtree.h>
 #include <pthread.h>
 #include <asm/bug.h>
+#include "symbol_conf.h"
 
+struct hist_browser_timer;
+struct hist_entry;
 struct ins_ops;
+struct map;
+struct map_symbol;
+struct addr_map_symbol;
+struct option;
+struct perf_sample;
+struct perf_evsel;
+struct symbol;
 
 struct ins {
 	const char     *name;
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 665ee374fc01..6d043c78f3c2 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -2,6 +2,7 @@
 #include <inttypes.h>
 #include <linux/time64.h>
 #include <math.h>
+#include "color.h"
 #include "evlist.h"
 #include "evsel.h"
 #include "stat.h"
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 03/29] perf tools: Move branch structs to branch.h
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 01/29] perf symbols: Move symbol_conf to separate file Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 02/29] perf annotate: Remove lots of headers from annotate.h Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 04/29] perf block-range: Add missing headers Arnaldo Carvalho de Melo
                   ` (26 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Namhyung Kim

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We already have it, move those there from events.h so that we untangle
the header dependencies a bit more.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-pnbkqo8jxbi49d4f3yd3b5w3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/branch.h | 27 ++++++++++++++++++++++++---
 tools/perf/util/event.h  | 21 +--------------------
 tools/perf/util/symbol.h |  1 +
 3 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
index 1e3c7c5cdc63..64f96b79f1d7 100644
--- a/tools/perf/util/branch.h
+++ b/tools/perf/util/branch.h
@@ -1,8 +1,31 @@
 #ifndef _PERF_BRANCH_H
 #define _PERF_BRANCH_H 1
 
+#include <stdio.h>
 #include <stdint.h>
-#include "../perf.h"
+#include <linux/perf_event.h>
+#include <linux/types.h>
+
+struct branch_flags {
+	u64 mispred:1;
+	u64 predicted:1;
+	u64 in_tx:1;
+	u64 abort:1;
+	u64 cycles:16;
+	u64 type:4;
+	u64 reserved:40;
+};
+
+struct branch_entry {
+	u64			from;
+	u64			to;
+	struct branch_flags	flags;
+};
+
+struct branch_stack {
+	u64			nr;
+	struct branch_entry	entries[0];
+};
 
 struct branch_type_stat {
 	bool	branch_to;
@@ -13,8 +36,6 @@ struct branch_type_stat {
 	u64	cross_2m;
 };
 
-struct branch_flags;
-
 void branch_type_count(struct branch_type_stat *st, struct branch_flags *flags,
 		       u64 from, u64 to);
 
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index dad32b81fe71..feba1aa819b4 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -161,26 +161,7 @@ struct ip_callchain {
 	u64 ips[0];
 };
 
-struct branch_flags {
-	u64 mispred:1;
-	u64 predicted:1;
-	u64 in_tx:1;
-	u64 abort:1;
-	u64 cycles:16;
-	u64 type:4;
-	u64 reserved:40;
-};
-
-struct branch_entry {
-	u64			from;
-	u64			to;
-	struct branch_flags	flags;
-};
-
-struct branch_stack {
-	u64			nr;
-	struct branch_entry	entries[0];
-};
+struct branch_stack;
 
 enum {
 	PERF_IP_FLAG_BRANCH		= 1ULL << 0,
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 277841e3dbd1..956d9916e364 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -12,6 +12,7 @@
 #include <stdio.h>
 #include <byteswap.h>
 #include <libgen.h>
+#include "branch.h"
 #include "build-id.h"
 #include "event.h"
 #include "path.h"
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 04/29] perf block-range: Add missing headers
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 03/29] perf tools: Move branch structs to branch.h Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 05/29] perf symbols: Remove include map.h from dso.h Arnaldo Carvalho de Melo
                   ` (25 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Namhyung Kim

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Some are being obtained indirectly and as we prune unnecessary includes,
this stops working, fix it by adding the headers for things used in
these file.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1p65lyeebc2ose0lbozvemda@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/block-range.c | 2 ++
 tools/perf/util/block-range.h | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/block-range.c b/tools/perf/util/block-range.c
index f1451c987eec..1be432657501 100644
--- a/tools/perf/util/block-range.c
+++ b/tools/perf/util/block-range.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "block-range.h"
 #include "annotate.h"
+#include <assert.h>
+#include <stdlib.h>
 
 struct {
 	struct rb_root root;
diff --git a/tools/perf/util/block-range.h b/tools/perf/util/block-range.h
index a5ba719d69fb..ec0fb534bf56 100644
--- a/tools/perf/util/block-range.h
+++ b/tools/perf/util/block-range.h
@@ -2,7 +2,11 @@
 #ifndef __PERF_BLOCK_RANGE_H
 #define __PERF_BLOCK_RANGE_H
 
-#include "symbol.h"
+#include <stdbool.h>
+#include <linux/rbtree.h>
+#include <linux/types.h>
+
+struct symbol;
 
 /*
  * struct block_range - non-overlapping parts of basic blocks
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 05/29] perf symbols: Remove include map.h from dso.h
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 04/29] perf block-range: Add missing headers Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 06/29] perf symbols: Remove some unnecessary includes from symbol.h Arnaldo Carvalho de Melo
                   ` (24 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Namhyung Kim

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Disentangling the dependency tree, to reduce build time.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-n2gcrfmh480rm44p7fra13vv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/browsers/map.c     | 1 +
 tools/perf/util/dso.h            | 4 +++-
 tools/perf/util/symbol_fprintf.c | 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/map.c b/tools/perf/ui/browsers/map.c
index 5b8b8c637686..c70d9337405b 100644
--- a/tools/perf/ui/browsers/map.c
+++ b/tools/perf/ui/browsers/map.c
@@ -6,6 +6,7 @@
 #include <linux/bitops.h>
 #include "../../util/util.h"
 #include "../../util/debug.h"
+#include "../../util/map.h"
 #include "../../util/symbol.h"
 #include "../browser.h"
 #include "../helpline.h"
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index af2eda29660f..56001b82d448 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -7,12 +7,14 @@
 #include <linux/rbtree.h>
 #include <sys/types.h>
 #include <stdbool.h>
+#include <stdio.h>
 #include "rwsem.h"
 #include <linux/bitops.h>
-#include "map.h"
 #include "namespaces.h"
 #include "build-id.h"
 
+struct map;
+
 enum dso_binary_type {
 	DSO_BINARY_TYPE__KALLSYMS = 0,
 	DSO_BINARY_TYPE__GUEST_KALLSYMS,
diff --git a/tools/perf/util/symbol_fprintf.c b/tools/perf/util/symbol_fprintf.c
index ed0205cc7942..50472c75da59 100644
--- a/tools/perf/util/symbol_fprintf.c
+++ b/tools/perf/util/symbol_fprintf.c
@@ -3,6 +3,7 @@
 #include <inttypes.h>
 #include <stdio.h>
 
+#include "map.h"
 #include "symbol.h"
 
 size_t symbol__fprintf(struct symbol *sym, FILE *fp)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 06/29] perf symbols: Remove some unnecessary includes from symbol.h
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 05/29] perf symbols: Remove include map.h from dso.h Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 07/29] perf namespaces: Remove namespaces.h from .h headers Arnaldo Carvalho de Melo
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Namhyung Kim

From: Arnaldo Carvalho de Melo <acme@redhat.com>

And fixup the fallout in places like annotation and jitdump that were
using things like dirname() but weren't including libgen.h, etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7ji9igrauh6ylt3p6i03bjlf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/annotate.c |  1 +
 tools/perf/util/jitdump.c  |  1 +
 tools/perf/util/symbol.h   | 10 ++++------
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 70de8f6b3aee..3d79add5f7ae 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -9,6 +9,7 @@
 
 #include <errno.h>
 #include <inttypes.h>
+#include <libgen.h>
 #include "util.h"
 #include "ui/ui.h"
 #include "sort.h"
diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index bf249552a9b0..eda28d3570bc 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -2,6 +2,7 @@
 #include <sys/sysmacros.h>
 #include <sys/types.h>
 #include <errno.h>
+#include <libgen.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 956d9916e364..2162a5d9b3ac 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -5,16 +5,10 @@
 #include <linux/types.h>
 #include <stdbool.h>
 #include <stdint.h>
-#include "map.h"
-#include "../perf.h"
 #include <linux/list.h>
 #include <linux/rbtree.h>
 #include <stdio.h>
-#include <byteswap.h>
-#include <libgen.h>
 #include "branch.h"
-#include "build-id.h"
-#include "event.h"
 #include "path.h"
 #include "symbol_conf.h"
 
@@ -26,6 +20,10 @@
 
 #include "dso.h"
 
+struct map;
+struct map_groups;
+struct option;
+
 /*
  * libelf 0.8.x and earlier do not support ELF_C_READ_MMAP;
  * for newer versions we can use mmap to reduce memory usage:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 07/29] perf namespaces: Remove namespaces.h from .h headers
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 06/29] perf symbols: Remove some unnecessary includes from symbol.h Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 08/29] perf comm: Remove needless headers from comm.h Arnaldo Carvalho de Melo
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Namhyung Kim

From: Arnaldo Carvalho de Melo <acme@redhat.com>

There we need just forward declarations, so remove it and add it just on
the .c files that actually touch the struct definitions.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wsjxzt99p83jubt6hu0med0f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-probe.c    | 1 +
 tools/perf/util/build-id.c    | 1 +
 tools/perf/util/build-id.h    | 3 ++-
 tools/perf/util/dso.c         | 1 +
 tools/perf/util/dso.h         | 2 +-
 tools/perf/util/probe-event.c | 1 +
 tools/perf/util/probe-event.h | 5 +++--
 tools/perf/util/probe-file.c  | 1 +
 tools/perf/util/thread.h      | 1 +
 tools/perf/util/util.c        | 1 +
 10 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 99de91698de1..46d3c2deeb40 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -32,6 +32,7 @@
 
 #include "perf.h"
 #include "builtin.h"
+#include "namespaces.h"
 #include "util/util.h"
 #include "util/strlist.h"
 #include "util/strfilter.h"
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 04b1d53e4bf9..07ef7fad689c 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -15,6 +15,7 @@
 #include <sys/types.h>
 #include "build-id.h"
 #include "event.h"
+#include "namespaces.h"
 #include "symbol.h"
 #include "thread.h"
 #include <linux/kernel.h>
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index f0c565164a97..93668f38f1ed 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -6,9 +6,10 @@
 #define SBUILD_ID_SIZE	(BUILD_ID_SIZE * 2 + 1)
 
 #include "tool.h"
-#include "namespaces.h"
 #include <linux/types.h>
 
+struct nsinfo;
+
 extern struct perf_tool build_id__mark_dso_hit_ops;
 struct dso;
 struct feat_fd;
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 62c8cf622607..8cbbb5be4a0b 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -9,6 +9,7 @@
 #include <errno.h>
 #include <fcntl.h>
 #include "compress.h"
+#include "namespaces.h"
 #include "path.h"
 #include "symbol.h"
 #include "srcline.h"
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 56001b82d448..2de54c0b26b2 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -10,9 +10,9 @@
 #include <stdio.h>
 #include "rwsem.h"
 #include <linux/bitops.h>
-#include "namespaces.h"
 #include "build-id.h"
 
+struct machine;
 struct map;
 
 enum dso_binary_type {
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 18a59fba97ff..736787a18b9e 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -35,6 +35,7 @@
 
 #include "util.h"
 #include "event.h"
+#include "namespaces.h"
 #include "strlist.h"
 #include "strfilter.h"
 #include "debug.h"
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 15a98c3a2a2f..05c8d571a901 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -4,8 +4,9 @@
 
 #include <linux/compiler.h>
 #include <stdbool.h>
-#include "intlist.h"
-#include "namespaces.h"
+
+struct intlist;
+struct nsinfo;
 
 /* Probe related configurations */
 struct probe_conf {
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 0b1195cad0e5..4062bc4412a9 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -20,6 +20,7 @@
 #include <sys/types.h>
 #include <sys/uio.h>
 #include <unistd.h>
+#include "namespaces.h"
 #include "util.h"
 #include "event.h"
 #include "strlist.h"
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 712dd48cc0ca..f3c939150f90 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -13,6 +13,7 @@
 #include <intlist.h>
 #include "rwsem.h"
 
+struct namespaces_event;
 struct thread_stack;
 struct unwind_libunwind_ops;
 
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 093352e93d50..320b0fef249a 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -2,6 +2,7 @@
 #include "../perf.h"
 #include "util.h"
 #include "debug.h"
+#include "namespaces.h"
 #include <api/fs/fs.h>
 #include <sys/mman.h>
 #include <sys/stat.h>
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 08/29] perf comm: Remove needless headers from comm.h
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 07/29] perf namespaces: Remove namespaces.h from .h headers Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 09/29] perf callchain: No need to include perf.h Arnaldo Carvalho de Melo
                   ` (21 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Namhyung Kim

From: Arnaldo Carvalho de Melo <acme@redhat.com>

There we don't need rbtree, only in comm.c, also ditch perf.h, not
needed at all.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-vr1jnwwujh99skrgldtimpmu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/comm.c | 1 +
 tools/perf/util/comm.h | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/comm.c b/tools/perf/util/comm.c
index 31279a7bd919..1066de92af12 100644
--- a/tools/perf/util/comm.c
+++ b/tools/perf/util/comm.c
@@ -6,6 +6,7 @@
 #include <stdio.h>
 #include <string.h>
 #include <linux/refcount.h>
+#include <linux/rbtree.h>
 #include "rwsem.h"
 
 struct comm_str {
diff --git a/tools/perf/util/comm.h b/tools/perf/util/comm.h
index 3e5c438fe85e..f35d8fbfa2dd 100644
--- a/tools/perf/util/comm.h
+++ b/tools/perf/util/comm.h
@@ -2,9 +2,9 @@
 #ifndef __PERF_COMM_H
 #define __PERF_COMM_H
 
-#include "../perf.h"
-#include <linux/rbtree.h>
 #include <linux/list.h>
+#include <linux/types.h>
+#include <stdbool.h>
 
 struct comm_str;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 09/29] perf callchain: No need to include perf.h
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 08/29] perf comm: Remove needless headers from comm.h Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 10/29] tools: Update rbtree implementation Arnaldo Carvalho de Melo
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Namhyung Kim

From: Arnaldo Carvalho de Melo <acme@redhat.com>

So ditch it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-bodhwdvcds9ahk26dy4w8m71@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/callchain.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 99d38ac019b8..8afe1622edf3 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -2,7 +2,6 @@
 #ifndef __PERF_CALLCHAIN_H
 #define __PERF_CALLCHAIN_H
 
-#include "../perf.h"
 #include <linux/list.h>
 #include <linux/rbtree.h>
 #include "event.h"
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 10/29] tools: Update rbtree implementation
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 09/29] perf callchain: No need to include perf.h Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 11/29] perf machine: Use cached rbtrees Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Arnaldo Carvalho de Melo, Jiri Olsa,
	Namhyung Kim

From: Davidlohr Bueso <dave@stgolabs.net>

There have been a number of changes in the kernel's rbrtee
implementation, including loose lockless searching guarantees and
rb_root_cached, which later patches will use as an optimization.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-2-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/include/linux/rbtree.h           |  52 +++++++-
 tools/include/linux/rbtree_augmented.h |  60 +++++++--
 tools/lib/rbtree.c                     | 178 +++++++++++++++++++------
 3 files changed, 229 insertions(+), 61 deletions(-)

diff --git a/tools/include/linux/rbtree.h b/tools/include/linux/rbtree.h
index 112582253dd0..8e9ed4786269 100644
--- a/tools/include/linux/rbtree.h
+++ b/tools/include/linux/rbtree.h
@@ -43,13 +43,28 @@ struct rb_root {
 	struct rb_node *rb_node;
 };
 
+/*
+ * Leftmost-cached rbtrees.
+ *
+ * We do not cache the rightmost node based on footprint
+ * size vs number of potential users that could benefit
+ * from O(1) rb_last(). Just not worth it, users that want
+ * this feature can always implement the logic explicitly.
+ * Furthermore, users that want to cache both pointers may
+ * find it a bit asymmetric, but that's ok.
+ */
+struct rb_root_cached {
+	struct rb_root rb_root;
+	struct rb_node *rb_leftmost;
+};
 
 #define rb_parent(r)   ((struct rb_node *)((r)->__rb_parent_color & ~3))
 
 #define RB_ROOT	(struct rb_root) { NULL, }
+#define RB_ROOT_CACHED (struct rb_root_cached) { {NULL, }, NULL }
 #define	rb_entry(ptr, type, member) container_of(ptr, type, member)
 
-#define RB_EMPTY_ROOT(root)  ((root)->rb_node == NULL)
+#define RB_EMPTY_ROOT(root)  (READ_ONCE((root)->rb_node) == NULL)
 
 /* 'empty' nodes are nodes that are known not to be inserted in an rbtree */
 #define RB_EMPTY_NODE(node)  \
@@ -68,6 +83,12 @@ extern struct rb_node *rb_prev(const struct rb_node *);
 extern struct rb_node *rb_first(const struct rb_root *);
 extern struct rb_node *rb_last(const struct rb_root *);
 
+extern void rb_insert_color_cached(struct rb_node *,
+				   struct rb_root_cached *, bool);
+extern void rb_erase_cached(struct rb_node *node, struct rb_root_cached *);
+/* Same as rb_first(), but O(1) */
+#define rb_first_cached(root) (root)->rb_leftmost
+
 /* Postorder iteration - always visit the parent after its children */
 extern struct rb_node *rb_first_postorder(const struct rb_root *);
 extern struct rb_node *rb_next_postorder(const struct rb_node *);
@@ -75,6 +96,8 @@ extern struct rb_node *rb_next_postorder(const struct rb_node *);
 /* Fast replacement of a single node without remove/rebalance/add/rebalance */
 extern void rb_replace_node(struct rb_node *victim, struct rb_node *new,
 			    struct rb_root *root);
+extern void rb_replace_node_cached(struct rb_node *victim, struct rb_node *new,
+				   struct rb_root_cached *root);
 
 static inline void rb_link_node(struct rb_node *node, struct rb_node *parent,
 				struct rb_node **rb_link)
@@ -90,12 +113,29 @@ static inline void rb_link_node(struct rb_node *node, struct rb_node *parent,
 	   ____ptr ? rb_entry(____ptr, type, member) : NULL; \
 	})
 
-
-/*
- * Handy for checking that we are not deleting an entry that is
- * already in a list, found in block/{blk-throttle,cfq-iosched}.c,
- * probably should be moved to lib/rbtree.c...
+/**
+ * rbtree_postorder_for_each_entry_safe - iterate in post-order over rb_root of
+ * given type allowing the backing memory of @pos to be invalidated
+ *
+ * @pos:	the 'type *' to use as a loop cursor.
+ * @n:		another 'type *' to use as temporary storage
+ * @root:	'rb_root *' of the rbtree.
+ * @field:	the name of the rb_node field within 'type'.
+ *
+ * rbtree_postorder_for_each_entry_safe() provides a similar guarantee as
+ * list_for_each_entry_safe() and allows the iteration to continue independent
+ * of changes to @pos by the body of the loop.
+ *
+ * Note, however, that it cannot handle other modifications that re-order the
+ * rbtree it is iterating over. This includes calling rb_erase() on @pos, as
+ * rb_erase() may rebalance the tree, causing us to miss some nodes.
  */
+#define rbtree_postorder_for_each_entry_safe(pos, n, root, field) \
+	for (pos = rb_entry_safe(rb_first_postorder(root), typeof(*pos), field); \
+	     pos && ({ n = rb_entry_safe(rb_next_postorder(&pos->field), \
+			typeof(*pos), field); 1; }); \
+	     pos = n)
+
 static inline void rb_erase_init(struct rb_node *n, struct rb_root *root)
 {
 	rb_erase(n, root);
diff --git a/tools/include/linux/rbtree_augmented.h b/tools/include/linux/rbtree_augmented.h
index 43be941db695..d008e1404580 100644
--- a/tools/include/linux/rbtree_augmented.h
+++ b/tools/include/linux/rbtree_augmented.h
@@ -44,7 +44,9 @@ struct rb_augment_callbacks {
 	void (*rotate)(struct rb_node *old, struct rb_node *new);
 };
 
-extern void __rb_insert_augmented(struct rb_node *node, struct rb_root *root,
+extern void __rb_insert_augmented(struct rb_node *node,
+				  struct rb_root *root,
+				  bool newleft, struct rb_node **leftmost,
 	void (*augment_rotate)(struct rb_node *old, struct rb_node *new));
 /*
  * Fixup the rbtree and update the augmented information when rebalancing.
@@ -60,7 +62,16 @@ static inline void
 rb_insert_augmented(struct rb_node *node, struct rb_root *root,
 		    const struct rb_augment_callbacks *augment)
 {
-	__rb_insert_augmented(node, root, augment->rotate);
+	__rb_insert_augmented(node, root, false, NULL, augment->rotate);
+}
+
+static inline void
+rb_insert_augmented_cached(struct rb_node *node,
+			   struct rb_root_cached *root, bool newleft,
+			   const struct rb_augment_callbacks *augment)
+{
+	__rb_insert_augmented(node, &root->rb_root,
+			      newleft, &root->rb_leftmost, augment->rotate);
 }
 
 #define RB_DECLARE_CALLBACKS(rbstatic, rbname, rbstruct, rbfield,	\
@@ -93,7 +104,9 @@ rbname ## _rotate(struct rb_node *rb_old, struct rb_node *rb_new)	\
 	old->rbaugmented = rbcompute(old);				\
 }									\
 rbstatic const struct rb_augment_callbacks rbname = {			\
-	rbname ## _propagate, rbname ## _copy, rbname ## _rotate	\
+	.propagate = rbname ## _propagate,				\
+	.copy = rbname ## _copy,					\
+	.rotate = rbname ## _rotate					\
 };
 
 
@@ -126,11 +139,11 @@ __rb_change_child(struct rb_node *old, struct rb_node *new,
 {
 	if (parent) {
 		if (parent->rb_left == old)
-			parent->rb_left = new;
+			WRITE_ONCE(parent->rb_left, new);
 		else
-			parent->rb_right = new;
+			WRITE_ONCE(parent->rb_right, new);
 	} else
-		root->rb_node = new;
+		WRITE_ONCE(root->rb_node, new);
 }
 
 extern void __rb_erase_color(struct rb_node *parent, struct rb_root *root,
@@ -138,12 +151,17 @@ extern void __rb_erase_color(struct rb_node *parent, struct rb_root *root,
 
 static __always_inline struct rb_node *
 __rb_erase_augmented(struct rb_node *node, struct rb_root *root,
+		     struct rb_node **leftmost,
 		     const struct rb_augment_callbacks *augment)
 {
-	struct rb_node *child = node->rb_right, *tmp = node->rb_left;
+	struct rb_node *child = node->rb_right;
+	struct rb_node *tmp = node->rb_left;
 	struct rb_node *parent, *rebalance;
 	unsigned long pc;
 
+	if (leftmost && node == *leftmost)
+		*leftmost = rb_next(node);
+
 	if (!tmp) {
 		/*
 		 * Case 1: node to erase has no more than 1 child (easy!)
@@ -170,6 +188,7 @@ __rb_erase_augmented(struct rb_node *node, struct rb_root *root,
 		tmp = parent;
 	} else {
 		struct rb_node *successor = child, *child2;
+
 		tmp = child->rb_left;
 		if (!tmp) {
 			/*
@@ -183,6 +202,7 @@ __rb_erase_augmented(struct rb_node *node, struct rb_root *root,
 			 */
 			parent = successor;
 			child2 = successor->rb_right;
+
 			augment->copy(node, successor);
 		} else {
 			/*
@@ -204,19 +224,23 @@ __rb_erase_augmented(struct rb_node *node, struct rb_root *root,
 				successor = tmp;
 				tmp = tmp->rb_left;
 			} while (tmp);
-			parent->rb_left = child2 = successor->rb_right;
-			successor->rb_right = child;
+			child2 = successor->rb_right;
+			WRITE_ONCE(parent->rb_left, child2);
+			WRITE_ONCE(successor->rb_right, child);
 			rb_set_parent(child, successor);
+
 			augment->copy(node, successor);
 			augment->propagate(parent, successor);
 		}
 
-		successor->rb_left = tmp = node->rb_left;
+		tmp = node->rb_left;
+		WRITE_ONCE(successor->rb_left, tmp);
 		rb_set_parent(tmp, successor);
 
 		pc = node->__rb_parent_color;
 		tmp = __rb_parent(pc);
 		__rb_change_child(node, successor, tmp, root);
+
 		if (child2) {
 			successor->__rb_parent_color = pc;
 			rb_set_parent_color(child2, parent, RB_BLACK);
@@ -237,9 +261,21 @@ static __always_inline void
 rb_erase_augmented(struct rb_node *node, struct rb_root *root,
 		   const struct rb_augment_callbacks *augment)
 {
-	struct rb_node *rebalance = __rb_erase_augmented(node, root, augment);
+	struct rb_node *rebalance = __rb_erase_augmented(node, root,
+							 NULL, augment);
 	if (rebalance)
 		__rb_erase_color(rebalance, root, augment->rotate);
 }
 
-#endif	/* _TOOLS_LINUX_RBTREE_AUGMENTED_H */
+static __always_inline void
+rb_erase_augmented_cached(struct rb_node *node, struct rb_root_cached *root,
+			  const struct rb_augment_callbacks *augment)
+{
+	struct rb_node *rebalance = __rb_erase_augmented(node, &root->rb_root,
+							 &root->rb_leftmost,
+							 augment);
+	if (rebalance)
+		__rb_erase_color(rebalance, &root->rb_root, augment->rotate);
+}
+
+#endif /* _TOOLS_LINUX_RBTREE_AUGMENTED_H */
diff --git a/tools/lib/rbtree.c b/tools/lib/rbtree.c
index 17c2b596f043..904adb70a4f0 100644
--- a/tools/lib/rbtree.c
+++ b/tools/lib/rbtree.c
@@ -22,6 +22,7 @@
 */
 
 #include <linux/rbtree_augmented.h>
+#include <linux/export.h>
 
 /*
  * red-black trees properties:  http://en.wikipedia.org/wiki/Rbtree
@@ -43,6 +44,30 @@
  *  parentheses and have some accompanying text comment.
  */
 
+/*
+ * Notes on lockless lookups:
+ *
+ * All stores to the tree structure (rb_left and rb_right) must be done using
+ * WRITE_ONCE(). And we must not inadvertently cause (temporary) loops in the
+ * tree structure as seen in program order.
+ *
+ * These two requirements will allow lockless iteration of the tree -- not
+ * correct iteration mind you, tree rotations are not atomic so a lookup might
+ * miss entire subtrees.
+ *
+ * But they do guarantee that any such traversal will only see valid elements
+ * and that it will indeed complete -- does not get stuck in a loop.
+ *
+ * It also guarantees that if the lookup returns an element it is the 'correct'
+ * one. But not returning an element does _NOT_ mean it's not present.
+ *
+ * NOTE:
+ *
+ * Stores to __rb_parent_color are not important for simple lookups so those
+ * are left undone as of now. Nor did I check for loops involving parent
+ * pointers.
+ */
+
 static inline void rb_set_black(struct rb_node *rb)
 {
 	rb->__rb_parent_color |= RB_BLACK;
@@ -70,22 +95,35 @@ __rb_rotate_set_parents(struct rb_node *old, struct rb_node *new,
 
 static __always_inline void
 __rb_insert(struct rb_node *node, struct rb_root *root,
+	    bool newleft, struct rb_node **leftmost,
 	    void (*augment_rotate)(struct rb_node *old, struct rb_node *new))
 {
 	struct rb_node *parent = rb_red_parent(node), *gparent, *tmp;
 
+	if (newleft)
+		*leftmost = node;
+
 	while (true) {
 		/*
-		 * Loop invariant: node is red
-		 *
-		 * If there is a black parent, we are done.
-		 * Otherwise, take some corrective action as we don't
-		 * want a red root or two consecutive red nodes.
+		 * Loop invariant: node is red.
 		 */
-		if (!parent) {
+		if (unlikely(!parent)) {
+			/*
+			 * The inserted node is root. Either this is the
+			 * first node, or we recursed at Case 1 below and
+			 * are no longer violating 4).
+			 */
 			rb_set_parent_color(node, NULL, RB_BLACK);
 			break;
-		} else if (rb_is_black(parent))
+		}
+
+		/*
+		 * If there is a black parent, we are done.
+		 * Otherwise, take some corrective action as,
+		 * per 4), we don't want a red root or two
+		 * consecutive red nodes.
+		 */
+		if(rb_is_black(parent))
 			break;
 
 		gparent = rb_red_parent(parent);
@@ -94,7 +132,7 @@ __rb_insert(struct rb_node *node, struct rb_root *root,
 		if (parent != tmp) {	/* parent == gparent->rb_left */
 			if (tmp && rb_is_red(tmp)) {
 				/*
-				 * Case 1 - color flips
+				 * Case 1 - node's uncle is red (color flips).
 				 *
 				 *       G            g
 				 *      / \          / \
@@ -117,7 +155,8 @@ __rb_insert(struct rb_node *node, struct rb_root *root,
 			tmp = parent->rb_right;
 			if (node == tmp) {
 				/*
-				 * Case 2 - left rotate at parent
+				 * Case 2 - node's uncle is black and node is
+				 * the parent's right child (left rotate at parent).
 				 *
 				 *      G             G
 				 *     / \           / \
@@ -128,8 +167,9 @@ __rb_insert(struct rb_node *node, struct rb_root *root,
 				 * This still leaves us in violation of 4), the
 				 * continuation into Case 3 will fix that.
 				 */
-				parent->rb_right = tmp = node->rb_left;
-				node->rb_left = parent;
+				tmp = node->rb_left;
+				WRITE_ONCE(parent->rb_right, tmp);
+				WRITE_ONCE(node->rb_left, parent);
 				if (tmp)
 					rb_set_parent_color(tmp, parent,
 							    RB_BLACK);
@@ -140,7 +180,8 @@ __rb_insert(struct rb_node *node, struct rb_root *root,
 			}
 
 			/*
-			 * Case 3 - right rotate at gparent
+			 * Case 3 - node's uncle is black and node is
+			 * the parent's left child (right rotate at gparent).
 			 *
 			 *        G           P
 			 *       / \         / \
@@ -148,8 +189,8 @@ __rb_insert(struct rb_node *node, struct rb_root *root,
 			 *     /                 \
 			 *    n                   U
 			 */
-			gparent->rb_left = tmp;  /* == parent->rb_right */
-			parent->rb_right = gparent;
+			WRITE_ONCE(gparent->rb_left, tmp); /* == parent->rb_right */
+			WRITE_ONCE(parent->rb_right, gparent);
 			if (tmp)
 				rb_set_parent_color(tmp, gparent, RB_BLACK);
 			__rb_rotate_set_parents(gparent, parent, root, RB_RED);
@@ -170,8 +211,9 @@ __rb_insert(struct rb_node *node, struct rb_root *root,
 			tmp = parent->rb_left;
 			if (node == tmp) {
 				/* Case 2 - right rotate at parent */
-				parent->rb_left = tmp = node->rb_right;
-				node->rb_right = parent;
+				tmp = node->rb_right;
+				WRITE_ONCE(parent->rb_left, tmp);
+				WRITE_ONCE(node->rb_right, parent);
 				if (tmp)
 					rb_set_parent_color(tmp, parent,
 							    RB_BLACK);
@@ -182,8 +224,8 @@ __rb_insert(struct rb_node *node, struct rb_root *root,
 			}
 
 			/* Case 3 - left rotate at gparent */
-			gparent->rb_right = tmp;  /* == parent->rb_left */
-			parent->rb_left = gparent;
+			WRITE_ONCE(gparent->rb_right, tmp); /* == parent->rb_left */
+			WRITE_ONCE(parent->rb_left, gparent);
 			if (tmp)
 				rb_set_parent_color(tmp, gparent, RB_BLACK);
 			__rb_rotate_set_parents(gparent, parent, root, RB_RED);
@@ -223,8 +265,9 @@ ____rb_erase_color(struct rb_node *parent, struct rb_root *root,
 				 *      / \         / \
 				 *     Sl  Sr      N   Sl
 				 */
-				parent->rb_right = tmp1 = sibling->rb_left;
-				sibling->rb_left = parent;
+				tmp1 = sibling->rb_left;
+				WRITE_ONCE(parent->rb_right, tmp1);
+				WRITE_ONCE(sibling->rb_left, parent);
 				rb_set_parent_color(tmp1, parent, RB_BLACK);
 				__rb_rotate_set_parents(parent, sibling, root,
 							RB_RED);
@@ -268,15 +311,31 @@ ____rb_erase_color(struct rb_node *parent, struct rb_root *root,
 				 *
 				 *   (p)           (p)
 				 *   / \           / \
-				 *  N   S    -->  N   Sl
+				 *  N   S    -->  N   sl
 				 *     / \             \
-				 *    sl  Sr            s
+				 *    sl  Sr            S
 				 *                       \
 				 *                        Sr
+				 *
+				 * Note: p might be red, and then both
+				 * p and sl are red after rotation(which
+				 * breaks property 4). This is fixed in
+				 * Case 4 (in __rb_rotate_set_parents()
+				 *         which set sl the color of p
+				 *         and set p RB_BLACK)
+				 *
+				 *   (p)            (sl)
+				 *   / \            /  \
+				 *  N   sl   -->   P    S
+				 *       \        /      \
+				 *        S      N        Sr
+				 *         \
+				 *          Sr
 				 */
-				sibling->rb_left = tmp1 = tmp2->rb_right;
-				tmp2->rb_right = sibling;
-				parent->rb_right = tmp2;
+				tmp1 = tmp2->rb_right;
+				WRITE_ONCE(sibling->rb_left, tmp1);
+				WRITE_ONCE(tmp2->rb_right, sibling);
+				WRITE_ONCE(parent->rb_right, tmp2);
 				if (tmp1)
 					rb_set_parent_color(tmp1, sibling,
 							    RB_BLACK);
@@ -296,8 +355,9 @@ ____rb_erase_color(struct rb_node *parent, struct rb_root *root,
 			 *        / \         / \
 			 *      (sl) sr      N  (sl)
 			 */
-			parent->rb_right = tmp2 = sibling->rb_left;
-			sibling->rb_left = parent;
+			tmp2 = sibling->rb_left;
+			WRITE_ONCE(parent->rb_right, tmp2);
+			WRITE_ONCE(sibling->rb_left, parent);
 			rb_set_parent_color(tmp1, sibling, RB_BLACK);
 			if (tmp2)
 				rb_set_parent(tmp2, parent);
@@ -309,8 +369,9 @@ ____rb_erase_color(struct rb_node *parent, struct rb_root *root,
 			sibling = parent->rb_left;
 			if (rb_is_red(sibling)) {
 				/* Case 1 - right rotate at parent */
-				parent->rb_left = tmp1 = sibling->rb_right;
-				sibling->rb_right = parent;
+				tmp1 = sibling->rb_right;
+				WRITE_ONCE(parent->rb_left, tmp1);
+				WRITE_ONCE(sibling->rb_right, parent);
 				rb_set_parent_color(tmp1, parent, RB_BLACK);
 				__rb_rotate_set_parents(parent, sibling, root,
 							RB_RED);
@@ -334,10 +395,11 @@ ____rb_erase_color(struct rb_node *parent, struct rb_root *root,
 					}
 					break;
 				}
-				/* Case 3 - right rotate at sibling */
-				sibling->rb_right = tmp1 = tmp2->rb_left;
-				tmp2->rb_left = sibling;
-				parent->rb_left = tmp2;
+				/* Case 3 - left rotate at sibling */
+				tmp1 = tmp2->rb_left;
+				WRITE_ONCE(sibling->rb_right, tmp1);
+				WRITE_ONCE(tmp2->rb_left, sibling);
+				WRITE_ONCE(parent->rb_left, tmp2);
 				if (tmp1)
 					rb_set_parent_color(tmp1, sibling,
 							    RB_BLACK);
@@ -345,9 +407,10 @@ ____rb_erase_color(struct rb_node *parent, struct rb_root *root,
 				tmp1 = sibling;
 				sibling = tmp2;
 			}
-			/* Case 4 - left rotate at parent + color flips */
-			parent->rb_left = tmp2 = sibling->rb_right;
-			sibling->rb_right = parent;
+			/* Case 4 - right rotate at parent + color flips */
+			tmp2 = sibling->rb_right;
+			WRITE_ONCE(parent->rb_left, tmp2);
+			WRITE_ONCE(sibling->rb_right, parent);
 			rb_set_parent_color(tmp1, sibling, RB_BLACK);
 			if (tmp2)
 				rb_set_parent(tmp2, parent);
@@ -378,22 +441,41 @@ static inline void dummy_copy(struct rb_node *old, struct rb_node *new) {}
 static inline void dummy_rotate(struct rb_node *old, struct rb_node *new) {}
 
 static const struct rb_augment_callbacks dummy_callbacks = {
-	dummy_propagate, dummy_copy, dummy_rotate
+	.propagate = dummy_propagate,
+	.copy = dummy_copy,
+	.rotate = dummy_rotate
 };
 
 void rb_insert_color(struct rb_node *node, struct rb_root *root)
 {
-	__rb_insert(node, root, dummy_rotate);
+	__rb_insert(node, root, false, NULL, dummy_rotate);
 }
 
 void rb_erase(struct rb_node *node, struct rb_root *root)
 {
 	struct rb_node *rebalance;
-	rebalance = __rb_erase_augmented(node, root, &dummy_callbacks);
+	rebalance = __rb_erase_augmented(node, root,
+					 NULL, &dummy_callbacks);
 	if (rebalance)
 		____rb_erase_color(rebalance, root, dummy_rotate);
 }
 
+void rb_insert_color_cached(struct rb_node *node,
+			    struct rb_root_cached *root, bool leftmost)
+{
+	__rb_insert(node, &root->rb_root, leftmost,
+		    &root->rb_leftmost, dummy_rotate);
+}
+
+void rb_erase_cached(struct rb_node *node, struct rb_root_cached *root)
+{
+	struct rb_node *rebalance;
+	rebalance = __rb_erase_augmented(node, &root->rb_root,
+					 &root->rb_leftmost, &dummy_callbacks);
+	if (rebalance)
+		____rb_erase_color(rebalance, &root->rb_root, dummy_rotate);
+}
+
 /*
  * Augmented rbtree manipulation functions.
  *
@@ -402,9 +484,10 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
  */
 
 void __rb_insert_augmented(struct rb_node *node, struct rb_root *root,
+			   bool newleft, struct rb_node **leftmost,
 	void (*augment_rotate)(struct rb_node *old, struct rb_node *new))
 {
-	__rb_insert(node, root, augment_rotate);
+	__rb_insert(node, root, newleft, leftmost, augment_rotate);
 }
 
 /*
@@ -498,15 +581,24 @@ void rb_replace_node(struct rb_node *victim, struct rb_node *new,
 {
 	struct rb_node *parent = rb_parent(victim);
 
+	/* Copy the pointers/colour from the victim to the replacement */
+	*new = *victim;
+
 	/* Set the surrounding nodes to point to the replacement */
-	__rb_change_child(victim, new, parent, root);
 	if (victim->rb_left)
 		rb_set_parent(victim->rb_left, new);
 	if (victim->rb_right)
 		rb_set_parent(victim->rb_right, new);
+	__rb_change_child(victim, new, parent, root);
+}
 
-	/* Copy the pointers/colour from the victim to the replacement */
-	*new = *victim;
+void rb_replace_node_cached(struct rb_node *victim, struct rb_node *new,
+			    struct rb_root_cached *root)
+{
+	rb_replace_node(victim, new, &root->rb_root);
+
+	if (root->rb_leftmost == victim)
+		root->rb_leftmost = new;
 }
 
 static struct rb_node *rb_left_deepest_node(const struct rb_node *node)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 11/29] perf machine: Use cached rbtrees
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 10/29] tools: Update rbtree implementation Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 12/29] perf callchain: " Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Arnaldo Carvalho de Melo, Jiri Olsa,
	Namhyung Kim

From: Davidlohr Bueso <dave@stgolabs.net>

At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something required for nearly every operation dealing with
machine->guests and threads->entries.

The conversion is straightforward, however, it's worth noticing that the
rb_erase_init() calls have been replaced by rb_erase_cached() which has
no _init() flavor, however, the node is explicitly cleared next anyway,
which was redundant until now.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-3-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c |  3 ++-
 tools/perf/util/build-id.c  | 12 ++++++---
 tools/perf/util/machine.c   | 53 ++++++++++++++++++++++---------------
 tools/perf/util/machine.h   | 12 ++++-----
 tools/perf/util/rb_resort.h |  6 ++---
 5 files changed, 50 insertions(+), 36 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 794a0de5ac1d..c9ceaf88759c 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -753,7 +753,8 @@ static int tasks_print(struct report *rep, FILE *fp)
 	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
 		struct threads *threads = &machine->threads[i];
 
-		for (nd = rb_first(&threads->entries); nd; nd = rb_next(nd)) {
+		for (nd = rb_first_cached(&threads->entries); nd;
+		     nd = rb_next(nd)) {
 			task = tasks + itask++;
 
 			task->thread = rb_entry(nd, struct thread, rb_node);
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 07ef7fad689c..2ff802068f06 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -364,7 +364,8 @@ int perf_session__write_buildid_table(struct perf_session *session,
 	if (err)
 		return err;
 
-	for (nd = rb_first(&session->machines.guests); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&session->machines.guests); nd;
+	     nd = rb_next(nd)) {
 		struct machine *pos = rb_entry(nd, struct machine, rb_node);
 		err = machine__write_buildid_table(pos, fd);
 		if (err)
@@ -397,7 +398,8 @@ int dsos__hit_all(struct perf_session *session)
 	if (err)
 		return err;
 
-	for (nd = rb_first(&session->machines.guests); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&session->machines.guests); nd;
+	     nd = rb_next(nd)) {
 		struct machine *pos = rb_entry(nd, struct machine, rb_node);
 
 		err = machine__hit_all_dsos(pos);
@@ -850,7 +852,8 @@ int perf_session__cache_build_ids(struct perf_session *session)
 
 	ret = machine__cache_build_ids(&session->machines.host);
 
-	for (nd = rb_first(&session->machines.guests); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&session->machines.guests); nd;
+	     nd = rb_next(nd)) {
 		struct machine *pos = rb_entry(nd, struct machine, rb_node);
 		ret |= machine__cache_build_ids(pos);
 	}
@@ -867,7 +870,8 @@ bool perf_session__read_build_ids(struct perf_session *session, bool with_hits)
 	struct rb_node *nd;
 	bool ret = machine__read_build_ids(&session->machines.host, with_hits);
 
-	for (nd = rb_first(&session->machines.guests); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&session->machines.guests); nd;
+	     nd = rb_next(nd)) {
 		struct machine *pos = rb_entry(nd, struct machine, rb_node);
 		ret |= machine__read_build_ids(pos, with_hits);
 	}
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index ae85106bb5bf..66f019fdc510 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -42,7 +42,7 @@ static void machine__threads_init(struct machine *machine)
 
 	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
 		struct threads *threads = &machine->threads[i];
-		threads->entries = RB_ROOT;
+		threads->entries = RB_ROOT_CACHED;
 		init_rwsem(&threads->lock);
 		threads->nr = 0;
 		INIT_LIST_HEAD(&threads->dead);
@@ -180,7 +180,7 @@ void machine__delete_threads(struct machine *machine)
 	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
 		struct threads *threads = &machine->threads[i];
 		down_write(&threads->lock);
-		nd = rb_first(&threads->entries);
+		nd = rb_first_cached(&threads->entries);
 		while (nd) {
 			struct thread *t = rb_entry(nd, struct thread, rb_node);
 
@@ -223,7 +223,7 @@ void machine__delete(struct machine *machine)
 void machines__init(struct machines *machines)
 {
 	machine__init(&machines->host, "", HOST_KERNEL_ID);
-	machines->guests = RB_ROOT;
+	machines->guests = RB_ROOT_CACHED;
 }
 
 void machines__exit(struct machines *machines)
@@ -235,9 +235,10 @@ void machines__exit(struct machines *machines)
 struct machine *machines__add(struct machines *machines, pid_t pid,
 			      const char *root_dir)
 {
-	struct rb_node **p = &machines->guests.rb_node;
+	struct rb_node **p = &machines->guests.rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct machine *pos, *machine = malloc(sizeof(*machine));
+	bool leftmost = true;
 
 	if (machine == NULL)
 		return NULL;
@@ -252,12 +253,14 @@ struct machine *machines__add(struct machines *machines, pid_t pid,
 		pos = rb_entry(parent, struct machine, rb_node);
 		if (pid < pos->pid)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 
 	rb_link_node(&machine->rb_node, parent, p);
-	rb_insert_color(&machine->rb_node, &machines->guests);
+	rb_insert_color_cached(&machine->rb_node, &machines->guests, leftmost);
 
 	return machine;
 }
@@ -268,7 +271,7 @@ void machines__set_comm_exec(struct machines *machines, bool comm_exec)
 
 	machines->host.comm_exec = comm_exec;
 
-	for (nd = rb_first(&machines->guests); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&machines->guests); nd; nd = rb_next(nd)) {
 		struct machine *machine = rb_entry(nd, struct machine, rb_node);
 
 		machine->comm_exec = comm_exec;
@@ -277,7 +280,7 @@ void machines__set_comm_exec(struct machines *machines, bool comm_exec)
 
 struct machine *machines__find(struct machines *machines, pid_t pid)
 {
-	struct rb_node **p = &machines->guests.rb_node;
+	struct rb_node **p = &machines->guests.rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct machine *machine;
 	struct machine *default_machine = NULL;
@@ -340,7 +343,7 @@ void machines__process_guests(struct machines *machines,
 {
 	struct rb_node *nd;
 
-	for (nd = rb_first(&machines->guests); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&machines->guests); nd; nd = rb_next(nd)) {
 		struct machine *pos = rb_entry(nd, struct machine, rb_node);
 		process(pos, data);
 	}
@@ -353,7 +356,8 @@ void machines__set_id_hdr_size(struct machines *machines, u16 id_hdr_size)
 
 	machines->host.id_hdr_size = id_hdr_size;
 
-	for (node = rb_first(&machines->guests); node; node = rb_next(node)) {
+	for (node = rb_first_cached(&machines->guests); node;
+	     node = rb_next(node)) {
 		machine = rb_entry(node, struct machine, rb_node);
 		machine->id_hdr_size = id_hdr_size;
 	}
@@ -466,9 +470,10 @@ static struct thread *____machine__findnew_thread(struct machine *machine,
 						  pid_t pid, pid_t tid,
 						  bool create)
 {
-	struct rb_node **p = &threads->entries.rb_node;
+	struct rb_node **p = &threads->entries.rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct thread *th;
+	bool leftmost = true;
 
 	th = threads__get_last_match(threads, machine, pid, tid);
 	if (th)
@@ -486,8 +491,10 @@ static struct thread *____machine__findnew_thread(struct machine *machine,
 
 		if (tid < th->tid)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 
 	if (!create)
@@ -496,7 +503,7 @@ static struct thread *____machine__findnew_thread(struct machine *machine,
 	th = thread__new(pid, tid);
 	if (th != NULL) {
 		rb_link_node(&th->rb_node, parent, p);
-		rb_insert_color(&th->rb_node, &threads->entries);
+		rb_insert_color_cached(&th->rb_node, &threads->entries, leftmost);
 
 		/*
 		 * We have to initialize map_groups separately
@@ -507,7 +514,7 @@ static struct thread *____machine__findnew_thread(struct machine *machine,
 		 * leader and that would screwed the rb tree.
 		 */
 		if (thread__init_map_groups(th, machine)) {
-			rb_erase_init(&th->rb_node, &threads->entries);
+			rb_erase_cached(&th->rb_node, &threads->entries);
 			RB_CLEAR_NODE(&th->rb_node);
 			thread__put(th);
 			return NULL;
@@ -798,7 +805,7 @@ size_t machines__fprintf_dsos(struct machines *machines, FILE *fp)
 	struct rb_node *nd;
 	size_t ret = __dsos__fprintf(&machines->host.dsos.head, fp);
 
-	for (nd = rb_first(&machines->guests); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&machines->guests); nd; nd = rb_next(nd)) {
 		struct machine *pos = rb_entry(nd, struct machine, rb_node);
 		ret += __dsos__fprintf(&pos->dsos.head, fp);
 	}
@@ -818,7 +825,7 @@ size_t machines__fprintf_dsos_buildid(struct machines *machines, FILE *fp,
 	struct rb_node *nd;
 	size_t ret = machine__fprintf_dsos_buildid(&machines->host, fp, skip, parm);
 
-	for (nd = rb_first(&machines->guests); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&machines->guests); nd; nd = rb_next(nd)) {
 		struct machine *pos = rb_entry(nd, struct machine, rb_node);
 		ret += machine__fprintf_dsos_buildid(pos, fp, skip, parm);
 	}
@@ -858,7 +865,8 @@ size_t machine__fprintf(struct machine *machine, FILE *fp)
 
 		ret = fprintf(fp, "Threads: %u\n", threads->nr);
 
-		for (nd = rb_first(&threads->entries); nd; nd = rb_next(nd)) {
+		for (nd = rb_first_cached(&threads->entries); nd;
+		     nd = rb_next(nd)) {
 			struct thread *pos = rb_entry(nd, struct thread, rb_node);
 
 			ret += thread__fprintf(pos, fp);
@@ -1161,7 +1169,7 @@ int machines__create_guest_kernel_maps(struct machines *machines)
 
 void machines__destroy_kernel_maps(struct machines *machines)
 {
-	struct rb_node *next = rb_first(&machines->guests);
+	struct rb_node *next = rb_first_cached(&machines->guests);
 
 	machine__destroy_kernel_maps(&machines->host);
 
@@ -1169,7 +1177,7 @@ void machines__destroy_kernel_maps(struct machines *machines)
 		struct machine *pos = rb_entry(next, struct machine, rb_node);
 
 		next = rb_next(&pos->rb_node);
-		rb_erase(&pos->rb_node, &machines->guests);
+		rb_erase_cached(&pos->rb_node, &machines->guests);
 		machine__delete(pos);
 	}
 }
@@ -1734,7 +1742,7 @@ static void __machine__remove_thread(struct machine *machine, struct thread *th,
 	BUG_ON(refcount_read(&th->refcnt) == 0);
 	if (lock)
 		down_write(&threads->lock);
-	rb_erase_init(&th->rb_node, &threads->entries);
+	rb_erase_cached(&th->rb_node, &threads->entries);
 	RB_CLEAR_NODE(&th->rb_node);
 	--threads->nr;
 	/*
@@ -2511,7 +2519,8 @@ int machine__for_each_thread(struct machine *machine,
 
 	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
 		threads = &machine->threads[i];
-		for (nd = rb_first(&threads->entries); nd; nd = rb_next(nd)) {
+		for (nd = rb_first_cached(&threads->entries); nd;
+		     nd = rb_next(nd)) {
 			thread = rb_entry(nd, struct thread, rb_node);
 			rc = fn(thread, priv);
 			if (rc != 0)
@@ -2538,7 +2547,7 @@ int machines__for_each_thread(struct machines *machines,
 	if (rc != 0)
 		return rc;
 
-	for (nd = rb_first(&machines->guests); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&machines->guests); nd; nd = rb_next(nd)) {
 		struct machine *machine = rb_entry(nd, struct machine, rb_node);
 
 		rc = machine__for_each_thread(machine, fn, priv);
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 4ecd380ce1b4..e2f3df45410a 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -29,11 +29,11 @@ struct vdso_info;
 #define THREADS__TABLE_SIZE	(1 << THREADS__TABLE_BITS)
 
 struct threads {
-	struct rb_root	  entries;
-	struct rw_semaphore lock;
-	unsigned int	  nr;
-	struct list_head  dead;
-	struct thread	  *last_match;
+	struct rb_root_cached  entries;
+	struct rw_semaphore    lock;
+	unsigned int	       nr;
+	struct list_head       dead;
+	struct thread	       *last_match;
 };
 
 struct machine {
@@ -140,7 +140,7 @@ typedef void (*machine__process_t)(struct machine *machine, void *data);
 
 struct machines {
 	struct machine host;
-	struct rb_root guests;
+	struct rb_root_cached guests;
 };
 
 void machines__init(struct machines *machines);
diff --git a/tools/perf/util/rb_resort.h b/tools/perf/util/rb_resort.h
index a920f702a74d..f272e181d3d6 100644
--- a/tools/perf/util/rb_resort.h
+++ b/tools/perf/util/rb_resort.h
@@ -144,8 +144,8 @@ struct __name##_sorted *__name = __name##_sorted__new
 				  __ilist->rblist.nr_entries)
 
 /* For 'struct machine->threads' */
-#define DECLARE_RESORT_RB_MACHINE_THREADS(__name, __machine, hash_bucket)	\
-	DECLARE_RESORT_RB(__name)(&__machine->threads[hash_bucket].entries,	\
-				  __machine->threads[hash_bucket].nr)
+#define DECLARE_RESORT_RB_MACHINE_THREADS(__name, __machine, hash_bucket)    \
+ DECLARE_RESORT_RB(__name)(&__machine->threads[hash_bucket].entries.rb_root, \
+			   __machine->threads[hash_bucket].nr)
 
 #endif /* _PERF_RESORT_RB_H_ */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 12/29] perf callchain: Use cached rbtrees
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 11/29] perf machine: Use cached rbtrees Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 13/29] perf util: Use cached rbtree for rblists Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Arnaldo Carvalho de Melo, Jiri Olsa,
	Namhyung Kim

From: Davidlohr Bueso <dave@stgolabs.net>

At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something required for nearly every in/srcline callchain node deletion
(in/srcline__tree_delete()).

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-4-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/dso.c     |  4 ++--
 tools/perf/util/dso.h     |  4 ++--
 tools/perf/util/srcline.c | 43 +++++++++++++++++++++++----------------
 tools/perf/util/srcline.h | 13 ++++++------
 4 files changed, 36 insertions(+), 28 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 8cbbb5be4a0b..a82bbf6e24e0 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1198,8 +1198,8 @@ struct dso *dso__new(const char *name)
 		dso__set_short_name(dso, dso->name, false);
 		dso->symbols = dso->symbol_names = RB_ROOT;
 		dso->data.cache = RB_ROOT;
-		dso->inlined_nodes = RB_ROOT;
-		dso->srclines = RB_ROOT;
+		dso->inlined_nodes = RB_ROOT_CACHED;
+		dso->srclines = RB_ROOT_CACHED;
 		dso->data.fd = -1;
 		dso->data.status = DSO_DATA_STATUS_UNKNOWN;
 		dso->symtab_type = DSO_BINARY_TYPE__NOT_FOUND;
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 2de54c0b26b2..40edfb375a8b 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -143,8 +143,8 @@ struct dso {
 	struct rb_root	 *root;		/* root of rbtree that rb_node is in */
 	struct rb_root	 symbols;
 	struct rb_root	 symbol_names;
-	struct rb_root	 inlined_nodes;
-	struct rb_root	 srclines;
+	struct rb_root_cached inlined_nodes;
+	struct rb_root_cached srclines;
 	struct {
 		u64		addr;
 		struct symbol	*symbol;
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index dc86597d0cc4..00f215580b5a 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -594,11 +594,12 @@ struct srcline_node {
 	struct rb_node		rb_node;
 };
 
-void srcline__tree_insert(struct rb_root *tree, u64 addr, char *srcline)
+void srcline__tree_insert(struct rb_root_cached *tree, u64 addr, char *srcline)
 {
-	struct rb_node **p = &tree->rb_node;
+	struct rb_node **p = &tree->rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct srcline_node *i, *node;
+	bool leftmost = true;
 
 	node = zalloc(sizeof(struct srcline_node));
 	if (!node) {
@@ -614,16 +615,18 @@ void srcline__tree_insert(struct rb_root *tree, u64 addr, char *srcline)
 		i = rb_entry(parent, struct srcline_node, rb_node);
 		if (addr < i->addr)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 	rb_link_node(&node->rb_node, parent, p);
-	rb_insert_color(&node->rb_node, tree);
+	rb_insert_color_cached(&node->rb_node, tree, leftmost);
 }
 
-char *srcline__tree_find(struct rb_root *tree, u64 addr)
+char *srcline__tree_find(struct rb_root_cached *tree, u64 addr)
 {
-	struct rb_node *n = tree->rb_node;
+	struct rb_node *n = tree->rb_root.rb_node;
 
 	while (n) {
 		struct srcline_node *i = rb_entry(n, struct srcline_node,
@@ -640,15 +643,15 @@ char *srcline__tree_find(struct rb_root *tree, u64 addr)
 	return NULL;
 }
 
-void srcline__tree_delete(struct rb_root *tree)
+void srcline__tree_delete(struct rb_root_cached *tree)
 {
 	struct srcline_node *pos;
-	struct rb_node *next = rb_first(tree);
+	struct rb_node *next = rb_first_cached(tree);
 
 	while (next) {
 		pos = rb_entry(next, struct srcline_node, rb_node);
 		next = rb_next(&pos->rb_node);
-		rb_erase(&pos->rb_node, tree);
+		rb_erase_cached(&pos->rb_node, tree);
 		free_srcline(pos->srcline);
 		zfree(&pos);
 	}
@@ -682,28 +685,32 @@ void inline_node__delete(struct inline_node *node)
 	free(node);
 }
 
-void inlines__tree_insert(struct rb_root *tree, struct inline_node *inlines)
+void inlines__tree_insert(struct rb_root_cached *tree,
+			  struct inline_node *inlines)
 {
-	struct rb_node **p = &tree->rb_node;
+	struct rb_node **p = &tree->rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	const u64 addr = inlines->addr;
 	struct inline_node *i;
+	bool leftmost = true;
 
 	while (*p != NULL) {
 		parent = *p;
 		i = rb_entry(parent, struct inline_node, rb_node);
 		if (addr < i->addr)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 	rb_link_node(&inlines->rb_node, parent, p);
-	rb_insert_color(&inlines->rb_node, tree);
+	rb_insert_color_cached(&inlines->rb_node, tree, leftmost);
 }
 
-struct inline_node *inlines__tree_find(struct rb_root *tree, u64 addr)
+struct inline_node *inlines__tree_find(struct rb_root_cached *tree, u64 addr)
 {
-	struct rb_node *n = tree->rb_node;
+	struct rb_node *n = tree->rb_root.rb_node;
 
 	while (n) {
 		struct inline_node *i = rb_entry(n, struct inline_node,
@@ -720,15 +727,15 @@ struct inline_node *inlines__tree_find(struct rb_root *tree, u64 addr)
 	return NULL;
 }
 
-void inlines__tree_delete(struct rb_root *tree)
+void inlines__tree_delete(struct rb_root_cached *tree)
 {
 	struct inline_node *pos;
-	struct rb_node *next = rb_first(tree);
+	struct rb_node *next = rb_first_cached(tree);
 
 	while (next) {
 		pos = rb_entry(next, struct inline_node, rb_node);
 		next = rb_next(&pos->rb_node);
-		rb_erase(&pos->rb_node, tree);
+		rb_erase_cached(&pos->rb_node, tree);
 		inline_node__delete(pos);
 	}
 }
diff --git a/tools/perf/util/srcline.h b/tools/perf/util/srcline.h
index 5762212dc342..b11a0aaaa676 100644
--- a/tools/perf/util/srcline.h
+++ b/tools/perf/util/srcline.h
@@ -19,11 +19,11 @@ void free_srcline(char *srcline);
 char *get_srcline_split(struct dso *dso, u64 addr, unsigned *line);
 
 /* insert the srcline into the DSO, which will take ownership */
-void srcline__tree_insert(struct rb_root *tree, u64 addr, char *srcline);
+void srcline__tree_insert(struct rb_root_cached *tree, u64 addr, char *srcline);
 /* find previously inserted srcline */
-char *srcline__tree_find(struct rb_root *tree, u64 addr);
+char *srcline__tree_find(struct rb_root_cached *tree, u64 addr);
 /* delete all srclines within the tree */
-void srcline__tree_delete(struct rb_root *tree);
+void srcline__tree_delete(struct rb_root_cached *tree);
 
 #define SRCLINE_UNKNOWN  ((char *) "??:0")
 
@@ -46,10 +46,11 @@ struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr,
 void inline_node__delete(struct inline_node *node);
 
 /* insert the inline node list into the DSO, which will take ownership */
-void inlines__tree_insert(struct rb_root *tree, struct inline_node *inlines);
+void inlines__tree_insert(struct rb_root_cached *tree,
+			  struct inline_node *inlines);
 /* find previously inserted inline node list */
-struct inline_node *inlines__tree_find(struct rb_root *tree, u64 addr);
+struct inline_node *inlines__tree_find(struct rb_root_cached *tree, u64 addr);
 /* delete all nodes within the tree of inline_node s */
-void inlines__tree_delete(struct rb_root *tree);
+void inlines__tree_delete(struct rb_root_cached *tree);
 
 #endif /* PERF_SRCLINE_H */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 13/29] perf util: Use cached rbtree for rblists
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 12/29] perf callchain: " Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 14/29] perf symbols: Use cached rbtrees Arnaldo Carvalho de Melo
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Arnaldo Carvalho de Melo, Jiri Olsa,
	Namhyung Kim

From: Davidlohr Bueso <dave@stgolabs.net>

At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something required for any of the strlist or intlist traversals
(XXX_for_each_entry()). There are a number of users in perf of these
(particularly strlists), including probes, and buildid.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-5-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intlist.h     |  2 +-
 tools/perf/util/metricgroup.c |  2 +-
 tools/perf/util/rb_resort.h   |  2 +-
 tools/perf/util/rblist.c      | 28 ++++++++++++++++++----------
 tools/perf/util/rblist.h      |  2 +-
 tools/perf/util/stat-shadow.c |  2 +-
 tools/perf/util/strlist.h     |  2 +-
 7 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/intlist.h b/tools/perf/util/intlist.h
index 85bab8735fa9..5c19ee001299 100644
--- a/tools/perf/util/intlist.h
+++ b/tools/perf/util/intlist.h
@@ -45,7 +45,7 @@ static inline unsigned int intlist__nr_entries(const struct intlist *ilist)
 /* For intlist iteration */
 static inline struct int_node *intlist__first(struct intlist *ilist)
 {
-	struct rb_node *rn = rb_first(&ilist->rblist.entries);
+	struct rb_node *rn = rb_first_cached(&ilist->rblist.entries);
 	return rn ? rb_entry(rn, struct int_node, rb_node) : NULL;
 }
 static inline struct int_node *intlist__next(struct int_node *in)
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index a28f9b5cc4ff..8529cbd3955b 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -352,7 +352,7 @@ void metricgroup__print(bool metrics, bool metricgroups, char *filter,
 	else if (metrics && !raw)
 		printf("\nMetrics:\n\n");
 
-	for (node = rb_first(&groups.entries); node; node = next) {
+	for (node = rb_first_cached(&groups.entries); node; node = next) {
 		struct mep *me = container_of(node, struct mep, nd);
 
 		if (metricgroups)
diff --git a/tools/perf/util/rb_resort.h b/tools/perf/util/rb_resort.h
index f272e181d3d6..376e86cb4c3c 100644
--- a/tools/perf/util/rb_resort.h
+++ b/tools/perf/util/rb_resort.h
@@ -140,7 +140,7 @@ struct __name##_sorted *__name = __name##_sorted__new
 
 /* For 'struct intlist' */
 #define DECLARE_RESORT_RB_INTLIST(__name, __ilist)				\
-	DECLARE_RESORT_RB(__name)(&__ilist->rblist.entries,			\
+	DECLARE_RESORT_RB(__name)(&__ilist->rblist.entries.rb_root,		\
 				  __ilist->rblist.nr_entries)
 
 /* For 'struct machine->threads' */
diff --git a/tools/perf/util/rblist.c b/tools/perf/util/rblist.c
index 0efc3258c648..11e07fab20dc 100644
--- a/tools/perf/util/rblist.c
+++ b/tools/perf/util/rblist.c
@@ -13,8 +13,9 @@
 
 int rblist__add_node(struct rblist *rblist, const void *new_entry)
 {
-	struct rb_node **p = &rblist->entries.rb_node;
+	struct rb_node **p = &rblist->entries.rb_root.rb_node;
 	struct rb_node *parent = NULL, *new_node;
+	bool leftmost = true;
 
 	while (*p != NULL) {
 		int rc;
@@ -24,8 +25,10 @@ int rblist__add_node(struct rblist *rblist, const void *new_entry)
 		rc = rblist->node_cmp(parent, new_entry);
 		if (rc > 0)
 			p = &(*p)->rb_left;
-		else if (rc < 0)
+		else if (rc < 0) {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 		else
 			return -EEXIST;
 	}
@@ -35,7 +38,7 @@ int rblist__add_node(struct rblist *rblist, const void *new_entry)
 		return -ENOMEM;
 
 	rb_link_node(new_node, parent, p);
-	rb_insert_color(new_node, &rblist->entries);
+	rb_insert_color_cached(new_node, &rblist->entries, leftmost);
 	++rblist->nr_entries;
 
 	return 0;
@@ -43,7 +46,7 @@ int rblist__add_node(struct rblist *rblist, const void *new_entry)
 
 void rblist__remove_node(struct rblist *rblist, struct rb_node *rb_node)
 {
-	rb_erase(rb_node, &rblist->entries);
+	rb_erase_cached(rb_node, &rblist->entries);
 	--rblist->nr_entries;
 	rblist->node_delete(rblist, rb_node);
 }
@@ -52,8 +55,9 @@ static struct rb_node *__rblist__findnew(struct rblist *rblist,
 					 const void *entry,
 					 bool create)
 {
-	struct rb_node **p = &rblist->entries.rb_node;
+	struct rb_node **p = &rblist->entries.rb_root.rb_node;
 	struct rb_node *parent = NULL, *new_node = NULL;
+	bool leftmost = true;
 
 	while (*p != NULL) {
 		int rc;
@@ -63,8 +67,10 @@ static struct rb_node *__rblist__findnew(struct rblist *rblist,
 		rc = rblist->node_cmp(parent, entry);
 		if (rc > 0)
 			p = &(*p)->rb_left;
-		else if (rc < 0)
+		else if (rc < 0) {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 		else
 			return parent;
 	}
@@ -73,7 +79,8 @@ static struct rb_node *__rblist__findnew(struct rblist *rblist,
 		new_node = rblist->node_new(rblist, entry);
 		if (new_node) {
 			rb_link_node(new_node, parent, p);
-			rb_insert_color(new_node, &rblist->entries);
+			rb_insert_color_cached(new_node,
+					       &rblist->entries, leftmost);
 			++rblist->nr_entries;
 		}
 	}
@@ -94,7 +101,7 @@ struct rb_node *rblist__findnew(struct rblist *rblist, const void *entry)
 void rblist__init(struct rblist *rblist)
 {
 	if (rblist != NULL) {
-		rblist->entries	 = RB_ROOT;
+		rblist->entries	 = RB_ROOT_CACHED;
 		rblist->nr_entries = 0;
 	}
 
@@ -103,7 +110,7 @@ void rblist__init(struct rblist *rblist)
 
 void rblist__exit(struct rblist *rblist)
 {
-	struct rb_node *pos, *next = rb_first(&rblist->entries);
+	struct rb_node *pos, *next = rb_first_cached(&rblist->entries);
 
 	while (next) {
 		pos = next;
@@ -124,7 +131,8 @@ struct rb_node *rblist__entry(const struct rblist *rblist, unsigned int idx)
 {
 	struct rb_node *node;
 
-	for (node = rb_first(&rblist->entries); node; node = rb_next(node)) {
+	for (node = rb_first_cached(&rblist->entries); node;
+	     node = rb_next(node)) {
 		if (!idx--)
 			return node;
 	}
diff --git a/tools/perf/util/rblist.h b/tools/perf/util/rblist.h
index 76df15c27f5f..14b232a4d0b6 100644
--- a/tools/perf/util/rblist.h
+++ b/tools/perf/util/rblist.h
@@ -20,7 +20,7 @@
  */
 
 struct rblist {
-	struct rb_root entries;
+	struct rb_root_cached entries;
 	unsigned int   nr_entries;
 
 	int (*node_cmp)(struct rb_node *rbn, const void *entry);
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 3c22c58b3e90..83d8094be4fe 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -168,7 +168,7 @@ static void reset_stat(struct runtime_stat *st)
 	struct rb_node *pos, *next;
 
 	rblist = &st->value_list;
-	next = rb_first(&rblist->entries);
+	next = rb_first_cached(&rblist->entries);
 	while (next) {
 		pos = next;
 		next = rb_next(pos);
diff --git a/tools/perf/util/strlist.h b/tools/perf/util/strlist.h
index d58f1e08b170..7e82c71dcc42 100644
--- a/tools/perf/util/strlist.h
+++ b/tools/perf/util/strlist.h
@@ -57,7 +57,7 @@ static inline unsigned int strlist__nr_entries(const struct strlist *slist)
 /* For strlist iteration */
 static inline struct str_node *strlist__first(struct strlist *slist)
 {
-	struct rb_node *rn = rb_first(&slist->rblist.entries);
+	struct rb_node *rn = rb_first_cached(&slist->rblist.entries);
 	return rn ? rb_entry(rn, struct str_node, rb_node) : NULL;
 }
 static inline struct str_node *strlist__next(struct str_node *sn)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 14/29] perf symbols: Use cached rbtrees
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 13/29] perf util: Use cached rbtree for rblists Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 15/29] perf hist: " Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Arnaldo Carvalho de Melo, Jiri Olsa,
	Namhyung Kim

From: Davidlohr Bueso <dave@stgolabs.net>

At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node).

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-6-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-annotate.c    |  2 +-
 tools/perf/util/dso.c            |  4 +-
 tools/perf/util/dso.h            |  6 +--
 tools/perf/util/map.c            |  8 +--
 tools/perf/util/probe-event.c    |  3 +-
 tools/perf/util/symbol.c         | 87 +++++++++++++++++---------------
 tools/perf/util/symbol.h         | 13 ++---
 tools/perf/util/symbol_fprintf.c |  2 +-
 8 files changed, 67 insertions(+), 58 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 93d679eaf1f4..cc3da5564300 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -227,7 +227,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 		 * the DSO?
 		 */
 		if (al->sym != NULL) {
-			rb_erase(&al->sym->rb_node,
+			rb_erase_cached(&al->sym->rb_node,
 				 &al->map->dso->symbols);
 			symbol__delete(al->sym);
 			dso__reset_find_symbol_cache(al->map->dso);
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index a82bbf6e24e0..d5273baa7753 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1196,7 +1196,7 @@ struct dso *dso__new(const char *name)
 		strcpy(dso->name, name);
 		dso__set_long_name(dso, dso->name, false);
 		dso__set_short_name(dso, dso->name, false);
-		dso->symbols = dso->symbol_names = RB_ROOT;
+		dso->symbols = dso->symbol_names = RB_ROOT_CACHED;
 		dso->data.cache = RB_ROOT;
 		dso->inlined_nodes = RB_ROOT_CACHED;
 		dso->srclines = RB_ROOT_CACHED;
@@ -1468,7 +1468,7 @@ size_t dso__fprintf(struct dso *dso, FILE *fp)
 	ret += fprintf(fp, "%sloaded, ", dso__loaded(dso) ? "" : "NOT ");
 	ret += dso__fprintf_buildid(dso, fp);
 	ret += fprintf(fp, ")\n");
-	for (nd = rb_first(&dso->symbols); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&dso->symbols); nd; nd = rb_next(nd)) {
 		struct symbol *pos = rb_entry(nd, struct symbol, rb_node);
 		ret += symbol__fprintf(pos, fp);
 	}
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 40edfb375a8b..bb417c54c25a 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -141,8 +141,8 @@ struct dso {
 	struct list_head node;
 	struct rb_node	 rb_node;	/* rbtree node sorted by long name */
 	struct rb_root	 *root;		/* root of rbtree that rb_node is in */
-	struct rb_root	 symbols;
-	struct rb_root	 symbol_names;
+	struct rb_root_cached symbols;
+	struct rb_root_cached symbol_names;
 	struct rb_root_cached inlined_nodes;
 	struct rb_root_cached srclines;
 	struct {
@@ -236,7 +236,7 @@ bool dso__loaded(const struct dso *dso);
 
 static inline bool dso__has_symbols(const struct dso *dso)
 {
-	return !RB_EMPTY_ROOT(&dso->symbols);
+	return !RB_EMPTY_ROOT(&dso->symbols.rb_root);
 }
 
 bool dso__sorted_by_name(const struct dso *dso);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 6751301a755c..cf407cc9d915 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -286,8 +286,8 @@ void map__put(struct map *map)
 
 void map__fixup_start(struct map *map)
 {
-	struct rb_root *symbols = &map->dso->symbols;
-	struct rb_node *nd = rb_first(symbols);
+	struct rb_root_cached *symbols = &map->dso->symbols;
+	struct rb_node *nd = rb_first_cached(symbols);
 	if (nd != NULL) {
 		struct symbol *sym = rb_entry(nd, struct symbol, rb_node);
 		map->start = sym->start;
@@ -296,8 +296,8 @@ void map__fixup_start(struct map *map)
 
 void map__fixup_end(struct map *map)
 {
-	struct rb_root *symbols = &map->dso->symbols;
-	struct rb_node *nd = rb_last(symbols);
+	struct rb_root_cached *symbols = &map->dso->symbols;
+	struct rb_node *nd = rb_last(&symbols->rb_root);
 	if (nd != NULL) {
 		struct symbol *sym = rb_entry(nd, struct symbol, rb_node);
 		map->end = sym->end;
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 736787a18b9e..4008f0cc36e5 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -3529,7 +3529,8 @@ int show_available_funcs(const char *target, struct nsinfo *nsi,
 	/* Show all (filtered) symbols */
 	setup_pager();
 
-	for (nd = rb_first(&map->dso->symbol_names); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&map->dso->symbol_names); nd;
+	     nd = rb_next(nd)) {
 		struct symbol_name_rb_node *pos = rb_entry(nd, struct symbol_name_rb_node, rb_node);
 
 		if (strfilter__compare(_filter, pos->sym.name))
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 48efad6d0f90..bcbcbd610460 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -163,7 +163,7 @@ static int choose_best_symbol(struct symbol *syma, struct symbol *symb)
 	return arch__choose_best_symbol(syma, symb);
 }
 
-void symbols__fixup_duplicate(struct rb_root *symbols)
+void symbols__fixup_duplicate(struct rb_root_cached *symbols)
 {
 	struct rb_node *nd;
 	struct symbol *curr, *next;
@@ -171,7 +171,7 @@ void symbols__fixup_duplicate(struct rb_root *symbols)
 	if (symbol_conf.allow_aliases)
 		return;
 
-	nd = rb_first(symbols);
+	nd = rb_first_cached(symbols);
 
 	while (nd) {
 		curr = rb_entry(nd, struct symbol, rb_node);
@@ -186,20 +186,20 @@ void symbols__fixup_duplicate(struct rb_root *symbols)
 			continue;
 
 		if (choose_best_symbol(curr, next) == SYMBOL_A) {
-			rb_erase(&next->rb_node, symbols);
+			rb_erase_cached(&next->rb_node, symbols);
 			symbol__delete(next);
 			goto again;
 		} else {
 			nd = rb_next(&curr->rb_node);
-			rb_erase(&curr->rb_node, symbols);
+			rb_erase_cached(&curr->rb_node, symbols);
 			symbol__delete(curr);
 		}
 	}
 }
 
-void symbols__fixup_end(struct rb_root *symbols)
+void symbols__fixup_end(struct rb_root_cached *symbols)
 {
-	struct rb_node *nd, *prevnd = rb_first(symbols);
+	struct rb_node *nd, *prevnd = rb_first_cached(symbols);
 	struct symbol *curr, *prev;
 
 	if (prevnd == NULL)
@@ -282,25 +282,27 @@ void symbol__delete(struct symbol *sym)
 	free(((void *)sym) - symbol_conf.priv_size);
 }
 
-void symbols__delete(struct rb_root *symbols)
+void symbols__delete(struct rb_root_cached *symbols)
 {
 	struct symbol *pos;
-	struct rb_node *next = rb_first(symbols);
+	struct rb_node *next = rb_first_cached(symbols);
 
 	while (next) {
 		pos = rb_entry(next, struct symbol, rb_node);
 		next = rb_next(&pos->rb_node);
-		rb_erase(&pos->rb_node, symbols);
+		rb_erase_cached(&pos->rb_node, symbols);
 		symbol__delete(pos);
 	}
 }
 
-void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool kernel)
+void __symbols__insert(struct rb_root_cached *symbols,
+		       struct symbol *sym, bool kernel)
 {
-	struct rb_node **p = &symbols->rb_node;
+	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	const u64 ip = sym->start;
 	struct symbol *s;
+	bool leftmost = true;
 
 	if (kernel) {
 		const char *name = sym->name;
@@ -318,26 +320,28 @@ void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool kernel)
 		s = rb_entry(parent, struct symbol, rb_node);
 		if (ip < s->start)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 	rb_link_node(&sym->rb_node, parent, p);
-	rb_insert_color(&sym->rb_node, symbols);
+	rb_insert_color_cached(&sym->rb_node, symbols, leftmost);
 }
 
-void symbols__insert(struct rb_root *symbols, struct symbol *sym)
+void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym)
 {
 	__symbols__insert(symbols, sym, false);
 }
 
-static struct symbol *symbols__find(struct rb_root *symbols, u64 ip)
+static struct symbol *symbols__find(struct rb_root_cached *symbols, u64 ip)
 {
 	struct rb_node *n;
 
 	if (symbols == NULL)
 		return NULL;
 
-	n = symbols->rb_node;
+	n = symbols->rb_root.rb_node;
 
 	while (n) {
 		struct symbol *s = rb_entry(n, struct symbol, rb_node);
@@ -353,9 +357,9 @@ static struct symbol *symbols__find(struct rb_root *symbols, u64 ip)
 	return NULL;
 }
 
-static struct symbol *symbols__first(struct rb_root *symbols)
+static struct symbol *symbols__first(struct rb_root_cached *symbols)
 {
-	struct rb_node *n = rb_first(symbols);
+	struct rb_node *n = rb_first_cached(symbols);
 
 	if (n)
 		return rb_entry(n, struct symbol, rb_node);
@@ -363,9 +367,9 @@ static struct symbol *symbols__first(struct rb_root *symbols)
 	return NULL;
 }
 
-static struct symbol *symbols__last(struct rb_root *symbols)
+static struct symbol *symbols__last(struct rb_root_cached *symbols)
 {
-	struct rb_node *n = rb_last(symbols);
+	struct rb_node *n = rb_last(&symbols->rb_root);
 
 	if (n)
 		return rb_entry(n, struct symbol, rb_node);
@@ -383,11 +387,12 @@ static struct symbol *symbols__next(struct symbol *sym)
 	return NULL;
 }
 
-static void symbols__insert_by_name(struct rb_root *symbols, struct symbol *sym)
+static void symbols__insert_by_name(struct rb_root_cached *symbols, struct symbol *sym)
 {
-	struct rb_node **p = &symbols->rb_node;
+	struct rb_node **p = &symbols->rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct symbol_name_rb_node *symn, *s;
+	bool leftmost = true;
 
 	symn = container_of(sym, struct symbol_name_rb_node, sym);
 
@@ -396,19 +401,21 @@ static void symbols__insert_by_name(struct rb_root *symbols, struct symbol *sym)
 		s = rb_entry(parent, struct symbol_name_rb_node, rb_node);
 		if (strcmp(sym->name, s->sym.name) < 0)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 	rb_link_node(&symn->rb_node, parent, p);
-	rb_insert_color(&symn->rb_node, symbols);
+	rb_insert_color_cached(&symn->rb_node, symbols, leftmost);
 }
 
-static void symbols__sort_by_name(struct rb_root *symbols,
-				  struct rb_root *source)
+static void symbols__sort_by_name(struct rb_root_cached *symbols,
+				  struct rb_root_cached *source)
 {
 	struct rb_node *nd;
 
-	for (nd = rb_first(source); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(source); nd; nd = rb_next(nd)) {
 		struct symbol *pos = rb_entry(nd, struct symbol, rb_node);
 		symbols__insert_by_name(symbols, pos);
 	}
@@ -431,7 +438,7 @@ int symbol__match_symbol_name(const char *name, const char *str,
 		return arch__compare_symbol_names(name, str);
 }
 
-static struct symbol *symbols__find_by_name(struct rb_root *symbols,
+static struct symbol *symbols__find_by_name(struct rb_root_cached *symbols,
 					    const char *name,
 					    enum symbol_tag_include includes)
 {
@@ -441,7 +448,7 @@ static struct symbol *symbols__find_by_name(struct rb_root *symbols,
 	if (symbols == NULL)
 		return NULL;
 
-	n = symbols->rb_node;
+	n = symbols->rb_root.rb_node;
 
 	while (n) {
 		int cmp;
@@ -644,7 +651,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 {
 	struct symbol *sym;
 	struct dso *dso = arg;
-	struct rb_root *root = &dso->symbols;
+	struct rb_root_cached *root = &dso->symbols;
 
 	if (!symbol_type__filter(type))
 		return 0;
@@ -681,14 +688,14 @@ static int map_groups__split_kallsyms_for_kcore(struct map_groups *kmaps, struct
 	struct map *curr_map;
 	struct symbol *pos;
 	int count = 0;
-	struct rb_root old_root = dso->symbols;
-	struct rb_root *root = &dso->symbols;
-	struct rb_node *next = rb_first(root);
+	struct rb_root_cached old_root = dso->symbols;
+	struct rb_root_cached *root = &dso->symbols;
+	struct rb_node *next = rb_first_cached(root);
 
 	if (!kmaps)
 		return -1;
 
-	*root = RB_ROOT;
+	*root = RB_ROOT_CACHED;
 
 	while (next) {
 		char *module;
@@ -696,8 +703,8 @@ static int map_groups__split_kallsyms_for_kcore(struct map_groups *kmaps, struct
 		pos = rb_entry(next, struct symbol, rb_node);
 		next = rb_next(&pos->rb_node);
 
-		rb_erase_init(&pos->rb_node, &old_root);
-
+		rb_erase_cached(&pos->rb_node, &old_root);
+		RB_CLEAR_NODE(&pos->rb_node);
 		module = strchr(pos->name, '\t');
 		if (module)
 			*module = '\0';
@@ -734,8 +741,8 @@ static int map_groups__split_kallsyms(struct map_groups *kmaps, struct dso *dso,
 	struct map *curr_map = initial_map;
 	struct symbol *pos;
 	int count = 0, moved = 0;
-	struct rb_root *root = &dso->symbols;
-	struct rb_node *next = rb_first(root);
+	struct rb_root_cached *root = &dso->symbols;
+	struct rb_node *next = rb_first_cached(root);
 	int kernel_range = 0;
 	bool x86_64;
 
@@ -849,7 +856,7 @@ static int map_groups__split_kallsyms(struct map_groups *kmaps, struct dso *dso,
 		}
 add_symbol:
 		if (curr_map != initial_map) {
-			rb_erase(&pos->rb_node, root);
+			rb_erase_cached(&pos->rb_node, root);
 			symbols__insert(&curr_map->dso->symbols, pos);
 			++moved;
 		} else
@@ -857,7 +864,7 @@ static int map_groups__split_kallsyms(struct map_groups *kmaps, struct dso *dso,
 
 		continue;
 discard_symbol:
-		rb_erase(&pos->rb_node, root);
+		rb_erase_cached(&pos->rb_node, root);
 		symbol__delete(pos);
 	}
 
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 2162a5d9b3ac..56e2bcb907cc 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -68,7 +68,7 @@ struct symbol {
 };
 
 void symbol__delete(struct symbol *sym);
-void symbols__delete(struct rb_root *symbols);
+void symbols__delete(struct rb_root_cached *symbols);
 
 /* symbols__for_each_entry - iterate over symbols (rb_root)
  *
@@ -77,7 +77,7 @@ void symbols__delete(struct rb_root *symbols);
  * @nd: the 'struct rb_node *' to use as a temporary storage
  */
 #define symbols__for_each_entry(symbols, pos, nd)			\
-	for (nd = rb_first(symbols);					\
+	for (nd = rb_first_cached(symbols);					\
 	     nd && (pos = rb_entry(nd, struct symbol, rb_node));	\
 	     nd = rb_next(nd))
 
@@ -247,10 +247,11 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss);
 
 char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
 
-void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool kernel);
-void symbols__insert(struct rb_root *symbols, struct symbol *sym);
-void symbols__fixup_duplicate(struct rb_root *symbols);
-void symbols__fixup_end(struct rb_root *symbols);
+void __symbols__insert(struct rb_root_cached *symbols, struct symbol *sym,
+		       bool kernel);
+void symbols__insert(struct rb_root_cached *symbols, struct symbol *sym);
+void symbols__fixup_duplicate(struct rb_root_cached *symbols);
+void symbols__fixup_end(struct rb_root_cached *symbols);
 void map_groups__fixup_end(struct map_groups *mg);
 
 typedef int (*mapfn_t)(u64 start, u64 len, u64 pgoff, void *data);
diff --git a/tools/perf/util/symbol_fprintf.c b/tools/perf/util/symbol_fprintf.c
index 50472c75da59..02e89b02c2ce 100644
--- a/tools/perf/util/symbol_fprintf.c
+++ b/tools/perf/util/symbol_fprintf.c
@@ -65,7 +65,7 @@ size_t dso__fprintf_symbols_by_name(struct dso *dso,
 	struct rb_node *nd;
 	struct symbol_name_rb_node *pos;
 
-	for (nd = rb_first(&dso->symbol_names); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&dso->symbol_names); nd; nd = rb_next(nd)) {
 		pos = rb_entry(nd, struct symbol_name_rb_node, rb_node);
 		fprintf(fp, "%s\n", pos->sym.name);
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 15/29] perf hist: Use cached rbtrees
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 14/29] perf symbols: Use cached rbtrees Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 16/29] perf sched: " Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Arnaldo Carvalho de Melo, Jiri Olsa,
	Namhyung Kim

From: Davidlohr Bueso <dave@stgolabs.net>

At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something heavily required for histograms. Specifically, the following
are converted to rb_root_cached, and users accordingly:

hist::entries_in_array
hist::entries_in
hist::entries
hist::entries_collapsed
hist_entry::hroot_in
hist_entry::hroot_out

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-7-dave@stgolabs.net
[ Added some missing conversions to rb_first_cached() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-annotate.c     |   2 +-
 tools/perf/builtin-c2c.c          |   6 +-
 tools/perf/builtin-diff.c         |  10 +-
 tools/perf/builtin-top.c          |   2 +-
 tools/perf/tests/hists_common.c   |   8 +-
 tools/perf/tests/hists_cumulate.c |  14 +--
 tools/perf/tests/hists_link.c     |   8 +-
 tools/perf/tests/hists_output.c   |  32 ++---
 tools/perf/ui/browsers/hists.c    |  16 +--
 tools/perf/ui/gtk/hists.c         |   6 +-
 tools/perf/ui/stdio/hist.c        |   3 +-
 tools/perf/util/hist.c            | 199 +++++++++++++++++-------------
 tools/perf/util/hist.h            |  10 +-
 tools/perf/util/sort.h            |   4 +-
 14 files changed, 173 insertions(+), 147 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index cc3da5564300..20db635347e5 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -305,7 +305,7 @@ static void hists__find_annotations(struct hists *hists,
 				    struct perf_evsel *evsel,
 				    struct perf_annotate *ann)
 {
-	struct rb_node *nd = rb_first(&hists->entries), *next;
+	struct rb_node *nd = rb_first_cached(&hists->entries), *next;
 	int key = K_RIGHT;
 
 	while (nd) {
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index f2863496dede..72ec0ae7d8ad 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2088,7 +2088,7 @@ static int resort_hitm_cb(struct hist_entry *he)
 
 static int hists__iterate_cb(struct hists *hists, hists__resort_cb_t cb)
 {
-	struct rb_node *next = rb_first(&hists->entries);
+	struct rb_node *next = rb_first_cached(&hists->entries);
 	int ret = 0;
 
 	while (next) {
@@ -2215,7 +2215,7 @@ static void print_pareto(FILE *out)
 	if (WARN_ONCE(ret, "failed to setup sort entries\n"))
 		return;
 
-	nd = rb_first(&c2c.hists.hists.entries);
+	nd = rb_first_cached(&c2c.hists.hists.entries);
 
 	for (; nd; nd = rb_next(nd)) {
 		struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
@@ -2283,7 +2283,7 @@ static void perf_c2c__hists_fprintf(FILE *out, struct perf_session *session)
 static void c2c_browser__update_nr_entries(struct hist_browser *hb)
 {
 	u64 nr_entries = 0;
-	struct rb_node *nd = rb_first(&hb->hists->entries);
+	struct rb_node *nd = rb_first_cached(&hb->hists->entries);
 
 	while (nd) {
 		struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 39db2ee32d48..751e1971456b 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -429,7 +429,7 @@ get_pair_fmt(struct hist_entry *he, struct diff_hpp_fmt *dfmt)
 
 static void hists__baseline_only(struct hists *hists)
 {
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *next;
 
 	if (hists__has(hists, need_collapse))
@@ -437,13 +437,13 @@ static void hists__baseline_only(struct hists *hists)
 	else
 		root = hists->entries_in;
 
-	next = rb_first(root);
+	next = rb_first_cached(root);
 	while (next != NULL) {
 		struct hist_entry *he = rb_entry(next, struct hist_entry, rb_node_in);
 
 		next = rb_next(&he->rb_node_in);
 		if (!hist_entry__next_pair(he)) {
-			rb_erase(&he->rb_node_in, root);
+			rb_erase_cached(&he->rb_node_in, root);
 			hist_entry__delete(he);
 		}
 	}
@@ -451,7 +451,7 @@ static void hists__baseline_only(struct hists *hists)
 
 static void hists__precompute(struct hists *hists)
 {
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *next;
 
 	if (hists__has(hists, need_collapse))
@@ -459,7 +459,7 @@ static void hists__precompute(struct hists *hists)
 	else
 		root = hists->entries_in;
 
-	next = rb_first(root);
+	next = rb_first_cached(root);
 	while (next != NULL) {
 		struct hist_entry *he, *pair;
 		struct data__file *d;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5a486d4de56e..57b1d7495d02 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -367,7 +367,7 @@ static void perf_top__prompt_symbol(struct perf_top *top, const char *msg)
 	if (p)
 		*p = 0;
 
-	next = rb_first(&hists->entries);
+	next = rb_first_cached(&hists->entries);
 	while (next) {
 		n = rb_entry(next, struct hist_entry, rb_node);
 		if (n->ms.sym && !strcmp(buf, n->ms.sym->name)) {
diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index b889a28fd80b..6b0649c8b33e 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -161,7 +161,7 @@ struct machine *setup_fake_machine(struct machines *machines)
 void print_hists_in(struct hists *hists)
 {
 	int i = 0;
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *node;
 
 	if (hists__has(hists, need_collapse))
@@ -170,7 +170,7 @@ void print_hists_in(struct hists *hists)
 		root = hists->entries_in;
 
 	pr_info("----- %s --------\n", __func__);
-	node = rb_first(root);
+	node = rb_first_cached(root);
 	while (node) {
 		struct hist_entry *he;
 
@@ -191,13 +191,13 @@ void print_hists_in(struct hists *hists)
 void print_hists_out(struct hists *hists)
 {
 	int i = 0;
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *node;
 
 	root = &hists->entries;
 
 	pr_info("----- %s --------\n", __func__);
-	node = rb_first(root);
+	node = rb_first_cached(root);
 	while (node) {
 		struct hist_entry *he;
 
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index 65fe02bebbee..4ca417d1a06b 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -125,8 +125,8 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 static void del_hist_entries(struct hists *hists)
 {
 	struct hist_entry *he;
-	struct rb_root *root_in;
-	struct rb_root *root_out;
+	struct rb_root_cached *root_in;
+	struct rb_root_cached *root_out;
 	struct rb_node *node;
 
 	if (hists__has(hists, need_collapse))
@@ -136,12 +136,12 @@ static void del_hist_entries(struct hists *hists)
 
 	root_out = &hists->entries;
 
-	while (!RB_EMPTY_ROOT(root_out)) {
-		node = rb_first(root_out);
+	while (!RB_EMPTY_ROOT(&root_out->rb_root)) {
+		node = rb_first_cached(root_out);
 
 		he = rb_entry(node, struct hist_entry, rb_node);
-		rb_erase(node, root_out);
-		rb_erase(&he->rb_node_in, root_in);
+		rb_erase_cached(node, root_out);
+		rb_erase_cached(&he->rb_node_in, root_in);
 		hist_entry__delete(he);
 	}
 }
@@ -198,7 +198,7 @@ static int do_test(struct hists *hists, struct result *expected, size_t nr_expec
 		print_hists_out(hists);
 	}
 
-	root = &hists->entries;
+	root = &hists->entries.rb_root;
 	for (node = rb_first(root), i = 0;
 	     node && (he = rb_entry(node, struct hist_entry, rb_node));
 	     node = rb_next(node), i++) {
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 9a9d06cb0222..af633db63f4d 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -142,7 +142,7 @@ static int find_sample(struct sample *samples, size_t nr_samples,
 static int __validate_match(struct hists *hists)
 {
 	size_t count = 0;
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *node;
 
 	/*
@@ -153,7 +153,7 @@ static int __validate_match(struct hists *hists)
 	else
 		root = hists->entries_in;
 
-	node = rb_first(root);
+	node = rb_first_cached(root);
 	while (node) {
 		struct hist_entry *he;
 
@@ -192,7 +192,7 @@ static int __validate_link(struct hists *hists, int idx)
 	size_t count = 0;
 	size_t count_pair = 0;
 	size_t count_dummy = 0;
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *node;
 
 	/*
@@ -205,7 +205,7 @@ static int __validate_link(struct hists *hists, int idx)
 	else
 		root = hists->entries_in;
 
-	node = rb_first(root);
+	node = rb_first_cached(root);
 	while (node) {
 		struct hist_entry *he;
 
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index faacb4f41460..6c4bc68a77ee 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -91,8 +91,8 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 static void del_hist_entries(struct hists *hists)
 {
 	struct hist_entry *he;
-	struct rb_root *root_in;
-	struct rb_root *root_out;
+	struct rb_root_cached *root_in;
+	struct rb_root_cached *root_out;
 	struct rb_node *node;
 
 	if (hists__has(hists, need_collapse))
@@ -102,12 +102,12 @@ static void del_hist_entries(struct hists *hists)
 
 	root_out = &hists->entries;
 
-	while (!RB_EMPTY_ROOT(root_out)) {
-		node = rb_first(root_out);
+	while (!RB_EMPTY_ROOT(&root_out->rb_root)) {
+		node = rb_first_cached(root_out);
 
 		he = rb_entry(node, struct hist_entry, rb_node);
-		rb_erase(node, root_out);
-		rb_erase(&he->rb_node_in, root_in);
+		rb_erase_cached(node, root_out);
+		rb_erase_cached(&he->rb_node_in, root_in);
 		hist_entry__delete(he);
 	}
 }
@@ -126,7 +126,7 @@ static int test1(struct perf_evsel *evsel, struct machine *machine)
 	int err;
 	struct hists *hists = evsel__hists(evsel);
 	struct hist_entry *he;
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *node;
 
 	field_order = NULL;
@@ -162,7 +162,7 @@ static int test1(struct perf_evsel *evsel, struct machine *machine)
 	}
 
 	root = &hists->entries;
-	node = rb_first(root);
+	node = rb_first_cached(root);
 	he = rb_entry(node, struct hist_entry, rb_node);
 	TEST_ASSERT_VAL("Invalid hist entry",
 			!strcmp(COMM(he), "perf") && !strcmp(DSO(he), "perf") &&
@@ -228,7 +228,7 @@ static int test2(struct perf_evsel *evsel, struct machine *machine)
 	int err;
 	struct hists *hists = evsel__hists(evsel);
 	struct hist_entry *he;
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *node;
 
 	field_order = "overhead,cpu";
@@ -262,7 +262,7 @@ static int test2(struct perf_evsel *evsel, struct machine *machine)
 	}
 
 	root = &hists->entries;
-	node = rb_first(root);
+	node = rb_first_cached(root);
 	he = rb_entry(node, struct hist_entry, rb_node);
 	TEST_ASSERT_VAL("Invalid hist entry",
 			CPU(he) == 1 && PID(he) == 100 && he->stat.period == 300);
@@ -284,7 +284,7 @@ static int test3(struct perf_evsel *evsel, struct machine *machine)
 	int err;
 	struct hists *hists = evsel__hists(evsel);
 	struct hist_entry *he;
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *node;
 
 	field_order = "comm,overhead,dso";
@@ -316,7 +316,7 @@ static int test3(struct perf_evsel *evsel, struct machine *machine)
 	}
 
 	root = &hists->entries;
-	node = rb_first(root);
+	node = rb_first_cached(root);
 	he = rb_entry(node, struct hist_entry, rb_node);
 	TEST_ASSERT_VAL("Invalid hist entry",
 			!strcmp(COMM(he), "bash") && !strcmp(DSO(he), "bash") &&
@@ -358,7 +358,7 @@ static int test4(struct perf_evsel *evsel, struct machine *machine)
 	int err;
 	struct hists *hists = evsel__hists(evsel);
 	struct hist_entry *he;
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *node;
 
 	field_order = "dso,sym,comm,overhead,dso";
@@ -394,7 +394,7 @@ static int test4(struct perf_evsel *evsel, struct machine *machine)
 	}
 
 	root = &hists->entries;
-	node = rb_first(root);
+	node = rb_first_cached(root);
 	he = rb_entry(node, struct hist_entry, rb_node);
 	TEST_ASSERT_VAL("Invalid hist entry",
 			!strcmp(DSO(he), "perf") && !strcmp(SYM(he), "cmd_record") &&
@@ -460,7 +460,7 @@ static int test5(struct perf_evsel *evsel, struct machine *machine)
 	int err;
 	struct hists *hists = evsel__hists(evsel);
 	struct hist_entry *he;
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *node;
 
 	field_order = "cpu,pid,comm,dso,sym";
@@ -497,7 +497,7 @@ static int test5(struct perf_evsel *evsel, struct machine *machine)
 	}
 
 	root = &hists->entries;
-	node = rb_first(root);
+	node = rb_first_cached(root);
 	he = rb_entry(node, struct hist_entry, rb_node);
 
 	TEST_ASSERT_VAL("Invalid hist entry",
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index b32f505a7608..85790a4d1842 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -49,7 +49,7 @@ static int hist_browser__get_folding(struct hist_browser *browser)
 	struct hists *hists = browser->hists;
 	int unfolded_rows = 0;
 
-	for (nd = rb_first(&hists->entries);
+	for (nd = rb_first_cached(&hists->entries);
 	     (nd = hists__filter_entries(nd, browser->min_pcnt)) != NULL;
 	     nd = rb_hierarchy_next(nd)) {
 		struct hist_entry *he =
@@ -267,7 +267,7 @@ static int hierarchy_count_rows(struct hist_browser *hb, struct hist_entry *he,
 	if (he->has_no_entry)
 		return 1;
 
-	node = rb_first(&he->hroot_out);
+	node = rb_first_cached(&he->hroot_out);
 	while (node) {
 		float percent;
 
@@ -372,7 +372,7 @@ static void hist_entry__init_have_children(struct hist_entry *he)
 		he->has_children = !RB_EMPTY_ROOT(&he->sorted_chain);
 		callchain__init_have_children(&he->sorted_chain);
 	} else {
-		he->has_children = !RB_EMPTY_ROOT(&he->hroot_out);
+		he->has_children = !RB_EMPTY_ROOT(&he->hroot_out.rb_root);
 	}
 
 	he->init_have_children = true;
@@ -508,7 +508,7 @@ static int hierarchy_set_folding(struct hist_browser *hb, struct hist_entry *he,
 	struct hist_entry *child;
 	int n = 0;
 
-	for (nd = rb_first(&he->hroot_out); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&he->hroot_out); nd; nd = rb_next(nd)) {
 		child = rb_entry(nd, struct hist_entry, rb_node);
 		percent = hist_entry__get_percent_limit(child);
 		if (!child->filtered && percent >= hb->min_pcnt)
@@ -566,7 +566,7 @@ __hist_browser__set_folding(struct hist_browser *browser, bool unfold)
 	struct rb_node *nd;
 	struct hist_entry *he;
 
-	nd = rb_first(&browser->hists->entries);
+	nd = rb_first_cached(&browser->hists->entries);
 	while (nd) {
 		he = rb_entry(nd, struct hist_entry, rb_node);
 
@@ -1738,7 +1738,7 @@ static void ui_browser__hists_init_top(struct ui_browser *browser)
 		struct hist_browser *hb;
 
 		hb = container_of(browser, struct hist_browser, b);
-		browser->top = rb_first(&hb->hists->entries);
+		browser->top = rb_first_cached(&hb->hists->entries);
 	}
 }
 
@@ -2649,7 +2649,7 @@ add_socket_opt(struct hist_browser *browser, struct popup_action *act,
 static void hist_browser__update_nr_entries(struct hist_browser *hb)
 {
 	u64 nr_entries = 0;
-	struct rb_node *nd = rb_first(&hb->hists->entries);
+	struct rb_node *nd = rb_first_cached(&hb->hists->entries);
 
 	if (hb->min_pcnt == 0 && !symbol_conf.report_hierarchy) {
 		hb->nr_non_filtered_entries = hb->hists->nr_non_filtered_entries;
@@ -2669,7 +2669,7 @@ static void hist_browser__update_percent_limit(struct hist_browser *hb,
 					       double percent)
 {
 	struct hist_entry *he;
-	struct rb_node *nd = rb_first(&hb->hists->entries);
+	struct rb_node *nd = rb_first_cached(&hb->hists->entries);
 	u64 total = hists__total_period(hb->hists);
 	u64 min_callchain_hits = total * (percent / 100);
 
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 4ab663ec3e5e..74414ccd8c84 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -353,7 +353,7 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
 
 	g_object_unref(GTK_TREE_MODEL(store));
 
-	for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&hists->entries); nd; nd = rb_next(nd)) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 		GtkTreeIter iter;
 		u64 total = hists__total_period(h->hists);
@@ -401,7 +401,7 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
 }
 
 static void perf_gtk__add_hierarchy_entries(struct hists *hists,
-					    struct rb_root *root,
+					    struct rb_root_cached *root,
 					    GtkTreeStore *store,
 					    GtkTreeIter *parent,
 					    struct perf_hpp *hpp,
@@ -415,7 +415,7 @@ static void perf_gtk__add_hierarchy_entries(struct hists *hists,
 	u64 total = hists__total_period(hists);
 	int size;
 
-	for (node = rb_first(root); node; node = rb_next(node)) {
+	for (node = rb_first_cached(root); node; node = rb_next(node)) {
 		GtkTreeIter iter;
 		float percent;
 		char *bf;
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 74c4ae1f0a05..2a9fa4dcbc2a 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -788,7 +788,8 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
 
 	indent = hists__overhead_width(hists) + 4;
 
-	for (nd = rb_first(&hists->entries); nd; nd = __rb_hierarchy_next(nd, HMD_FORCE_CHILD)) {
+	for (nd = rb_first_cached(&hists->entries); nd;
+	     nd = __rb_hierarchy_next(nd, HMD_FORCE_CHILD)) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 		float percent;
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 8aad8330e392..9e7a8e044a0a 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -209,7 +209,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
 
 void hists__output_recalc_col_len(struct hists *hists, int max_rows)
 {
-	struct rb_node *next = rb_first(&hists->entries);
+	struct rb_node *next = rb_first_cached(&hists->entries);
 	struct hist_entry *n;
 	int row = 0;
 
@@ -296,7 +296,7 @@ static bool hists__decay_entry(struct hists *hists, struct hist_entry *he)
 
 	if (!he->leaf) {
 		struct hist_entry *child;
-		struct rb_node *node = rb_first(&he->hroot_out);
+		struct rb_node *node = rb_first_cached(&he->hroot_out);
 		while (node) {
 			child = rb_entry(node, struct hist_entry, rb_node);
 			node = rb_next(node);
@@ -311,8 +311,8 @@ static bool hists__decay_entry(struct hists *hists, struct hist_entry *he)
 
 static void hists__delete_entry(struct hists *hists, struct hist_entry *he)
 {
-	struct rb_root *root_in;
-	struct rb_root *root_out;
+	struct rb_root_cached *root_in;
+	struct rb_root_cached *root_out;
 
 	if (he->parent_he) {
 		root_in  = &he->parent_he->hroot_in;
@@ -325,8 +325,8 @@ static void hists__delete_entry(struct hists *hists, struct hist_entry *he)
 		root_out = &hists->entries;
 	}
 
-	rb_erase(&he->rb_node_in, root_in);
-	rb_erase(&he->rb_node, root_out);
+	rb_erase_cached(&he->rb_node_in, root_in);
+	rb_erase_cached(&he->rb_node, root_out);
 
 	--hists->nr_entries;
 	if (!he->filtered)
@@ -337,7 +337,7 @@ static void hists__delete_entry(struct hists *hists, struct hist_entry *he)
 
 void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel)
 {
-	struct rb_node *next = rb_first(&hists->entries);
+	struct rb_node *next = rb_first_cached(&hists->entries);
 	struct hist_entry *n;
 
 	while (next) {
@@ -353,7 +353,7 @@ void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel)
 
 void hists__delete_entries(struct hists *hists)
 {
-	struct rb_node *next = rb_first(&hists->entries);
+	struct rb_node *next = rb_first_cached(&hists->entries);
 	struct hist_entry *n;
 
 	while (next) {
@@ -435,8 +435,8 @@ static int hist_entry__init(struct hist_entry *he,
 	}
 	INIT_LIST_HEAD(&he->pairs.node);
 	thread__get(he->thread);
-	he->hroot_in  = RB_ROOT;
-	he->hroot_out = RB_ROOT;
+	he->hroot_in  = RB_ROOT_CACHED;
+	he->hroot_out = RB_ROOT_CACHED;
 
 	if (!symbol_conf.report_hierarchy)
 		he->leaf = true;
@@ -513,8 +513,9 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
 	int64_t cmp;
 	u64 period = entry->stat.period;
 	u64 weight = entry->stat.weight;
+	bool leftmost = true;
 
-	p = &hists->entries_in->rb_node;
+	p = &hists->entries_in->rb_root.rb_node;
 
 	while (*p != NULL) {
 		parent = *p;
@@ -557,8 +558,10 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
 
 		if (cmp < 0)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 
 	he = hist_entry__new(entry, sample_self);
@@ -570,7 +573,7 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
 	hists->nr_entries++;
 
 	rb_link_node(&he->rb_node_in, parent, p);
-	rb_insert_color(&he->rb_node_in, hists->entries_in);
+	rb_insert_color_cached(&he->rb_node_in, hists->entries_in, leftmost);
 out:
 	if (sample_self)
 		he_stat__add_cpumode_period(&he->stat, al->cpumode, period);
@@ -1279,16 +1282,17 @@ static void hist_entry__apply_hierarchy_filters(struct hist_entry *he)
 }
 
 static struct hist_entry *hierarchy_insert_entry(struct hists *hists,
-						 struct rb_root *root,
+						 struct rb_root_cached *root,
 						 struct hist_entry *he,
 						 struct hist_entry *parent_he,
 						 struct perf_hpp_list *hpp_list)
 {
-	struct rb_node **p = &root->rb_node;
+	struct rb_node **p = &root->rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct hist_entry *iter, *new;
 	struct perf_hpp_fmt *fmt;
 	int64_t cmp;
+	bool leftmost = true;
 
 	while (*p != NULL) {
 		parent = *p;
@@ -1308,8 +1312,10 @@ static struct hist_entry *hierarchy_insert_entry(struct hists *hists,
 
 		if (cmp < 0)
 			p = &parent->rb_left;
-		else
+		else {
 			p = &parent->rb_right;
+			leftmost = false;
+		}
 	}
 
 	new = hist_entry__new(he, true);
@@ -1343,12 +1349,12 @@ static struct hist_entry *hierarchy_insert_entry(struct hists *hists,
 	}
 
 	rb_link_node(&new->rb_node_in, parent, p);
-	rb_insert_color(&new->rb_node_in, root);
+	rb_insert_color_cached(&new->rb_node_in, root, leftmost);
 	return new;
 }
 
 static int hists__hierarchy_insert_entry(struct hists *hists,
-					 struct rb_root *root,
+					 struct rb_root_cached *root,
 					 struct hist_entry *he)
 {
 	struct perf_hpp_list_node *node;
@@ -1395,13 +1401,14 @@ static int hists__hierarchy_insert_entry(struct hists *hists,
 }
 
 static int hists__collapse_insert_entry(struct hists *hists,
-					struct rb_root *root,
+					struct rb_root_cached *root,
 					struct hist_entry *he)
 {
-	struct rb_node **p = &root->rb_node;
+	struct rb_node **p = &root->rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct hist_entry *iter;
 	int64_t cmp;
+	bool leftmost = true;
 
 	if (symbol_conf.report_hierarchy)
 		return hists__hierarchy_insert_entry(hists, root, he);
@@ -1432,19 +1439,21 @@ static int hists__collapse_insert_entry(struct hists *hists,
 
 		if (cmp < 0)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 	hists->nr_entries++;
 
 	rb_link_node(&he->rb_node_in, parent, p);
-	rb_insert_color(&he->rb_node_in, root);
+	rb_insert_color_cached(&he->rb_node_in, root, leftmost);
 	return 1;
 }
 
-struct rb_root *hists__get_rotate_entries_in(struct hists *hists)
+struct rb_root_cached *hists__get_rotate_entries_in(struct hists *hists)
 {
-	struct rb_root *root;
+	struct rb_root_cached *root;
 
 	pthread_mutex_lock(&hists->lock);
 
@@ -1467,7 +1476,7 @@ static void hists__apply_filters(struct hists *hists, struct hist_entry *he)
 
 int hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
 {
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *next;
 	struct hist_entry *n;
 	int ret;
@@ -1479,7 +1488,7 @@ int hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
 
 	root = hists__get_rotate_entries_in(hists);
 
-	next = rb_first(root);
+	next = rb_first_cached(root);
 
 	while (next) {
 		if (session_done())
@@ -1487,7 +1496,7 @@ int hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
 		n = rb_entry(next, struct hist_entry, rb_node_in);
 		next = rb_next(&n->rb_node_in);
 
-		rb_erase(&n->rb_node_in, root);
+		rb_erase_cached(&n->rb_node_in, root);
 		ret = hists__collapse_insert_entry(hists, &hists->entries_collapsed, n);
 		if (ret < 0)
 			return -1;
@@ -1558,7 +1567,7 @@ static void hierarchy_recalc_total_periods(struct hists *hists)
 	struct rb_node *node;
 	struct hist_entry *he;
 
-	node = rb_first(&hists->entries);
+	node = rb_first_cached(&hists->entries);
 
 	hists->stats.total_period = 0;
 	hists->stats.total_non_filtered_period = 0;
@@ -1578,13 +1587,14 @@ static void hierarchy_recalc_total_periods(struct hists *hists)
 	}
 }
 
-static void hierarchy_insert_output_entry(struct rb_root *root,
+static void hierarchy_insert_output_entry(struct rb_root_cached *root,
 					  struct hist_entry *he)
 {
-	struct rb_node **p = &root->rb_node;
+	struct rb_node **p = &root->rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct hist_entry *iter;
 	struct perf_hpp_fmt *fmt;
+	bool leftmost = true;
 
 	while (*p != NULL) {
 		parent = *p;
@@ -1592,12 +1602,14 @@ static void hierarchy_insert_output_entry(struct rb_root *root,
 
 		if (hist_entry__sort(he, iter) > 0)
 			p = &parent->rb_left;
-		else
+		else {
 			p = &parent->rb_right;
+			leftmost = false;
+		}
 	}
 
 	rb_link_node(&he->rb_node, parent, p);
-	rb_insert_color(&he->rb_node, root);
+	rb_insert_color_cached(&he->rb_node, root, leftmost);
 
 	/* update column width of dynamic entry */
 	perf_hpp_list__for_each_sort_list(he->hpp_list, fmt) {
@@ -1608,16 +1620,16 @@ static void hierarchy_insert_output_entry(struct rb_root *root,
 
 static void hists__hierarchy_output_resort(struct hists *hists,
 					   struct ui_progress *prog,
-					   struct rb_root *root_in,
-					   struct rb_root *root_out,
+					   struct rb_root_cached *root_in,
+					   struct rb_root_cached *root_out,
 					   u64 min_callchain_hits,
 					   bool use_callchain)
 {
 	struct rb_node *node;
 	struct hist_entry *he;
 
-	*root_out = RB_ROOT;
-	node = rb_first(root_in);
+	*root_out = RB_ROOT_CACHED;
+	node = rb_first_cached(root_in);
 
 	while (node) {
 		he = rb_entry(node, struct hist_entry, rb_node_in);
@@ -1660,15 +1672,16 @@ static void hists__hierarchy_output_resort(struct hists *hists,
 	}
 }
 
-static void __hists__insert_output_entry(struct rb_root *entries,
+static void __hists__insert_output_entry(struct rb_root_cached *entries,
 					 struct hist_entry *he,
 					 u64 min_callchain_hits,
 					 bool use_callchain)
 {
-	struct rb_node **p = &entries->rb_node;
+	struct rb_node **p = &entries->rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct hist_entry *iter;
 	struct perf_hpp_fmt *fmt;
+	bool leftmost = true;
 
 	if (use_callchain) {
 		if (callchain_param.mode == CHAIN_GRAPH_REL) {
@@ -1689,12 +1702,14 @@ static void __hists__insert_output_entry(struct rb_root *entries,
 
 		if (hist_entry__sort(he, iter) > 0)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 
 	rb_link_node(&he->rb_node, parent, p);
-	rb_insert_color(&he->rb_node, entries);
+	rb_insert_color_cached(&he->rb_node, entries, leftmost);
 
 	perf_hpp_list__for_each_sort_list(&perf_hpp_list, fmt) {
 		if (perf_hpp__is_dynamic_entry(fmt) &&
@@ -1706,7 +1721,7 @@ static void __hists__insert_output_entry(struct rb_root *entries,
 static void output_resort(struct hists *hists, struct ui_progress *prog,
 			  bool use_callchain, hists__resort_cb_t cb)
 {
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *next;
 	struct hist_entry *n;
 	u64 callchain_total;
@@ -1736,8 +1751,8 @@ static void output_resort(struct hists *hists, struct ui_progress *prog,
 	else
 		root = hists->entries_in;
 
-	next = rb_first(root);
-	hists->entries = RB_ROOT;
+	next = rb_first_cached(root);
+	hists->entries = RB_ROOT_CACHED;
 
 	while (next) {
 		n = rb_entry(next, struct hist_entry, rb_node_in);
@@ -1798,7 +1813,7 @@ struct rb_node *rb_hierarchy_last(struct rb_node *node)
 	struct hist_entry *he = rb_entry(node, struct hist_entry, rb_node);
 
 	while (can_goto_child(he, HMD_NORMAL)) {
-		node = rb_last(&he->hroot_out);
+		node = rb_last(&he->hroot_out.rb_root);
 		he = rb_entry(node, struct hist_entry, rb_node);
 	}
 	return node;
@@ -1809,7 +1824,7 @@ struct rb_node *__rb_hierarchy_next(struct rb_node *node, enum hierarchy_move_di
 	struct hist_entry *he = rb_entry(node, struct hist_entry, rb_node);
 
 	if (can_goto_child(he, hmd))
-		node = rb_first(&he->hroot_out);
+		node = rb_first_cached(&he->hroot_out);
 	else
 		node = rb_next(node);
 
@@ -1847,7 +1862,7 @@ bool hist_entry__has_hierarchy_children(struct hist_entry *he, float limit)
 	if (he->leaf)
 		return false;
 
-	node = rb_first(&he->hroot_out);
+	node = rb_first_cached(&he->hroot_out);
 	child = rb_entry(node, struct hist_entry, rb_node);
 
 	while (node && child->filtered) {
@@ -1965,7 +1980,7 @@ static void hists__filter_by_type(struct hists *hists, int type, filter_fn_t fil
 	hists__reset_filter_stats(hists);
 	hists__reset_col_len(hists);
 
-	for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(&hists->entries); nd; nd = rb_next(nd)) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 
 		if (filter(hists, h))
@@ -1975,13 +1990,15 @@ static void hists__filter_by_type(struct hists *hists, int type, filter_fn_t fil
 	}
 }
 
-static void resort_filtered_entry(struct rb_root *root, struct hist_entry *he)
+static void resort_filtered_entry(struct rb_root_cached *root,
+				  struct hist_entry *he)
 {
-	struct rb_node **p = &root->rb_node;
+	struct rb_node **p = &root->rb_root.rb_node;
 	struct rb_node *parent = NULL;
 	struct hist_entry *iter;
-	struct rb_root new_root = RB_ROOT;
+	struct rb_root_cached new_root = RB_ROOT_CACHED;
 	struct rb_node *nd;
+	bool leftmost = true;
 
 	while (*p != NULL) {
 		parent = *p;
@@ -1989,22 +2006,24 @@ static void resort_filtered_entry(struct rb_root *root, struct hist_entry *he)
 
 		if (hist_entry__sort(he, iter) > 0)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 
 	rb_link_node(&he->rb_node, parent, p);
-	rb_insert_color(&he->rb_node, root);
+	rb_insert_color_cached(&he->rb_node, root, leftmost);
 
 	if (he->leaf || he->filtered)
 		return;
 
-	nd = rb_first(&he->hroot_out);
+	nd = rb_first_cached(&he->hroot_out);
 	while (nd) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 
 		nd = rb_next(nd);
-		rb_erase(&h->rb_node, &he->hroot_out);
+		rb_erase_cached(&h->rb_node, &he->hroot_out);
 
 		resort_filtered_entry(&new_root, h);
 	}
@@ -2015,14 +2034,14 @@ static void resort_filtered_entry(struct rb_root *root, struct hist_entry *he)
 static void hists__filter_hierarchy(struct hists *hists, int type, const void *arg)
 {
 	struct rb_node *nd;
-	struct rb_root new_root = RB_ROOT;
+	struct rb_root_cached new_root = RB_ROOT_CACHED;
 
 	hists->stats.nr_non_filtered_samples = 0;
 
 	hists__reset_filter_stats(hists);
 	hists__reset_col_len(hists);
 
-	nd = rb_first(&hists->entries);
+	nd = rb_first_cached(&hists->entries);
 	while (nd) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 		int ret;
@@ -2066,12 +2085,12 @@ static void hists__filter_hierarchy(struct hists *hists, int type, const void *a
 	 * resort output after applying a new filter since filter in a lower
 	 * hierarchy can change periods in a upper hierarchy.
 	 */
-	nd = rb_first(&hists->entries);
+	nd = rb_first_cached(&hists->entries);
 	while (nd) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 
 		nd = rb_next(nd);
-		rb_erase(&h->rb_node, &hists->entries);
+		rb_erase_cached(&h->rb_node, &hists->entries);
 
 		resort_filtered_entry(&new_root, h);
 	}
@@ -2140,18 +2159,19 @@ void hists__inc_nr_samples(struct hists *hists, bool filtered)
 static struct hist_entry *hists__add_dummy_entry(struct hists *hists,
 						 struct hist_entry *pair)
 {
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node **p;
 	struct rb_node *parent = NULL;
 	struct hist_entry *he;
 	int64_t cmp;
+	bool leftmost = true;
 
 	if (hists__has(hists, need_collapse))
 		root = &hists->entries_collapsed;
 	else
 		root = hists->entries_in;
 
-	p = &root->rb_node;
+	p = &root->rb_root.rb_node;
 
 	while (*p != NULL) {
 		parent = *p;
@@ -2164,8 +2184,10 @@ static struct hist_entry *hists__add_dummy_entry(struct hists *hists,
 
 		if (cmp < 0)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			leftmost = false;
+		}
 	}
 
 	he = hist_entry__new(pair, true);
@@ -2175,7 +2197,7 @@ static struct hist_entry *hists__add_dummy_entry(struct hists *hists,
 		if (symbol_conf.cumulate_callchain)
 			memset(he->stat_acc, 0, sizeof(he->stat));
 		rb_link_node(&he->rb_node_in, parent, p);
-		rb_insert_color(&he->rb_node_in, root);
+		rb_insert_color_cached(&he->rb_node_in, root, leftmost);
 		hists__inc_stats(hists, he);
 		he->dummy = true;
 	}
@@ -2184,15 +2206,16 @@ static struct hist_entry *hists__add_dummy_entry(struct hists *hists,
 }
 
 static struct hist_entry *add_dummy_hierarchy_entry(struct hists *hists,
-						    struct rb_root *root,
+						    struct rb_root_cached *root,
 						    struct hist_entry *pair)
 {
 	struct rb_node **p;
 	struct rb_node *parent = NULL;
 	struct hist_entry *he;
 	struct perf_hpp_fmt *fmt;
+	bool leftmost = true;
 
-	p = &root->rb_node;
+	p = &root->rb_root.rb_node;
 	while (*p != NULL) {
 		int64_t cmp = 0;
 
@@ -2209,14 +2232,16 @@ static struct hist_entry *add_dummy_hierarchy_entry(struct hists *hists,
 
 		if (cmp < 0)
 			p = &parent->rb_left;
-		else
+		else {
 			p = &parent->rb_right;
+			leftmost = false;
+		}
 	}
 
 	he = hist_entry__new(pair, true);
 	if (he) {
 		rb_link_node(&he->rb_node_in, parent, p);
-		rb_insert_color(&he->rb_node_in, root);
+		rb_insert_color_cached(&he->rb_node_in, root, leftmost);
 
 		he->dummy = true;
 		he->hists = hists;
@@ -2233,9 +2258,9 @@ static struct hist_entry *hists__find_entry(struct hists *hists,
 	struct rb_node *n;
 
 	if (hists__has(hists, need_collapse))
-		n = hists->entries_collapsed.rb_node;
+		n = hists->entries_collapsed.rb_root.rb_node;
 	else
-		n = hists->entries_in->rb_node;
+		n = hists->entries_in->rb_root.rb_node;
 
 	while (n) {
 		struct hist_entry *iter = rb_entry(n, struct hist_entry, rb_node_in);
@@ -2252,10 +2277,10 @@ static struct hist_entry *hists__find_entry(struct hists *hists,
 	return NULL;
 }
 
-static struct hist_entry *hists__find_hierarchy_entry(struct rb_root *root,
+static struct hist_entry *hists__find_hierarchy_entry(struct rb_root_cached *root,
 						      struct hist_entry *he)
 {
-	struct rb_node *n = root->rb_node;
+	struct rb_node *n = root->rb_root.rb_node;
 
 	while (n) {
 		struct hist_entry *iter;
@@ -2280,13 +2305,13 @@ static struct hist_entry *hists__find_hierarchy_entry(struct rb_root *root,
 	return NULL;
 }
 
-static void hists__match_hierarchy(struct rb_root *leader_root,
-				   struct rb_root *other_root)
+static void hists__match_hierarchy(struct rb_root_cached *leader_root,
+				   struct rb_root_cached *other_root)
 {
 	struct rb_node *nd;
 	struct hist_entry *pos, *pair;
 
-	for (nd = rb_first(leader_root); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(leader_root); nd; nd = rb_next(nd)) {
 		pos  = rb_entry(nd, struct hist_entry, rb_node_in);
 		pair = hists__find_hierarchy_entry(other_root, pos);
 
@@ -2302,7 +2327,7 @@ static void hists__match_hierarchy(struct rb_root *leader_root,
  */
 void hists__match(struct hists *leader, struct hists *other)
 {
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *nd;
 	struct hist_entry *pos, *pair;
 
@@ -2317,7 +2342,7 @@ void hists__match(struct hists *leader, struct hists *other)
 	else
 		root = leader->entries_in;
 
-	for (nd = rb_first(root); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(root); nd; nd = rb_next(nd)) {
 		pos  = rb_entry(nd, struct hist_entry, rb_node_in);
 		pair = hists__find_entry(other, pos);
 
@@ -2328,13 +2353,13 @@ void hists__match(struct hists *leader, struct hists *other)
 
 static int hists__link_hierarchy(struct hists *leader_hists,
 				 struct hist_entry *parent,
-				 struct rb_root *leader_root,
-				 struct rb_root *other_root)
+				 struct rb_root_cached *leader_root,
+				 struct rb_root_cached *other_root)
 {
 	struct rb_node *nd;
 	struct hist_entry *pos, *leader;
 
-	for (nd = rb_first(other_root); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(other_root); nd; nd = rb_next(nd)) {
 		pos = rb_entry(nd, struct hist_entry, rb_node_in);
 
 		if (hist_entry__has_pairs(pos)) {
@@ -2377,7 +2402,7 @@ static int hists__link_hierarchy(struct hists *leader_hists,
  */
 int hists__link(struct hists *leader, struct hists *other)
 {
-	struct rb_root *root;
+	struct rb_root_cached *root;
 	struct rb_node *nd;
 	struct hist_entry *pos, *pair;
 
@@ -2393,7 +2418,7 @@ int hists__link(struct hists *leader, struct hists *other)
 	else
 		root = other->entries_in;
 
-	for (nd = rb_first(root); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(root); nd; nd = rb_next(nd)) {
 		pos = rb_entry(nd, struct hist_entry, rb_node_in);
 
 		if (!hist_entry__has_pairs(pos)) {
@@ -2566,10 +2591,10 @@ int perf_hist_config(const char *var, const char *value)
 int __hists__init(struct hists *hists, struct perf_hpp_list *hpp_list)
 {
 	memset(hists, 0, sizeof(*hists));
-	hists->entries_in_array[0] = hists->entries_in_array[1] = RB_ROOT;
+	hists->entries_in_array[0] = hists->entries_in_array[1] = RB_ROOT_CACHED;
 	hists->entries_in = &hists->entries_in_array[0];
-	hists->entries_collapsed = RB_ROOT;
-	hists->entries = RB_ROOT;
+	hists->entries_collapsed = RB_ROOT_CACHED;
+	hists->entries = RB_ROOT_CACHED;
 	pthread_mutex_init(&hists->lock, NULL);
 	hists->socket_filter = -1;
 	hists->hpp_list = hpp_list;
@@ -2577,14 +2602,14 @@ int __hists__init(struct hists *hists, struct perf_hpp_list *hpp_list)
 	return 0;
 }
 
-static void hists__delete_remaining_entries(struct rb_root *root)
+static void hists__delete_remaining_entries(struct rb_root_cached *root)
 {
 	struct rb_node *node;
 	struct hist_entry *he;
 
-	while (!RB_EMPTY_ROOT(root)) {
-		node = rb_first(root);
-		rb_erase(node, root);
+	while (!RB_EMPTY_ROOT(&root->rb_root)) {
+		node = rb_first_cached(root);
+		rb_erase_cached(node, root);
 
 		he = rb_entry(node, struct hist_entry, rb_node_in);
 		hist_entry__delete(he);
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 664b5eda8d51..08267af7439c 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -70,10 +70,10 @@ struct thread;
 struct dso;
 
 struct hists {
-	struct rb_root		entries_in_array[2];
-	struct rb_root		*entries_in;
-	struct rb_root		entries;
-	struct rb_root		entries_collapsed;
+	struct rb_root_cached	entries_in_array[2];
+	struct rb_root_cached	*entries_in;
+	struct rb_root_cached	entries;
+	struct rb_root_cached	entries_collapsed;
 	u64			nr_entries;
 	u64			nr_non_filtered_entries;
 	u64			callchain_period;
@@ -230,7 +230,7 @@ static __pure inline bool hists__has_callchains(struct hists *hists)
 int hists__init(void);
 int __hists__init(struct hists *hists, struct perf_hpp_list *hpp_list);
 
-struct rb_root *hists__get_rotate_entries_in(struct hists *hists);
+struct rb_root_cached *hists__get_rotate_entries_in(struct hists *hists);
 
 struct perf_hpp {
 	char *buf;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 130fe37fe2df..dd6312876492 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -145,8 +145,8 @@ struct hist_entry {
 	union {
 		/* this is for hierarchical entry structure */
 		struct {
-			struct rb_root	hroot_in;
-			struct rb_root  hroot_out;
+			struct rb_root_cached	hroot_in;
+			struct rb_root_cached   hroot_out;
 		};				/* non-leaf entries */
 		struct rb_root	sorted_chain;	/* leaf entry has callchains */
 	};
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 16/29] perf sched: Use cached rbtrees
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 15/29] perf hist: " Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 17/29] perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Jiri Olsa, Namhyung Kim,
	Arnaldo Carvalho de Melo

From: Davidlohr Bueso <dave@stgolabs.net>

At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something heavily required for perf-sched.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-8-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 45 +++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 2e0f0c65964a..640558e9352e 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -213,7 +213,7 @@ struct perf_sched {
 	u64		 all_runtime;
 	u64		 all_count;
 	u64		 cpu_last_switched[MAX_CPUS];
-	struct rb_root	 atom_root, sorted_atom_root, merged_atom_root;
+	struct rb_root_cached atom_root, sorted_atom_root, merged_atom_root;
 	struct list_head sort_list, cmp_pid;
 	bool force;
 	bool skip_merge;
@@ -271,7 +271,7 @@ struct evsel_runtime {
 struct idle_thread_runtime {
 	struct thread_runtime	tr;
 	struct thread		*last_thread;
-	struct rb_root		sorted_root;
+	struct rb_root_cached	sorted_root;
 	struct callchain_root	callchain;
 	struct callchain_cursor	cursor;
 };
@@ -950,10 +950,10 @@ thread_lat_cmp(struct list_head *list, struct work_atoms *l, struct work_atoms *
 }
 
 static struct work_atoms *
-thread_atoms_search(struct rb_root *root, struct thread *thread,
+thread_atoms_search(struct rb_root_cached *root, struct thread *thread,
 			 struct list_head *sort_list)
 {
-	struct rb_node *node = root->rb_node;
+	struct rb_node *node = root->rb_root.rb_node;
 	struct work_atoms key = { .thread = thread };
 
 	while (node) {
@@ -976,10 +976,11 @@ thread_atoms_search(struct rb_root *root, struct thread *thread,
 }
 
 static void
-__thread_latency_insert(struct rb_root *root, struct work_atoms *data,
+__thread_latency_insert(struct rb_root_cached *root, struct work_atoms *data,
 			 struct list_head *sort_list)
 {
-	struct rb_node **new = &(root->rb_node), *parent = NULL;
+	struct rb_node **new = &(root->rb_root.rb_node), *parent = NULL;
+	bool leftmost = true;
 
 	while (*new) {
 		struct work_atoms *this;
@@ -992,12 +993,14 @@ __thread_latency_insert(struct rb_root *root, struct work_atoms *data,
 
 		if (cmp > 0)
 			new = &((*new)->rb_left);
-		else
+		else {
 			new = &((*new)->rb_right);
+			leftmost = false;
+		}
 	}
 
 	rb_link_node(&data->node, parent, new);
-	rb_insert_color(&data->node, root);
+	rb_insert_color_cached(&data->node, root, leftmost);
 }
 
 static int thread_atoms_insert(struct perf_sched *sched, struct thread *thread)
@@ -1447,15 +1450,15 @@ static int sort_dimension__add(const char *tok, struct list_head *list)
 static void perf_sched__sort_lat(struct perf_sched *sched)
 {
 	struct rb_node *node;
-	struct rb_root *root = &sched->atom_root;
+	struct rb_root_cached *root = &sched->atom_root;
 again:
 	for (;;) {
 		struct work_atoms *data;
-		node = rb_first(root);
+		node = rb_first_cached(root);
 		if (!node)
 			break;
 
-		rb_erase(node, root);
+		rb_erase_cached(node, root);
 		data = rb_entry(node, struct work_atoms, node);
 		__thread_latency_insert(&sched->sorted_atom_root, data, &sched->sort_list);
 	}
@@ -2762,12 +2765,12 @@ static size_t callchain__fprintf_folded(FILE *fp, struct callchain_node *node)
 	return ret;
 }
 
-static size_t timehist_print_idlehist_callchain(struct rb_root *root)
+static size_t timehist_print_idlehist_callchain(struct rb_root_cached *root)
 {
 	size_t ret = 0;
 	FILE *fp = stdout;
 	struct callchain_node *chain;
-	struct rb_node *rb_node = rb_first(root);
+	struct rb_node *rb_node = rb_first_cached(root);
 
 	printf("  %16s  %8s  %s\n", "Idle time (msec)", "Count", "Callchains");
 	printf("  %.16s  %.8s  %.50s\n", graph_dotted_line, graph_dotted_line,
@@ -2868,7 +2871,7 @@ static void timehist_print_summary(struct perf_sched *sched,
 			if (itr == NULL)
 				continue;
 
-			callchain_param.sort(&itr->sorted_root, &itr->callchain,
+			callchain_param.sort(&itr->sorted_root.rb_root, &itr->callchain,
 					     0, &callchain_param);
 
 			printf("  CPU %2d:", i);
@@ -3074,11 +3077,12 @@ static void print_bad_events(struct perf_sched *sched)
 	}
 }
 
-static void __merge_work_atoms(struct rb_root *root, struct work_atoms *data)
+static void __merge_work_atoms(struct rb_root_cached *root, struct work_atoms *data)
 {
-	struct rb_node **new = &(root->rb_node), *parent = NULL;
+	struct rb_node **new = &(root->rb_root.rb_node), *parent = NULL;
 	struct work_atoms *this;
 	const char *comm = thread__comm_str(data->thread), *this_comm;
+	bool leftmost = true;
 
 	while (*new) {
 		int cmp;
@@ -3092,6 +3096,7 @@ static void __merge_work_atoms(struct rb_root *root, struct work_atoms *data)
 			new = &((*new)->rb_left);
 		} else if (cmp < 0) {
 			new = &((*new)->rb_right);
+			leftmost = false;
 		} else {
 			this->num_merged++;
 			this->total_runtime += data->total_runtime;
@@ -3109,7 +3114,7 @@ static void __merge_work_atoms(struct rb_root *root, struct work_atoms *data)
 
 	data->num_merged++;
 	rb_link_node(&data->node, parent, new);
-	rb_insert_color(&data->node, root);
+	rb_insert_color_cached(&data->node, root, leftmost);
 }
 
 static void perf_sched__merge_lat(struct perf_sched *sched)
@@ -3120,8 +3125,8 @@ static void perf_sched__merge_lat(struct perf_sched *sched)
 	if (sched->skip_merge)
 		return;
 
-	while ((node = rb_first(&sched->atom_root))) {
-		rb_erase(node, &sched->atom_root);
+	while ((node = rb_first_cached(&sched->atom_root))) {
+		rb_erase_cached(node, &sched->atom_root);
 		data = rb_entry(node, struct work_atoms, node);
 		__merge_work_atoms(&sched->merged_atom_root, data);
 	}
@@ -3143,7 +3148,7 @@ static int perf_sched__lat(struct perf_sched *sched)
 	printf("  Task                  |   Runtime ms  | Switches | Average delay ms | Maximum delay ms | Maximum delay at       |\n");
 	printf(" -----------------------------------------------------------------------------------------------------------------\n");
 
-	next = rb_first(&sched->sorted_atom_root);
+	next = rb_first_cached(&sched->sorted_atom_root);
 
 	while (next) {
 		struct work_atoms *work_list;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 17/29] perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 16/29] perf sched: " Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 18/29] perf script python: Add trace_context extension module to sys.modules Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Song Liu,
	Arnaldo Carvalho de Melo, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Peter Zijlstra, kernel-team

From: Song Liu <songliubraving@fb.com>

Added missing machine->id_hdr_size to event->header.size. Also fixed
size of PERF_RECORD_KSYMBOL by removing extra bytes for name.

Committer notes:

We need to malloc that extra machine->id_hdr_size at the start of
perf_event__synthesize_bpf_events() and also need to cast the event to
(void *) otherwise we segfault, fix it.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Fixes: 7b612e291a5a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs")
Link: http://lkml.kernel.org/r/20190122210218.358664-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/bpf-event.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index 01e1dc1bb7fb..796ef793f4ce 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -7,6 +7,7 @@
 #include "bpf-event.h"
 #include "debug.h"
 #include "symbol.h"
+#include "machine.h"
 
 #define ptr_to_u64(ptr)    ((__u64)(unsigned long)(ptr))
 
@@ -149,7 +150,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool,
 		*ksymbol_event = (struct ksymbol_event){
 			.header = {
 				.type = PERF_RECORD_KSYMBOL,
-				.size = sizeof(struct ksymbol_event),
+				.size = offsetof(struct ksymbol_event, name),
 			},
 			.addr = prog_addrs[i],
 			.len = prog_lens[i],
@@ -178,6 +179,9 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool,
 
 		ksymbol_event->header.size += PERF_ALIGN(name_len + 1,
 							 sizeof(u64));
+
+		memset((void *)event + event->header.size, 0, machine->id_hdr_size);
+		event->header.size += machine->id_hdr_size;
 		err = perf_tool__process_synth_event(tool, event,
 						     machine, process);
 	}
@@ -194,6 +198,8 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool,
 			.id = info.id,
 		};
 		memcpy(bpf_event->tag, prog_tags[i], BPF_TAG_SIZE);
+		memset((void *)event + event->header.size, 0, machine->id_hdr_size);
+		event->header.size += machine->id_hdr_size;
 		err = perf_tool__process_synth_event(tool, event,
 						     machine, process);
 	}
@@ -217,7 +223,7 @@ int perf_event__synthesize_bpf_events(struct perf_tool *tool,
 	int err;
 	int fd;
 
-	event = malloc(sizeof(event->bpf_event) + KSYM_NAME_LEN);
+	event = malloc(sizeof(event->bpf_event) + KSYM_NAME_LEN + machine->id_hdr_size);
 	if (!event)
 		return -1;
 	while (true) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 18/29] perf script python: Add trace_context extension module to sys.modules
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (16 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 17/29] perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 19/29] perf script python: Use PyBytes for attr in trace-event-python Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Tony Jones,
	Jiri Olsa, Arnaldo Carvalho de Melo, Jaroslav Škarvada,
	Jonathan Corbet, Ravi Bangoria, Seeteena Thoufeek

From: Tony Jones <tonyj@suse.de>

In Python3, the result of PyModule_Create (called from
scripts/python/Perf-Trace-Util/Context.c) is not automatically added to
sys.modules.  See: https://bugs.python.org/issue4592

Below is the observed behavior without the fix:

  # ldd /usr/bin/perf | grep -i python
	libpython3.6m.so.1.0 => /usr/lib64/libpython3.6m.so.1.0 (0x00007f8e1dfb2000)

  # perf record /bin/false
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.015 MB perf.data (17 samples) ]

  # perf script -g python | cat
  generated Python script: perf-script.py

  # perf script -s ./perf-script.py
  Traceback (most recent call last):
    File "./perf-script.py", line 18, in <module>
      from perf_trace_context import *
  ModuleNotFoundError: No module named 'perf_trace_context'
  Error running python script ./perf-script.py
  #

Committer notes:

To build with python3 use:

  $ make -C tools/perf PYTHON=python3

Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jaroslav Škarvada <jskarvad@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Fixes: 66dfdff03d19 ("perf tools: Add Python 3 support")
Link: http://lkml.kernel.org/r/20190124005229.16146-2-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/scripting-engines/trace-event-python.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 87ef16a1b17e..40300d8e80a7 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1494,6 +1494,7 @@ static void _free_command_line(wchar_t **command_line, int num)
 static int python_start_script(const char *script, int argc, const char **argv)
 {
 	struct tables *tables = &tables_global;
+	PyMODINIT_FUNC (*initfunc)(void);
 #if PY_MAJOR_VERSION < 3
 	const char **command_line;
 #else
@@ -1504,24 +1505,25 @@ static int python_start_script(const char *script, int argc, const char **argv)
 	FILE *fp;
 
 #if PY_MAJOR_VERSION < 3
+	initfunc = initperf_trace_context;
 	command_line = malloc((argc + 1) * sizeof(const char *));
 	command_line[0] = script;
 	for (i = 1; i < argc + 1; i++)
 		command_line[i] = argv[i - 1];
 #else
+	initfunc = PyInit_perf_trace_context;
 	command_line = malloc((argc + 1) * sizeof(wchar_t *));
 	command_line[0] = Py_DecodeLocale(script, NULL);
 	for (i = 1; i < argc + 1; i++)
 		command_line[i] = Py_DecodeLocale(argv[i - 1], NULL);
 #endif
 
+	PyImport_AppendInittab("perf_trace_context", initfunc);
 	Py_Initialize();
 
 #if PY_MAJOR_VERSION < 3
-	initperf_trace_context();
 	PySys_SetArgv(argc + 1, (char **)command_line);
 #else
-	PyInit_perf_trace_context();
 	PySys_SetArgv(argc + 1, command_line);
 #endif
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 19/29] perf script python: Use PyBytes for attr in trace-event-python
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (17 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 18/29] perf script python: Add trace_context extension module to sys.modules Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 20/29] perf script python: Remove explicit shebang from setup.py Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Tony Jones,
	Jiri Olsa, Arnaldo Carvalho de Melo, Jaroslav Škarvada,
	Jonathan Corbet, Ravi Bangoria, Seeteena Thoufeek

From: Tony Jones <tonyj@suse.de>

With Python3.  PyUnicode_FromStringAndSize is unsafe to call on attr and will
return NULL.  Use _PyBytes_FromStringAndSize (as with raw_buf).

Below is the observed behavior without the fix.  Note it is first necessary
to apply the prior fix (Add trace_context extension module to sys,modules):

  # ldd /usr/bin/perf | grep -i python
          libpython3.6m.so.1.0 => /usr/lib64/libpython3.6m.so.1.0 (0x00007f8e1dfb2000)

  # perf record -e raw_syscalls:sys_enter /bin/false
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.018 MB perf.data (21 samples) ]

  # perf script -g python | cat
  generated Python script: perf-script.py

  # perf script -s ./perf-script.py
  in trace_begin
  Segmentation fault (core dumped)

Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jaroslav Škarvada <jskarvad@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Fixes: 66dfdff03d19 ("perf tools: Add Python 3 support")
Link: http://lkml.kernel.org/r/20190124005229.16146-3-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/scripting-engines/trace-event-python.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 40300d8e80a7..777c97b13905 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -733,8 +733,7 @@ static PyObject *get_perf_sample_dict(struct perf_sample *sample,
 		Py_FatalError("couldn't create Python dictionary");
 
 	pydict_set_item_string_decref(dict, "ev_name", _PyUnicode_FromString(perf_evsel__name(evsel)));
-	pydict_set_item_string_decref(dict, "attr", _PyUnicode_FromStringAndSize(
-			(const char *)&evsel->attr, sizeof(evsel->attr)));
+	pydict_set_item_string_decref(dict, "attr", _PyBytes_FromStringAndSize((const char *)&evsel->attr, sizeof(evsel->attr)));
 
 	pydict_set_item_string_decref(dict_sample, "pid",
 			_PyLong_FromLong(sample->pid));
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 20/29] perf script python: Remove explicit shebang from setup.py
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (18 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 19/29] perf script python: Use PyBytes for attr in trace-event-python Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 21/29] perf script python: Remove explicit shebang from tests/attr.c Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Tony Jones,
	Jiri Olsa, Jonathan Corbet, Ravi Bangoria, Seeteena Thoufeek,
	Arnaldo Carvalho de Melo

From: Tony Jones <tonyj@suse.de>

Makefile.perf invokes setup.py via an explicit invocation of python
(PYTHON_WORD) so there is therefore no need for an explicit shebang.

Also most distros follow pep-0394 which recommends that /usr/bin/python
refer only to v2 and so may not exist on the system (if PYTHON=python3).

Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20190124005229.16146-4-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/setup.py | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/perf/util/setup.py b/tools/perf/util/setup.py
index 64d1f36dee99..d3ffc18424b5 100644
--- a/tools/perf/util/setup.py
+++ b/tools/perf/util/setup.py
@@ -1,5 +1,3 @@
-#!/usr/bin/python
-
 from os import getenv
 from subprocess import Popen, PIPE
 from re import sub
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 21/29] perf script python: Remove explicit shebang from tests/attr.c
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (19 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 20/29] perf script python: Remove explicit shebang from setup.py Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 22/29] perf script python: Remove explicit shebang from Python scripts Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Tony Jones,
	Jiri Olsa, Jonathan Corbet, Ravi Bangoria, Seeteena Thoufeek,
	Arnaldo Carvalho de Melo

From: Tony Jones <tonyj@suse.de>

tests/attr.c invokes attr.py via an explicit invocation of Python
($PYTHON) so there is therefore no need for an explicit shebang.

Also most distros follow pep-0394 which recommends that /usr/bin/python
refer only to v2 and so may not exist on the system (if PYTHON=python3).

Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20190124005229.16146-5-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/attr.py | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/perf/tests/attr.py b/tools/perf/tests/attr.py
index 44090a9a19f3..3e07eee33b10 100644
--- a/tools/perf/tests/attr.py
+++ b/tools/perf/tests/attr.py
@@ -1,4 +1,3 @@
-#! /usr/bin/python
 # SPDX-License-Identifier: GPL-2.0
 
 import os
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 22/29] perf script python: Remove explicit shebang from Python scripts
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (20 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 21/29] perf script python: Remove explicit shebang from tests/attr.c Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 23/29] perf script python: Add Python3 support to tests/attr.py Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Tony Jones,
	Jiri Olsa, Adrian Hunter, Frederic Weisbecker, Jonathan Corbet,
	Ravi Bangoria, Seeteena Thoufeek, Arnaldo Carvalho de Melo

From: Tony Jones <tonyj@suse.de>

The scripts in scripts/python are intended to be run from 'perf script'
and the Python version used is dictated by how perf was built (PYTHON=).

Also most distros follow pep-0394 which recommends that /usr/bin/python
refer to Python2 and so may not exist on the system (if PYTHON=python3).

- Remove the explicit shebang
- Install the scripts as mode 644

Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20190124005229.16146-6-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.perf                         | 4 ++--
 tools/perf/scripts/python/exported-sql-viewer.py | 1 -
 tools/perf/scripts/python/sched-migration.py     | 2 --
 tools/perf/scripts/python/stat-cpi.py            | 1 -
 4 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 0ee6795d82cc..09df1c8a4ec9 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -863,8 +863,8 @@ ifndef NO_LIBPYTHON
 	$(call QUIET_INSTALL, python-scripts) \
 		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python/Perf-Trace-Util/lib/Perf/Trace'; \
 		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python/bin'; \
-		$(INSTALL) scripts/python/Perf-Trace-Util/lib/Perf/Trace/* -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python/Perf-Trace-Util/lib/Perf/Trace'; \
-		$(INSTALL) scripts/python/*.py -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python'; \
+		$(INSTALL) scripts/python/Perf-Trace-Util/lib/Perf/Trace/* -m 644 -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python/Perf-Trace-Util/lib/Perf/Trace'; \
+		$(INSTALL) scripts/python/*.py -m 644 -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python'; \
 		$(INSTALL) scripts/python/bin/* -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python/bin'
 endif
 	$(call QUIET_INSTALL, perf_completion-script) \
diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index f278ce5ebab7..c3091401df91 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -1,4 +1,3 @@
-#!/usr/bin/python2
 # SPDX-License-Identifier: GPL-2.0
 # exported-sql-viewer.py: view data from sql database
 # Copyright (c) 2014-2018, Intel Corporation.
diff --git a/tools/perf/scripts/python/sched-migration.py b/tools/perf/scripts/python/sched-migration.py
index 3473e7f66081..3984bf51f3c5 100644
--- a/tools/perf/scripts/python/sched-migration.py
+++ b/tools/perf/scripts/python/sched-migration.py
@@ -1,5 +1,3 @@
-#!/usr/bin/python
-#
 # Cpu task migration overview toy
 #
 # Copyright (C) 2010 Frederic Weisbecker <fweisbec@gmail.com>
diff --git a/tools/perf/scripts/python/stat-cpi.py b/tools/perf/scripts/python/stat-cpi.py
index 8410672efb8b..a81ad8835a74 100644
--- a/tools/perf/scripts/python/stat-cpi.py
+++ b/tools/perf/scripts/python/stat-cpi.py
@@ -1,4 +1,3 @@
-#!/usr/bin/env python
 # SPDX-License-Identifier: GPL-2.0
 
 data    = {}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 23/29] perf script python: Add Python3 support to tests/attr.py
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (21 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 22/29] perf script python: Remove explicit shebang from Python scripts Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 24/29] perf bpf: Add bpf_map() helper Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Tony Jones,
	Jiri Olsa, Arnaldo Carvalho de Melo, Jonathan Corbet,
	Ravi Bangoria, Seeteena Thoufeek

From: Tony Jones <tonyj@suse.de>

Support both Python 2 and Python 3 in tests/attr.py

The use of "except as" syntax implies the minimum supported Python2 version is
now v2.6

Committer testing:

  $ make -C tools/perf PYTHON3=python install-bin

Before:

  # perf test attr
  16: Setup struct perf_event_attr                          : FAILED!
  48: Synthesize attr update                                : Ok
  [root@quaco ~]# perf test -v attr
  16: Setup struct perf_event_attr                          :
  --- start ---
  test child forked, pid 3121
    File "/home/acme/libexec/perf-core/tests/attr.py", line 324
      except Unsup, obj:
                ^
  SyntaxError: invalid syntax
  test child finished with -1
  ---- end ----
  Setup struct perf_event_attr: FAILED!
  48: Synthesize attr update                                :
  --- start ---
  test child forked, pid 3124
  test child finished with 0
  ---- end ----
  Synthesize attr update: Ok
  #

After:

   # perf test attr
  16: Setup struct perf_event_attr                          : Ok
  48: Synthesize attr update                                : Ok
  #

Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20190124005229.16146-7-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/attr.py | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/tools/perf/tests/attr.py b/tools/perf/tests/attr.py
index 3e07eee33b10..cb39ac46bc73 100644
--- a/tools/perf/tests/attr.py
+++ b/tools/perf/tests/attr.py
@@ -1,5 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 
+from __future__ import print_function
+
 import os
 import sys
 import glob
@@ -7,7 +9,11 @@ import optparse
 import tempfile
 import logging
 import shutil
-import ConfigParser
+
+try:
+    import configparser
+except ImportError:
+    import ConfigParser as configparser
 
 def data_equal(a, b):
     # Allow multiple values in assignment separated by '|'
@@ -99,20 +105,20 @@ class Event(dict):
     def equal(self, other):
         for t in Event.terms:
             log.debug("      [%s] %s %s" % (t, self[t], other[t]));
-            if not self.has_key(t) or not other.has_key(t):
+            if t not in self or t not in other:
                 return False
             if not data_equal(self[t], other[t]):
                 return False
         return True
 
     def optional(self):
-        if self.has_key('optional') and self['optional'] == '1':
+        if 'optional' in self and self['optional'] == '1':
             return True
         return False
 
     def diff(self, other):
         for t in Event.terms:
-            if not self.has_key(t) or not other.has_key(t):
+            if t not in self or t not in other:
                 continue
             if not data_equal(self[t], other[t]):
                 log.warning("expected %s=%s, got %s" % (t, self[t], other[t]))
@@ -133,7 +139,7 @@ class Event(dict):
 #   - expected values assignments
 class Test(object):
     def __init__(self, path, options):
-        parser = ConfigParser.SafeConfigParser()
+        parser = configparser.SafeConfigParser()
         parser.read(path)
 
         log.warning("running '%s'" % path)
@@ -192,7 +198,7 @@ class Test(object):
         return True
 
     def load_events(self, path, events):
-        parser_event = ConfigParser.SafeConfigParser()
+        parser_event = configparser.SafeConfigParser()
         parser_event.read(path)
 
         # The event record section header contains 'event' word,
@@ -206,7 +212,7 @@ class Test(object):
             # Read parent event if there's any
             if (':' in section):
                 base = section[section.index(':') + 1:]
-                parser_base = ConfigParser.SafeConfigParser()
+                parser_base = configparser.SafeConfigParser()
                 parser_base.read(self.test_dir + '/' + base)
                 base_items = parser_base.items('event')
 
@@ -321,9 +327,9 @@ def run_tests(options):
     for f in glob.glob(options.test_dir + '/' + options.test):
         try:
             Test(f, options).run()
-        except Unsup, obj:
+        except Unsup as obj:
             log.warning("unsupp  %s" % obj.getMsg())
-        except Notest, obj:
+        except Notest as obj:
             log.warning("skipped %s" % obj.getMsg())
 
 def setup_log(verbose):
@@ -362,7 +368,7 @@ def main():
     parser.add_option("-p", "--perf",
                       action="store", type="string", dest="perf")
     parser.add_option("-v", "--verbose",
-                      action="count", dest="verbose")
+                      default=0, action="count", dest="verbose")
 
     options, args = parser.parse_args()
     if args:
@@ -372,7 +378,7 @@ def main():
     setup_log(options.verbose)
 
     if not options.test_dir:
-        print 'FAILED no -d option specified'
+        print('FAILED no -d option specified')
         sys.exit(-1)
 
     if not options.test:
@@ -381,8 +387,8 @@ def main():
     try:
         run_tests(options)
 
-    except Fail, obj:
-        print "FAILED %s" % obj.getMsg();
+    except Fail as obj:
+        print("FAILED %s" % obj.getMsg())
         sys.exit(-1)
 
     sys.exit(0)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 24/29] perf bpf: Add bpf_map() helper
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (22 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 23/29] perf script python: Add Python3 support to tests/attr.py Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 25/29] perf bpf: Convert pid_map() to bpf_map() Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Daniel Borkmann,
	Jiri Olsa, Luis Cláudio Gonçalves, Martin KaFai Lau,
	Namhyung Kim, Wang Nan, Yonghong Song

From: Arnaldo Carvalho de Melo <acme@redhat.com>

To make the declaration of maps more compact, the following patches will
make use of it.

Standardizing on it will allow to add the BTF details, i.e.
BPF_ANNOTATE_KV_PAIR() (tools/testing/selftests/bpf/bpf_helpers.h)
transparently.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-h3q9rxxkbzetgnbro5rclqft@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/include/bpf/bpf.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/include/bpf/bpf.h b/tools/perf/include/bpf/bpf.h
index e667577207dc..0482028c1d11 100644
--- a/tools/perf/include/bpf/bpf.h
+++ b/tools/perf/include/bpf/bpf.h
@@ -18,6 +18,14 @@ struct bpf_map {
         unsigned int numa_node;
 };
 
+#define bpf_map(name, _type, type_key, type_val, _max_entries)	\
+struct bpf_map SEC("maps") name = {				\
+	.type	     = BPF_MAP_TYPE_##_type,			\
+	.key_size    = sizeof(type_key),			\
+	.value_size  = sizeof(type_val),			\
+	.max_entries = _max_entries,				\
+}
+
 /*
  * FIXME: this should receive .max_entries as a parameter, as careful
  *	  tuning of these limits is needed to avoid hitting limits that
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 25/29] perf bpf: Convert pid_map() to bpf_map()
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (23 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 24/29] perf bpf: Add bpf_map() helper Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 26/29] perf augmented_raw_syscalls: Use bpf_map() Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Daniel Borkmann,
	Jiri Olsa, Luis Cláudio Gonçalves, Martin KaFai Lau,
	Namhyung Kim, Wang Nan, Yonghong Song

From: Arnaldo Carvalho de Melo <acme@redhat.com>

First user, pid_t as the type, lets see how this goes with the BTF
routines.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-56eplvf86r69wt3p35nh805z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/include/bpf/bpf.h | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/tools/perf/include/bpf/bpf.h b/tools/perf/include/bpf/bpf.h
index 0482028c1d11..5df7ed9d9020 100644
--- a/tools/perf/include/bpf/bpf.h
+++ b/tools/perf/include/bpf/bpf.h
@@ -34,13 +34,7 @@ struct bpf_map SEC("maps") name = {				\
  *	  For the current need, 'perf trace --filter-pids', 64 should
  *	  be good enough, but this surely needs to be revisited.
  */
-#define pid_map(name, value_type)		\
-struct bpf_map SEC("maps") name = {		\
-	.type	     = BPF_MAP_TYPE_HASH,	\
-	.key_size    = sizeof(pid_t),		\
-	.value_size  = sizeof(value_type),	\
-	.max_entries = 64,			\
-}
+#define pid_map(name, value_type) bpf_map(name, HASH, pid_t, value_type, 64)
 
 static int (*bpf_map_update_elem)(struct bpf_map *map, void *key, void *value, u64 flags) = (void *)BPF_FUNC_map_update_elem;
 static void *(*bpf_map_lookup_elem)(struct bpf_map *map, void *key) = (void *)BPF_FUNC_map_lookup_elem;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 26/29] perf augmented_raw_syscalls: Use bpf_map()
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (24 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 25/29] perf bpf: Convert pid_map() to bpf_map() Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 27/29] perf trace: Fixup etcsnoop example Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Daniel Borkmann,
	Jiri Olsa, Luis Cláudio Gonçalves, Martin KaFai Lau,
	Namhyung Kim, Wang Nan, Yonghong Song

From: Arnaldo Carvalho de Melo <acme@redhat.com>

To make the code more compact and also paving the way to have the BTF
annotation to be done transparently.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-pjlf38sv3i1hbn5vzkr4y3ol@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/examples/bpf/augmented_raw_syscalls.c | 14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 9e9d4c66e53c..f9b2161e1ca4 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -18,23 +18,13 @@
 #include <pid_filter.h>
 
 /* bpf-output associated map */
-struct bpf_map SEC("maps") __augmented_syscalls__ = {
-	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
-	.key_size = sizeof(int),
-	.value_size = sizeof(u32),
-	.max_entries = __NR_CPUS__,
-};
+bpf_map(__augmented_syscalls__, PERF_EVENT_ARRAY, int, u32, __NR_CPUS__);
 
 struct syscall {
 	bool	enabled;
 };
 
-struct bpf_map SEC("maps") syscalls = {
-	.type	     = BPF_MAP_TYPE_ARRAY,
-	.key_size    = sizeof(int),
-	.value_size  = sizeof(struct syscall),
-	.max_entries = 512,
-};
+bpf_map(syscalls, ARRAY, int, struct syscall, 512);
 
 struct syscall_enter_args {
 	unsigned long long common_tp_fields;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 27/29] perf trace: Fixup etcsnoop example
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (25 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 26/29] perf augmented_raw_syscalls: Use bpf_map() Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 28/29] perf bpf examples: Convert etcsnoop to use bpf_map() Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa,
	Luis Cláudio Gonçalves, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Where we don't have "raw_syscalls:sys_enter", so we need to look for a
"*syscalls:sys_enter*" to initialize the offsets for the
__augmented_syscalls__ evsel, which is the case with etcsnoop, that was
segfaulting, fixed:

  # trace -e /home/acme/git/perf/tools/perf/examples/bpf/etcsnoop.c
     0.000 (         ): gnome-shell/2105 openat(dfd: CWD, filename: "/etc/localtime")                       ...
   631.834 (         ): cat/6521 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) ...
   632.637 (         ): bash/6521 openat(dfd: CWD, filename: "/etc/passwd")                          ...
  ^C#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: b9b6a2ea2baf ("perf trace: Do not hardcode the size of the tracepoint common_ fields")
Link: https://lkml.kernel.org/n/tip-0tjwcit8qitsmh4nyvf2b0jo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index ed4583128b9c..1447993e1ee3 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3854,7 +3854,8 @@ int cmd_trace(int argc, const char **argv)
 				goto init_augmented_syscall_tp;
 			}
 
-			if (strcmp(perf_evsel__name(evsel), "raw_syscalls:sys_enter") == 0) {
+			if (trace.syscalls.events.augmented->priv == NULL &&
+			    strstr(perf_evsel__name(evsel), "syscalls:sys_enter")) {
 				struct perf_evsel *augmented = trace.syscalls.events.augmented;
 				if (perf_evsel__init_augmented_syscall_tp(augmented, evsel) ||
 				    perf_evsel__init_augmented_syscall_tp_args(augmented))
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 28/29] perf bpf examples: Convert etcsnoop to use bpf_map()
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (26 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 27/29] perf trace: Fixup etcsnoop example Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-25 23:18 ` [PATCH 29/29] perf augmented_syscalls: Convert to bpf_map() Arnaldo Carvalho de Melo
  2019-01-26  9:52 ` [GIT PULL 00/29] perf/core improvements and fixes Ingo Molnar
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa,
	Luis Cláudio Gonçalves, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Making the code more compact, end result is the same:

  # trace -e /home/acme/git/perf/tools/perf/examples/bpf/etcsnoop.c
     0.000 (         ): sed/7385 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) ...
  2727.723 (         ): cat/7389 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) ...
  2728.543 (         ): cat/7389 openat(dfd: CWD, filename: "/etc/passwd")                          ...
  ^C

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-znhgz24p0daux2kay200ovc1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/examples/bpf/etcsnoop.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/perf/examples/bpf/etcsnoop.c b/tools/perf/examples/bpf/etcsnoop.c
index 550e69c2e8d1..e81b535346c0 100644
--- a/tools/perf/examples/bpf/etcsnoop.c
+++ b/tools/perf/examples/bpf/etcsnoop.c
@@ -21,12 +21,8 @@
 
 #include <stdio.h>
 
-struct bpf_map SEC("maps") __augmented_syscalls__ = {
-       .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
-       .key_size = sizeof(int),
-       .value_size = sizeof(u32),
-       .max_entries = __NR_CPUS__,
-};
+/* bpf-output associated map */
+bpf_map(__augmented_syscalls__, PERF_EVENT_ARRAY, int, u32, __NR_CPUS__);
 
 struct augmented_filename {
 	int	size;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 29/29] perf augmented_syscalls: Convert to bpf_map()
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (27 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 28/29] perf bpf examples: Convert etcsnoop to use bpf_map() Arnaldo Carvalho de Melo
@ 2019-01-25 23:18 ` Arnaldo Carvalho de Melo
  2019-01-26  9:52 ` [GIT PULL 00/29] perf/core improvements and fixes Ingo Molnar
  29 siblings, 0 replies; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-25 23:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa,
	Luis Cláudio Gonçalves, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

To make the code more compact, end result is the same:

  # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_syscalls.c sleep 1
     0.000 ( 0.008 ms): sleep/9663 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) = 3
     0.022 ( 0.005 ms): sleep/9663 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY|CLOEXEC) = 3
     0.226 ( 0.006 ms): sleep/9663 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY|CLOEXEC) = 3
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-23z08bgizqnbc3qdsyl7jyyg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/examples/bpf/augmented_syscalls.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/perf/examples/bpf/augmented_syscalls.c b/tools/perf/examples/bpf/augmented_syscalls.c
index b7dba114e36c..524fdb8534b3 100644
--- a/tools/perf/examples/bpf/augmented_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_syscalls.c
@@ -19,12 +19,8 @@
 #include <stdio.h>
 #include <linux/socket.h>
 
-struct bpf_map SEC("maps") __augmented_syscalls__ = {
-       .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
-       .key_size = sizeof(int),
-       .value_size = sizeof(u32),
-       .max_entries = __NR_CPUS__,
-};
+/* bpf-output associated map */
+bpf_map(__augmented_syscalls__, PERF_EVENT_ARRAY, int, u32, __NR_CPUS__);
 
 struct syscall_exit_args {
 	unsigned long long common_tp_fields;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [GIT PULL 00/29] perf/core improvements and fixes
  2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (28 preceding siblings ...)
  2019-01-25 23:18 ` [PATCH 29/29] perf augmented_syscalls: Convert to bpf_map() Arnaldo Carvalho de Melo
@ 2019-01-26  9:52 ` Ingo Molnar
  29 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2019-01-26  9:52 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Song Liu, Tony Jones, Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit f575494d4a610278ea8597f2f798c8431b94e884:
> 
>   Merge tag 'perf-core-for-mingo-5.0-20190121' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2019-01-22 11:07:01 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-5.0-20190126
> 
> for you to fetch changes up to 76a06125dd57ed2c7559410168b543313fa0cc51:
> 
>   perf augmented_syscalls: Convert to bpf_map() (2019-01-25 15:12:11 +0100)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> BPF:
> 
>   Song Liu:
> 
>   - Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT
> 
>   Arnaldo Carvalho de Melo:
> 
>   - Add bpf_map() helper, to make BPF map declararions more compact and
>     allow for BTF annotations to be made transparently.
> 
> perf script python:
> 
>   Tony Jones:
> 
>   - Remove explicit shebangs.
> 
>   - Fix the PYTHON=python3 builds.
> 
> Core:
> 
>   Davidlohr Bueso:
> 
>   - Update rbtree implementation, getting it closer to the kernel one.
> 
>   - Use cached rbtrees.
> 
>   Arnaldo Carvalho de Melo:
> 
>   - Remove some needless headers from .c and .h files fixing up the fallout,
>     to reduce building time when changes are made to .h files
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (16):
>       perf color: Add missing stdarg.g to color.h
>       perf symbols: Move symbol_conf to separate file
>       perf annotate: Remove lots of headers from annotate.h
>       perf tools: Move branch structs to branch.h
>       perf block-range: Add missing headers
>       perf symbols: Remove include map.h from dso.h
>       perf symbols: Remove some unnecessary includes from symbol.h
>       perf namespaces: Remove namespaces.h from .h headers
>       perf comm: Remove needless headers from comm.h
>       perf callchain: No need to include perf.h
>       perf bpf: Add bpf_map() helper
>       perf bpf: Convert pid_map() to bpf_map()
>       perf augmented_raw_syscalls: Use bpf_map()
>       perf trace: Fixup etcsnoop example
>       perf bpf examples: Convert etcsnoop to use bpf_map()
>       perf augmented_syscalls: Convert to bpf_map()
> 
> Davidlohr Bueso (7):
>       tools: Update rbtree implementation
>       perf machine: Use cached rbtrees
>       perf callchain: Use cached rbtrees
>       perf util: Use cached rbtree for rblists
>       perf symbols: Use cached rbtrees
>       perf hist: Use cached rbtrees
>       perf sched: Use cached rbtrees
> 
> Song Liu (1):
>       perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT
> 
> Tony Jones (6):
>       perf script python: Add trace_context extension module to sys.modules
>       perf script python: Use PyBytes for attr in trace-event-python
>       perf script python: Remove explicit shebang from setup.py
>       perf script python: Remove explicit shebang from tests/attr.c
>       perf script python: Remove explicit shebang from Python scripts
>       perf script python: Add Python3 support to tests/attr.py
> 
>  tools/include/linux/rbtree.h                       |  52 +++++-
>  tools/include/linux/rbtree_augmented.h             |  60 +++++--
>  tools/lib/rbtree.c                                 | 178 +++++++++++++-----
>  tools/perf/Makefile.perf                           |   4 +-
>  tools/perf/builtin-annotate.c                      |   4 +-
>  tools/perf/builtin-c2c.c                           |   6 +-
>  tools/perf/builtin-diff.c                          |  10 +-
>  tools/perf/builtin-probe.c                         |   1 +
>  tools/perf/builtin-report.c                        |   3 +-
>  tools/perf/builtin-sched.c                         |  45 ++---
>  tools/perf/builtin-top.c                           |   2 +-
>  tools/perf/builtin-trace.c                         |   3 +-
>  tools/perf/examples/bpf/augmented_raw_syscalls.c   |  14 +-
>  tools/perf/examples/bpf/augmented_syscalls.c       |   8 +-
>  tools/perf/examples/bpf/etcsnoop.c                 |   8 +-
>  tools/perf/include/bpf/bpf.h                       |  16 +-
>  tools/perf/scripts/python/exported-sql-viewer.py   |   1 -
>  tools/perf/scripts/python/sched-migration.py       |   2 -
>  tools/perf/scripts/python/stat-cpi.py              |   1 -
>  tools/perf/tests/attr.py                           |  33 ++--
>  tools/perf/tests/hists_common.c                    |   8 +-
>  tools/perf/tests/hists_cumulate.c                  |  14 +-
>  tools/perf/tests/hists_link.c                      |   8 +-
>  tools/perf/tests/hists_output.c                    |  32 ++--
>  tools/perf/ui/browsers/hists.c                     |  16 +-
>  tools/perf/ui/browsers/map.c                       |   1 +
>  tools/perf/ui/gtk/annotate.c                       |   1 +
>  tools/perf/ui/gtk/hists.c                          |   6 +-
>  tools/perf/ui/stdio/hist.c                         |   3 +-
>  tools/perf/util/annotate.c                         |   1 +
>  tools/perf/util/annotate.h                         |  14 +-
>  tools/perf/util/block-range.c                      |   2 +
>  tools/perf/util/block-range.h                      |   6 +-
>  tools/perf/util/bpf-event.c                        |  10 +-
>  tools/perf/util/branch.h                           |  27 ++-
>  tools/perf/util/build-id.c                         |  13 +-
>  tools/perf/util/build-id.h                         |   3 +-
>  tools/perf/util/callchain.h                        |   1 -
>  tools/perf/util/color.h                            |   1 +
>  tools/perf/util/comm.c                             |   1 +
>  tools/perf/util/comm.h                             |   4 +-
>  tools/perf/util/dso.c                              |  10 +-
>  tools/perf/util/dso.h                              |  16 +-
>  tools/perf/util/event.h                            |  21 +--
>  tools/perf/util/hist.c                             | 199 ++++++++++++---------
>  tools/perf/util/hist.h                             |  10 +-
>  tools/perf/util/intlist.h                          |   2 +-
>  tools/perf/util/jitdump.c                          |   1 +
>  tools/perf/util/machine.c                          |  53 +++---
>  tools/perf/util/machine.h                          |  12 +-
>  tools/perf/util/map.c                              |   8 +-
>  tools/perf/util/metricgroup.c                      |   2 +-
>  tools/perf/util/probe-event.c                      |   4 +-
>  tools/perf/util/probe-event.h                      |   5 +-
>  tools/perf/util/probe-file.c                       |   1 +
>  tools/perf/util/rb_resort.h                        |   8 +-
>  tools/perf/util/rblist.c                           |  28 +--
>  tools/perf/util/rblist.h                           |   2 +-
>  .../util/scripting-engines/trace-event-python.c    |  15 +-
>  tools/perf/util/setup.py                           |   2 -
>  tools/perf/util/sort.h                             |   4 +-
>  tools/perf/util/srcline.c                          |  43 +++--
>  tools/perf/util/srcline.h                          |  13 +-
>  tools/perf/util/stat-display.c                     |   1 +
>  tools/perf/util/stat-shadow.c                      |   2 +-
>  tools/perf/util/strlist.h                          |   2 +-
>  tools/perf/util/symbol-minimal.c                   |   1 +
>  tools/perf/util/symbol.c                           |  87 ++++-----
>  tools/perf/util/symbol.h                           |  88 ++-------
>  tools/perf/util/symbol_conf.h                      |  73 ++++++++
>  tools/perf/util/symbol_fprintf.c                   |   3 +-
>  tools/perf/util/thread.h                           |   1 +
>  tools/perf/util/util.c                             |   1 +
>  73 files changed, 810 insertions(+), 531 deletions(-)
>  create mode 100644 tools/perf/util/symbol_conf.h

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [GIT PULL 00/29] perf/core improvements and fixes
  2019-01-03 12:45 Arnaldo Carvalho de Melo
@ 2019-01-03 13:07 ` Ingo Molnar
  0 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2019-01-03 13:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter,
	Andi Kleen, Colin King, Joe Mario, Namhyung Kim,
	Ondřej Lysoněk, Stanislav Fomichev,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 883f4def8b77e6870ce42be279564cca0256c611:
> 
>   Merge tag 'perf-core-for-mingo-4.21-20181218' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2018-12-20 18:51:47 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.21-20190103
> 
> for you to fetch changes up to b25756df5b28cd7b6e91200fc5012e7c76e8ec69:
> 
>   perf session: Add comment for perf_session__register_idle_thread() (2019-01-02 11:05:06 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> perf c2c:
> 
>   Jiri Olsa:
> 
>   - Change the default coalesce setup to from '--coalesce pid,iaddr' to just '--coalesce iaddr'.
> 
>   - Increase the HITM ratio limit for displayed cachelines.
> 
> perf script:
> 
>   Andi Kleen:
> 
>   - Fix LBR skid dump problems in brstackinsn.
> 
> perf trace:
> 
>   Arnaldo Carvalho de Melo:
> 
>   - Check if the raw_syscalls:sys_{enter,exit} are setup before setting tp filter.
> 
>   - Do not hardcode the size of the tracepoint common_ fields.
> 
>   - Beautify USBDEFFS_ ioctl commands.
> 
>   Colin Ian King:
> 
>   - Use correct SECCOMP prefix spelling, "SECOMP_*" -> "SECCOMP_*".
> 
> perf python:
> 
>   Jiri Olsa:
> 
>   - Do not force closing original perf descriptor in evlist.get_pollfd().
> 
> tools misc:
> 
>   Jiri Olsa:
> 
>   - Allow overriding CFLAGS and LDFLAGS.
> 
> perf build:
> 
>   Stanislav Fomichev:
> 
>   - Don't unconditionally link the libbfd feature test to -liberty and -lz
> 
> thread-stack:
> 
>   Adrian Hunter:
> 
>   - Fix processing for the idle task, having a stack per cpu.
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Adrian Hunter (8):
>       perf thread-stack: Simplify some code in thread_stack__process()
>       perf thread-stack: Tidy thread_stack__bottom() usage
>       perf thread-stack: Avoid direct reference to the thread's stack
>       perf thread-stack: Allow for a thread stack array
>       perf thread-stack: Factor out thread_stack__init()
>       perf thread-stack: Allocate an array of thread stacks
>       perf thread-stack: Fix thread stack processing for the idle task
>       perf session: Add comment for perf_session__register_idle_thread()
> 
> Andi Kleen (1):
>       perf script: Fix LBR skid dump problems in brstackinsn
> 
> Arnaldo Carvalho de Melo (11):
>       perf trace: Check if the raw_syscalls:sys_{enter,exit} are setup before setting tp filter
>       perf beauty mmap: PROT_WRITE should come before PROT_EXEC
>       perf trace: Do not hardcode the size of the tracepoint common_ fields
>       perf trace: Rename thread_thread->paths to thread_trace->files
>       perf trace: Move the files table resizing to outside set_pathname()
>       perf trace: Store the major number for a file when storing its pathname
>       tools headers uapi: Grab a copy of usbdevice_fs.h
>       perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands
>       perf trace: Wire up ioctl's USBDEBFS_ cmd table generator
>       perf trace beauty: Export function to get the files for a thread
>       perf trace beauty ioctl: Beautify USBDEVFS_ commands
> 
> Colin Ian King (1):
>       perf trace: Use correct SECCOMP prefix spelling, "SECOMP_*" -> "SECCOMP_*"
> 
> Jiri Olsa (7):
>       perf python: Do not force closing original perf descriptor in evlist.get_pollfd()
>       perf c2c: Change the default coalesce setup
>       perf c2c: Increase the HITM ratio limit for displayed cachelines
>       tools power x86_energy_perf_policy: Override CFLAGS assignments and add LDFLAGS to build command
>       tools thermal tmon: Allow overriding CFLAGS assignments
>       tools power turbostat: Override CFLAGS assignments and add LDFLAGS to build command
>       tools gpio: Allow overriding CFLAGS
> 
> Stanislav Fomichev (1):
>       perf build: Don't unconditionally link the libbfd feature test to -liberty and -lz
> 
>  tools/build/Makefile.feature                       |   4 +-
>  tools/build/feature/Makefile                       |  10 +-
>  tools/gpio/Makefile                                |   2 +-
>  tools/include/uapi/linux/usbdevice_fs.h            | 201 ++++++++++++++++++
>  tools/perf/Makefile.config                         |  44 ++--
>  tools/perf/Makefile.perf                           |   8 +
>  tools/perf/builtin-c2c.c                           |   4 +-
>  tools/perf/builtin-script.c                        |  21 +-
>  tools/perf/builtin-trace.c                         | 142 +++++++++----
>  tools/perf/check-headers.sh                        |   1 +
>  tools/perf/trace/beauty/beauty.h                   |   7 +
>  tools/perf/trace/beauty/ioctl.c                    |  22 ++
>  tools/perf/trace/beauty/mmap.c                     |   2 +-
>  tools/perf/trace/beauty/seccomp.c                  |   4 +-
>  tools/perf/trace/beauty/usbdevfs_ioctl.sh          |  19 ++
>  tools/perf/util/dump-insn.c                        |   8 +
>  tools/perf/util/dump-insn.h                        |   2 +
>  tools/perf/util/intel-bts.c                        |   4 +-
>  .../util/intel-pt-decoder/intel-pt-insn-decoder.c  |   8 +
>  tools/perf/util/intel-pt.c                         |   6 +-
>  tools/perf/util/python.c                           |   3 +-
>  tools/perf/util/session.c                          |   7 +
>  tools/perf/util/thread-stack.c                     | 227 +++++++++++++++------
>  tools/perf/util/thread-stack.h                     |   8 +-
>  tools/power/x86/turbostat/Makefile                 |   8 +-
>  tools/power/x86/x86_energy_perf_policy/Makefile    |   6 +-
>  tools/thermal/tmon/Makefile                        |   8 +-
>  27 files changed, 620 insertions(+), 166 deletions(-)
>  create mode 100644 tools/include/uapi/linux/usbdevice_fs.h
>  create mode 100755 tools/perf/trace/beauty/usbdevfs_ioctl.sh

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [GIT PULL 00/29] perf/core improvements and fixes
@ 2019-01-03 12:45 Arnaldo Carvalho de Melo
  2019-01-03 13:07 ` Ingo Molnar
  0 siblings, 1 reply; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-01-03 12:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen, Colin King,
	Joe Mario, Namhyung Kim, Ondřej Lysoněk,
	Stanislav Fomichev, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 883f4def8b77e6870ce42be279564cca0256c611:

  Merge tag 'perf-core-for-mingo-4.21-20181218' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2018-12-20 18:51:47 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.21-20190103

for you to fetch changes up to b25756df5b28cd7b6e91200fc5012e7c76e8ec69:

  perf session: Add comment for perf_session__register_idle_thread() (2019-01-02 11:05:06 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

perf c2c:

  Jiri Olsa:

  - Change the default coalesce setup to from '--coalesce pid,iaddr' to just '--coalesce iaddr'.

  - Increase the HITM ratio limit for displayed cachelines.

perf script:

  Andi Kleen:

  - Fix LBR skid dump problems in brstackinsn.

perf trace:

  Arnaldo Carvalho de Melo:

  - Check if the raw_syscalls:sys_{enter,exit} are setup before setting tp filter.

  - Do not hardcode the size of the tracepoint common_ fields.

  - Beautify USBDEFFS_ ioctl commands.

  Colin Ian King:

  - Use correct SECCOMP prefix spelling, "SECOMP_*" -> "SECCOMP_*".

perf python:

  Jiri Olsa:

  - Do not force closing original perf descriptor in evlist.get_pollfd().

tools misc:

  Jiri Olsa:

  - Allow overriding CFLAGS and LDFLAGS.

perf build:

  Stanislav Fomichev:

  - Don't unconditionally link the libbfd feature test to -liberty and -lz

thread-stack:

  Adrian Hunter:

  - Fix processing for the idle task, having a stack per cpu.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Adrian Hunter (8):
      perf thread-stack: Simplify some code in thread_stack__process()
      perf thread-stack: Tidy thread_stack__bottom() usage
      perf thread-stack: Avoid direct reference to the thread's stack
      perf thread-stack: Allow for a thread stack array
      perf thread-stack: Factor out thread_stack__init()
      perf thread-stack: Allocate an array of thread stacks
      perf thread-stack: Fix thread stack processing for the idle task
      perf session: Add comment for perf_session__register_idle_thread()

Andi Kleen (1):
      perf script: Fix LBR skid dump problems in brstackinsn

Arnaldo Carvalho de Melo (11):
      perf trace: Check if the raw_syscalls:sys_{enter,exit} are setup before setting tp filter
      perf beauty mmap: PROT_WRITE should come before PROT_EXEC
      perf trace: Do not hardcode the size of the tracepoint common_ fields
      perf trace: Rename thread_thread->paths to thread_trace->files
      perf trace: Move the files table resizing to outside set_pathname()
      perf trace: Store the major number for a file when storing its pathname
      tools headers uapi: Grab a copy of usbdevice_fs.h
      perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands
      perf trace: Wire up ioctl's USBDEBFS_ cmd table generator
      perf trace beauty: Export function to get the files for a thread
      perf trace beauty ioctl: Beautify USBDEVFS_ commands

Colin Ian King (1):
      perf trace: Use correct SECCOMP prefix spelling, "SECOMP_*" -> "SECCOMP_*"

Jiri Olsa (7):
      perf python: Do not force closing original perf descriptor in evlist.get_pollfd()
      perf c2c: Change the default coalesce setup
      perf c2c: Increase the HITM ratio limit for displayed cachelines
      tools power x86_energy_perf_policy: Override CFLAGS assignments and add LDFLAGS to build command
      tools thermal tmon: Allow overriding CFLAGS assignments
      tools power turbostat: Override CFLAGS assignments and add LDFLAGS to build command
      tools gpio: Allow overriding CFLAGS

Stanislav Fomichev (1):
      perf build: Don't unconditionally link the libbfd feature test to -liberty and -lz

 tools/build/Makefile.feature                       |   4 +-
 tools/build/feature/Makefile                       |  10 +-
 tools/gpio/Makefile                                |   2 +-
 tools/include/uapi/linux/usbdevice_fs.h            | 201 ++++++++++++++++++
 tools/perf/Makefile.config                         |  44 ++--
 tools/perf/Makefile.perf                           |   8 +
 tools/perf/builtin-c2c.c                           |   4 +-
 tools/perf/builtin-script.c                        |  21 +-
 tools/perf/builtin-trace.c                         | 142 +++++++++----
 tools/perf/check-headers.sh                        |   1 +
 tools/perf/trace/beauty/beauty.h                   |   7 +
 tools/perf/trace/beauty/ioctl.c                    |  22 ++
 tools/perf/trace/beauty/mmap.c                     |   2 +-
 tools/perf/trace/beauty/seccomp.c                  |   4 +-
 tools/perf/trace/beauty/usbdevfs_ioctl.sh          |  19 ++
 tools/perf/util/dump-insn.c                        |   8 +
 tools/perf/util/dump-insn.h                        |   2 +
 tools/perf/util/intel-bts.c                        |   4 +-
 .../util/intel-pt-decoder/intel-pt-insn-decoder.c  |   8 +
 tools/perf/util/intel-pt.c                         |   6 +-
 tools/perf/util/python.c                           |   3 +-
 tools/perf/util/session.c                          |   7 +
 tools/perf/util/thread-stack.c                     | 227 +++++++++++++++------
 tools/perf/util/thread-stack.h                     |   8 +-
 tools/power/x86/turbostat/Makefile                 |   8 +-
 tools/power/x86/x86_energy_perf_policy/Makefile    |   6 +-
 tools/thermal/tmon/Makefile                        |   8 +-
 27 files changed, 620 insertions(+), 166 deletions(-)
 create mode 100644 tools/include/uapi/linux/usbdevice_fs.h
 create mode 100755 tools/perf/trace/beauty/usbdevfs_ioctl.sh

Test results:

The first ones are container based builds of tools/perf with and without libelf
support.  Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

  $ dm
   1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0
   2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822
   3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0
   4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   5 alpine:3.8                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   6 alpine:edge                   : Ok   gcc (Alpine 8.2.0) 8.2.0
   7 amazonlinux:1                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
   8 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
   9 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  10 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  11 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  12 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
  13 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
  14 clearlinux:latest             : Ok   gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502
  15 debian:7                      : Ok   gcc (Debian 4.7.2-5) 4.7.2
  16 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u1) 4.9.2
  17 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
  18 debian:experimental           : Ok   gcc (Debian 8.2.0-10) 8.2.0
  19 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.2.0-10) 8.2.0
  20 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 8.2.0-10) 8.2.0
  21 debian:experimental           : Ok   gcc (Debian 8.2.0-13) 8.2.0
  22 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
  23 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
  24 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 8.2.0-11) 8.2.0
  25 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
  26 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
  27 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  28 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  29 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  30 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  31 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
  32 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
  33 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)
  34 fedora:28                     : Ok   gcc (GCC) 8.2.1 20181105 (Red Hat 8.2.1-5)
  35 fedora:29                     : Ok   gcc (GCC) 8.2.1 20181105 (Red Hat 8.2.1-5)
  36 fedora:rawhide                : Ok   gcc (GCC) 8.2.1 20181105 (Red Hat 8.2.1-5)
  37 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 7.3.0-r3 p1.4) 7.3.0
  38 mageia:5                      : Ok   gcc (GCC) 4.9.2
  39 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0
  40 opensuse:13.2                 : Ok   gcc (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
  41 opensuse:15.0                 : Ok   gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
  42 opensuse:42.1                 : Ok   gcc (SUSE Linux) 4.8.5
  43 opensuse:42.2                 : Ok   gcc (SUSE Linux) 4.8.5
  44 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5
  45 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 8.2.1 20181108 [gcc-8-branch revision 265914]
  46 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
  47 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36.0.1)
  48 ubuntu:12.04.5                : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  49 ubuntu:14.04.4                : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
  50 ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
  51 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  52 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  53 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  54 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  55 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  56 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  57 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  58 ubuntu:17.10                  : Ok   gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
  59 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  60 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
  61 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
  62 ubuntu:18.04-x-m68k           : Ok   m68k-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  63 ubuntu:18.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  64 ubuntu:18.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  65 ubuntu:18.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  66 ubuntu:18.04-x-riscv64        : Ok   riscv64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  67 ubuntu:18.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  68 ubuntu:18.04-x-sh4            : Ok   sh4-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  69 ubuntu:18.04-x-sparc64        : Ok   sparc64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  70 ubuntu:18.10                  : Ok   gcc (Ubuntu 8.2.0-7ubuntu1) 8.2.0
  71 ubuntu:19.04                  : Ok   gcc (Ubuntu 8.2.0-12ubuntu1) 8.2.0
  72 ubuntu:19.04-x-alpha          : Ok   alpha-linux-gnu-gcc (Ubuntu 8.2.0-11ubuntu1) 8.2.0
  73 ubuntu:19.04-x-hppa           : Ok   hppa-linux-gnu-gcc (Ubuntu 8.2.0-11ubuntu1) 8.2.0
  $ 

  # uname -a
  Linux quaco 4.19.8-rt6+ #2 SMP PREEMPT RT Wed Dec 19 12:58:37 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
  # git log --oneline -1
  b25756df5b28 perf session: Add comment for perf_session__register_idle_thread()
  # perf version --build-options
  perf version 4.20.rc7.gb25756
                   dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
      dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                   glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                    gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
           syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                  libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                  libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                 libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
  numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
               libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                libslang: [ on  ]  # HAVE_SLANG_SUPPORT
               libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
               libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
      libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                    zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                    lzma: [ on  ]  # HAVE_LZMA_SUPPORT
               get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Breakpoint accounting                                 : Ok
  22: Watchpoint                                            :
  22.1: Read Only Watchpoint                                : Skip
  22.2: Write Only Watchpoint                               : Ok
  22.3: Read / Write Watchpoint                             : Ok
  22.4: Modify Watchpoint                                   : Ok
  23: Number of exit events of a simple workload            : Ok
  24: Software clock events period values                   : Ok
  25: Object code reading                                   : Ok
  26: Sample parsing                                        : Ok
  27: Use a dummy software event to keep tracking           : Ok
  28: Parse with no sample_id_all bit set                   : Ok
  29: Filter hist entries                                   : Ok
  30: Lookup mmap thread                                    : Ok
  31: Share thread mg                                       : Ok
  32: Sort output of hist entries                           : Ok
  33: Cumulate child hist entries                           : Ok
  34: Track with sched_switch                               : Ok
  35: Filter fds with revents mask in a fdarray             : Ok
  36: Add fd to a fdarray, making it autogrow               : Ok
  37: kmod_path__parse                                      : Ok
  38: Thread map                                            : Ok
  39: LLVM search and compile                               :
  39.1: Basic BPF llvm compile                              : Ok
  39.2: kbuild searching                                    : Ok
  39.3: Compile source for BPF prologue generation          : Ok
  39.4: Compile source for BPF relocation                   : Ok
  40: Session topology                                      : Ok
  41: BPF filter                                            :
  41.1: Basic BPF filtering                                 : Ok
  41.2: BPF pinning                                         : Ok
  41.3: BPF prologue generation                             : Ok
  41.4: BPF relocation checker                              : Ok
  42: Synthesize thread map                                 : Ok
  43: Remove thread map                                     : Ok
  44: Synthesize cpu map                                    : Ok
  45: Synthesize stat config                                : Ok
  46: Synthesize stat                                       : Ok
  47: Synthesize stat round                                 : Ok
  48: Synthesize attr update                                : Ok
  49: Event times                                           : Ok
  50: Read backward ring buffer                             : Ok
  51: Print cpu map                                         : Ok
  52: Probe SDT events                                      : Ok
  53: is_printable_array                                    : Ok
  54: Print bitmap                                          : Ok
  55: perf hooks                                            : Ok
  56: builtin clang support                                 : Skip (not compiled in)
  57: unit_number__scnprintf                                : Ok
  58: mem2node                                              : Ok
  59: x86 rdpmc                                             : Ok
  60: Convert perf time to TSC                              : Ok
  61: DWARF unwind                                          : Ok
  62: x86 instruction decoder - new instructions            : Ok
  63: x86 bp modify                                         : Ok
  64: probe libc's inet_pton & backtrace it with ping       : Ok
  65: Use vfs_getname probe to get syscall args filenames   : Ok
  66: Add vfs_getname probe to get syscall args filenames   : Ok
  67: Check open filename arg using perf trace + vfs_getname: Ok

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
           make_no_libpython_O: make NO_LIBPYTHON=1
              make_clean_all_O: make clean all
                 make_perf_o_O: make perf.o
                make_no_newt_O: make NO_NEWT=1
           make_no_libunwind_O: make NO_LIBUNWIND=1
              make_no_libelf_O: make NO_LIBELF=1
             make_no_libnuma_O: make NO_LIBNUMA=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
               make_no_slang_O: make NO_SLANG=1
                   make_pure_O: make
                   make_help_O: make help
                 make_cscope_O: make cscope
         make_with_clangllvm_O: make LIBCLANGLLVM=1
             make_util_map_o_O: make util/map.o
            make_no_demangle_O: make NO_DEMANGLE=1
         make_install_prefix_O: make install prefix=/tmp/krava
           make_no_backtrace_O: make NO_BACKTRACE=1
                    make_doc_O: make doc
                make_install_O: make install
       make_util_pmu_bison_o_O: make util/pmu-bison.o
            make_no_auxtrace_O: make NO_AUXTRACE=1
                  make_debug_O: make DEBUG=1
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
            make_no_libaudit_O: make NO_LIBAUDIT=1
           make_no_libbionic_O: make NO_LIBBIONIC=1
             make_no_libperl_O: make NO_LIBPERL=1
        make_with_babeltrace_O: make LIBBABELTRACE=1
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
            make_install_bin_O: make install-bin
                 make_static_O: make LDFLAGS=-static
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
                   make_tags_O: make tags
                make_no_gtk2_O: make NO_GTK2=1
              make_no_libbpf_O: make NO_LIBBPF=1
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [GIT PULL 00/29] perf/core improvements and fixes
  2016-12-20 17:03 Arnaldo Carvalho de Melo
@ 2016-12-20 19:15 ` Ingo Molnar
  0 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2016-12-20 19:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Alexander Shishkin,
	Alexei Starovoitov, Andi Kleen, Benjamin Herrenschmidt,
	Chris Riyder, Daniel Borkmann, David Ahern, Davidlohr Bueso,
	Jiri Olsa, Joe Stringer, Kan Liang, Kim Phillips, Kyle McMartin,
	linuxppc-dev, Markus Trippelsdorf, Masami Hiramatsu,
	Michael Ellerman, Minchan Kim, Namhyung Kim, Naveen N . Rao,
	netdev, Nicholas Piggin, Paul Mackerras, Peter Zijlstra,
	Ravi Bangoria, Taeung Song, Wang Nan, Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
>         Please consider pulling, I had most of this queued before your first
> pull req to Linus for 4.10, most are fixes, with 'perf sched timehist --idle'
> as a followup new feature to the 'perf sched timehist' command introduced in
> this window.
> 	
> 	One other thing that delayed this was the samples/bpf/ switch to
> tools/lib/bpf/ that involved fixing up merge clashes with net.git and also
> to properly test it, after more rounds than antecipated, but all seems ok
> now and would be good to get this merge issues past us ASAP.
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit e7aa8c2eb11ba69b1b69099c3c7bd6be3087b0ba:
> 
>   Merge tag 'docs-4.10' of git://git.lwn.net/linux (2016-12-12 21:58:13 -0800)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161220
> 
> for you to fetch changes up to 9899694a7f67714216665b87318eb367e2c5c901:
> 
>   samples/bpf: Move open_raw_sock to separate header (2016-12-20 12:00:40 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> New features:
> 
> - Introduce 'perf sched timehist --idle', to analyse processes
>   going to/from idle state (Namhyung Kim)
> 
> Fixes:
> 
> - Allow 'perf record -u user' to continue when facing races with threads
>   going away after having scanned them via /proc (Jiri Olsa)
> 
> - Fix 'perf mem' --all-user/--all-kernel options (Jiri Olsa)
> 
> - Support jumps with multiple arguments (Ravi Bangoria)
> 
> - Fix jumps to before the function where they are located (Ravi
> Bangoria)
> 
> - Fix lock-pi help string (Davidlohr Bueso)
> 
> - Fix build of 'perf trace' in odd systems such as a RHEL PPC one (Jiri Olsa)
> 
> - Do not overwrite valid build id in 'perf diff' (Kan Liang)
> 
> - Don't throw error for zero length symbols, allowing the use of the TUI
>   in PowerPC, where such symbols became more common recently (Ravi Bangoria)
> 
> Infrastructure:
> 
> - Switch of samples/bpf/ to use tools/lib/bpf, removing libbpf
>   duplication (Joe Stringer)
> 
> - Move headers check into bash script (Jiri Olsa)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (3):
>       perf tools: Remove some needless __maybe_unused
>       samples/bpf: Make perf_event_read() static
>       samples/bpf: Be consistent with bpf_load_program bpf_insn parameter
> 
> Davidlohr Bueso (1):
>       perf bench futex: Fix lock-pi help string
> 
> Jiri Olsa (7):
>       perf tools: Move headers check into bash script
>       perf mem: Fix --all-user/--all-kernel options
>       perf evsel: Use variable instead of repeating lengthy FD macro
>       perf thread_map: Add thread_map__remove function
>       perf evsel: Allow to ignore missing pid
>       perf record: Force ignore_missing_thread for uid option
>       perf trace: Check if MAP_32BIT is defined (again)
> 
> Joe Stringer (8):
>       tools lib bpf: Sync {tools,}/include/uapi/linux/bpf.h
>       tools lib bpf: use __u32 from linux/types.h
>       tools lib bpf: Add flags to bpf_create_map()
>       samples/bpf: Make samples more libbpf-centric
>       samples/bpf: Switch over to libbpf
>       tools lib bpf: Add bpf_prog_{attach,detach}
>       samples/bpf: Remove perf_event_open() declaration
>       samples/bpf: Move open_raw_sock to separate header
> 
> Kan Liang (1):
>       perf diff: Do not overwrite valid build id
> 
> Namhyung Kim (6):
>       perf sched timehist: Split is_idle_sample()
>       perf sched timehist: Introduce struct idle_time_data
>       perf sched timehist: Save callchain when entering idle
>       perf sched timehist: Skip non-idle events when necessary
>       perf sched timehist: Add -I/--idle-hist option
>       perf sched timehist: Show callchains for idle stat
> 
> Ravi Bangoria (3):
>       perf annotate: Support jump instruction with target as second operand
>       perf annotate: Fix jump target outside of function address range
>       perf annotate: Don't throw error for zero length symbols
> 
>  samples/bpf/Makefile                              |  70 +--
>  samples/bpf/README.rst                            |   4 +-
>  samples/bpf/bpf_load.c                            |  21 +-
>  samples/bpf/bpf_load.h                            |   3 +
>  samples/bpf/fds_example.c                         |  13 +-
>  samples/bpf/lathist_user.c                        |   2 +-
>  samples/bpf/libbpf.c                              | 176 -------
>  samples/bpf/libbpf.h                              |  28 +-
>  samples/bpf/lwt_len_hist_user.c                   |   6 +-
>  samples/bpf/offwaketime_user.c                    |   8 +-
>  samples/bpf/sampleip_user.c                       |   7 +-
>  samples/bpf/sock_example.c                        |  14 +-
>  samples/bpf/sock_example.h                        |  35 ++
>  samples/bpf/sockex1_user.c                        |   7 +-
>  samples/bpf/sockex2_user.c                        |   5 +-
>  samples/bpf/sockex3_user.c                        |   5 +-
>  samples/bpf/spintest_user.c                       |   8 +-
>  samples/bpf/tc_l2_redirect_user.c                 |   4 +-
>  samples/bpf/test_cgrp2_array_pin.c                |   4 +-
>  samples/bpf/test_cgrp2_attach.c                   |  12 +-
>  samples/bpf/test_cgrp2_attach2.c                  |   8 +-
>  samples/bpf/test_cgrp2_sock.c                     |   7 +-
>  samples/bpf/test_current_task_under_cgroup_user.c |   8 +-
>  samples/bpf/test_lru_dist.c                       |  32 +-
>  samples/bpf/test_probe_write_user_user.c          |   2 +-
>  samples/bpf/trace_event_user.c                    |  23 +-
>  samples/bpf/trace_output_user.c                   |   7 +-
>  samples/bpf/tracex2_user.c                        |  10 +-
>  samples/bpf/tracex3_user.c                        |   4 +-
>  samples/bpf/tracex4_user.c                        |   4 +-
>  samples/bpf/tracex6_user.c                        |   5 +-
>  samples/bpf/xdp1_user.c                           |   2 +-
>  samples/bpf/xdp_tx_iptunnel_user.c                |   6 +-
>  tools/include/uapi/linux/bpf.h                    | 593 +++++++++++++---------
>  tools/lib/bpf/bpf.c                               |  30 +-
>  tools/lib/bpf/bpf.h                               |   9 +-
>  tools/lib/bpf/libbpf.c                            |   3 +-
>  tools/perf/Documentation/perf-sched.txt           |   4 +
>  tools/perf/Makefile.perf                          |  94 +---
>  tools/perf/bench/futex-lock-pi.c                  |   2 +-
>  tools/perf/builtin-c2c.c                          |  13 +-
>  tools/perf/builtin-mem.c                          |   4 +-
>  tools/perf/builtin-record.c                       |   3 +
>  tools/perf/builtin-report.c                       |   2 +-
>  tools/perf/builtin-sched.c                        | 261 ++++++++--
>  tools/perf/builtin-stat.c                         |   6 +-
>  tools/perf/check-headers.sh                       |  59 +++
>  tools/perf/perf.h                                 |   1 +
>  tools/perf/tests/builtin-test.c                   |   4 +
>  tools/perf/tests/tests.h                          |   1 +
>  tools/perf/tests/thread-map.c                     |  44 ++
>  tools/perf/trace/beauty/mmap.c                    |   2 +
>  tools/perf/ui/browsers/annotate.c                 |   5 +-
>  tools/perf/util/annotate.c                        |  23 +-
>  tools/perf/util/annotate.h                        |   5 +-
>  tools/perf/util/evsel.c                           |  61 ++-
>  tools/perf/util/evsel.h                           |   1 +
>  tools/perf/util/symbol.c                          |   3 +-
>  tools/perf/util/thread_map.c                      |  22 +
>  tools/perf/util/thread_map.h                      |   1 +
>  60 files changed, 1075 insertions(+), 731 deletions(-)
>  delete mode 100644 samples/bpf/libbpf.c
>  create mode 100644 samples/bpf/sock_example.h
>  create mode 100755 tools/perf/check-headers.sh

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [GIT PULL 00/29] perf/core improvements and fixes
@ 2016-12-20 17:03 Arnaldo Carvalho de Melo
  2016-12-20 19:15 ` Ingo Molnar
  0 siblings, 1 reply; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-12-20 17:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Alexander Shishkin, Alexei Starovoitov, Andi Kleen,
	Benjamin Herrenschmidt, Chris Riyder, Daniel Borkmann,
	David Ahern, Davidlohr Bueso, Jiri Olsa, Joe Stringer, Kan Liang,
	Kim Phillips, Kyle McMartin, linuxppc-dev, Markus Trippelsdorf,
	Masami Hiramatsu, Michael Ellerman, Minchan Kim, Namhyung Kim,
	Naveen N . Rao, netdev, Nicholas Piggin, Paul Mackerras,
	Peter Zijlstra, Ravi Bangoria, Taeung Song, Wang Nan,
	Arnaldo Carvalho de Melo

Hi Ingo,

        Please consider pulling, I had most of this queued before your first
pull req to Linus for 4.10, most are fixes, with 'perf sched timehist --idle'
as a followup new feature to the 'perf sched timehist' command introduced in
this window.
	
	One other thing that delayed this was the samples/bpf/ switch to
tools/lib/bpf/ that involved fixing up merge clashes with net.git and also
to properly test it, after more rounds than antecipated, but all seems ok
now and would be good to get this merge issues past us ASAP.

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit e7aa8c2eb11ba69b1b69099c3c7bd6be3087b0ba:

  Merge tag 'docs-4.10' of git://git.lwn.net/linux (2016-12-12 21:58:13 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161220

for you to fetch changes up to 9899694a7f67714216665b87318eb367e2c5c901:

  samples/bpf: Move open_raw_sock to separate header (2016-12-20 12:00:40 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Introduce 'perf sched timehist --idle', to analyse processes
  going to/from idle state (Namhyung Kim)

Fixes:

- Allow 'perf record -u user' to continue when facing races with threads
  going away after having scanned them via /proc (Jiri Olsa)

- Fix 'perf mem' --all-user/--all-kernel options (Jiri Olsa)

- Support jumps with multiple arguments (Ravi Bangoria)

- Fix jumps to before the function where they are located (Ravi
Bangoria)

- Fix lock-pi help string (Davidlohr Bueso)

- Fix build of 'perf trace' in odd systems such as a RHEL PPC one (Jiri Olsa)

- Do not overwrite valid build id in 'perf diff' (Kan Liang)

- Don't throw error for zero length symbols, allowing the use of the TUI
  in PowerPC, where such symbols became more common recently (Ravi Bangoria)

Infrastructure:

- Switch of samples/bpf/ to use tools/lib/bpf, removing libbpf
  duplication (Joe Stringer)

- Move headers check into bash script (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (3):
      perf tools: Remove some needless __maybe_unused
      samples/bpf: Make perf_event_read() static
      samples/bpf: Be consistent with bpf_load_program bpf_insn parameter

Davidlohr Bueso (1):
      perf bench futex: Fix lock-pi help string

Jiri Olsa (7):
      perf tools: Move headers check into bash script
      perf mem: Fix --all-user/--all-kernel options
      perf evsel: Use variable instead of repeating lengthy FD macro
      perf thread_map: Add thread_map__remove function
      perf evsel: Allow to ignore missing pid
      perf record: Force ignore_missing_thread for uid option
      perf trace: Check if MAP_32BIT is defined (again)

Joe Stringer (8):
      tools lib bpf: Sync {tools,}/include/uapi/linux/bpf.h
      tools lib bpf: use __u32 from linux/types.h
      tools lib bpf: Add flags to bpf_create_map()
      samples/bpf: Make samples more libbpf-centric
      samples/bpf: Switch over to libbpf
      tools lib bpf: Add bpf_prog_{attach,detach}
      samples/bpf: Remove perf_event_open() declaration
      samples/bpf: Move open_raw_sock to separate header

Kan Liang (1):
      perf diff: Do not overwrite valid build id

Namhyung Kim (6):
      perf sched timehist: Split is_idle_sample()
      perf sched timehist: Introduce struct idle_time_data
      perf sched timehist: Save callchain when entering idle
      perf sched timehist: Skip non-idle events when necessary
      perf sched timehist: Add -I/--idle-hist option
      perf sched timehist: Show callchains for idle stat

Ravi Bangoria (3):
      perf annotate: Support jump instruction with target as second operand
      perf annotate: Fix jump target outside of function address range
      perf annotate: Don't throw error for zero length symbols

 samples/bpf/Makefile                              |  70 +--
 samples/bpf/README.rst                            |   4 +-
 samples/bpf/bpf_load.c                            |  21 +-
 samples/bpf/bpf_load.h                            |   3 +
 samples/bpf/fds_example.c                         |  13 +-
 samples/bpf/lathist_user.c                        |   2 +-
 samples/bpf/libbpf.c                              | 176 -------
 samples/bpf/libbpf.h                              |  28 +-
 samples/bpf/lwt_len_hist_user.c                   |   6 +-
 samples/bpf/offwaketime_user.c                    |   8 +-
 samples/bpf/sampleip_user.c                       |   7 +-
 samples/bpf/sock_example.c                        |  14 +-
 samples/bpf/sock_example.h                        |  35 ++
 samples/bpf/sockex1_user.c                        |   7 +-
 samples/bpf/sockex2_user.c                        |   5 +-
 samples/bpf/sockex3_user.c                        |   5 +-
 samples/bpf/spintest_user.c                       |   8 +-
 samples/bpf/tc_l2_redirect_user.c                 |   4 +-
 samples/bpf/test_cgrp2_array_pin.c                |   4 +-
 samples/bpf/test_cgrp2_attach.c                   |  12 +-
 samples/bpf/test_cgrp2_attach2.c                  |   8 +-
 samples/bpf/test_cgrp2_sock.c                     |   7 +-
 samples/bpf/test_current_task_under_cgroup_user.c |   8 +-
 samples/bpf/test_lru_dist.c                       |  32 +-
 samples/bpf/test_probe_write_user_user.c          |   2 +-
 samples/bpf/trace_event_user.c                    |  23 +-
 samples/bpf/trace_output_user.c                   |   7 +-
 samples/bpf/tracex2_user.c                        |  10 +-
 samples/bpf/tracex3_user.c                        |   4 +-
 samples/bpf/tracex4_user.c                        |   4 +-
 samples/bpf/tracex6_user.c                        |   5 +-
 samples/bpf/xdp1_user.c                           |   2 +-
 samples/bpf/xdp_tx_iptunnel_user.c                |   6 +-
 tools/include/uapi/linux/bpf.h                    | 593 +++++++++++++---------
 tools/lib/bpf/bpf.c                               |  30 +-
 tools/lib/bpf/bpf.h                               |   9 +-
 tools/lib/bpf/libbpf.c                            |   3 +-
 tools/perf/Documentation/perf-sched.txt           |   4 +
 tools/perf/Makefile.perf                          |  94 +---
 tools/perf/bench/futex-lock-pi.c                  |   2 +-
 tools/perf/builtin-c2c.c                          |  13 +-
 tools/perf/builtin-mem.c                          |   4 +-
 tools/perf/builtin-record.c                       |   3 +
 tools/perf/builtin-report.c                       |   2 +-
 tools/perf/builtin-sched.c                        | 261 ++++++++--
 tools/perf/builtin-stat.c                         |   6 +-
 tools/perf/check-headers.sh                       |  59 +++
 tools/perf/perf.h                                 |   1 +
 tools/perf/tests/builtin-test.c                   |   4 +
 tools/perf/tests/tests.h                          |   1 +
 tools/perf/tests/thread-map.c                     |  44 ++
 tools/perf/trace/beauty/mmap.c                    |   2 +
 tools/perf/ui/browsers/annotate.c                 |   5 +-
 tools/perf/util/annotate.c                        |  23 +-
 tools/perf/util/annotate.h                        |   5 +-
 tools/perf/util/evsel.c                           |  61 ++-
 tools/perf/util/evsel.h                           |   1 +
 tools/perf/util/symbol.c                          |   3 +-
 tools/perf/util/thread_map.c                      |  22 +
 tools/perf/util/thread_map.h                      |   1 +
 60 files changed, 1075 insertions(+), 731 deletions(-)
 delete mode 100644 samples/bpf/libbpf.c
 create mode 100644 samples/bpf/sock_example.h
 create mode 100755 tools/perf/check-headers.sh

  [root@jouet ~]# uname -a
  Linux jouet 4.9.0+ #1 SMP Mon Dec 19 15:44:52 BRT 2016 x86_64 x86_64 x86_64 GNU/Linux
  [root@jouet ~]# perf test
   1: vmlinux symtab matches kallsyms            : Ok
   2: Detect openat syscall event                : Ok
   3: Detect openat syscall event on all cpus    : Ok
   4: Read samples using the mmap interface      : Ok
   5: Parse event definition strings             : Ok
   6: PERF_RECORD_* events & perf_sample fields  : Ok
   7: Parse perf pmu format                      : Ok
   8: DSO data read                              : Ok
   9: DSO data cache                             : Ok
  10: DSO data reopen                            : Ok
  11: Roundtrip evsel->name                      : Ok
  12: Parse sched tracepoints fields             : Ok
  13: syscalls:sys_enter_openat event fields     : Ok
  14: Setup struct perf_event_attr               : Ok
  15: Match and link multiple hists              : Ok
  16: 'import perf' in python                    : Ok
  17: Breakpoint overflow signal handler         : Ok
  18: Breakpoint overflow sampling               : Ok
  19: Number of exit events of a simple workload : Ok
  20: Software clock events period values        : Ok
  21: Object code reading                        : Ok
  22: Sample parsing                             : Ok
  23: Use a dummy software event to keep tracking: Ok
  24: Parse with no sample_id_all bit set        : Ok
  25: Filter hist entries                        : Ok
  26: Lookup mmap thread                         : Ok
  27: Share thread mg                            : Ok
  28: Sort output of hist entries                : Ok
  29: Cumulate child hist entries                : Ok
  30: Track with sched_switch                    : Ok
  31: Filter fds with revents mask in a fdarray  : Ok
  32: Add fd to a fdarray, making it autogrow    : Ok
  33: kmod_path__parse                           : Ok
  34: Thread map                                 : Ok
  35: LLVM search and compile                    :
  35.1: Basic BPF llvm compile                    : Ok
  35.2: kbuild searching                          : Ok
  35.3: Compile source for BPF prologue generation: Ok
  35.4: Compile source for BPF relocation         : Ok
  36: Session topology                           : Ok
  37: BPF filter                                 :
  37.1: Basic BPF filtering                      : Ok
  37.2: BPF prologue generation                  : Ok
  37.3: BPF relocation checker                   : Ok
  38: Synthesize thread map                      : Ok
  39: Remove thread map                          : Ok
  40: Synthesize cpu map                         : Ok
  41: Synthesize stat config                     : Ok
  42: Synthesize stat                            : Ok
  43: Synthesize stat round                      : Ok
  44: Synthesize attr update                     : Ok
  45: Event times                                : Ok
  46: Read backward ring buffer                  : Ok
  47: Print cpu map                              : Ok
  48: Probe SDT events                           : Ok
  49: is_printable_array                         : Ok
  50: Print bitmap                               : Ok
  51: perf hooks                                 : Ok
  52: builtin clang support                      : Skip (not compiled in)
  53: x86 rdpmc                                  : Ok
  54: Convert perf time to TSC                   : Ok
  55: DWARF unwind                               : Ok
  56: x86 instruction decoder - new instructions : Ok
  57: Intel cqm nmi context read                 : Skip
  [root@jouet ~]# 

  # time dm
   1 alpine:3.4: Ok
   2 android-ndk:r12b-arm: Ok
   3 archlinux:latest: Ok
   4 centos:5: Ok
   5 centos:6: Ok
   6 centos:7: Ok
   7 debian:7: Ok
   8 debian:8: Ok
   9 debian:experimental: Ok
  10 debian:experimental-x-mips64: Ok
  11 fedora:20: Ok
  12 fedora:21: Ok
  13 fedora:22: Ok
  14 fedora:23: Ok
  15 fedora:24: Ok
  16 fedora:24-x-ARC-uClibc: Ok
  17 fedora:25: Ok
  18 fedora:rawhide: Ok
  19 mageia:5: Ok
  20 opensuse:13.2: Ok
  21 opensuse:42.1: Ok
  22 opensuse:tumbleweed: Ok
  23 ubuntu:12.04.5: Ok
  24 ubuntu:14.04.4-x-linaro-arm64: Ok
  25 ubuntu:16.04: Ok
  26 ubuntu:16.04-x-arm: Ok
  27 ubuntu:16.04-x-arm64: Ok
  28 ubuntu:16.04-x-powerpc: Ok
  29 ubuntu:16.04-x-powerpc64: Ok
  30 ubuntu:16.04-x-powerpc64el: Ok
  31 ubuntu:16.04-x-s390: Ok
  32 ubuntu:16.10: Ok
  #

  $ make -C tools/perf build-test
  make: Entering directory `/home/acme/git/linux/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
                make_minimal_O: cd . && make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
              make_clean_all_O: cd . && make clean all
            make_no_demangle_O: cd . && make NO_DEMANGLE=1
            make_no_auxtrace_O: cd . && make NO_AUXTRACE=1
         make_with_clangllvm_O: cd . && make LIBCLANGLLVM=1
                 make_static_O: cd . && make LDFLAGS=-static
                 make_cscope_O: cd . && make cscope
                  make_no_ui_O: cd . && make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
   make_install_prefix_slash_O: cd . && make install prefix=/tmp/krava/
       make_util_pmu_bison_o_O: cd . && make util/pmu-bison.o
        make_with_babeltrace_O: cd . && make LIBBABELTRACE=1
             make_no_libnuma_O: cd . && make NO_LIBNUMA=1
                make_no_gtk2_O: cd . && make NO_GTK2=1
           make_no_libbionic_O: cd . && make NO_LIBBIONIC=1
                  make_debug_O: cd . && make DEBUG=1
                   make_tags_O: cd . && make tags
             make_no_libperl_O: cd . && make NO_LIBPERL=1
               make_no_slang_O: cd . && make NO_SLANG=1
             make_util_map_o_O: cd . && make util/map.o
           make_no_backtrace_O: cd . && make NO_BACKTRACE=1
                    make_doc_O: cd . && make doc
         make_install_prefix_O: cd . && make install prefix=/tmp/krava
             make_no_scripts_O: cd . && make NO_LIBPYTHON=1 NO_LIBPERL=1
                 make_perf_o_O: cd . && make perf.o
            make_install_bin_O: cd . && make install-bin
              make_no_libbpf_O: cd . && make NO_LIBBPF=1
           make_no_libunwind_O: cd . && make NO_LIBUNWIND=1
                   make_help_O: cd . && make help
            make_no_libaudit_O: cd . && make NO_LIBAUDIT=1
              make_no_libelf_O: cd . && make NO_LIBELF=1
           make_no_libpython_O: cd . && make NO_LIBPYTHON=1
  make_no_libdw_dwarf_unwind_O: cd . && make NO_LIBDW_DWARF_UNWIND=1
                make_no_newt_O: cd . && make NO_NEWT=1
                   make_pure_O: cd . && make
                make_install_O: cd . && make install
  OK
  make: Leaving directory `/home/acme/git/linux/tools/perf'
  $

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [GIT PULL 00/29] perf/core improvements and fixes
  2016-02-03 16:33 Arnaldo Carvalho de Melo
@ 2016-02-04  7:59 ` Ingo Molnar
  0 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2016-02-04  7:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Andi Kleen, David Ahern, Jiri Olsa, Namhyung Kim,
	Peter Zijlstra, Stephane Eranian, Wang Nan,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> The following changes since commit 8eb22c984eb9eeaed47130f5679acd367af69678:
> 
>   Merge tag 'perf-core-for-mingo-3' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-02-03 11:02:37 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo
> 
> for you to fetch changes up to b62e8dfcda8cb133c062c0e1207afea2476eb7fd:
> 
>   perf hists browser: Add 'L' hotkey to change percent limit (2016-02-03 12:24:22 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> New features:
> 
> - Add 'L' hotkey to dynamicly set the percent threshold for histogram
>   entries and callchains, i.e. dynamicly do what the --percent-limit
>   command line option to 'top' and 'report' does. (Namhyung Kim)
> 
> Infrastructure:
> 
> - Per hists field and sort lists, that will be used, for instance,
>   in the c2c tool (Jiri Olsa)
> 
> Documentation:
> 
> - Update documentation of --sort and --perf-limit options
>   for 'perf report' (Namhyung Kim)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (1):
>       perf top: Move UI initialization ahead of sort setup
> 
> Jiri Olsa (25):
>       perf hists: Factor output_resort from hists__output_resort
>       perf hists: Introduce perf_evsel__output_resort function
>       perf hists: Add _idx fields into struct perf_hpp_fmt
>       perf hists: Use struct perf_hpp_fmt::idx in perf_hpp__reset_width
>       perf hists: Add 'equal' method to perf_hpp_fmt struct
>       perf hists: Add 'hpp__equal' callback function
>       perf hists: Make hpp setup function generic
>       perf report: Move UI initialization ahead of sort setup
>       perf hists: Allocate output sort field
>       perf hists: Remove perf_hpp__column_(disable|enable)
>       perf hists: Properly release format fields
>       perf hists: Separate sort fields parsing into setup_sort_list function
>       perf hists: Separate output fields parsing into setup_output_list function
>       perf hists: Introduce struct perf_hpp_list
>       perf hists: Introduce perf_hpp_list__init function
>       perf hists: Add perf_hpp_list register helpers
>       perf hists: Pass perf_hpp_list all the way through setup_output_list
>       perf hists: Introduce perf_hpp_list__for_each_format macro
>       perf hists: Introduce perf_hpp_list__for_each_format_safe macro
>       perf hists: Introduce perf_hpp_list__for_each_sort_list macro
>       perf hists: Introduce perf_hpp_list__for_each_sort_list_safe macro
>       perf hists: Add struct perf_hpp_list argument to helper functions
>       perf tools: Add hpp_list into struct hists object
>       perf hists: Introduce hists__for_each_format macro
>       perf hists: Introduce hists__for_each_sort_list macro
> 
> Namhyung Kim (3):
>       perf report: Update documentation of --sort option
>       perf report: Update documention of --percent-limit option
>       perf hists browser: Add 'L' hotkey to change percent limit
> 
>  tools/perf/Documentation/perf-report.txt |  37 +++---
>  tools/perf/builtin-annotate.c            |   2 +-
>  tools/perf/builtin-report.c              |  20 +--
>  tools/perf/builtin-top.c                 |  24 ++--
>  tools/perf/tests/hists_cumulate.c        |   2 +-
>  tools/perf/tests/hists_filter.c          |   2 +-
>  tools/perf/tests/hists_output.c          |  10 +-
>  tools/perf/ui/browsers/hists.c           |  63 +++++++++-
>  tools/perf/ui/gtk/hists.c                |   6 +-
>  tools/perf/ui/hist.c                     | 170 +++++++++++++------------
>  tools/perf/ui/stdio/hist.c               |   8 +-
>  tools/perf/util/hist.c                   |  47 +++++--
>  tools/perf/util/hist.h                   |  65 +++++++---
>  tools/perf/util/sort.c                   | 205 +++++++++++++++++++++----------
>  14 files changed, 425 insertions(+), 236 deletions(-)

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [GIT PULL 00/29] perf/core improvements and fixes
@ 2016-02-03 16:33 Arnaldo Carvalho de Melo
  2016-02-04  7:59 ` Ingo Molnar
  0 siblings, 1 reply; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-03 16:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Andi Kleen, David Ahern,
	Jiri Olsa, Namhyung Kim, Peter Zijlstra, Stephane Eranian,
	Wang Nan, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

The following changes since commit 8eb22c984eb9eeaed47130f5679acd367af69678:

  Merge tag 'perf-core-for-mingo-3' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-02-03 11:02:37 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo

for you to fetch changes up to b62e8dfcda8cb133c062c0e1207afea2476eb7fd:

  perf hists browser: Add 'L' hotkey to change percent limit (2016-02-03 12:24:22 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- Add 'L' hotkey to dynamicly set the percent threshold for histogram
  entries and callchains, i.e. dynamicly do what the --percent-limit
  command line option to 'top' and 'report' does. (Namhyung Kim)

Infrastructure:

- Per hists field and sort lists, that will be used, for instance,
  in the c2c tool (Jiri Olsa)

Documentation:

- Update documentation of --sort and --perf-limit options
  for 'perf report' (Namhyung Kim)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (1):
      perf top: Move UI initialization ahead of sort setup

Jiri Olsa (25):
      perf hists: Factor output_resort from hists__output_resort
      perf hists: Introduce perf_evsel__output_resort function
      perf hists: Add _idx fields into struct perf_hpp_fmt
      perf hists: Use struct perf_hpp_fmt::idx in perf_hpp__reset_width
      perf hists: Add 'equal' method to perf_hpp_fmt struct
      perf hists: Add 'hpp__equal' callback function
      perf hists: Make hpp setup function generic
      perf report: Move UI initialization ahead of sort setup
      perf hists: Allocate output sort field
      perf hists: Remove perf_hpp__column_(disable|enable)
      perf hists: Properly release format fields
      perf hists: Separate sort fields parsing into setup_sort_list function
      perf hists: Separate output fields parsing into setup_output_list function
      perf hists: Introduce struct perf_hpp_list
      perf hists: Introduce perf_hpp_list__init function
      perf hists: Add perf_hpp_list register helpers
      perf hists: Pass perf_hpp_list all the way through setup_output_list
      perf hists: Introduce perf_hpp_list__for_each_format macro
      perf hists: Introduce perf_hpp_list__for_each_format_safe macro
      perf hists: Introduce perf_hpp_list__for_each_sort_list macro
      perf hists: Introduce perf_hpp_list__for_each_sort_list_safe macro
      perf hists: Add struct perf_hpp_list argument to helper functions
      perf tools: Add hpp_list into struct hists object
      perf hists: Introduce hists__for_each_format macro
      perf hists: Introduce hists__for_each_sort_list macro

Namhyung Kim (3):
      perf report: Update documentation of --sort option
      perf report: Update documention of --percent-limit option
      perf hists browser: Add 'L' hotkey to change percent limit

 tools/perf/Documentation/perf-report.txt |  37 +++---
 tools/perf/builtin-annotate.c            |   2 +-
 tools/perf/builtin-report.c              |  20 +--
 tools/perf/builtin-top.c                 |  24 ++--
 tools/perf/tests/hists_cumulate.c        |   2 +-
 tools/perf/tests/hists_filter.c          |   2 +-
 tools/perf/tests/hists_output.c          |  10 +-
 tools/perf/ui/browsers/hists.c           |  63 +++++++++-
 tools/perf/ui/gtk/hists.c                |   6 +-
 tools/perf/ui/hist.c                     | 170 +++++++++++++------------
 tools/perf/ui/stdio/hist.c               |   8 +-
 tools/perf/util/hist.c                   |  47 +++++--
 tools/perf/util/hist.h                   |  65 +++++++---
 tools/perf/util/sort.c                   | 205 +++++++++++++++++++++----------
 14 files changed, 425 insertions(+), 236 deletions(-)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [GIT PULL 00/29] perf/core improvements and fixes
  2013-11-05  7:52   ` Ingo Molnar
@ 2013-11-05  8:05     ` Ingo Molnar
  0 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2013-11-05  8:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Namhyung Kim
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen, Andrew Morton, Corey Ashford, David Ahern,
	Frederic Weisbecker, Howard Cochran, Jiri Olsa, Masami Hiramatsu,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	Stephane Eranian, Steven Rostedt, Xenia Ragiadakou,
	Yoshihiro YUNOMAE, Arnaldo Carvalho de Melo


* Ingo Molnar <mingo@kernel.org> wrote:

> Now it displays the spammy auto-detection all the time:
> 
> comet:~/tip/tools/perf> make
>   BUILD:   Doing 'make -j12' parallel build
> 
> Auto-detecting system features:
> ...                     backtrace: [ on  ]
> ...                         dwarf: [ on  ]
> ...                fortify-source: [ on  ]
> ...                         glibc: [ on  ]
> ...                          gtk2: [ on  ]
> ...                  gtk2-infobar: [ on  ]
> ...                      libaudit: [ on  ]
> ...                        libbfd: [ on  ]
> ...                        libelf: [ on  ]
> ...             libelf-getphdrnum: [ on  ]
> ...                   libelf-mmap: [ on  ]
> ...                       libnuma: [ on  ]
> ...                       libperl: [ on  ]
> ...                     libpython: [ on  ]
> ...             libpython-version: [ on  ]
> ...                      libslang: [ on  ]
> ...                     libunwind: [ on  ]
> ...                       on-exit: [ on  ]
> ...                stackprotector: [ on  ]
> ...            stackprotector-all: [ on  ]
> 
> maybe a new testcase fails on my box, but is not properly displayed?

Hm, a 'make clean' cleaned up the condition and it now works as expected. 

So what happened is that GCC got upgraded from 4.8.1 to 4.8.2 and 
test-all.c failed to build:

  comet:~/tip/tools/perf/config/feature-checks> make -i
  make: *** No rule to make target `/usr/lib/gcc/x86_64-redhat-linux/4.8.1/include/limits.h', needed by `test-all'.  Stop.

but the individual testcases got built successfully and the perf build was 
fine and correct - it's just that the test-all.c .d file never got updated 
due to the broken dependency.

I think Namhyung warned about this during the feature-detection 
discussion, and I dismissed the concern - but Namhyung was right, it's a 
real problem.

I think to fix this we should automatically clean out potentially stale .d 
files automatically after testcase build failures (be it test-all.c or any 
other testcase)?

That would still preserve all the good roles of these dependencies and it 
would preserve the build speed and detection of system library changes.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [GIT PULL 00/29] perf/core improvements and fixes
  2013-11-04 20:16 ` Ingo Molnar
@ 2013-11-05  7:52   ` Ingo Molnar
  2013-11-05  8:05     ` Ingo Molnar
  0 siblings, 1 reply; 41+ messages in thread
From: Ingo Molnar @ 2013-11-05  7:52 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen, Andrew Morton, Corey Ashford, David Ahern,
	Frederic Weisbecker, Howard Cochran, Jiri Olsa, Masami Hiramatsu,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	Stephane Eranian, Steven Rostedt, Xenia Ragiadakou,
	Yoshihiro YUNOMAE, Arnaldo Carvalho de Melo


hm, probably one of these commits:

8a0c4c2843d3 perf tools: Fix libunwind build and feature detection for 32-bit build
28e962b9d79f perf tools: Fix 32-bit cross build
74af377bc25d perf tools: Fix non-debug build
9402802a416c perf tools: Fix test_on_exit for 32-bit build

broke the suppression of the auto-detection messages on repeat invocations 
of 'make', if all testcases pass.

'make' used to display this on repeat invocations:

 comet:~/tip/tools/perf> make
   BUILD:   Doing 'make -j12' parallel build
 comet:~/tip/tools/perf> make
   BUILD:   Doing 'make -j12' parallel build
 comet:~/tip/tools/perf> make
   BUILD:   Doing 'make -j12' parallel build

Now it displays the spammy auto-detection all the time:

comet:~/tip/tools/perf> make
  BUILD:   Doing 'make -j12' parallel build

Auto-detecting system features:
...                     backtrace: [ on  ]
...                         dwarf: [ on  ]
...                fortify-source: [ on  ]
...                         glibc: [ on  ]
...                          gtk2: [ on  ]
...                  gtk2-infobar: [ on  ]
...                      libaudit: [ on  ]
...                        libbfd: [ on  ]
...                        libelf: [ on  ]
...             libelf-getphdrnum: [ on  ]
...                   libelf-mmap: [ on  ]
...                       libnuma: [ on  ]
...                       libperl: [ on  ]
...                     libpython: [ on  ]
...             libpython-version: [ on  ]
...                      libslang: [ on  ]
...                     libunwind: [ on  ]
...                       on-exit: [ on  ]
...                stackprotector: [ on  ]
...            stackprotector-all: [ on  ]

maybe a new testcase fails on my box, but is not properly displayed?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [GIT PULL 00/29] perf/core improvements and fixes
  2013-11-04 17:58 Arnaldo Carvalho de Melo
@ 2013-11-04 20:16 ` Ingo Molnar
  2013-11-05  7:52   ` Ingo Molnar
  0 siblings, 1 reply; 41+ messages in thread
From: Ingo Molnar @ 2013-11-04 20:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen, Andrew Morton, Corey Ashford, David Ahern,
	Frederic Weisbecker, Howard Cochran, Jiri Olsa, Masami Hiramatsu,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	Stephane Eranian, Steven Rostedt, Xenia Ragiadakou,
	Yoshihiro YUNOMAE, Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@infradead.org> wrote:

> From: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> 
> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> The following changes since commit 46d525eae2a076adfde92dca1db12d9a3b8ad8bb:
> 
>   perf test: Update command line callchain attribute tests (2013-11-01 10:42:57 -0300)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-urgent-for-mingo

I guess this wanted to be perf-core-for-mingo?

Pulled those bits, thanks Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [GIT PULL 00/29] perf/core improvements and fixes
@ 2013-11-04 17:58 Arnaldo Carvalho de Melo
  2013-11-04 20:16 ` Ingo Molnar
  0 siblings, 1 reply; 41+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-04 17:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen, Andrew Morton, Corey Ashford, David Ahern,
	Frederic Weisbecker, Howard Cochran, Jiri Olsa, Masami Hiramatsu,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	Stephane Eranian, Steven Rostedt, Xenia Ragiadakou,
	Yoshihiro YUNOMAE, Arnaldo Carvalho de Melo

From: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>

Hi Ingo,

	Please consider pulling,

- Arnaldo

The following changes since commit 46d525eae2a076adfde92dca1db12d9a3b8ad8bb:

  perf test: Update command line callchain attribute tests (2013-11-01 10:42:57 -0300)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-urgent-for-mingo

for you to fetch changes up to 46d525eae2a076adfde92dca1db12d9a3b8ad8bb:

  perf test: Update command line callchain attribute tests (2013-11-01 10:42:57 -0300)

----------------------------------------------------------------
perf/urgent fixes:

. Fix command line callchain attribute tests to handle the new
  -g/--call-chain semantics.

. Remove cast of non-variadic function to variadic, fixing perf output
  on armhf arch. Fix from Michael Hudson-Doyle.

. Fix 32-bit building of 'perf bench', from Wei Yang.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2019-01-26  9:52 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-25 23:18 [GIT PULL 00/29] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 01/29] perf symbols: Move symbol_conf to separate file Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 02/29] perf annotate: Remove lots of headers from annotate.h Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 03/29] perf tools: Move branch structs to branch.h Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 04/29] perf block-range: Add missing headers Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 05/29] perf symbols: Remove include map.h from dso.h Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 06/29] perf symbols: Remove some unnecessary includes from symbol.h Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 07/29] perf namespaces: Remove namespaces.h from .h headers Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 08/29] perf comm: Remove needless headers from comm.h Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 09/29] perf callchain: No need to include perf.h Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 10/29] tools: Update rbtree implementation Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 11/29] perf machine: Use cached rbtrees Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 12/29] perf callchain: " Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 13/29] perf util: Use cached rbtree for rblists Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 14/29] perf symbols: Use cached rbtrees Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 15/29] perf hist: " Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 16/29] perf sched: " Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 17/29] perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 18/29] perf script python: Add trace_context extension module to sys.modules Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 19/29] perf script python: Use PyBytes for attr in trace-event-python Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 20/29] perf script python: Remove explicit shebang from setup.py Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 21/29] perf script python: Remove explicit shebang from tests/attr.c Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 22/29] perf script python: Remove explicit shebang from Python scripts Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 23/29] perf script python: Add Python3 support to tests/attr.py Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 24/29] perf bpf: Add bpf_map() helper Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 25/29] perf bpf: Convert pid_map() to bpf_map() Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 26/29] perf augmented_raw_syscalls: Use bpf_map() Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 27/29] perf trace: Fixup etcsnoop example Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 28/29] perf bpf examples: Convert etcsnoop to use bpf_map() Arnaldo Carvalho de Melo
2019-01-25 23:18 ` [PATCH 29/29] perf augmented_syscalls: Convert to bpf_map() Arnaldo Carvalho de Melo
2019-01-26  9:52 ` [GIT PULL 00/29] perf/core improvements and fixes Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2019-01-03 12:45 Arnaldo Carvalho de Melo
2019-01-03 13:07 ` Ingo Molnar
2016-12-20 17:03 Arnaldo Carvalho de Melo
2016-12-20 19:15 ` Ingo Molnar
2016-02-03 16:33 Arnaldo Carvalho de Melo
2016-02-04  7:59 ` Ingo Molnar
2013-11-04 17:58 Arnaldo Carvalho de Melo
2013-11-04 20:16 ` Ingo Molnar
2013-11-05  7:52   ` Ingo Molnar
2013-11-05  8:05     ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).