linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 00/28] perf/core improvements and fixes
@ 2018-11-22  3:35 Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 01/28] perf bpf: Add unistd.h to the headers accessible to bpf proggies Arnaldo Carvalho de Melo
                   ` (27 more replies)
  0 siblings, 28 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin,
	Alexei Starovoitov, Andi Kleen, Andrew Morton, Anton Blanchard,
	Ben Gainey, Ben Hutchings, Borislav Petkov, Daniel Borkmann,
	Dave Kleikamp, David Ahern, David Aldridge, Davidlohr Bueso,
	Edward Cree, Eric Saint-Etienne, Gustavo Luiz Duarte,
	Jason Baron, Jin Yao, Jiri Olsa, Kan Liang, Martin KaFai Lau,
	Milian Wolff, Namhyung Kim, Peter Zijlstra, Pu Wen,
	Ravi Bangoria, Rob Gardner, Stephane Eranian, Thomas Gleixner,
	Thomas Richter, Wang Nan, Yonghong Song, yuzhoujian,
	Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling, some from before the trip to Vancouver,
some that were more easy to process before I continue with the backlog.
Took a bit more time than I antecipated due to fixing build breakage in
various places due to multiple patches.  This has tip/perf/urgent
merged.

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit b1a9d7b0190119dad5b9b7841751b5a7586bbc8b:

  Merge tag 'perf-urgent-for-mingo-4.20-20181121' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2018-11-21 15:57:21 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.21-20181122

for you to fetch changes up to f4a0742b3cc1d03b2ff448017b8c714a77e5a261:

  perf pmu: Move *_cpuid_str() weak functions to header.c (2018-11-21 22:39:59 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

- Start using BPF maps in 'perf trace' for filters in the augmented syscalls
  code, keeping the existing code for tracepoint filters so that we can switch
  back and forth while getting everything BPFied (Arnaldo Carvalho de Melo)

- Suppress potential format-truncation warning in the PMU code (Ben Hutchings)

- Introduce 'perf bench epoll', with "wait" and "ctl" benchmarks (Davidlohr Bueso)

- Fix slowness due to -ffunction-section, do it by sorting the maps by name, so
  avoiding the using rb_first/next to traverse all entries looking for a map name,
  that with --ffunction-section gets to thousands of maps (Eric Saint-Etienne)

- Separate jvmti cmlr check (Jiri Olsa)

- Allow using the stepping when figuring out which JSON files to use for a x86
  processor, so that Cascadelake server can be support, which has the same
  cpuid as some other processor, being different only in the stepping (Kan Liang)

- Share code and output format for uregs and iregs 'perf script' output (Milian Wolff)

- Use perf_evsel__is_clocki() for clock events in 'perf stat' (Ravi Bangoria)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (15):
      perf bpf: Add unistd.h to the headers accessible to bpf proggies
      perf augmented_syscalls: Filter on a hard coded pid
      perf augmented_syscalls: Remove needless linux/socket.h include
      perf bpf: Add defines for map insertion/lookup
      perf bpf: Add simple pid_filter class accessible to BPF proggies
      perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter
      perf augmented_syscalls: Use pid_filter
      perf evlist: Rename perf_evlist__set_filter* to perf_evlist__set_tp_filter*
      perf trace: Add "_from_option" suffix to trace__set_filter()
      perf trace: See if there is a map named "filtered_pids"
      perf trace: Fill in BPF "filtered_pids" map when present
      perf augmented_syscalls: Remove example hardcoded set of filtered pids
      Revert "perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter"
      perf bpf: Reduce the hardcoded .max_entries for pid_maps
      tools build feature: Check if eventfd() is available

Ben Hutchings (1):
      perf pmu: Suppress potential format-truncation warning

Davidlohr Bueso (3):
      perf bench: Move HAVE_PTHREAD_ATTR_SETAFFINITY_NP into bench.h
      perf bench: Add epoll parallel epoll_wait benchmark
      perf bench: Add epoll_ctl(2) benchmark

Eric Saint-Etienne (1):
      perf symbols: Fix slowness due to -ffunction-section

Jiri Olsa (1):
      perf jvmti: Separate jvmti cmlr check

Kan Liang (3):
      perf vendor events: Add stepping in CPUID string for x86
      perf vendor events: Add JSON metrics for Cascadelake server
      perf pmu: Move *_cpuid_str() weak functions to header.c

Milian Wolff (2):
      perf script: Add newline after uregs output
      perf script: Share code and output format for uregs and iregs output

Pu Wen (1):
      perf tools: Add Hygon Dhyana support

Ravi Bangoria (1):
      perf stat: Use perf_evsel__is_clocki() for clock events

 tools/build/Makefile.feature                       |     1 +
 tools/build/feature/Makefile                       |     8 +
 tools/build/feature/test-all.c                     |     5 +
 tools/build/feature/test-eventfd.c                 |     9 +
 tools/build/feature/test-jvmti-cmlr.c              |    11 +
 tools/build/feature/test-jvmti.c                   |     1 -
 tools/perf/Documentation/perf-bench.txt            |    10 +
 tools/perf/Makefile.config                         |    12 +-
 tools/perf/Makefile.perf                           |     3 +
 tools/perf/arch/x86/util/header.c                  |    66 +-
 tools/perf/arch/x86/util/kvm-stat.c                |     2 +-
 tools/perf/bench/Build                             |     3 +
 tools/perf/bench/bench.h                           |    14 +
 tools/perf/bench/epoll-ctl.c                       |   413 +
 tools/perf/bench/epoll-wait.c                      |   540 +
 tools/perf/bench/futex.h                           |    12 -
 tools/perf/builtin-bench.c                         |    13 +
 tools/perf/builtin-script.c                        |    38 +-
 tools/perf/builtin-trace.c                         |    92 +-
 tools/perf/examples/bpf/augmented_raw_syscalls.c   |    10 +-
 tools/perf/include/bpf/bpf.h                       |    19 +
 tools/perf/include/bpf/pid_filter.h                |    21 +
 tools/perf/include/bpf/unistd.h                    |    10 +
 tools/perf/jvmti/libjvmti.c                        |    12 +
 .../pmu-events/arch/x86/cascadelakex/cache.json    | 10172 +++++++++++++++++++
 .../arch/x86/cascadelakex/clx-metrics.json         |   164 +
 .../arch/x86/cascadelakex/floating-point.json      |    85 +
 .../pmu-events/arch/x86/cascadelakex/frontend.json |   482 +
 .../pmu-events/arch/x86/cascadelakex/memory.json   |  9909 ++++++++++++++++++
 .../pmu-events/arch/x86/cascadelakex/other.json    |  8908 ++++++++++++++++
 .../pmu-events/arch/x86/cascadelakex/pipeline.json |   969 ++
 .../arch/x86/cascadelakex/uncore-memory.json       |   117 +
 .../arch/x86/cascadelakex/uncore-other.json        |   255 +
 .../arch/x86/cascadelakex/virtual-memory.json      |   285 +
 tools/perf/pmu-events/arch/x86/mapfile.csv         |     3 +-
 tools/perf/util/evlist.c                           |    10 +-
 tools/perf/util/evlist.h                           |     6 +-
 tools/perf/util/header.c                           |    39 +
 tools/perf/util/map.c                              |    27 +
 tools/perf/util/map.h                              |     2 +
 tools/perf/util/pmu.c                              |    47 +-
 tools/perf/util/stat-shadow.c                      |     3 +-
 tools/perf/util/symbol.c                           |    15 +-
 43 files changed, 32711 insertions(+), 112 deletions(-)
 create mode 100644 tools/build/feature/test-eventfd.c
 create mode 100644 tools/build/feature/test-jvmti-cmlr.c
 create mode 100644 tools/perf/bench/epoll-ctl.c
 create mode 100644 tools/perf/bench/epoll-wait.c
 create mode 100644 tools/perf/include/bpf/pid_filter.h
 create mode 100644 tools/perf/include/bpf/unistd.h
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/cache.json
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/floating-point.json
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/other.json
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/uncore-memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/uncore-other.json
 create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/virtual-memory.json

Test results:

The first ones are container (docker) based builds of tools/perf with
and without libelf support.  Where clang is available, it is also used
to build perf with/without libelf, and building with LIBCLANGLLVM=1
(built-in clang) with gcc and clang when clang and its devel libraries
are installed.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

  # dm
   1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0
   2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822
   3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0
   4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   5 alpine:3.8                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   6 alpine:edge                   : Ok   gcc (Alpine 6.4.0) 6.4.0
   7 amazonlinux:1                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
   8 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
   9 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  10 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  11 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  12 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
  13 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
  14 clearlinux:latest             : Ok   gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502
  15 debian:7                      : Ok   gcc (Debian 4.7.2-5) 4.7.2
  16 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u1) 4.9.2
  17 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
  18 debian:experimental           : Ok   gcc (Debian 8.2.0-9) 8.2.0
  19 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.2.0-9) 8.2.0
  20 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0
  21 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 8.2.0-9) 8.2.0
  22 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0
  23 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
  24 fedora:21                     : Ok   gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
  25 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  26 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  27 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  28 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  29 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
  30 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
  31 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)
  32 fedora:28                     : Ok   gcc (GCC) 8.2.1 20181011 (Red Hat 8.2.1-4)
  33 fedora:29                     : Ok   gcc (GCC) 8.2.1 20181011 (Red Hat 8.2.1-4)
  34 fedora:rawhide                : Ok   gcc (GCC) 8.2.1 20181011 (Red Hat 8.2.1-4)
  35 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 7.3.0-r3 p1.4) 7.3.0
  36 mageia:5                      : Ok   gcc (GCC) 4.9.2
  37 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0
  38 opensuse:13.2                 : Ok   gcc (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
  39 opensuse:42.1                 : Ok   gcc (SUSE Linux) 4.8.5
  40 opensuse:42.2                 : Ok   gcc (SUSE Linux) 4.8.5
  41 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5
  42 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
  43 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
  44 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36.0.1)
  45 ubuntu:12.04.5                : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  46 ubuntu:14.04.4                : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
  47 ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
  48 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  49 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  50 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  51 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  52 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  53 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  54 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  55 ubuntu:16.10                  : Ok   gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
  56 ubuntu:17.10                  : Ok   gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
  57 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  58 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
  59 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
  60 ubuntu:18.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  61 ubuntu:18.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  62 ubuntu:18.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  63 ubuntu:18.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  64 ubuntu:18.04-x-sparc64        : Ok   sparc64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
  65 ubuntu:18.10                  : Ok   gcc (Ubuntu 8.2.0-7ubuntu1) 8.2.0

  # uname -a
  Linux seventh 4.19.0-rc8-00014-gc0cff31be705 #1 SMP Wed Oct 17 09:00:22 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
  # git log --oneline -1
  f4a0742b3cc1 perf pmu: Move *_cpuid_str() weak functions to header.c
  # perf version --build-options
  perf version 4.20.rc3.gf4a074
                   dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
      dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                   glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                    gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
           syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                  libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                  libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                 libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
  numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
               libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                libslang: [ on  ]  # HAVE_SLANG_SUPPORT
               libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
               libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
      libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                    zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                    lzma: [ on  ]  # HAVE_LZMA_SUPPORT
               get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Breakpoint accounting                                 : Ok
  22: Watchpoint                                            :
  22.1: Read Only Watchpoint                                : Skip
  22.2: Write Only Watchpoint                               : Ok
  22.3: Read / Write Watchpoint                             : Ok
  22.4: Modify Watchpoint                                   : Ok
  23: Number of exit events of a simple workload            : Ok
  24: Software clock events period values                   : Ok
  25: Object code reading                                   : Ok
  26: Sample parsing                                        : Ok
  27: Use a dummy software event to keep tracking           : Ok
  28: Parse with no sample_id_all bit set                   : Ok
  29: Filter hist entries                                   : Ok
  30: Lookup mmap thread                                    : Ok
  31: Share thread mg                                       : Ok
  32: Sort output of hist entries                           : Ok
  33: Cumulate child hist entries                           : Ok
  34: Track with sched_switch                               : Ok
  35: Filter fds with revents mask in a fdarray             : Ok
  36: Add fd to a fdarray, making it autogrow               : Ok
  37: kmod_path__parse                                      : Ok
  38: Thread map                                            : Ok
  39: LLVM search and compile                               :
  39.1: Basic BPF llvm compile                              : Ok
  39.2: kbuild searching                                    : Ok
  39.3: Compile source for BPF prologue generation          : Ok
  39.4: Compile source for BPF relocation                   : Ok
  40: Session topology                                      : Ok
  41: BPF filter                                            :
  41.1: Basic BPF filtering                                 : Ok
  41.2: BPF pinning                                         : Ok
  41.3: BPF prologue generation                             : Ok
  41.4: BPF relocation checker                              : Ok
  42: Synthesize thread map                                 : Ok
  43: Remove thread map                                     : Ok
  44: Synthesize cpu map                                    : Ok
  45: Synthesize stat config                                : Ok
  46: Synthesize stat                                       : Ok
  47: Synthesize stat round                                 : Ok
  48: Synthesize attr update                                : Ok
  49: Event times                                           : Ok
  50: Read backward ring buffer                             : Ok
  51: Print cpu map                                         : Ok
  52: Probe SDT events                                      : Ok
  53: is_printable_array                                    : Ok
  54: Print bitmap                                          : Ok
  55: perf hooks                                            : Ok
  56: builtin clang support                                 : Skip (not compiled in)
  57: unit_number__scnprintf                                : Ok
  58: mem2node                                              : Ok
  59: x86 rdpmc                                             : Ok
  60: Convert perf time to TSC                              : Ok
  61: DWARF unwind                                          : Ok
  62: x86 instruction decoder - new instructions            : Ok
  63: x86 bp modify                                         : Ok
  64: probe libc's inet_pton & backtrace it with ping       : Ok
  65: Check open filename arg using perf trace + vfs_getname: Ok
  66: Use vfs_getname probe to get syscall args filenames   : Ok
  67: Add vfs_getname probe to get syscall args filenames   : Ok

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
                   make_pure_O: make
            make_no_demangle_O: make NO_DEMANGLE=1
                  make_debug_O: make DEBUG=1
                make_no_newt_O: make NO_NEWT=1
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
        make_with_babeltrace_O: make LIBBABELTRACE=1
                make_no_gtk2_O: make NO_GTK2=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
            make_install_bin_O: make install-bin
             make_util_map_o_O: make util/map.o
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
                    make_doc_O: make doc
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
           make_no_libbionic_O: make NO_LIBBIONIC=1
                   make_tags_O: make tags
         make_with_clangllvm_O: make LIBCLANGLLVM=1
       make_util_pmu_bison_o_O: make util/pmu-bison.o
              make_no_libelf_O: make NO_LIBELF=1
           make_no_libpython_O: make NO_LIBPYTHON=1
           make_no_backtrace_O: make NO_BACKTRACE=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
              make_clean_all_O: make clean all
                make_install_O: make install
                 make_perf_o_O: make perf.o
            make_no_auxtrace_O: make NO_AUXTRACE=1
              make_no_libbpf_O: make NO_LIBBPF=1
               make_no_slang_O: make NO_SLANG=1
         make_install_prefix_O: make install prefix=/tmp/krava
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
             make_no_libperl_O: make NO_LIBPERL=1
           make_no_libunwind_O: make NO_LIBUNWIND=1
                   make_help_O: make help
                 make_static_O: make LDFLAGS=-static
             make_no_libnuma_O: make NO_LIBNUMA=1
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 01/28] perf bpf: Add unistd.h to the headers accessible to bpf proggies
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 02/28] perf augmented_syscalls: Filter on a hard coded pid Arnaldo Carvalho de Melo
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Start with a getpid() function wrapping BPF_FUNC_get_current_pid_tgid,
idea is to mimic the system headers.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zo8hv22onidep7tm785dzxfk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/include/bpf/unistd.h | 10 ++++++++++
 1 file changed, 10 insertions(+)
 create mode 100644 tools/perf/include/bpf/unistd.h

diff --git a/tools/perf/include/bpf/unistd.h b/tools/perf/include/bpf/unistd.h
new file mode 100644
index 000000000000..ca7877f9a976
--- /dev/null
+++ b/tools/perf/include/bpf/unistd.h
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: LGPL-2.1
+
+#include <bpf.h>
+
+static int (*bpf_get_current_pid_tgid)(void) = (void *)BPF_FUNC_get_current_pid_tgid;
+
+static pid_t getpid(void)
+{
+	return bpf_get_current_pid_tgid();
+}
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 02/28] perf augmented_syscalls: Filter on a hard coded pid
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 01/28] perf bpf: Add unistd.h to the headers accessible to bpf proggies Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 03/28] perf augmented_syscalls: Remove needless linux/socket.h include Arnaldo Carvalho de Melo
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Just to show where we'll hook pid based filters, and what we use to
obtain the current pid, using a BPF getpid() equivalent.

Now we need to remove that hardcoded PID with a BPF hash map, so that we
start by filtering 'perf trace's own PID, implement the --filter-pid
functionality, etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-oshrcgcekiyhd0whwisxfvtv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/examples/bpf/augmented_raw_syscalls.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 90a19336310b..2feb00018f79 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -15,6 +15,7 @@
  */
 
 #include <stdio.h>
+#include <unistd.h>
 #include <linux/socket.h>
 
 /* bpf-output associated map */
@@ -56,6 +57,9 @@ int sys_enter(struct syscall_enter_args *args)
 	unsigned int len = sizeof(augmented_args);
 	const void *filename_arg = NULL;
 
+	if (getpid() == 2971)
+		return 0;
+
 	probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
 	/*
 	 * Yonghong and Edward Cree sayz:
@@ -125,7 +129,7 @@ int sys_enter(struct syscall_enter_args *args)
 SEC("raw_syscalls:sys_exit")
 int sys_exit(struct syscall_exit_args *args)
 {
-	return 1; /* 0 as soon as we start copying data returned by the kernel, e.g. 'read' */
+	return getpid() != 2971;
 }
 
 license(GPL);
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 03/28] perf augmented_syscalls: Remove needless linux/socket.h include
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 01/28] perf bpf: Add unistd.h to the headers accessible to bpf proggies Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 02/28] perf augmented_syscalls: Filter on a hard coded pid Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 04/28] perf bpf: Add defines for map insertion/lookup Arnaldo Carvalho de Melo
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Leftover from when we started augmented_raw_syscalls.c from
tools/perf/examples/bpf/augmented_syscalls.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: e58a0322dbac ("perf examples bpf: Start augmenting raw_syscalls:sys_{start,exit}")
Link: https://lkml.kernel.org/n/tip-pmts9ls2skh8n3zisb4txudd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/examples/bpf/augmented_raw_syscalls.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 2feb00018f79..ec109c12ff24 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -16,7 +16,6 @@
 
 #include <stdio.h>
 #include <unistd.h>
-#include <linux/socket.h>
 
 /* bpf-output associated map */
 struct bpf_map SEC("maps") __augmented_syscalls__ = {
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 04/28] perf bpf: Add defines for map insertion/lookup
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 03/28] perf augmented_syscalls: Remove needless linux/socket.h include Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 05/28] perf bpf: Add simple pid_filter class accessible to BPF proggies Arnaldo Carvalho de Melo
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Starting with a helper for a basic pid_map(), a hash using a pid as a
key.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-gdwvq53wltvq6b3g5tdmh0cw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/include/bpf/bpf.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/include/bpf/bpf.h b/tools/perf/include/bpf/bpf.h
index 52b6d87fe822..04ecd425a237 100644
--- a/tools/perf/include/bpf/bpf.h
+++ b/tools/perf/include/bpf/bpf.h
@@ -18,6 +18,17 @@ struct bpf_map {
         unsigned int numa_node;
 };
 
+#define pid_map(name, value_type)		\
+struct bpf_map SEC("maps") name = {		\
+	.type	     = BPF_MAP_TYPE_HASH,	\
+	.key_size    = sizeof(pid_t),		\
+	.value_size  = sizeof(value_type),	\
+	.max_entries = 512,			\
+}
+
+static int (*bpf_map_update_elem)(struct bpf_map *map, void *key, void *value, u64 flags) = (void *)BPF_FUNC_map_update_elem;
+static void *(*bpf_map_lookup_elem)(struct bpf_map *map, void *key) = (void *)BPF_FUNC_map_lookup_elem;
+
 #define SEC(NAME) __attribute__((section(NAME),  used))
 
 #define probe(function, vars) \
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 05/28] perf bpf: Add simple pid_filter class accessible to BPF proggies
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 04/28] perf bpf: Add defines for map insertion/lookup Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 06/28] perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter Arnaldo Carvalho de Melo
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Will be used in the augmented_raw_syscalls.c to implement 'perf trace
--filter-pids'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-9sybmz4vchlbpqwx2am13h9e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/include/bpf/pid_filter.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)
 create mode 100644 tools/perf/include/bpf/pid_filter.h

diff --git a/tools/perf/include/bpf/pid_filter.h b/tools/perf/include/bpf/pid_filter.h
new file mode 100644
index 000000000000..6e61c4bdf548
--- /dev/null
+++ b/tools/perf/include/bpf/pid_filter.h
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: LGPL-2.1
+
+#ifndef _PERF_BPF_PID_FILTER_
+#define _PERF_BPF_PID_FILTER_
+
+#include <bpf.h>
+
+#define pid_filter(name) pid_map(name, bool)
+
+static int pid_filter__add(struct bpf_map *pids, pid_t pid)
+{
+	bool value = true;
+	return bpf_map_update_elem(pids, &pid, &value, BPF_NOEXIST);
+}
+
+static bool pid_filter__has(struct bpf_map *pids, pid_t pid)
+{
+	return bpf_map_lookup_elem(pids, &pid) != NULL;
+}
+
+#endif // _PERF_BPF_PID_FILTER_
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 06/28] perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 05/28] perf bpf: Add simple pid_filter class accessible to BPF proggies Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 07/28] perf augmented_syscalls: Use pid_filter Arnaldo Carvalho de Melo
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

When testing system wide tracing without filtering the syscalls called
by 'perf trace' itself we get into a feedback loop, drop for now those
two syscalls, that are the ones that 'perf trace' does in its loop for
writing the syscalls it intercepts, to help with testing till we get
that filtering in place.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-rkbu536af66dbsfx51sr8yof@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/examples/bpf/augmented_raw_syscalls.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index ec109c12ff24..7d729319618c 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -43,7 +43,9 @@ struct augmented_filename {
 	char		value[256];
 };
 
+#define SYS_WRITE 1
 #define SYS_OPEN 2
+#define SYS_POLL 7
 #define SYS_OPENAT 257
 
 SEC("raw_syscalls:sys_enter")
@@ -101,6 +103,8 @@ int sys_enter(struct syscall_enter_args *args)
 	 * 	 after the ctx memory access to prevent their down stream merging.
 	 */
 	switch (augmented_args.args.syscall_nr) {
+	case SYS_WRITE:
+	case SYS_POLL:	 return 0;
 	case SYS_OPEN:	 filename_arg = (const void *)args->args[0];
 			__asm__ __volatile__("": : :"memory");
 			 break;
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 07/28] perf augmented_syscalls: Use pid_filter
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 06/28] perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 08/28] perf evlist: Rename perf_evlist__set_filter* to perf_evlist__set_tp_filter* Arnaldo Carvalho de Melo
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Just to test filtering a bunch of pids, now its time to go and get that
hooked up in 'perf trace', right after we load the bpf program, if we
find a "pids_filtered" map defined, we'll populate it with the filtered
pids.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-1i9s27wqqdhafk3fappow84x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/examples/bpf/augmented_raw_syscalls.c | 34 ++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 7d729319618c..5fed1eff889d 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -16,6 +16,7 @@
 
 #include <stdio.h>
 #include <unistd.h>
+#include <pid_filter.h>
 
 /* bpf-output associated map */
 struct bpf_map SEC("maps") __augmented_syscalls__ = {
@@ -48,6 +49,29 @@ struct augmented_filename {
 #define SYS_POLL 7
 #define SYS_OPENAT 257
 
+pid_filter(pids_filtered);
+
+static void pid_filter__init(void)
+{
+	/*
+	 * Filter a bunch of pids: gnome-shell, kvm, firefox threads,
+	 * avahi-daemon, etc, just for testing as we go along.
+	 *
+	 * These will come from 'perf trace --filter-pids' in a explicit way
+	 * and also it will filter out itself, to avoid the feedback loop:
+	 * syscalls 'perf trace' does gets caught, reported, causing new
+	 * syscalls to get emitted, rinse repeat forever.
+	 */
+	if (pid_filter__add(&pids_filtered, 2971))
+		return; /* pid_filter__init() was already called, bail out */
+	pid_filter__add(&pids_filtered, 20016);
+	pid_filter__add(&pids_filtered, 12018);
+	pid_filter__add(&pids_filtered, 2310);
+	pid_filter__add(&pids_filtered, 3759);
+	pid_filter__add(&pids_filtered, 25978);
+	pid_filter__add(&pids_filtered, 883);
+}
+
 SEC("raw_syscalls:sys_enter")
 int sys_enter(struct syscall_enter_args *args)
 {
@@ -57,8 +81,14 @@ int sys_enter(struct syscall_enter_args *args)
 	} augmented_args;
 	unsigned int len = sizeof(augmented_args);
 	const void *filename_arg = NULL;
+	/*
+ 	 * We still don't have a "main()" called first and only once
+ 	 * call it always, it will exit as soon as it realizes the
+ 	 * first hard coded filtered pid was already added.
+ 	 */
+	pid_filter__init();
 
-	if (getpid() == 2971)
+	if (pid_filter__has(&pids_filtered, getpid()))
 		return 0;
 
 	probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
@@ -132,7 +162,7 @@ int sys_enter(struct syscall_enter_args *args)
 SEC("raw_syscalls:sys_exit")
 int sys_exit(struct syscall_exit_args *args)
 {
-	return getpid() != 2971;
+	return !pid_filter__has(&pids_filtered, getpid());
 }
 
 license(GPL);
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 08/28] perf evlist: Rename perf_evlist__set_filter* to perf_evlist__set_tp_filter*
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 07/28] perf augmented_syscalls: Use pid_filter Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 09/28] perf trace: Add "_from_option" suffix to trace__set_filter() Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

To better reflect that this is a tracepoint filter, as opposed, for
instance to map based BPF filters.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-9138svli6ddcphrr3ymy9oy3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c |  4 ++--
 tools/perf/util/evlist.c   | 10 +++++-----
 tools/perf/util/evlist.h   |  6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 835619476370..d86ba17fcf44 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2586,7 +2586,7 @@ static int trace__set_filter_loop_pids(struct trace *trace)
 		thread = parent;
 	}
 
-	return perf_evlist__set_filter_pids(trace->evlist, nr, pids);
+	return perf_evlist__set_tp_filter_pids(trace->evlist, nr, pids);
 }
 
 static int trace__run(struct trace *trace, int argc, const char **argv)
@@ -2702,7 +2702,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	 * we fork the workload in perf_evlist__prepare_workload.
 	 */
 	if (trace->filter_pids.nr > 0)
-		err = perf_evlist__set_filter_pids(evlist, trace->filter_pids.nr, trace->filter_pids.entries);
+		err = perf_evlist__set_tp_filter_pids(evlist, trace->filter_pids.nr, trace->filter_pids.entries);
 	else if (thread_map__pid(evlist->threads, 0) == -1)
 		err = trace__set_filter_loop_pids(trace);
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 668d2a9ef0f4..36526d229315 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1176,7 +1176,7 @@ int perf_evlist__apply_filters(struct perf_evlist *evlist, struct perf_evsel **e
 	return err;
 }
 
-int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter)
+int perf_evlist__set_tp_filter(struct perf_evlist *evlist, const char *filter)
 {
 	struct perf_evsel *evsel;
 	int err = 0;
@@ -1193,7 +1193,7 @@ int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter)
 	return err;
 }
 
-int perf_evlist__set_filter_pids(struct perf_evlist *evlist, size_t npids, pid_t *pids)
+int perf_evlist__set_tp_filter_pids(struct perf_evlist *evlist, size_t npids, pid_t *pids)
 {
 	char *filter;
 	int ret = -1;
@@ -1214,15 +1214,15 @@ int perf_evlist__set_filter_pids(struct perf_evlist *evlist, size_t npids, pid_t
 		}
 	}
 
-	ret = perf_evlist__set_filter(evlist, filter);
+	ret = perf_evlist__set_tp_filter(evlist, filter);
 out_free:
 	free(filter);
 	return ret;
 }
 
-int perf_evlist__set_filter_pid(struct perf_evlist *evlist, pid_t pid)
+int perf_evlist__set_tp_filter_pid(struct perf_evlist *evlist, pid_t pid)
 {
-	return perf_evlist__set_filter_pids(evlist, 1, &pid);
+	return perf_evlist__set_tp_filter_pids(evlist, 1, &pid);
 }
 
 bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 9919eed6d15b..d108d167eb36 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -98,9 +98,9 @@ void __perf_evlist__reset_sample_bit(struct perf_evlist *evlist,
 #define perf_evlist__reset_sample_bit(evlist, bit) \
 	__perf_evlist__reset_sample_bit(evlist, PERF_SAMPLE_##bit)
 
-int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter);
-int perf_evlist__set_filter_pid(struct perf_evlist *evlist, pid_t pid);
-int perf_evlist__set_filter_pids(struct perf_evlist *evlist, size_t npids, pid_t *pids);
+int perf_evlist__set_tp_filter(struct perf_evlist *evlist, const char *filter);
+int perf_evlist__set_tp_filter_pid(struct perf_evlist *evlist, pid_t pid);
+int perf_evlist__set_tp_filter_pids(struct perf_evlist *evlist, size_t npids, pid_t *pids);
 
 struct perf_evsel *
 perf_evlist__find_tracepoint_by_id(struct perf_evlist *evlist, int id);
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 09/28] perf trace: Add "_from_option" suffix to trace__set_filter()
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 08/28] perf evlist: Rename perf_evlist__set_filter* to perf_evlist__set_tp_filter* Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 10/28] perf trace: See if there is a map named "filtered_pids" Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

As we'll need that name for a new function to set filters for both
tracepoints and BPF maps for filtering pids.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-mdkck6hf3fnd21rz2766280q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index d86ba17fcf44..8f966c7a7d0d 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3104,8 +3104,8 @@ static int trace__set_duration(const struct option *opt, const char *str,
 	return 0;
 }
 
-static int trace__set_filter_pids(const struct option *opt, const char *str,
-				  int unset __maybe_unused)
+static int trace__set_filter_pids_from_option(const struct option *opt, const char *str,
+					      int unset __maybe_unused)
 {
 	int ret = -1;
 	size_t i;
@@ -3363,7 +3363,7 @@ int cmd_trace(int argc, const char **argv)
 	OPT_STRING('t', "tid", &trace.opts.target.tid, "tid",
 		    "trace events on existing thread id"),
 	OPT_CALLBACK(0, "filter-pids", &trace, "CSV list of pids",
-		     "pids to filter (by the kernel)", trace__set_filter_pids),
+		     "pids to filter (by the kernel)", trace__set_filter_pids_from_option),
 	OPT_BOOLEAN('a', "all-cpus", &trace.opts.target.system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('C', "cpu", &trace.opts.target.cpu_list, "cpu",
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/28] perf trace: See if there is a map named "filtered_pids"
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 09/28] perf trace: Add "_from_option" suffix to trace__set_filter() Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 11/28] perf trace: Fill in BPF "filtered_pids" map when present Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Lookup for the first map named "filtered_pids" and, if augmenting
syscalls, i.e. if a BPF event is present and the
"__augmented_syscalls__" is present, then fill in that map with the pids
to filter, be it feedback loop ones (perf trace's pid, its father if it
is "sshd", more auto-filtered in the future) or the ones explicitely
stated in the tool command line via --filter-pids.

The code to actually fill in the map comes next.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-rhzytmw7qpe6lqyjxi1ded9t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 8f966c7a7d0d..c423a78b5ecd 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -18,6 +18,7 @@
 
 #include <traceevent/event-parse.h>
 #include <api/fs/tracing_path.h>
+#include <bpf/bpf.h>
 #include "builtin.h"
 #include "util/cgroup.h"
 #include "util/color.h"
@@ -99,6 +100,7 @@ struct trace {
 	struct {
 		size_t		nr;
 		pid_t		*entries;
+		struct bpf_map  *map;
 	}			filter_pids;
 	double			duration_filter;
 	double			runtime_ms;
@@ -3315,6 +3317,25 @@ static int trace__parse_cgroups(const struct option *opt, const char *str, int u
 	return 0;
 }
 
+static struct bpf_map *bpf__find_map_by_name(const char *name)
+{
+	struct bpf_object *obj, *tmp;
+
+	bpf_object__for_each_safe(obj, tmp) {
+		struct bpf_map *map = bpf_object__find_map_by_name(obj, name);
+		if (map)
+			return map;
+
+	}
+
+	return NULL;
+}
+
+static void trace__set_bpf_map_filtered_pids(struct trace *trace)
+{
+	trace->filter_pids.map = bpf__find_map_by_name("pids_filtered");
+}
+
 int cmd_trace(int argc, const char **argv)
 {
 	const char *trace_usage[] = {
@@ -3451,8 +3472,10 @@ int cmd_trace(int argc, const char **argv)
 		goto out;
 	}
 
-	if (evsel)
+	if (evsel) {
 		trace.syscalls.events.augmented = evsel;
+		trace__set_bpf_map_filtered_pids(&trace);
+	}
 
 	err = bpf__setup_stdout(trace.evlist);
 	if (err) {
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 11/28] perf trace: Fill in BPF "filtered_pids" map when present
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 10/28] perf trace: See if there is a map named "filtered_pids" Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 12/28] perf augmented_syscalls: Remove example hardcoded set of filtered pids Arnaldo Carvalho de Melo
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

This makes the augmented_syscalls support the --filter-pids and
auto-filtered feedback loop pids just like when working without BPF,
i.e. with just raw_syscalls:sys_{enter,exit} and tracepoint filters.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zc5n453sxxm0tz1zfwwelyti@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 61 ++++++++++++++++++++++++++++++++++++----------
 1 file changed, 48 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index c423a78b5ecd..8e3c3f74a3a4 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2567,9 +2567,27 @@ static int trace__set_ev_qualifier_filter(struct trace *trace)
 	goto out;
 }
 
+static int bpf_map__set_filter_pids(struct bpf_map *map __maybe_unused,
+				    size_t npids __maybe_unused, pid_t *pids __maybe_unused)
+{
+	int err = 0;
+#ifdef HAVE_LIBBPF_SUPPORT
+	bool value = true;
+	int map_fd = bpf_map__fd(map);
+	size_t i;
+
+	for (i = 0; i < npids; ++i) {
+		err = bpf_map_update_elem(map_fd, &pids[i], &value, BPF_ANY);
+		if (err)
+			break;
+	}
+#endif
+	return err;
+}
+
 static int trace__set_filter_loop_pids(struct trace *trace)
 {
-	unsigned int nr = 1;
+	unsigned int nr = 1, err;
 	pid_t pids[32] = {
 		getpid(),
 	};
@@ -2588,7 +2606,34 @@ static int trace__set_filter_loop_pids(struct trace *trace)
 		thread = parent;
 	}
 
-	return perf_evlist__set_tp_filter_pids(trace->evlist, nr, pids);
+	err = perf_evlist__set_tp_filter_pids(trace->evlist, nr, pids);
+	if (!err && trace->filter_pids.map)
+		err = bpf_map__set_filter_pids(trace->filter_pids.map, nr, pids);
+
+	return err;
+}
+
+static int trace__set_filter_pids(struct trace *trace)
+{
+	int err = 0;
+	/*
+	 * Better not use !target__has_task() here because we need to cover the
+	 * case where no threads were specified in the command line, but a
+	 * workload was, and in that case we will fill in the thread_map when
+	 * we fork the workload in perf_evlist__prepare_workload.
+	 */
+	if (trace->filter_pids.nr > 0) {
+		err = perf_evlist__set_tp_filter_pids(trace->evlist, trace->filter_pids.nr,
+						      trace->filter_pids.entries);
+		if (!err && trace->filter_pids.map) {
+			err = bpf_map__set_filter_pids(trace->filter_pids.map, trace->filter_pids.nr,
+						       trace->filter_pids.entries);
+		}
+	} else if (thread_map__pid(trace->evlist->threads, 0) == -1) {
+		err = trace__set_filter_loop_pids(trace);
+	}
+
+	return err;
 }
 
 static int trace__run(struct trace *trace, int argc, const char **argv)
@@ -2697,17 +2742,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		goto out_error_open;
 	}
 
-	/*
-	 * Better not use !target__has_task() here because we need to cover the
-	 * case where no threads were specified in the command line, but a
-	 * workload was, and in that case we will fill in the thread_map when
-	 * we fork the workload in perf_evlist__prepare_workload.
-	 */
-	if (trace->filter_pids.nr > 0)
-		err = perf_evlist__set_tp_filter_pids(evlist, trace->filter_pids.nr, trace->filter_pids.entries);
-	else if (thread_map__pid(evlist->threads, 0) == -1)
-		err = trace__set_filter_loop_pids(trace);
-
+	err = trace__set_filter_pids(trace);
 	if (err < 0)
 		goto out_error_mem;
 
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 12/28] perf augmented_syscalls: Remove example hardcoded set of filtered pids
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 11/28] perf trace: Fill in BPF "filtered_pids" map when present Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 13/28] Revert "perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter" Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Now that 'perf trace' fills in that "filtered_pids" BPF map, remove the
set of filtered pids used as an example to test that feature.

That feature works like this:

Starting a system wide 'strace' like 'perf trace' augmented session we
noticed that lots of events take place for a pid, which ends up being
the feedback loop of perf trace's syscalls being processed by the
'gnome-terminal' process:

  # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c
     0.391 ( 0.002 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f750bc, count: 8176) = 453
     0.394 ( 0.001 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f75280, count: 7724) = -1 EAGAIN Resource temporarily unavailable
     0.438 ( 0.001 ms): gnome-terminal/2469 read(fd: 4<anon_inode:[eventfd]>, buf: 0x7fffc696aeb0, count: 16) = 8
     0.519 ( 0.001 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f75280, count: 7724) = 114
     0.522 ( 0.001 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f752f1, count: 7611) = -1 EAGAIN Resource temporarily unavailable
  ^C

So we can use --filter-pids to get rid of that one, and in this case what is
being used to implement that functionality is that "filtered_pids" BPF map that
the tools/perf/examples/bpf/augmented_raw_syscalls.c created and that 'perf trace'
bpf loader noticed and created a "struct bpf_map" associated that then got populated
by 'perf trace':

  # perf trace --filter-pids 2469 -e tools/perf/examples/bpf/augmented_raw_syscalls.c
     0.020 ( 0.002 ms): gnome-shell/1663 epoll_pwait(epfd: 12<anon_inode:[eventpoll]>, events: 0x7ffd8f3ef960, maxevents: 32, sigsetsize: 8) = 1
     0.025 ( 0.002 ms): gnome-shell/1663 read(fd: 24</dev/input/event4>, buf: 0x560c01bb8240, count: 8112) = 48
     0.029 ( 0.001 ms): gnome-shell/1663 read(fd: 24</dev/input/event4>, buf: 0x560c01bb8258, count: 8088) = -1 EAGAIN Resource temporarily unavailable
     0.032 ( 0.001 ms): gnome-shell/1663 read(fd: 24</dev/input/event4>, buf: 0x560c01bb8240, count: 8112) = -1 EAGAIN Resource temporarily unavailable
     0.040 ( 0.003 ms): gnome-shell/1663 recvmsg(fd: 46<socket:[35893]>, msg: 0x7ffd8f3ef950) = -1 EAGAIN Resource temporarily unavailable
    21.529 ( 0.002 ms): gnome-shell/1663 epoll_pwait(epfd: 5<anon_inode:[eventpoll]>, events: 0x7ffd8f3ef960, maxevents: 32, sigsetsize: 8) = 1
    21.533 ( 0.004 ms): gnome-shell/1663 recvmsg(fd: 82<socket:[42826]>, msg: 0x7ffd8f3ef7b0, flags: DONTWAIT|CMSG_CLOEXEC) = 236
    21.581 ( 0.006 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7ffd8f3ef060) = 0
    21.605 ( 0.020 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_CREATE, arg: 0x7ffd8f3eeea0) = 0
    21.626 ( 0.119 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_SET_DOMAIN, arg: 0x7ffd8f3eee94) = 0
    21.746 ( 0.081 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_PWRITE, arg: 0x7ffd8f3eeea0) = 0
  ^C

Oops, yet another gnome process that is involved with the output that
'perf trace' generates, lets filter that out too:

  # perf trace --filter-pids 2469,1663 -e tools/perf/examples/bpf/augmented_raw_syscalls.c
         ? (         ): wpa_supplicant/1366  ... [continued]: select()) = 0 Timeout
     0.006 ( 0.002 ms): wpa_supplicant/1366 clock_gettime(which_clock: BOOTTIME, tp: 0x7fffe5b1e430) = 0
     0.011 ( 0.001 ms): wpa_supplicant/1366 clock_gettime(which_clock: BOOTTIME, tp: 0x7fffe5b1e3e0) = 0
     0.014 ( 0.001 ms): wpa_supplicant/1366 clock_gettime(which_clock: BOOTTIME, tp: 0x7fffe5b1e430) = 0
         ? (         ): gmain/1791  ... [continued]: poll()) = 0 Timeout
     0.017 (         ): wpa_supplicant/1366 select(n: 6, inp: 0x55646fed3ad0, outp: 0x55646fed3b60, exp: 0x55646fed3bf0, tvp: 0x7fffe5b1e4a0) ...
   157.879 ( 0.019 ms): gmain/1791 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: , mask: 16789454) = -1 ENOENT No such file or directory
         ? (         ): cupsd/1001  ... [continued]: epoll_pwait()) = 0
         ? (         ): gsd-color/1908  ... [continued]: poll()) = 0 Timeout
   499.615 (         ): cupsd/1001 epoll_pwait(epfd: 4<anon_inode:[eventpoll]>, events: 0x557a21166500, maxevents: 4096, timeout: 1000, sigsetsize: 8) ...
   586.593 ( 0.004 ms): gsd-color/1908 recvmsg(fd: 3<socket:[38074]>, msg: 0x7ffdef34e800) = -1 EAGAIN Resource temporarily unavailable
         ? (         ): fwupd/2230  ... [continued]: poll()) = 0 Timeout
         ? (         ): rtkit-daemon/906  ... [continued]: poll()) = 0 Timeout
         ? (         ): rtkit-daemon/907  ... [continued]: poll()) = 1
   724.603 ( 0.007 ms): rtkit-daemon/907 read(fd: 6<anon_inode:[eventfd]>, buf: 0x7f05ff768d08, count: 8) = 8
         ? (         ): ssh/5461  ... [continued]: select()) = 1
   810.431 ( 0.002 ms): ssh/5461 clock_gettime(which_clock: BOOTTIME, tp: 0x7ffd7f39f870) = 0
   ^C

Several syscall exit events for syscalls in flight when 'perf trace' started, etc. Saner :-)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-c3tu5yg204p5mvr9kvwew07n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/examples/bpf/augmented_raw_syscalls.c | 27 ------------------------
 1 file changed, 27 deletions(-)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 5fed1eff889d..3f26e705b86c 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -51,27 +51,6 @@ struct augmented_filename {
 
 pid_filter(pids_filtered);
 
-static void pid_filter__init(void)
-{
-	/*
-	 * Filter a bunch of pids: gnome-shell, kvm, firefox threads,
-	 * avahi-daemon, etc, just for testing as we go along.
-	 *
-	 * These will come from 'perf trace --filter-pids' in a explicit way
-	 * and also it will filter out itself, to avoid the feedback loop:
-	 * syscalls 'perf trace' does gets caught, reported, causing new
-	 * syscalls to get emitted, rinse repeat forever.
-	 */
-	if (pid_filter__add(&pids_filtered, 2971))
-		return; /* pid_filter__init() was already called, bail out */
-	pid_filter__add(&pids_filtered, 20016);
-	pid_filter__add(&pids_filtered, 12018);
-	pid_filter__add(&pids_filtered, 2310);
-	pid_filter__add(&pids_filtered, 3759);
-	pid_filter__add(&pids_filtered, 25978);
-	pid_filter__add(&pids_filtered, 883);
-}
-
 SEC("raw_syscalls:sys_enter")
 int sys_enter(struct syscall_enter_args *args)
 {
@@ -81,12 +60,6 @@ int sys_enter(struct syscall_enter_args *args)
 	} augmented_args;
 	unsigned int len = sizeof(augmented_args);
 	const void *filename_arg = NULL;
-	/*
- 	 * We still don't have a "main()" called first and only once
- 	 * call it always, it will exit as soon as it realizes the
- 	 * first hard coded filtered pid was already added.
- 	 */
-	pid_filter__init();
 
 	if (pid_filter__has(&pids_filtered, getpid()))
 		return 0;
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 13/28] Revert "perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter"
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 12/28] perf augmented_syscalls: Remove example hardcoded set of filtered pids Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 14/28] perf script: Add newline after uregs output Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Now that we have the "filtered_pids" logic in place, no need to do this
rough filter to avoid the feedback loop from 'perf trace's own syscalls,
revert it.

This reverts commit 7ed71f124284359676b6496ae7db724fee9da753.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-88vh02cnkam0vv5f9vp02o3h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/examples/bpf/augmented_raw_syscalls.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 3f26e705b86c..74ce7574073d 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -44,9 +44,7 @@ struct augmented_filename {
 	char		value[256];
 };
 
-#define SYS_WRITE 1
 #define SYS_OPEN 2
-#define SYS_POLL 7
 #define SYS_OPENAT 257
 
 pid_filter(pids_filtered);
@@ -106,8 +104,6 @@ int sys_enter(struct syscall_enter_args *args)
 	 * 	 after the ctx memory access to prevent their down stream merging.
 	 */
 	switch (augmented_args.args.syscall_nr) {
-	case SYS_WRITE:
-	case SYS_POLL:	 return 0;
 	case SYS_OPEN:	 filename_arg = (const void *)args->args[0];
 			__asm__ __volatile__("": : :"memory");
 			 break;
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 14/28] perf script: Add newline after uregs output
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 13/28] Revert "perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter" Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 15/28] perf bpf: Reduce the hardcoded .max_entries for pid_maps Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Milian Wolff,
	Jiri Olsa, Arnaldo Carvalho de Melo

From: Milian Wolff <milian.wolff@kdab.com>

This change makes it much easier to easily distinguish between
consecutive samples by keeping the empty line between them, like we see
when we do not enable uregs output.

Before:

  cpp-inlining 28298 [-01] 54837.342780:    3068085 cycles:pp:
              7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
              ...
   ABI:2    AX:0x0    BX:0x40f56cf6    CX:0x294a3ae7    ...
  cpp-inlining 28298 [-01] 54837.344493:    2881929 cycles:pp:
              7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
              ...
   ABI:2    AX:0x40d440c7    BX:0x40d440c7    CX:0x4d45e5da    ...

After:

  cpp-inlining 28298 [-01] 54837.342780:    3068085 cycles:pp:
              7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
              ...
   ABI:2    AX:0x0    BX:0x40f56cf6    CX:0x294a3ae7    ...

  cpp-inlining 28298 [-01] 54837.344493:    2881929 cycles:pp:
              7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
              ...
   ABI:2    AX:0x40d440c7    BX:0x40d440c7    CX:0x4d45e5da    ...

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181107093705.16346-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index b5bc85bd0bbe..daf73832743e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -603,6 +603,8 @@ static int perf_sample__fprintf_uregs(struct perf_sample *sample,
 		printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
 	}
 
+	fprintf(fp, "\n");
+
 	return printed;
 }
 
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 15/28] perf bpf: Reduce the hardcoded .max_entries for pid_maps
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 14/28] perf script: Add newline after uregs output Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:35 ` [PATCH 16/28] perf script: Share code and output format for uregs and iregs output Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Alexei Starovoitov,
	Daniel Borkmann, David Ahern, Edward Cree, Jiri Olsa,
	Martin KaFai Lau, Namhyung Kim, Wang Nan, Yonghong Song

From: Arnaldo Carvalho de Melo <acme@redhat.com>

While working on augmented syscalls I got into this error:

  # trace -vv --filter-pids 2469,1663 -e tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
  <SNIP>
  libbpf: map 0 is "__augmented_syscalls__"
  libbpf: map 1 is "__bpf_stdout__"
  libbpf: map 2 is "pids_filtered"
  libbpf: map 3 is "syscalls"
  libbpf: collecting relocating info for: '.text'
  libbpf: relo for 13 value 84 name 133
  libbpf: relocation: insn_idx=3
  libbpf: relocation: find map 3 (pids_filtered) for insn 3
  libbpf: collecting relocating info for: 'raw_syscalls:sys_enter'
  libbpf: relo for 8 value 0 name 0
  libbpf: relocation: insn_idx=1
  libbpf: relo for 8 value 0 name 0
  libbpf: relocation: insn_idx=3
  libbpf: relo for 9 value 28 name 178
  libbpf: relocation: insn_idx=36
  libbpf: relocation: find map 1 (__augmented_syscalls__) for insn 36
  libbpf: collecting relocating info for: 'raw_syscalls:sys_exit'
  libbpf: relo for 8 value 0 name 0
  libbpf: relocation: insn_idx=0
  libbpf: relo for 8 value 0 name 0
  libbpf: relocation: insn_idx=2
  bpf: config program 'raw_syscalls:sys_enter'
  bpf: config program 'raw_syscalls:sys_exit'
  libbpf: create map __bpf_stdout__: fd=3
  libbpf: create map __augmented_syscalls__: fd=4
  libbpf: create map syscalls: fd=5
  libbpf: create map pids_filtered: fd=6
  libbpf: added 13 insn from .text to prog raw_syscalls:sys_enter
  libbpf: added 13 insn from .text to prog raw_syscalls:sys_exit
  libbpf: load bpf program failed: Operation not permitted
  libbpf: failed to load program 'raw_syscalls:sys_exit'
  libbpf: failed to load object 'tools/perf/examples/bpf/augmented_raw_syscalls.c'
  bpf: load objects failed: err=-4009: (Incorrect kernel version)
  event syntax error: 'tools/perf/examples/bpf/augmented_raw_syscalls.c'
                       \___ Failed to load program for unknown reason

  (add -v to see detail)
  Run 'perf list' for a list of valid events

   Usage: perf trace [<options>] [<command>]
      or: perf trace [<options>] -- <command> [<options>]
      or: perf trace record [<options>] [<command>]
      or: perf trace record [<options>] -- <command> [<options>]

      -e, --event <event>   event/syscall selector. use 'perf list' to list available events

If I then try to use strace (perf trace'ing 'perf trace' needs some more work
before its possible) to get a bit more info I get:

  # strace -e bpf trace --filter-pids 2469,1663 -e tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=4, map_flags=0, inner_map_fd=0, map_name="__bpf_stdout__", map_ifindex=0}, 72) = 3
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=4, map_flags=0, inner_map_fd=0, map_name="__augmented_sys", map_ifindex=0}, 72) = 4
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=1, max_entries=500, map_flags=0, inner_map_fd=0, map_name="syscalls", map_ifindex=0}, 72) = 5
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=512, map_flags=0, inner_map_fd=0, map_name="pids_filtered", map_ifindex=0}, 72) = 6
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=57, insns=0x1223f50, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_enter", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = 7
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=18, insns=0x1224120, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_exit", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = -1 EPERM (Operation not permitted)
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=18, insns=0x1224120, license="GPL", log_level=1, log_size=262144, log_buf="", kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_exit", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = -1 EPERM (Operation not permitted)
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=18, insns=0x1224120, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_exit", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = -1 EPERM (Operation not permitted)
  event syntax error: 'tools/perf/examples/bpf/augmented_raw_syscalls.c'
                       \___ Failed to load program for unknown reason
  <SNIP similar output as without 'strace'>
  #

I managed to create the maps, etc, but then installing the "sys_exit" hook into
the "raw_syscalls:sys_exit" tracepoint somehow gets -EPERMed...

I then go and try reducing the size of this new table:

  +++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
  @@ -47,6 +47,17 @@ struct augmented_filename {
   #define SYS_OPEN 2
   #define SYS_OPENAT 257

  +struct syscall {
  +       bool    filtered;
  +};
  +
  +struct bpf_map SEC("maps") syscalls = {
  +       .type        = BPF_MAP_TYPE_ARRAY,
  +       .key_size    = sizeof(int),
  +       .value_size  = sizeof(struct syscall),
  +       .max_entries = 500,
  +};

And after reducing that .max_entries a tad, it works. So yeah, the "unknown
reason" should be related to the number of bytes all this is taking, reduce the
default for pid_map()s so that we can have a "syscalls" map with enough slots
for all syscalls in most arches. And take notes about this error message,
improve it :-)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-yjzhak8asumz9e9hts2dgplp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/include/bpf/bpf.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/include/bpf/bpf.h b/tools/perf/include/bpf/bpf.h
index 04ecd425a237..bd5d7b4d7760 100644
--- a/tools/perf/include/bpf/bpf.h
+++ b/tools/perf/include/bpf/bpf.h
@@ -18,12 +18,20 @@ struct bpf_map {
         unsigned int numa_node;
 };
 
+/*
+ * FIXME: this should receive .max_entries as a parameter, as careful
+ *	  tuning of these limits is needed to avoid hitting limits that
+ *	  prevents other BPF constructs, such as tracepoint handlers,
+ *	  to get installed, with cryptic messages from libbpf, etc.
+ *	  For the current need, 'perf trace --filter-pids', 64 should
+ *	  be good enough, but this surely needs to be revisited.
+ */
 #define pid_map(name, value_type)		\
 struct bpf_map SEC("maps") name = {		\
 	.type	     = BPF_MAP_TYPE_HASH,	\
 	.key_size    = sizeof(pid_t),		\
 	.value_size  = sizeof(value_type),	\
-	.max_entries = 512,			\
+	.max_entries = 64,			\
 }
 
 static int (*bpf_map_update_elem)(struct bpf_map *map, void *key, void *value, u64 flags) = (void *)BPF_FUNC_map_update_elem;
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 16/28] perf script: Share code and output format for uregs and iregs output
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 15/28] perf bpf: Reduce the hardcoded .max_entries for pid_maps Arnaldo Carvalho de Melo
@ 2018-11-22  3:35 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 17/28] perf bench: Move HAVE_PTHREAD_ATTR_SETAFFINITY_NP into bench.h Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Milian Wolff,
	Arnaldo Carvalho de Melo

From: Milian Wolff <milian.wolff@kdab.com>

The iregs output was missing the newline at end as well as the leading
ABI output. This made it hard to compare the iregs and uregs values.
Instead, use a single function to output the register values and use it
for both, iregs and uregs, to ensure the output is consistent.

Before:

  perf  7049 [-01]  1343.354347:          1 cycles:ppp:
        ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
    AX:0x80000000    BX:0x0    CX:0x0    DX:0x7    SI:0xf    DI:0x286    BP:0xffff95bc8213a460    SP:0xffffacbf0ba97d18    IP:0xffffffffa7bc21cd FLAGS:0x28e    CS:0x10    SS:0x18    R8:0x2    R9:0x21440   R10:0x33816fb3b8c   R11:0x1   R12:0xffff95bc8213a460   R13:0xffff95bc8213a400   R14:0xffff95bc8213a400   R15:0x1  ABI:2    AX:0xffffffffffffffda    BX:0xffffffffffffffff    CX:0x7f84ad85798b    DX:0x560209699d50    SI:0x7ffe2c7a6820    DI:0x7ffe2c7a8c9b    BP:0x7ffe2c7a20d0    SP:0x7ffe2c7a2058    IP:0x7f84ad85798b FLAGS:0x206    CS:0x33    SS:0x2b    R8:0x7ffe2c7a2030    R9:0x7f84ae55f010   R10:0x8   R11:0x206   R12:0xffffffffffffffff   R13:0xffffffffffffffff   R14:0xffffffffffffffff   R15:0xffffffffffffffff

  perf  7049 [-01]  1343.354363:          1 cycles:ppp:
        ...

After:

  perf  7049 [-01]  1343.354347:          1 cycles:ppp:
        ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
        ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
    ABI:2    AX:0x80000000    BX:0x0    CX:0x0    DX:0x7    SI:0xf    DI:0x286    BP:0xffff95bc8213a460    SP:0xffffacbf0ba97d18    IP:0xffffffffa7bc21cd FLAGS:0x28e    CS:0x10    SS:0x18    R8:0x2    R9:0x21440   R10:0x33816fb3b8c   R11:0x1   R12:0xffff95bc8213a460   R13:0xffff95bc8213a400   R14:0xffff95bc8213a400   R15:0x1
    ABI:2    AX:0xffffffffffffffda    BX:0xffffffffffffffff    CX:0x7f84ad85798b    DX:0x560209699d50    SI:0x7ffe2c7a6820    DI:0x7ffe2c7a8c9b    BP:0x7ffe2c7a20d0    SP:0x7ffe2c7a2058    IP:0x7f84ad85798b FLAGS:0x206    CS:0x33    SS:0x2b    R8:0x7ffe2c7a2030    R9:0x7f84ae55f010   R10:0x8   R11:0x206   R12:0xffffffffffffffff   R13:0xffffffffffffffff   R14:0xffffffffffffffff   R15:0xffffffffffffffff

  perf  7049 [-01]  1343.354363:          1 cycles:ppp:
        ...

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181107223437.9071-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 40 +++++++++++++++++-----------------------
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index daf73832743e..04913136bac9 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -566,30 +566,10 @@ static int perf_session__check_output_opt(struct perf_session *session)
 	return 0;
 }
 
-static int perf_sample__fprintf_iregs(struct perf_sample *sample,
-				      struct perf_event_attr *attr, FILE *fp)
-{
-	struct regs_dump *regs = &sample->intr_regs;
-	uint64_t mask = attr->sample_regs_intr;
-	unsigned i = 0, r;
-	int printed = 0;
-
-	if (!regs)
-		return 0;
-
-	for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
-		u64 val = regs->regs[i++];
-		printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
-	}
-
-	return printed;
-}
-
-static int perf_sample__fprintf_uregs(struct perf_sample *sample,
-				      struct perf_event_attr *attr, FILE *fp)
+static int perf_sample__fprintf_regs(struct regs_dump *regs, uint64_t mask,
+				     FILE *fp
+)
 {
-	struct regs_dump *regs = &sample->user_regs;
-	uint64_t mask = attr->sample_regs_user;
 	unsigned i = 0, r;
 	int printed = 0;
 
@@ -608,6 +588,20 @@ static int perf_sample__fprintf_uregs(struct perf_sample *sample,
 	return printed;
 }
 
+static int perf_sample__fprintf_iregs(struct perf_sample *sample,
+				      struct perf_event_attr *attr, FILE *fp)
+{
+	return perf_sample__fprintf_regs(&sample->intr_regs,
+					 attr->sample_regs_intr, fp);
+}
+
+static int perf_sample__fprintf_uregs(struct perf_sample *sample,
+				      struct perf_event_attr *attr, FILE *fp)
+{
+	return perf_sample__fprintf_regs(&sample->user_regs,
+					 attr->sample_regs_user, fp);
+}
+
 static int perf_sample__fprintf_start(struct perf_sample *sample,
 				      struct thread *thread,
 				      struct perf_evsel *evsel,
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 17/28] perf bench: Move HAVE_PTHREAD_ATTR_SETAFFINITY_NP into bench.h
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2018-11-22  3:35 ` [PATCH 16/28] perf script: Share code and output format for uregs and iregs output Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 18/28] tools build feature: Check if eventfd() is available Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Andrew Morton, Jason Baron,
	Arnaldo Carvalho de Melo

From: Davidlohr Bueso <dave@stgolabs.net>

Both futex and epoll need this call, and can cause build failure on
systems that don't have it pthread_attr_setaffinity_np().

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Link: http://lkml.kernel.org/r/20181109210719.pr7ohayuwqmfp2wl@linux-r8p5
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/bench.h | 11 +++++++++++
 tools/perf/bench/futex.h | 12 ------------
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 6c9fcd757f31..8299c76046cd 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -48,4 +48,15 @@ int bench_futex_lock_pi(int argc, const char **argv);
 extern int bench_format;
 extern unsigned int bench_repeat;
 
+#ifndef HAVE_PTHREAD_ATTR_SETAFFINITY_NP
+#include <pthread.h>
+#include <linux/compiler.h>
+static inline int pthread_attr_setaffinity_np(pthread_attr_t *attr __maybe_unused,
+					      size_t cpusetsize __maybe_unused,
+					      cpu_set_t *cpuset __maybe_unused)
+{
+	return 0;
+}
+#endif
+
 #endif
diff --git a/tools/perf/bench/futex.h b/tools/perf/bench/futex.h
index db4853f209c7..31b53cc7d5bc 100644
--- a/tools/perf/bench/futex.h
+++ b/tools/perf/bench/futex.h
@@ -86,16 +86,4 @@ futex_cmp_requeue(u_int32_t *uaddr, u_int32_t val, u_int32_t *uaddr2, int nr_wak
 	return futex(uaddr, FUTEX_CMP_REQUEUE, nr_wake, nr_requeue, uaddr2,
 		 val, opflags);
 }
-
-#ifndef HAVE_PTHREAD_ATTR_SETAFFINITY_NP
-#include <pthread.h>
-#include <linux/compiler.h>
-static inline int pthread_attr_setaffinity_np(pthread_attr_t *attr __maybe_unused,
-					      size_t cpusetsize __maybe_unused,
-					      cpu_set_t *cpuset __maybe_unused)
-{
-	return 0;
-}
-#endif
-
 #endif /* _FUTEX_H */
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 18/28] tools build feature: Check if eventfd() is available
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (16 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 17/28] perf bench: Move HAVE_PTHREAD_ATTR_SETAFFINITY_NP into bench.h Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 19/28] perf bench: Add epoll parallel epoll_wait benchmark Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Andrew Morton,
	David Ahern, Davidlohr Bueso, Jason Baron, Jiri Olsa,
	Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

A new 'perf bench epoll' will use this, and to disable it for older
systems, add a feature test for this API.

This is just a simple program that if successfully compiled, means that
the feature is present, at least at the library level, in a build that
sets the output directory to /tmp/build/perf (using O=/tmp/build/perf),
we end up with:

  $ ls -la /tmp/build/perf/feature/test-eventfd*
  -rwxrwxr-x. 1 acme acme 8176 Nov 21 15:58 /tmp/build/perf/feature/test-eventfd.bin
  -rw-rw-r--. 1 acme acme  588 Nov 21 15:58 /tmp/build/perf/feature/test-eventfd.d
  -rw-rw-r--. 1 acme acme    0 Nov 21 15:58 /tmp/build/perf/feature/test-eventfd.make.output
  $ ldd /tmp/build/perf/feature/test-eventfd.bin
	  linux-vdso.so.1 (0x00007fff3bf3f000)
	  libc.so.6 => /lib64/libc.so.6 (0x00007fa984061000)
	  /lib64/ld-linux-x86-64.so.2 (0x00007fa984417000)
  $ grep eventfd -A 2 -B 2 /tmp/build/perf/FEATURE-DUMP
  feature-dwarf=1
  feature-dwarf_getlocations=1
  feature-eventfd=1
  feature-fortify-source=1
  feature-sync-compare-and-swap=1
  $

The main thing here is that in the end we'll have -DHAVE_EVENTFD in
CFLAGS, and then the 'perf bench' entry needing that API can be
selectively pruned.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-wkeldwob7dpx6jvtuzl8164k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/build/Makefile.feature       | 1 +
 tools/build/feature/Makefile       | 4 ++++
 tools/build/feature/test-all.c     | 5 +++++
 tools/build/feature/test-eventfd.c | 9 +++++++++
 tools/perf/Makefile.config         | 5 ++++-
 5 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 tools/build/feature/test-eventfd.c

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index d74bb9414d7c..8a123834a2a3 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -31,6 +31,7 @@ FEATURE_TESTS_BASIC :=                  \
         backtrace                       \
         dwarf                           \
         dwarf_getlocations              \
+        eventfd                         \
         fortify-source                  \
         sync-compare-and-swap           \
         get_current_dir_name            \
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 304b984f11b9..325087a0429c 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -5,6 +5,7 @@ FILES=                                          \
          test-bionic.bin                        \
          test-dwarf.bin                         \
          test-dwarf_getlocations.bin            \
+         test-eventfd.bin                       \
          test-fortify-source.bin                \
          test-sync-compare-and-swap.bin         \
          test-get_current_dir_name.bin          \
@@ -102,6 +103,9 @@ $(OUTPUT)test-bionic.bin:
 $(OUTPUT)test-libelf.bin:
 	$(BUILD) -lelf
 
+$(OUTPUT)test-eventfd.bin:
+	$(BUILD)
+
 $(OUTPUT)test-get_current_dir_name.bin:
 	$(BUILD)
 
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index 56722bfe6bdd..58f01b950195 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -50,6 +50,10 @@
 # include "test-dwarf_getlocations.c"
 #undef main
 
+#define main main_test_eventfd
+# include "test-eventfd.c"
+#undef main
+
 #define main main_test_libelf_getphdrnum
 # include "test-libelf-getphdrnum.c"
 #undef main
@@ -182,6 +186,7 @@ int main(int argc, char *argv[])
 	main_test_glibc();
 	main_test_dwarf();
 	main_test_dwarf_getlocations();
+	main_test_eventfd();
 	main_test_libelf_getphdrnum();
 	main_test_libelf_gelf_getnote();
 	main_test_libelf_getshdrstrndx();
diff --git a/tools/build/feature/test-eventfd.c b/tools/build/feature/test-eventfd.c
new file mode 100644
index 000000000000..f4de7ef00ccb
--- /dev/null
+++ b/tools/build/feature/test-eventfd.c
@@ -0,0 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018, Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com>
+
+#include <sys/eventfd.h>
+
+int main(void)
+{
+	return eventfd(0, EFD_NONBLOCK);
+}
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index a0e8c23f9125..376d1f78be04 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -299,11 +299,14 @@ ifndef NO_BIONIC
   endif
 endif
 
+ifeq ($(feature-eventfd), 1)
+  CFLAGS += -DHAVE_EVENTFD
+endif
+
 ifeq ($(feature-get_current_dir_name), 1)
   CFLAGS += -DHAVE_GET_CURRENT_DIR_NAME
 endif
 
-
 ifdef NO_LIBELF
   NO_DWARF := 1
   NO_DEMANGLE := 1
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 19/28] perf bench: Add epoll parallel epoll_wait benchmark
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (17 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 18/28] tools build feature: Check if eventfd() is available Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 20/28] perf bench: Add epoll_ctl(2) benchmark Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Andrew Morton, Jason Baron,
	Arnaldo Carvalho de Melo

From: Davidlohr Bueso <dave@stgolabs.net>

This program benchmarks concurrent epoll_wait(2) for file descriptors
that are monitored with with EPOLLIN along various semantics, by a
single epoll instance. Such conditions can be found when using
single/combined or multiple queuing when load balancing.

Each thread has a number of private, nonblocking file descriptors,
referred to as fdmap. A writer thread will constantly be writing to the
fdmaps of all threads, minimizing each threads's chances of epoll_wait
not finding any ready read events and blocking as this is not what we
want to stress. Full details in the start of the C file.

Committer testing:

  # perf bench
  Usage:
	perf bench [<common options>] <collection> <benchmark> [<options>]

        # List of all available benchmark collections:

         sched: Scheduler and IPC benchmarks
           mem: Memory access benchmarks
          numa: NUMA scheduling and MM benchmarks
         futex: Futex stressing benchmarks
         epoll: Epoll stressing benchmarks
           all: All benchmarks

  # perf bench epoll

        # List of available benchmarks for collection 'epoll':

          wait: Benchmark epoll concurrent epoll_waits
           all: Run all futex benchmarks

  # perf bench epoll wait
  # Running 'epoll/wait' benchmark:
  Run summary [PID 19295]: 3 threads monitoring on 64 file-descriptors for 8 secs.

  [thread  0] fdmap: 0xdaa650 ... 0xdaa74c [ 328241 ops/sec ]
  [thread  1] fdmap: 0xdaa900 ... 0xdaa9fc [ 351695 ops/sec ]
  [thread  2] fdmap: 0xdaabb0 ... 0xdaacac [ 381423 ops/sec ]

  Averaged 353786 operations/sec (+- 4.35%), total secs = 8
  #

Committer notes:

Fix the build on debian:experimental-x-mips, debian:experimental-x-mipsel
and others:

    CC       /tmp/build/perf/bench/epoll-wait.o
  bench/epoll-wait.c: In function 'writerfn':
  bench/epoll-wait.c:399:12: error: format '%ld' expects argument of type 'long int', but argument 2 has type 'size_t' {aka 'unsigned int'} [-Werror=format=]
    printinfo("exiting writer-thread (total full-loops: %ld)\n", iter);
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~
  bench/epoll-wait.c:86:31: note: in definition of macro 'printinfo'
    do { if (__verbose) { printf(fmt, ## arg); fflush(stdout); } } while (0)
                                 ^~~
  cc1: all warnings being treated as errors

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com> <jbaron@akamai.com>
Link: http://lkml.kernel.org/r/20181106152226.20883-2-dave@stgolabs.net
Link: http://lkml.kernel.org/r/20181106182349.thdkpvshkna5vd7o@linux-r8p5>
[ Applied above fixup as per Davidlohr's request ]
[ Use inttypes.h to print rlim_t fields, fixing the build on Alpine Linux / musl libc ]
[ Check if eventfd() is available, i.e. if HAVE_EVENTFD is defined ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-bench.txt |   7 +
 tools/perf/bench/Build                  |   2 +
 tools/perf/bench/bench.h                |   2 +
 tools/perf/bench/epoll-wait.c           | 540 ++++++++++++++++++++++++++++++++
 tools/perf/builtin-bench.c              |  12 +
 5 files changed, 563 insertions(+)
 create mode 100644 tools/perf/bench/epoll-wait.c

diff --git a/tools/perf/Documentation/perf-bench.txt b/tools/perf/Documentation/perf-bench.txt
index 34750fc32714..3a6b2e73b2e8 100644
--- a/tools/perf/Documentation/perf-bench.txt
+++ b/tools/perf/Documentation/perf-bench.txt
@@ -58,6 +58,9 @@ SUBSYSTEM
 'futex'::
 	Futex stressing benchmarks.
 
+'epoll'::
+	Eventpoll (epoll) stressing benchmarks.
+
 'all'::
 	All benchmark subsystems.
 
@@ -203,6 +206,10 @@ Suite for evaluating requeue calls.
 *lock-pi*::
 Suite for evaluating futex lock_pi calls.
 
+SUITES FOR 'epoll'
+~~~~~~~~~~~~~~~~~~
+*wait*::
+Suite for evaluating concurrent epoll_wait calls.
 
 SEE ALSO
 --------
diff --git a/tools/perf/bench/Build b/tools/perf/bench/Build
index eafce1a130a1..2bb79b542d53 100644
--- a/tools/perf/bench/Build
+++ b/tools/perf/bench/Build
@@ -7,6 +7,8 @@ perf-y += futex-wake-parallel.o
 perf-y += futex-requeue.o
 perf-y += futex-lock-pi.o
 
+perf-y += epoll-wait.o
+
 perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-lib.o
 perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-asm.o
 perf-$(CONFIG_X86_64) += mem-memset-x86-64-asm.o
diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 8299c76046cd..6e1f091ced96 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -38,6 +38,8 @@ int bench_futex_requeue(int argc, const char **argv);
 /* pi futexes */
 int bench_futex_lock_pi(int argc, const char **argv);
 
+int bench_epoll_wait(int argc, const char **argv);
+
 #define BENCH_FORMAT_DEFAULT_STR	"default"
 #define BENCH_FORMAT_DEFAULT		0
 #define BENCH_FORMAT_SIMPLE_STR		"simple"
diff --git a/tools/perf/bench/epoll-wait.c b/tools/perf/bench/epoll-wait.c
new file mode 100644
index 000000000000..5a11534e96a0
--- /dev/null
+++ b/tools/perf/bench/epoll-wait.c
@@ -0,0 +1,540 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifdef HAVE_EVENTFD
+/*
+ * Copyright (C) 2018 Davidlohr Bueso.
+ *
+ * This program benchmarks concurrent epoll_wait(2) monitoring multiple
+ * file descriptors under one or two load balancing models. The first,
+ * and default, is the single/combined queueing (which refers to a single
+ * epoll instance for N worker threads):
+ *
+ *                          |---> [worker A]
+ *                          |---> [worker B]
+ *        [combined queue]  .---> [worker C]
+ *                          |---> [worker D]
+ *                          |---> [worker E]
+ *
+ * While the second model, enabled via --multiq option, uses multiple
+ * queueing (which refers to one epoll instance per worker). For example,
+ * short lived tcp connections in a high throughput httpd server will
+ * ditribute the accept()'ing  connections across CPUs. In this case each
+ * worker does a limited  amount of processing.
+ *
+ *             [queue A]  ---> [worker]
+ *             [queue B]  ---> [worker]
+ *             [queue C]  ---> [worker]
+ *             [queue D]  ---> [worker]
+ *             [queue E]  ---> [worker]
+ *
+ * Naturally, the single queue will enforce more concurrency on the epoll
+ * instance, and can therefore scale poorly compared to multiple queues.
+ * However, this is a benchmark raw data and must be taken with a grain of
+ * salt when choosing how to make use of sys_epoll.
+
+ * Each thread has a number of private, nonblocking file descriptors,
+ * referred to as fdmap. A writer thread will constantly be writing to
+ * the fdmaps of all threads, minimizing each threads's chances of
+ * epoll_wait not finding any ready read events and blocking as this
+ * is not what we want to stress. The size of the fdmap can be adjusted
+ * by the user; enlarging the value will increase the chances of
+ * epoll_wait(2) blocking as the lineal writer thread will take "longer",
+ * at least at a high level.
+ *
+ * Note that because fds are private to each thread, this workload does
+ * not stress scenarios where multiple tasks are awoken per ready IO; ie:
+ * EPOLLEXCLUSIVE semantics.
+ *
+ * The end result/metric is throughput: number of ops/second where an
+ * operation consists of:
+ *
+ *   epoll_wait(2) + [others]
+ *
+ *        ... where [others] is the cost of re-adding the fd (EPOLLET),
+ *            or rearming it (EPOLLONESHOT).
+ *
+ *
+ * The purpose of this is program is that it be useful for measuring
+ * kernel related changes to the sys_epoll, and not comparing different
+ * IO polling methods, for example. Hence everything is very adhoc and
+ * outputs raw microbenchmark numbers. Also this uses eventfd, similar
+ * tools tend to use pipes or sockets, but the result is the same.
+ */
+
+/* For the CLR_() macros */
+#include <string.h>
+#include <pthread.h>
+
+#include <errno.h>
+#include <inttypes.h>
+#include <signal.h>
+#include <stdlib.h>
+#include <linux/compiler.h>
+#include <linux/kernel.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <sys/epoll.h>
+#include <sys/eventfd.h>
+#include <sys/types.h>
+
+#include "../util/stat.h"
+#include <subcmd/parse-options.h>
+#include "bench.h"
+#include "cpumap.h"
+
+#include <err.h>
+
+#define printinfo(fmt, arg...) \
+	do { if (__verbose) { printf(fmt, ## arg); fflush(stdout); } } while (0)
+
+static unsigned int nthreads = 0;
+static unsigned int nsecs    = 8;
+struct timeval start, end, runtime;
+static bool wdone, done, __verbose, randomize, nonblocking;
+
+/*
+ * epoll related shared variables.
+ */
+
+/* Maximum number of nesting allowed inside epoll sets */
+#define EPOLL_MAXNESTS 4
+
+static int epollfd;
+static int *epollfdp;
+static bool noaffinity;
+static unsigned int nested = 0;
+static bool et; /* edge-trigger */
+static bool oneshot;
+static bool multiq; /* use an epoll instance per thread */
+
+/* amount of fds to monitor, per thread */
+static unsigned int nfds = 64;
+
+static pthread_mutex_t thread_lock;
+static unsigned int threads_starting;
+static struct stats throughput_stats;
+static pthread_cond_t thread_parent, thread_worker;
+
+struct worker {
+	int tid;
+	int epollfd; /* for --multiq */
+	pthread_t thread;
+	unsigned long ops;
+	int *fdmap;
+};
+
+static const struct option options[] = {
+	/* general benchmark options */
+	OPT_UINTEGER('t', "threads", &nthreads, "Specify amount of threads"),
+	OPT_UINTEGER('r', "runtime", &nsecs, "Specify runtime (in seconds)"),
+	OPT_UINTEGER('f', "nfds",    &nfds,  "Specify amount of file descriptors to monitor for each thread"),
+	OPT_BOOLEAN( 'n', "noaffinity",  &noaffinity,   "Disables CPU affinity"),
+	OPT_BOOLEAN('R', "randomize", &randomize,   "Enable random write behaviour (default is lineal)"),
+	OPT_BOOLEAN( 'v', "verbose", &__verbose, "Verbose mode"),
+
+	/* epoll specific options */
+	OPT_BOOLEAN( 'm', "multiq",  &multiq,   "Use multiple epoll instances (one per thread)"),
+	OPT_BOOLEAN( 'B', "nonblocking", &nonblocking, "Nonblocking epoll_wait(2) behaviour"),
+	OPT_UINTEGER( 'N', "nested",  &nested,   "Nesting level epoll hierarchy (default is 0, no nesting)"),
+	OPT_BOOLEAN( 'S', "oneshot",  &oneshot,   "Use EPOLLONESHOT semantics"),
+	OPT_BOOLEAN( 'E', "edge",  &et,   "Use Edge-triggered interface (default is LT)"),
+
+	OPT_END()
+};
+
+static const char * const bench_epoll_wait_usage[] = {
+	"perf bench epoll wait <options>",
+	NULL
+};
+
+
+/*
+ * Arrange the N elements of ARRAY in random order.
+ * Only effective if N is much smaller than RAND_MAX;
+ * if this may not be the case, use a better random
+ * number generator. -- Ben Pfaff.
+ */
+static void shuffle(void *array, size_t n, size_t size)
+{
+	char *carray = array;
+	void *aux;
+	size_t i;
+
+	if (n <= 1)
+		return;
+
+	aux = calloc(1, size);
+	if (!aux)
+		err(EXIT_FAILURE, "calloc");
+
+	for (i = 1; i < n; ++i) {
+		size_t j =   i + rand() / (RAND_MAX / (n - i) + 1);
+		j *= size;
+
+		memcpy(aux, &carray[j], size);
+		memcpy(&carray[j], &carray[i*size], size);
+		memcpy(&carray[i*size], aux, size);
+	}
+
+	free(aux);
+}
+
+
+static void *workerfn(void *arg)
+{
+	int fd, ret, r;
+	struct worker *w = (struct worker *) arg;
+	unsigned long ops = w->ops;
+	struct epoll_event ev;
+	uint64_t val;
+	int to = nonblocking? 0 : -1;
+	int efd = multiq ? w->epollfd : epollfd;
+
+	pthread_mutex_lock(&thread_lock);
+	threads_starting--;
+	if (!threads_starting)
+		pthread_cond_signal(&thread_parent);
+	pthread_cond_wait(&thread_worker, &thread_lock);
+	pthread_mutex_unlock(&thread_lock);
+
+	do {
+		/*
+		 * Block undefinitely waiting for the IN event.
+		 * In order to stress the epoll_wait(2) syscall,
+		 * call it event per event, instead of a larger
+		 * batch (max)limit.
+		 */
+		do {
+			ret = epoll_wait(efd, &ev, 1, to);
+		} while (ret < 0 && errno == EINTR);
+		if (ret < 0)
+			err(EXIT_FAILURE, "epoll_wait");
+
+		fd = ev.data.fd;
+
+		do {
+			r = read(fd, &val, sizeof(val));
+		} while (!done && (r < 0 && errno == EAGAIN));
+
+		if (et) {
+			ev.events = EPOLLIN | EPOLLET;
+			ret = epoll_ctl(efd, EPOLL_CTL_ADD, fd, &ev);
+		}
+
+		if (oneshot) {
+			/* rearm the file descriptor with a new event mask */
+			ev.events |= EPOLLIN | EPOLLONESHOT;
+			ret = epoll_ctl(efd, EPOLL_CTL_MOD, fd, &ev);
+		}
+
+		ops++;
+	}  while (!done);
+
+	if (multiq)
+		close(w->epollfd);
+
+	w->ops = ops;
+	return NULL;
+}
+
+static void nest_epollfd(struct worker *w)
+{
+	unsigned int i;
+	struct epoll_event ev;
+	int efd = multiq ? w->epollfd : epollfd;
+
+	if (nested > EPOLL_MAXNESTS)
+		nested = EPOLL_MAXNESTS;
+
+	epollfdp = calloc(nested, sizeof(*epollfdp));
+	if (!epollfdp)
+		err(EXIT_FAILURE, "calloc");
+
+	for (i = 0; i < nested; i++) {
+		epollfdp[i] = epoll_create(1);
+		if (epollfdp[i] < 0)
+			err(EXIT_FAILURE, "epoll_create");
+	}
+
+	ev.events = EPOLLHUP; /* anything */
+	ev.data.u64 = i; /* any number */
+
+	for (i = nested - 1; i; i--) {
+		if (epoll_ctl(epollfdp[i - 1], EPOLL_CTL_ADD,
+			      epollfdp[i], &ev) < 0)
+			err(EXIT_FAILURE, "epoll_ctl");
+	}
+
+	if (epoll_ctl(efd, EPOLL_CTL_ADD, *epollfdp, &ev) < 0)
+		err(EXIT_FAILURE, "epoll_ctl");
+}
+
+static void toggle_done(int sig __maybe_unused,
+			siginfo_t *info __maybe_unused,
+			void *uc __maybe_unused)
+{
+	/* inform all threads that we're done for the day */
+	done = true;
+	gettimeofday(&end, NULL);
+	timersub(&end, &start, &runtime);
+}
+
+static void print_summary(void)
+{
+	unsigned long avg = avg_stats(&throughput_stats);
+	double stddev = stddev_stats(&throughput_stats);
+
+	printf("\nAveraged %ld operations/sec (+- %.2f%%), total secs = %d\n",
+	       avg, rel_stddev_stats(stddev, avg),
+	       (int) runtime.tv_sec);
+}
+
+static int do_threads(struct worker *worker, struct cpu_map *cpu)
+{
+	pthread_attr_t thread_attr, *attrp = NULL;
+	cpu_set_t cpuset;
+	unsigned int i, j;
+	int ret, events = EPOLLIN;
+
+	if (oneshot)
+		events |= EPOLLONESHOT;
+	if (et)
+		events |= EPOLLET;
+
+	printinfo("starting worker/consumer %sthreads%s\n",
+		  noaffinity ?  "":"CPU affinity ",
+		  nonblocking ? " (nonblocking)":"");
+	if (!noaffinity)
+		pthread_attr_init(&thread_attr);
+
+	for (i = 0; i < nthreads; i++) {
+		struct worker *w = &worker[i];
+
+		if (multiq) {
+			w->epollfd = epoll_create(1);
+			if (w->epollfd < 0)
+				err(EXIT_FAILURE, "epoll_create");
+
+			if (nested)
+				nest_epollfd(w);
+		}
+
+		w->tid = i;
+		w->fdmap = calloc(nfds, sizeof(int));
+		if (!w->fdmap)
+			return 1;
+
+		for (j = 0; j < nfds; j++) {
+			int efd = multiq ? w->epollfd : epollfd;
+			struct epoll_event ev;
+
+			w->fdmap[j] = eventfd(0, EFD_NONBLOCK);
+			if (w->fdmap[j] < 0)
+				err(EXIT_FAILURE, "eventfd");
+
+			ev.data.fd = w->fdmap[j];
+			ev.events = events;
+
+			ret = epoll_ctl(efd, EPOLL_CTL_ADD,
+					w->fdmap[j], &ev);
+			if (ret < 0)
+				err(EXIT_FAILURE, "epoll_ctl");
+		}
+
+		if (!noaffinity) {
+			CPU_ZERO(&cpuset);
+			CPU_SET(cpu->map[i % cpu->nr], &cpuset);
+
+			ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset);
+			if (ret)
+				err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
+
+			attrp = &thread_attr;
+		}
+
+		ret = pthread_create(&w->thread, attrp, workerfn,
+				     (void *)(struct worker *) w);
+		if (ret)
+			err(EXIT_FAILURE, "pthread_create");
+	}
+
+	if (!noaffinity)
+		pthread_attr_destroy(&thread_attr);
+
+	return ret;
+}
+
+static void *writerfn(void *p)
+{
+	struct worker *worker = p;
+	size_t i, j, iter;
+	const uint64_t val = 1;
+	ssize_t sz;
+	struct timespec ts = { .tv_sec = 0,
+			       .tv_nsec = 500 };
+
+	printinfo("starting writer-thread: doing %s writes ...\n",
+		  randomize? "random":"lineal");
+
+	for (iter = 0; !wdone; iter++) {
+		if (randomize) {
+			shuffle((void *)worker, nthreads, sizeof(*worker));
+		}
+
+		for (i = 0; i < nthreads; i++) {
+			struct worker *w = &worker[i];
+
+			if (randomize) {
+				shuffle((void *)w->fdmap, nfds, sizeof(int));
+			}
+
+			for (j = 0; j < nfds; j++) {
+				do {
+					sz = write(w->fdmap[j], &val, sizeof(val));
+				} while (!wdone && (sz < 0 && errno == EAGAIN));
+			}
+		}
+
+		nanosleep(&ts, NULL);
+	}
+
+	printinfo("exiting writer-thread (total full-loops: %zd)\n", iter);
+	return NULL;
+}
+
+static int cmpworker(const void *p1, const void *p2)
+{
+
+	struct worker *w1 = (struct worker *) p1;
+	struct worker *w2 = (struct worker *) p2;
+	return w1->tid > w2->tid;
+}
+
+int bench_epoll_wait(int argc, const char **argv)
+{
+	int ret = 0;
+	struct sigaction act;
+	unsigned int i;
+	struct worker *worker = NULL;
+	struct cpu_map *cpu;
+	pthread_t wthread;
+	struct rlimit rl, prevrl;
+
+	argc = parse_options(argc, argv, options, bench_epoll_wait_usage, 0);
+	if (argc) {
+		usage_with_options(bench_epoll_wait_usage, options);
+		exit(EXIT_FAILURE);
+	}
+
+	sigfillset(&act.sa_mask);
+	act.sa_sigaction = toggle_done;
+	sigaction(SIGINT, &act, NULL);
+
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		goto errmem;
+
+	/* a single, main epoll instance */
+	if (!multiq) {
+		epollfd = epoll_create(1);
+		if (epollfd < 0)
+			err(EXIT_FAILURE, "epoll_create");
+
+		/*
+		 * Deal with nested epolls, if any.
+		 */
+		if (nested)
+			nest_epollfd(NULL);
+	}
+
+	printinfo("Using %s queue model\n", multiq ? "multi" : "single");
+	printinfo("Nesting level(s): %d\n", nested);
+
+	/* default to the number of CPUs and leave one for the writer pthread */
+	if (!nthreads)
+		nthreads = cpu->nr - 1;
+
+	worker = calloc(nthreads, sizeof(*worker));
+	if (!worker) {
+		goto errmem;
+	}
+
+	if (getrlimit(RLIMIT_NOFILE, &prevrl))
+		err(EXIT_FAILURE, "getrlimit");
+	rl.rlim_cur = rl.rlim_max = nfds * nthreads * 2 + 50;
+	printinfo("Setting RLIMIT_NOFILE rlimit from %" PRIu64 " to: %" PRIu64 "\n",
+		  (uint64_t)prevrl.rlim_max, (uint64_t)rl.rlim_max);
+	if (setrlimit(RLIMIT_NOFILE, &rl) < 0)
+		err(EXIT_FAILURE, "setrlimit");
+
+	printf("Run summary [PID %d]: %d threads monitoring%s on "
+	       "%d file-descriptors for %d secs.\n\n",
+	       getpid(), nthreads, oneshot ? " (EPOLLONESHOT semantics)": "", nfds, nsecs);
+
+	init_stats(&throughput_stats);
+	pthread_mutex_init(&thread_lock, NULL);
+	pthread_cond_init(&thread_parent, NULL);
+	pthread_cond_init(&thread_worker, NULL);
+
+	threads_starting = nthreads;
+
+	gettimeofday(&start, NULL);
+
+	do_threads(worker, cpu);
+
+	pthread_mutex_lock(&thread_lock);
+	while (threads_starting)
+		pthread_cond_wait(&thread_parent, &thread_lock);
+	pthread_cond_broadcast(&thread_worker);
+	pthread_mutex_unlock(&thread_lock);
+
+	/*
+	 * At this point the workers should be blocked waiting for read events
+	 * to become ready. Launch the writer which will constantly be writing
+	 * to each thread's fdmap.
+	 */
+	ret = pthread_create(&wthread, NULL, writerfn,
+			     (void *)(struct worker *) worker);
+	if (ret)
+		err(EXIT_FAILURE, "pthread_create");
+
+	sleep(nsecs);
+	toggle_done(0, NULL, NULL);
+	printinfo("main thread: toggling done\n");
+
+	sleep(1); /* meh */
+	wdone = true;
+	ret = pthread_join(wthread, NULL);
+	if (ret)
+		err(EXIT_FAILURE, "pthread_join");
+
+	/* cleanup & report results */
+	pthread_cond_destroy(&thread_parent);
+	pthread_cond_destroy(&thread_worker);
+	pthread_mutex_destroy(&thread_lock);
+
+	/* sort the array back before reporting */
+	if (randomize)
+		qsort(worker, nthreads, sizeof(struct worker), cmpworker);
+
+	for (i = 0; i < nthreads; i++) {
+		unsigned long t = worker[i].ops/runtime.tv_sec;
+
+		update_stats(&throughput_stats, t);
+
+		if (nfds == 1)
+			printf("[thread %2d] fdmap: %p [ %04ld ops/sec ]\n",
+			       worker[i].tid, &worker[i].fdmap[0], t);
+		else
+			printf("[thread %2d] fdmap: %p ... %p [ %04ld ops/sec ]\n",
+			       worker[i].tid, &worker[i].fdmap[0],
+			       &worker[i].fdmap[nfds-1], t);
+	}
+
+	print_summary();
+
+	close(epollfd);
+	return ret;
+errmem:
+	err(EXIT_FAILURE, "calloc");
+}
+#endif // HAVE_EVENTFD
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index 17a6bcd01aa6..55efd23c3efb 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -14,6 +14,7 @@
  *  mem   ... memory access performance
  *  numa  ... NUMA scheduling and MM performance
  *  futex ... Futex performance
+ *  epoll ... Event poll performance
  */
 #include "perf.h"
 #include "util/util.h"
@@ -67,6 +68,14 @@ static struct bench futex_benchmarks[] = {
 	{ NULL,		NULL,						NULL			}
 };
 
+#ifdef HAVE_EVENTFD
+static struct bench epoll_benchmarks[] = {
+	{ "wait",	"Benchmark epoll concurrent epoll_waits",       bench_epoll_wait	},
+	{ "all",	"Run all futex benchmarks",			NULL			},
+	{ NULL,		NULL,						NULL			}
+};
+#endif // HAVE_EVENTFD
+
 struct collection {
 	const char	*name;
 	const char	*summary;
@@ -80,6 +89,9 @@ static struct collection collections[] = {
 	{ "numa",	"NUMA scheduling and MM benchmarks",		numa_benchmarks		},
 #endif
 	{"futex",       "Futex stressing benchmarks",                   futex_benchmarks        },
+#ifdef HAVE_EVENTFD
+	{"epoll",       "Epoll stressing benchmarks",                   epoll_benchmarks        },
+#endif
 	{ "all",	"All benchmarks",				NULL			},
 	{ NULL,		NULL,						NULL			}
 };
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 20/28] perf bench: Add epoll_ctl(2) benchmark
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (18 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 19/28] perf bench: Add epoll parallel epoll_wait benchmark Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 21/28] perf tools: Add Hygon Dhyana support Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Davidlohr Bueso,
	Davidlohr Bueso, Andrew Morton, Jason Baron,
	Arnaldo Carvalho de Melo

From: Davidlohr Bueso <dave@stgolabs.net>

Benchmark the various operations allowed for epoll_ctl(2).  The idea is
to concurrently stress a single epoll instance doing add/mod/del
operations.

Committer testing:

  # perf bench epoll ctl
  # Running 'epoll/ctl' benchmark:
  Run summary [PID 20344]: 4 threads doing epoll_ctl ops 64 file-descriptors for 8 secs.

  [thread  0] fdmap: 0x21a46b0 ... 0x21a47ac [ add: 1680960 ops; mod: 1680960 ops; del: 1680960 ops ]
  [thread  1] fdmap: 0x21a4960 ... 0x21a4a5c [ add: 1685440 ops; mod: 1685440 ops; del: 1685440 ops ]
  [thread  2] fdmap: 0x21a4c10 ... 0x21a4d0c [ add: 1674368 ops; mod: 1674368 ops; del: 1674368 ops ]
  [thread  3] fdmap: 0x21a4ec0 ... 0x21a4fbc [ add: 1677568 ops; mod: 1677568 ops; del: 1677568 ops ]

  Averaged 1679584 ADD operations (+- 0.14%)
  Averaged 1679584 MOD operations (+- 0.14%)
  Averaged 1679584 DEL operations (+- 0.14%)
  #

Lets measure those calls with 'perf trace' to get a glympse at what this
benchmark is doing in terms of syscalls:

  # perf trace -m32768 -s perf bench epoll ctl
  # Running 'epoll/ctl' benchmark:
  Run summary [PID 20405]: 4 threads doing epoll_ctl ops 64 file-descriptors for 8 secs.

  [thread  0] fdmap: 0x21764e0 ... 0x21765dc [ add: 1100480 ops; mod: 1100480 ops; del: 1100480 ops ]
  [thread  1] fdmap: 0x2176790 ... 0x217688c [ add: 1250176 ops; mod: 1250176 ops; del: 1250176 ops ]
  [thread  2] fdmap: 0x2176a40 ... 0x2176b3c [ add: 1022464 ops; mod: 1022464 ops; del: 1022464 ops ]
  [thread  3] fdmap: 0x2176cf0 ... 0x2176dec [ add: 705472 ops; mod: 705472 ops; del: 705472 ops ]

  Averaged 1019648 ADD operations (+- 11.27%)
  Averaged 1019648 MOD operations (+- 11.27%)
  Averaged 1019648 DEL operations (+- 11.27%)

  Summary of events:

  epoll-ctl (20405), 1264 events, 0.0%

   syscall            calls    total       min       avg       max      stddev
                               (msec)    (msec)    (msec)    (msec)        (%)
   --------------- -------- --------- --------- --------- ---------     ------
   eventfd2             256     9.514     0.001     0.037     5.243     68.00%
   clone                  4     1.245     0.204     0.311     0.531     24.13%
   mprotect              66     0.345     0.002     0.005     0.021      7.43%
   openat                45     0.313     0.004     0.007     0.073     21.93%
   mmap                  88     0.302     0.002     0.003     0.013      5.02%
   futex                  4     0.160     0.002     0.040     0.140     83.43%
   sched_setaffinity      4     0.124     0.005     0.031     0.070     49.39%
   read                  44     0.103     0.001     0.002     0.013     15.54%
   fstat                 40     0.052     0.001     0.001     0.003      5.43%
   close                 39     0.039     0.001     0.001     0.001      1.48%
   stat                   9     0.034     0.003     0.004     0.006      7.30%
   access                 3     0.023     0.007     0.008     0.008      4.25%
   open                   2     0.021     0.008     0.011     0.013     22.60%
   getdents               4     0.019     0.001     0.005     0.009     37.15%
   write                  2     0.013     0.004     0.007     0.009     38.48%
   munmap                 1     0.010     0.010     0.010     0.010      0.00%
   brk                    3     0.006     0.001     0.002     0.003     26.34%
   rt_sigprocmask         2     0.004     0.001     0.002     0.003     43.95%
   rt_sigaction           3     0.004     0.001     0.001     0.002     16.07%
   prlimit64              3     0.004     0.001     0.001     0.001      5.39%
   prctl                  1     0.003     0.003     0.003     0.003      0.00%
   epoll_create           1     0.003     0.003     0.003     0.003      0.00%
   lseek                  2     0.002     0.001     0.001     0.001     11.42%
   sched_getaffinity        1     0.002     0.002     0.002     0.002      0.00%
   arch_prctl             1     0.002     0.002     0.002     0.002      0.00%
   set_tid_address        1     0.001     0.001     0.001     0.001      0.00%
   getpid                 1     0.001     0.001     0.001     0.001      0.00%
   set_robust_list        1     0.001     0.001     0.001     0.001      0.00%
   execve                 1     0.000     0.000     0.000     0.000      0.00%

 epoll-ctl (20406), 1245480 events, 14.6%

   syscall            calls    total       min       avg       max      stddev
                               (msec)    (msec)    (msec)    (msec)        (%)
   --------------- -------- --------- --------- --------- ---------     ------
   epoll_ctl         619511  1034.927     0.001     0.002     6.691      0.67%
   nanosleep           3226   616.114     0.006     0.191    10.376      7.57%
   futex                  2    11.336     0.002     5.668    11.334     99.97%
   set_robust_list        1     0.001     0.001     0.001     0.001      0.00%
   clone                  1     0.000     0.000     0.000     0.000      0.00%

 epoll-ctl (20407), 1243151 events, 14.5%

   syscall            calls    total       min       avg       max      stddev
                               (msec)    (msec)    (msec)    (msec)        (%)
   --------------- -------- --------- --------- --------- ---------     ------
   epoll_ctl         618350  1042.181     0.001     0.002     2.512      0.40%
   nanosleep           3220   366.261     0.012     0.114    18.162      9.59%
   futex                  4     5.463     0.001     1.366     5.427     99.12%
   set_robust_list        1     0.002     0.002     0.002     0.002      0.00%

 epoll-ctl (20408), 1801690 events, 21.1%

   syscall            calls    total       min       avg       max      stddev
                               (msec)    (msec)    (msec)    (msec)        (%)
   --------------- -------- --------- --------- --------- ---------     ------
   epoll_ctl         896174  1540.581     0.001     0.002     6.987      0.74%
   nanosleep           4667   783.393     0.006     0.168    10.419      7.10%
   futex                  2     4.682     0.002     2.341     4.681     99.93%
   set_robust_list        1     0.002     0.002     0.002     0.002      0.00%
   clone                  1     0.000     0.000     0.000     0.000      0.00%

 epoll-ctl (20409), 4254890 events, 49.8%

   syscall            calls    total       min       avg       max      stddev
                               (msec)    (msec)    (msec)    (msec)        (%)
   --------------- -------- --------- --------- --------- ---------     ------
   epoll_ctl        2116416  3768.097     0.001     0.002     9.956      0.41%
   nanosleep          11023  1141.778     0.006     0.104     9.447      4.95%
   futex                  3     0.037     0.002     0.012     0.029     70.50%
   set_robust_list        1     0.008     0.008     0.008     0.008      0.00%
   madvise                1     0.005     0.005     0.005     0.005      0.00%
   clone                  1     0.000     0.000     0.000     0.000      0.00%
  #

Committer notes:

Fix build on fedora:24-x-ARC-uClibc, debian:experimental-x-mips,
debian:experimental-x-mipsel, ubuntu:16.04-x-arm and ubuntu:16.04-x-powerpc

    CC       /tmp/build/perf/bench/epoll-ctl.o
  bench/epoll-ctl.c: In function 'init_fdmaps':
  bench/epoll-ctl.c:214:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
    for (i = 0; i < nfds; i+=inc) {
                  ^
  bench/epoll-ctl.c: In function 'bench_epoll_ctl':
  bench/epoll-ctl.c:377:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
    for (i = 0; i < nthreads; i++) {
                  ^
  bench/epoll-ctl.c:388:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
    for (i = 0; i < nthreads; i++) {
                  ^
  cc1: all warnings being treated as errors

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Link: http://lkml.kernel.org/r/20181106152226.20883-3-dave@stgolabs.net
[ Use inttypes.h to print rlim_t fields, fixing the build on Alpine Linux / musl libc ]
[ Check if eventfd() is available, i.e. if HAVE_EVENTFD is defined ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-bench.txt |   3 +
 tools/perf/bench/Build                  |   1 +
 tools/perf/bench/bench.h                |   1 +
 tools/perf/bench/epoll-ctl.c            | 413 ++++++++++++++++++++++++++++++++
 tools/perf/builtin-bench.c              |   1 +
 5 files changed, 419 insertions(+)
 create mode 100644 tools/perf/bench/epoll-ctl.c

diff --git a/tools/perf/Documentation/perf-bench.txt b/tools/perf/Documentation/perf-bench.txt
index 3a6b2e73b2e8..0921a3c67381 100644
--- a/tools/perf/Documentation/perf-bench.txt
+++ b/tools/perf/Documentation/perf-bench.txt
@@ -211,6 +211,9 @@ SUITES FOR 'epoll'
 *wait*::
 Suite for evaluating concurrent epoll_wait calls.
 
+*ctl*::
+Suite for evaluating multiple epoll_ctl calls.
+
 SEE ALSO
 --------
 linkperf:perf[1]
diff --git a/tools/perf/bench/Build b/tools/perf/bench/Build
index 2bb79b542d53..e4e321b6f883 100644
--- a/tools/perf/bench/Build
+++ b/tools/perf/bench/Build
@@ -8,6 +8,7 @@ perf-y += futex-requeue.o
 perf-y += futex-lock-pi.o
 
 perf-y += epoll-wait.o
+perf-y += epoll-ctl.o
 
 perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-lib.o
 perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-asm.o
diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 6e1f091ced96..fddb3ced9db6 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -39,6 +39,7 @@ int bench_futex_requeue(int argc, const char **argv);
 int bench_futex_lock_pi(int argc, const char **argv);
 
 int bench_epoll_wait(int argc, const char **argv);
+int bench_epoll_ctl(int argc, const char **argv);
 
 #define BENCH_FORMAT_DEFAULT_STR	"default"
 #define BENCH_FORMAT_DEFAULT		0
diff --git a/tools/perf/bench/epoll-ctl.c b/tools/perf/bench/epoll-ctl.c
new file mode 100644
index 000000000000..0c0a6e824934
--- /dev/null
+++ b/tools/perf/bench/epoll-ctl.c
@@ -0,0 +1,413 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 Davidlohr Bueso.
+ *
+ * Benchmark the various operations allowed for epoll_ctl(2).
+ * The idea is to concurrently stress a single epoll instance
+ */
+#ifdef HAVE_EVENTFD
+/* For the CLR_() macros */
+#include <string.h>
+#include <pthread.h>
+
+#include <errno.h>
+#include <inttypes.h>
+#include <signal.h>
+#include <stdlib.h>
+#include <linux/compiler.h>
+#include <linux/kernel.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <sys/epoll.h>
+#include <sys/eventfd.h>
+
+#include "../util/stat.h"
+#include <subcmd/parse-options.h>
+#include "bench.h"
+#include "cpumap.h"
+
+#include <err.h>
+
+#define printinfo(fmt, arg...) \
+	do { if (__verbose) printf(fmt, ## arg); } while (0)
+
+static unsigned int nthreads = 0;
+static unsigned int nsecs    = 8;
+struct timeval start, end, runtime;
+static bool done, __verbose, randomize;
+
+/*
+ * epoll related shared variables.
+ */
+
+/* Maximum number of nesting allowed inside epoll sets */
+#define EPOLL_MAXNESTS 4
+
+enum {
+	OP_EPOLL_ADD,
+	OP_EPOLL_MOD,
+	OP_EPOLL_DEL,
+	EPOLL_NR_OPS,
+};
+
+static int epollfd;
+static int *epollfdp;
+static bool noaffinity;
+static unsigned int nested = 0;
+
+/* amount of fds to monitor, per thread */
+static unsigned int nfds = 64;
+
+static pthread_mutex_t thread_lock;
+static unsigned int threads_starting;
+static struct stats all_stats[EPOLL_NR_OPS];
+static pthread_cond_t thread_parent, thread_worker;
+
+struct worker {
+	int tid;
+	pthread_t thread;
+	unsigned long ops[EPOLL_NR_OPS];
+	int *fdmap;
+};
+
+static const struct option options[] = {
+	OPT_UINTEGER('t', "threads", &nthreads, "Specify amount of threads"),
+	OPT_UINTEGER('r', "runtime", &nsecs,    "Specify runtime (in seconds)"),
+	OPT_UINTEGER('f', "nfds", &nfds, "Specify amount of file descriptors to monitor for each thread"),
+	OPT_BOOLEAN( 'n', "noaffinity",  &noaffinity,   "Disables CPU affinity"),
+	OPT_UINTEGER( 'N', "nested",  &nested,   "Nesting level epoll hierarchy (default is 0, no nesting)"),
+	OPT_BOOLEAN( 'R', "randomize", &randomize,   "Perform random operations on random fds"),
+	OPT_BOOLEAN( 'v', "verbose",  &__verbose,   "Verbose mode"),
+	OPT_END()
+};
+
+static const char * const bench_epoll_ctl_usage[] = {
+	"perf bench epoll ctl <options>",
+	NULL
+};
+
+static void toggle_done(int sig __maybe_unused,
+			siginfo_t *info __maybe_unused,
+			void *uc __maybe_unused)
+{
+	/* inform all threads that we're done for the day */
+	done = true;
+	gettimeofday(&end, NULL);
+	timersub(&end, &start, &runtime);
+}
+
+static void nest_epollfd(void)
+{
+	unsigned int i;
+	struct epoll_event ev;
+
+	if (nested > EPOLL_MAXNESTS)
+		nested = EPOLL_MAXNESTS;
+	printinfo("Nesting level(s): %d\n", nested);
+
+	epollfdp = calloc(nested, sizeof(int));
+	if (!epollfd)
+		err(EXIT_FAILURE, "calloc");
+
+	for (i = 0; i < nested; i++) {
+		epollfdp[i] = epoll_create(1);
+		if (epollfd < 0)
+			err(EXIT_FAILURE, "epoll_create");
+	}
+
+	ev.events = EPOLLHUP; /* anything */
+	ev.data.u64 = i; /* any number */
+
+	for (i = nested - 1; i; i--) {
+		if (epoll_ctl(epollfdp[i - 1], EPOLL_CTL_ADD,
+			      epollfdp[i], &ev) < 0)
+			err(EXIT_FAILURE, "epoll_ctl");
+	}
+
+	if (epoll_ctl(epollfd, EPOLL_CTL_ADD, *epollfdp, &ev) < 0)
+		err(EXIT_FAILURE, "epoll_ctl");
+}
+
+static inline void do_epoll_op(struct worker *w, int op, int fd)
+{
+	int error;
+	struct epoll_event ev;
+
+	ev.events = EPOLLIN;
+	ev.data.u64 = fd;
+
+	switch (op) {
+	case OP_EPOLL_ADD:
+		error = epoll_ctl(epollfd, EPOLL_CTL_ADD, fd, &ev);
+		break;
+	case OP_EPOLL_MOD:
+		ev.events = EPOLLOUT;
+		error = epoll_ctl(epollfd, EPOLL_CTL_MOD, fd, &ev);
+		break;
+	case OP_EPOLL_DEL:
+		error = epoll_ctl(epollfd, EPOLL_CTL_DEL, fd, NULL);
+		break;
+	default:
+		error = 1;
+		break;
+	}
+
+	if (!error)
+		w->ops[op]++;
+}
+
+static inline void do_random_epoll_op(struct worker *w)
+{
+	unsigned long rnd1 = random(), rnd2 = random();
+	int op, fd;
+
+	fd = w->fdmap[rnd1 % nfds];
+	op = rnd2 % EPOLL_NR_OPS;
+
+	do_epoll_op(w, op, fd);
+}
+
+static void *workerfn(void *arg)
+{
+	unsigned int i;
+	struct worker *w = (struct worker *) arg;
+	struct timespec ts = { .tv_sec = 0,
+			       .tv_nsec = 250 };
+
+	pthread_mutex_lock(&thread_lock);
+	threads_starting--;
+	if (!threads_starting)
+		pthread_cond_signal(&thread_parent);
+	pthread_cond_wait(&thread_worker, &thread_lock);
+	pthread_mutex_unlock(&thread_lock);
+
+	/* Let 'em loose */
+	do {
+		/* random */
+		if (randomize) {
+			do_random_epoll_op(w);
+		} else {
+			for (i = 0; i < nfds; i++) {
+				do_epoll_op(w, OP_EPOLL_ADD, w->fdmap[i]);
+				do_epoll_op(w, OP_EPOLL_MOD, w->fdmap[i]);
+				do_epoll_op(w, OP_EPOLL_DEL, w->fdmap[i]);
+			}
+		}
+
+		nanosleep(&ts, NULL);
+	}  while (!done);
+
+	return NULL;
+}
+
+static void init_fdmaps(struct worker *w, int pct)
+{
+	unsigned int i;
+	int inc;
+	struct epoll_event ev;
+
+	if (!pct)
+		return;
+
+	inc = 100/pct;
+	for (i = 0; i < nfds; i+=inc) {
+		ev.data.fd = w->fdmap[i];
+		ev.events = EPOLLIN;
+
+		if (epoll_ctl(epollfd, EPOLL_CTL_ADD, w->fdmap[i], &ev) < 0)
+			err(EXIT_FAILURE, "epoll_ct");
+	}
+}
+
+static int do_threads(struct worker *worker, struct cpu_map *cpu)
+{
+	pthread_attr_t thread_attr, *attrp = NULL;
+	cpu_set_t cpuset;
+	unsigned int i, j;
+	int ret;
+
+	if (!noaffinity)
+		pthread_attr_init(&thread_attr);
+
+	for (i = 0; i < nthreads; i++) {
+		struct worker *w = &worker[i];
+
+		w->tid = i;
+		w->fdmap = calloc(nfds, sizeof(int));
+		if (!w->fdmap)
+			return 1;
+
+		for (j = 0; j < nfds; j++) {
+			w->fdmap[j] = eventfd(0, EFD_NONBLOCK);
+			if (w->fdmap[j] < 0)
+				err(EXIT_FAILURE, "eventfd");
+		}
+
+		/*
+		 * Lets add 50% of the fdmap to the epoll instance, and
+		 * do it before any threads are started; otherwise there is
+		 * an initial bias of the call failing  (mod and del ops).
+		 */
+		if (randomize)
+			init_fdmaps(w, 50);
+
+		if (!noaffinity) {
+			CPU_ZERO(&cpuset);
+			CPU_SET(cpu->map[i % cpu->nr], &cpuset);
+
+			ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset);
+			if (ret)
+				err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
+
+			attrp = &thread_attr;
+		}
+
+		ret = pthread_create(&w->thread, attrp, workerfn,
+				     (void *)(struct worker *) w);
+		if (ret)
+			err(EXIT_FAILURE, "pthread_create");
+	}
+
+	if (!noaffinity)
+		pthread_attr_destroy(&thread_attr);
+
+	return ret;
+}
+
+static void print_summary(void)
+{
+	int i;
+	unsigned long avg[EPOLL_NR_OPS];
+	double stddev[EPOLL_NR_OPS];
+
+	for (i = 0; i < EPOLL_NR_OPS; i++) {
+		avg[i] = avg_stats(&all_stats[i]);
+		stddev[i] = stddev_stats(&all_stats[i]);
+	}
+
+	printf("\nAveraged %ld ADD operations (+- %.2f%%)\n",
+	       avg[OP_EPOLL_ADD], rel_stddev_stats(stddev[OP_EPOLL_ADD],
+						   avg[OP_EPOLL_ADD]));
+	printf("Averaged %ld MOD operations (+- %.2f%%)\n",
+	       avg[OP_EPOLL_MOD], rel_stddev_stats(stddev[OP_EPOLL_MOD],
+						   avg[OP_EPOLL_MOD]));
+	printf("Averaged %ld DEL operations (+- %.2f%%)\n",
+	       avg[OP_EPOLL_DEL], rel_stddev_stats(stddev[OP_EPOLL_DEL],
+						   avg[OP_EPOLL_DEL]));
+}
+
+int bench_epoll_ctl(int argc, const char **argv)
+{
+	int j, ret = 0;
+	struct sigaction act;
+	struct worker *worker = NULL;
+	struct cpu_map *cpu;
+	struct rlimit rl, prevrl;
+	unsigned int i;
+
+	argc = parse_options(argc, argv, options, bench_epoll_ctl_usage, 0);
+	if (argc) {
+		usage_with_options(bench_epoll_ctl_usage, options);
+		exit(EXIT_FAILURE);
+	}
+
+	sigfillset(&act.sa_mask);
+	act.sa_sigaction = toggle_done;
+	sigaction(SIGINT, &act, NULL);
+
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		goto errmem;
+
+	/* a single, main epoll instance */
+	epollfd = epoll_create(1);
+	if (epollfd < 0)
+		err(EXIT_FAILURE, "epoll_create");
+
+	/*
+	 * Deal with nested epolls, if any.
+	 */
+	if (nested)
+		nest_epollfd();
+
+	/* default to the number of CPUs */
+	if (!nthreads)
+		nthreads = cpu->nr;
+
+	worker = calloc(nthreads, sizeof(*worker));
+	if (!worker)
+		goto errmem;
+
+	if (getrlimit(RLIMIT_NOFILE, &prevrl))
+	    err(EXIT_FAILURE, "getrlimit");
+	rl.rlim_cur = rl.rlim_max = nfds * nthreads * 2 + 50;
+	printinfo("Setting RLIMIT_NOFILE rlimit from %" PRIu64 " to: %" PRIu64 "\n",
+		  (uint64_t)prevrl.rlim_max, (uint64_t)rl.rlim_max);
+	if (setrlimit(RLIMIT_NOFILE, &rl) < 0)
+		err(EXIT_FAILURE, "setrlimit");
+
+	printf("Run summary [PID %d]: %d threads doing epoll_ctl ops "
+	       "%d file-descriptors for %d secs.\n\n",
+	       getpid(), nthreads, nfds, nsecs);
+
+	for (i = 0; i < EPOLL_NR_OPS; i++)
+		init_stats(&all_stats[i]);
+
+	pthread_mutex_init(&thread_lock, NULL);
+	pthread_cond_init(&thread_parent, NULL);
+	pthread_cond_init(&thread_worker, NULL);
+
+	threads_starting = nthreads;
+
+	gettimeofday(&start, NULL);
+
+	do_threads(worker, cpu);
+
+	pthread_mutex_lock(&thread_lock);
+	while (threads_starting)
+		pthread_cond_wait(&thread_parent, &thread_lock);
+	pthread_cond_broadcast(&thread_worker);
+	pthread_mutex_unlock(&thread_lock);
+
+	sleep(nsecs);
+	toggle_done(0, NULL, NULL);
+	printinfo("main thread: toggling done\n");
+
+	for (i = 0; i < nthreads; i++) {
+		ret = pthread_join(worker[i].thread, NULL);
+		if (ret)
+			err(EXIT_FAILURE, "pthread_join");
+	}
+
+	/* cleanup & report results */
+	pthread_cond_destroy(&thread_parent);
+	pthread_cond_destroy(&thread_worker);
+	pthread_mutex_destroy(&thread_lock);
+
+	for (i = 0; i < nthreads; i++) {
+		unsigned long t[EPOLL_NR_OPS];
+
+		for (j = 0; j < EPOLL_NR_OPS; j++) {
+			t[j] = worker[i].ops[j];
+			update_stats(&all_stats[j], t[j]);
+		}
+
+		if (nfds == 1)
+			printf("[thread %2d] fdmap: %p [ add: %04ld; mod: %04ld; del: %04lds ops ]\n",
+			       worker[i].tid, &worker[i].fdmap[0],
+			       t[OP_EPOLL_ADD], t[OP_EPOLL_MOD], t[OP_EPOLL_DEL]);
+		else
+			printf("[thread %2d] fdmap: %p ... %p [ add: %04ld ops; mod: %04ld ops; del: %04ld ops ]\n",
+			       worker[i].tid, &worker[i].fdmap[0],
+			       &worker[i].fdmap[nfds-1],
+			       t[OP_EPOLL_ADD], t[OP_EPOLL_MOD], t[OP_EPOLL_DEL]);
+	}
+
+	print_summary();
+
+	close(epollfd);
+	return ret;
+errmem:
+	err(EXIT_FAILURE, "calloc");
+}
+#endif // HAVE_EVENTFD
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index 55efd23c3efb..334c77ffc1d9 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -71,6 +71,7 @@ static struct bench futex_benchmarks[] = {
 #ifdef HAVE_EVENTFD
 static struct bench epoll_benchmarks[] = {
 	{ "wait",	"Benchmark epoll concurrent epoll_waits",       bench_epoll_wait	},
+	{ "ctl",	"Benchmark epoll concurrent epoll_ctls",        bench_epoll_ctl		},
 	{ "all",	"Run all futex benchmarks",			NULL			},
 	{ NULL,		NULL,						NULL			}
 };
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 21/28] perf tools: Add Hygon Dhyana support
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (19 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 20/28] perf bench: Add epoll_ctl(2) benchmark Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 22/28] perf pmu: Suppress potential format-truncation warning Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Pu Wen,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Peter Zijlstra,
	Thomas Gleixner, Arnaldo Carvalho de Melo

From: Pu Wen <puwen@hygon.cn>

The tool perf is useful for the performance analysis on the Hygon Dhyana
platform. But right now there is no Hygon support for it to analyze the
KVM guest os data. So add Hygon Dhyana support to it by checking vendor
string to share the code path of AMD.

Signed-off-by: Pu Wen <puwen@hygon.cn>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1542008451-31735-1-git-send-email-puwen@hygon.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/util/kvm-stat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/kvm-stat.c b/tools/perf/arch/x86/util/kvm-stat.c
index b32409a0e546..081353d7b095 100644
--- a/tools/perf/arch/x86/util/kvm-stat.c
+++ b/tools/perf/arch/x86/util/kvm-stat.c
@@ -156,7 +156,7 @@ int cpu_isa_init(struct perf_kvm_stat *kvm, const char *cpuid)
 	if (strstr(cpuid, "Intel")) {
 		kvm->exit_reasons = vmx_exit_reasons;
 		kvm->exit_reasons_isa = "VMX";
-	} else if (strstr(cpuid, "AMD")) {
+	} else if (strstr(cpuid, "AMD") || strstr(cpuid, "Hygon")) {
 		kvm->exit_reasons = svm_exit_reasons;
 		kvm->exit_reasons_isa = "SVM";
 	} else
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 22/28] perf pmu: Suppress potential format-truncation warning
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (20 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 21/28] perf tools: Add Hygon Dhyana support Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 23/28] perf stat: Use perf_evsel__is_clocki() for clock events Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Ben Hutchings,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Peter Zijlstra,
	stable, Arnaldo Carvalho de Melo

From: Ben Hutchings <ben@decadent.org.uk>

Depending on which functions are inlined in util/pmu.c, the snprintf()
calls in perf_pmu__parse_{scale,unit,per_pkg,snapshot}() might trigger a
warning:

  util/pmu.c: In function 'pmu_aliases':
  util/pmu.c:178:31: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=]
    snprintf(path, PATH_MAX, "%s/%s.unit", dir, name);
                               ^~

I found this when trying to build perf from Linux 3.16 with gcc 8.
However I can reproduce the problem in mainline if I force
__perf_pmu__new_alias() to be inlined.

Suppress this by using scnprintf() as has been done elsewhere in perf.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20181111184524.fux4taownc6ndbx6@decadent.org.uk
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 7e49baad304d..7348eea0248f 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -145,7 +145,7 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
 	int fd, ret = -1;
 	char path[PATH_MAX];
 
-	snprintf(path, PATH_MAX, "%s/%s.scale", dir, name);
+	scnprintf(path, PATH_MAX, "%s/%s.scale", dir, name);
 
 	fd = open(path, O_RDONLY);
 	if (fd == -1)
@@ -175,7 +175,7 @@ static int perf_pmu__parse_unit(struct perf_pmu_alias *alias, char *dir, char *n
 	ssize_t sret;
 	int fd;
 
-	snprintf(path, PATH_MAX, "%s/%s.unit", dir, name);
+	scnprintf(path, PATH_MAX, "%s/%s.unit", dir, name);
 
 	fd = open(path, O_RDONLY);
 	if (fd == -1)
@@ -205,7 +205,7 @@ perf_pmu__parse_per_pkg(struct perf_pmu_alias *alias, char *dir, char *name)
 	char path[PATH_MAX];
 	int fd;
 
-	snprintf(path, PATH_MAX, "%s/%s.per-pkg", dir, name);
+	scnprintf(path, PATH_MAX, "%s/%s.per-pkg", dir, name);
 
 	fd = open(path, O_RDONLY);
 	if (fd == -1)
@@ -223,7 +223,7 @@ static int perf_pmu__parse_snapshot(struct perf_pmu_alias *alias,
 	char path[PATH_MAX];
 	int fd;
 
-	snprintf(path, PATH_MAX, "%s/%s.snapshot", dir, name);
+	scnprintf(path, PATH_MAX, "%s/%s.snapshot", dir, name);
 
 	fd = open(path, O_RDONLY);
 	if (fd == -1)
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 23/28] perf stat: Use perf_evsel__is_clocki() for clock events
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (21 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 22/28] perf pmu: Suppress potential format-truncation warning Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 24/28] perf vendor events: Add stepping in CPUID string for x86 Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Ravi Bangoria,
	Alexander Shishkin, Anton Blanchard, Jin Yao, Namhyung Kim,
	Thomas Richter, yuzhoujian, Arnaldo Carvalho de Melo

From: Ravi Bangoria <ravi.bangoria@linux.ibm.com>

We already have function to check if a given event is either
SW_CPU_CLOCK or SW_TASK_CLOCK. Utilize it.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: yuzhoujian@didichuxing.com
Link: http://lkml.kernel.org/r/20181115095533.16930-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-shadow.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 8ad32763cfff..f0a8cec55c47 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -212,8 +212,7 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
 
 	count *= counter->scale;
 
-	if (perf_evsel__match(counter, SOFTWARE, SW_TASK_CLOCK) ||
-	    perf_evsel__match(counter, SOFTWARE, SW_CPU_CLOCK))
+	if (perf_evsel__is_clock(counter))
 		update_runtime_stat(st, STAT_NSECS, 0, cpu, count);
 	else if (perf_evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
 		update_runtime_stat(st, STAT_CYCLES, ctx, cpu, count);
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 24/28] perf vendor events: Add stepping in CPUID string for x86
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (22 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 23/28] perf stat: Use perf_evsel__is_clocki() for clock events Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 26/28] perf jvmti: Separate jvmti cmlr check Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Kan Liang,
	Andi Kleen, Namhyung Kim, Peter Zijlstra,
	Arnaldo Carvalho de Melo

From: Kan Liang <kan.liang@linux.intel.com>

The perf tools cannot find the proper event list for the Cascadelake
server.  Because the Cascadelake server and the Skylake server have the
same CPU model number, which are used by the perf tools to find the
event list.

The stepping for Skylake server is up to 4.

The stepping for Cascadelake server starts from 5.

The stepping can be used to distinguish between them.

The stepping is added in get_cpuid_str().

The stepping information for Skylake server is updated in mapfile.csv.

A x86 specific strcmp_cpuid_cmp() function is added to handle two CPUID
formats in mapfile.csv, "vendor-family-model-stepping" and
"vendor-family-model":

- If a cpuid-regular-expression from the mapfile.csv using the new
  stepping format, a cpuid-string generated on the machine must include
  stepping. Otherwise, it is a mismatch.

- If the cpuid-regular-expression using the old non-stepping format,
  the stepping in the cpuid-string will be ignored.

The script, using environment string "PERF_CPUID" without stepping on
Skylake server, will be broken. If so, users must fix their scripts.

Committer notes:

Fixed this build error on centos:6 and debian:7:

  arch/x86/util/header.c: In function 'is_full_cpuid':
  arch/x86/util/header.c:82:39: error: declaration of 'cpuid' shadows a global declaration [-Werror=shadow]
  arch/x86/util/header.c:12:1: error: shadowed declaration is here [-Werror=shadow]
  arch/x86/util/header.c: In function 'strcmp_cpuid_str':
  arch/x86/util/header.c:98:56: error: declaration of 'cpuid' shadows a global declaration [-Werror=shadow]
  arch/x86/util/header.c:12:1: error: shadowed declaration is here [-Werror=shadow]
  cc1: all warnings being treated as errors

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181114212416.15665-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/util/header.c          | 66 +++++++++++++++++++++++++++++-
 tools/perf/pmu-events/arch/x86/mapfile.csv |  2 +-
 tools/perf/util/pmu.c                      |  2 +-
 3 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/tools/perf/arch/x86/util/header.c b/tools/perf/arch/x86/util/header.c
index fb0d71afee8b..af9a9f2600be 100644
--- a/tools/perf/arch/x86/util/header.c
+++ b/tools/perf/arch/x86/util/header.c
@@ -4,6 +4,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <regex.h>
 
 #include "../../util/header.h"
 
@@ -70,9 +71,72 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	char *buf = malloc(128);
 
-	if (buf && __get_cpuid(buf, 128, "%s-%u-%X$") < 0) {
+	if (buf && __get_cpuid(buf, 128, "%s-%u-%X-%X$") < 0) {
 		free(buf);
 		return NULL;
 	}
 	return buf;
 }
+
+/* Full CPUID format for x86 is vendor-family-model-stepping */
+static bool is_full_cpuid(const char *id)
+{
+	const char *tmp = id;
+	int count = 0;
+
+	while ((tmp = strchr(tmp, '-')) != NULL) {
+		count++;
+		tmp++;
+	}
+
+	if (count == 3)
+		return true;
+
+	return false;
+}
+
+int strcmp_cpuid_str(const char *mapcpuid, const char *id)
+{
+	regex_t re;
+	regmatch_t pmatch[1];
+	int match;
+	bool full_mapcpuid = is_full_cpuid(mapcpuid);
+	bool full_cpuid = is_full_cpuid(id);
+
+	/*
+	 * Full CPUID format is required to identify a platform.
+	 * Error out if the cpuid string is incomplete.
+	 */
+	if (full_mapcpuid && !full_cpuid) {
+		pr_info("Invalid CPUID %s. Full CPUID is required, "
+			"vendor-family-model-stepping\n", id);
+		return 1;
+	}
+
+	if (regcomp(&re, mapcpuid, REG_EXTENDED) != 0) {
+		/* Warn unable to generate match particular string. */
+		pr_info("Invalid regular expression %s\n", mapcpuid);
+		return 1;
+	}
+
+	match = !regexec(&re, id, 1, pmatch, 0);
+	regfree(&re);
+	if (match) {
+		size_t match_len = (pmatch[0].rm_eo - pmatch[0].rm_so);
+		size_t cpuid_len;
+
+		/* If the full CPUID format isn't required,
+		 * ignoring the stepping.
+		 */
+		if (!full_mapcpuid && full_cpuid)
+			cpuid_len = strrchr(id, '-') - id;
+		else
+			cpuid_len = strlen(id);
+
+		/* Verify the entire string matched. */
+		if (match_len == cpuid_len)
+			return 0;
+	}
+
+	return 1;
+}
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 7e3cce3bcf3b..183a42c99251 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -31,4 +31,4 @@ GenuineIntel-6-2A,v15,sandybridge,core
 GenuineIntel-6-2C,v2,westmereep-dp,core
 GenuineIntel-6-25,v2,westmereep-sp,core
 GenuineIntel-6-2F,v2,westmereex,core
-GenuineIntel-6-55,v1,skylakex,core
+GenuineIntel-6-55-[01234],v1,skylakex,core
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 7348eea0248f..c660625d7d4b 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -670,7 +670,7 @@ char * __weak get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
  * cpuid string generated on this platform.
  * Otherwise return non-zero.
  */
-int strcmp_cpuid_str(const char *mapcpuid, const char *cpuid)
+int __weak strcmp_cpuid_str(const char *mapcpuid, const char *cpuid)
 {
 	regex_t re;
 	regmatch_t pmatch[1];
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 26/28] perf jvmti: Separate jvmti cmlr check
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (23 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 24/28] perf vendor events: Add stepping in CPUID string for x86 Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 27/28] perf symbols: Fix slowness due to -ffunction-section Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Jiri Olsa,
	Alexander Shishkin, Ben Gainey, Gustavo Luiz Duarte,
	Namhyung Kim, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

The Compiled Method Load Record (cmlr) is JDK specific interface to
access JVM stack info. This makes the jvmti agent code not compile under
another jdk, which does not support that.

Separating jvmti cmlr check into special feature check, and adding
HAVE_JVMTI_CMLR macro to indicate that.

Mark cmlr code in jvmti/libjvmti.c with HAVE_JVMTI_CMLR, so we can
compile it on system without cmlr support.

This change makes the jvmti compile with java-1.8.0-ibm package. It's
without the line numbers support, but the rest works.

Adding NO_JVMTI_CMLR compile variable for testing.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Gustavo Luiz Duarte <gduarte@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20181121154341.21521-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/build/feature/Makefile          |  4 ++++
 tools/build/feature/test-jvmti-cmlr.c | 11 +++++++++++
 tools/build/feature/test-jvmti.c      |  1 -
 tools/perf/Makefile.config            |  7 +++++++
 tools/perf/Makefile.perf              |  3 +++
 tools/perf/jvmti/libjvmti.c           | 12 ++++++++++++
 6 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 tools/build/feature/test-jvmti-cmlr.c

diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 325087a0429c..38c22e122cb0 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -55,6 +55,7 @@ FILES=                                          \
          test-sdt.bin                           \
          test-cxx.bin                           \
          test-jvmti.bin				\
+         test-jvmti-cmlr.bin			\
          test-sched_getcpu.bin			\
          test-setns.bin				\
          test-libopencsd.bin			\
@@ -267,6 +268,9 @@ $(OUTPUT)test-cxx.bin:
 $(OUTPUT)test-jvmti.bin:
 	$(BUILD)
 
+$(OUTPUT)test-jvmti-cmlr.bin:
+	$(BUILD)
+
 $(OUTPUT)test-llvm.bin:
 	$(BUILDXX) -std=gnu++11 				\
 		-I$(shell $(LLVM_CONFIG) --includedir) 		\
diff --git a/tools/build/feature/test-jvmti-cmlr.c b/tools/build/feature/test-jvmti-cmlr.c
new file mode 100644
index 000000000000..c27b5b71a0f6
--- /dev/null
+++ b/tools/build/feature/test-jvmti-cmlr.c
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <jvmti.h>
+#include <jvmticmlr.h>
+
+int main(void)
+{
+	jvmtiCompiledMethodLoadInlineRecord	rec __attribute__((unused));
+	jvmtiCompiledMethodLoadRecordHeader	hdr __attribute__((unused));
+	PCStackInfo				p   __attribute__((unused));
+	return 0;
+}
diff --git a/tools/build/feature/test-jvmti.c b/tools/build/feature/test-jvmti.c
index 5cf31192f204..799916d2e3e3 100644
--- a/tools/build/feature/test-jvmti.c
+++ b/tools/build/feature/test-jvmti.c
@@ -1,6 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <jvmti.h>
-#include <jvmticmlr.h>
 
 int main(void)
 {
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 376d1f78be04..e110010e7faa 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -855,6 +855,13 @@ ifndef NO_JVMTI
   $(call feature_check,jvmti)
   ifeq ($(feature-jvmti), 1)
     $(call detected_var,JDIR)
+    ifndef NO_JVMTI_CMLR
+      FEATURE_CHECK_CFLAGS-jvmti-cmlr := $(FEATURE_CHECK_CFLAGS-jvmti)
+      $(call feature_check,jvmti-cmlr)
+      ifeq ($(feature-jvmti-cmlr), 1)
+        CFLAGS += -DHAVE_JVMTI_CMLR
+      endif
+    endif # NO_JVMTI_CMLR
   else
     $(warning No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel)
     NO_JVMTI := 1
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index d95655489f7e..239e7b3270f4 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -95,6 +95,9 @@ include ../scripts/utilities.mak
 #
 # Define NO_JVMTI if you do not want jvmti agent built
 #
+# Define NO_JVMTI_CMLR (debug only) if you do not want to process CMLR
+# data for java source lines.
+#
 # Define LIBCLANGLLVM if you DO want builtin clang and llvm support.
 # When selected, pass LLVM_CONFIG=/path/to/llvm-config to `make' if
 # llvm-config is not in $PATH.
diff --git a/tools/perf/jvmti/libjvmti.c b/tools/perf/jvmti/libjvmti.c
index 6add3e982614..aea7b1fe85aa 100644
--- a/tools/perf/jvmti/libjvmti.c
+++ b/tools/perf/jvmti/libjvmti.c
@@ -6,7 +6,9 @@
 #include <stdlib.h>
 #include <err.h>
 #include <jvmti.h>
+#ifdef HAVE_JVMTI_CMLR
 #include <jvmticmlr.h>
+#endif
 #include <limits.h>
 
 #include "jvmti_agent.h"
@@ -27,6 +29,7 @@ static void print_error(jvmtiEnv *jvmti, const char *msg, jvmtiError ret)
 	}
 }
 
+#ifdef HAVE_JVMTI_CMLR
 static jvmtiError
 do_get_line_numbers(jvmtiEnv *jvmti, void *pc, jmethodID m, jint bci,
 		    jvmti_line_info_t *tab, jint *nr)
@@ -125,6 +128,15 @@ get_line_numbers(jvmtiEnv *jvmti, const void *compile_info, jvmti_line_info_t **
 	*nr_lines = lines_total;
 	return JVMTI_ERROR_NONE;
 }
+#else /* HAVE_JVMTI_CMLR */
+
+static jvmtiError
+get_line_numbers(jvmtiEnv *jvmti __maybe_unused, const void *compile_info __maybe_unused,
+		 jvmti_line_info_t **tab __maybe_unused, int *nr_lines __maybe_unused)
+{
+	return JVMTI_ERROR_NONE;
+}
+#endif /* HAVE_JVMTI_CMLR */
 
 static void
 copy_class_filename(const char * class_sign, const char * file_name, char * result, size_t max_length)
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 27/28] perf symbols: Fix slowness due to -ffunction-section
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (24 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 26/28] perf jvmti: Separate jvmti cmlr check Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  3:36 ` [PATCH 28/28] perf pmu: Move *_cpuid_str() weak functions to header.c Arnaldo Carvalho de Melo
  2018-11-22  6:54 ` [GIT PULL 00/28] perf/core improvements and fixes Ingo Molnar
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Eric Saint-Etienne, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Peter Zijlstra, Arnaldo Carvalho de Melo

From: Eric Saint-Etienne <eric.saint.etienne@oracle.com>

Perf can take minutes to parse an image when -ffunction-section is used.
This is especially true with the kernel image when it is compiled this
way, which is the arm64 default since the patcheset "Enable deadcode
elimination at link time".

Perf organize maps using a rbtree. Whenever perf finds a new symbols, it
first searches this rbtree for the map it belongs to, by strcmp()'aring
section names.  When it finds the map with the right name, it uses it to
add the symbol. With a usual image there aren't so many maps but when
using -ffunction-section there's basically one map per function.  With
the kernel image that's north of 40,000 maps. For most symbols perf has
to parses the entire rbtree to eventually create a new map and add it.
Consequently perf spends most of the time browsing a rbtree that keeps
getting larger.

This performance fix introduces a secondary rbtree that indexes maps
based on the section name.

Signed-off-by: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Reviewed-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: David Aldridge <david.aldridge@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1542822679-25591-1-git-send-email-eric.saint.etienne@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/map.c    | 27 +++++++++++++++++++++++++++
 tools/perf/util/map.h    |  2 ++
 tools/perf/util/symbol.c | 15 +++++++++++++--
 3 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 354e54550d2b..781eed8e3265 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -21,6 +21,7 @@
 #include "unwind.h"
 
 static void __maps__insert(struct maps *maps, struct map *map);
+static void __maps__insert_name(struct maps *maps, struct map *map);
 
 static inline int is_anon_memory(const char *filename, u32 flags)
 {
@@ -496,6 +497,7 @@ u64 map__objdump_2mem(struct map *map, u64 ip)
 static void maps__init(struct maps *maps)
 {
 	maps->entries = RB_ROOT;
+	maps->names = RB_ROOT;
 	init_rwsem(&maps->lock);
 }
 
@@ -664,6 +666,7 @@ size_t map_groups__fprintf(struct map_groups *mg, FILE *fp)
 static void __map_groups__insert(struct map_groups *mg, struct map *map)
 {
 	__maps__insert(&mg->maps, map);
+	__maps__insert_name(&mg->maps, map);
 	map->groups = mg;
 }
 
@@ -824,10 +827,34 @@ static void __maps__insert(struct maps *maps, struct map *map)
 	map__get(map);
 }
 
+static void __maps__insert_name(struct maps *maps, struct map *map)
+{
+	struct rb_node **p = &maps->names.rb_node;
+	struct rb_node *parent = NULL;
+	struct map *m;
+	int rc;
+
+	while (*p != NULL) {
+		parent = *p;
+		m = rb_entry(parent, struct map, rb_node_name);
+		rc = strcmp(m->dso->short_name, map->dso->short_name);
+		if (rc < 0)
+			p = &(*p)->rb_left;
+		else if (rc  > 0)
+			p = &(*p)->rb_right;
+		else
+			return;
+	}
+	rb_link_node(&map->rb_node_name, parent, p);
+	rb_insert_color(&map->rb_node_name, &maps->names);
+	map__get(map);
+}
+
 void maps__insert(struct maps *maps, struct map *map)
 {
 	down_write(&maps->lock);
 	__maps__insert(maps, map);
+	__maps__insert_name(maps, map);
 	up_write(&maps->lock);
 }
 
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index e0f327b51e66..5c792c90fc4c 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -25,6 +25,7 @@ struct map {
 		struct rb_node	rb_node;
 		struct list_head node;
 	};
+	struct rb_node          rb_node_name;
 	u64			start;
 	u64			end;
 	bool			erange_warned;
@@ -57,6 +58,7 @@ struct kmap {
 
 struct maps {
 	struct rb_root	 entries;
+	struct rb_root	 names;
 	struct rw_semaphore lock;
 };
 
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index d188b7588152..dcce74bae6de 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1680,11 +1680,22 @@ struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
 {
 	struct maps *maps = &mg->maps;
 	struct map *map;
+	struct rb_node *node;
 
 	down_read(&maps->lock);
 
-	for (map = maps__first(maps); map; map = map__next(map)) {
-		if (map->dso && strcmp(map->dso->short_name, name) == 0)
+	for (node = maps->names.rb_node; node; ) {
+		int rc;
+
+		map = rb_entry(node, struct map, rb_node_name);
+
+		rc = strcmp(map->dso->short_name, name);
+		if (rc < 0)
+			node = node->rb_left;
+		else if (rc > 0)
+			node = node->rb_right;
+		else
+
 			goto out_unlock;
 	}
 
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 28/28] perf pmu: Move *_cpuid_str() weak functions to header.c
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (25 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 27/28] perf symbols: Fix slowness due to -ffunction-section Arnaldo Carvalho de Melo
@ 2018-11-22  3:36 ` Arnaldo Carvalho de Melo
  2018-11-22  6:54 ` [GIT PULL 00/28] perf/core improvements and fixes Ingo Molnar
  27 siblings, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-22  3:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users, Kan Liang,
	Arnaldo Carvalho de Melo

From: Kan Liang <kan.liang@linux.intel.com>

The weak functions, strcmp_cpuid_str() and get_cpuid_str(), are defined
in pmu.c.

Most of the cpuid related functions, including *_cpuid_str()'s
declaration and platform specific definition, are in header.c/h.

To make the declaration and definition of all cpuid related functions in
a consistent place, move the weak functions to header.c.

There is no functional change.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Link: http://lkml.kernel.org/r/20181121164939.13482-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 39 +++++++++++++++++++++++++++++++++++++++
 tools/perf/util/pmu.c    | 39 ---------------------------------------
 2 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 4fd45be95a43..e31f52845e77 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -987,6 +987,45 @@ static int write_group_desc(struct feat_fd *ff,
 	return 0;
 }
 
+/*
+ * Return the CPU id as a raw string.
+ *
+ * Each architecture should provide a more precise id string that
+ * can be use to match the architecture's "mapfile".
+ */
+char * __weak get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
+{
+	return NULL;
+}
+
+/* Return zero when the cpuid from the mapfile.csv matches the
+ * cpuid string generated on this platform.
+ * Otherwise return non-zero.
+ */
+int __weak strcmp_cpuid_str(const char *mapcpuid, const char *cpuid)
+{
+	regex_t re;
+	regmatch_t pmatch[1];
+	int match;
+
+	if (regcomp(&re, mapcpuid, REG_EXTENDED) != 0) {
+		/* Warn unable to generate match particular string. */
+		pr_info("Invalid regular expression %s\n", mapcpuid);
+		return 1;
+	}
+
+	match = !regexec(&re, cpuid, 1, pmatch, 0);
+	regfree(&re);
+	if (match) {
+		size_t match_len = (pmatch[0].rm_eo - pmatch[0].rm_so);
+
+		/* Verify the entire string matched. */
+		if (match_len == strlen(cpuid))
+			return 0;
+	}
+	return 1;
+}
+
 /*
  * default get_cpuid(): nothing gets recorded
  * actual implementation must be in arch/$(SRCARCH)/util/header.c
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index c660625d7d4b..11a234740632 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -655,45 +655,6 @@ static int is_arm_pmu_core(const char *name)
 	return 0;
 }
 
-/*
- * Return the CPU id as a raw string.
- *
- * Each architecture should provide a more precise id string that
- * can be use to match the architecture's "mapfile".
- */
-char * __weak get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
-{
-	return NULL;
-}
-
-/* Return zero when the cpuid from the mapfile.csv matches the
- * cpuid string generated on this platform.
- * Otherwise return non-zero.
- */
-int __weak strcmp_cpuid_str(const char *mapcpuid, const char *cpuid)
-{
-	regex_t re;
-	regmatch_t pmatch[1];
-	int match;
-
-	if (regcomp(&re, mapcpuid, REG_EXTENDED) != 0) {
-		/* Warn unable to generate match particular string. */
-		pr_info("Invalid regular expression %s\n", mapcpuid);
-		return 1;
-	}
-
-	match = !regexec(&re, cpuid, 1, pmatch, 0);
-	regfree(&re);
-	if (match) {
-		size_t match_len = (pmatch[0].rm_eo - pmatch[0].rm_so);
-
-		/* Verify the entire string matched. */
-		if (match_len == strlen(cpuid))
-			return 0;
-	}
-	return 1;
-}
-
 static char *perf_pmu__getcpuid(struct perf_pmu *pmu)
 {
 	char *cpuid;
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [GIT PULL 00/28] perf/core improvements and fixes
  2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (26 preceding siblings ...)
  2018-11-22  3:36 ` [PATCH 28/28] perf pmu: Move *_cpuid_str() weak functions to header.c Arnaldo Carvalho de Melo
@ 2018-11-22  6:54 ` Ingo Molnar
  27 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2018-11-22  6:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter,
	Alexander Shishkin, Alexei Starovoitov, Andi Kleen,
	Andrew Morton, Anton Blanchard, Ben Gainey, Ben Hutchings,
	Borislav Petkov, Daniel Borkmann, Dave Kleikamp, David Ahern,
	David Aldridge, Davidlohr Bueso, Edward Cree, Eric Saint-Etienne,
	Gustavo Luiz Duarte, Jason Baron, Jin Yao, Jiri Olsa, Kan Liang,
	Martin KaFai Lau, Milian Wolff, Namhyung Kim, Peter Zijlstra,
	Pu Wen, Ravi Bangoria, Rob Gardner, Stephane Eranian,
	Thomas Gleixner, Thomas Richter, Wang Nan, Yonghong Song,
	yuzhoujian, Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling, some from before the trip to Vancouver,
> some that were more easy to process before I continue with the backlog.
> Took a bit more time than I antecipated due to fixing build breakage in
> various places due to multiple patches.  This has tip/perf/urgent
> merged.
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit b1a9d7b0190119dad5b9b7841751b5a7586bbc8b:
> 
>   Merge tag 'perf-urgent-for-mingo-4.20-20181121' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2018-11-21 15:57:21 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.21-20181122
> 
> for you to fetch changes up to f4a0742b3cc1d03b2ff448017b8c714a77e5a261:
> 
>   perf pmu: Move *_cpuid_str() weak functions to header.c (2018-11-21 22:39:59 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> - Start using BPF maps in 'perf trace' for filters in the augmented syscalls
>   code, keeping the existing code for tracepoint filters so that we can switch
>   back and forth while getting everything BPFied (Arnaldo Carvalho de Melo)
> 
> - Suppress potential format-truncation warning in the PMU code (Ben Hutchings)
> 
> - Introduce 'perf bench epoll', with "wait" and "ctl" benchmarks (Davidlohr Bueso)
> 
> - Fix slowness due to -ffunction-section, do it by sorting the maps by name, so
>   avoiding the using rb_first/next to traverse all entries looking for a map name,
>   that with --ffunction-section gets to thousands of maps (Eric Saint-Etienne)
> 
> - Separate jvmti cmlr check (Jiri Olsa)
> 
> - Allow using the stepping when figuring out which JSON files to use for a x86
>   processor, so that Cascadelake server can be support, which has the same
>   cpuid as some other processor, being different only in the stepping (Kan Liang)
> 
> - Share code and output format for uregs and iregs 'perf script' output (Milian Wolff)
> 
> - Use perf_evsel__is_clocki() for clock events in 'perf stat' (Ravi Bangoria)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (15):
>       perf bpf: Add unistd.h to the headers accessible to bpf proggies
>       perf augmented_syscalls: Filter on a hard coded pid
>       perf augmented_syscalls: Remove needless linux/socket.h include
>       perf bpf: Add defines for map insertion/lookup
>       perf bpf: Add simple pid_filter class accessible to BPF proggies
>       perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter
>       perf augmented_syscalls: Use pid_filter
>       perf evlist: Rename perf_evlist__set_filter* to perf_evlist__set_tp_filter*
>       perf trace: Add "_from_option" suffix to trace__set_filter()
>       perf trace: See if there is a map named "filtered_pids"
>       perf trace: Fill in BPF "filtered_pids" map when present
>       perf augmented_syscalls: Remove example hardcoded set of filtered pids
>       Revert "perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter"
>       perf bpf: Reduce the hardcoded .max_entries for pid_maps
>       tools build feature: Check if eventfd() is available
> 
> Ben Hutchings (1):
>       perf pmu: Suppress potential format-truncation warning
> 
> Davidlohr Bueso (3):
>       perf bench: Move HAVE_PTHREAD_ATTR_SETAFFINITY_NP into bench.h
>       perf bench: Add epoll parallel epoll_wait benchmark
>       perf bench: Add epoll_ctl(2) benchmark
> 
> Eric Saint-Etienne (1):
>       perf symbols: Fix slowness due to -ffunction-section
> 
> Jiri Olsa (1):
>       perf jvmti: Separate jvmti cmlr check
> 
> Kan Liang (3):
>       perf vendor events: Add stepping in CPUID string for x86
>       perf vendor events: Add JSON metrics for Cascadelake server
>       perf pmu: Move *_cpuid_str() weak functions to header.c
> 
> Milian Wolff (2):
>       perf script: Add newline after uregs output
>       perf script: Share code and output format for uregs and iregs output
> 
> Pu Wen (1):
>       perf tools: Add Hygon Dhyana support
> 
> Ravi Bangoria (1):
>       perf stat: Use perf_evsel__is_clocki() for clock events
> 
>  tools/build/Makefile.feature                       |     1 +
>  tools/build/feature/Makefile                       |     8 +
>  tools/build/feature/test-all.c                     |     5 +
>  tools/build/feature/test-eventfd.c                 |     9 +
>  tools/build/feature/test-jvmti-cmlr.c              |    11 +
>  tools/build/feature/test-jvmti.c                   |     1 -
>  tools/perf/Documentation/perf-bench.txt            |    10 +
>  tools/perf/Makefile.config                         |    12 +-
>  tools/perf/Makefile.perf                           |     3 +
>  tools/perf/arch/x86/util/header.c                  |    66 +-
>  tools/perf/arch/x86/util/kvm-stat.c                |     2 +-
>  tools/perf/bench/Build                             |     3 +
>  tools/perf/bench/bench.h                           |    14 +
>  tools/perf/bench/epoll-ctl.c                       |   413 +
>  tools/perf/bench/epoll-wait.c                      |   540 +
>  tools/perf/bench/futex.h                           |    12 -
>  tools/perf/builtin-bench.c                         |    13 +
>  tools/perf/builtin-script.c                        |    38 +-
>  tools/perf/builtin-trace.c                         |    92 +-
>  tools/perf/examples/bpf/augmented_raw_syscalls.c   |    10 +-
>  tools/perf/include/bpf/bpf.h                       |    19 +
>  tools/perf/include/bpf/pid_filter.h                |    21 +
>  tools/perf/include/bpf/unistd.h                    |    10 +
>  tools/perf/jvmti/libjvmti.c                        |    12 +
>  .../pmu-events/arch/x86/cascadelakex/cache.json    | 10172 +++++++++++++++++++
>  .../arch/x86/cascadelakex/clx-metrics.json         |   164 +
>  .../arch/x86/cascadelakex/floating-point.json      |    85 +
>  .../pmu-events/arch/x86/cascadelakex/frontend.json |   482 +
>  .../pmu-events/arch/x86/cascadelakex/memory.json   |  9909 ++++++++++++++++++
>  .../pmu-events/arch/x86/cascadelakex/other.json    |  8908 ++++++++++++++++
>  .../pmu-events/arch/x86/cascadelakex/pipeline.json |   969 ++
>  .../arch/x86/cascadelakex/uncore-memory.json       |   117 +
>  .../arch/x86/cascadelakex/uncore-other.json        |   255 +
>  .../arch/x86/cascadelakex/virtual-memory.json      |   285 +
>  tools/perf/pmu-events/arch/x86/mapfile.csv         |     3 +-
>  tools/perf/util/evlist.c                           |    10 +-
>  tools/perf/util/evlist.h                           |     6 +-
>  tools/perf/util/header.c                           |    39 +
>  tools/perf/util/map.c                              |    27 +
>  tools/perf/util/map.h                              |     2 +
>  tools/perf/util/pmu.c                              |    47 +-
>  tools/perf/util/stat-shadow.c                      |     3 +-
>  tools/perf/util/symbol.c                           |    15 +-
>  43 files changed, 32711 insertions(+), 112 deletions(-)
>  create mode 100644 tools/build/feature/test-eventfd.c
>  create mode 100644 tools/build/feature/test-jvmti-cmlr.c
>  create mode 100644 tools/perf/bench/epoll-ctl.c
>  create mode 100644 tools/perf/bench/epoll-wait.c
>  create mode 100644 tools/perf/include/bpf/pid_filter.h
>  create mode 100644 tools/perf/include/bpf/unistd.h
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/cache.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/floating-point.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/memory.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/other.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/uncore-memory.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/uncore-other.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/cascadelakex/virtual-memory.json

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2018-11-22  6:54 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-22  3:35 [GIT PULL 00/28] perf/core improvements and fixes Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 01/28] perf bpf: Add unistd.h to the headers accessible to bpf proggies Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 02/28] perf augmented_syscalls: Filter on a hard coded pid Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 03/28] perf augmented_syscalls: Remove needless linux/socket.h include Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 04/28] perf bpf: Add defines for map insertion/lookup Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 05/28] perf bpf: Add simple pid_filter class accessible to BPF proggies Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 06/28] perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 07/28] perf augmented_syscalls: Use pid_filter Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 08/28] perf evlist: Rename perf_evlist__set_filter* to perf_evlist__set_tp_filter* Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 09/28] perf trace: Add "_from_option" suffix to trace__set_filter() Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 10/28] perf trace: See if there is a map named "filtered_pids" Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 11/28] perf trace: Fill in BPF "filtered_pids" map when present Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 12/28] perf augmented_syscalls: Remove example hardcoded set of filtered pids Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 13/28] Revert "perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid filter" Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 14/28] perf script: Add newline after uregs output Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 15/28] perf bpf: Reduce the hardcoded .max_entries for pid_maps Arnaldo Carvalho de Melo
2018-11-22  3:35 ` [PATCH 16/28] perf script: Share code and output format for uregs and iregs output Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 17/28] perf bench: Move HAVE_PTHREAD_ATTR_SETAFFINITY_NP into bench.h Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 18/28] tools build feature: Check if eventfd() is available Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 19/28] perf bench: Add epoll parallel epoll_wait benchmark Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 20/28] perf bench: Add epoll_ctl(2) benchmark Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 21/28] perf tools: Add Hygon Dhyana support Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 22/28] perf pmu: Suppress potential format-truncation warning Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 23/28] perf stat: Use perf_evsel__is_clocki() for clock events Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 24/28] perf vendor events: Add stepping in CPUID string for x86 Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 26/28] perf jvmti: Separate jvmti cmlr check Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 27/28] perf symbols: Fix slowness due to -ffunction-section Arnaldo Carvalho de Melo
2018-11-22  3:36 ` [PATCH 28/28] perf pmu: Move *_cpuid_str() weak functions to header.c Arnaldo Carvalho de Melo
2018-11-22  6:54 ` [GIT PULL 00/28] perf/core improvements and fixes Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).