* [GIT PULL 00/35] perf/core improvements and fixes
@ 2017-12-28 14:29 Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 01/35] perf stat: Define a structure for per-thread shadow stats Arnaldo Carvalho de Melo
` (35 more replies)
0 siblings, 36 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
Adrian Hunter, Alexander Shishkin, Andi Kleen, bhargavb,
Cheng Jian, David Ahern, David S . Miller, Ganapatrao Kulkarni,
Greg Kroah-Hartman, Heiko Carstens, Hendrik Brueckner, Jin Yao,
Jiri Olsa, Jonathan Hermann, Kan Liang, Kim Phillips, Li Bin,
linux-arm-kernel, linux-rt-users, linux s390 list,
Martin Schwidefsky, Masami Hiramatsu, Mengting Zhang,
Michael Petlan, Namhyung Kim, Naveen N . Rao, Paul Clarke,
Peter Zijlstra, Pravin Shedge, Ravi Bangoria, Steven Rostedt,
Thomas Gleixner, Thomas Richter, Wang Nan, Will Deacon,
Arnaldo Carvalho de Melo
Hi Ingo,
Please consider pulling,
- Arnaldo
Test results at the end of this message, as usual.
The following changes since commit faaf95677f33dac910b6cbe917cabea43c8c1616:
Merge branch 'perf/urgent' into perf/core, to pick up fixes (2017-12-18 18:13:00 +0100)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.16-20171227
for you to fetch changes up to 5d4fd9c8b83b36d34521b3af361a5726899045bf:
perf tools: Auto-complete for events with ':' (2017-12-27 12:16:00 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
- Allow system wide 'perf stat --per-thread', sorting the result (Jin Yao)
E.g.:
[root@jouet ~]# perf stat --per-thread --metrics IPC
^C
Performance counter stats for 'system wide':
make-22229 23,012,094,032 inst_retired.any # 0.8 IPC
cc1-22419 692,027,497 inst_retired.any # 0.8 IPC
gcc-22418 328,231,855 inst_retired.any # 0.9 IPC
cc1-22509 220,853,647 inst_retired.any # 0.8 IPC
gcc-22486 199,874,810 inst_retired.any # 1.0 IPC
as-22466 177,896,365 inst_retired.any # 0.9 IPC
cc1-22465 150,732,374 inst_retired.any # 0.8 IPC
gcc-22508 112,555,593 inst_retired.any # 0.9 IPC
cc1-22487 108,964,079 inst_retired.any # 0.7 IPC
qemu-system-x86-2697 21,330,550 inst_retired.any # 0.3 IPC
systemd-journal-551 20,642,951 inst_retired.any # 0.4 IPC
docker-containe-17651 9,552,892 inst_retired.any # 0.5 IPC
dockerd-current-9809 7,528,586 inst_retired.any # 0.5 IPC
make-22153 12,504,194,380 inst_retired.any # 0.8 IPC
python2-22429 12,081,290,954 inst_retired.any # 0.8 IPC
<SNIP>
python2-22429 15,026,328,103 cpu_clk_unhalted.thread
cc1-22419 826,660,193 cpu_clk_unhalted.thread
gcc-22418 365,321,295 cpu_clk_unhalted.thread
cc1-22509 279,169,362 cpu_clk_unhalted.thread
gcc-22486 210,156,950 cpu_clk_unhalted.thread
<SNIP>
5.638075538 seconds time elapsed
[root@jouet ~]#
- Improve shell auto-completion of perf events (Jin Yao)
- Fix symbol fixup issues in arm64 due to ELF type (Kim Phillips)
- Ignore threads when they vanish after procfs based enumeration and
before we try to use them with sys_perf_event_open(), i.e. just remove
them from the thread_map and continue with the rest. This makes, among
other cases, the previous new feature (perf stat --per-thread for system
wide, albeit that not seeming to be the motivation for this patch) more
robust. (Mengting Zhang)
- Generate s390 syscall table from asm/unistd.h, doing like x86,
removing the dependency on audit-libs to do this id->string translation,
speeding up the support for newly introducted syscalls (Hendrik Brueckner)
- Fix 'perf test' on filesystems where readdir() returns d_type == DT_UNKNOWN,
such as XFS (Jiri Olsa)
- Fix PERF_SAMPLE_RAW_DATA endianity handling for cross-arch tracepoint
processing (Jiri Olsa)
- Add __return suffix for return events in 'perf probe', streamlining
entry/exit tracing (Masami Hiramatsu)
- Improve support for versioned symbols in 'perf probe" (Masami Hiramatsu)
- Clarify error message about invalid 'perf probe' event names (Masami Hiramatsu)
- Fix check open filename arg using 'perf trace' in a 'perf test' entry for
systems using glibc >= 2.26, such as some ARM and s390 distros (Michael Petlan)
- Make method for obtaining the (normalized) architecture id for a
perf.data file or for the running system used by the annotation routines
generally available, next user will be for generating per arch errno
string tables to allow for pretty printing errno codes recorded in a
perf.data file in architecture A to be properly decoded on hardware
archictecture B. (Arnaldo Carvalho de Melo)
- Remove duplicate includes, found using scripts/checkincludes.pl (Pravin Shedge)
- s390 needs -fPIC, enable it, also revert a patch that supposedly did
that but instead enabled -fPIC for x86 (Hendrik Brueckner, Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Arnaldo Carvalho de Melo (4):
perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate()
perf annotate: Use perf_env when obtaining the arch name
perf env: Adopt perf_env__arch() from the annotate code
Revert "perf s390: Always build with -fPIC"
Hendrik Brueckner (4):
tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h
perf s390: Generate system call table from asm/unistd.h
perf trace: Use generated syscall table on s390 too
perf s390: Always build with -fPIC
Jin Yao (14):
perf stat: Define a structure for per-thread shadow stats
perf stat: Extend rbtree to support per-thread shadow stats
perf stat: Create the runtime_stat init/exit function
perf stat: Update per-thread shadow stats
perf stat: Print per-thread shadow stats
perf stat: Remove a set of shadow stats static variables
perf stat: Allocate shadow stats buffer for threads
perf stat: Update or print per-thread stats
perf thread_map: Enumerate all threads from /proc
perf stat: Remove --per-thread pid/tid limitation
perf stat: Resort '--per-thread' result
perf tool: Improve bash command line auto-complete for multiple events with comma
perf tools: Return all events as auto-completions after comma
perf tools: Auto-complete for events with ':'
Jiri Olsa (3):
perf utils: Move is_directory() to path.h
perf test: Handle properly readdir DT_UNKNOWN
perf evsel: Fix swap for samples with raw data
Kim Phillips (1):
perf probe arm64: Fix symbol fixup issues due to ELF type
Masami Hiramatsu (6):
perf probe: Add warning message if there is unexpected event name
perf probe: Cut off the version suffix from event name
perf probe: Add __return suffix for return events
perf probe: Find versioned symbols from map
perf string: Add {strdup,strpbrk}_esc()
perf probe: Support escaped character in parser
Mengting Zhang (1):
perf evsel: Enable ignore_missing_thread for pid option
Michael Petlan (1):
perf test shell: Fix check open filename arg using 'perf trace'
Pravin Shedge (1):
perf perf: Remove duplicate includes
tools/arch/s390/include/uapi/asm/unistd.h | 412 ++++++++++++++++++++
tools/perf/Documentation/perf-probe.txt | 18 +-
tools/perf/Makefile.config | 11 +-
tools/perf/arch/arm64/util/Build | 1 +
tools/perf/arch/arm64/util/sym-handling.c | 22 ++
tools/perf/arch/common.c | 44 +--
tools/perf/arch/common.h | 1 -
tools/perf/arch/powerpc/util/sym-handling.c | 8 +
tools/perf/arch/s390/Makefile | 21 ++
tools/perf/arch/s390/entry/syscalls/mksyscalltbl | 36 ++
tools/perf/bench/futex-hash.c | 1 -
tools/perf/builtin-c2c.c | 3 -
tools/perf/builtin-record.c | 5 +-
tools/perf/builtin-script.c | 20 +-
tools/perf/builtin-stat.c | 168 +++++++--
tools/perf/builtin-top.c | 2 +-
tools/perf/check-headers.sh | 1 +
tools/perf/perf-completion.sh | 47 ++-
tools/perf/tests/builtin-test.c | 10 +-
tools/perf/tests/parse-events.c | 1 -
tools/perf/tests/shell/trace+probe_vfs_getname.sh | 7 +-
tools/perf/tests/thread-map.c | 2 +-
tools/perf/ui/browsers/annotate.c | 4 +-
tools/perf/ui/gtk/annotate.c | 2 +-
tools/perf/util/annotate.c | 26 +-
tools/perf/util/annotate.h | 2 +-
tools/perf/util/auxtrace.c | 3 -
tools/perf/util/env.c | 47 +++
tools/perf/util/env.h | 2 +
tools/perf/util/evlist.c | 3 +-
tools/perf/util/evsel.c | 80 +++-
tools/perf/util/evsel.h | 3 +-
tools/perf/util/header.c | 2 -
tools/perf/util/metricgroup.c | 2 -
tools/perf/util/path.c | 14 +
tools/perf/util/path.h | 3 +
tools/perf/util/probe-event.c | 85 +++--
tools/perf/util/python-ext-sources | 1 +
.../util/scripting-engines/trace-event-python.c | 1 -
tools/perf/util/stat-shadow.c | 416 ++++++++++++---------
tools/perf/util/stat.c | 15 +-
tools/perf/util/stat.h | 63 +++-
tools/perf/util/string.c | 46 +++
tools/perf/util/string2.h | 2 +
tools/perf/util/symbol.c | 5 +
tools/perf/util/symbol.h | 1 +
tools/perf/util/syscalltbl.c | 4 +
tools/perf/util/target.h | 7 +
tools/perf/util/thread_map.c | 5 +-
tools/perf/util/thread_map.h | 2 +-
tools/perf/util/unwind-libunwind.c | 4 +-
51 files changed, 1328 insertions(+), 363 deletions(-)
create mode 100644 tools/arch/s390/include/uapi/asm/unistd.h
create mode 100644 tools/perf/arch/arm64/util/sym-handling.c
create mode 100755 tools/perf/arch/s390/entry/syscalls/mksyscalltbl
Test results:
The first ones are container (docker) based builds of tools/perf with and
without libelf support. Where clang is available, it is also used to build
perf with/without libelf.
The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.
Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The second column is the time it takes on a i5-7500 CPU @ 3.40GHz, with
a 240 GB SSD from Sandisk. Take it with a grain of salt because we do
the build with clang as well when availalbe.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
# dm
1 38.08 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0
2 44.14 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822
3 39.23 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0
4 39.94 alpine:edge : Ok gcc (Alpine 6.4.0) 6.4.0
5 34.36 amazonlinux:1 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
6 39.75 amazonlinux:2 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
7 28.21 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
8 26.06 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
9 20.89 centos:5 : Ok gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
10 33.98 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
11 38.71 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
12 32.67 debian:7 : Ok gcc (Debian 4.7.2-5) 4.7.2
13 35.71 debian:8 : Ok gcc (Debian 4.9.2-10) 4.9.2
14 60.76 debian:9 : Ok gcc (Debian 6.3.0-18) 6.3.0 20170516
15 63.80 debian:experimental : Ok gcc (Debian 7.2.0-18) 7.2.0
16 37.26 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
17 36.71 debian:experimental-x-mips : Ok mips-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
18 33.56 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 7.2.0-11) 7.2.0
19 37.09 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
20 37.44 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
21 38.19 fedora:21 : Ok gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
22 37.92 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
23 39.25 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
24 39.44 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
25 34.11 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
26 76.13 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
27 80.30 fedora:26 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
28 75.38 fedora:27 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
29 78.37 fedora:rawhide : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-4)
30 42.54 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 6.4.0 p1.1) 6.4.0
31 44.86 mageia:5 : Ok gcc (GCC) 4.9.2
32 45.95 mageia:6 : Ok gcc (Mageia 5.4.0-5.mga6) 5.4.0
33 44.47 opensuse:42.1 : Ok gcc (SUSE Linux) 4.8.5
34 46.53 opensuse:42.2 : Ok gcc (SUSE Linux) 4.8.5
35 45.51 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5
36 89.91 opensuse:tumbleweed : Ok gcc (SUSE Linux) 7.2.1 20171020 [gcc-7-branch revision 253932]
37 40.36 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
38 42.95 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
39 37.95 ubuntu:12.04.5 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
40 37.70 ubuntu:14.04.4 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
41 37.09 ubuntu:14.04.4-x-linaro-arm64 : Ok aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
42 69.99 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
43 38.08 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
44 36.04 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
45 34.35 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
46 35.10 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.1) 5.4.0 20160609
47 34.80 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
48 34.28 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
49 66.92 ubuntu:16.10 : Ok gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
50 66.80 ubuntu:17.04 : Ok gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406
51 74.57 ubuntu:17.10 : Ok gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0
52 73.70 ubuntu:18.04 : Ok gcc (Ubuntu 7.2.0-18ubuntu2) 7.2.0
#
# uname -a
Linux jouet 4.15.0-rc3+ #3 SMP Wed Dec 13 10:14:18 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : Ok
9: Parse perf pmu format : Ok
10: DSO data read : Ok
11: DSO data cache : Ok
12: DSO data reopen : Ok
13: Roundtrip evsel->name : Ok
14: Parse sched tracepoints fields : Ok
15: syscalls:sys_enter_openat event fields : Ok
16: Setup struct perf_event_attr : Ok
17: Match and link multiple hists : Ok
18: 'import perf' in python : Ok
19: Breakpoint overflow signal handler : Ok
20: Breakpoint overflow sampling : Ok
21: Number of exit events of a simple workload : Ok
22: Software clock events period values : Ok
23: Object code reading : Ok
24: Sample parsing : Ok
25: Use a dummy software event to keep tracking : Ok
26: Parse with no sample_id_all bit set : Ok
27: Filter hist entries : Ok
28: Lookup mmap thread : Ok
29: Share thread mg : Ok
30: Sort output of hist entries : Ok
31: Cumulate child hist entries : Ok
32: Track with sched_switch : Ok
33: Filter fds with revents mask in a fdarray : Ok
34: Add fd to a fdarray, making it autogrow : Ok
35: kmod_path__parse : Ok
36: Thread map : Ok
37: LLVM search and compile :
37.1: Basic BPF llvm compile : Ok
37.2: kbuild searching : Ok
37.3: Compile source for BPF prologue generation : Ok
37.4: Compile source for BPF relocation : Ok
38: Session topology : Ok
39: BPF filter :
39.1: Basic BPF filtering : Ok
39.2: BPF pinning : Ok
39.3: BPF prologue generation : Ok
39.4: BPF relocation checker : Ok
40: Synthesize thread map : Ok
41: Remove thread map : Ok
42: Synthesize cpu map : Ok
43: Synthesize stat config : Ok
44: Synthesize stat : Ok
45: Synthesize stat round : Ok
46: Synthesize attr update : Ok
47: Event times : Ok
48: Read backward ring buffer : Ok
49: Print cpu map : Ok
50: Probe SDT events : Ok
51: is_printable_array : Ok
52: Print bitmap : Ok
53: perf hooks : Ok
54: builtin clang support : Skip (not compiled in)
55: unit_number__scnprintf : Ok
56: x86 rdpmc : Ok
57: Convert perf time to TSC : Ok
58: DWARF unwind : Ok
59: x86 instruction decoder - new instructions : Ok
60: Use vfs_getname probe to get syscall args filenames : Ok
61: probe libc's inet_pton & backtrace it with ping : Ok
62: Check open filename arg using perf trace + vfs_getname: Ok
63: Add vfs_getname probe to get syscall args filenames : Ok
#
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_tags_O: make tags
make_no_libelf_O: make NO_LIBELF=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_pure_O: make
make_perf_o_O: make perf.o
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_util_map_o_O: make util/map.o
make_debug_O: make DEBUG=1
make_help_O: make help
make_no_newt_O: make NO_NEWT=1
make_doc_O: make doc
make_no_libperl_O: make NO_LIBPERL=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_slang_O: make NO_SLANG=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_install_bin_O: make install-bin
make_install_O: make install
make_no_gtk2_O: make NO_GTK2=1
make_static_O: make LDFLAGS=-static
make_cscope_O: make cscope
make_no_libbpf_O: make NO_LIBBPF=1
make_no_demangle_O: make NO_DEMANGLE=1
make_clean_all_O: make clean all
OK
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH 01/35] perf stat: Define a structure for per-thread shadow stats
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2017-12-28 14:29 ` Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 02/35] perf stat: Extend rbtree to support " Arnaldo Carvalho de Melo
` (34 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
Perf has a set of static variables to record the runtime shadow metrics
stats.
While if we want to record the runtime shadow stats for per-thread, it
will be the limitation. This patch creates a structure and the next
patches will use this structure to update the runtime shadow stats for
per-thread.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/stat-shadow.c | 11 -----------
tools/perf/util/stat.h | 43 ++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 42 insertions(+), 12 deletions(-)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 57ec22513971..93aac2788056 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -9,17 +9,6 @@
#include "expr.h"
#include "metricgroup.h"
-enum {
- CTX_BIT_USER = 1 << 0,
- CTX_BIT_KERNEL = 1 << 1,
- CTX_BIT_HV = 1 << 2,
- CTX_BIT_HOST = 1 << 3,
- CTX_BIT_IDLE = 1 << 4,
- CTX_BIT_MAX = 1 << 5,
-};
-
-#define NUM_CTX CTX_BIT_MAX
-
/*
* AGGR_GLOBAL: Use CPU 0
* AGGR_SOCKET: Use first CPU of socket
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index eefca5c981fd..c685c41f1fb9 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -5,6 +5,7 @@
#include <linux/types.h>
#include <stdio.h>
#include "xyarray.h"
+#include "rblist.h"
struct stats
{
@@ -43,6 +44,47 @@ enum aggr_mode {
AGGR_UNSET,
};
+enum {
+ CTX_BIT_USER = 1 << 0,
+ CTX_BIT_KERNEL = 1 << 1,
+ CTX_BIT_HV = 1 << 2,
+ CTX_BIT_HOST = 1 << 3,
+ CTX_BIT_IDLE = 1 << 4,
+ CTX_BIT_MAX = 1 << 5,
+};
+
+#define NUM_CTX CTX_BIT_MAX
+
+enum stat_type {
+ STAT_NONE = 0,
+ STAT_NSECS,
+ STAT_CYCLES,
+ STAT_STALLED_CYCLES_FRONT,
+ STAT_STALLED_CYCLES_BACK,
+ STAT_BRANCHES,
+ STAT_CACHEREFS,
+ STAT_L1_DCACHE,
+ STAT_L1_ICACHE,
+ STAT_LL_CACHE,
+ STAT_ITLB_CACHE,
+ STAT_DTLB_CACHE,
+ STAT_CYCLES_IN_TX,
+ STAT_TRANSACTION,
+ STAT_ELISION,
+ STAT_TOPDOWN_TOTAL_SLOTS,
+ STAT_TOPDOWN_SLOTS_ISSUED,
+ STAT_TOPDOWN_SLOTS_RETIRED,
+ STAT_TOPDOWN_FETCH_BUBBLES,
+ STAT_TOPDOWN_RECOVERY_BUBBLES,
+ STAT_SMI_NUM,
+ STAT_APERF,
+ STAT_MAX
+};
+
+struct runtime_stat {
+ struct rblist value_list;
+};
+
struct perf_stat_config {
enum aggr_mode aggr_mode;
bool scale;
@@ -92,7 +134,6 @@ struct perf_stat_output_ctx {
bool force_header;
};
-struct rblist;
void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
double avg, int cpu,
struct perf_stat_output_ctx *out,
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 02/35] perf stat: Extend rbtree to support per-thread shadow stats
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 01/35] perf stat: Define a structure for per-thread shadow stats Arnaldo Carvalho de Melo
@ 2017-12-28 14:29 ` Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 03/35] perf stat: Create the runtime_stat init/exit function Arnaldo Carvalho de Melo
` (33 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
Previously the rbtree was used to link generic metrics.
This patches adds new ctx/type/stat into rbtree keys because we will use
this rbtree to maintain shadow metrics to replace original a couple of
static arrays for supporting per-thread shadow stats.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/stat-shadow.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 93aac2788056..528be3e8d13b 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -45,7 +45,10 @@ struct stats walltime_nsecs_stats;
struct saved_value {
struct rb_node rb_node;
struct perf_evsel *evsel;
+ enum stat_type type;
+ int ctx;
int cpu;
+ struct runtime_stat *stat;
struct stats stats;
};
@@ -58,6 +61,30 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
if (a->cpu != b->cpu)
return a->cpu - b->cpu;
+
+ /*
+ * Previously the rbtree was used to link generic metrics.
+ * The keys were evsel/cpu. Now the rbtree is extended to support
+ * per-thread shadow stats. For shadow stats case, the keys
+ * are cpu/type/ctx/stat (evsel is NULL). For generic metrics
+ * case, the keys are still evsel/cpu (type/ctx/stat are 0 or NULL).
+ */
+ if (a->type != b->type)
+ return a->type - b->type;
+
+ if (a->ctx != b->ctx)
+ return a->ctx - b->ctx;
+
+ if (a->evsel == NULL && b->evsel == NULL) {
+ if (a->stat == b->stat)
+ return 0;
+
+ if ((char *)a->stat < (char *)b->stat)
+ return -1;
+
+ return 1;
+ }
+
if (a->evsel == b->evsel)
return 0;
if ((char *)a->evsel < (char *)b->evsel)
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 03/35] perf stat: Create the runtime_stat init/exit function
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 01/35] perf stat: Define a structure for per-thread shadow stats Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 02/35] perf stat: Extend rbtree to support " Arnaldo Carvalho de Melo
@ 2017-12-28 14:29 ` Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 04/35] perf stat: Update per-thread shadow stats Arnaldo Carvalho de Melo
` (32 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
It mainly initializes and releases the rblist which is defined in struct
runtime_stat.
For the original rblist 'runtime_saved_values', it's still kept there
for keeping the patch bisectable.
The rblist 'runtime_saved_values' will be removed in later patch at
switching time.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-4-git-send-email-yao.jin@linux.intel.com
[ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/stat-shadow.c | 17 +++++++++++++++++
tools/perf/util/stat.h | 3 +++
2 files changed, 20 insertions(+)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 528be3e8d13b..07cfbf613bdc 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -40,6 +40,7 @@ static struct stats runtime_aperf_stats[NUM_CTX][MAX_NR_CPUS];
static struct rblist runtime_saved_values;
static bool have_frontend_stalled;
+struct runtime_stat rt_stat;
struct stats walltime_nsecs_stats;
struct saved_value {
@@ -134,6 +135,21 @@ static struct saved_value *saved_value_lookup(struct perf_evsel *evsel,
return NULL;
}
+void runtime_stat__init(struct runtime_stat *st)
+{
+ struct rblist *rblist = &st->value_list;
+
+ rblist__init(rblist);
+ rblist->node_cmp = saved_value_cmp;
+ rblist->node_new = saved_value_new;
+ rblist->node_delete = saved_value_delete;
+}
+
+void runtime_stat__exit(struct runtime_stat *st)
+{
+ rblist__exit(&st->value_list);
+}
+
void perf_stat__init_shadow_stats(void)
{
have_frontend_stalled = pmu_have_event("cpu", "stalled-cycles-frontend");
@@ -141,6 +157,7 @@ void perf_stat__init_shadow_stats(void)
runtime_saved_values.node_cmp = saved_value_cmp;
runtime_saved_values.node_new = saved_value_new;
runtime_saved_values.node_delete = saved_value_delete;
+ runtime_stat__init(&rt_stat);
}
static int evsel_context(struct perf_evsel *evsel)
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index c685c41f1fb9..f20240037377 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -117,12 +117,15 @@ bool __perf_evsel_stat__is(struct perf_evsel *evsel,
void perf_stat_evsel_id_init(struct perf_evsel *evsel);
+extern struct runtime_stat rt_stat;
extern struct stats walltime_nsecs_stats;
typedef void (*print_metric_t)(void *ctx, const char *color, const char *unit,
const char *fmt, double val);
typedef void (*new_line_t )(void *ctx);
+void runtime_stat__init(struct runtime_stat *st);
+void runtime_stat__exit(struct runtime_stat *st);
void perf_stat__init_shadow_stats(void);
void perf_stat__reset_shadow_stats(void);
void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 04/35] perf stat: Update per-thread shadow stats
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (2 preceding siblings ...)
2017-12-28 14:29 ` [PATCH 03/35] perf stat: Create the runtime_stat init/exit function Arnaldo Carvalho de Melo
@ 2017-12-28 14:29 ` Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 05/35] perf stat: Print " Arnaldo Carvalho de Melo
` (31 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
The functions perf_stat__update_shadow_stats() is called to update the
shadow stats on a set of static variables.
But the static variables are the limitations to be extended to support
per-thread shadow stats.
This patch lets the perf_stat__update_shadow_stats() support to update
the shadow stats on a input parameter 'st' and uses
update_runtime_stat() to update the stats. It will not directly update
the static variables as before.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-5-git-send-email-yao.jin@linux.intel.com
[ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-script.c | 3 +-
tools/perf/builtin-stat.c | 3 +-
tools/perf/util/stat-shadow.c | 86 +++++++++++++++++++++++++++++--------------
tools/perf/util/stat.c | 8 ++--
tools/perf/util/stat.h | 2 +-
5 files changed, 68 insertions(+), 34 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 39d8b55f0db3..81b395040298 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1548,7 +1548,8 @@ static void perf_sample__fprint_metric(struct perf_script *script,
val = sample->period * evsel->scale;
perf_stat__update_shadow_stats(evsel,
val,
- sample->cpu);
+ sample->cpu,
+ &rt_stat);
evsel_script(evsel)->val = val;
if (evsel_script(evsel->leader)->gnum == evsel->leader->nr_members) {
for_each_group_member (ev2, evsel->leader) {
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a027b4712e48..3f4a2c21b824 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1214,7 +1214,8 @@ static void aggr_update_shadow(void)
val += perf_counts(counter->counts, cpu, 0)->val;
}
perf_stat__update_shadow_stats(counter, val,
- first_shadow_cpu(counter, id));
+ first_shadow_cpu(counter, id),
+ &rt_stat);
}
}
}
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 07cfbf613bdc..4b28c40de927 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -116,19 +116,29 @@ static void saved_value_delete(struct rblist *rblist __maybe_unused,
static struct saved_value *saved_value_lookup(struct perf_evsel *evsel,
int cpu,
- bool create)
+ bool create,
+ enum stat_type type,
+ int ctx,
+ struct runtime_stat *st)
{
+ struct rblist *rblist;
struct rb_node *nd;
struct saved_value dm = {
.cpu = cpu,
.evsel = evsel,
+ .type = type,
+ .ctx = ctx,
+ .stat = st,
};
- nd = rblist__find(&runtime_saved_values, &dm);
+
+ rblist = &st->value_list;
+
+ nd = rblist__find(rblist, &dm);
if (nd)
return container_of(nd, struct saved_value, rb_node);
if (create) {
- rblist__add_node(&runtime_saved_values, &dm);
- nd = rblist__find(&runtime_saved_values, &dm);
+ rblist__add_node(rblist, &dm);
+ nd = rblist__find(rblist, &dm);
if (nd)
return container_of(nd, struct saved_value, rb_node);
}
@@ -217,13 +227,24 @@ void perf_stat__reset_shadow_stats(void)
}
}
+static void update_runtime_stat(struct runtime_stat *st,
+ enum stat_type type,
+ int ctx, int cpu, u64 count)
+{
+ struct saved_value *v = saved_value_lookup(NULL, cpu, true,
+ type, ctx, st);
+
+ if (v)
+ update_stats(&v->stats, count);
+}
+
/*
* Update various tracking values we maintain to print
* more semantic information such as miss/hit ratios,
* instruction rates, etc:
*/
void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
- int cpu)
+ int cpu, struct runtime_stat *st)
{
int ctx = evsel_context(counter);
@@ -231,50 +252,58 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
if (perf_evsel__match(counter, SOFTWARE, SW_TASK_CLOCK) ||
perf_evsel__match(counter, SOFTWARE, SW_CPU_CLOCK))
- update_stats(&runtime_nsecs_stats[cpu], count);
+ update_runtime_stat(st, STAT_NSECS, 0, cpu, count);
else if (perf_evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
- update_stats(&runtime_cycles_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_CYCLES, ctx, cpu, count);
else if (perf_stat_evsel__is(counter, CYCLES_IN_TX))
- update_stats(&runtime_cycles_in_tx_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_CYCLES_IN_TX, ctx, cpu, count);
else if (perf_stat_evsel__is(counter, TRANSACTION_START))
- update_stats(&runtime_transaction_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_TRANSACTION, ctx, cpu, count);
else if (perf_stat_evsel__is(counter, ELISION_START))
- update_stats(&runtime_elision_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_ELISION, ctx, cpu, count);
else if (perf_stat_evsel__is(counter, TOPDOWN_TOTAL_SLOTS))
- update_stats(&runtime_topdown_total_slots[ctx][cpu], count);
+ update_runtime_stat(st, STAT_TOPDOWN_TOTAL_SLOTS,
+ ctx, cpu, count);
else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_ISSUED))
- update_stats(&runtime_topdown_slots_issued[ctx][cpu], count);
+ update_runtime_stat(st, STAT_TOPDOWN_SLOTS_ISSUED,
+ ctx, cpu, count);
else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_RETIRED))
- update_stats(&runtime_topdown_slots_retired[ctx][cpu], count);
+ update_runtime_stat(st, STAT_TOPDOWN_SLOTS_RETIRED,
+ ctx, cpu, count);
else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_BUBBLES))
- update_stats(&runtime_topdown_fetch_bubbles[ctx][cpu], count);
+ update_runtime_stat(st, STAT_TOPDOWN_FETCH_BUBBLES,
+ ctx, cpu, count);
else if (perf_stat_evsel__is(counter, TOPDOWN_RECOVERY_BUBBLES))
- update_stats(&runtime_topdown_recovery_bubbles[ctx][cpu], count);
+ update_runtime_stat(st, STAT_TOPDOWN_RECOVERY_BUBBLES,
+ ctx, cpu, count);
else if (perf_evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
- update_stats(&runtime_stalled_cycles_front_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_STALLED_CYCLES_FRONT,
+ ctx, cpu, count);
else if (perf_evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_BACKEND))
- update_stats(&runtime_stalled_cycles_back_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_STALLED_CYCLES_BACK,
+ ctx, cpu, count);
else if (perf_evsel__match(counter, HARDWARE, HW_BRANCH_INSTRUCTIONS))
- update_stats(&runtime_branches_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_BRANCHES, ctx, cpu, count);
else if (perf_evsel__match(counter, HARDWARE, HW_CACHE_REFERENCES))
- update_stats(&runtime_cacherefs_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_CACHEREFS, ctx, cpu, count);
else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_L1D))
- update_stats(&runtime_l1_dcache_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_L1_DCACHE, ctx, cpu, count);
else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_L1I))
- update_stats(&runtime_ll_cache_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_L1_ICACHE, ctx, cpu, count);
else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_LL))
- update_stats(&runtime_ll_cache_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_LL_CACHE, ctx, cpu, count);
else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_DTLB))
- update_stats(&runtime_dtlb_cache_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_DTLB_CACHE, ctx, cpu, count);
else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_ITLB))
- update_stats(&runtime_itlb_cache_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_ITLB_CACHE, ctx, cpu, count);
else if (perf_stat_evsel__is(counter, SMI_NUM))
- update_stats(&runtime_smi_num_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_SMI_NUM, ctx, cpu, count);
else if (perf_stat_evsel__is(counter, APERF))
- update_stats(&runtime_aperf_stats[ctx][cpu], count);
+ update_runtime_stat(st, STAT_APERF, ctx, cpu, count);
if (counter->collect_stat) {
- struct saved_value *v = saved_value_lookup(counter, cpu, true);
+ struct saved_value *v = saved_value_lookup(counter, cpu, true,
+ STAT_NONE, 0, st);
update_stats(&v->stats, count);
}
}
@@ -694,7 +723,8 @@ static void generic_metric(const char *metric_expr,
stats = &walltime_nsecs_stats;
scale = 1e-9;
} else {
- v = saved_value_lookup(metric_events[i], cpu, false);
+ v = saved_value_lookup(metric_events[i], cpu, false,
+ STAT_NONE, 0, &rt_stat);
if (!v)
break;
stats = &v->stats;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 151e9efd7286..78abfd40b135 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -278,9 +278,11 @@ process_counter_values(struct perf_stat_config *config, struct perf_evsel *evsel
perf_evsel__compute_deltas(evsel, cpu, thread, count);
perf_counts_values__scale(count, config->scale, NULL);
if (config->aggr_mode == AGGR_NONE)
- perf_stat__update_shadow_stats(evsel, count->val, cpu);
+ perf_stat__update_shadow_stats(evsel, count->val, cpu,
+ &rt_stat);
if (config->aggr_mode == AGGR_THREAD)
- perf_stat__update_shadow_stats(evsel, count->val, 0);
+ perf_stat__update_shadow_stats(evsel, count->val, 0,
+ &rt_stat);
break;
case AGGR_GLOBAL:
aggr->val += count->val;
@@ -362,7 +364,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
/*
* Save the full runtime - to allow normalization during printout:
*/
- perf_stat__update_shadow_stats(counter, *count, 0);
+ perf_stat__update_shadow_stats(counter, *count, 0, &rt_stat);
return 0;
}
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index f20240037377..bb9902ad3a79 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -129,7 +129,7 @@ void runtime_stat__exit(struct runtime_stat *st);
void perf_stat__init_shadow_stats(void);
void perf_stat__reset_shadow_stats(void);
void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
- int cpu);
+ int cpu, struct runtime_stat *st);
struct perf_stat_output_ctx {
void *ctx;
print_metric_t print_metric;
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 05/35] perf stat: Print per-thread shadow stats
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (3 preceding siblings ...)
2017-12-28 14:29 ` [PATCH 04/35] perf stat: Update per-thread shadow stats Arnaldo Carvalho de Melo
@ 2017-12-28 14:29 ` Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 06/35] perf stat: Remove a set of shadow stats static variables Arnaldo Carvalho de Melo
` (30 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
The function perf_stat__print_shadow_stats() is called to print the
shadow stats on a set of static variables.
But the static variables are the limitations to support
per-thread shadow stats.
This patch lets the perf_stat__print_shadow_stats() support
to print the shadow stats from a input parameter 'st'.
It will not directly get value from static variable. Instead,
it now uses runtime_stat_avg() and runtime_stat_n() to get and
compute the values.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-6-git-send-email-yao.jin@linux.intel.com
[ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-script.c | 3 +-
tools/perf/builtin-stat.c | 23 +++--
tools/perf/util/stat-shadow.c | 209 ++++++++++++++++++++++++++----------------
tools/perf/util/stat.h | 3 +-
4 files changed, 151 insertions(+), 87 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 81b395040298..fac6f053e4da 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1557,7 +1557,8 @@ static void perf_sample__fprint_metric(struct perf_script *script,
evsel_script(ev2)->val,
sample->cpu,
&ctx,
- NULL);
+ NULL,
+ &rt_stat);
}
evsel_script(evsel->leader)->gnum = 0;
}
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 3f4a2c21b824..097a694d16f2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1097,7 +1097,8 @@ static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
}
static void printout(int id, int nr, struct perf_evsel *counter, double uval,
- char *prefix, u64 run, u64 ena, double noise)
+ char *prefix, u64 run, u64 ena, double noise,
+ struct runtime_stat *st)
{
struct perf_stat_output_ctx out;
struct outstate os = {
@@ -1190,7 +1191,7 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
perf_stat__print_shadow_stats(counter, uval,
first_shadow_cpu(counter, id),
- &out, &metric_events);
+ &out, &metric_events, st);
if (!csv_output && !metric_only) {
print_noise(counter, noise);
print_running(run, ena);
@@ -1335,7 +1336,8 @@ static void print_aggr(char *prefix)
fprintf(output, "%s", prefix);
uval = val * counter->scale;
- printout(id, nr, counter, uval, prefix, run, ena, 1.0);
+ printout(id, nr, counter, uval, prefix, run, ena, 1.0,
+ &rt_stat);
if (!metric_only)
fputc('\n', output);
}
@@ -1365,7 +1367,8 @@ static void print_aggr_thread(struct perf_evsel *counter, char *prefix)
fprintf(output, "%s", prefix);
uval = val * counter->scale;
- printout(thread, 0, counter, uval, prefix, run, ena, 1.0);
+ printout(thread, 0, counter, uval, prefix, run, ena, 1.0,
+ &rt_stat);
fputc('\n', output);
}
}
@@ -1402,7 +1405,8 @@ static void print_counter_aggr(struct perf_evsel *counter, char *prefix)
fprintf(output, "%s", prefix);
uval = cd.avg * counter->scale;
- printout(-1, 0, counter, uval, prefix, cd.avg_running, cd.avg_enabled, cd.avg);
+ printout(-1, 0, counter, uval, prefix, cd.avg_running, cd.avg_enabled,
+ cd.avg, &rt_stat);
if (!metric_only)
fprintf(output, "\n");
}
@@ -1441,7 +1445,8 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
fprintf(output, "%s", prefix);
uval = val * counter->scale;
- printout(cpu, 0, counter, uval, prefix, run, ena, 1.0);
+ printout(cpu, 0, counter, uval, prefix, run, ena, 1.0,
+ &rt_stat);
fputc('\n', output);
}
@@ -1473,7 +1478,8 @@ static void print_no_aggr_metric(char *prefix)
run = perf_counts(counter->counts, cpu, 0)->run;
uval = val * counter->scale;
- printout(cpu, 0, counter, uval, prefix, run, ena, 1.0);
+ printout(cpu, 0, counter, uval, prefix, run, ena, 1.0,
+ &rt_stat);
}
fputc('\n', stat_config.output);
}
@@ -1529,7 +1535,8 @@ static void print_metric_headers(const char *prefix, bool no_indent)
perf_stat__print_shadow_stats(counter, 0,
0,
&out,
- &metric_events);
+ &metric_events,
+ &rt_stat);
}
fputc('\n', stat_config.output);
}
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 4b28c40de927..a95c4fe991aa 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -424,15 +424,40 @@ void perf_stat__collect_metric_expr(struct perf_evlist *evsel_list)
}
}
+static double runtime_stat_avg(struct runtime_stat *st,
+ enum stat_type type, int ctx, int cpu)
+{
+ struct saved_value *v;
+
+ v = saved_value_lookup(NULL, cpu, false, type, ctx, st);
+ if (!v)
+ return 0.0;
+
+ return avg_stats(&v->stats);
+}
+
+static double runtime_stat_n(struct runtime_stat *st,
+ enum stat_type type, int ctx, int cpu)
+{
+ struct saved_value *v;
+
+ v = saved_value_lookup(NULL, cpu, false, type, ctx, st);
+ if (!v)
+ return 0.0;
+
+ return v->stats.n;
+}
+
static void print_stalled_cycles_frontend(int cpu,
struct perf_evsel *evsel, double avg,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
{
double total, ratio = 0.0;
const char *color;
int ctx = evsel_context(evsel);
- total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
if (total)
ratio = avg / total * 100.0;
@@ -448,13 +473,14 @@ static void print_stalled_cycles_frontend(int cpu,
static void print_stalled_cycles_backend(int cpu,
struct perf_evsel *evsel, double avg,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
{
double total, ratio = 0.0;
const char *color;
int ctx = evsel_context(evsel);
- total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
if (total)
ratio = avg / total * 100.0;
@@ -467,13 +493,14 @@ static void print_stalled_cycles_backend(int cpu,
static void print_branch_misses(int cpu,
struct perf_evsel *evsel,
double avg,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
{
double total, ratio = 0.0;
const char *color;
int ctx = evsel_context(evsel);
- total = avg_stats(&runtime_branches_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_BRANCHES, ctx, cpu);
if (total)
ratio = avg / total * 100.0;
@@ -486,13 +513,15 @@ static void print_branch_misses(int cpu,
static void print_l1_dcache_misses(int cpu,
struct perf_evsel *evsel,
double avg,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
+
{
double total, ratio = 0.0;
const char *color;
int ctx = evsel_context(evsel);
- total = avg_stats(&runtime_l1_dcache_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_L1_DCACHE, ctx, cpu);
if (total)
ratio = avg / total * 100.0;
@@ -505,13 +534,15 @@ static void print_l1_dcache_misses(int cpu,
static void print_l1_icache_misses(int cpu,
struct perf_evsel *evsel,
double avg,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
+
{
double total, ratio = 0.0;
const char *color;
int ctx = evsel_context(evsel);
- total = avg_stats(&runtime_l1_icache_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_L1_ICACHE, ctx, cpu);
if (total)
ratio = avg / total * 100.0;
@@ -523,13 +554,14 @@ static void print_l1_icache_misses(int cpu,
static void print_dtlb_cache_misses(int cpu,
struct perf_evsel *evsel,
double avg,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
{
double total, ratio = 0.0;
const char *color;
int ctx = evsel_context(evsel);
- total = avg_stats(&runtime_dtlb_cache_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_DTLB_CACHE, ctx, cpu);
if (total)
ratio = avg / total * 100.0;
@@ -541,13 +573,14 @@ static void print_dtlb_cache_misses(int cpu,
static void print_itlb_cache_misses(int cpu,
struct perf_evsel *evsel,
double avg,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
{
double total, ratio = 0.0;
const char *color;
int ctx = evsel_context(evsel);
- total = avg_stats(&runtime_itlb_cache_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_ITLB_CACHE, ctx, cpu);
if (total)
ratio = avg / total * 100.0;
@@ -559,13 +592,14 @@ static void print_itlb_cache_misses(int cpu,
static void print_ll_cache_misses(int cpu,
struct perf_evsel *evsel,
double avg,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
{
double total, ratio = 0.0;
const char *color;
int ctx = evsel_context(evsel);
- total = avg_stats(&runtime_ll_cache_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_LL_CACHE, ctx, cpu);
if (total)
ratio = avg / total * 100.0;
@@ -623,68 +657,72 @@ static double sanitize_val(double x)
return x;
}
-static double td_total_slots(int ctx, int cpu)
+static double td_total_slots(int ctx, int cpu, struct runtime_stat *st)
{
- return avg_stats(&runtime_topdown_total_slots[ctx][cpu]);
+ return runtime_stat_avg(st, STAT_TOPDOWN_TOTAL_SLOTS, ctx, cpu);
}
-static double td_bad_spec(int ctx, int cpu)
+static double td_bad_spec(int ctx, int cpu, struct runtime_stat *st)
{
double bad_spec = 0;
double total_slots;
double total;
- total = avg_stats(&runtime_topdown_slots_issued[ctx][cpu]) -
- avg_stats(&runtime_topdown_slots_retired[ctx][cpu]) +
- avg_stats(&runtime_topdown_recovery_bubbles[ctx][cpu]);
- total_slots = td_total_slots(ctx, cpu);
+ total = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_ISSUED, ctx, cpu) -
+ runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED, ctx, cpu) +
+ runtime_stat_avg(st, STAT_TOPDOWN_RECOVERY_BUBBLES, ctx, cpu);
+
+ total_slots = td_total_slots(ctx, cpu, st);
if (total_slots)
bad_spec = total / total_slots;
return sanitize_val(bad_spec);
}
-static double td_retiring(int ctx, int cpu)
+static double td_retiring(int ctx, int cpu, struct runtime_stat *st)
{
double retiring = 0;
- double total_slots = td_total_slots(ctx, cpu);
- double ret_slots = avg_stats(&runtime_topdown_slots_retired[ctx][cpu]);
+ double total_slots = td_total_slots(ctx, cpu, st);
+ double ret_slots = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED,
+ ctx, cpu);
if (total_slots)
retiring = ret_slots / total_slots;
return retiring;
}
-static double td_fe_bound(int ctx, int cpu)
+static double td_fe_bound(int ctx, int cpu, struct runtime_stat *st)
{
double fe_bound = 0;
- double total_slots = td_total_slots(ctx, cpu);
- double fetch_bub = avg_stats(&runtime_topdown_fetch_bubbles[ctx][cpu]);
+ double total_slots = td_total_slots(ctx, cpu, st);
+ double fetch_bub = runtime_stat_avg(st, STAT_TOPDOWN_FETCH_BUBBLES,
+ ctx, cpu);
if (total_slots)
fe_bound = fetch_bub / total_slots;
return fe_bound;
}
-static double td_be_bound(int ctx, int cpu)
+static double td_be_bound(int ctx, int cpu, struct runtime_stat *st)
{
- double sum = (td_fe_bound(ctx, cpu) +
- td_bad_spec(ctx, cpu) +
- td_retiring(ctx, cpu));
+ double sum = (td_fe_bound(ctx, cpu, st) +
+ td_bad_spec(ctx, cpu, st) +
+ td_retiring(ctx, cpu, st));
if (sum == 0)
return 0;
return sanitize_val(1.0 - sum);
}
static void print_smi_cost(int cpu, struct perf_evsel *evsel,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
{
double smi_num, aperf, cycles, cost = 0.0;
int ctx = evsel_context(evsel);
const char *color = NULL;
- smi_num = avg_stats(&runtime_smi_num_stats[ctx][cpu]);
- aperf = avg_stats(&runtime_aperf_stats[ctx][cpu]);
- cycles = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+ smi_num = runtime_stat_avg(st, STAT_SMI_NUM, ctx, cpu);
+ aperf = runtime_stat_avg(st, STAT_APERF, ctx, cpu);
+ cycles = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
if ((cycles == 0) || (aperf == 0))
return;
@@ -704,7 +742,8 @@ static void generic_metric(const char *metric_expr,
const char *metric_name,
double avg,
int cpu,
- struct perf_stat_output_ctx *out)
+ struct perf_stat_output_ctx *out,
+ struct runtime_stat *st)
{
print_metric_t print_metric = out->print_metric;
struct parse_ctx pctx;
@@ -724,7 +763,7 @@ static void generic_metric(const char *metric_expr,
scale = 1e-9;
} else {
v = saved_value_lookup(metric_events[i], cpu, false,
- STAT_NONE, 0, &rt_stat);
+ STAT_NONE, 0, st);
if (!v)
break;
stats = &v->stats;
@@ -752,7 +791,8 @@ static void generic_metric(const char *metric_expr,
void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
double avg, int cpu,
struct perf_stat_output_ctx *out,
- struct rblist *metric_events)
+ struct rblist *metric_events,
+ struct runtime_stat *st)
{
void *ctxp = out->ctx;
print_metric_t print_metric = out->print_metric;
@@ -763,7 +803,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
int num = 1;
if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
- total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
+
if (total) {
ratio = avg / total;
print_metric(ctxp, NULL, "%7.2f ",
@@ -771,8 +812,13 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
} else {
print_metric(ctxp, NULL, NULL, "insn per cycle", 0);
}
- total = avg_stats(&runtime_stalled_cycles_front_stats[ctx][cpu]);
- total = max(total, avg_stats(&runtime_stalled_cycles_back_stats[ctx][cpu]));
+
+ total = runtime_stat_avg(st, STAT_STALLED_CYCLES_FRONT,
+ ctx, cpu);
+
+ total = max(total, runtime_stat_avg(st,
+ STAT_STALLED_CYCLES_BACK,
+ ctx, cpu));
if (total && avg) {
out->new_line(ctxp);
@@ -785,8 +831,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
"stalled cycles per insn", 0);
}
} else if (perf_evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
- if (runtime_branches_stats[ctx][cpu].n != 0)
- print_branch_misses(cpu, evsel, avg, out);
+ if (runtime_stat_n(st, STAT_BRANCHES, ctx, cpu) != 0)
+ print_branch_misses(cpu, evsel, avg, out, st);
else
print_metric(ctxp, NULL, NULL, "of all branches", 0);
} else if (
@@ -794,8 +840,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
evsel->attr.config == ( PERF_COUNT_HW_CACHE_L1D |
((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
- if (runtime_l1_dcache_stats[ctx][cpu].n != 0)
- print_l1_dcache_misses(cpu, evsel, avg, out);
+
+ if (runtime_stat_n(st, STAT_L1_DCACHE, ctx, cpu) != 0)
+ print_l1_dcache_misses(cpu, evsel, avg, out, st);
else
print_metric(ctxp, NULL, NULL, "of all L1-dcache hits", 0);
} else if (
@@ -803,8 +850,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
evsel->attr.config == ( PERF_COUNT_HW_CACHE_L1I |
((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
- if (runtime_l1_icache_stats[ctx][cpu].n != 0)
- print_l1_icache_misses(cpu, evsel, avg, out);
+
+ if (runtime_stat_n(st, STAT_L1_ICACHE, ctx, cpu) != 0)
+ print_l1_icache_misses(cpu, evsel, avg, out, st);
else
print_metric(ctxp, NULL, NULL, "of all L1-icache hits", 0);
} else if (
@@ -812,8 +860,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
evsel->attr.config == ( PERF_COUNT_HW_CACHE_DTLB |
((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
- if (runtime_dtlb_cache_stats[ctx][cpu].n != 0)
- print_dtlb_cache_misses(cpu, evsel, avg, out);
+
+ if (runtime_stat_n(st, STAT_DTLB_CACHE, ctx, cpu) != 0)
+ print_dtlb_cache_misses(cpu, evsel, avg, out, st);
else
print_metric(ctxp, NULL, NULL, "of all dTLB cache hits", 0);
} else if (
@@ -821,8 +870,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
evsel->attr.config == ( PERF_COUNT_HW_CACHE_ITLB |
((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
- if (runtime_itlb_cache_stats[ctx][cpu].n != 0)
- print_itlb_cache_misses(cpu, evsel, avg, out);
+
+ if (runtime_stat_n(st, STAT_ITLB_CACHE, ctx, cpu) != 0)
+ print_itlb_cache_misses(cpu, evsel, avg, out, st);
else
print_metric(ctxp, NULL, NULL, "of all iTLB cache hits", 0);
} else if (
@@ -830,27 +880,28 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
evsel->attr.config == ( PERF_COUNT_HW_CACHE_LL |
((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
- if (runtime_ll_cache_stats[ctx][cpu].n != 0)
- print_ll_cache_misses(cpu, evsel, avg, out);
+
+ if (runtime_stat_n(st, STAT_LL_CACHE, ctx, cpu) != 0)
+ print_ll_cache_misses(cpu, evsel, avg, out, st);
else
print_metric(ctxp, NULL, NULL, "of all LL-cache hits", 0);
} else if (perf_evsel__match(evsel, HARDWARE, HW_CACHE_MISSES)) {
- total = avg_stats(&runtime_cacherefs_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_CACHEREFS, ctx, cpu);
if (total)
ratio = avg * 100 / total;
- if (runtime_cacherefs_stats[ctx][cpu].n != 0)
+ if (runtime_stat_n(st, STAT_CACHEREFS, ctx, cpu) != 0)
print_metric(ctxp, NULL, "%8.3f %%",
"of all cache refs", ratio);
else
print_metric(ctxp, NULL, NULL, "of all cache refs", 0);
} else if (perf_evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) {
- print_stalled_cycles_frontend(cpu, evsel, avg, out);
+ print_stalled_cycles_frontend(cpu, evsel, avg, out, st);
} else if (perf_evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND)) {
- print_stalled_cycles_backend(cpu, evsel, avg, out);
+ print_stalled_cycles_backend(cpu, evsel, avg, out, st);
} else if (perf_evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
- total = avg_stats(&runtime_nsecs_stats[cpu]);
+ total = runtime_stat_avg(st, STAT_NSECS, 0, cpu);
if (total) {
ratio = avg / total;
@@ -859,7 +910,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
print_metric(ctxp, NULL, NULL, "Ghz", 0);
}
} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX)) {
- total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
+
if (total)
print_metric(ctxp, NULL,
"%7.2f%%", "transactional cycles",
@@ -868,8 +920,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
print_metric(ctxp, NULL, NULL, "transactional cycles",
0);
} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX_CP)) {
- total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
- total2 = avg_stats(&runtime_cycles_in_tx_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
+ total2 = runtime_stat_avg(st, STAT_CYCLES_IN_TX, ctx, cpu);
+
if (total2 < avg)
total2 = avg;
if (total)
@@ -878,19 +931,21 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
else
print_metric(ctxp, NULL, NULL, "aborted cycles", 0);
} else if (perf_stat_evsel__is(evsel, TRANSACTION_START)) {
- total = avg_stats(&runtime_cycles_in_tx_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_CYCLES_IN_TX,
+ ctx, cpu);
if (avg)
ratio = total / avg;
- if (runtime_cycles_in_tx_stats[ctx][cpu].n != 0)
+ if (runtime_stat_n(st, STAT_CYCLES_IN_TX, ctx, cpu) != 0)
print_metric(ctxp, NULL, "%8.0f",
"cycles / transaction", ratio);
else
print_metric(ctxp, NULL, NULL, "cycles / transaction",
- 0);
+ 0);
} else if (perf_stat_evsel__is(evsel, ELISION_START)) {
- total = avg_stats(&runtime_cycles_in_tx_stats[ctx][cpu]);
+ total = runtime_stat_avg(st, STAT_CYCLES_IN_TX,
+ ctx, cpu);
if (avg)
ratio = total / avg;
@@ -904,28 +959,28 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
else
print_metric(ctxp, NULL, NULL, "CPUs utilized", 0);
} else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_BUBBLES)) {
- double fe_bound = td_fe_bound(ctx, cpu);
+ double fe_bound = td_fe_bound(ctx, cpu, st);
if (fe_bound > 0.2)
color = PERF_COLOR_RED;
print_metric(ctxp, color, "%8.1f%%", "frontend bound",
fe_bound * 100.);
} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_RETIRED)) {
- double retiring = td_retiring(ctx, cpu);
+ double retiring = td_retiring(ctx, cpu, st);
if (retiring > 0.7)
color = PERF_COLOR_GREEN;
print_metric(ctxp, color, "%8.1f%%", "retiring",
retiring * 100.);
} else if (perf_stat_evsel__is(evsel, TOPDOWN_RECOVERY_BUBBLES)) {
- double bad_spec = td_bad_spec(ctx, cpu);
+ double bad_spec = td_bad_spec(ctx, cpu, st);
if (bad_spec > 0.1)
color = PERF_COLOR_RED;
print_metric(ctxp, color, "%8.1f%%", "bad speculation",
bad_spec * 100.);
} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_ISSUED)) {
- double be_bound = td_be_bound(ctx, cpu);
+ double be_bound = td_be_bound(ctx, cpu, st);
const char *name = "backend bound";
static int have_recovery_bubbles = -1;
@@ -938,19 +993,19 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
if (be_bound > 0.2)
color = PERF_COLOR_RED;
- if (td_total_slots(ctx, cpu) > 0)
+ if (td_total_slots(ctx, cpu, st) > 0)
print_metric(ctxp, color, "%8.1f%%", name,
be_bound * 100.);
else
print_metric(ctxp, NULL, NULL, name, 0);
} else if (evsel->metric_expr) {
generic_metric(evsel->metric_expr, evsel->metric_events, evsel->name,
- evsel->metric_name, avg, cpu, out);
- } else if (runtime_nsecs_stats[cpu].n != 0) {
+ evsel->metric_name, avg, cpu, out, st);
+ } else if (runtime_stat_n(st, STAT_NSECS, 0, cpu) != 0) {
char unit = 'M';
char unit_buf[10];
- total = avg_stats(&runtime_nsecs_stats[cpu]);
+ total = runtime_stat_avg(st, STAT_NSECS, 0, cpu);
if (total)
ratio = 1000.0 * avg / total;
@@ -961,7 +1016,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
snprintf(unit_buf, sizeof(unit_buf), "%c/sec", unit);
print_metric(ctxp, NULL, "%8.3f", unit_buf, ratio);
} else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
- print_smi_cost(cpu, evsel, out);
+ print_smi_cost(cpu, evsel, out, st);
} else {
num = 0;
}
@@ -974,7 +1029,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
out->new_line(ctxp);
generic_metric(mexp->metric_expr, mexp->metric_events,
evsel->name, mexp->metric_name,
- avg, cpu, out);
+ avg, cpu, out, st);
}
}
if (num == 0)
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index bb9902ad3a79..76b322a2d293 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -140,7 +140,8 @@ struct perf_stat_output_ctx {
void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
double avg, int cpu,
struct perf_stat_output_ctx *out,
- struct rblist *metric_events);
+ struct rblist *metric_events,
+ struct runtime_stat *st);
void perf_stat__collect_metric_expr(struct perf_evlist *);
int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw);
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 06/35] perf stat: Remove a set of shadow stats static variables
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (4 preceding siblings ...)
2017-12-28 14:29 ` [PATCH 05/35] perf stat: Print " Arnaldo Carvalho de Melo
@ 2017-12-28 14:29 ` Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 07/35] perf stat: Allocate shadow stats buffer for threads Arnaldo Carvalho de Melo
` (29 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
In previous patches, we have reconstructed the code and let it not
access the static variables directly.
This patch removes these static variables.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-7-git-send-email-yao.jin@linux.intel.com
[ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/stat-shadow.c | 68 ++++++++++---------------------------------
tools/perf/util/stat.h | 1 +
2 files changed, 16 insertions(+), 53 deletions(-)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index a95c4fe991aa..594d14a02b67 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -16,28 +16,6 @@
* AGGR_NONE: Use matching CPU
* AGGR_THREAD: Not supported?
*/
-static struct stats runtime_nsecs_stats[MAX_NR_CPUS];
-static struct stats runtime_cycles_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_stalled_cycles_front_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_stalled_cycles_back_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_branches_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_cacherefs_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_l1_dcache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_l1_icache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_ll_cache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_itlb_cache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_dtlb_cache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_cycles_in_tx_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_transaction_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_elision_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_total_slots[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_slots_issued[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_slots_retired[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_fetch_bubbles[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_recovery_bubbles[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_smi_num_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_aperf_stats[NUM_CTX][MAX_NR_CPUS];
-static struct rblist runtime_saved_values;
static bool have_frontend_stalled;
struct runtime_stat rt_stat;
@@ -163,10 +141,6 @@ void runtime_stat__exit(struct runtime_stat *st)
void perf_stat__init_shadow_stats(void)
{
have_frontend_stalled = pmu_have_event("cpu", "stalled-cycles-frontend");
- rblist__init(&runtime_saved_values);
- runtime_saved_values.node_cmp = saved_value_cmp;
- runtime_saved_values.node_new = saved_value_new;
- runtime_saved_values.node_delete = saved_value_delete;
runtime_stat__init(&rt_stat);
}
@@ -188,36 +162,13 @@ static int evsel_context(struct perf_evsel *evsel)
return ctx;
}
-void perf_stat__reset_shadow_stats(void)
+static void reset_stat(struct runtime_stat *st)
{
+ struct rblist *rblist;
struct rb_node *pos, *next;
- memset(runtime_nsecs_stats, 0, sizeof(runtime_nsecs_stats));
- memset(runtime_cycles_stats, 0, sizeof(runtime_cycles_stats));
- memset(runtime_stalled_cycles_front_stats, 0, sizeof(runtime_stalled_cycles_front_stats));
- memset(runtime_stalled_cycles_back_stats, 0, sizeof(runtime_stalled_cycles_back_stats));
- memset(runtime_branches_stats, 0, sizeof(runtime_branches_stats));
- memset(runtime_cacherefs_stats, 0, sizeof(runtime_cacherefs_stats));
- memset(runtime_l1_dcache_stats, 0, sizeof(runtime_l1_dcache_stats));
- memset(runtime_l1_icache_stats, 0, sizeof(runtime_l1_icache_stats));
- memset(runtime_ll_cache_stats, 0, sizeof(runtime_ll_cache_stats));
- memset(runtime_itlb_cache_stats, 0, sizeof(runtime_itlb_cache_stats));
- memset(runtime_dtlb_cache_stats, 0, sizeof(runtime_dtlb_cache_stats));
- memset(runtime_cycles_in_tx_stats, 0,
- sizeof(runtime_cycles_in_tx_stats));
- memset(runtime_transaction_stats, 0,
- sizeof(runtime_transaction_stats));
- memset(runtime_elision_stats, 0, sizeof(runtime_elision_stats));
- memset(&walltime_nsecs_stats, 0, sizeof(walltime_nsecs_stats));
- memset(runtime_topdown_total_slots, 0, sizeof(runtime_topdown_total_slots));
- memset(runtime_topdown_slots_retired, 0, sizeof(runtime_topdown_slots_retired));
- memset(runtime_topdown_slots_issued, 0, sizeof(runtime_topdown_slots_issued));
- memset(runtime_topdown_fetch_bubbles, 0, sizeof(runtime_topdown_fetch_bubbles));
- memset(runtime_topdown_recovery_bubbles, 0, sizeof(runtime_topdown_recovery_bubbles));
- memset(runtime_smi_num_stats, 0, sizeof(runtime_smi_num_stats));
- memset(runtime_aperf_stats, 0, sizeof(runtime_aperf_stats));
-
- next = rb_first(&runtime_saved_values.entries);
+ rblist = &st->value_list;
+ next = rb_first(&rblist->entries);
while (next) {
pos = next;
next = rb_next(pos);
@@ -227,6 +178,17 @@ void perf_stat__reset_shadow_stats(void)
}
}
+void perf_stat__reset_shadow_stats(void)
+{
+ reset_stat(&rt_stat);
+ memset(&walltime_nsecs_stats, 0, sizeof(walltime_nsecs_stats));
+}
+
+void perf_stat__reset_shadow_per_stat(struct runtime_stat *st)
+{
+ reset_stat(st);
+}
+
static void update_runtime_stat(struct runtime_stat *st,
enum stat_type type,
int ctx, int cpu, u64 count)
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 76b322a2d293..cfe4fb899633 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -128,6 +128,7 @@ void runtime_stat__init(struct runtime_stat *st);
void runtime_stat__exit(struct runtime_stat *st);
void perf_stat__init_shadow_stats(void);
void perf_stat__reset_shadow_stats(void);
+void perf_stat__reset_shadow_per_stat(struct runtime_stat *st);
void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
int cpu, struct runtime_stat *st);
struct perf_stat_output_ctx {
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 07/35] perf stat: Allocate shadow stats buffer for threads
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (5 preceding siblings ...)
2017-12-28 14:29 ` [PATCH 06/35] perf stat: Remove a set of shadow stats static variables Arnaldo Carvalho de Melo
@ 2017-12-28 14:29 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 08/35] perf stat: Update or print per-thread stats Arnaldo Carvalho de Melo
` (28 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
After perf_evlist__create_maps() being executed, we can get all threads
from /proc. And via thread_map__nr(), we can also get the number of
threads.
With the number of threads, the patch allocates a buffer which will
record the shadow stats for these threads.
The buffer pointer is saved in stat_config.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-8-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-stat.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
tools/perf/util/stat.h | 2 ++
2 files changed, 47 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 097a694d16f2..4c492ac3ac07 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -214,8 +214,13 @@ static inline void diff_timespec(struct timespec *r, struct timespec *a,
static void perf_stat__reset_stats(void)
{
+ int i;
+
perf_evlist__reset_stats(evsel_list);
perf_stat__reset_shadow_stats();
+
+ for (i = 0; i < stat_config.stats_num; i++)
+ perf_stat__reset_shadow_per_stat(&stat_config.stats[i]);
}
static int create_perf_stat_counter(struct perf_evsel *evsel)
@@ -2495,6 +2500,35 @@ int process_cpu_map_event(struct perf_tool *tool,
return set_maps(st);
}
+static int runtime_stat_new(struct perf_stat_config *config, int nthreads)
+{
+ int i;
+
+ config->stats = calloc(nthreads, sizeof(struct runtime_stat));
+ if (!config->stats)
+ return -1;
+
+ config->stats_num = nthreads;
+
+ for (i = 0; i < nthreads; i++)
+ runtime_stat__init(&config->stats[i]);
+
+ return 0;
+}
+
+static void runtime_stat_delete(struct perf_stat_config *config)
+{
+ int i;
+
+ if (!config->stats)
+ return;
+
+ for (i = 0; i < config->stats_num; i++)
+ runtime_stat__exit(&config->stats[i]);
+
+ free(config->stats);
+}
+
static const char * const stat_report_usage[] = {
"perf stat report [<options>]",
NULL,
@@ -2750,8 +2784,15 @@ int cmd_stat(int argc, const char **argv)
* Initialize thread_map with comm names,
* so we could print it out on output.
*/
- if (stat_config.aggr_mode == AGGR_THREAD)
+ if (stat_config.aggr_mode == AGGR_THREAD) {
thread_map__read_comms(evsel_list->threads);
+ if (target.system_wide) {
+ if (runtime_stat_new(&stat_config,
+ thread_map__nr(evsel_list->threads))) {
+ goto out;
+ }
+ }
+ }
if (interval && interval < 100) {
if (interval < 10) {
@@ -2841,5 +2882,8 @@ int cmd_stat(int argc, const char **argv)
sysfs__write_int(FREEZE_ON_SMI_PATH, 0);
perf_evlist__delete(evsel_list);
+
+ runtime_stat_delete(&stat_config);
+
return status;
}
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index cfe4fb899633..2ed95dc72784 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -90,6 +90,8 @@ struct perf_stat_config {
bool scale;
FILE *output;
unsigned int interval;
+ struct runtime_stat *stats;
+ int stats_num;
};
void update_stats(struct stats *stats, u64 val);
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 08/35] perf stat: Update or print per-thread stats
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (6 preceding siblings ...)
2017-12-28 14:29 ` [PATCH 07/35] perf stat: Allocate shadow stats buffer for threads Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 09/35] perf thread_map: Enumerate all threads from /proc Arnaldo Carvalho de Melo
` (27 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
If the stats pointer in stat_config structure is not null, it will
update the per-thread stats or print the per-thread stats on this
buffer.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-9-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-stat.c | 9 +++++++--
tools/perf/util/stat.c | 11 ++++++++---
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 4c492ac3ac07..f4129a5fbb01 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1372,8 +1372,13 @@ static void print_aggr_thread(struct perf_evsel *counter, char *prefix)
fprintf(output, "%s", prefix);
uval = val * counter->scale;
- printout(thread, 0, counter, uval, prefix, run, ena, 1.0,
- &rt_stat);
+
+ if (stat_config.stats)
+ printout(thread, 0, counter, uval, prefix, run, ena,
+ 1.0, &stat_config.stats[thread]);
+ else
+ printout(thread, 0, counter, uval, prefix, run, ena,
+ 1.0, &rt_stat);
fputc('\n', output);
}
}
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 78abfd40b135..32235657c1ac 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -280,9 +280,14 @@ process_counter_values(struct perf_stat_config *config, struct perf_evsel *evsel
if (config->aggr_mode == AGGR_NONE)
perf_stat__update_shadow_stats(evsel, count->val, cpu,
&rt_stat);
- if (config->aggr_mode == AGGR_THREAD)
- perf_stat__update_shadow_stats(evsel, count->val, 0,
- &rt_stat);
+ if (config->aggr_mode == AGGR_THREAD) {
+ if (config->stats)
+ perf_stat__update_shadow_stats(evsel,
+ count->val, 0, &config->stats[thread]);
+ else
+ perf_stat__update_shadow_stats(evsel,
+ count->val, 0, &rt_stat);
+ }
break;
case AGGR_GLOBAL:
aggr->val += count->val;
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 09/35] perf thread_map: Enumerate all threads from /proc
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (7 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 08/35] perf stat: Update or print per-thread stats Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 10/35] perf stat: Remove --per-thread pid/tid limitation Arnaldo Carvalho de Melo
` (26 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
This patch calls thread_map__new_all_cpus() to enumerate all threads
from /proc if per-thread flag is enabled.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-10-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/tests/thread-map.c | 2 +-
tools/perf/util/evlist.c | 3 ++-
tools/perf/util/thread_map.c | 5 ++++-
tools/perf/util/thread_map.h | 2 +-
4 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/tools/perf/tests/thread-map.c b/tools/perf/tests/thread-map.c
index dbcb6a19b375..4de1939b58ba 100644
--- a/tools/perf/tests/thread-map.c
+++ b/tools/perf/tests/thread-map.c
@@ -105,7 +105,7 @@ int test__thread_map_remove(struct test *test __maybe_unused, int subtest __mayb
TEST_ASSERT_VAL("failed to allocate map string",
asprintf(&str, "%d,%d", getpid(), getppid()) >= 0);
- threads = thread_map__new_str(str, NULL, 0);
+ threads = thread_map__new_str(str, NULL, 0, false);
TEST_ASSERT_VAL("failed to allocate thread_map",
threads);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 3570355bcf39..f0a5e09c4071 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1105,7 +1105,8 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
struct cpu_map *cpus;
struct thread_map *threads;
- threads = thread_map__new_str(target->pid, target->tid, target->uid);
+ threads = thread_map__new_str(target->pid, target->tid, target->uid,
+ target->per_thread);
if (!threads)
return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 2b653853eec2..3e1038f6491c 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
}
struct thread_map *thread_map__new_str(const char *pid, const char *tid,
- uid_t uid)
+ uid_t uid, bool per_thread)
{
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,6 +331,9 @@ struct thread_map *thread_map__new_str(const char *pid, const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
+ if (per_thread)
+ return thread_map__new_all_cpus();
+
return thread_map__new_by_tid_str(tid);
}
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 07a765fb22bb..0a806b99e73c 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
void thread_map__put(struct thread_map *map);
struct thread_map *thread_map__new_str(const char *pid,
- const char *tid, uid_t uid);
+ const char *tid, uid_t uid, bool per_thread);
struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 10/35] perf stat: Remove --per-thread pid/tid limitation
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (8 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 09/35] perf thread_map: Enumerate all threads from /proc Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 11/35] perf stat: Resort '--per-thread' result Arnaldo Carvalho de Melo
` (25 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
Currently, if we execute 'perf stat --per-thread' without specifying
pid/tid, perf will return error.
root@skl:/tmp# perf stat --per-thread
The --per-thread option is only available when monitoring via -p -t options.
-p, --pid <pid> stat events on existing process id
-t, --tid <tid> stat events on existing thread id
This patch removes this limitation. If no pid/tid specified, it returns
all threads (get threads from /proc).
Note that it doesn't support cpu_list yet so if it's a cpu_list case,
then skip.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-11-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-stat.c | 23 +++++++++++++++--------
tools/perf/util/target.h | 7 +++++++
2 files changed, 22 insertions(+), 8 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f4129a5fbb01..ee708ba6f79a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -277,7 +277,7 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
attr->enable_on_exec = 1;
}
- if (target__has_cpu(&target))
+ if (target__has_cpu(&target) && !target__has_per_thread(&target))
return perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));
return perf_evsel__open_per_thread(evsel, evsel_list->threads);
@@ -340,7 +340,7 @@ static int read_counter(struct perf_evsel *counter)
int nthreads = thread_map__nr(evsel_list->threads);
int ncpus, cpu, thread;
- if (target__has_cpu(&target))
+ if (target__has_cpu(&target) && !target__has_per_thread(&target))
ncpus = perf_evsel__nr_cpus(counter);
else
ncpus = 1;
@@ -2743,12 +2743,16 @@ int cmd_stat(int argc, const char **argv)
run_count = 1;
}
- if ((stat_config.aggr_mode == AGGR_THREAD) && !target__has_task(&target)) {
- fprintf(stderr, "The --per-thread option is only available "
- "when monitoring via -p -t options.\n");
- parse_options_usage(NULL, stat_options, "p", 1);
- parse_options_usage(NULL, stat_options, "t", 1);
- goto out;
+ if ((stat_config.aggr_mode == AGGR_THREAD) &&
+ !target__has_task(&target)) {
+ if (!target.system_wide || target.cpu_list) {
+ fprintf(stderr, "The --per-thread option is only "
+ "available when monitoring via -p -t -a "
+ "options or only --per-thread.\n");
+ parse_options_usage(NULL, stat_options, "p", 1);
+ parse_options_usage(NULL, stat_options, "t", 1);
+ goto out;
+ }
}
/*
@@ -2772,6 +2776,9 @@ int cmd_stat(int argc, const char **argv)
target__validate(&target);
+ if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
+ target.per_thread = true;
+
if (perf_evlist__create_maps(evsel_list, &target) < 0) {
if (target__has_task(&target)) {
pr_err("Problems finding threads of monitor\n");
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 446aa7a56f25..6ef01a83b24e 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -64,6 +64,11 @@ static inline bool target__none(struct target *target)
return !target__has_task(target) && !target__has_cpu(target);
}
+static inline bool target__has_per_thread(struct target *target)
+{
+ return target->system_wide && target->per_thread;
+}
+
static inline bool target__uses_dummy_map(struct target *target)
{
bool use_dummy = false;
@@ -73,6 +78,8 @@ static inline bool target__uses_dummy_map(struct target *target)
else if (target__has_task(target) ||
(!target__has_cpu(target) && !target->uses_mmap))
use_dummy = true;
+ else if (target__has_per_thread(target))
+ use_dummy = true;
return use_dummy;
}
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 11/35] perf stat: Resort '--per-thread' result
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (9 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 10/35] perf stat: Remove --per-thread pid/tid limitation Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 12/35] perf utils: Move is_directory() to path.h Arnaldo Carvalho de Melo
` (24 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
There are many threads reported if we enable '--per-thread'
globally.
1. Most of the threads are not counted or counting value 0.
This patch removes these threads.
2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.
For example, the new results would be:
root@skl:/tmp# perf stat --per-thread
^C
Performance counter stats for 'system wide':
perf-24165 4.302433 cpu-clock (msec) # 0.001 CPUs utilized
vmstat-23127 1.562215 cpu-clock (msec) # 0.000 CPUs utilized
irqbalance-2780 0.827851 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23111 0.278308 cpu-clock (msec) # 0.000 CPUs utilized
thermald-2841 0.230880 cpu-clock (msec) # 0.000 CPUs utilized
sshd-23058 0.207306 cpu-clock (msec) # 0.000 CPUs utilized
kworker/0:2-19991 0.133983 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:1-18249 0.125636 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 0.085533 cpu-clock (msec) # 0.000 CPUs utilized
kworker/u16:2-23146 0.077139 cpu-clock (msec) # 0.000 CPUs utilized
gmain-2700 0.041789 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1-15354 0.028370 cpu-clock (msec) # 0.000 CPUs utilized
kworker/6:0-17528 0.023895 cpu-clock (msec) # 0.000 CPUs utilized
kworker/4:1H-1887 0.013209 cpu-clock (msec) # 0.000 CPUs utilized
kworker/5:2-31362 0.011627 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/0-11 0.010892 cpu-clock (msec) # 0.000 CPUs utilized
kworker/3:2-12870 0.010220 cpu-clock (msec) # 0.000 CPUs utilized
ksoftirqd/0-7 0.008869 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/1-14 0.008476 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/7-50 0.002944 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/3-26 0.002893 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/4-32 0.002759 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/2-20 0.002429 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/6-44 0.001491 cpu-clock (msec) # 0.000 CPUs utilized
watchdog/5-38 0.001477 cpu-clock (msec) # 0.000 CPUs utilized
rcu_sched-8 10 context-switches # 0.117 M/sec
kworker/u16:1-18249 7 context-switches # 0.056 M/sec
sshd-23111 4 context-switches # 0.014 M/sec
vmstat-23127 4 context-switches # 0.003 M/sec
perf-24165 4 context-switches # 0.930 K/sec
kworker/0:2-19991 3 context-switches # 0.022 M/sec
kworker/u16:2-23146 3 context-switches # 0.039 M/sec
kworker/4:1-15354 2 context-switches # 0.070 M/sec
kworker/6:0-17528 2 context-switches # 0.084 M/sec
sshd-23058 2 context-switches # 0.010 M/sec
ksoftirqd/0-7 1 context-switches # 0.113 M/sec
watchdog/0-11 1 context-switches # 0.092 M/sec
watchdog/1-14 1 context-switches # 0.118 M/sec
watchdog/2-20 1 context-switches # 0.412 M/sec
watchdog/3-26 1 context-switches # 0.346 M/sec
watchdog/4-32 1 context-switches # 0.362 M/sec
watchdog/5-38 1 context-switches # 0.677 M/sec
watchdog/6-44 1 context-switches # 0.671 M/sec
watchdog/7-50 1 context-switches # 0.340 M/sec
kworker/4:1H-1887 1 context-switches # 0.076 M/sec
thermald-2841 1 context-switches # 0.004 M/sec
gmain-2700 1 context-switches # 0.024 M/sec
irqbalance-2780 1 context-switches # 0.001 M/sec
kworker/3:2-12870 1 context-switches # 0.098 M/sec
kworker/5:2-31362 1 context-switches # 0.086 M/sec
kworker/u16:1-18249 2 cpu-migrations # 0.016 M/sec
kworker/u16:2-23146 2 cpu-migrations # 0.026 M/sec
rcu_sched-8 1 cpu-migrations # 0.012 M/sec
sshd-23058 1 cpu-migrations # 0.005 M/sec
perf-24165 8,833,385 cycles # 2.053 GHz
vmstat-23127 1,702,699 cycles # 1.090 GHz
irqbalance-2780 739,847 cycles # 0.894 GHz
sshd-23111 269,506 cycles # 0.968 GHz
thermald-2841 204,556 cycles # 0.886 GHz
sshd-23058 158,780 cycles # 0.766 GHz
kworker/0:2-19991 112,981 cycles # 0.843 GHz
kworker/u16:1-18249 100,926 cycles # 0.803 GHz
rcu_sched-8 74,024 cycles # 0.865 GHz
kworker/u16:2-23146 55,984 cycles # 0.726 GHz
gmain-2700 34,278 cycles # 0.820 GHz
kworker/4:1-15354 20,665 cycles # 0.728 GHz
kworker/6:0-17528 16,445 cycles # 0.688 GHz
kworker/5:2-31362 9,492 cycles # 0.816 GHz
watchdog/3-26 8,695 cycles # 3.006 GHz
kworker/4:1H-1887 8,238 cycles # 0.624 GHz
watchdog/4-32 7,580 cycles # 2.747 GHz
kworker/3:2-12870 7,306 cycles # 0.715 GHz
watchdog/2-20 7,274 cycles # 2.995 GHz
watchdog/0-11 6,988 cycles # 0.642 GHz
ksoftirqd/0-7 6,376 cycles # 0.719 GHz
watchdog/1-14 5,340 cycles # 0.630 GHz
watchdog/5-38 4,061 cycles # 2.749 GHz
watchdog/6-44 3,976 cycles # 2.667 GHz
watchdog/7-50 3,418 cycles # 1.161 GHz
vmstat-23127 2,511,699 instructions # 1.48 insn per cycle
perf-24165 1,829,908 instructions # 0.21 insn per cycle
irqbalance-2780 1,190,204 instructions # 1.61 insn per cycle
thermald-2841 143,544 instructions # 0.70 insn per cycle
sshd-23111 128,138 instructions # 0.48 insn per cycle
sshd-23058 57,654 instructions # 0.36 insn per cycle
rcu_sched-8 44,063 instructions # 0.60 insn per cycle
kworker/u16:1-18249 42,551 instructions # 0.42 insn per cycle
kworker/0:2-19991 25,873 instructions # 0.23 insn per cycle
kworker/u16:2-23146 21,407 instructions # 0.38 insn per cycle
gmain-2700 13,691 instructions # 0.40 insn per cycle
kworker/4:1-15354 12,964 instructions # 0.63 insn per cycle
kworker/6:0-17528 10,034 instructions # 0.61 insn per cycle
kworker/5:2-31362 5,203 instructions # 0.55 insn per cycle
kworker/3:2-12870 4,866 instructions # 0.67 insn per cycle
kworker/4:1H-1887 3,586 instructions # 0.44 insn per cycle
ksoftirqd/0-7 3,463 instructions # 0.54 insn per cycle
watchdog/0-11 3,135 instructions # 0.45 insn per cycle
watchdog/1-14 3,135 instructions # 0.59 insn per cycle
watchdog/2-20 3,135 instructions # 0.43 insn per cycle
watchdog/3-26 3,135 instructions # 0.36 insn per cycle
watchdog/4-32 3,135 instructions # 0.41 insn per cycle
watchdog/5-38 3,135 instructions # 0.77 insn per cycle
watchdog/6-44 3,135 instructions # 0.79 insn per cycle
watchdog/7-50 3,135 instructions # 0.92 insn per cycle
vmstat-23127 539,181 branches # 345.139 M/sec
perf-24165 375,364 branches # 87.245 M/sec
irqbalance-2780 262,092 branches # 316.593 M/sec
thermald-2841 31,611 branches # 136.915 M/sec
sshd-23111 21,874 branches # 78.596 M/sec
sshd-23058 10,682 branches # 51.528 M/sec
rcu_sched-8 8,693 branches # 101.633 M/sec
kworker/u16:1-18249 7,891 branches # 62.808 M/sec
kworker/0:2-19991 5,761 branches # 42.998 M/sec
kworker/u16:2-23146 4,099 branches # 53.138 M/sec
kworker/4:1-15354 2,755 branches # 97.110 M/sec
gmain-2700 2,638 branches # 63.127 M/sec
kworker/6:0-17528 2,216 branches # 92.739 M/sec
kworker/5:2-31362 1,132 branches # 97.360 M/sec
kworker/3:2-12870 1,081 branches # 105.773 M/sec
kworker/4:1H-1887 725 branches # 54.887 M/sec
ksoftirqd/0-7 707 branches # 79.716 M/sec
watchdog/0-11 652 branches # 59.860 M/sec
watchdog/1-14 652 branches # 76.923 M/sec
watchdog/2-20 652 branches # 268.423 M/sec
watchdog/3-26 652 branches # 225.372 M/sec
watchdog/4-32 652 branches # 236.318 M/sec
watchdog/5-38 652 branches # 441.435 M/sec
watchdog/6-44 652 branches # 437.290 M/sec
watchdog/7-50 652 branches # 221.467 M/sec
vmstat-23127 8,960 branch-misses # 1.66% of all branches
irqbalance-2780 3,047 branch-misses # 1.16% of all branches
perf-24165 2,876 branch-misses # 0.77% of all branches
sshd-23111 1,843 branch-misses # 8.43% of all branches
thermald-2841 1,444 branch-misses # 4.57% of all branches
sshd-23058 1,379 branch-misses # 12.91% of all branches
kworker/u16:1-18249 982 branch-misses # 12.44% of all branches
rcu_sched-8 893 branch-misses # 10.27% of all branches
kworker/u16:2-23146 578 branch-misses # 14.10% of all branches
kworker/0:2-19991 376 branch-misses # 6.53% of all branches
gmain-2700 280 branch-misses # 10.61% of all branches
kworker/6:0-17528 196 branch-misses # 8.84% of all branches
kworker/4:1-15354 187 branch-misses # 6.79% of all branches
kworker/5:2-31362 123 branch-misses # 10.87% of all branches
watchdog/0-11 95 branch-misses # 14.57% of all branches
watchdog/4-32 89 branch-misses # 13.65% of all branches
kworker/3:2-12870 80 branch-misses # 7.40% of all branches
watchdog/3-26 61 branch-misses # 9.36% of all branches
kworker/4:1H-1887 60 branch-misses # 8.28% of all branches
watchdog/2-20 52 branch-misses # 7.98% of all branches
ksoftirqd/0-7 47 branch-misses # 6.65% of all branches
watchdog/1-14 46 branch-misses # 7.06% of all branches
watchdog/7-50 13 branch-misses # 1.99% of all branches
watchdog/5-38 8 branch-misses # 1.23% of all branches
watchdog/6-44 7 branch-misses # 1.07% of all branches
3.695150786 seconds time elapsed
root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C
Performance counter stats for 'system wide':
vmstat-23127 2,000,783 inst_retired.any # 1.5 IPC
thermald-2841 1,472,670 inst_retired.any # 1.3 IPC
sshd-23111 977,374 inst_retired.any # 1.2 IPC
perf-24163 483,779 inst_retired.any # 0.2 IPC
gmain-2700 341,213 inst_retired.any # 0.9 IPC
sshd-23058 148,891 inst_retired.any # 0.8 IPC
rtkit-daemon-3288 71,210 inst_retired.any # 0.7 IPC
kworker/u16:1-18249 39,562 inst_retired.any # 0.3 IPC
rcu_sched-8 14,474 inst_retired.any # 0.8 IPC
kworker/0:2-19991 7,659 inst_retired.any # 0.2 IPC
kworker/4:1-15354 6,714 inst_retired.any # 0.8 IPC
rtkit-daemon-3289 4,839 inst_retired.any # 0.3 IPC
kworker/6:0-17528 3,321 inst_retired.any # 0.6 IPC
kworker/5:2-31362 3,215 inst_retired.any # 0.5 IPC
kworker/7:2-23145 3,173 inst_retired.any # 0.7 IPC
kworker/4:1H-1887 1,719 inst_retired.any # 0.3 IPC
watchdog/0-11 1,479 inst_retired.any # 0.3 IPC
watchdog/1-14 1,479 inst_retired.any # 0.3 IPC
watchdog/2-20 1,479 inst_retired.any # 0.4 IPC
watchdog/3-26 1,479 inst_retired.any # 0.4 IPC
watchdog/4-32 1,479 inst_retired.any # 0.3 IPC
watchdog/5-38 1,479 inst_retired.any # 0.3 IPC
watchdog/6-44 1,479 inst_retired.any # 0.7 IPC
watchdog/7-50 1,479 inst_retired.any # 0.7 IPC
kworker/u16:2-23146 1,408 inst_retired.any # 0.5 IPC
perf-24163 2,249,872 cpu_clk_unhalted.thread
vmstat-23127 1,352,455 cpu_clk_unhalted.thread
thermald-2841 1,161,140 cpu_clk_unhalted.thread
sshd-23111 807,827 cpu_clk_unhalted.thread
gmain-2700 375,535 cpu_clk_unhalted.thread
sshd-23058 194,071 cpu_clk_unhalted.thread
kworker/u16:1-18249 114,306 cpu_clk_unhalted.thread
rtkit-daemon-3288 103,547 cpu_clk_unhalted.thread
kworker/0:2-19991 46,550 cpu_clk_unhalted.thread
rcu_sched-8 18,855 cpu_clk_unhalted.thread
rtkit-daemon-3289 17,549 cpu_clk_unhalted.thread
kworker/4:1-15354 8,812 cpu_clk_unhalted.thread
kworker/5:2-31362 6,812 cpu_clk_unhalted.thread
kworker/4:1H-1887 5,270 cpu_clk_unhalted.thread
kworker/6:0-17528 5,111 cpu_clk_unhalted.thread
kworker/7:2-23145 4,667 cpu_clk_unhalted.thread
watchdog/0-11 4,663 cpu_clk_unhalted.thread
watchdog/1-14 4,663 cpu_clk_unhalted.thread
watchdog/4-32 4,626 cpu_clk_unhalted.thread
watchdog/5-38 4,403 cpu_clk_unhalted.thread
watchdog/3-26 3,936 cpu_clk_unhalted.thread
watchdog/2-20 3,850 cpu_clk_unhalted.thread
kworker/u16:2-23146 2,654 cpu_clk_unhalted.thread
watchdog/6-44 2,017 cpu_clk_unhalted.thread
watchdog/7-50 2,017 cpu_clk_unhalted.thread
vmstat-23127 2,000,783 inst_retired.any # 0.7 CPI
thermald-2841 1,472,670 inst_retired.any # 0.8 CPI
sshd-23111 977,374 inst_retired.any # 0.8 CPI
perf-24163 495,037 inst_retired.any # 4.7 CPI
gmain-2700 341,213 inst_retired.any # 1.1 CPI
sshd-23058 148,891 inst_retired.any # 1.3 CPI
rtkit-daemon-3288 71,210 inst_retired.any # 1.5 CPI
kworker/u16:1-18249 39,562 inst_retired.any # 2.9 CPI
rcu_sched-8 14,474 inst_retired.any # 1.3 CPI
kworker/0:2-19991 7,659 inst_retired.any # 6.1 CPI
kworker/4:1-15354 6,714 inst_retired.any # 1.3 CPI
rtkit-daemon-3289 4,839 inst_retired.any # 3.6 CPI
kworker/6:0-17528 3,321 inst_retired.any # 1.5 CPI
kworker/5:2-31362 3,215 inst_retired.any # 2.1 CPI
kworker/7:2-23145 3,173 inst_retired.any # 1.5 CPI
kworker/4:1H-1887 1,719 inst_retired.any # 3.1 CPI
watchdog/0-11 1,479 inst_retired.any # 3.2 CPI
watchdog/1-14 1,479 inst_retired.any # 3.2 CPI
watchdog/2-20 1,479 inst_retired.any # 2.6 CPI
watchdog/3-26 1,479 inst_retired.any # 2.7 CPI
watchdog/4-32 1,479 inst_retired.any # 3.1 CPI
watchdog/5-38 1,479 inst_retired.any # 3.0 CPI
watchdog/6-44 1,479 inst_retired.any # 1.4 CPI
watchdog/7-50 1,479 inst_retired.any # 1.4 CPI
kworker/u16:2-23146 1,408 inst_retired.any # 1.9 CPI
perf-24163 2,302,323 cycles
vmstat-23127 1,352,455 cycles
thermald-2841 1,161,140 cycles
sshd-23111 807,827 cycles
gmain-2700 375,535 cycles
sshd-23058 194,071 cycles
kworker/u16:1-18249 114,306 cycles
rtkit-daemon-3288 103,547 cycles
kworker/0:2-19991 46,550 cycles
rcu_sched-8 18,855 cycles
rtkit-daemon-3289 17,549 cycles
kworker/4:1-15354 8,812 cycles
kworker/5:2-31362 6,812 cycles
kworker/4:1H-1887 5,270 cycles
kworker/6:0-17528 5,111 cycles
kworker/7:2-23145 4,667 cycles
watchdog/0-11 4,663 cycles
watchdog/1-14 4,663 cycles
watchdog/4-32 4,626 cycles
watchdog/5-38 4,403 cycles
watchdog/3-26 3,936 cycles
watchdog/2-20 3,850 cycles
kworker/u16:2-23146 2,654 cycles
watchdog/6-44 2,017 cycles
watchdog/7-50 2,017 cycles
2.175726600 seconds time elapsed
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-stat.c | 77 ++++++++++++++++++++++++++++++++++++++++-------
tools/perf/util/stat.h | 9 ++++++
2 files changed, 75 insertions(+), 11 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index ee708ba6f79a..58d501d1f5fd 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1351,13 +1351,24 @@ static void print_aggr(char *prefix)
}
}
-static void print_aggr_thread(struct perf_evsel *counter, char *prefix)
+static int cmp_val(const void *a, const void *b)
{
- FILE *output = stat_config.output;
- int nthreads = thread_map__nr(counter->threads);
- int ncpus = cpu_map__nr(counter->cpus);
- int cpu, thread;
+ return ((struct perf_aggr_thread_value *)b)->val -
+ ((struct perf_aggr_thread_value *)a)->val;
+}
+
+static struct perf_aggr_thread_value *sort_aggr_thread(
+ struct perf_evsel *counter,
+ int nthreads, int ncpus,
+ int *ret)
+{
+ int cpu, thread, i = 0;
double uval;
+ struct perf_aggr_thread_value *buf;
+
+ buf = calloc(nthreads, sizeof(struct perf_aggr_thread_value));
+ if (!buf)
+ return NULL;
for (thread = 0; thread < nthreads; thread++) {
u64 ena = 0, run = 0, val = 0;
@@ -1368,19 +1379,63 @@ static void print_aggr_thread(struct perf_evsel *counter, char *prefix)
run += perf_counts(counter->counts, cpu, thread)->run;
}
+ uval = val * counter->scale;
+
+ /*
+ * Skip value 0 when enabling --per-thread globally,
+ * otherwise too many 0 output.
+ */
+ if (uval == 0.0 && target__has_per_thread(&target))
+ continue;
+
+ buf[i].counter = counter;
+ buf[i].id = thread;
+ buf[i].uval = uval;
+ buf[i].val = val;
+ buf[i].run = run;
+ buf[i].ena = ena;
+ i++;
+ }
+
+ qsort(buf, i, sizeof(struct perf_aggr_thread_value), cmp_val);
+
+ if (ret)
+ *ret = i;
+
+ return buf;
+}
+
+static void print_aggr_thread(struct perf_evsel *counter, char *prefix)
+{
+ FILE *output = stat_config.output;
+ int nthreads = thread_map__nr(counter->threads);
+ int ncpus = cpu_map__nr(counter->cpus);
+ int thread, sorted_threads, id;
+ struct perf_aggr_thread_value *buf;
+
+ buf = sort_aggr_thread(counter, nthreads, ncpus, &sorted_threads);
+ if (!buf) {
+ perror("cannot sort aggr thread");
+ return;
+ }
+
+ for (thread = 0; thread < sorted_threads; thread++) {
if (prefix)
fprintf(output, "%s", prefix);
- uval = val * counter->scale;
-
+ id = buf[thread].id;
if (stat_config.stats)
- printout(thread, 0, counter, uval, prefix, run, ena,
- 1.0, &stat_config.stats[thread]);
+ printout(id, 0, buf[thread].counter, buf[thread].uval,
+ prefix, buf[thread].run, buf[thread].ena, 1.0,
+ &stat_config.stats[id]);
else
- printout(thread, 0, counter, uval, prefix, run, ena,
- 1.0, &rt_stat);
+ printout(id, 0, buf[thread].counter, buf[thread].uval,
+ prefix, buf[thread].run, buf[thread].ena, 1.0,
+ &rt_stat);
fputc('\n', output);
}
+
+ free(buf);
}
struct caggr_data {
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 2ed95dc72784..dbc6f7134f61 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -111,6 +111,15 @@ static inline void init_stats(struct stats *stats)
struct perf_evsel;
struct perf_evlist;
+struct perf_aggr_thread_value {
+ struct perf_evsel *counter;
+ int id;
+ double uval;
+ u64 val;
+ u64 run;
+ u64 ena;
+};
+
bool __perf_evsel_stat__is(struct perf_evsel *evsel,
enum perf_stat_evsel_id id);
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 12/35] perf utils: Move is_directory() to path.h
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (10 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 11/35] perf stat: Resort '--per-thread' result Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 13/35] perf test: Handle properly readdir DT_UNKNOWN Arnaldo Carvalho de Melo
` (23 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jiri Olsa, David Ahern,
Michael Petlan, Namhyung Kim, Peter Zijlstra,
Arnaldo Carvalho de Melo
From: Jiri Olsa <jolsa@kernel.org>
So that it can be used more widely, like in the next patch, when it will
be used to fix a bug in 'perf test' handling of dirent.d_type ==
DT_UNKNOWN.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171206174535.25380-1-jolsa@kernel.org
[ Split from a larger patch, removed needless includes in path.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-script.c | 14 +-------------
tools/perf/util/path.c | 14 ++++++++++++++
tools/perf/util/path.h | 3 +++
3 files changed, 18 insertions(+), 13 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index fac6f053e4da..77e47cf39f2c 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -26,6 +26,7 @@
#include "util/string2.h"
#include "util/thread-stack.h"
#include "util/time-utils.h"
+#include "util/path.h"
#include "print_binary.h"
#include <linux/bitmap.h>
#include <linux/kernel.h>
@@ -2401,19 +2402,6 @@ static int parse_output_fields(const struct option *opt __maybe_unused,
return rc;
}
-/* Helper function for filesystems that return a dent->d_type DT_UNKNOWN */
-static int is_directory(const char *base_path, const struct dirent *dent)
-{
- char path[PATH_MAX];
- struct stat st;
-
- sprintf(path, "%s/%s", base_path, dent->d_name);
- if (stat(path, &st))
- return 0;
-
- return S_ISDIR(st.st_mode);
-}
-
#define for_each_lang(scripts_path, scripts_dir, lang_dirent) \
while ((lang_dirent = readdir(scripts_dir)) != NULL) \
if ((lang_dirent->d_type == DT_DIR || \
diff --git a/tools/perf/util/path.c b/tools/perf/util/path.c
index 933f5c6bffb4..ca56ba2dd3da 100644
--- a/tools/perf/util/path.c
+++ b/tools/perf/util/path.c
@@ -18,6 +18,7 @@
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
+#include <dirent.h>
#include <unistd.h>
static char bad_path[] = "/bad-path/";
@@ -77,3 +78,16 @@ bool is_regular_file(const char *file)
return S_ISREG(st.st_mode);
}
+
+/* Helper function for filesystems that return a dent->d_type DT_UNKNOWN */
+bool is_directory(const char *base_path, const struct dirent *dent)
+{
+ char path[PATH_MAX];
+ struct stat st;
+
+ sprintf(path, "%s/%s", base_path, dent->d_name);
+ if (stat(path, &st))
+ return false;
+
+ return S_ISDIR(st.st_mode);
+}
diff --git a/tools/perf/util/path.h b/tools/perf/util/path.h
index 14a254ada7eb..f014f905df50 100644
--- a/tools/perf/util/path.h
+++ b/tools/perf/util/path.h
@@ -2,9 +2,12 @@
#ifndef _PERF_PATH_H
#define _PERF_PATH_H
+struct dirent;
+
int path__join(char *bf, size_t size, const char *path1, const char *path2);
int path__join3(char *bf, size_t size, const char *path1, const char *path2, const char *path3);
bool is_regular_file(const char *file);
+bool is_directory(const char *base_path, const struct dirent *dent);
#endif /* _PERF_PATH_H */
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 13/35] perf test: Handle properly readdir DT_UNKNOWN
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (11 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 12/35] perf utils: Move is_directory() to path.h Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 14/35] perf perf: Remove duplicate includes Arnaldo Carvalho de Melo
` (22 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jiri Olsa, David Ahern,
Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jiri Olsa <jolsa@kernel.org>
Some system can return DT_UNKNOWN in readdir's struct dirent::d_type and
we must handle it properly. In this case we can directly check if the
entity we found is directory and skip it.
Reported-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171206174535.25380-1-jolsa@kernel.org
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/tests/builtin-test.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 766573e236e4..fafa014240cd 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -411,9 +411,9 @@ static const char *shell_test__description(char *description, size_t size,
return description ? trim(description + 1) : NULL;
}
-#define for_each_shell_test(dir, ent) \
+#define for_each_shell_test(dir, base, ent) \
while ((ent = readdir(dir)) != NULL) \
- if (ent->d_type == DT_REG && ent->d_name[0] != '.')
+ if (!is_directory(base, ent))
static const char *shell_tests__dir(char *path, size_t size)
{
@@ -452,7 +452,7 @@ static int shell_tests__max_desc_width(void)
if (!dir)
return -1;
- for_each_shell_test(dir, ent) {
+ for_each_shell_test(dir, path, ent) {
char bf[256];
const char *desc = shell_test__description(bf, sizeof(bf), path, ent->d_name);
@@ -504,7 +504,7 @@ static int run_shell_tests(int argc, const char *argv[], int i, int width)
if (!dir)
return -1;
- for_each_shell_test(dir, ent) {
+ for_each_shell_test(dir, st.dir, ent) {
int curr = i++;
char desc[256];
struct test test = {
@@ -614,7 +614,7 @@ static int perf_test__list_shell(int argc, const char **argv, int i)
if (!dir)
return -1;
- for_each_shell_test(dir, ent) {
+ for_each_shell_test(dir, path, ent) {
int curr = i++;
char bf[256];
struct test t = {
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 14/35] perf perf: Remove duplicate includes
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (12 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 13/35] perf test: Handle properly readdir DT_UNKNOWN Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 15/35] tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h Arnaldo Carvalho de Melo
` (21 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Pravin Shedge, David S . Miller,
Greg Kroah-Hartman, Jiri Olsa, Peter Zijlstra, Thomas Gleixner,
Arnaldo Carvalho de Melo
From: Pravin Shedge <pravin.shedge4linux@gmail.com>
These duplicate includes have been found with scripts/checkincludes.pl
but they have been removed manually to avoid removing false positives.
Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1512582204-6493-1-git-send-email-pravin.shedge4linux@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/bench/futex-hash.c | 1 -
tools/perf/builtin-c2c.c | 3 ---
tools/perf/builtin-record.c | 1 -
tools/perf/builtin-stat.c | 1 -
tools/perf/tests/parse-events.c | 1 -
tools/perf/util/auxtrace.c | 3 ---
| 2 --
tools/perf/util/metricgroup.c | 2 --
tools/perf/util/scripting-engines/trace-event-python.c | 1 -
9 files changed, 15 deletions(-)
diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index 2defb6df7fd0..9aa3a674829b 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -27,7 +27,6 @@
#include "cpumap.h"
#include <err.h>
-#include <sys/time.h>
static unsigned int nthreads = 0;
static unsigned int nsecs = 10;
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index f1da9b0833c0..c0debc3f79b6 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -27,13 +27,10 @@
#include "sort.h"
#include "tool.h"
#include "data.h"
-#include "sort.h"
#include "event.h"
#include "evlist.h"
#include "evsel.h"
-#include <asm/bug.h>
#include "ui/browsers/hists.h"
-#include "evlist.h"
#include "thread.h"
struct c2c_hists {
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0a5749ef8b94..98da8cb8de93 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -51,7 +51,6 @@
#include <signal.h>
#include <sys/mman.h>
#include <sys/wait.h>
-#include <asm/bug.h>
#include <linux/time64.h>
struct switch_output {
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 58d501d1f5fd..98bf9d32f222 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -63,7 +63,6 @@
#include "util/group.h"
#include "util/session.h"
#include "util/tool.h"
-#include "util/group.h"
#include "util/string2.h"
#include "util/metricgroup.h"
#include "asm/bug.h"
diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index f0679613bd18..18b06444f230 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -13,7 +13,6 @@
#include <unistd.h>
#include <linux/kernel.h>
#include <linux/hw_breakpoint.h>
-#include <api/fs/fs.h>
#include <api/fs/tracing_path.h>
#define PERF_TP_SAMPLE_TYPE (PERF_SAMPLE_RAW | PERF_SAMPLE_TIME | \
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index a33491416400..c76687e42344 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -31,9 +31,6 @@
#include <sys/param.h>
#include <stdlib.h>
#include <stdio.h>
-#include <string.h>
-#include <limits.h>
-#include <errno.h>
#include <linux/list.h>
#include "../perf.h"
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 5890e08e0754..ca73aa7be708 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -15,9 +15,7 @@
#include <linux/bitops.h>
#include <linux/stringify.h>
#include <sys/stat.h>
-#include <sys/types.h>
#include <sys/utsname.h>
-#include <unistd.h>
#include "evlist.h"
#include "evsel.h"
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index e48410c99b39..1ddc3d1d0147 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -20,12 +20,10 @@
#include "pmu.h"
#include "expr.h"
#include "rblist.h"
-#include "pmu.h"
#include <string.h>
#include <stdbool.h>
#include <errno.h>
#include "pmu-events/pmu-events.h"
-#include "strbuf.h"
#include "strlist.h"
#include <assert.h>
#include <ctype.h>
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index c7187f067d31..c1848b543f27 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -43,7 +43,6 @@
#include "../db-export.h"
#include "../thread-stack.h"
#include "../trace-event.h"
-#include "../machine.h"
#include "../call-path.h"
#include "thread_map.h"
#include "cpumap.h"
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 15/35] tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (13 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 14/35] perf perf: Remove duplicate includes Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 16/35] perf s390: Generate system call table from asm/unistd.h Arnaldo Carvalho de Melo
` (20 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Hendrik Brueckner, Jiri Olsa,
Michael Petlan, linux-s390, Arnaldo Carvalho de Melo
From: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Will be used for generating the syscall id/string translation table.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-vjfbfvgjrnqnbdluqd7leo98@git.kernel.org
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/arch/s390/include/uapi/asm/unistd.h | 412 ++++++++++++++++++++++++++++++
| 1 +
2 files changed, 413 insertions(+)
create mode 100644 tools/arch/s390/include/uapi/asm/unistd.h
diff --git a/tools/arch/s390/include/uapi/asm/unistd.h b/tools/arch/s390/include/uapi/asm/unistd.h
new file mode 100644
index 000000000000..725120939051
--- /dev/null
+++ b/tools/arch/s390/include/uapi/asm/unistd.h
@@ -0,0 +1,412 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * S390 version
+ *
+ * Derived from "include/asm-i386/unistd.h"
+ */
+
+#ifndef _UAPI_ASM_S390_UNISTD_H_
+#define _UAPI_ASM_S390_UNISTD_H_
+
+/*
+ * This file contains the system call numbers.
+ */
+
+#define __NR_exit 1
+#define __NR_fork 2
+#define __NR_read 3
+#define __NR_write 4
+#define __NR_open 5
+#define __NR_close 6
+#define __NR_restart_syscall 7
+#define __NR_creat 8
+#define __NR_link 9
+#define __NR_unlink 10
+#define __NR_execve 11
+#define __NR_chdir 12
+#define __NR_mknod 14
+#define __NR_chmod 15
+#define __NR_lseek 19
+#define __NR_getpid 20
+#define __NR_mount 21
+#define __NR_umount 22
+#define __NR_ptrace 26
+#define __NR_alarm 27
+#define __NR_pause 29
+#define __NR_utime 30
+#define __NR_access 33
+#define __NR_nice 34
+#define __NR_sync 36
+#define __NR_kill 37
+#define __NR_rename 38
+#define __NR_mkdir 39
+#define __NR_rmdir 40
+#define __NR_dup 41
+#define __NR_pipe 42
+#define __NR_times 43
+#define __NR_brk 45
+#define __NR_signal 48
+#define __NR_acct 51
+#define __NR_umount2 52
+#define __NR_ioctl 54
+#define __NR_fcntl 55
+#define __NR_setpgid 57
+#define __NR_umask 60
+#define __NR_chroot 61
+#define __NR_ustat 62
+#define __NR_dup2 63
+#define __NR_getppid 64
+#define __NR_getpgrp 65
+#define __NR_setsid 66
+#define __NR_sigaction 67
+#define __NR_sigsuspend 72
+#define __NR_sigpending 73
+#define __NR_sethostname 74
+#define __NR_setrlimit 75
+#define __NR_getrusage 77
+#define __NR_gettimeofday 78
+#define __NR_settimeofday 79
+#define __NR_symlink 83
+#define __NR_readlink 85
+#define __NR_uselib 86
+#define __NR_swapon 87
+#define __NR_reboot 88
+#define __NR_readdir 89
+#define __NR_mmap 90
+#define __NR_munmap 91
+#define __NR_truncate 92
+#define __NR_ftruncate 93
+#define __NR_fchmod 94
+#define __NR_getpriority 96
+#define __NR_setpriority 97
+#define __NR_statfs 99
+#define __NR_fstatfs 100
+#define __NR_socketcall 102
+#define __NR_syslog 103
+#define __NR_setitimer 104
+#define __NR_getitimer 105
+#define __NR_stat 106
+#define __NR_lstat 107
+#define __NR_fstat 108
+#define __NR_lookup_dcookie 110
+#define __NR_vhangup 111
+#define __NR_idle 112
+#define __NR_wait4 114
+#define __NR_swapoff 115
+#define __NR_sysinfo 116
+#define __NR_ipc 117
+#define __NR_fsync 118
+#define __NR_sigreturn 119
+#define __NR_clone 120
+#define __NR_setdomainname 121
+#define __NR_uname 122
+#define __NR_adjtimex 124
+#define __NR_mprotect 125
+#define __NR_sigprocmask 126
+#define __NR_create_module 127
+#define __NR_init_module 128
+#define __NR_delete_module 129
+#define __NR_get_kernel_syms 130
+#define __NR_quotactl 131
+#define __NR_getpgid 132
+#define __NR_fchdir 133
+#define __NR_bdflush 134
+#define __NR_sysfs 135
+#define __NR_personality 136
+#define __NR_afs_syscall 137 /* Syscall for Andrew File System */
+#define __NR_getdents 141
+#define __NR_flock 143
+#define __NR_msync 144
+#define __NR_readv 145
+#define __NR_writev 146
+#define __NR_getsid 147
+#define __NR_fdatasync 148
+#define __NR__sysctl 149
+#define __NR_mlock 150
+#define __NR_munlock 151
+#define __NR_mlockall 152
+#define __NR_munlockall 153
+#define __NR_sched_setparam 154
+#define __NR_sched_getparam 155
+#define __NR_sched_setscheduler 156
+#define __NR_sched_getscheduler 157
+#define __NR_sched_yield 158
+#define __NR_sched_get_priority_max 159
+#define __NR_sched_get_priority_min 160
+#define __NR_sched_rr_get_interval 161
+#define __NR_nanosleep 162
+#define __NR_mremap 163
+#define __NR_query_module 167
+#define __NR_poll 168
+#define __NR_nfsservctl 169
+#define __NR_prctl 172
+#define __NR_rt_sigreturn 173
+#define __NR_rt_sigaction 174
+#define __NR_rt_sigprocmask 175
+#define __NR_rt_sigpending 176
+#define __NR_rt_sigtimedwait 177
+#define __NR_rt_sigqueueinfo 178
+#define __NR_rt_sigsuspend 179
+#define __NR_pread64 180
+#define __NR_pwrite64 181
+#define __NR_getcwd 183
+#define __NR_capget 184
+#define __NR_capset 185
+#define __NR_sigaltstack 186
+#define __NR_sendfile 187
+#define __NR_getpmsg 188
+#define __NR_putpmsg 189
+#define __NR_vfork 190
+#define __NR_pivot_root 217
+#define __NR_mincore 218
+#define __NR_madvise 219
+#define __NR_getdents64 220
+#define __NR_readahead 222
+#define __NR_setxattr 224
+#define __NR_lsetxattr 225
+#define __NR_fsetxattr 226
+#define __NR_getxattr 227
+#define __NR_lgetxattr 228
+#define __NR_fgetxattr 229
+#define __NR_listxattr 230
+#define __NR_llistxattr 231
+#define __NR_flistxattr 232
+#define __NR_removexattr 233
+#define __NR_lremovexattr 234
+#define __NR_fremovexattr 235
+#define __NR_gettid 236
+#define __NR_tkill 237
+#define __NR_futex 238
+#define __NR_sched_setaffinity 239
+#define __NR_sched_getaffinity 240
+#define __NR_tgkill 241
+/* Number 242 is reserved for tux */
+#define __NR_io_setup 243
+#define __NR_io_destroy 244
+#define __NR_io_getevents 245
+#define __NR_io_submit 246
+#define __NR_io_cancel 247
+#define __NR_exit_group 248
+#define __NR_epoll_create 249
+#define __NR_epoll_ctl 250
+#define __NR_epoll_wait 251
+#define __NR_set_tid_address 252
+#define __NR_fadvise64 253
+#define __NR_timer_create 254
+#define __NR_timer_settime 255
+#define __NR_timer_gettime 256
+#define __NR_timer_getoverrun 257
+#define __NR_timer_delete 258
+#define __NR_clock_settime 259
+#define __NR_clock_gettime 260
+#define __NR_clock_getres 261
+#define __NR_clock_nanosleep 262
+/* Number 263 is reserved for vserver */
+#define __NR_statfs64 265
+#define __NR_fstatfs64 266
+#define __NR_remap_file_pages 267
+#define __NR_mbind 268
+#define __NR_get_mempolicy 269
+#define __NR_set_mempolicy 270
+#define __NR_mq_open 271
+#define __NR_mq_unlink 272
+#define __NR_mq_timedsend 273
+#define __NR_mq_timedreceive 274
+#define __NR_mq_notify 275
+#define __NR_mq_getsetattr 276
+#define __NR_kexec_load 277
+#define __NR_add_key 278
+#define __NR_request_key 279
+#define __NR_keyctl 280
+#define __NR_waitid 281
+#define __NR_ioprio_set 282
+#define __NR_ioprio_get 283
+#define __NR_inotify_init 284
+#define __NR_inotify_add_watch 285
+#define __NR_inotify_rm_watch 286
+#define __NR_migrate_pages 287
+#define __NR_openat 288
+#define __NR_mkdirat 289
+#define __NR_mknodat 290
+#define __NR_fchownat 291
+#define __NR_futimesat 292
+#define __NR_unlinkat 294
+#define __NR_renameat 295
+#define __NR_linkat 296
+#define __NR_symlinkat 297
+#define __NR_readlinkat 298
+#define __NR_fchmodat 299
+#define __NR_faccessat 300
+#define __NR_pselect6 301
+#define __NR_ppoll 302
+#define __NR_unshare 303
+#define __NR_set_robust_list 304
+#define __NR_get_robust_list 305
+#define __NR_splice 306
+#define __NR_sync_file_range 307
+#define __NR_tee 308
+#define __NR_vmsplice 309
+#define __NR_move_pages 310
+#define __NR_getcpu 311
+#define __NR_epoll_pwait 312
+#define __NR_utimes 313
+#define __NR_fallocate 314
+#define __NR_utimensat 315
+#define __NR_signalfd 316
+#define __NR_timerfd 317
+#define __NR_eventfd 318
+#define __NR_timerfd_create 319
+#define __NR_timerfd_settime 320
+#define __NR_timerfd_gettime 321
+#define __NR_signalfd4 322
+#define __NR_eventfd2 323
+#define __NR_inotify_init1 324
+#define __NR_pipe2 325
+#define __NR_dup3 326
+#define __NR_epoll_create1 327
+#define __NR_preadv 328
+#define __NR_pwritev 329
+#define __NR_rt_tgsigqueueinfo 330
+#define __NR_perf_event_open 331
+#define __NR_fanotify_init 332
+#define __NR_fanotify_mark 333
+#define __NR_prlimit64 334
+#define __NR_name_to_handle_at 335
+#define __NR_open_by_handle_at 336
+#define __NR_clock_adjtime 337
+#define __NR_syncfs 338
+#define __NR_setns 339
+#define __NR_process_vm_readv 340
+#define __NR_process_vm_writev 341
+#define __NR_s390_runtime_instr 342
+#define __NR_kcmp 343
+#define __NR_finit_module 344
+#define __NR_sched_setattr 345
+#define __NR_sched_getattr 346
+#define __NR_renameat2 347
+#define __NR_seccomp 348
+#define __NR_getrandom 349
+#define __NR_memfd_create 350
+#define __NR_bpf 351
+#define __NR_s390_pci_mmio_write 352
+#define __NR_s390_pci_mmio_read 353
+#define __NR_execveat 354
+#define __NR_userfaultfd 355
+#define __NR_membarrier 356
+#define __NR_recvmmsg 357
+#define __NR_sendmmsg 358
+#define __NR_socket 359
+#define __NR_socketpair 360
+#define __NR_bind 361
+#define __NR_connect 362
+#define __NR_listen 363
+#define __NR_accept4 364
+#define __NR_getsockopt 365
+#define __NR_setsockopt 366
+#define __NR_getsockname 367
+#define __NR_getpeername 368
+#define __NR_sendto 369
+#define __NR_sendmsg 370
+#define __NR_recvfrom 371
+#define __NR_recvmsg 372
+#define __NR_shutdown 373
+#define __NR_mlock2 374
+#define __NR_copy_file_range 375
+#define __NR_preadv2 376
+#define __NR_pwritev2 377
+#define __NR_s390_guarded_storage 378
+#define __NR_statx 379
+#define __NR_s390_sthyi 380
+#define NR_syscalls 381
+
+/*
+ * There are some system calls that are not present on 64 bit, some
+ * have a different name although they do the same (e.g. __NR_chown32
+ * is __NR_chown on 64 bit).
+ */
+#ifndef __s390x__
+
+#define __NR_time 13
+#define __NR_lchown 16
+#define __NR_setuid 23
+#define __NR_getuid 24
+#define __NR_stime 25
+#define __NR_setgid 46
+#define __NR_getgid 47
+#define __NR_geteuid 49
+#define __NR_getegid 50
+#define __NR_setreuid 70
+#define __NR_setregid 71
+#define __NR_getrlimit 76
+#define __NR_getgroups 80
+#define __NR_setgroups 81
+#define __NR_fchown 95
+#define __NR_ioperm 101
+#define __NR_setfsuid 138
+#define __NR_setfsgid 139
+#define __NR__llseek 140
+#define __NR__newselect 142
+#define __NR_setresuid 164
+#define __NR_getresuid 165
+#define __NR_setresgid 170
+#define __NR_getresgid 171
+#define __NR_chown 182
+#define __NR_ugetrlimit 191 /* SuS compliant getrlimit */
+#define __NR_mmap2 192
+#define __NR_truncate64 193
+#define __NR_ftruncate64 194
+#define __NR_stat64 195
+#define __NR_lstat64 196
+#define __NR_fstat64 197
+#define __NR_lchown32 198
+#define __NR_getuid32 199
+#define __NR_getgid32 200
+#define __NR_geteuid32 201
+#define __NR_getegid32 202
+#define __NR_setreuid32 203
+#define __NR_setregid32 204
+#define __NR_getgroups32 205
+#define __NR_setgroups32 206
+#define __NR_fchown32 207
+#define __NR_setresuid32 208
+#define __NR_getresuid32 209
+#define __NR_setresgid32 210
+#define __NR_getresgid32 211
+#define __NR_chown32 212
+#define __NR_setuid32 213
+#define __NR_setgid32 214
+#define __NR_setfsuid32 215
+#define __NR_setfsgid32 216
+#define __NR_fcntl64 221
+#define __NR_sendfile64 223
+#define __NR_fadvise64_64 264
+#define __NR_fstatat64 293
+
+#else
+
+#define __NR_select 142
+#define __NR_getrlimit 191 /* SuS compliant getrlimit */
+#define __NR_lchown 198
+#define __NR_getuid 199
+#define __NR_getgid 200
+#define __NR_geteuid 201
+#define __NR_getegid 202
+#define __NR_setreuid 203
+#define __NR_setregid 204
+#define __NR_getgroups 205
+#define __NR_setgroups 206
+#define __NR_fchown 207
+#define __NR_setresuid 208
+#define __NR_getresuid 209
+#define __NR_setresgid 210
+#define __NR_getresgid 211
+#define __NR_chown 212
+#define __NR_setuid 213
+#define __NR_setgid 214
+#define __NR_setfsuid 215
+#define __NR_setfsgid 216
+#define __NR_newfstatat 293
+
+#endif
+
+#endif /* _UAPI_ASM_S390_UNISTD_H_ */
--git a/tools/perf/check-headers.sh b/tools/perf/check-headers.sh
index ea602cd1b43a..f81ca508700c 100755
--- a/tools/perf/check-headers.sh
+++ b/tools/perf/check-headers.sh
@@ -33,6 +33,7 @@ arch/s390/include/uapi/asm/kvm.h
arch/s390/include/uapi/asm/kvm_perf.h
arch/s390/include/uapi/asm/ptrace.h
arch/s390/include/uapi/asm/sie.h
+arch/s390/include/uapi/asm/unistd.h
arch/arm/include/uapi/asm/kvm.h
arch/arm64/include/uapi/asm/kvm.h
include/asm-generic/bitops/arch_hweight.h
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 16/35] perf s390: Generate system call table from asm/unistd.h
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (14 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 15/35] tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 17/35] perf trace: Use generated syscall table on s390 too Arnaldo Carvalho de Melo
` (19 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Hendrik Brueckner, Jiri Olsa,
Michael Petlan, linux-s390, Arnaldo Carvalho de Melo
From: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
This should speed up accessing new system calls introduced with
the kernel rather than waiting for libaudit updates to include
them.
Committer testing:
$ rm -rf /tmp/build/perf
$ mkdir /tmp/build/perf
$ make srctree=/home/acme/git/perf -C tools/perf/arch/s390 OUTPUT=/tmp/build/perf/ archheaders
make: Entering directory '/home/acme/git/perf/tools/perf/arch/s390'
/bin/sh '/home/acme/git/perf/tools/perf/arch/s390/entry/syscalls//mksyscalltbl' 'cc' /home/acme/git/perf/tools/arch/s390/include/uapi/asm/unistd.h > /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
make: Leaving directory '/home/acme/git/perf/tools/perf/arch/s390'
$ head -5 /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
static const char *syscalltbl_s390_64[] = {
[1] = "exit",
[2] = "fork",
[3] = "read",
[4] = "write",
$ tail -5 /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
[378] = "s390_guarded_storage",
[379] = "statx",
[380] = "s390_sthyi",
};
#define SYSCALLTBL_S390_64_MAX_ID 380
$
Now to plug this into 'perf trace' proper.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-h5km60rdg3rqxvsys85q50l3@git.kernel.org
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/arch/s390/Makefile | 21 ++++++++++++++
tools/perf/arch/s390/entry/syscalls/mksyscalltbl | 36 ++++++++++++++++++++++++
2 files changed, 57 insertions(+)
create mode 100755 tools/perf/arch/s390/entry/syscalls/mksyscalltbl
diff --git a/tools/perf/arch/s390/Makefile b/tools/perf/arch/s390/Makefile
index 09ba923debe8..48228de415d0 100644
--- a/tools/perf/arch/s390/Makefile
+++ b/tools/perf/arch/s390/Makefile
@@ -3,3 +3,24 @@ PERF_HAVE_DWARF_REGS := 1
endif
HAVE_KVM_STAT_SUPPORT := 1
PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
+
+#
+# Syscall table generation for perf
+#
+
+out := $(OUTPUT)arch/s390/include/generated/asm
+header := $(out)/syscalls_64.c
+sysdef := $(srctree)/tools/arch/s390/include/uapi/asm/unistd.h
+sysprf := $(srctree)/tools/perf/arch/s390/entry/syscalls/
+systbl := $(sysprf)/mksyscalltbl
+
+# Create output directory if not already present
+_dummy := $(shell [ -d '$(out)' ] || mkdir -p '$(out)')
+
+$(header): $(sysdef) $(systbl)
+ $(Q)$(SHELL) '$(systbl)' '$(CC)' $(sysdef) > $@
+
+clean::
+ $(call QUIET_CLEAN, s390) $(RM) $(header)
+
+archheaders: $(header)
diff --git a/tools/perf/arch/s390/entry/syscalls/mksyscalltbl b/tools/perf/arch/s390/entry/syscalls/mksyscalltbl
new file mode 100755
index 000000000000..7fa0d0abd419
--- /dev/null
+++ b/tools/perf/arch/s390/entry/syscalls/mksyscalltbl
@@ -0,0 +1,36 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Generate system call table for perf
+#
+#
+# Copyright IBM Corp. 2017
+# Author(s): Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
+#
+
+gcc=$1
+input=$2
+
+if ! test -r $input; then
+ echo "Could not read input file" >&2
+ exit 1
+fi
+
+create_table()
+{
+ local max_nr
+
+ echo 'static const char *syscalltbl_s390_64[] = {'
+ while read sc nr; do
+ printf '\t[%d] = "%s",\n' $nr $sc
+ max_nr=$nr
+ done
+ echo '};'
+ echo "#define SYSCALLTBL_S390_64_MAX_ID $max_nr"
+}
+
+
+$gcc -m64 -E -dM -x c $input \
+ |sed -ne 's/^#define __NR_//p' \
+ |sort -t' ' -k2 -nu \
+ |create_table
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 17/35] perf trace: Use generated syscall table on s390 too
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (15 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 16/35] perf s390: Generate system call table from asm/unistd.h Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 18/35] perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate() Arnaldo Carvalho de Melo
` (18 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Hendrik Brueckner, Jiri Olsa,
Michael Petlan, linux-s390, Arnaldo Carvalho de Melo
From: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
This should speed up accessing new system calls introduced with the
kernel rather than waiting for libaudit updates to include them.
It also enables users to specify wildcards, for example, perf trace -e
'open*', just like was already possible on x86.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-htplh3nbrivi7g3cffbh4fsu@git.kernel.org
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Makefile.config | 10 +++++++++-
tools/perf/util/syscalltbl.c | 4 ++++
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 79b117a03fd7..6f73c2316740 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -22,6 +22,7 @@ include $(srctree)/tools/scripts/Makefile.arch
$(call detected_var,SRCARCH)
NO_PERF_REGS := 1
+NO_SYSCALL_TABLE := 1
# Additional ARCH settings for ppc
ifeq ($(SRCARCH),powerpc)
@@ -33,7 +34,8 @@ endif
ifeq ($(SRCARCH),x86)
$(call detected,CONFIG_X86)
ifeq (${IS_64_BIT}, 1)
- CFLAGS += -DHAVE_ARCH_X86_64_SUPPORT -DHAVE_SYSCALL_TABLE -I$(OUTPUT)arch/x86/include/generated
+ NO_SYSCALL_TABLE := 0
+ CFLAGS += -DHAVE_ARCH_X86_64_SUPPORT -I$(OUTPUT)arch/x86/include/generated
ARCH_INCLUDE = ../../arch/x86/lib/memcpy_64.S ../../arch/x86/lib/memset_64.S
LIBUNWIND_LIBS = -lunwind-x86_64 -lunwind -llzma
$(call detected,CONFIG_X86_64)
@@ -56,12 +58,18 @@ endif
ifeq ($(ARCH),s390)
NO_PERF_REGS := 0
+ NO_SYSCALL_TABLE := 0
+ CFLAGS += -I$(OUTPUT)arch/s390/include/generated
endif
ifeq ($(NO_PERF_REGS),0)
$(call detected,CONFIG_PERF_REGS)
endif
+ifneq ($(NO_SYSCALL_TABLE),1)
+ CFLAGS += -DHAVE_SYSCALL_TABLE
+endif
+
# So far there's only x86 and arm libdw unwind support merged in perf.
# Disable it on all other architectures in case libdw unwind
# support is detected in system. Add supported architectures
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 6eea7cff3d4e..303bdb84ab5a 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -26,6 +26,10 @@
#include <asm/syscalls_64.c>
const int syscalltbl_native_max_id = SYSCALLTBL_x86_64_MAX_ID;
static const char **syscalltbl_native = syscalltbl_x86_64;
+#elif defined(__s390x__)
+#include <asm/syscalls_64.c>
+const int syscalltbl_native_max_id = SYSCALLTBL_S390_64_MAX_ID;
+static const char **syscalltbl_native = syscalltbl_s390_64;
#endif
struct syscall {
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 18/35] perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate()
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (16 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 17/35] perf trace: Use generated syscall table on s390 too Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 19/35] perf annotate: Use perf_env when obtaining the arch name Arnaldo Carvalho de Melo
` (17 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
Adrian Hunter, David Ahern, Hendrik Brueckner, Jiri Olsa,
Michael Petlan, Namhyung Kim, Thomas Richter, Wang Nan
From: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce its function signature, since we get this from 'evsel' which
is already one of its arguments.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-070eap7t6uicg9c3w086xy2z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-top.c | 2 +-
tools/perf/ui/browsers/annotate.c | 4 +---
tools/perf/ui/gtk/annotate.c | 2 +-
tools/perf/util/annotate.c | 7 ++++---
tools/perf/util/annotate.h | 2 +-
tools/perf/util/evsel.c | 6 +++---
tools/perf/util/evsel.h | 2 +-
7 files changed, 12 insertions(+), 13 deletions(-)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 540461f5e345..c6ccda52117d 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -138,7 +138,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
return err;
}
- err = symbol__annotate(sym, map, evsel, 0, NULL, NULL);
+ err = symbol__annotate(sym, map, evsel, 0, NULL);
if (err == 0) {
out_assign:
top->sym_filter_entry = he;
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 03b7363a49c9..286427975112 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1116,9 +1116,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
if (perf_evsel__is_group_event(evsel))
nr_pcnt = evsel->nr_members;
- err = symbol__annotate(sym, map, evsel,
- sizeof(struct browser_line), &browser.arch,
- perf_evsel__env_cpuid(evsel));
+ err = symbol__annotate(sym, map, evsel, sizeof(struct browser_line), &browser.arch);
if (err) {
char msg[BUFSIZ];
symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index cdb5ecf91666..aeeaf15029f0 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -169,7 +169,7 @@ static int symbol__gtk_annotate(struct symbol *sym, struct map *map,
if (map->dso->annotate_warned)
return -1;
- err = symbol__annotate(sym, map, evsel, 0, NULL, NULL);
+ err = symbol__annotate(sym, map, evsel, 0, NULL);
if (err) {
char msg[BUFSIZ];
symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index facad1e279a8..bc34b28373f4 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1622,13 +1622,14 @@ void symbol__calc_percent(struct symbol *sym, struct perf_evsel *evsel)
int symbol__annotate(struct symbol *sym, struct map *map,
struct perf_evsel *evsel, size_t privsize,
- struct arch **parch, char *cpuid)
+ struct arch **parch)
{
struct annotate_args args = {
.privsize = privsize,
.map = map,
.evsel = evsel,
};
+ struct perf_env *env = perf_evsel__env(evsel);
const char *arch_name = NULL;
struct arch *arch;
int err;
@@ -1648,7 +1649,7 @@ int symbol__annotate(struct symbol *sym, struct map *map,
*parch = arch;
if (arch->init) {
- err = arch->init(arch, cpuid);
+ err = arch->init(arch, env ? env->cpuid : NULL);
if (err) {
pr_err("%s: failed to initialize %s arch priv area\n", __func__, arch->name);
return err;
@@ -1999,7 +2000,7 @@ int symbol__tty_annotate(struct symbol *sym, struct map *map,
struct dso *dso = map->dso;
struct rb_root source_line = RB_ROOT;
- if (symbol__annotate(sym, map, evsel, 0, NULL, NULL) < 0)
+ if (symbol__annotate(sym, map, evsel, 0, NULL) < 0)
return -1;
symbol__calc_percent(sym, evsel);
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 6d7289e88fa3..ce427445671f 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -179,7 +179,7 @@ void symbol__annotate_zero_histograms(struct symbol *sym);
int symbol__annotate(struct symbol *sym, struct map *map,
struct perf_evsel *evsel, size_t privsize,
- struct arch **parch, char *cpuid);
+ struct arch **parch);
enum symbol_disassemble_errno {
SYMBOL_ANNOTATE_ERRNO__SUCCESS = 0,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 95853c51c0ca..541897049c6c 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2842,9 +2842,9 @@ char *perf_evsel__env_arch(struct perf_evsel *evsel)
return NULL;
}
-char *perf_evsel__env_cpuid(struct perf_evsel *evsel)
+struct perf_env *perf_evsel__env(struct perf_evsel *evsel)
{
- if (evsel && evsel->evlist && evsel->evlist->env)
- return evsel->evlist->env->cpuid;
+ if (evsel && evsel->evlist)
+ return evsel->evlist->env;
return NULL;
}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index c3663a70c9b9..0e961ce60a9c 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -447,6 +447,6 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
attr__fprintf_f attr__fprintf, void *priv);
char *perf_evsel__env_arch(struct perf_evsel *evsel);
-char *perf_evsel__env_cpuid(struct perf_evsel *evsel);
+struct perf_env *perf_evsel__env(struct perf_evsel *evsel);
#endif /* __PERF_EVSEL_H */
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 19/35] perf annotate: Use perf_env when obtaining the arch name
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (17 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 18/35] perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate() Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 20/35] perf env: Adopt perf_env__arch() from the annotate code Arnaldo Carvalho de Melo
` (16 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
Adrian Hunter, David Ahern, Hendrik Brueckner, Jiri Olsa,
Michael Petlan, Namhyung Kim, Thomas Richter, Wang Nan
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Paving the way to reuse these routines in other areas, like when
generating errno tables.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-rh1qv051vb8gfdcswskrn53h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/annotate.c | 17 ++++++++---------
tools/perf/util/evsel.c | 7 -------
tools/perf/util/evsel.h | 1 -
3 files changed, 8 insertions(+), 17 deletions(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index bc34b28373f4..eac45ccd5c32 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1420,16 +1420,19 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil
return 0;
}
-static const char *annotate__norm_arch(const char *arch_name)
+static const char *perf_env__arch(struct perf_env *env)
{
struct utsname uts;
+ char *arch_name;
- if (!arch_name) { /* Assume we are annotating locally. */
+ if (!env) { /* Assume local operation */
if (uname(&uts) < 0)
return NULL;
arch_name = uts.machine;
- }
- return normalize_arch((char *)arch_name);
+ } else
+ arch_name = env->arch;
+
+ return normalize_arch(arch_name);
}
static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
@@ -1630,14 +1633,10 @@ int symbol__annotate(struct symbol *sym, struct map *map,
.evsel = evsel,
};
struct perf_env *env = perf_evsel__env(evsel);
- const char *arch_name = NULL;
+ const char *arch_name = perf_env__arch(env);
struct arch *arch;
int err;
- if (evsel)
- arch_name = perf_evsel__env_arch(evsel);
-
- arch_name = annotate__norm_arch(arch_name);
if (!arch_name)
return -1;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 541897049c6c..4718f0a460df 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2835,13 +2835,6 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target,
perf_evsel__name(evsel));
}
-char *perf_evsel__env_arch(struct perf_evsel *evsel)
-{
- if (evsel && evsel->evlist && evsel->evlist->env)
- return evsel->evlist->env->arch;
- return NULL;
-}
-
struct perf_env *perf_evsel__env(struct perf_evsel *evsel)
{
if (evsel && evsel->evlist)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 0e961ce60a9c..846e41644525 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -446,7 +446,6 @@ typedef int (*attr__fprintf_f)(FILE *, const char *, const char *, void *);
int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
attr__fprintf_f attr__fprintf, void *priv);
-char *perf_evsel__env_arch(struct perf_evsel *evsel);
struct perf_env *perf_evsel__env(struct perf_evsel *evsel);
#endif /* __PERF_EVSEL_H */
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 20/35] perf env: Adopt perf_env__arch() from the annotate code
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (18 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 19/35] perf annotate: Use perf_env when obtaining the arch name Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 21/35] perf probe: Add warning message if there is unexpected event name Arnaldo Carvalho de Melo
` (15 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
Adrian Hunter, David Ahern, Hendrik Brueckner, Jiri Olsa,
Michael Petlan, Namhyung Kim, Thomas Richter, Wang Nan
From: Arnaldo Carvalho de Melo <acme@redhat.com>
And use it in the libunwind case, with both passing a valid perf_env to
extract the arch to be normalized from and passing NULL with the same
semantic as in the annotate code: to get it from uname() uts.machine.
Now the code to generate per arch errno translation tables (int/string)
can use it to decode perf.data files recorded in a different arch than
that where 'perf trace' (or any other analysis tool) runs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-p2epffgash69w38kvj3ntpc9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/arch/common.c | 44 +++--------------------------------
tools/perf/arch/common.h | 1 -
tools/perf/util/annotate.c | 16 -------------
tools/perf/util/env.c | 47 ++++++++++++++++++++++++++++++++++++++
tools/perf/util/env.h | 2 ++
tools/perf/util/unwind-libunwind.c | 4 ++--
6 files changed, 54 insertions(+), 60 deletions(-)
diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index 8c0cfeb55f8e..c6f373508a4f 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -1,12 +1,10 @@
// SPDX-License-Identifier: GPL-2.0
#include <stdio.h>
-#include <sys/utsname.h>
#include "common.h"
+#include "../util/env.h"
#include "../util/util.h"
#include "../util/debug.h"
-#include "sane_ctype.h"
-
const char *const arm_triplets[] = {
"arm-eabi-",
"arm-linux-androideabi-",
@@ -120,55 +118,19 @@ static int lookup_triplets(const char *const *triplets, const char *name)
return -1;
}
-/*
- * Return architecture name in a normalized form.
- * The conversion logic comes from the Makefile.
- */
-const char *normalize_arch(char *arch)
-{
- if (!strcmp(arch, "x86_64"))
- return "x86";
- if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
- return "x86";
- if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
- return "sparc";
- if (!strcmp(arch, "aarch64") || !strcmp(arch, "arm64"))
- return "arm64";
- if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
- return "arm";
- if (!strncmp(arch, "s390", 4))
- return "s390";
- if (!strncmp(arch, "parisc", 6))
- return "parisc";
- if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
- return "powerpc";
- if (!strncmp(arch, "mips", 4))
- return "mips";
- if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
- return "sh";
-
- return arch;
-}
-
static int perf_env__lookup_binutils_path(struct perf_env *env,
const char *name, const char **path)
{
int idx;
- const char *arch, *cross_env;
- struct utsname uts;
+ const char *arch = perf_env__arch(env), *cross_env;
const char *const *path_list;
char *buf = NULL;
- arch = normalize_arch(env->arch);
-
- if (uname(&uts) < 0)
- goto out;
-
/*
* We don't need to try to find objdump path for native system.
* Just use default binutils path (e.g.: "objdump").
*/
- if (!strcmp(normalize_arch(uts.machine), arch))
+ if (!strcmp(perf_env__arch(NULL), arch))
goto out;
cross_env = getenv("CROSS_COMPILE");
diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h
index a1546509ad24..2d875baa92e6 100644
--- a/tools/perf/arch/common.h
+++ b/tools/perf/arch/common.h
@@ -7,6 +7,5 @@
extern const char *objdump_path;
int perf_env__lookup_objdump(struct perf_env *env);
-const char *normalize_arch(char *arch);
#endif /* ARCH_PERF_COMMON_H */
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index eac45ccd5c32..68e687d1bf99 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -26,7 +26,6 @@
#include <pthread.h>
#include <linux/bitops.h>
#include <linux/kernel.h>
-#include <sys/utsname.h>
#include "sane_ctype.h"
@@ -1420,21 +1419,6 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil
return 0;
}
-static const char *perf_env__arch(struct perf_env *env)
-{
- struct utsname uts;
- char *arch_name;
-
- if (!env) { /* Assume local operation */
- if (uname(&uts) < 0)
- return NULL;
- arch_name = uts.machine;
- } else
- arch_name = env->arch;
-
- return normalize_arch(arch_name);
-}
-
static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
{
struct map *map = args->map;
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 6276b340f893..6d311868d850 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -1,8 +1,10 @@
// SPDX-License-Identifier: GPL-2.0
#include "cpumap.h"
#include "env.h"
+#include "sane_ctype.h"
#include "util.h"
#include <errno.h>
+#include <sys/utsname.h>
struct perf_env perf_env;
@@ -93,3 +95,48 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
free(cache->map);
free(cache->size);
}
+
+/*
+ * Return architecture name in a normalized form.
+ * The conversion logic comes from the Makefile.
+ */
+static const char *normalize_arch(char *arch)
+{
+ if (!strcmp(arch, "x86_64"))
+ return "x86";
+ if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
+ return "x86";
+ if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
+ return "sparc";
+ if (!strcmp(arch, "aarch64") || !strcmp(arch, "arm64"))
+ return "arm64";
+ if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
+ return "arm";
+ if (!strncmp(arch, "s390", 4))
+ return "s390";
+ if (!strncmp(arch, "parisc", 6))
+ return "parisc";
+ if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
+ return "powerpc";
+ if (!strncmp(arch, "mips", 4))
+ return "mips";
+ if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
+ return "sh";
+
+ return arch;
+}
+
+const char *perf_env__arch(struct perf_env *env)
+{
+ struct utsname uts;
+ char *arch_name;
+
+ if (!env) { /* Assume local operation */
+ if (uname(&uts) < 0)
+ return NULL;
+ arch_name = uts.machine;
+ } else
+ arch_name = env->arch;
+
+ return normalize_arch(arch_name);
+}
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 1eb35b190b34..bf970f57dce0 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -65,4 +65,6 @@ int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[]);
int perf_env__read_cpu_topology_map(struct perf_env *env);
void cpu_cache_level__free(struct cpu_cache_level *cache);
+
+const char *perf_env__arch(struct perf_env *env);
#endif /* __PERF_ENV_H */
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 647a1e6b4c7b..b029a5e9ae49 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -3,7 +3,7 @@
#include "thread.h"
#include "session.h"
#include "debug.h"
-#include "arch/common.h"
+#include "env.h"
struct unwind_libunwind_ops __weak *local_unwind_libunwind_ops;
struct unwind_libunwind_ops __weak *x86_32_unwind_libunwind_ops;
@@ -39,7 +39,7 @@ int unwind__prepare_access(struct thread *thread, struct map *map,
if (dso_type == DSO__TYPE_UNKNOWN)
return 0;
- arch = normalize_arch(thread->mg->machine->env->arch);
+ arch = perf_env__arch(thread->mg->machine->env);
if (!strcmp(arch, "x86")) {
if (dso_type != DSO__TYPE_64BIT)
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 21/35] perf probe: Add warning message if there is unexpected event name
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (19 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 20/35] perf env: Adopt perf_env__arch() from the annotate code Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 22/35] perf probe: Cut off the version suffix from " Arnaldo Carvalho de Melo
` (14 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Masami Hiramatsu, Paul Clarke,
bhargavb, linux-rt-users, Arnaldo Carvalho de Melo
From: Masami Hiramatsu <mhiramat@kernel.org>
This improve the error message so that user can know event-name error
before writing new events to kprobe-events interface.
E.g.
======
#./perf probe -x /lib64/libc-2.25.so malloc_get_state*
Internal error: "malloc_get_state@GLIBC_2" is an invalid event name.
Error: Failed to add events.
======
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275040665.24652.5188568529237584489.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/probe-event.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index b7aaf9b2294d..262d5da86623 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2625,6 +2625,14 @@ static int get_new_event_name(char *buf, size_t len, const char *base,
out:
free(nbase);
+
+ /* Final validation */
+ if (ret >= 0 && !is_c_func_name(buf)) {
+ pr_warning("Internal error: \"%s\" is an invalid event name.\n",
+ buf);
+ ret = -EINVAL;
+ }
+
return ret;
}
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 22/35] perf probe: Cut off the version suffix from event name
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (20 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 21/35] perf probe: Add warning message if there is unexpected event name Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 23/35] perf probe: Add __return suffix for return events Arnaldo Carvalho de Melo
` (13 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Masami Hiramatsu, Paul Clarke,
linux-rt-users, Arnaldo Carvalho de Melo
From: Masami Hiramatsu <mhiramat@kernel.org>
Cut off the version suffix (e.g. @GLIBC_2.2.5 etc.) from automatic
generated event name. This fixes wildcard event adding like below case;
=====
# perf probe -x /lib64/libc-2.25.so malloc*
Internal error: "malloc_get_state@GLIBC_2" is wrong event name.
Error: Failed to add events.
=====
This failure was caused by a versioned suffix symbol.
With this fix, perf probe automatically cuts the suffix after @ as
below.
=====
# ./perf probe -x /lib64/libc-2.25.so malloc*
Added new events:
probe_libc:malloc_printerr (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc_consolidate (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc_check (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc_hook_ini (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc_trim (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc_usable_size (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc_stats (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc_info (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:mallochook (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc_get_state (on malloc* in /usr/lib64/libc-2.25.so)
probe_libc:malloc_set_state (on malloc* in /usr/lib64/libc-2.25.so)
You can now use it in all perf tools, such as:
perf record -e probe_libc:malloc_set_state -aR sleep 1
=====
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: bhargavb <bhargavaramudu@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/None
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/probe-event.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 262d5da86623..7e582547ac07 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2584,8 +2584,8 @@ static int get_new_event_name(char *buf, size_t len, const char *base,
if (!nbase)
return -ENOMEM;
- /* Cut off the dot suffixes (e.g. .const, .isra)*/
- p = strchr(nbase, '.');
+ /* Cut off the dot suffixes (e.g. .const, .isra) and version suffixes */
+ p = strpbrk(nbase, ".@");
if (p && p != nbase)
*p = '\0';
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 23/35] perf probe: Add __return suffix for return events
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (21 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 22/35] perf probe: Cut off the version suffix from " Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 24/35] perf probe: Find versioned symbols from map Arnaldo Carvalho de Melo
` (12 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Masami Hiramatsu, Paul Clarke,
bhargavb, linux-rt-users, Arnaldo Carvalho de Melo
From: Masami Hiramatsu <mhiramat@kernel.org>
Add __return suffix for function return events automatically. Without
this, user have to give --force option and will see the number suffix
for each event like "function_1", which is not easy to recognize.
Instead, this adds __return suffix to it automatically. E.g.
=====
# ./perf probe -x /lib64/libc-2.25.so 'malloc*%return'
Added new events:
probe_libc:malloc_printerr__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc_consolidate__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc_check__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc_hook_ini__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc_trim__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc_usable_size__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc_stats__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc_info__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:mallochook__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc_get_state__return (on malloc*%return in /usr/lib64/libc-2.25.so)
probe_libc:malloc_set_state__return (on malloc*%return in /usr/lib64/libc-2.25.so)
You can now use it in all perf tools, such as:
perf record -e probe_libc:malloc_set_state__return -aR sleep 1
=====
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275046418.24652.6696011972866498489.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Documentation/perf-probe.txt | 2 +-
tools/perf/util/probe-event.c | 9 +++++----
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt
index d7e4869905f1..f96382692f42 100644
--- a/tools/perf/Documentation/perf-probe.txt
+++ b/tools/perf/Documentation/perf-probe.txt
@@ -170,7 +170,7 @@ Probe points are defined by following syntax.
or,
sdt_PROVIDER:SDTEVENT
-'EVENT' specifies the name of new event, if omitted, it will be set the name of the probed function. You can also specify a group name by 'GROUP', if omitted, set 'probe' is used for kprobe and 'probe_<bin>' is used for uprobe.
+'EVENT' specifies the name of new event, if omitted, it will be set the name of the probed function, and for return probes, a "\_\_return" suffix is automatically added to the function name. You can also specify a group name by 'GROUP', if omitted, set 'probe' is used for kprobe and 'probe_<bin>' is used for uprobe.
Note that using existing group name can conflict with other events. Especially, using the group name reserved for kernel modules can hide embedded events in the
modules.
'FUNC' specifies a probed function name, and it may have one of the following options; '+OFFS' is the offset from function entry address in bytes, ':RLN' is the relative-line number from function entry line, and '%return' means that it probes function return. And ';PTN' means lazy matching pattern (see LAZY MATCHING). Note that ';PTN' must be the end of the probe point definition. In addition, '@SRC' specifies a source file which has that function.
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 7e582547ac07..a68141d360b0 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2573,7 +2573,8 @@ int show_perf_probe_events(struct strfilter *filter)
}
static int get_new_event_name(char *buf, size_t len, const char *base,
- struct strlist *namelist, bool allow_suffix)
+ struct strlist *namelist, bool ret_event,
+ bool allow_suffix)
{
int i, ret;
char *p, *nbase;
@@ -2590,7 +2591,7 @@ static int get_new_event_name(char *buf, size_t len, const char *base,
*p = '\0';
/* Try no suffix number */
- ret = e_snprintf(buf, len, "%s", nbase);
+ ret = e_snprintf(buf, len, "%s%s", nbase, ret_event ? "__return" : "");
if (ret < 0) {
pr_debug("snprintf() failed: %d\n", ret);
goto out;
@@ -2689,8 +2690,8 @@ static int probe_trace_event__set_name(struct probe_trace_event *tev,
group = PERFPROBE_GROUP;
/* Get an unused new event name */
- ret = get_new_event_name(buf, 64, event,
- namelist, allow_suffix);
+ ret = get_new_event_name(buf, 64, event, namelist,
+ tev->point.retprobe, allow_suffix);
if (ret < 0)
return ret;
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 24/35] perf probe: Find versioned symbols from map
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (22 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 23/35] perf probe: Add __return suffix for return events Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 25/35] perf string: Add {strdup,strpbrk}_esc() Arnaldo Carvalho de Melo
` (11 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Masami Hiramatsu, Paul Clarke,
bhargavb, linux-rt-users, Arnaldo Carvalho de Melo
From: Masami Hiramatsu <mhiramat@kernel.org>
Commit d80406453ad4 ("perf symbols: Allow user probes on versioned
symbols") allows user to find default versioned symbols (with "@@") in
map. However, it did not enable normal versioned symbol (with "@") for
perf-probe. E.g.
=====
# ./perf probe -x /lib64/libc-2.25.so malloc_get_state
Failed to find symbol malloc_get_state in /usr/lib64/libc-2.25.so
Error: Failed to add events.
=====
This solves above issue by improving perf-probe symbol search function,
as below.
=====
# ./perf probe -x /lib64/libc-2.25.so malloc_get_state
Added new event:
probe_libc:malloc_get_state (on malloc_get_state in /usr/lib64/libc-2.25.so)
You can now use it in all perf tools, such as:
perf record -e probe_libc:malloc_get_state -aR sleep 1
# ./perf probe -l
probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
=====
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275049269.24652.1639103455496216255.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/arch/powerpc/util/sym-handling.c | 8 ++++++++
tools/perf/util/probe-event.c | 20 ++++++++++++++++++--
tools/perf/util/symbol.c | 5 +++++
tools/perf/util/symbol.h | 1 +
4 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/powerpc/util/sym-handling.c b/tools/perf/arch/powerpc/util/sym-handling.c
index 9c4e23d8c8ce..53d83d7e6a09 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -64,6 +64,14 @@ int arch__compare_symbol_names_n(const char *namea, const char *nameb,
return strncmp(namea, nameb, n);
}
+
+const char *arch__normalize_symbol_name(const char *name)
+{
+ /* Skip over initial dot */
+ if (name && *name == '.')
+ name++;
+ return name;
+}
#endif
#if defined(_CALL_ELF) && _CALL_ELF == 2
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index a68141d360b0..0d6c66d51939 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2801,16 +2801,32 @@ static int find_probe_functions(struct map *map, char *name,
int found = 0;
struct symbol *sym;
struct rb_node *tmp;
+ const char *norm, *ver;
+ char *buf = NULL;
if (map__load(map) < 0)
return 0;
map__for_each_symbol(map, sym, tmp) {
- if (strglobmatch(sym->name, name)) {
+ norm = arch__normalize_symbol_name(sym->name);
+ if (!norm)
+ continue;
+
+ /* We don't care about default symbol or not */
+ ver = strchr(norm, '@');
+ if (ver) {
+ buf = strndup(norm, ver - norm);
+ if (!buf)
+ return -ENOMEM;
+ norm = buf;
+ }
+ if (strglobmatch(norm, name)) {
found++;
if (syms && found < probe_conf.max_probes)
syms[found - 1] = sym;
}
+ if (buf)
+ zfree(&buf);
}
return found;
@@ -2856,7 +2872,7 @@ static int find_probe_trace_events_from_map(struct perf_probe_event *pev,
* same name but different addresses, this lists all the symbols.
*/
num_matched_functions = find_probe_functions(map, pp->function, syms);
- if (num_matched_functions == 0) {
+ if (num_matched_functions <= 0) {
pr_err("Failed to find symbol %s in %s\n", pp->function,
pev->target ? : "kernel");
ret = -ENOENT;
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 1b67a8639dfe..cc065d4bfafc 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -94,6 +94,11 @@ static int prefix_underscores_count(const char *str)
return tail - str;
}
+const char * __weak arch__normalize_symbol_name(const char *name)
+{
+ return name;
+}
+
int __weak arch__compare_symbol_names(const char *namea, const char *nameb)
{
return strcmp(namea, nameb);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index a4f0075b4e5c..0563f33c1eb3 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -349,6 +349,7 @@ bool elf__needs_adjust_symbols(GElf_Ehdr ehdr);
void arch__sym_update(struct symbol *s, GElf_Sym *sym);
#endif
+const char *arch__normalize_symbol_name(const char *name);
#define SYMBOL_A 0
#define SYMBOL_B 1
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 25/35] perf string: Add {strdup,strpbrk}_esc()
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (23 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 24/35] perf probe: Find versioned symbols from map Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 26/35] perf probe: Support escaped character in parser Arnaldo Carvalho de Melo
` (10 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Masami Hiramatsu, Paul Clarke,
bhargavb, linux-rt-users, Arnaldo Carvalho de Melo
From: Masami Hiramatsu <mhiramat@kernel.org>
To support the special characters escaped by '\' in 'perf probe' event parser.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275052163.24652.18205979384585484358.stgit@devbox
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/string.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/string2.h | 2 ++
2 files changed, 48 insertions(+)
diff --git a/tools/perf/util/string.c b/tools/perf/util/string.c
index aaa08ee8c717..d8bfd0c4d2cb 100644
--- a/tools/perf/util/string.c
+++ b/tools/perf/util/string.c
@@ -396,3 +396,49 @@ char *asprintf_expr_inout_ints(const char *var, bool in, size_t nints, int *ints
free(expr);
return NULL;
}
+
+/* Like strpbrk(), but not break if it is right after a backslash (escaped) */
+char *strpbrk_esc(char *str, const char *stopset)
+{
+ char *ptr;
+
+ do {
+ ptr = strpbrk(str, stopset);
+ if (ptr == str ||
+ (ptr == str + 1 && *(ptr - 1) != '\\'))
+ break;
+ str = ptr + 1;
+ } while (ptr && *(ptr - 1) == '\\' && *(ptr - 2) != '\\');
+
+ return ptr;
+}
+
+/* Like strdup, but do not copy a single backslash */
+char *strdup_esc(const char *str)
+{
+ char *s, *d, *p, *ret = strdup(str);
+
+ if (!ret)
+ return NULL;
+
+ d = strchr(ret, '\\');
+ if (!d)
+ return ret;
+
+ s = d + 1;
+ do {
+ if (*s == '\0') {
+ *d = '\0';
+ break;
+ }
+ p = strchr(s + 1, '\\');
+ if (p) {
+ memmove(d, s, p - s);
+ d += p - s;
+ s = p + 1;
+ } else
+ memmove(d, s, strlen(s) + 1);
+ } while (p);
+
+ return ret;
+}
diff --git a/tools/perf/util/string2.h b/tools/perf/util/string2.h
index ee14ca5451ab..4c68a09b97e8 100644
--- a/tools/perf/util/string2.h
+++ b/tools/perf/util/string2.h
@@ -39,5 +39,7 @@ static inline char *asprintf_expr_not_in_ints(const char *var, size_t nints, int
return asprintf_expr_inout_ints(var, false, nints, ints);
}
+char *strpbrk_esc(char *str, const char *stopset);
+char *strdup_esc(const char *str);
#endif /* PERF_STRING_H */
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 26/35] perf probe: Support escaped character in parser
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (24 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 25/35] perf string: Add {strdup,strpbrk}_esc() Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 27/35] perf evsel: Fix swap for samples with raw data Arnaldo Carvalho de Melo
` (9 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Masami Hiramatsu, Paul Clarke,
bhargavb, linux-rt-users, Arnaldo Carvalho de Melo
From: Masami Hiramatsu <mhiramat@kernel.org>
Support the special characters escaped by '\' in parser. This allows
user to specify versions directly like below.
=====
# ./perf probe -x /lib64/libc-2.25.so malloc_get_state\\@GLIBC_2.2.5
Added new event:
probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
You can now use it in all perf tools, such as:
perf record -e probe_libc:malloc_get_state -aR sleep 1
=====
Or, you can use separators in source filename, e.g.
=====
# ./perf probe -x /opt/test/a.out foo+bar.c:3
Semantic error :There is non-digit character in offset.
Error: Command Parse Error.
=====
Usually "+" in source file cause parser error, but
=====
# ./perf probe -x /opt/test/a.out foo\\+bar.c:4
Added new event:
probe_a:main (on @foo+bar.c:4 in /opt/test/a.out)
You can now use it in all perf tools, such as:
perf record -e probe_a:main -aR sleep 1
=====
escaped "\+" allows you to specify that.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151309111236.18107.5634753157435343410.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Documentation/perf-probe.txt | 16 +++++++++
tools/perf/util/probe-event.c | 58 ++++++++++++++++++++-------------
2 files changed, 51 insertions(+), 23 deletions(-)
diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt
index f96382692f42..b6866a05edd2 100644
--- a/tools/perf/Documentation/perf-probe.txt
+++ b/tools/perf/Documentation/perf-probe.txt
@@ -182,6 +182,14 @@ Note that before using the SDT event, the target binary (on which SDT events are
For details of the SDT, see below.
https://sourceware.org/gdb/onlinedocs/gdb/Static-Probe-Points.html
+ESCAPED CHARACTER
+-----------------
+
+In the probe syntax, '=', '@', '+', ':' and ';' are treated as a special character. You can use a backslash ('\') to escape the special characters.
+This is useful if you need to probe on a specific versioned symbols, like @GLIBC_... suffixes, or also you need to specify a source file which includes the special characters.
+Note that usually single backslash is consumed by shell, so you might need to pass double backslash (\\) or wrapping with single quotes (\'AAA\@BBB').
+See EXAMPLES how it is used.
+
PROBE ARGUMENT
--------------
Each probe argument follows below syntax.
@@ -277,6 +285,14 @@ Add a USDT probe to a target process running in a different mount namespace
./perf probe --target-ns <target pid> -x /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64/jre/lib/amd64/server/libjvm.so %sdt_hotspot:thread__sleep__end
+Add a probe on specific versioned symbol by backslash escape
+
+ ./perf probe -x /lib64/libc-2.25.so 'malloc_get_state\@GLIBC_2.2.5'
+
+Add a probe in a source file using special characters by backslash escape
+
+ ./perf probe -x /opt/test/a.out 'foo\+bar.c:4'
+
SEE ALSO
--------
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 0d6c66d51939..e1dbc9821617 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1325,27 +1325,30 @@ static int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev)
{
char *ptr;
- ptr = strchr(*arg, ':');
+ ptr = strpbrk_esc(*arg, ":");
if (ptr) {
*ptr = '\0';
if (!pev->sdt && !is_c_func_name(*arg))
goto ng_name;
- pev->group = strdup(*arg);
+ pev->group = strdup_esc(*arg);
if (!pev->group)
return -ENOMEM;
*arg = ptr + 1;
} else
pev->group = NULL;
- if (!pev->sdt && !is_c_func_name(*arg)) {
+
+ pev->event = strdup_esc(*arg);
+ if (pev->event == NULL)
+ return -ENOMEM;
+
+ if (!pev->sdt && !is_c_func_name(pev->event)) {
+ zfree(&pev->event);
ng_name:
+ zfree(&pev->group);
semantic_error("%s is bad for event name -it must "
"follow C symbol-naming rule.\n", *arg);
return -EINVAL;
}
- pev->event = strdup(*arg);
- if (pev->event == NULL)
- return -ENOMEM;
-
return 0;
}
@@ -1373,7 +1376,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
arg++;
}
- ptr = strpbrk(arg, ";=@+%");
+ ptr = strpbrk_esc(arg, ";=@+%");
if (pev->sdt) {
if (ptr) {
if (*ptr != '@') {
@@ -1387,7 +1390,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
pev->target = build_id_cache__origname(tmp);
free(tmp);
} else
- pev->target = strdup(ptr + 1);
+ pev->target = strdup_esc(ptr + 1);
if (!pev->target)
return -ENOMEM;
*ptr = '\0';
@@ -1421,13 +1424,14 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
*
* Otherwise, we consider arg to be a function specification.
*/
- if (!strpbrk(arg, "+@%") && (ptr = strpbrk(arg, ";:")) != NULL) {
+ if (!strpbrk_esc(arg, "+@%")) {
+ ptr = strpbrk_esc(arg, ";:");
/* This is a file spec if it includes a '.' before ; or : */
- if (memchr(arg, '.', ptr - arg))
+ if (ptr && memchr(arg, '.', ptr - arg))
file_spec = true;
}
- ptr = strpbrk(arg, ";:+@%");
+ ptr = strpbrk_esc(arg, ";:+@%");
if (ptr) {
nc = *ptr;
*ptr++ = '\0';
@@ -1436,7 +1440,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
if (arg[0] == '\0')
tmp = NULL;
else {
- tmp = strdup(arg);
+ tmp = strdup_esc(arg);
if (tmp == NULL)
return -ENOMEM;
}
@@ -1469,12 +1473,12 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
arg = ptr;
c = nc;
if (c == ';') { /* Lazy pattern must be the last part */
- pp->lazy_line = strdup(arg);
+ pp->lazy_line = strdup(arg); /* let leave escapes */
if (pp->lazy_line == NULL)
return -ENOMEM;
break;
}
- ptr = strpbrk(arg, ";:+@%");
+ ptr = strpbrk_esc(arg, ";:+@%");
if (ptr) {
nc = *ptr;
*ptr++ = '\0';
@@ -1501,7 +1505,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
semantic_error("SRC@SRC is not allowed.\n");
return -EINVAL;
}
- pp->file = strdup(arg);
+ pp->file = strdup_esc(arg);
if (pp->file == NULL)
return -ENOMEM;
break;
@@ -2803,23 +2807,31 @@ static int find_probe_functions(struct map *map, char *name,
struct rb_node *tmp;
const char *norm, *ver;
char *buf = NULL;
+ bool cut_version = true;
if (map__load(map) < 0)
return 0;
+ /* If user gives a version, don't cut off the version from symbols */
+ if (strchr(name, '@'))
+ cut_version = false;
+
map__for_each_symbol(map, sym, tmp) {
norm = arch__normalize_symbol_name(sym->name);
if (!norm)
continue;
- /* We don't care about default symbol or not */
- ver = strchr(norm, '@');
- if (ver) {
- buf = strndup(norm, ver - norm);
- if (!buf)
- return -ENOMEM;
- norm = buf;
+ if (cut_version) {
+ /* We don't care about default symbol or not */
+ ver = strchr(norm, '@');
+ if (ver) {
+ buf = strndup(norm, ver - norm);
+ if (!buf)
+ return -ENOMEM;
+ norm = buf;
+ }
}
+
if (strglobmatch(norm, name)) {
found++;
if (syms && found < probe_conf.max_probes)
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 27/35] perf evsel: Fix swap for samples with raw data
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (25 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 26/35] perf probe: Support escaped character in parser Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 28/35] perf test shell: Fix check open filename arg using 'perf trace' Arnaldo Carvalho de Melo
` (8 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jiri Olsa, David Ahern,
Namhyung Kim, Peter Zijlstra, Steven Rostedt,
Arnaldo Carvalho de Melo
From: Jiri Olsa <jolsa@kernel.org>
When we detect a different endianity we swap event before processing.
It's tricky for samples because we have no idea what's inside. We treat
it as an array of u64s, swap them and later on we swap back parts which
are different.
We mangle this way also the tracepoint raw data, which ends up in report
showing wrong data:
1.95% comm=Q^B pid=29285 prio=16777216 target_cpu=000
1.67% comm=l^B pid=0 prio=16777216 target_cpu=000
Luckily the traceevent library handles the endianity by itself (thank
you Steven!), so we can pass the RAW data directly in the other
endianity.
2.51% comm=beah-rhts-task pid=1175 prio=120 target_cpu=002
2.23% comm=kworker/0:0 pid=11566 prio=120 target_cpu=000
The fix is basically to swap back the raw data if different endianity is
detected.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20171129184346.3656-1-jolsa@kernel.org
[ Add util/memswap.c to python-ext-sources to link missing mem_bswap_64() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/evsel.c | 20 +++++++++++++++++---
tools/perf/util/python-ext-sources | 1 +
2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4718f0a460df..1cf044cbae36 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -36,6 +36,7 @@
#include "debug.h"
#include "trace-event.h"
#include "stat.h"
+#include "memswap.h"
#include "util/parse-branch-options.h"
#include "sane_ctype.h"
@@ -2131,14 +2132,27 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
if (type & PERF_SAMPLE_RAW) {
OVERFLOW_CHECK_u64(array);
u.val64 = *array;
- if (WARN_ONCE(swapped,
- "Endianness of raw data not corrected!\n")) {
- /* undo swap of u64, then swap on individual u32s */
+
+ /*
+ * Undo swap of u64, then swap on individual u32s,
+ * get the size of the raw area and undo all of the
+ * swap. The pevent interface handles endianity by
+ * itself.
+ */
+ if (swapped) {
u.val64 = bswap_64(u.val64);
u.val32[0] = bswap_32(u.val32[0]);
u.val32[1] = bswap_32(u.val32[1]);
}
data->raw_size = u.val32[0];
+
+ /*
+ * The raw data is aligned on 64bits including the
+ * u32 size, so it's safe to use mem_bswap_64.
+ */
+ if (swapped)
+ mem_bswap_64((void *) array, data->raw_size);
+
array = (void *)array + sizeof(u32);
OVERFLOW_CHECK(array, data->raw_size, max_size);
diff --git a/tools/perf/util/python-ext-sources b/tools/perf/util/python-ext-sources
index b4f2f06722a7..7aa0ea64544e 100644
--- a/tools/perf/util/python-ext-sources
+++ b/tools/perf/util/python-ext-sources
@@ -10,6 +10,7 @@ util/ctype.c
util/evlist.c
util/evsel.c
util/cpumap.c
+util/memswap.c
util/mmap.c
util/namespaces.c
../lib/bitmap.c
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 28/35] perf test shell: Fix check open filename arg using 'perf trace'
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (26 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 27/35] perf evsel: Fix swap for samples with raw data Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 29/35] Revert "perf s390: Always build with -fPIC" Arnaldo Carvalho de Melo
` (7 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Michael Petlan, Adrian Hunter,
David Ahern, Hendrik Brueckner, Jiri Olsa, Namhyung Kim,
Wang Nan, Arnaldo Carvalho de Melo
From: Michael Petlan <mpetlan@redhat.com>
Commit f231af789b11 ("perf test shell: Fix check open filename arg using
'perf trace' on s390x") added an exception for s390x to use openat()
instead of open() in the test that intercepts a open syscall to look for
the filename argument as obtained by the vfs_getname 'perf probe' it
puts in place at the getname_flags kernel function.
Its not just s390x that uses openat() instead of open(), so use 'perf
list' to look for the syscall:sys_enter_open(at)? present in the system
being tested instead of checking if the system is s390x.
In fact Namhyung pointed out that glibc 2.26 changed this behaviour, as
described in https://lwn.net/Articles/738694/, so systems where glibc is
>= 2.26 will need this patch for this test to work, which already took
place in some distros for architectures such as s390x, while Fedora 26
x86_64 is at glibc 2.25, i.e. still uses open().
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: https://lkml.kernel.org/r/ab23fe42-1080-a46b-503e-744e097f414f@linux.vnet.ibm.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
LPU-Reference: 1275675985.12835754.1513095723265.JavaMail.zimbra@redhat.com
Link: https://lkml.kernel.org/n/tip-j2wbz9av1rw3thr3t0g4dtuk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/tests/shell/trace+probe_vfs_getname.sh | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/tools/perf/tests/shell/trace+probe_vfs_getname.sh b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
index 2a9ef080efd0..55ad9793d544 100755
--- a/tools/perf/tests/shell/trace+probe_vfs_getname.sh
+++ b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
@@ -17,10 +17,9 @@ skip_if_no_perf_probe || exit 2
file=$(mktemp /tmp/temporary_file.XXXXX)
trace_open_vfs_getname() {
- test "$(uname -m)" = s390x && { svc="openat"; txt="dfd: +CWD, +"; }
-
- perf trace -e ${svc:-open} touch $file 2>&1 | \
- egrep " +[0-9]+\.[0-9]+ +\( +[0-9]+\.[0-9]+ ms\): +touch\/[0-9]+ ${svc:-open}\(${txt}filename: +${file}, +flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, +mode: +IRUGO\|IWUGO\) += +[0-9]+$"
+ evts=$(echo $(perf list syscalls:sys_enter_open* |& egrep 'open(at)? ' | sed -r 's/.*sys_enter_([a-z]+) +\[.*$/\1/') | sed 's/ /,/')
+ perf trace -e $evts touch $file 2>&1 | \
+ egrep " +[0-9]+\.[0-9]+ +\( +[0-9]+\.[0-9]+ ms\): +touch\/[0-9]+ open(at)?\((dfd: +CWD, +)?filename: +${file}, +flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, +mode: +IRUGO\|IWUGO\) += +[0-9]+$"
}
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 29/35] Revert "perf s390: Always build with -fPIC"
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (27 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 28/35] perf test shell: Fix check open filename arg using 'perf trace' Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 30/35] perf s390: Always build with -fPIC Arnaldo Carvalho de Melo
` (6 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo
From: Arnaldo Carvalho de Melo <acme@redhat.com>
This one made x86 always build with -fPIC, when the intention was for
s390 to be built that way, due to a rebase mistake.
Reported-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
This reverts commit 1dc4ddf112a408e607a073d951b962b6c6e2bd6c.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Makefile.config | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 6f73c2316740..eb6bd99be0bd 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -43,7 +43,6 @@ ifeq ($(SRCARCH),x86)
LIBUNWIND_LIBS = -lunwind-x86 -llzma -lunwind
endif
NO_PERF_REGS := 0
- CFLAGS += -fPIC
endif
ifeq ($(SRCARCH),arm)
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 30/35] perf s390: Always build with -fPIC
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (28 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 29/35] Revert "perf s390: Always build with -fPIC" Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 31/35] perf evsel: Enable ignore_missing_thread for pid option Arnaldo Carvalho de Melo
` (5 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Hendrik Brueckner,
Heiko Carstens, Martin Schwidefsky, Thomas Richter,
linux s390 list, Arnaldo Carvalho de Melo
From: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
On s390, object files must be compiled with position-indepedent code in
order to be incrementally linked or linked to shared libraries.
Therefore, add -fPIC to the CFLAGS for s390 to ensure each object file
is built properly.
Reported-by: Jonathan Hermann <jonathan.hermann@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux s390 list <linux-s390@vger.kernel.org>
Link: https://lkml.kernel.org/r/20171207080951.GC4889@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Makefile.config | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index eb6bd99be0bd..f050f38d8fa3 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -58,7 +58,7 @@ endif
ifeq ($(ARCH),s390)
NO_PERF_REGS := 0
NO_SYSCALL_TABLE := 0
- CFLAGS += -I$(OUTPUT)arch/s390/include/generated
+ CFLAGS += -fPIC -I$(OUTPUT)arch/s390/include/generated
endif
ifeq ($(NO_PERF_REGS),0)
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 31/35] perf evsel: Enable ignore_missing_thread for pid option
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (29 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 30/35] perf s390: Always build with -fPIC Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 32/35] perf probe arm64: Fix symbol fixup issues due to ELF type Arnaldo Carvalho de Melo
` (4 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Mengting Zhang, Cheng Jian,
Li Bin, Wang Nan, Arnaldo Carvalho de Melo
From: Mengting Zhang <zhangmengting@huawei.com>
While monitoring a multithread process with pid option, perf sometimes
may return sys_perf_event_open failure with 3(No such process) if any of
the process's threads die before we open the event. However, we want
perf continue monitoring the remaining threads and do not exit with
error.
Here, the patch enables perf_evsel::ignore_missing_thread for -p option
to ignore complete failure if any of threads die before we open the event.
But it may still return sys_perf_event_open failure with 22(Invalid) if we
monitors several event groups.
sys_perf_event_open: pid 28960 cpu 40 group_fd 118202 flags 0x8
sys_perf_event_open: pid 28961 cpu 40 group_fd 118203 flags 0x8
WARNING: Ignored open failure for pid 28962
sys_perf_event_open: pid 28962 cpu 40 group_fd [118203] flags 0x8
sys_perf_event_open failed, error -22
That is because when we ignore a missing thread, we change the thread_idx
without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd()
may return a wrong group_fd for the next thread and sys_perf_event_open()
return with 22.
sys_perf_event_open(){
...
if (group_fd != -1)
perf_fget_light()//to get corresponding group_leader by group_fd
...
if (group_leader)
if (group_leader->ctx->task != ctx->task)//should on the same task
goto err_context
...
}
This patch also fixes this bug by introducing perf_evsel__remove_fd() and
update_fds to allow removing fds for the missing thread.
Changes since v1:
- Change group_fd__remove() into a more genetic way without changing code logic
- Remove redundant condition
Changes since v2:
- Use a proper function name and add some comment.
- Multiline comment style fixes.
Committer testing:
Before this patch the recently added 'perf stat --per-thread' for system
wide counting would race while enumerating all threads using /proc:
[root@jouet ~]# perf stat --per-thread
failed to parse CPUs map: No such file or directory
Usage: perf stat [<options>] [<command>]
-C, --cpu <cpu> list of cpus to monitor in system-wide
-a, --all-cpus system-wide collection from all CPUs
[root@jouet ~]# perf stat --per-thread
failed to parse CPUs map: No such file or directory
Usage: perf stat [<options>] [<command>]
-C, --cpu <cpu> list of cpus to monitor in system-wide
-a, --all-cpus system-wide collection from all CPUs
[root@jouet ~]#
When, say, the kernel was being built, so lots of shortlived threads,
after this patch this doesn't happen.
Signed-off-by: Mengting Zhang <zhangmengting@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Cheng Jian <cj.chengjian@huawei.com>
Cc: Li Bin <huawei.libin@huawei.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com
[ Remove one use 'evlist' alias variable ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-record.c | 4 ++--
tools/perf/util/evsel.c | 47 +++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 47 insertions(+), 4 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 98da8cb8de93..50385d89c497 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1804,8 +1804,8 @@ int cmd_record(int argc, const char **argv)
goto out;
}
- /* Enable ignoring missing threads when -u option is defined. */
- rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX;
+ /* Enable ignoring missing threads when -u/-p option is defined. */
+ rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX || rec->opts.target.pid;
err = -ENOMEM;
if (perf_evlist__create_maps(rec->evlist, &rec->opts.target) < 0)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1cf044cbae36..a4d256ea0dc4 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1599,10 +1599,46 @@ static int __open_attr__fprintf(FILE *fp, const char *name, const char *val,
return fprintf(fp, " %-32s %s\n", name, val);
}
+static void perf_evsel__remove_fd(struct perf_evsel *pos,
+ int nr_cpus, int nr_threads,
+ int thread_idx)
+{
+ for (int cpu = 0; cpu < nr_cpus; cpu++)
+ for (int thread = thread_idx; thread < nr_threads - 1; thread++)
+ FD(pos, cpu, thread) = FD(pos, cpu, thread + 1);
+}
+
+static int update_fds(struct perf_evsel *evsel,
+ int nr_cpus, int cpu_idx,
+ int nr_threads, int thread_idx)
+{
+ struct perf_evsel *pos;
+
+ if (cpu_idx >= nr_cpus || thread_idx >= nr_threads)
+ return -EINVAL;
+
+ evlist__for_each_entry(evsel->evlist, pos) {
+ nr_cpus = pos != evsel ? nr_cpus : cpu_idx;
+
+ perf_evsel__remove_fd(pos, nr_cpus, nr_threads, thread_idx);
+
+ /*
+ * Since fds for next evsel has not been created,
+ * there is no need to iterate whole event list.
+ */
+ if (pos == evsel)
+ break;
+ }
+ return 0;
+}
+
static bool ignore_missing_thread(struct perf_evsel *evsel,
+ int nr_cpus, int cpu,
struct thread_map *threads,
int thread, int err)
{
+ pid_t ignore_pid = thread_map__pid(threads, thread);
+
if (!evsel->ignore_missing_thread)
return false;
@@ -1618,11 +1654,18 @@ static bool ignore_missing_thread(struct perf_evsel *evsel,
if (threads->nr == 1)
return false;
+ /*
+ * We should remove fd for missing_thread first
+ * because thread_map__remove() will decrease threads->nr.
+ */
+ if (update_fds(evsel, nr_cpus, cpu, threads->nr, thread))
+ return false;
+
if (thread_map__remove(threads, thread))
return false;
pr_warning("WARNING: Ignored open failure for pid %d\n",
- thread_map__pid(threads, thread));
+ ignore_pid);
return true;
}
@@ -1727,7 +1770,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
if (fd < 0) {
err = -errno;
- if (ignore_missing_thread(evsel, threads, thread, err)) {
+ if (ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
/*
* We just removed 1 thread, so take a step
* back on thread index and lower the upper
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 32/35] perf probe arm64: Fix symbol fixup issues due to ELF type
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (30 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 31/35] perf evsel: Enable ignore_missing_thread for pid option Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 33/35] perf tool: Improve bash command line auto-complete for multiple events with comma Arnaldo Carvalho de Melo
` (3 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Kim Phillips, Alexander Shishkin,
Ganapatrao Kulkarni, Jiri Olsa, Namhyung Kim, Naveen N . Rao,
Peter Zijlstra, Will Deacon, linux-arm-kernel,
Arnaldo Carvalho de Melo
From: Kim Phillips <kim.phillips@arm.com>
On an arm64 machine running a CONFIG_RANDOMIZE_BASE=y kernel, perf
kernel symbol resolution fails. Debugging saw symsrc_init calling the
default elf__needs_adjust_symbols() where checks for an ET_DYN (3)
ehdr.e_type failed when they should have succeeded.
Fix by adopting powerpc version of the weak elf__needs_adjust_symbols()
function, as done in commit d2332098331f ("perf probe ppc: Fix symbol
fixup issues due to ELF type").
Prior to this patch, perf test 1 would fail:
$ sudo oldperf test -v 1 |& head
1: vmlinux symtab matches kallsyms :
test child forked, pid 33374
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/boot/vmlinux for symbols
ERR : 0xfffe0000100f1000: do_undefinstr not on kallsyms
ERR : 0xfffe0000100f1320: do_sysinstr not on kallsyms
ERR : 0xfffe0000100f13b0: do_debug_exception not on kallsyms
ERR : 0xfffe0000100f1498: do_mem_abort not on kallsyms
ERR : 0xfffe0000100f1580: do_sp_pc_abort not on kallsyms
...
After applying this patch, perf test 1 now succeeds:
$ sudo ./newperf test -v 1 |& head
1: vmlinux symtab matches kallsyms :
test child forked, pid 33378
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/boot/vmlinux for symbols
WARN: 0xffff000008081000: diff name v: do_undefinstr k: __exception_text_start
WARN: 0xffff0000080819e8: diff name v: __irqentry_text_end k: __softirqentry_text_start
WARN: 0xffff000008081d08: diff name v: __entry_text_start k: __softirqentry_text_end
WARN: 0xffff00000809db5c: diff name v: flush_icache_range k: __flush_cache_user_range
WARN: 0xffff000008101908: diff name v: sys_ni_syscall k: sys_vm86old
...
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171214175242.e30450f17f93ad675d968fa3@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/arch/arm64/util/Build | 1 +
tools/perf/arch/arm64/util/sym-handling.c | 22 ++++++++++++++++++++++
2 files changed, 23 insertions(+)
create mode 100644 tools/perf/arch/arm64/util/sym-handling.c
diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index b1ab72d2a42e..e04f6cdd6f32 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -1,4 +1,5 @@
libperf-y += header.o
+libperf-y += sym-handling.o
libperf-$(CONFIG_DWARF) += dwarf-regs.o
libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
diff --git a/tools/perf/arch/arm64/util/sym-handling.c b/tools/perf/arch/arm64/util/sym-handling.c
new file mode 100644
index 000000000000..0051b1ee8450
--- /dev/null
+++ b/tools/perf/arch/arm64/util/sym-handling.c
@@ -0,0 +1,22 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Copyright (C) 2015 Naveen N. Rao, IBM Corporation
+ */
+
+#include "debug.h"
+#include "symbol.h"
+#include "map.h"
+#include "probe-event.h"
+#include "probe-file.h"
+
+#ifdef HAVE_LIBELF_SUPPORT
+bool elf__needs_adjust_symbols(GElf_Ehdr ehdr)
+{
+ return ehdr.e_type == ET_EXEC ||
+ ehdr.e_type == ET_REL ||
+ ehdr.e_type == ET_DYN;
+}
+#endif
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 33/35] perf tool: Improve bash command line auto-complete for multiple events with comma
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (31 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 32/35] perf probe arm64: Fix symbol fixup issues due to ELF type Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 34/35] perf tools: Return all events as auto-completions after comma Arnaldo Carvalho de Melo
` (2 subsequent siblings)
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Jiri Olsa, Kan Liang, Peter Zijlstra,
Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
perf has perf-completion.sh to define command line auto-completion in
bash/zsh.
For record/stat -e it works for single events, but isn't working when
specifying multiple events with comma.
It would be very useful if it could be fixed to make it easier by
supporting multiple events, comma separated.
With this patch, the result can be like this:
1. Support the events returned from 'perf list --raw-dump'
root@skl:/tmp# perf stat -e cpu/cache<TAB>
cpu/cache-misses/ cpu/cache-references/
root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/branch-<TAB>
cpu/branch-instructions/ cpu/branch-misses/
root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/branch-i<TAB>
root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/branch-instructions/
2. Support the events listed in /sys/bus/event_source/devices/cpu/events
root@skl:/tmp# perf stat -e cycle<TAB>
cycle_activity.cycles_l1d_miss cycle_activity.stalls_l3_miss
cycle_activity.cycles_l2_miss cycle_activity.stalls_mem_any
cycle_activity.cycles_l3_miss cycle_activity.stalls_total
cycle_activity.cycles_mem_any cycles-ct
cycle_activity.stalls_l1d_miss cycles-t
cycle_activity.stalls_l2_miss
root@skl:/tmp# perf stat -e cycles-<TAB>
cycles-ct cycles-t
root@skl:/tmp# perf stat -e cycles-t,cpu/c<TAB>
cpu/cache-misses/ cpu/cpu-cycles/ cpu/cycles-t/
cpu/cache-references/ cpu/cycles-ct/
root@skl:/tmp# perf stat -e cycles-t,cpu/cache-<TAB>
cpu/cache-misses/ cpu/cache-references/
root@skl:/tmp# perf stat -e cycles-t,cpu/cache-misses/
3. Support the uppercase event which is with prefix "cpu/"
root@skl:/tmp# perf stat -e cpu/c<TAB>
cpu/cache-misses/ cpu/cpu-cycles/ cpu/cycles-t/
cpu/cache-references/ cpu/cycles-ct/
root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/C<TAB>
cpu/CACHE-MISSES/ cpu/CPU-CYCLES/ cpu/CYCLES-T/
cpu/CACHE-REFERENCES/ cpu/CYCLES-CT/
root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/CACHE-REFERENCES/
Note that:
a) This patch only supports bash.
b) It doesn't support the cases like {},{} or {...,...}.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1513848370-8098-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/perf-completion.sh | 38 ++++++++++++++++++++++++++++++++++----
1 file changed, 34 insertions(+), 4 deletions(-)
diff --git a/tools/perf/perf-completion.sh b/tools/perf/perf-completion.sh
index 345f5d6e9ed5..d8310830a18b 100644
--- a/tools/perf/perf-completion.sh
+++ b/tools/perf/perf-completion.sh
@@ -162,8 +162,33 @@ __perf_main ()
# List possible events for -e option
elif [[ $prev == @("-e"|"--event") &&
$prev_skip_opts == @(record|stat|top) ]]; then
- evts=$($cmd list --raw-dump)
- __perfcomp_colon "$evts" "$cur"
+
+ local cur1=${COMP_WORDS[COMP_CWORD]}
+ local raw_evts=$($cmd list --raw-dump)
+ local arr s tmp result
+
+ if [[ "$cur1" == */* && ${cur1#*/} =~ ^[A-Z] ]]; then
+ OLD_IFS="$IFS"
+ IFS=" "
+ arr=($raw_evts)
+ IFS="$OLD_IFS"
+
+ for s in ${arr[@]}
+ do
+ if [[ "$s" == *cpu/* ]]; then
+ tmp=${s#*cpu/}
+ result=$result" ""cpu/"${tmp^^}
+ else
+ result=$result" "$s
+ fi
+ done
+
+ evts=${result}+$(ls /sys/bus/event_source/devices/cpu/events)
+ else
+ evts=${raw_evts}+$(ls /sys/bus/event_source/devices/cpu/events)
+ fi
+
+ __perfcomp_colon "$evts" "$cur1"
else
# List subcommands for perf commands
if [[ $prev_skip_opts == @(kvm|kmem|mem|lock|sched|
@@ -246,11 +271,16 @@ fi
type perf &>/dev/null &&
_perf()
{
+ if [[ "$COMP_WORDBREAKS" != *,* ]]; then
+ COMP_WORDBREAKS="${COMP_WORDBREAKS},"
+ export COMP_WORDBREAKS
+ fi
+
local cur words cword prev
if [ $preload_get_comp_words_by_ref = "true" ]; then
- _get_comp_words_by_ref -n =: cur words cword prev
+ _get_comp_words_by_ref -n =:, cur words cword prev
else
- __perf_get_comp_words_by_ref -n =: cur words cword prev
+ __perf_get_comp_words_by_ref -n =:, cur words cword prev
fi
__perf_main
} &&
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 34/35] perf tools: Return all events as auto-completions after comma
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (32 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 33/35] perf tool: Improve bash command line auto-complete for multiple events with comma Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 35/35] perf tools: Auto-complete for events with ':' Arnaldo Carvalho de Melo
2017-12-28 15:17 ` [GIT PULL 00/35] perf/core improvements and fixes Ingo Molnar
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Jiri Olsa, Kan Liang, Peter Zijlstra,
Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
It's a follow up for one previous patch "perf tool: Improve bash command
line auto-complete for multiple events with comma."
It fixes an issue that no events are displayed when <TAB> is directly
typed after comma.
With this patch, now the result is:
root@skl:/tmp# perf stat -e cpu-cycles,<TAB>
Display all 2389 possibilities? (y or n)
alarmtimer:alarmtimer_cancel
alarmtimer:alarmtimer_fired
alarmtimer:alarmtimer_start
alarmtimer:alarmtimer_suspend
alignment-faults
arith.divider_active
BAClear_Cost
baclears.any
block:block_bio_backmerge
block:block_bio_bounce
block:block_bio_complete
block:block_bio_frontmerge
block:block_bio_queue
block:block_bio_remap
block:block_dirty_buffer
block:block_getrq
block:block_plug
block:block_rq_complete
block:block_rq_insert
block:block_rq_issue
block:block_rq_remap
block:block_rq_requeue
block:block_sleeprq
--More--
One remaining issue is that the auto-completions doesn't work well
for the event with ':'. For example, clk:clk_enable.
Because ':' is set as WORDBREAK by default in bash. Need more work
for this case.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1513940255-16528-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/perf-completion.sh | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/tools/perf/perf-completion.sh b/tools/perf/perf-completion.sh
index d8310830a18b..90206413f4d7 100644
--- a/tools/perf/perf-completion.sh
+++ b/tools/perf/perf-completion.sh
@@ -183,12 +183,16 @@ __perf_main ()
fi
done
- evts=${result}+$(ls /sys/bus/event_source/devices/cpu/events)
+ evts=${result}" "$(ls /sys/bus/event_source/devices/cpu/events)
else
- evts=${raw_evts}+$(ls /sys/bus/event_source/devices/cpu/events)
+ evts=${raw_evts}" "$(ls /sys/bus/event_source/devices/cpu/events)
fi
- __perfcomp_colon "$evts" "$cur1"
+ if [[ "$cur1" == , ]]; then
+ __perfcomp_colon "$evts" ""
+ else
+ __perfcomp_colon "$evts" "$cur1"
+ fi
else
# List subcommands for perf commands
if [[ $prev_skip_opts == @(kvm|kmem|mem|lock|sched|
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 35/35] perf tools: Auto-complete for events with ':'
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (33 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 34/35] perf tools: Return all events as auto-completions after comma Arnaldo Carvalho de Melo
@ 2017-12-28 14:30 ` Arnaldo Carvalho de Melo
2017-12-28 15:17 ` [GIT PULL 00/35] perf/core improvements and fixes Ingo Molnar
35 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-28 14:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
Andi Kleen, Jiri Olsa, Kan Liang, Peter Zijlstra,
Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
It's a follow up patch for a previous patch "perf tool: Return all
events as auto-completions after comma".
With this patch, auto-completion can work well for events with a ':'.
For example:
root@skl:/tmp# perf stat -e block:block_<TAB>
block:block_bio_backmerge block:block_rq_complete
block:block_bio_bounce block:block_rq_insert
block:block_bio_complete block:block_rq_issue
block:block_bio_frontmerge block:block_rq_remap
block:block_bio_queue block:block_rq_requeue
block:block_bio_remap block:block_sleeprq
block:block_dirty_buffer block:block_split
block:block_getrq block:block_touch_buffer
block:block_plug block:block_unplug
root@skl:/tmp# perf stat -e block:block_rq_<TAB>
block:block_rq_complete block:block_rq_issue block:block_rq_requeue
block:block_rq_insert block:block_rq_remap
root@skl:/tmp# perf stat -e block:block_rq_complete<TAB>
block:block_rq_complete
root@skl:/tmp# perf stat -e block:block_rq_complete
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1513973758-19109-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/perf-completion.sh | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tools/perf/perf-completion.sh b/tools/perf/perf-completion.sh
index 90206413f4d7..fdf75d45efff 100644
--- a/tools/perf/perf-completion.sh
+++ b/tools/perf/perf-completion.sh
@@ -280,6 +280,11 @@ _perf()
export COMP_WORDBREAKS
fi
+ if [[ "$COMP_WORDBREAKS" == *:* ]]; then
+ COMP_WORDBREAKS="${COMP_WORDBREAKS/:/}"
+ export COMP_WORDBREAKS
+ fi
+
local cur words cword prev
if [ $preload_get_comp_words_by_ref = "true" ]; then
_get_comp_words_by_ref -n =:, cur words cword prev
--
2.13.6
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [GIT PULL 00/35] perf/core improvements and fixes
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (34 preceding siblings ...)
2017-12-28 14:30 ` [PATCH 35/35] perf tools: Auto-complete for events with ':' Arnaldo Carvalho de Melo
@ 2017-12-28 15:17 ` Ingo Molnar
35 siblings, 0 replies; 48+ messages in thread
From: Ingo Molnar @ 2017-12-28 15:17 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: linux-kernel, linux-perf-users, Adrian Hunter,
Alexander Shishkin, Andi Kleen, bhargavb, Cheng Jian,
David Ahern, David S . Miller, Ganapatrao Kulkarni,
Greg Kroah-Hartman, Heiko Carstens, Hendrik Brueckner, Jin Yao,
Jiri Olsa, Jonathan Hermann, Kan Liang, Kim Phillips, Li Bin,
linux-arm-kernel, linux-rt-users, linux s390 list,
Martin Schwidefsky, Masami Hiramatsu, Mengting Zhang,
Michael Petlan, Namhyung Kim, Naveen N . Rao, Paul Clarke,
Peter Zijlstra, Pravin Shedge, Ravi Bangoria, Steven Rostedt,
Thomas Gleixner, Thomas Richter, Wang Nan, Will Deacon,
Arnaldo Carvalho de Melo
* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
>
> Test results at the end of this message, as usual.
>
> The following changes since commit faaf95677f33dac910b6cbe917cabea43c8c1616:
>
> Merge branch 'perf/urgent' into perf/core, to pick up fixes (2017-12-18 18:13:00 +0100)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.16-20171227
>
> for you to fetch changes up to 5d4fd9c8b83b36d34521b3af361a5726899045bf:
>
> perf tools: Auto-complete for events with ':' (2017-12-27 12:16:00 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> - Allow system wide 'perf stat --per-thread', sorting the result (Jin Yao)
>
> E.g.:
>
> [root@jouet ~]# perf stat --per-thread --metrics IPC
> ^C
> Performance counter stats for 'system wide':
>
> make-22229 23,012,094,032 inst_retired.any # 0.8 IPC
> cc1-22419 692,027,497 inst_retired.any # 0.8 IPC
> gcc-22418 328,231,855 inst_retired.any # 0.9 IPC
> cc1-22509 220,853,647 inst_retired.any # 0.8 IPC
> gcc-22486 199,874,810 inst_retired.any # 1.0 IPC
> as-22466 177,896,365 inst_retired.any # 0.9 IPC
> cc1-22465 150,732,374 inst_retired.any # 0.8 IPC
> gcc-22508 112,555,593 inst_retired.any # 0.9 IPC
> cc1-22487 108,964,079 inst_retired.any # 0.7 IPC
> qemu-system-x86-2697 21,330,550 inst_retired.any # 0.3 IPC
> systemd-journal-551 20,642,951 inst_retired.any # 0.4 IPC
> docker-containe-17651 9,552,892 inst_retired.any # 0.5 IPC
> dockerd-current-9809 7,528,586 inst_retired.any # 0.5 IPC
> make-22153 12,504,194,380 inst_retired.any # 0.8 IPC
> python2-22429 12,081,290,954 inst_retired.any # 0.8 IPC
> <SNIP>
> python2-22429 15,026,328,103 cpu_clk_unhalted.thread
> cc1-22419 826,660,193 cpu_clk_unhalted.thread
> gcc-22418 365,321,295 cpu_clk_unhalted.thread
> cc1-22509 279,169,362 cpu_clk_unhalted.thread
> gcc-22486 210,156,950 cpu_clk_unhalted.thread
> <SNIP>
>
> 5.638075538 seconds time elapsed
>
> [root@jouet ~]#
>
> - Improve shell auto-completion of perf events (Jin Yao)
>
> - Fix symbol fixup issues in arm64 due to ELF type (Kim Phillips)
>
> - Ignore threads when they vanish after procfs based enumeration and
> before we try to use them with sys_perf_event_open(), i.e. just remove
> them from the thread_map and continue with the rest. This makes, among
> other cases, the previous new feature (perf stat --per-thread for system
> wide, albeit that not seeming to be the motivation for this patch) more
> robust. (Mengting Zhang)
>
> - Generate s390 syscall table from asm/unistd.h, doing like x86,
> removing the dependency on audit-libs to do this id->string translation,
> speeding up the support for newly introducted syscalls (Hendrik Brueckner)
>
> - Fix 'perf test' on filesystems where readdir() returns d_type == DT_UNKNOWN,
> such as XFS (Jiri Olsa)
>
> - Fix PERF_SAMPLE_RAW_DATA endianity handling for cross-arch tracepoint
> processing (Jiri Olsa)
>
> - Add __return suffix for return events in 'perf probe', streamlining
> entry/exit tracing (Masami Hiramatsu)
>
> - Improve support for versioned symbols in 'perf probe" (Masami Hiramatsu)
>
> - Clarify error message about invalid 'perf probe' event names (Masami Hiramatsu)
>
> - Fix check open filename arg using 'perf trace' in a 'perf test' entry for
> systems using glibc >= 2.26, such as some ARM and s390 distros (Michael Petlan)
>
> - Make method for obtaining the (normalized) architecture id for a
> perf.data file or for the running system used by the annotation routines
> generally available, next user will be for generating per arch errno
> string tables to allow for pretty printing errno codes recorded in a
> perf.data file in architecture A to be properly decoded on hardware
> archictecture B. (Arnaldo Carvalho de Melo)
>
> - Remove duplicate includes, found using scripts/checkincludes.pl (Pravin Shedge)
>
> - s390 needs -fPIC, enable it, also revert a patch that supposedly did
> that but instead enabled -fPIC for x86 (Hendrik Brueckner, Arnaldo Carvalho de Melo)
>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (4):
> perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate()
> perf annotate: Use perf_env when obtaining the arch name
> perf env: Adopt perf_env__arch() from the annotate code
> Revert "perf s390: Always build with -fPIC"
>
> Hendrik Brueckner (4):
> tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h
> perf s390: Generate system call table from asm/unistd.h
> perf trace: Use generated syscall table on s390 too
> perf s390: Always build with -fPIC
>
> Jin Yao (14):
> perf stat: Define a structure for per-thread shadow stats
> perf stat: Extend rbtree to support per-thread shadow stats
> perf stat: Create the runtime_stat init/exit function
> perf stat: Update per-thread shadow stats
> perf stat: Print per-thread shadow stats
> perf stat: Remove a set of shadow stats static variables
> perf stat: Allocate shadow stats buffer for threads
> perf stat: Update or print per-thread stats
> perf thread_map: Enumerate all threads from /proc
> perf stat: Remove --per-thread pid/tid limitation
> perf stat: Resort '--per-thread' result
> perf tool: Improve bash command line auto-complete for multiple events with comma
> perf tools: Return all events as auto-completions after comma
> perf tools: Auto-complete for events with ':'
>
> Jiri Olsa (3):
> perf utils: Move is_directory() to path.h
> perf test: Handle properly readdir DT_UNKNOWN
> perf evsel: Fix swap for samples with raw data
>
> Kim Phillips (1):
> perf probe arm64: Fix symbol fixup issues due to ELF type
>
> Masami Hiramatsu (6):
> perf probe: Add warning message if there is unexpected event name
> perf probe: Cut off the version suffix from event name
> perf probe: Add __return suffix for return events
> perf probe: Find versioned symbols from map
> perf string: Add {strdup,strpbrk}_esc()
> perf probe: Support escaped character in parser
>
> Mengting Zhang (1):
> perf evsel: Enable ignore_missing_thread for pid option
>
> Michael Petlan (1):
> perf test shell: Fix check open filename arg using 'perf trace'
>
> Pravin Shedge (1):
> perf perf: Remove duplicate includes
>
> tools/arch/s390/include/uapi/asm/unistd.h | 412 ++++++++++++++++++++
> tools/perf/Documentation/perf-probe.txt | 18 +-
> tools/perf/Makefile.config | 11 +-
> tools/perf/arch/arm64/util/Build | 1 +
> tools/perf/arch/arm64/util/sym-handling.c | 22 ++
> tools/perf/arch/common.c | 44 +--
> tools/perf/arch/common.h | 1 -
> tools/perf/arch/powerpc/util/sym-handling.c | 8 +
> tools/perf/arch/s390/Makefile | 21 ++
> tools/perf/arch/s390/entry/syscalls/mksyscalltbl | 36 ++
> tools/perf/bench/futex-hash.c | 1 -
> tools/perf/builtin-c2c.c | 3 -
> tools/perf/builtin-record.c | 5 +-
> tools/perf/builtin-script.c | 20 +-
> tools/perf/builtin-stat.c | 168 +++++++--
> tools/perf/builtin-top.c | 2 +-
> tools/perf/check-headers.sh | 1 +
> tools/perf/perf-completion.sh | 47 ++-
> tools/perf/tests/builtin-test.c | 10 +-
> tools/perf/tests/parse-events.c | 1 -
> tools/perf/tests/shell/trace+probe_vfs_getname.sh | 7 +-
> tools/perf/tests/thread-map.c | 2 +-
> tools/perf/ui/browsers/annotate.c | 4 +-
> tools/perf/ui/gtk/annotate.c | 2 +-
> tools/perf/util/annotate.c | 26 +-
> tools/perf/util/annotate.h | 2 +-
> tools/perf/util/auxtrace.c | 3 -
> tools/perf/util/env.c | 47 +++
> tools/perf/util/env.h | 2 +
> tools/perf/util/evlist.c | 3 +-
> tools/perf/util/evsel.c | 80 +++-
> tools/perf/util/evsel.h | 3 +-
> tools/perf/util/header.c | 2 -
> tools/perf/util/metricgroup.c | 2 -
> tools/perf/util/path.c | 14 +
> tools/perf/util/path.h | 3 +
> tools/perf/util/probe-event.c | 85 +++--
> tools/perf/util/python-ext-sources | 1 +
> .../util/scripting-engines/trace-event-python.c | 1 -
> tools/perf/util/stat-shadow.c | 416 ++++++++++++---------
> tools/perf/util/stat.c | 15 +-
> tools/perf/util/stat.h | 63 +++-
> tools/perf/util/string.c | 46 +++
> tools/perf/util/string2.h | 2 +
> tools/perf/util/symbol.c | 5 +
> tools/perf/util/symbol.h | 1 +
> tools/perf/util/syscalltbl.c | 4 +
> tools/perf/util/target.h | 7 +
> tools/perf/util/thread_map.c | 5 +-
> tools/perf/util/thread_map.h | 2 +-
> tools/perf/util/unwind-libunwind.c | 4 +-
> 51 files changed, 1328 insertions(+), 363 deletions(-)
> create mode 100644 tools/arch/s390/include/uapi/asm/unistd.h
> create mode 100644 tools/perf/arch/arm64/util/sym-handling.c
> create mode 100755 tools/perf/arch/s390/entry/syscalls/mksyscalltbl
Pulled, thanks a lot Arnaldo!
Ingo
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [GIT PULL 00/35] perf/core improvements and fixes
2019-03-07 17:43 Arnaldo Carvalho de Melo
@ 2019-03-09 16:02 ` Ingo Molnar
0 siblings, 0 replies; 48+ messages in thread
From: Ingo Molnar @ 2019-03-09 16:02 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Adrian Hunter, Alexander Shishkin, Andi Kleen,
Gustavo A . R . Silva, Jin Yao, Mathias Krause, Michael Sartain,
Nageswara R Sastry, Ravi Bangoria, Seeteena Thoufeek, Song Liu,
Tony Jones, Travis Downs, Yang Wei, Arnaldo Carvalho de Melo
* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> Test results at the end of this message, as usual.
>
> The following changes since commit c978b9460fe1d4a1e1effa0abd6bd69b18a098a8:
>
> Merge tag 'perf-core-for-mingo-5.1-20190225' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2019-02-28 08:29:50 +0100)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-5.1-20190307
>
> for you to fetch changes up to b8f7d86b5849ea7bb84bddc0345a3799049764d4:
>
> perf data: Force perf_data__open|close zero data->file.path (2019-03-06 18:21:00 -0300)
>
> ----------------------------------------------------------------
> perf bpf:
>
> Arnaldo Carvalho de Melo:
>
> - Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
> tools such as 'bpftool map dump' can pretty print map keys and values.
>
> perf c2c:
>
> Jiri Olsa:
>
> - Fix report for empty NUMA node.
>
> perf diff:
>
> Jin Yao:
>
> - Support --time, --cpu, --pid and --tid filter options.
>
> perf probe:
>
> Arnaldo Carvalho de Melo:
>
> - Clarify error message about not finding kernel modules debuginfo.
>
> perf record:
>
> Jiri Olsa:
>
> - Fixup probing for max attr.precise_ip.
>
> perf trace:
>
> Arnaldo Carvalho de Melo:
>
> - Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.
>
> perf annotate:
>
> Arnaldo Carvalho de Melo:
>
> - Calculate the max instruction name, align column to that, removing the
> hardcoded max 6 chars and cope with instructions with names longer than that,
> such as vpmovmskb, vpcmpeqb, etc.
>
> kernel:
>
> Song Liu:
>
> - Consider events with attr.bpf_event set as side-band.
>
> Gustavo A. R. Silva:
>
> - Mark expected switch fall-through in perf_event_parse_addr_filter().
>
> Libraries:
>
> Jiri Olsa:
>
> - Fix leaks and double frees on error paths.
>
> libtraceevent:
>
> Tony Jones:
>
> - Fix buffer overflow in arg_eval().
>
> python scripting:
>
> Tony Jones:
>
> - More python3 fixes.
>
> Trivial:
>
> Yang Wei:
>
> - Remove needless extra semicolon in clang C++ glue code.
>
> Intel PT/BTS:
>
> Adrian Hunter:
>
> - Improve auxtrace address filter error message when there is no DSO.
>
> - Fix divide by zero when TSC is not available.
>
> - Further improvements to the export to sqlite/posgresql python scripts
> and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
> the creation of call trees.
>
> Andi Kleen:
>
> - Generalize function to copy from thread addr space from intel-bts code.
>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> ----------------------------------------------------------------
> Adrian Hunter (10):
> perf auxtrace: Improve address filter error message when there is no DSO
> perf intel-pt: Fix divide by zero when TSC is not available
> perf db-export: Add calls parent_id to enable creation of call trees
> perf scripts python: export-to-sqlite.py: Export calls parent_id
> perf scripts python: export-to-postgresql.py: Fix invalid input syntax for integer error
> perf scripts python: export-to-postgresql.py: Export calls parent_id
> perf scripts python: exported-sql-viewer.py: Factor out TreeWindowBase
> perf scripts python: exported-sql-viewer.py: Improve TreeModel abstraction
> perf scripts python: exported-sql-viewer.py: Factor out CallGraphModelBase
> perf scripts python: exported-sql-viewer.py: Add call tree
>
> Andi Kleen (1):
> perf thread: Generalize function to copy from thread addr space from intel-bts code
>
> Arnaldo Carvalho de Melo (4):
> perf probe: Clarify error message about not finding kernel modules debuginfo
> perf beauty msg_flags: Add missing %s lost when adding prefix suppression logic
> perf bpf: Automatically add BTF ELF markers
> perf annotate: Calculate the max instruction name, align column to that
>
> Gustavo A. R. Silva (1):
> perf: Mark expected switch fall-through
>
> Jin Yao (4):
> perf time-utils: Refactor time range parsing code
> perf diff: Support --time filter option
> perf diff: Support --cpu filter option
> perf diff: Support --pid/--tid filter options
>
> Jiri Olsa (7):
> perf c2c: Fix c2c report for empty numa node
> perf hist: Add error path into hist_entry__init
> perf hist: Fix memory leak of srcline
> perf tools: Read and store caps/max_precise in perf_pmu
> perf evsel: Probe for precise_ip with simple attr
> perf session: Fix double free in perf_data__close
> perf data: Force perf_data__open|close zero data->file.path
>
> Song Liu (1):
> perf, bpf: Consider events with attr.bpf_event as side-band events
>
> Tony Jones (6):
> tools lib traceevent: Fix buffer overflow in arg_eval
> perf script python: Remove mixed indentation
> perf script python: Add Python3 support to futex-contention.py
> perf script python: add Python3 support to check-perf-trace.py
> perf script python: Add Python3 support to event_analyzing_sample.py
> perf script python: Add Python3 support to intel-pt-events.py
>
> Yang Wei (1):
> perf clang: Remove needless extra semicolon
>
> kernel/events/core.c | 4 +-
> tools/lib/traceevent/event-parse.c | 2 +-
> tools/perf/Documentation/perf-diff.txt | 56 ++++
> tools/perf/arch/arm64/annotate/instructions.c | 2 +-
> tools/perf/arch/s390/annotate/instructions.c | 2 +-
> tools/perf/builtin-c2c.c | 8 +-
> tools/perf/builtin-diff.c | 168 +++++++++-
> tools/perf/builtin-report.c | 38 +--
> tools/perf/builtin-script.c | 39 +--
> tools/perf/include/bpf/bpf.h | 8 +-
> tools/perf/scripts/python/check-perf-trace.py | 76 ++---
> tools/perf/scripts/python/compaction-times.py | 8 +-
> .../perf/scripts/python/event_analyzing_sample.py | 48 +--
> tools/perf/scripts/python/export-to-postgresql.py | 16 +-
> tools/perf/scripts/python/export-to-sqlite.py | 12 +-
> tools/perf/scripts/python/exported-sql-viewer.py | 354 ++++++++++++++++-----
> .../perf/scripts/python/failed-syscalls-by-pid.py | 38 +--
> tools/perf/scripts/python/futex-contention.py | 10 +-
> tools/perf/scripts/python/intel-pt-events.py | 60 ++--
> tools/perf/scripts/python/mem-phys-addr.py | 7 +-
> tools/perf/scripts/python/net_dropmonitor.py | 2 +-
> tools/perf/scripts/python/netdev-times.py | 12 +-
> tools/perf/scripts/python/sched-migration.py | 6 +-
> tools/perf/scripts/python/sctop.py | 13 +-
> tools/perf/scripts/python/stackcollapse.py | 2 +-
> tools/perf/scripts/python/syscall-counts-by-pid.py | 47 ++-
> tools/perf/scripts/python/syscall-counts.py | 31 +-
> tools/perf/trace/beauty/msg_flags.c | 2 +-
> tools/perf/util/annotate.c | 74 +++--
> tools/perf/util/annotate.h | 7 +-
> tools/perf/util/auxtrace.c | 3 +-
> tools/perf/util/c++/clang.cpp | 2 +-
> tools/perf/util/data.c | 4 +-
> tools/perf/util/db-export.c | 15 +-
> tools/perf/util/db-export.h | 3 +-
> tools/perf/util/evlist.c | 25 +-
> tools/perf/util/evsel.c | 8 -
> tools/perf/util/hist.c | 51 +--
> tools/perf/util/intel-bts.c | 20 +-
> tools/perf/util/intel-pt.c | 2 +
> tools/perf/util/pmu.c | 14 +
> tools/perf/util/pmu.h | 1 +
> tools/perf/util/probe-event.c | 9 +-
> .../util/scripting-engines/trace-event-python.c | 8 +-
> tools/perf/util/session.c | 4 +-
> tools/perf/util/thread-stack.c | 16 +-
> tools/perf/util/thread-stack.h | 6 +-
> tools/perf/util/thread.c | 23 ++
> tools/perf/util/thread.h | 3 +
> tools/perf/util/time-utils.c | 51 ++-
> tools/perf/util/time-utils.h | 6 +
> 51 files changed, 978 insertions(+), 448 deletions(-)
Pulled, thanks a lot Arnaldo!
Ingo
^ permalink raw reply [flat|nested] 48+ messages in thread
* [GIT PULL 00/35] perf/core improvements and fixes
@ 2019-03-07 17:43 Arnaldo Carvalho de Melo
2019-03-09 16:02 ` Ingo Molnar
0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-07 17:43 UTC (permalink / raw)
To: Ingo Molnar
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Andi Kleen, Gustavo A . R . Silva, Jin Yao,
Mathias Krause, Michael Sartain, Nageswara R Sastry,
Ravi Bangoria, Seeteena Thoufeek, Song Liu, Tony Jones,
Travis Downs, Yang Wei, Arnaldo Carvalho de Melo
Hi Ingo,
Please consider pulling,
- Arnaldo
Test results at the end of this message, as usual.
The following changes since commit c978b9460fe1d4a1e1effa0abd6bd69b18a098a8:
Merge tag 'perf-core-for-mingo-5.1-20190225' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2019-02-28 08:29:50 +0100)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-5.1-20190307
for you to fetch changes up to b8f7d86b5849ea7bb84bddc0345a3799049764d4:
perf data: Force perf_data__open|close zero data->file.path (2019-03-06 18:21:00 -0300)
----------------------------------------------------------------
perf bpf:
Arnaldo Carvalho de Melo:
- Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
tools such as 'bpftool map dump' can pretty print map keys and values.
perf c2c:
Jiri Olsa:
- Fix report for empty NUMA node.
perf diff:
Jin Yao:
- Support --time, --cpu, --pid and --tid filter options.
perf probe:
Arnaldo Carvalho de Melo:
- Clarify error message about not finding kernel modules debuginfo.
perf record:
Jiri Olsa:
- Fixup probing for max attr.precise_ip.
perf trace:
Arnaldo Carvalho de Melo:
- Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.
perf annotate:
Arnaldo Carvalho de Melo:
- Calculate the max instruction name, align column to that, removing the
hardcoded max 6 chars and cope with instructions with names longer than that,
such as vpmovmskb, vpcmpeqb, etc.
kernel:
Song Liu:
- Consider events with attr.bpf_event set as side-band.
Gustavo A. R. Silva:
- Mark expected switch fall-through in perf_event_parse_addr_filter().
Libraries:
Jiri Olsa:
- Fix leaks and double frees on error paths.
libtraceevent:
Tony Jones:
- Fix buffer overflow in arg_eval().
python scripting:
Tony Jones:
- More python3 fixes.
Trivial:
Yang Wei:
- Remove needless extra semicolon in clang C++ glue code.
Intel PT/BTS:
Adrian Hunter:
- Improve auxtrace address filter error message when there is no DSO.
- Fix divide by zero when TSC is not available.
- Further improvements to the export to sqlite/posgresql python scripts
and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
the creation of call trees.
Andi Kleen:
- Generalize function to copy from thread addr space from intel-bts code.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Adrian Hunter (10):
perf auxtrace: Improve address filter error message when there is no DSO
perf intel-pt: Fix divide by zero when TSC is not available
perf db-export: Add calls parent_id to enable creation of call trees
perf scripts python: export-to-sqlite.py: Export calls parent_id
perf scripts python: export-to-postgresql.py: Fix invalid input syntax for integer error
perf scripts python: export-to-postgresql.py: Export calls parent_id
perf scripts python: exported-sql-viewer.py: Factor out TreeWindowBase
perf scripts python: exported-sql-viewer.py: Improve TreeModel abstraction
perf scripts python: exported-sql-viewer.py: Factor out CallGraphModelBase
perf scripts python: exported-sql-viewer.py: Add call tree
Andi Kleen (1):
perf thread: Generalize function to copy from thread addr space from intel-bts code
Arnaldo Carvalho de Melo (4):
perf probe: Clarify error message about not finding kernel modules debuginfo
perf beauty msg_flags: Add missing %s lost when adding prefix suppression logic
perf bpf: Automatically add BTF ELF markers
perf annotate: Calculate the max instruction name, align column to that
Gustavo A. R. Silva (1):
perf: Mark expected switch fall-through
Jin Yao (4):
perf time-utils: Refactor time range parsing code
perf diff: Support --time filter option
perf diff: Support --cpu filter option
perf diff: Support --pid/--tid filter options
Jiri Olsa (7):
perf c2c: Fix c2c report for empty numa node
perf hist: Add error path into hist_entry__init
perf hist: Fix memory leak of srcline
perf tools: Read and store caps/max_precise in perf_pmu
perf evsel: Probe for precise_ip with simple attr
perf session: Fix double free in perf_data__close
perf data: Force perf_data__open|close zero data->file.path
Song Liu (1):
perf, bpf: Consider events with attr.bpf_event as side-band events
Tony Jones (6):
tools lib traceevent: Fix buffer overflow in arg_eval
perf script python: Remove mixed indentation
perf script python: Add Python3 support to futex-contention.py
perf script python: add Python3 support to check-perf-trace.py
perf script python: Add Python3 support to event_analyzing_sample.py
perf script python: Add Python3 support to intel-pt-events.py
Yang Wei (1):
perf clang: Remove needless extra semicolon
kernel/events/core.c | 4 +-
tools/lib/traceevent/event-parse.c | 2 +-
tools/perf/Documentation/perf-diff.txt | 56 ++++
tools/perf/arch/arm64/annotate/instructions.c | 2 +-
tools/perf/arch/s390/annotate/instructions.c | 2 +-
tools/perf/builtin-c2c.c | 8 +-
tools/perf/builtin-diff.c | 168 +++++++++-
tools/perf/builtin-report.c | 38 +--
tools/perf/builtin-script.c | 39 +--
tools/perf/include/bpf/bpf.h | 8 +-
tools/perf/scripts/python/check-perf-trace.py | 76 ++---
tools/perf/scripts/python/compaction-times.py | 8 +-
.../perf/scripts/python/event_analyzing_sample.py | 48 +--
tools/perf/scripts/python/export-to-postgresql.py | 16 +-
tools/perf/scripts/python/export-to-sqlite.py | 12 +-
tools/perf/scripts/python/exported-sql-viewer.py | 354 ++++++++++++++++-----
.../perf/scripts/python/failed-syscalls-by-pid.py | 38 +--
tools/perf/scripts/python/futex-contention.py | 10 +-
tools/perf/scripts/python/intel-pt-events.py | 60 ++--
tools/perf/scripts/python/mem-phys-addr.py | 7 +-
tools/perf/scripts/python/net_dropmonitor.py | 2 +-
tools/perf/scripts/python/netdev-times.py | 12 +-
tools/perf/scripts/python/sched-migration.py | 6 +-
tools/perf/scripts/python/sctop.py | 13 +-
tools/perf/scripts/python/stackcollapse.py | 2 +-
tools/perf/scripts/python/syscall-counts-by-pid.py | 47 ++-
tools/perf/scripts/python/syscall-counts.py | 31 +-
tools/perf/trace/beauty/msg_flags.c | 2 +-
tools/perf/util/annotate.c | 74 +++--
tools/perf/util/annotate.h | 7 +-
tools/perf/util/auxtrace.c | 3 +-
tools/perf/util/c++/clang.cpp | 2 +-
tools/perf/util/data.c | 4 +-
tools/perf/util/db-export.c | 15 +-
tools/perf/util/db-export.h | 3 +-
tools/perf/util/evlist.c | 25 +-
tools/perf/util/evsel.c | 8 -
tools/perf/util/hist.c | 51 +--
tools/perf/util/intel-bts.c | 20 +-
tools/perf/util/intel-pt.c | 2 +
tools/perf/util/pmu.c | 14 +
tools/perf/util/pmu.h | 1 +
tools/perf/util/probe-event.c | 9 +-
.../util/scripting-engines/trace-event-python.c | 8 +-
tools/perf/util/session.c | 4 +-
tools/perf/util/thread-stack.c | 16 +-
tools/perf/util/thread-stack.h | 6 +-
tools/perf/util/thread.c | 23 ++
tools/perf/util/thread.h | 3 +
tools/perf/util/time-utils.c | 51 ++-
tools/perf/util/time-utils.h | 6 +
51 files changed, 978 insertions(+), 448 deletions(-)
Test results:
The first ones are container based builds of tools/perf with and without libelf
support. Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.
The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.
Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
$ export PERF_TARBALL=http://192.168.124.1/perf/perf-5.0.0-rc8.tar.xz
# dm
1 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0
2 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822
3 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0
4 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0
5 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0
6 alpine:3.9 : Ok gcc (Alpine 8.2.0) 8.2.0
7 alpine:edge : Ok gcc (Alpine 8.2.0) 8.2.0
8 amazonlinux:1 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
9 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
10 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
11 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
12 centos:5 : Ok gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
13 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
14 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
15 clearlinux:latest : Ok gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502
16 debian:7 : Ok gcc (Debian 4.7.2-5) 4.7.2
17 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2
18 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
19 debian:experimental : Ok gcc (Debian 8.2.0-17) 8.2.1 20190204
20 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
21 debian:experimental-x-mips : Ok mips-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
22 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 8.2.0-16) 8.2.0
23 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
24 fedora:21 : Ok gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
25 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
26 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
27 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
28 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
29 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
30 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
31 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)
32 fedora:28 : Ok gcc (GCC) 8.2.1 20181215 (Red Hat 8.2.1-6)
33 fedora:29 : Ok gcc (GCC) 8.2.1 20181215 (Red Hat 8.2.1-6)
34 fedora:30 : Ok gcc (GCC) 9.0.1 20190203 (Red Hat 9.0.1-0.3)
35 fedora:rawhide : Ok gcc (GCC) 9.0.1 20190209 (Red Hat 9.0.1-0.4)
36 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 7.3.0-r3 p1.4) 7.3.0
37 mageia:5 : Ok gcc (GCC) 4.9.2
38 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0
39 opensuse:13.2 : Ok gcc (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
40 opensuse:15.0 : Ok gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
41 opensuse:15.1 : Ok gcc (SUSE Linux) 7.4.0
42 opensuse:42.1 : Ok gcc (SUSE Linux) 4.8.5
43 opensuse:42.2 : Ok gcc (SUSE Linux) 4.8.5
44 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5
45 opensuse:tumbleweed : Ok gcc (SUSE Linux) 8.2.1 20190103 [gcc-8-branch revision 267549]
46 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
47 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36.0.1)
48 ubuntu:12.04.5 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
49 ubuntu:14.04.4 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
50 ubuntu:14.04.4-x-linaro-arm64 : Ok aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
51 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
52 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
53 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
54 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
55 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
56 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
57 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
58 ubuntu:17.10 : Ok gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
59 ubuntu:18.04 : Ok gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
60 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
61 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
62 ubuntu:18.04-x-m68k : Ok m68k-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
63 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
64 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
65 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
66 ubuntu:18.04-x-riscv64 : Ok riscv64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
67 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
68 ubuntu:18.04-x-sh4 : Ok sh4-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
69 ubuntu:18.04-x-sparc64 : Ok sparc64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
70 ubuntu:18.10 : Ok gcc (Ubuntu 8.2.0-7ubuntu1) 8.2.0
71 ubuntu:19.04 : Ok gcc (Ubuntu 8.2.0-20ubuntu1) 8.2.0
72 ubuntu:19.04-x-alpha : Ok alpha-linux-gnu-gcc (Ubuntu 8.2.0-20ubuntu1) 8.2.0
73 ubuntu:19.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 8.2.0-20ubuntu1) 8.2.0
74 ubuntu:19.04-x-hppa : Ok hppa-linux-gnu-gcc (Ubuntu 8.2.0-20ubuntu1) 8.2.0
#
# uname -a
Linux quaco 5.0.0+ #1 SMP Thu Mar 7 10:32:55 -03 2019 x86_64 x86_64 x86_64 GNU/Linux
# git log --oneline -1
b8f7d86b5849 perf data: Force perf_data__open|close zero data->file.path
# perf version --build-options
perf version 5.0.rc8.gb8f7d8
dwarf: [ on ] # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT
glibc: [ on ] # HAVE_GLIBC_SUPPORT
gtk2: [ on ] # HAVE_GTK2_SUPPORT
syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT
libbfd: [ on ] # HAVE_LIBBFD_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
libperl: [ on ] # HAVE_LIBPERL_SUPPORT
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT
libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
lzma: [ on ] # HAVE_LZMA_SUPPORT
get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : Ok
9: Parse perf pmu format : Ok
10: DSO data read : Ok
11: DSO data cache : Ok
12: DSO data reopen : Ok
13: Roundtrip evsel->name : Ok
14: Parse sched tracepoints fields : Ok
15: syscalls:sys_enter_openat event fields : Ok
16: Setup struct perf_event_attr : Ok
17: Match and link multiple hists : Ok
18: 'import perf' in python : Ok
19: Breakpoint overflow signal handler : Ok
20: Breakpoint overflow sampling : Ok
21: Breakpoint accounting : Ok
22: Watchpoint :
22.1: Read Only Watchpoint : Skip
22.2: Write Only Watchpoint : Ok
22.3: Read / Write Watchpoint : Ok
22.4: Modify Watchpoint : Ok
23: Number of exit events of a simple workload : Ok
24: Software clock events period values : Ok
25: Object code reading : Ok
26: Sample parsing : Ok
27: Use a dummy software event to keep tracking : Ok
28: Parse with no sample_id_all bit set : Ok
29: Filter hist entries : Ok
30: Lookup mmap thread : Ok
31: Share thread mg : Ok
32: Sort output of hist entries : Ok
33: Cumulate child hist entries : Ok
34: Track with sched_switch : Ok
35: Filter fds with revents mask in a fdarray : Ok
36: Add fd to a fdarray, making it autogrow : Ok
37: kmod_path__parse : Ok
38: Thread map : Ok
39: LLVM search and compile :
39.1: Basic BPF llvm compile : Ok
39.2: kbuild searching : Ok
39.3: Compile source for BPF prologue generation : Ok
39.4: Compile source for BPF relocation : Ok
40: Session topology : Ok
41: BPF filter :
41.1: Basic BPF filtering : Ok
41.2: BPF pinning : Ok
41.3: BPF prologue generation : Ok
41.4: BPF relocation checker : Ok
42: Synthesize thread map : Ok
43: Remove thread map : Ok
44: Synthesize cpu map : Ok
45: Synthesize stat config : Ok
46: Synthesize stat : Ok
47: Synthesize stat round : Ok
48: Synthesize attr update : Ok
49: Event times : Ok
50: Read backward ring buffer : Ok
51: Print cpu map : Ok
52: Probe SDT events : Ok
53: is_printable_array : Ok
54: Print bitmap : Ok
55: perf hooks : Ok
56: builtin clang support : Skip (not compiled in)
57: unit_number__scnprintf : Ok
58: mem2node : Ok
59: x86 rdpmc : Ok
60: Convert perf time to TSC : Ok
61: DWARF unwind : Ok
62: x86 instruction decoder - new instructions : Ok
63: x86 bp modify : Ok
64: probe libc's inet_pton & backtrace it with ping : Ok
65: Use vfs_getname probe to get syscall args filenames : Ok
66: Add vfs_getname probe to get syscall args filenames : Ok
67: Check open filename arg using perf trace + vfs_getname: Ok
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_with_babeltrace_O: make LIBBABELTRACE=1
make_debug_O: make DEBUG=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_demangle_O: make NO_DEMANGLE=1
make_pure_O: make
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_perf_o_O: make perf.o
make_install_bin_O: make install-bin
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_newt_O: make NO_NEWT=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_libbpf_O: make NO_LIBBPF=1
make_no_gtk2_O: make NO_GTK2=1
make_doc_O: make doc
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_cscope_O: make cscope
make_static_O: make LDFLAGS=-static
make_no_libelf_O: make NO_LIBELF=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_clean_all_O: make clean all
make_install_O: make install
make_tags_O: make tags
make_install_prefix_O: make install prefix=/tmp/krava
make_util_map_o_O: make util/map.o
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_help_O: make help
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_slang_O: make NO_SLANG=1
OK
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [GIT PULL 00/35] perf/core improvements and fixes
2018-08-15 15:05 Arnaldo Carvalho de Melo
2018-08-15 15:21 ` Andy Lutomirski
@ 2018-08-18 11:17 ` Ingo Molnar
1 sibling, 0 replies; 48+ messages in thread
From: Ingo Molnar @ 2018-08-18 11:17 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Clark Williams, linux-kernel, linux-perf-users,
Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Kapshuk,
Alexander Shishkin, Andi Kleen, Andrew Morton, Andy Lutomirski,
Aneesh Kumar, Benno Evers, Dave Hansen, David Ahern,
Dongjiu Geng, H . Peter Anvin, Jiri Olsa, Joerg Roedel,
Kim Phillips, Krister Johansen, linux-trace-devel,
Michael Ellerman, Namhyung Kim, Naveen N . Rao, Peter Zijlstra,
Ravi Bangoria, Sandipan Das, stable, Steven Rostedt,
Thomas Gleixner, Tzvetomir Stoyanov, Wang Nan, x86,
Yordan Karadzhov
* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> Hi Ingo,
>
> Please consider pulling, this is on top of
> perf-core-for-mingo-4.19-20180809, that is not yet in tip.
>
> Thanks,
>
> - Arnaldo
>
> Test results at the end of this message, as usual.
>
> The following changes since commit 6a9405b56c274024564f9014bba97b92c91b34d6:
>
> perf map: Optimize maps__fixup_overlappings() (2018-08-08 15:56:00 -0300)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.19-20180815
>
> for you to fetch changes up to 6855dc41b24619c3d1de3dbd27dd0546b0e45272:
>
> x86: Add entry trampolines to kcore (2018-08-14 19:13:26 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements ad fixes:
>
> kernel:
>
> . kallsyms, x86: Export addresses of PTI entry trampolines (Alexander Shishkin)
>
> . kallsyms: Simplify update_iter_mod() (Adrian Hunter)
>
> . x86: Add entry trampolines to kcore (Adrian Hunter)
>
> Hardware tracing:
>
> . Fix auxtrace queue resize (Adrian Hunter)
>
> Arch specific:
>
> . Fix uninitialized ARM SPE record error variable (Kim Phillips)
>
> . Fix trace event post-processing in powerpc (Sandipan Das)
>
> Build:
>
> . Fix check-headers.sh AND list path of execution (Alexander Kapshuk)
>
> . Remove -mcet and -fcf-protection when building the python binding
> with older clang versions (Arnaldo Carvalho de Melo)
>
> . Make check-headers.sh check based on kernel dir (Jiri Olsa)
>
> . Move syscall_64.tbl check into check-headers.sh (Jiri Olsa)
>
> Infrastructure:
>
> . Check for null when copying nsinfo. (Benno Evers)
>
> Libraries:
>
> . Rename libtraceevent prefixes, prep work for making it a shared
> library generaly available (Tzvetomir Stoyanov (VMware))
>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> ----------------------------------------------------------------
> Adrian Hunter (3):
> perf auxtrace: Fix queue resize
> kallsyms: Simplify update_iter_mod()
> x86: Add entry trampolines to kcore
>
> Alexander Kapshuk (1):
> perf tools: Fix check-headers.sh AND list path of execution
>
> Alexander Shishkin (1):
> kallsyms, x86: Export addresses of PTI entry trampolines
>
> Arnaldo Carvalho de Melo (1):
> perf python: Remove -mcet and -fcf-protection when building with clang
>
> Benno Evers (1):
> perf tools: Check for null when copying nsinfo.
>
> Jiri Olsa (2):
> perf tools: Make check-headers.sh check based on kernel dir
> perf tools: Move syscall_64.tbl check into check-headers.sh
>
> Kim Phillips (1):
> perf arm spe: Fix uninitialized record error variable
>
> Sandipan Das (1):
> perf probe powerpc: Fix trace event post-processing
>
> Tzvetomir Stoyanov (VMware) (24):
> tools lib traceevent, perf tools: Rename struct pevent to struct tep_handle
> tools lib traceevent, perf tools: Rename 'struct pevent_record' to 'struct tep_record'
> tools lib traceevent, perf tools: Rename pevent plugin related APIs
> tools lib traceevent, perf tools: Rename pevent alloc / free APIs
> tools lib traceevent, perf tools: Rename pevent find APIs
> tools lib traceevent, perf tools: Rename pevent parse APIs
> tools lib traceevent, perf tools: Rename pevent print APIs
> tools lib traceevent, perf tools: Rename pevent_read_number_* APIs
> tools lib traceevent, perf tools: Rename pevent_register_* APIs
> tools lib traceevent, perf tools: Rename pevent_set_* APIs
> tools lib traceevent, perf tools: Rename traceevent_* APIs
> tools lib traceevent, perf tools: Rename 'enum pevent_flag' to 'enum tep_flag'
> tools lib traceevent, tools lib lockdep: Rename 'enum pevent_errno' to 'enum tep_errno'
> tools lib traceevent: Rename pevent_function* APIs
> tools lib traceevent, perf tools: Rename traceevent_plugin_* APIs
> tools lib traceevent: Rename pevent_filter* APIs
> tools lib traceevent: Rename pevent_register / unregister APIs
> tools lib traceevent: Rename pevent_data_ APIs
> tools lib traceevent: Rename pevent field APIs
> tools lib traceevent: Rename pevent_find_* APIs
> tools lib traceevent: Rename various pevent get/set/is APIs
> tools lib traceevent: Rename internal parser related APIs
> tools lib traceevent: Rename various pevent APIs
> tools lib traceevent: Rename static variables and functions in event-parse.c
>
> arch/x86/mm/cpu_entry_area.c | 33 +
> fs/proc/kcore.c | 7 +-
> include/linux/kcore.h | 13 +
> kernel/kallsyms.c | 51 +-
> tools/lib/lockdep/Makefile | 4 +-
> tools/lib/traceevent/Makefile | 4 +-
> tools/lib/traceevent/event-parse.c | 696 ++++++++++-----------
> tools/lib/traceevent/event-parse.h | 458 +++++++-------
> tools/lib/traceevent/event-plugin.c | 70 +--
> tools/lib/traceevent/parse-filter.c | 288 ++++-----
> tools/lib/traceevent/plugin_cfg80211.c | 20 +-
> tools/lib/traceevent/plugin_function.c | 34 +-
> tools/lib/traceevent/plugin_hrtimer.c | 56 +-
> tools/lib/traceevent/plugin_jbd2.c | 36 +-
> tools/lib/traceevent/plugin_kmem.c | 66 +-
> tools/lib/traceevent/plugin_kvm.c | 154 ++---
> tools/lib/traceevent/plugin_mac80211.c | 28 +-
> tools/lib/traceevent/plugin_sched_switch.c | 60 +-
> tools/lib/traceevent/plugin_scsi.c | 24 +-
> tools/lib/traceevent/plugin_xen.c | 20 +-
> tools/perf/arch/arm64/util/arm-spe.c | 1 +
> tools/perf/arch/powerpc/util/sym-handling.c | 4 +-
> tools/perf/arch/x86/Makefile | 3 -
> tools/perf/builtin-kmem.c | 6 +-
> tools/perf/builtin-report.c | 6 +-
> tools/perf/builtin-script.c | 6 +-
> tools/perf/check-headers.sh | 17 +-
> tools/perf/util/auxtrace.c | 3 +
> tools/perf/util/data-convert-bt.c | 6 +-
> tools/perf/util/evsel.c | 2 +-
> tools/perf/util/header.c | 6 +-
> tools/perf/util/machine.h | 2 +-
> tools/perf/util/namespaces.c | 3 +
> tools/perf/util/python.c | 10 +-
> .../perf/util/scripting-engines/trace-event-perl.c | 2 +-
> .../util/scripting-engines/trace-event-python.c | 6 +-
> tools/perf/util/setup.py | 10 +-
> tools/perf/util/sort.c | 16 +-
> tools/perf/util/sort.h | 2 +-
> tools/perf/util/trace-event-parse.c | 34 +-
> tools/perf/util/trace-event-read.c | 44 +-
> tools/perf/util/trace-event-scripting.c | 4 +-
> tools/perf/util/trace-event.c | 28 +-
> tools/perf/util/trace-event.h | 20 +-
> 44 files changed, 1230 insertions(+), 1133 deletions(-)
Pulled into perf/urgent, thanks a lot Arnaldo!
Ingo
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [GIT PULL 00/35] perf/core improvements and fixes
2018-08-15 15:05 Arnaldo Carvalho de Melo
@ 2018-08-15 15:21 ` Andy Lutomirski
2018-08-18 11:17 ` Ingo Molnar
1 sibling, 0 replies; 48+ messages in thread
From: Andy Lutomirski @ 2018-08-15 15:21 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Ingo Molnar, Clark Williams, linux-kernel, linux-perf-users,
Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Kapshuk,
Alexander Shishkin, Andi Kleen, Andrew Morton, Andy Lutomirski,
Aneesh Kumar, Benno Evers, Dave Hansen, David Ahern,
Dongjiu Geng, H . Peter Anvin, Jiri Olsa, Joerg Roedel,
Kim Phillips, Krister Johansen, linux-trace-devel,
Michael Ellerman, Namhyung Kim, Naveen N . Rao, Peter Zijlstra,
Ravi Bangoria, Sandipan Das, stable, Steven Rostedt,
Thomas Gleixner, Tzvetomir Stoyanov, Wang Nan, x86,
Yordan Karadzhov
> On Aug 15, 2018, at 8:05 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> Hi Ingo,
>
> Please consider pulling, this is on top of
> perf-core-for-mingo-4.19-20180809, that is not yet in tip.
>
> Thanks,
>
> - Arnaldo
>
> Test results at the end of this message, as usual.
>
> The following changes since commit 6a9405b56c274024564f9014bba97b92c91b34d6:
>
> perf map: Optimize maps__fixup_overlappings() (2018-08-08 15:56:00 -0300)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.19-20180815
>
> for you to fetch changes up to 6855dc41b24619c3d1de3dbd27dd0546b0e45272:
>
> x86: Add entry trampolines to kcore
I don’t object per se, but once I finish rebasing and testing my entry trampoline removal code, this will probably have to go.
^ permalink raw reply [flat|nested] 48+ messages in thread
* [GIT PULL 00/35] perf/core improvements and fixes
@ 2018-08-15 15:05 Arnaldo Carvalho de Melo
2018-08-15 15:21 ` Andy Lutomirski
2018-08-18 11:17 ` Ingo Molnar
0 siblings, 2 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-15 15:05 UTC (permalink / raw)
To: Ingo Molnar
Cc: Clark Williams, linux-kernel, linux-perf-users,
Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Kapshuk,
Alexander Shishkin, Andi Kleen, Andrew Morton, Andy Lutomirski,
Aneesh Kumar, Benno Evers, Dave Hansen, David Ahern,
Dongjiu Geng, H . Peter Anvin, Jiri Olsa, Joerg Roedel,
Kim Phillips, Krister Johansen, linux-trace-devel,
Michael Ellerman, Namhyung Kim, Naveen N . Rao, Peter Zijlstra,
Ravi Bangoria, Sandipan Das, stable, Steven Rostedt,
Thomas Gleixner, Tzvetomir Stoyanov, Wang Nan, x86,
Yordan Karadzhov
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Hi Ingo,
Please consider pulling, this is on top of
perf-core-for-mingo-4.19-20180809, that is not yet in tip.
Thanks,
- Arnaldo
Test results at the end of this message, as usual.
The following changes since commit 6a9405b56c274024564f9014bba97b92c91b34d6:
perf map: Optimize maps__fixup_overlappings() (2018-08-08 15:56:00 -0300)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.19-20180815
for you to fetch changes up to 6855dc41b24619c3d1de3dbd27dd0546b0e45272:
x86: Add entry trampolines to kcore (2018-08-14 19:13:26 -0300)
----------------------------------------------------------------
perf/core improvements ad fixes:
kernel:
. kallsyms, x86: Export addresses of PTI entry trampolines (Alexander Shishkin)
. kallsyms: Simplify update_iter_mod() (Adrian Hunter)
. x86: Add entry trampolines to kcore (Adrian Hunter)
Hardware tracing:
. Fix auxtrace queue resize (Adrian Hunter)
Arch specific:
. Fix uninitialized ARM SPE record error variable (Kim Phillips)
. Fix trace event post-processing in powerpc (Sandipan Das)
Build:
. Fix check-headers.sh AND list path of execution (Alexander Kapshuk)
. Remove -mcet and -fcf-protection when building the python binding
with older clang versions (Arnaldo Carvalho de Melo)
. Make check-headers.sh check based on kernel dir (Jiri Olsa)
. Move syscall_64.tbl check into check-headers.sh (Jiri Olsa)
Infrastructure:
. Check for null when copying nsinfo. (Benno Evers)
Libraries:
. Rename libtraceevent prefixes, prep work for making it a shared
library generaly available (Tzvetomir Stoyanov (VMware))
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Adrian Hunter (3):
perf auxtrace: Fix queue resize
kallsyms: Simplify update_iter_mod()
x86: Add entry trampolines to kcore
Alexander Kapshuk (1):
perf tools: Fix check-headers.sh AND list path of execution
Alexander Shishkin (1):
kallsyms, x86: Export addresses of PTI entry trampolines
Arnaldo Carvalho de Melo (1):
perf python: Remove -mcet and -fcf-protection when building with clang
Benno Evers (1):
perf tools: Check for null when copying nsinfo.
Jiri Olsa (2):
perf tools: Make check-headers.sh check based on kernel dir
perf tools: Move syscall_64.tbl check into check-headers.sh
Kim Phillips (1):
perf arm spe: Fix uninitialized record error variable
Sandipan Das (1):
perf probe powerpc: Fix trace event post-processing
Tzvetomir Stoyanov (VMware) (24):
tools lib traceevent, perf tools: Rename struct pevent to struct tep_handle
tools lib traceevent, perf tools: Rename 'struct pevent_record' to 'struct tep_record'
tools lib traceevent, perf tools: Rename pevent plugin related APIs
tools lib traceevent, perf tools: Rename pevent alloc / free APIs
tools lib traceevent, perf tools: Rename pevent find APIs
tools lib traceevent, perf tools: Rename pevent parse APIs
tools lib traceevent, perf tools: Rename pevent print APIs
tools lib traceevent, perf tools: Rename pevent_read_number_* APIs
tools lib traceevent, perf tools: Rename pevent_register_* APIs
tools lib traceevent, perf tools: Rename pevent_set_* APIs
tools lib traceevent, perf tools: Rename traceevent_* APIs
tools lib traceevent, perf tools: Rename 'enum pevent_flag' to 'enum tep_flag'
tools lib traceevent, tools lib lockdep: Rename 'enum pevent_errno' to 'enum tep_errno'
tools lib traceevent: Rename pevent_function* APIs
tools lib traceevent, perf tools: Rename traceevent_plugin_* APIs
tools lib traceevent: Rename pevent_filter* APIs
tools lib traceevent: Rename pevent_register / unregister APIs
tools lib traceevent: Rename pevent_data_ APIs
tools lib traceevent: Rename pevent field APIs
tools lib traceevent: Rename pevent_find_* APIs
tools lib traceevent: Rename various pevent get/set/is APIs
tools lib traceevent: Rename internal parser related APIs
tools lib traceevent: Rename various pevent APIs
tools lib traceevent: Rename static variables and functions in event-parse.c
arch/x86/mm/cpu_entry_area.c | 33 +
fs/proc/kcore.c | 7 +-
include/linux/kcore.h | 13 +
kernel/kallsyms.c | 51 +-
tools/lib/lockdep/Makefile | 4 +-
tools/lib/traceevent/Makefile | 4 +-
tools/lib/traceevent/event-parse.c | 696 ++++++++++-----------
tools/lib/traceevent/event-parse.h | 458 +++++++-------
tools/lib/traceevent/event-plugin.c | 70 +--
tools/lib/traceevent/parse-filter.c | 288 ++++-----
tools/lib/traceevent/plugin_cfg80211.c | 20 +-
tools/lib/traceevent/plugin_function.c | 34 +-
tools/lib/traceevent/plugin_hrtimer.c | 56 +-
tools/lib/traceevent/plugin_jbd2.c | 36 +-
tools/lib/traceevent/plugin_kmem.c | 66 +-
tools/lib/traceevent/plugin_kvm.c | 154 ++---
tools/lib/traceevent/plugin_mac80211.c | 28 +-
tools/lib/traceevent/plugin_sched_switch.c | 60 +-
tools/lib/traceevent/plugin_scsi.c | 24 +-
tools/lib/traceevent/plugin_xen.c | 20 +-
tools/perf/arch/arm64/util/arm-spe.c | 1 +
tools/perf/arch/powerpc/util/sym-handling.c | 4 +-
tools/perf/arch/x86/Makefile | 3 -
tools/perf/builtin-kmem.c | 6 +-
tools/perf/builtin-report.c | 6 +-
tools/perf/builtin-script.c | 6 +-
tools/perf/check-headers.sh | 17 +-
tools/perf/util/auxtrace.c | 3 +
tools/perf/util/data-convert-bt.c | 6 +-
tools/perf/util/evsel.c | 2 +-
tools/perf/util/header.c | 6 +-
tools/perf/util/machine.h | 2 +-
tools/perf/util/namespaces.c | 3 +
tools/perf/util/python.c | 10 +-
.../perf/util/scripting-engines/trace-event-perl.c | 2 +-
.../util/scripting-engines/trace-event-python.c | 6 +-
tools/perf/util/setup.py | 10 +-
tools/perf/util/sort.c | 16 +-
tools/perf/util/sort.h | 2 +-
tools/perf/util/trace-event-parse.c | 34 +-
tools/perf/util/trace-event-read.c | 44 +-
tools/perf/util/trace-event-scripting.c | 4 +-
tools/perf/util/trace-event.c | 28 +-
tools/perf/util/trace-event.h | 20 +-
44 files changed, 1230 insertions(+), 1133 deletions(-)
Test results:
The first ones are container (docker) based builds of tools/perf with
and without libelf support. Where clang is available, it is also used
to build perf with/without libelf, and building with LIBCLANGLLVM=1
(built-in clang) with gcc and clang when clang and its devel libraries
are installed.
The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.
Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
# dm
1 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0
2 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822
3 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0
4 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0
5 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0
6 alpine:edge : Ok gcc (Alpine 6.4.0) 6.4.0
7 amazonlinux:1 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
8 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
9 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
10 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
11 centos:5 : Ok gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
12 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
13 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
14 debian:7 : Ok gcc (Debian 4.7.2-5) 4.7.2
15 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u1) 4.9.2
16 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
17 debian:experimental : Ok gcc (Debian 8.2.0-1) 8.2.0
18 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 8.1.0-12) 8.1.0
19 debian:experimental-x-mips : Ok mips-linux-gnu-gcc (Debian 8.1.0-12) 8.1.0
20 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 8.1.0-12) 8.1.0
21 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 8.1.0-12) 8.1.0
22 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
23 fedora:21 : Ok gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
24 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
25 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
26 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
27 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
28 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
29 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
30 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)
31 fedora:28 : Ok gcc (GCC) 8.1.1 20180712 (Red Hat 8.1.1-5)
32 fedora:rawhide : Ok gcc (GCC) 8.0.1 20180324 (Red Hat 8.0.1-0.20)
33 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 7.3.0-r3 p1.4) 7.3.0
34 mageia:5 : Ok gcc (GCC) 4.9.2
35 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0
36 opensuse:13.2 : Ok gcc (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
37 opensuse:42.1 : Ok gcc (SUSE Linux) 4.8.5
38 opensuse:42.2 : Ok gcc (SUSE Linux) 4.8.5
39 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5
40 opensuse:tumbleweed : Ok gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
41 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
42 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28.0.1)
43 ubuntu:12.04.5 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
44 ubuntu:14.04.4 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
45 ubuntu:14.04.4-x-linaro-arm64 : Ok aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
46 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
47 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
48 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
49 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
50 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
51 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
52 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
53 ubuntu:16.10 : Ok gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
54 ubuntu:17.10 : Ok gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
55 ubuntu:18.04 : Ok gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
56 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-16ubuntu3) 7.3.0
57 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-16ubuntu3) 7.3.0
58 ubuntu:18.04-x-m68k : Ok m68k-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
59 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
60 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
61 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
62 ubuntu:18.04-x-riscv64 : Ok riscv64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
63 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
64 ubuntu:18.04-x-sh4 : Ok sh4-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
65 ubuntu:18.04-x-sparc64 : Ok sparc64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
66 ubuntu:18.10 : Ok gcc (Ubuntu 8.2.0-1ubuntu2) 8.2.0
#
# uname -a
Linux seventh 4.18.0-rc8-00002-g1236568ee3cb #1 SMP Wed Aug 8 09:39:17 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
# git log --oneline -1
6855dc41b246 (HEAD -> perf/core) x86: Add entry trampolines to kcore
# perf version --build-options
perf version 4.18.rc7.g6855dc
dwarf: [ on ] # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT
glibc: [ on ] # HAVE_GLIBC_SUPPORT
gtk2: [ on ] # HAVE_GTK2_SUPPORT
syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT
libbfd: [ on ] # HAVE_LIBBFD_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
libperl: [ on ] # HAVE_LIBPERL_SUPPORT
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT
libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
lzma: [ on ] # HAVE_LZMA_SUPPORT
get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : Ok
9: Parse perf pmu format : Ok
10: DSO data read : Ok
11: DSO data cache : Ok
12: DSO data reopen : Ok
13: Roundtrip evsel->name : Ok
14: Parse sched tracepoints fields : Ok
15: syscalls:sys_enter_openat event fields : Ok
16: Setup struct perf_event_attr : Ok
17: Match and link multiple hists : Ok
18: 'import perf' in python : Ok
19: Breakpoint overflow signal handler : Ok
20: Breakpoint overflow sampling : Ok
21: Breakpoint accounting : Ok
22: Number of exit events of a simple workload : Ok
23: Software clock events period values : Ok
24: Object code reading : Ok
25: Sample parsing : Ok
26: Use a dummy software event to keep tracking : Ok
27: Parse with no sample_id_all bit set : Ok
28: Filter hist entries : Ok
29: Lookup mmap thread : Ok
30: Share thread mg : Ok
31: Sort output of hist entries : Ok
32: Cumulate child hist entries : Ok
33: Track with sched_switch : Ok
34: Filter fds with revents mask in a fdarray : Ok
35: Add fd to a fdarray, making it autogrow : Ok
36: kmod_path__parse : Ok
37: Thread map : Ok
38: LLVM search and compile :
38.1: Basic BPF llvm compile : Ok
38.2: kbuild searching : Ok
38.3: Compile source for BPF prologue generation : Ok
38.4: Compile source for BPF relocation : Ok
39: Session topology : Ok
40: BPF filter :
40.1: Basic BPF filtering : Ok
40.2: BPF pinning : Ok
40.3: BPF prologue generation : Ok
40.4: BPF relocation checker : Ok
41: Synthesize thread map : Ok
42: Remove thread map : Ok
43: Synthesize cpu map : Ok
44: Synthesize stat config : Ok
45: Synthesize stat : Ok
46: Synthesize stat round : Ok
47: Synthesize attr update : Ok
48: Event times : Ok
49: Read backward ring buffer : Ok
50: Print cpu map : Ok
51: Probe SDT events : Ok
52: is_printable_array : Ok
53: Print bitmap : Ok
54: perf hooks : Ok
55: builtin clang support : Skip (not compiled in)
56: unit_number__scnprintf : Ok
57: mem2node : Ok
58: x86 rdpmc : Ok
59: Convert perf time to TSC : Ok
60: DWARF unwind : Ok
61: x86 instruction decoder - new instructions : Ok
62: probe libc's inet_pton & backtrace it with ping : Ok
63: Check open filename arg using perf trace + vfs_getname: Ok
64: Use vfs_getname probe to get syscall args filenames : Ok
65: Add vfs_getname probe to get syscall args filenames : Ok
#
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_libaudit_O: make NO_LIBAUDIT=1
make_tags_O: make tags
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_libelf_O: make NO_LIBELF=1
make_no_libbpf_O: make NO_LIBBPF=1
make_util_map_o_O: make util/map.o
make_doc_O: make doc
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_demangle_O: make NO_DEMANGLE=1
make_static_O: make LDFLAGS=-static
make_install_O: make install
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_install_bin_O: make install-bin
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_slang_O: make NO_SLANG=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_gtk2_O: make NO_GTK2=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_install_prefix_O: make install prefix=/tmp/krava
make_debug_O: make DEBUG=1
make_no_libperl_O: make NO_LIBPERL=1
make_pure_O: make
make_no_libnuma_O: make NO_LIBNUMA=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_help_O: make help
make_no_newt_O: make NO_NEWT=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_clean_all_O: make clean all
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_perf_o_O: make perf.o
make_with_babeltrace_O: make LIBBABELTRACE=1
OK
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [GIT PULL 00/35] perf/core improvements and fixes
2017-03-06 19:37 Arnaldo Carvalho de Melo
@ 2017-03-07 7:17 ` Ingo Molnar
0 siblings, 0 replies; 48+ messages in thread
From: Ingo Molnar @ 2017-03-07 7:17 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Ananth N Mavinakayanahalli, Andi Kleen,
Andrew Morton, Borislav Petkov, Charles Baylis, Dave Hansen,
David Ahern, Davidlohr Bueso, David Windsor, Elena Reshetova,
Frederic Weisbecker, Greg Kroah-Hartman, Hans Liljestrand,
Jiri Hladky, Jiri Olsa, Kan Liang, Karol Wachowski, Kees Kook,
kernel-team, linuxppc-dev, Mark Rutland, Masami Hiramatsu,
Matija Glavinic Pecotic, Maxim Kuvyrkov, Michael Ellerman,
Namhyung Kim, Naveen N . Rao, Peter Zijlstra, Piotr Luc,
Robert Richter, Srinivas Pandruvada, Steven Rostedt,
Vince Weaver, Wang Nan
* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> Test results at the end of this message, as usual.
>
> The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:
>
> Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 08:05:45 +0100)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170306
>
> for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:
>
> perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> New features:
>
> - Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis)
>
> E.g.:
>
> # perf report -s symbol_size,symbol
>
> Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
> Overhead Symbol size Symbol
> 14.55% 326 [k] flush_tlb_mm_range
> 7.20% 1045 [k] filemap_map_pages
> 5.82% 124 [k] vma_interval_tree_insert
> 5.18% 2430 [k] unmap_page_range
> 2.57% 571 [k] vma_interval_tree_remove
> 1.94% 494 [k] page_add_file_rmap
> 1.82% 740 [k] page_remove_rmap
> 1.66% 1017 [k] release_pages
> 1.57% 1636 [k] update_blocked_averages
> 1.57% 76 [k] unlock_page
>
> - Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' (Namhyung Kim)
>
> Change in behaviour:
>
> - Make system wide (-a) the default option if no target was specified and one
> of following conditions is met:
>
> - No workload specified (current behaviour)
>
> - A workload is specified but all requested events are system wide ones,
> like uncore ones. (Jiri Olsa)
>
> Fixes:
>
> - Add missing initialization to the instruction decoder used in the
> intel PT/BTS code, which was causing lots of failures in 'perf test',
> looking for a value when there was none (Adrian Hunter)
>
> Infrastructure:
>
> - Add arch code needed to adopt the kernel's refcount_t to aid in
> catching bugs when using atomic_t as a reference counter, basically
> cmpxchg related functions (Arnaldo Carvalho de Melo)
>
> - Convert the code using atomic_t as reference counts to refcount_t
> (Elena Rashetova)
>
> - Add feature test for sched_getcpu() to more easily check for its
> presence in the many libc implementations and accross different
> versions of such C libraries (Arnaldo Carvalho de Melo)
>
> - Issue a HW watchdog disable hint in 'perf stat' for when some of the
> requested events can't get counted because a PMU counter is taken by that
> watchdog (Borislav Petkov).
>
> - Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
>
> Documentation:
>
> - Clarify the term 'convergence' in:
>
> perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
>
> Kernel code:
>
> - Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
>
> - Allow return probes with offsets and absolute addresses (Naveen N. Rao)
>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> ----------------------------------------------------------------
> Adrian Hunter (1):
> perf intel-PT/BTS: Add missing initialization
>
> Arnaldo Carvalho de Melo (12):
> tools include: Adopt __compiletime_error
> tools arch x86: Include asm/cmpxchg.h
> tools arch x86: Introduce atomic_cmpxchg()
> tools include: Introduce atomic_cmpxchg_{relaxed,release}()
> tools include: Provide gcc based cmpxchg fallback for !x86
> tools include: Add UINT_MAX def to kernel.h
> tools include: Adopt kernel's refcount.h
> perf evlist: Clarify a bit the use of perf_mmap->refcnt
> tools build: Add test for sched_getcpu()
> perf bench futex: Use __maybe_unused
> perf bench futex: Fix build on musl + clang
> tools build: Use the same CC for feature detection and actual build
>
> Borislav Petkov (1):
> perf stat: Issue a HW watchdog disable hint
>
> Charles Baylis (1):
> perf tools: Allow sorting by symbol size
>
> Elena Reshetova (9):
> perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
> perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
> perf comm: Convert comm_str.refcnt from atomic_t to refcount_t
> perf dso: Convert dso.refcnt from atomic_t to refcount_t
> perf map: Convert map.refcnt from atomic_t to refcount_t
> perf map: Convert map_groups.refcnt from atomic_t to refcount_t
> perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t
> perf thread: convert thread.refcnt from atomic_t to refcount_t
> perf thread_map: Convert thread_map.refcnt from atomic_t to refcount_t
>
> Jiri Olsa (2):
> perf tools: Force uncore events to system wide monitoring
> perf bench numa: Add more comment for -c option
>
> Karol Wachowski (1):
> perf vendor events: Add mapping for KnightsMill PMU events
>
> Namhyung Kim (4):
> perf ftrace: Add support for --pid option
> perf cpumap: Introduce cpu_map__snprint_mask()
> perf ftrace: Add support for -a and -C option
> perf ftrace: Use pager for displaying result
>
> Naveen N. Rao (3):
> kretprobes: Ensure probe location is at function entry
> trace/kprobes: Allow return probes with offsets and absolute addresses
> perf probe: Generalize probe event file open routine
>
> Steven Rostedt (VMware) (1):
> trace/kprobes: Add back warning about offset in return probes
>
> include/linux/kprobes.h | 1 +
> kernel/kprobes.c | 13 ++
> kernel/trace/trace.c | 1 +
> kernel/trace/trace_kprobe.c | 9 +-
> tools/arch/x86/include/asm/atomic.h | 7 +
> tools/arch/x86/include/asm/cmpxchg.h | 89 ++++++++++++
> tools/build/Makefile.feature | 1 +
> tools/build/feature/Makefile | 10 +-
> tools/build/feature/test-all.c | 5 +
> tools/build/feature/test-sched_getcpu.c | 7 +
> tools/include/asm-generic/atomic-gcc.h | 8 ++
> tools/include/linux/atomic.h | 6 +
> tools/include/linux/compiler-gcc.h | 4 +
> tools/include/linux/compiler.h | 4 +
> tools/include/linux/kernel.h | 4 +
> tools/include/linux/refcount.h | 151 ++++++++++++++++++++
> tools/perf/Documentation/perf-ftrace.txt | 18 +++
> tools/perf/Documentation/perf-report.txt | 1 +
> tools/perf/MANIFEST | 2 +
> tools/perf/Makefile.config | 4 +
> tools/perf/bench/futex-hash.c | 1 +
> tools/perf/bench/futex-lock-pi.c | 1 +
> tools/perf/bench/futex-requeue.c | 1 +
> tools/perf/bench/futex-wake-parallel.c | 1 +
> tools/perf/bench/futex-wake.c | 1 +
> tools/perf/bench/futex.h | 10 +-
> tools/perf/bench/numa.c | 3 +-
> tools/perf/builtin-ftrace.c | 152 +++++++++++++++++----
> tools/perf/builtin-stat.c | 44 +++++-
> tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
> tools/perf/tests/cpumap.c | 2 +-
> tools/perf/tests/thread-map.c | 6 +-
> tools/perf/tests/thread-mg-share.c | 12 +-
> tools/perf/util/cgroup.c | 6 +-
> tools/perf/util/cgroup.h | 4 +-
> tools/perf/util/cloexec.h | 6 -
> tools/perf/util/comm.c | 15 +-
> tools/perf/util/cpumap.c | 62 +++++++--
> tools/perf/util/cpumap.h | 5 +-
> tools/perf/util/dso.c | 6 +-
> tools/perf/util/dso.h | 4 +-
> tools/perf/util/evlist.c | 31 +++--
> tools/perf/util/evlist.h | 4 +-
> tools/perf/util/hist.h | 1 +
> .../util/intel-pt-decoder/intel-pt-insn-decoder.c | 2 +
> tools/perf/util/machine.c | 2 +-
> tools/perf/util/map.c | 10 +-
> tools/perf/util/map.h | 10 +-
> tools/perf/util/parse-events.c | 5 +-
> tools/perf/util/probe-file.c | 20 +--
> tools/perf/util/probe-file.h | 1 +
> tools/perf/util/sort.c | 41 ++++++
> tools/perf/util/sort.h | 1 +
> tools/perf/util/thread.c | 6 +-
> tools/perf/util/thread.h | 4 +-
> tools/perf/util/thread_map.c | 20 +--
> tools/perf/util/thread_map.h | 4 +-
> tools/perf/util/util.h | 4 +-
> tools/scripts/Makefile.include | 9 ++
> 59 files changed, 720 insertions(+), 143 deletions(-)
> create mode 100644 tools/arch/x86/include/asm/cmpxchg.h
> create mode 100644 tools/build/feature/test-sched_getcpu.c
> create mode 100644 tools/include/linux/refcount.h
Pulled, thanks a lot Arnaldo!
Ingo
^ permalink raw reply [flat|nested] 48+ messages in thread
* [GIT PULL 00/35] perf/core improvements and fixes
@ 2017-03-06 19:37 Arnaldo Carvalho de Melo
2017-03-07 7:17 ` Ingo Molnar
0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-03-06 19:37 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Ananth N Mavinakayanahalli, Andi Kleen,
Andrew Morton, Borislav Petkov, Charles Baylis, Dave Hansen,
David Ahern, Davidlohr Bueso, David Windsor, Elena Reshetova,
Frederic Weisbecker, Greg Kroah-Hartman, Hans Liljestrand,
Jiri Hladky, Jiri Olsa, Kan Liang, Karol Wachowski, Kees Kook,
kernel-team, linuxppc-dev, Mark Rutland, Masami Hiramatsu,
Matija Glavinic Pecotic, Maxim Kuvyrkov, Michael Ellerman,
Namhyung Kim, Naveen N . Rao, Peter Zijlstra, Piotr Luc,
Robert Richter, Srinivas Pandruvada, Steven Rostedt,
Vince Weaver, Wang Nan
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Hi Ingo,
Please consider pulling,
- Arnaldo
Test results at the end of this message, as usual.
The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:
Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 08:05:45 +0100)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170306
for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:
perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
New features:
- Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis)
E.g.:
# perf report -s symbol_size,symbol
Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
Overhead Symbol size Symbol
14.55% 326 [k] flush_tlb_mm_range
7.20% 1045 [k] filemap_map_pages
5.82% 124 [k] vma_interval_tree_insert
5.18% 2430 [k] unmap_page_range
2.57% 571 [k] vma_interval_tree_remove
1.94% 494 [k] page_add_file_rmap
1.82% 740 [k] page_remove_rmap
1.66% 1017 [k] release_pages
1.57% 1636 [k] update_blocked_averages
1.57% 76 [k] unlock_page
- Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' (Namhyung Kim)
Change in behaviour:
- Make system wide (-a) the default option if no target was specified and one
of following conditions is met:
- No workload specified (current behaviour)
- A workload is specified but all requested events are system wide ones,
like uncore ones. (Jiri Olsa)
Fixes:
- Add missing initialization to the instruction decoder used in the
intel PT/BTS code, which was causing lots of failures in 'perf test',
looking for a value when there was none (Adrian Hunter)
Infrastructure:
- Add arch code needed to adopt the kernel's refcount_t to aid in
catching bugs when using atomic_t as a reference counter, basically
cmpxchg related functions (Arnaldo Carvalho de Melo)
- Convert the code using atomic_t as reference counts to refcount_t
(Elena Rashetova)
- Add feature test for sched_getcpu() to more easily check for its
presence in the many libc implementations and accross different
versions of such C libraries (Arnaldo Carvalho de Melo)
- Issue a HW watchdog disable hint in 'perf stat' for when some of the
requested events can't get counted because a PMU counter is taken by that
watchdog (Borislav Petkov).
- Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
Documentation:
- Clarify the term 'convergence' in:
perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
Kernel code:
- Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
- Allow return probes with offsets and absolute addresses (Naveen N. Rao)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Adrian Hunter (1):
perf intel-PT/BTS: Add missing initialization
Arnaldo Carvalho de Melo (12):
tools include: Adopt __compiletime_error
tools arch x86: Include asm/cmpxchg.h
tools arch x86: Introduce atomic_cmpxchg()
tools include: Introduce atomic_cmpxchg_{relaxed,release}()
tools include: Provide gcc based cmpxchg fallback for !x86
tools include: Add UINT_MAX def to kernel.h
tools include: Adopt kernel's refcount.h
perf evlist: Clarify a bit the use of perf_mmap->refcnt
tools build: Add test for sched_getcpu()
perf bench futex: Use __maybe_unused
perf bench futex: Fix build on musl + clang
tools build: Use the same CC for feature detection and actual build
Borislav Petkov (1):
perf stat: Issue a HW watchdog disable hint
Charles Baylis (1):
perf tools: Allow sorting by symbol size
Elena Reshetova (9):
perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
perf comm: Convert comm_str.refcnt from atomic_t to refcount_t
perf dso: Convert dso.refcnt from atomic_t to refcount_t
perf map: Convert map.refcnt from atomic_t to refcount_t
perf map: Convert map_groups.refcnt from atomic_t to refcount_t
perf evlist: Convert perf_map.refcnt from atomic_t to refcount_t
perf thread: convert thread.refcnt from atomic_t to refcount_t
perf thread_map: Convert thread_map.refcnt from atomic_t to refcount_t
Jiri Olsa (2):
perf tools: Force uncore events to system wide monitoring
perf bench numa: Add more comment for -c option
Karol Wachowski (1):
perf vendor events: Add mapping for KnightsMill PMU events
Namhyung Kim (4):
perf ftrace: Add support for --pid option
perf cpumap: Introduce cpu_map__snprint_mask()
perf ftrace: Add support for -a and -C option
perf ftrace: Use pager for displaying result
Naveen N. Rao (3):
kretprobes: Ensure probe location is at function entry
trace/kprobes: Allow return probes with offsets and absolute addresses
perf probe: Generalize probe event file open routine
Steven Rostedt (VMware) (1):
trace/kprobes: Add back warning about offset in return probes
include/linux/kprobes.h | 1 +
kernel/kprobes.c | 13 ++
kernel/trace/trace.c | 1 +
kernel/trace/trace_kprobe.c | 9 +-
tools/arch/x86/include/asm/atomic.h | 7 +
tools/arch/x86/include/asm/cmpxchg.h | 89 ++++++++++++
tools/build/Makefile.feature | 1 +
tools/build/feature/Makefile | 10 +-
tools/build/feature/test-all.c | 5 +
tools/build/feature/test-sched_getcpu.c | 7 +
tools/include/asm-generic/atomic-gcc.h | 8 ++
tools/include/linux/atomic.h | 6 +
tools/include/linux/compiler-gcc.h | 4 +
tools/include/linux/compiler.h | 4 +
tools/include/linux/kernel.h | 4 +
tools/include/linux/refcount.h | 151 ++++++++++++++++++++
tools/perf/Documentation/perf-ftrace.txt | 18 +++
tools/perf/Documentation/perf-report.txt | 1 +
tools/perf/MANIFEST | 2 +
tools/perf/Makefile.config | 4 +
tools/perf/bench/futex-hash.c | 1 +
tools/perf/bench/futex-lock-pi.c | 1 +
tools/perf/bench/futex-requeue.c | 1 +
tools/perf/bench/futex-wake-parallel.c | 1 +
tools/perf/bench/futex-wake.c | 1 +
tools/perf/bench/futex.h | 10 +-
tools/perf/bench/numa.c | 3 +-
tools/perf/builtin-ftrace.c | 152 +++++++++++++++++----
tools/perf/builtin-stat.c | 44 +++++-
tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
tools/perf/tests/cpumap.c | 2 +-
tools/perf/tests/thread-map.c | 6 +-
tools/perf/tests/thread-mg-share.c | 12 +-
tools/perf/util/cgroup.c | 6 +-
tools/perf/util/cgroup.h | 4 +-
tools/perf/util/cloexec.h | 6 -
tools/perf/util/comm.c | 15 +-
tools/perf/util/cpumap.c | 62 +++++++--
tools/perf/util/cpumap.h | 5 +-
tools/perf/util/dso.c | 6 +-
tools/perf/util/dso.h | 4 +-
tools/perf/util/evlist.c | 31 +++--
tools/perf/util/evlist.h | 4 +-
tools/perf/util/hist.h | 1 +
.../util/intel-pt-decoder/intel-pt-insn-decoder.c | 2 +
tools/perf/util/machine.c | 2 +-
tools/perf/util/map.c | 10 +-
tools/perf/util/map.h | 10 +-
tools/perf/util/parse-events.c | 5 +-
tools/perf/util/probe-file.c | 20 +--
tools/perf/util/probe-file.h | 1 +
tools/perf/util/sort.c | 41 ++++++
tools/perf/util/sort.h | 1 +
tools/perf/util/thread.c | 6 +-
tools/perf/util/thread.h | 4 +-
tools/perf/util/thread_map.c | 20 +--
tools/perf/util/thread_map.h | 4 +-
tools/perf/util/util.h | 4 +-
tools/scripts/Makefile.include | 9 ++
59 files changed, 720 insertions(+), 143 deletions(-)
create mode 100644 tools/arch/x86/include/asm/cmpxchg.h
create mode 100644 tools/build/feature/test-sched_getcpu.c
create mode 100644 tools/include/linux/refcount.h
Test results:
The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.
Where clang is available, it is also used to build perf with/without libelf.
Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
[root@jouet ~]# waitp `pidof perf` ; time dm
1 alpine:3.4: Ok
2 alpine:3.5: Ok
3 alpine:edge: Ok
4 android-ndk:r12b-arm: Ok
5 archlinux:latest: Ok
6 centos:5: Ok
7 centos:6: Ok
8 centos:7: Ok
9 debian:7: Ok
10 debian:8: Ok
11 debian:experimental: Ok
12 debian:experimental-x-arm64: Ok
13 debian:experimental-x-mips: Ok
14 debian:experimental-x-mips64: Ok
15 debian:experimental-x-mipsel: Ok
16 fedora:20: Ok
17 fedora:21: Ok
18 fedora:22: Ok
19 fedora:23: Ok
20 fedora:24: Ok
21 fedora:24-x-ARC-uClibc: Ok
22 fedora:25: Ok
23 fedora:rawhide: Ok
24 mageia:5: Ok
25 opensuse:13.2: Ok
26 opensuse:42.1: Ok
27 opensuse:tumbleweed: Ok
28 ubuntu:12.04.5: Ok
29 ubuntu:14.04.4: Ok
30 ubuntu:14.04.4-x-linaro-arm64: Ok
31 ubuntu:15.10: Ok
32 ubuntu:16.04: Ok
33 ubuntu:16.04-x-arm: Ok
34 ubuntu:16.04-x-arm64: Ok
35 ubuntu:16.04-x-powerpc: Ok
36 ubuntu:16.04-x-powerpc64: Ok
37 ubuntu:16.04-x-s390: Ok
38 ubuntu:16.10: Ok
39 ubuntu:17.04: Ok
[root@jouet ~]#
[root@zoo ~]# uname -a
Linux zoo 4.9.13-100.fc24.x86_64 #1 SMP Mon Feb 27 16:57:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@zoo ~]# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: PERF_RECORD_* events & perf_sample fields : Ok
7: Parse perf pmu format : Ok
8: DSO data read : Ok
9: DSO data cache : Ok
10: DSO data reopen : Ok
11: Roundtrip evsel->name : Ok
12: Parse sched tracepoints fields : Ok
13: syscalls:sys_enter_openat event fields : Ok
14: Setup struct perf_event_attr : Ok
15: Match and link multiple hists : Ok
16: 'import perf' in python : Ok
17: Breakpoint overflow signal handler : Ok
18: Breakpoint overflow sampling : Ok
19: Number of exit events of a simple workload : Ok
20: Software clock events period values : Ok
21: Object code reading : Ok
22: Sample parsing : Ok
23: Use a dummy software event to keep tracking: Ok
24: Parse with no sample_id_all bit set : Ok
25: Filter hist entries : Ok
26: Lookup mmap thread : Ok
27: Share thread mg : Ok
28: Sort output of hist entries : Ok
29: Cumulate child hist entries : Ok
30: Track with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: kmod_path__parse : Ok
34: Thread map : Ok
35: LLVM search and compile :
35.1: Basic BPF llvm compile : Ok
35.2: kbuild searching : Ok
35.3: Compile source for BPF prologue generation: Ok
35.4: Compile source for BPF relocation : Ok
36: Session topology : Ok
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF pinning : Ok
37.3: BPF prologue generation : Ok
37.4: BPF relocation checker : Ok
38: Synthesize thread map : Ok
39: Remove thread map : Ok
40: Synthesize cpu map : Ok
41: Synthesize stat config : Ok
42: Synthesize stat : Ok
43: Synthesize stat round : Ok
44: Synthesize attr update : Ok
45: Event times : Ok
46: Read backward ring buffer : Ok
47: Print cpu map : Ok
48: Probe SDT events : Ok
49: is_printable_array : Ok
50: Print bitmap : Ok
51: perf hooks : Ok
52: builtin clang support : Skip (not compiled in)
53: unit_number__scnprintf : Ok
54: x86 rdpmc : Ok
55: Convert perf time to TSC : Ok
56: DWARF unwind : Ok
57: x86 instruction decoder - new instructions : Ok
58: Intel cqm nmi context read : Skip
[root@zoo ~]#
[acme@jouet linux]$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_pure_O: make
make_doc_O: make doc
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_static_O: make LDFLAGS=-static
make_help_O: make help
make_no_libnuma_O: make NO_LIBNUMA=1
make_clean_all_O: make clean all
make_no_libelf_O: make NO_LIBELF=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_tags_O: make tags
make_debug_O: make DEBUG=1
make_no_newt_O: make NO_NEWT=1
make_install_prefix_O: make install prefix=/tmp/krava
make_install_bin_O: make install-bin
make_perf_o_O: make perf.o
make_no_slang_O: make NO_SLANG=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_util_map_o_O: make util/map.o
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_gtk2_O: make NO_GTK2=1
make_no_libbpf_O: make NO_LIBBPF=1
make_install_O: make install
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
OK
[acme@jouet linux]$
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [GIT PULL 00/35] perf/core improvements and fixes
2016-08-23 21:03 Arnaldo Carvalho de Melo
@ 2016-08-24 9:09 ` Ingo Molnar
0 siblings, 0 replies; 48+ messages in thread
From: Ingo Molnar @ 2016-08-24 9:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: linux-kernel, Adrian Hunter, Alexander Shishkin,
Alexander Yarygin, Alexey Brodkin, Alexei Starovoitov,
Arjan van de Ven, Colin King, David Ahern, He Kuang,
Hemant Kumar, Jiri Olsa, Masami Hiramatsu, Mathieu Poirier,
Namhyung Kim, Naohiro Aota, Pekka Enberg, Peter Zijlstra,
Rui Teng, Stanislav Fomichev, Steven Rostedt, Vince Weaver,
Vineet Gupta, Wang Nan, Arnaldo Carvalho de Melo
* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> Hi Ingo,
>
> Please consider pulling, I first merged tip/perf/urgent into a
> tip/perf/core and rebased the patches I had in acme/perf/core.
>
> - Arnaldo
>
> Build stats at the end of this message.
>
> The following changes since commit ce90c12d2453aa6be743719bb0a5d4040b92700f:
>
> Merge branch 'perf/urgent' into perf/core, to pick up fixes before merging new changes (2016-08-23 15:35:47 -0300)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160823
>
> for you to fetch changes up to 5e30d55c71de058e4156080fe32d426c22d094cb:
>
> perf record: Fix spelling mistake "Finshed" -> "Finished" (2016-08-23 17:06:40 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> User visible:
>
> . Allow configuring the default 'perf report -s' sort order in ~/.perfconfig,
> for instance, "sym,dso" may be more fitting for kernel developers. (Arnaldo Carvalho de Melo)
>
> - Support x8/x16/x32/x64 hexadecimal "types" in ftrace and 'perf probe' (Masami Hiramatsu)
>
> Infrastructure:
>
> - Skip running the feature tests for 'make install-doc' (Rui Teng)
>
> - Introduce tools/include/linux/time64.h with *SEC_PER_*SEC macros
> to use in all of tools/ (Arnaldo Carvalho de Melo)
>
> - Break down symbol__disassemble() into multiple functions, to ease
> future work on better reporting the errors that may take place in
> the various steps it performs (possibly decompressing kernel module
> files, getting build-id keyed files, calling objdump, parsing its
> output, etc) (Arnaldo Carvalho de Melo)
>
> - Typo fixes in various places (Colin Ian King)
>
> - Remove superfluous NULL check in the TUI code (Colin Ian King)
>
> - Allow displaying multiple header lines in the TUI browser, prep
> work for the 'perf c2c' browser (Jiri Olsa)
>
> - Copy coresight-pmu.h header file needed by perf tools (Mathieu Poirier)
>
> - Use __weak definition from linux/compiler.h (Rui Teng)
>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> ----------------------------------------------------------------
>
> Arnaldo Carvalho de Melo (16):
> tools: Introduce tools/include/linux/time64.h for *SEC_PER_*SEC macros
> perf bench numa: Use NSEC_PER_U?SEC
> perf sched: Use linux/time64.h
> perf timechart: Use NSEC_PER_U?SEC
> perf bench sched-pipe: Use linux/time64.h, USEC_PER_SEC
> perf stat: Use *SEC_PER_*SEC macros
> perf bench mem: Use USEC_PER_SEC
> perf bench sched-messaging: Use USEC_PER_MSEC
> perf record: Use USEC_PER_MSEC
> perf kvm: Use NSEC_PER_USEC
> perf bench futex: Use NSEC_PER_USEC
> perf top: Use MSEC_PER_SEC
> perf disassemble: Move check for kallsyms + !kcore
> perf disassemble: Simplify logic for picking the filename to disassemble
> perf disassemble: Extract logic to find file to pass to objdump to a separate function
> perf report: Allow configuring the default sort order in ~/.perfconfig
>
> Colin Ian King (5):
> perf hists browser: Remove superfluous null check on map
> perf tools: Fix typo: "ehough" -> "enough"
> perf test bpf: Fix typo: "ehough" -> "enough"
> perf bpf: Fix typo: "ehough" -> "enough"
> perf record: Fix spelling mistake "Finshed" -> "Finished"
>
> Jiri Olsa (5):
> perf hists: Introduce nr_header_lines into struct perf_hpp_list
> perf hists: Add line argument into perf_hpp_fmt's header callback
> perf tools tui: Display multiple header lines
> perf tools stdio: Display multiple header lines
> perf hists: Add support for header span
>
> Masami Hiramatsu (6):
> ftrace: kprobe: uprobe: Add x8/x16/x32/x64 for hexadecimal types
> ftrace: probe: Add README entries for k/uprobe-events
> perf probe: Add supported for type casting by the running kernel
> perf probe: Support hexadecimal casting
> perf probe: Use hexadecimal type by default if possible
> ftrace: kprobe: uprobe: Show u8/u16/u32/u64 types in decimal
>
> Mathieu Poirier (1):
> tools: Copy coresight-pmu.h header file needed by perf tools
>
> Rui Teng (2):
> perf tools: Use __weak definition from linux/compiler.h
> perf tools: Skip running the feature tests for 'make install-doc'
>
> Documentation/trace/kprobetrace.txt | 9 ++-
> Documentation/trace/uprobetracer.txt | 9 ++-
> kernel/trace/trace.c | 24 ++++++
> kernel/trace/trace_kprobe.c | 4 +
> kernel/trace/trace_probe.c | 30 ++++----
> kernel/trace/trace_probe.h | 11 ++-
> kernel/trace/trace_uprobe.c | 4 +
> tools/include/linux/coresight-pmu.h | 39 ++++++++++
> tools/include/linux/time64.h | 12 +++
> tools/perf/Documentation/perf-config.txt | 4 +
> tools/perf/Documentation/perf-probe.txt | 5 +-
> tools/perf/MANIFEST | 2 +
> tools/perf/Makefile.perf | 5 +-
> tools/perf/bench/futex-requeue.c | 5 +-
> tools/perf/bench/futex-wake-parallel.c | 5 +-
> tools/perf/bench/futex-wake.c | 5 +-
> tools/perf/bench/mem-functions.c | 3 +-
> tools/perf/bench/numa.c | 53 ++++++-------
> tools/perf/bench/sched-messaging.c | 5 +-
> tools/perf/bench/sched-pipe.c | 9 ++-
> tools/perf/builtin-diff.c | 4 +-
> tools/perf/builtin-kvm.c | 11 +--
> tools/perf/builtin-record.c | 8 +-
> tools/perf/builtin-report.c | 4 +
> tools/perf/builtin-sched.c | 37 ++++-----
> tools/perf/builtin-script.c | 7 +-
> tools/perf/builtin-stat.c | 19 ++---
> tools/perf/builtin-timechart.c | 13 ++--
> tools/perf/builtin-top.c | 3 +-
> tools/perf/builtin-trace.c | 1 +
> tools/perf/perf.h | 7 --
> tools/perf/tests/backward-ring-buffer.c | 2 +-
> tools/perf/tests/bpf.c | 2 +-
> tools/perf/ui/browsers/hists.c | 50 +++++++++----
> tools/perf/ui/gtk/hists.c | 2 +-
> tools/perf/ui/hist.c | 4 +-
> tools/perf/ui/stdio/hist.c | 45 +++++++----
> tools/perf/util/annotate.c | 87 +++++++++++-----------
> tools/perf/util/bpf-loader.c | 2 +-
> tools/perf/util/debug.c | 10 +--
> tools/perf/util/header.c | 3 +-
> tools/perf/util/hist.h | 3 +-
> tools/perf/util/probe-file.c | 57 ++++++++++++++
> tools/perf/util/probe-file.h | 10 +++
> tools/perf/util/probe-finder.c | 19 +++--
> .../perf/util/scripting-engines/trace-event-perl.c | 5 +-
> .../util/scripting-engines/trace-event-python.c | 5 +-
> tools/perf/util/sort.c | 9 ++-
> tools/perf/util/sort.h | 2 +-
> tools/perf/util/svghelper.c | 11 +--
> tools/perf/util/util.c | 1 +
> tools/perf/util/util.h | 4 -
> 52 files changed, 464 insertions(+), 226 deletions(-)
> create mode 100644 tools/include/linux/coresight-pmu.h
> create mode 100644 tools/include/linux/time64.h
>
> Build Stats:
>
> News:
>
> The fedora:24-x-ARC-uClibc adds the ARC arch and the uClibc libc to the mix of
> targets tested, still with a minimal build that doesn't include even libelf,
> that will be added soon, using a more full featured pre-built toolchain provided
> by the Synopsys folks.
>
> [root@jouet ~]# time dm
> 1 70.304638757 alpine:3.4: Ok
> 2 24.806303766 android-ndk:r12b-arm: Ok
> 3 71.829633643 archlinux:latest: Ok
> 4 39.316551941 centos:5: Ok
> 5 59.282978228 centos:6: Ok
> 6 69.836088394 centos:7: Ok
> 7 63.476952272 debian:7: Ok
> 8 69.450110099 debian:8: Ok
> 9 72.484714796 debian:experimental: Ok
> 10 69.730035221 fedora:20: Ok
> 11 73.629813614 fedora:21: Ok
> 12 71.955425760 fedora:22: Ok
> 13 73.015579053 fedora:23: Ok
> 14 81.186943795 fedora:24: Ok
> 15 31.401503154 fedora:24-x-ARC-uClibc: Ok
> 16 78.366606942 fedora:rawhide: Ok
> 17 77.434661064 mageia:5: Ok
> 18 74.491020416 opensuse:13.2: Ok
> 19 71.673570141 opensuse:42.1: Ok
> 20 79.587415167 opensuse:tumbleweed: Ok
> 21 57.802976895 ubuntu:12.04.5: Ok
> 22 68.505810699 ubuntu:14.04.4: Ok
> 23 69.478057262 ubuntu:15.10: Ok
> 24 65.596669700 ubuntu:16.04: Ok
> 25 52.759886216 ubuntu:16.04-x-arm: Ok
> 26 52.389293077 ubuntu:16.04-x-arm64: Ok
> 27 52.319141734 ubuntu:16.04-x-powerpc64: Ok
> 28 54.378287607 ubuntu:16.04-x-powerpc64el: Ok
> 29 72.794919209 ubuntu:16.10: Ok
> 30 53.351899868 ubuntu:16.10-x-s390: Ok
> 1922.64s
>
> real 32m3.246s
> user 0m1.858s
> sys 0m2.363s
> [root@jouet ~]#
Pulled, thanks a lot Arnaldo!
Ingo
^ permalink raw reply [flat|nested] 48+ messages in thread
* [GIT PULL 00/35] perf/core improvements and fixes
@ 2016-08-23 21:03 Arnaldo Carvalho de Melo
2016-08-24 9:09 ` Ingo Molnar
0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-08-23 21:03 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Alexander Yarygin, Alexey Brodkin,
Alexei Starovoitov, Arjan van de Ven, Colin King, David Ahern,
He Kuang, Hemant Kumar, Jiri Olsa, Masami Hiramatsu,
Mathieu Poirier, Namhyung Kim, Naohiro Aota, Pekka Enberg,
Peter Zijlstra, Rui Teng, Stanislav Fomichev, Steven Rostedt,
Vince Weaver, Vineet Gupta, Wang Nan, Arnaldo Carvalho de Melo
Hi Ingo,
Please consider pulling, I first merged tip/perf/urgent into a
tip/perf/core and rebased the patches I had in acme/perf/core.
- Arnaldo
Build stats at the end of this message.
The following changes since commit ce90c12d2453aa6be743719bb0a5d4040b92700f:
Merge branch 'perf/urgent' into perf/core, to pick up fixes before merging new changes (2016-08-23 15:35:47 -0300)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160823
for you to fetch changes up to 5e30d55c71de058e4156080fe32d426c22d094cb:
perf record: Fix spelling mistake "Finshed" -> "Finished" (2016-08-23 17:06:40 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
User visible:
. Allow configuring the default 'perf report -s' sort order in ~/.perfconfig,
for instance, "sym,dso" may be more fitting for kernel developers. (Arnaldo Carvalho de Melo)
- Support x8/x16/x32/x64 hexadecimal "types" in ftrace and 'perf probe' (Masami Hiramatsu)
Infrastructure:
- Skip running the feature tests for 'make install-doc' (Rui Teng)
- Introduce tools/include/linux/time64.h with *SEC_PER_*SEC macros
to use in all of tools/ (Arnaldo Carvalho de Melo)
- Break down symbol__disassemble() into multiple functions, to ease
future work on better reporting the errors that may take place in
the various steps it performs (possibly decompressing kernel module
files, getting build-id keyed files, calling objdump, parsing its
output, etc) (Arnaldo Carvalho de Melo)
- Typo fixes in various places (Colin Ian King)
- Remove superfluous NULL check in the TUI code (Colin Ian King)
- Allow displaying multiple header lines in the TUI browser, prep
work for the 'perf c2c' browser (Jiri Olsa)
- Copy coresight-pmu.h header file needed by perf tools (Mathieu Poirier)
- Use __weak definition from linux/compiler.h (Rui Teng)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Arnaldo Carvalho de Melo (16):
tools: Introduce tools/include/linux/time64.h for *SEC_PER_*SEC macros
perf bench numa: Use NSEC_PER_U?SEC
perf sched: Use linux/time64.h
perf timechart: Use NSEC_PER_U?SEC
perf bench sched-pipe: Use linux/time64.h, USEC_PER_SEC
perf stat: Use *SEC_PER_*SEC macros
perf bench mem: Use USEC_PER_SEC
perf bench sched-messaging: Use USEC_PER_MSEC
perf record: Use USEC_PER_MSEC
perf kvm: Use NSEC_PER_USEC
perf bench futex: Use NSEC_PER_USEC
perf top: Use MSEC_PER_SEC
perf disassemble: Move check for kallsyms + !kcore
perf disassemble: Simplify logic for picking the filename to disassemble
perf disassemble: Extract logic to find file to pass to objdump to a separate function
perf report: Allow configuring the default sort order in ~/.perfconfig
Colin Ian King (5):
perf hists browser: Remove superfluous null check on map
perf tools: Fix typo: "ehough" -> "enough"
perf test bpf: Fix typo: "ehough" -> "enough"
perf bpf: Fix typo: "ehough" -> "enough"
perf record: Fix spelling mistake "Finshed" -> "Finished"
Jiri Olsa (5):
perf hists: Introduce nr_header_lines into struct perf_hpp_list
perf hists: Add line argument into perf_hpp_fmt's header callback
perf tools tui: Display multiple header lines
perf tools stdio: Display multiple header lines
perf hists: Add support for header span
Masami Hiramatsu (6):
ftrace: kprobe: uprobe: Add x8/x16/x32/x64 for hexadecimal types
ftrace: probe: Add README entries for k/uprobe-events
perf probe: Add supported for type casting by the running kernel
perf probe: Support hexadecimal casting
perf probe: Use hexadecimal type by default if possible
ftrace: kprobe: uprobe: Show u8/u16/u32/u64 types in decimal
Mathieu Poirier (1):
tools: Copy coresight-pmu.h header file needed by perf tools
Rui Teng (2):
perf tools: Use __weak definition from linux/compiler.h
perf tools: Skip running the feature tests for 'make install-doc'
Documentation/trace/kprobetrace.txt | 9 ++-
Documentation/trace/uprobetracer.txt | 9 ++-
kernel/trace/trace.c | 24 ++++++
kernel/trace/trace_kprobe.c | 4 +
kernel/trace/trace_probe.c | 30 ++++----
kernel/trace/trace_probe.h | 11 ++-
kernel/trace/trace_uprobe.c | 4 +
tools/include/linux/coresight-pmu.h | 39 ++++++++++
tools/include/linux/time64.h | 12 +++
tools/perf/Documentation/perf-config.txt | 4 +
tools/perf/Documentation/perf-probe.txt | 5 +-
tools/perf/MANIFEST | 2 +
tools/perf/Makefile.perf | 5 +-
tools/perf/bench/futex-requeue.c | 5 +-
tools/perf/bench/futex-wake-parallel.c | 5 +-
tools/perf/bench/futex-wake.c | 5 +-
tools/perf/bench/mem-functions.c | 3 +-
tools/perf/bench/numa.c | 53 ++++++-------
tools/perf/bench/sched-messaging.c | 5 +-
tools/perf/bench/sched-pipe.c | 9 ++-
tools/perf/builtin-diff.c | 4 +-
tools/perf/builtin-kvm.c | 11 +--
tools/perf/builtin-record.c | 8 +-
tools/perf/builtin-report.c | 4 +
tools/perf/builtin-sched.c | 37 ++++-----
tools/perf/builtin-script.c | 7 +-
tools/perf/builtin-stat.c | 19 ++---
tools/perf/builtin-timechart.c | 13 ++--
tools/perf/builtin-top.c | 3 +-
tools/perf/builtin-trace.c | 1 +
tools/perf/perf.h | 7 --
tools/perf/tests/backward-ring-buffer.c | 2 +-
tools/perf/tests/bpf.c | 2 +-
tools/perf/ui/browsers/hists.c | 50 +++++++++----
tools/perf/ui/gtk/hists.c | 2 +-
tools/perf/ui/hist.c | 4 +-
tools/perf/ui/stdio/hist.c | 45 +++++++----
tools/perf/util/annotate.c | 87 +++++++++++-----------
tools/perf/util/bpf-loader.c | 2 +-
tools/perf/util/debug.c | 10 +--
tools/perf/util/header.c | 3 +-
tools/perf/util/hist.h | 3 +-
tools/perf/util/probe-file.c | 57 ++++++++++++++
tools/perf/util/probe-file.h | 10 +++
tools/perf/util/probe-finder.c | 19 +++--
.../perf/util/scripting-engines/trace-event-perl.c | 5 +-
.../util/scripting-engines/trace-event-python.c | 5 +-
tools/perf/util/sort.c | 9 ++-
tools/perf/util/sort.h | 2 +-
tools/perf/util/svghelper.c | 11 +--
tools/perf/util/util.c | 1 +
tools/perf/util/util.h | 4 -
52 files changed, 464 insertions(+), 226 deletions(-)
create mode 100644 tools/include/linux/coresight-pmu.h
create mode 100644 tools/include/linux/time64.h
Build Stats:
News:
The fedora:24-x-ARC-uClibc adds the ARC arch and the uClibc libc to the mix of
targets tested, still with a minimal build that doesn't include even libelf,
that will be added soon, using a more full featured pre-built toolchain provided
by the Synopsys folks.
[root@jouet ~]# time dm
1 70.304638757 alpine:3.4: Ok
2 24.806303766 android-ndk:r12b-arm: Ok
3 71.829633643 archlinux:latest: Ok
4 39.316551941 centos:5: Ok
5 59.282978228 centos:6: Ok
6 69.836088394 centos:7: Ok
7 63.476952272 debian:7: Ok
8 69.450110099 debian:8: Ok
9 72.484714796 debian:experimental: Ok
10 69.730035221 fedora:20: Ok
11 73.629813614 fedora:21: Ok
12 71.955425760 fedora:22: Ok
13 73.015579053 fedora:23: Ok
14 81.186943795 fedora:24: Ok
15 31.401503154 fedora:24-x-ARC-uClibc: Ok
16 78.366606942 fedora:rawhide: Ok
17 77.434661064 mageia:5: Ok
18 74.491020416 opensuse:13.2: Ok
19 71.673570141 opensuse:42.1: Ok
20 79.587415167 opensuse:tumbleweed: Ok
21 57.802976895 ubuntu:12.04.5: Ok
22 68.505810699 ubuntu:14.04.4: Ok
23 69.478057262 ubuntu:15.10: Ok
24 65.596669700 ubuntu:16.04: Ok
25 52.759886216 ubuntu:16.04-x-arm: Ok
26 52.389293077 ubuntu:16.04-x-arm64: Ok
27 52.319141734 ubuntu:16.04-x-powerpc64: Ok
28 54.378287607 ubuntu:16.04-x-powerpc64el: Ok
29 72.794919209 ubuntu:16.10: Ok
30 53.351899868 ubuntu:16.10-x-s390: Ok
1922.64s
real 32m3.246s
user 0m1.858s
sys 0m2.363s
[root@jouet ~]#
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [GIT PULL 00/35] perf/core improvements and fixes
2013-12-20 19:08 Arnaldo Carvalho de Melo
@ 2013-12-27 20:05 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-12-27 20:05 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Adrian Hunter, Arun Sharma, Corey Ashford,
David Ahern, Frederic Weisbecker, Jiri Olsa, Mike Galbraith,
Namhyung Kim, Paul Mackerras, Peter Zijlstra, Rodrigo Campos,
Stephane Eranian, Steven Rostedt
Em Fri, Dec 20, 2013 at 04:08:35PM -0300, Arnaldo Carvalho de Melo escreveu:
> User visible changes:
>
> Improvements:
>
> . Do not show stats if workload fails in 'stat' (David Ahern)
Hi Ingo,
Please hold on, as reported elsewhere, the above change broke
'perf stat valid-workload', so I removed it from my tree.
I'll resubmit soon with a bunch more,
Thanks,
- Arnaldo
^ permalink raw reply [flat|nested] 48+ messages in thread
* [GIT PULL 00/35] perf/core improvements and fixes
@ 2013-12-20 19:08 Arnaldo Carvalho de Melo
2013-12-27 20:05 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 48+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-12-20 19:08 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Arun Sharma, Corey Ashford, David Ahern, Frederic Weisbecker,
Jiri Olsa, Mike Galbraith, Namhyung Kim, Paul Mackerras,
Peter Zijlstra, Rodrigo Campos, Stephane Eranian, Steven Rostedt,
Arnaldo Carvalho de Melo
From: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Hi Ingo,
Please consider pulling,
Best regards,
- Arnaldo
The following changes since commit fa6e8e5f7cbf85f364ebd5a90525dbbe9de2083b:
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2013-12-18 14:07:26 +0100)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo
for you to fetch changes up to a419e512731892c3e7e661cd96e7f7d6a7acdc91:
perf stat: Do not show stats if workload fails (2013-12-20 13:37:50 -0300)
----------------------------------------------------------------
User visible changes:
Improvements:
. Do not show stats if workload fails in 'stat' (David Ahern)
. Print session information only if --stdio is given (Namhyung Kim)
Developer stuff:
Fixes:
. Get rid of a duplicate va_end() in error reporting (Namhyung Kim)
. If a hist entry doesn't have symbol information, compare it with its
address. Affects upcoming new feature (--cumulate) (Namhyung Kim)
Improvements:
. Make libtraceevent install target quieter (Jiri Olsa)
. Make tests/make output more compact (Jiri Olsa)
New APIs:
. Introduce pevent_filter_strerror() in libtraceevent, similar in
purpose to libc's strerror() function. (Namhyung Kim)
Refactorings:
. Use perf_data_file methods to write output file in 'record' and
'inject' (Jiri Olsa)
. Use pr_*() functions where applicable in 'report' (Namhyumg Kim)
. Add 'machine' 'addr_location' struct to have full picture (machine,
thread, map, symbol, addr) for a (partially) resolved address, reducing
function signatures (Arnaldo Carvalho de Melo)
. Reduce code duplication in the histogram entry creation/insertion. (Arnaldo Carvalho de Melo)
. Auto allocate annotation histogram data structures, (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Arnaldo Carvalho de Melo (16):
perf annotate: Auto allocate symbol per addr hist buckets
perf hists: Leave symbol addr hist bucket auto alloc to symbol layer
perf annotate: Add inc_samples method to addr_map_symbol
perf top: Use hist_entry__inc_addr_sample
perf annotate: Adopt methods from hists
perf annotate: Make symbol__inc_addr_samples private
perf report: Introduce helpers for processing callchains
perf record: Simplify perf_record__write
perf record: Rename 'perf_record' to plain 'record'
perf tools: Rename 'perf_record_opts' to 'record_opts
perf report: Rename 'perf_report' to 'report'
perf ui browser: Remove misplaced __maybe_unused
perf scripting python: Shorten function signatures
perf scripting perl: Shorten function signatures
perf mem: Remove unused parameter from dump_raw_samples()
perf symbols: Add 'machine' member to struct addr_location
David Ahern (1):
perf stat: Do not show stats if workload fails
Jiri Olsa (11):
perf inject: Handle output file via perf_data_file object
perf record: Use perf_data_file__write for output file
perf tests: Factor make install tests
perf tools: Making QUIET_(CLEAN|INSTAL) variables global
tools lib traceevent: Remove print_app_build variable
tools lib traceevent: Use global QUIET_CC build output
tools lib traceevent: Add global QUIET_CC_FPIC build output
tools lib traceevent: Use global QUIET_LINK build output
tools lib traceevent: Use global QUIET_INSTALL build output
tools lib traceevent: Use global QUIET_CLEAN build output
tools lib traceevent: Use global 'O' processing code
Namhyung Kim (7):
perf sort: Compare addresses if no symbol info
perf sort: Do not compare dso again
perf hists: Do not pass period and weight to add_hist_entry()
tools lib traceevent: Introduce pevent_filter_strerror()
perf tools: Get rid of a duplicate va_end() in error reporting routine
perf report: Use pr_*() functions where applicable
perf report: Print session information only if --stdio is given
tools/lib/traceevent/Makefile | 85 +++----
tools/lib/traceevent/event-parse.c | 17 +-
tools/lib/traceevent/event-parse.h | 7 +-
tools/lib/traceevent/parse-filter.c | 98 ++++----
tools/perf/builtin-annotate.c | 10 +-
tools/perf/builtin-inject.c | 65 +++---
tools/perf/builtin-kvm.c | 2 +-
tools/perf/builtin-mem.c | 5 +-
tools/perf/builtin-record.c | 94 ++++----
tools/perf/builtin-report.c | 259 +++++++--------------
tools/perf/builtin-script.c | 16 +-
tools/perf/builtin-stat.c | 11 +-
tools/perf/builtin-top.c | 23 +-
tools/perf/builtin-trace.c | 2 +-
tools/perf/config/utilities.mak | 7 -
tools/perf/perf.h | 2 +-
tools/perf/tests/code-reading.c | 2 +-
tools/perf/tests/keep-tracking.c | 2 +-
tools/perf/tests/make | 38 ++-
tools/perf/tests/open-syscall-tp-fields.c | 2 +-
tools/perf/tests/perf-record.c | 2 +-
tools/perf/tests/perf-time-to-tsc.c | 2 +-
tools/perf/ui/browser.c | 2 +-
tools/perf/util/annotate.c | 41 +++-
tools/perf/util/annotate.h | 9 +-
tools/perf/util/callchain.h | 2 +-
tools/perf/util/debug.c | 1 -
tools/perf/util/event.c | 1 +
tools/perf/util/evlist.h | 7 +-
tools/perf/util/evsel.c | 3 +-
tools/perf/util/evsel.h | 4 +-
tools/perf/util/hist.c | 21 +-
tools/perf/util/hist.h | 3 -
tools/perf/util/record.c | 9 +-
.../perf/util/scripting-engines/trace-event-perl.c | 19 +-
.../util/scripting-engines/trace-event-python.c | 25 +-
tools/perf/util/session.c | 4 +-
tools/perf/util/session.h | 2 +-
tools/perf/util/sort.c | 22 +-
tools/perf/util/symbol.h | 1 +
tools/perf/util/top.c | 2 +-
tools/perf/util/top.h | 2 +-
tools/perf/util/trace-event-scripting.c | 3 +-
tools/perf/util/trace-event.h | 1 -
tools/scripts/Makefile.include | 4 +
45 files changed, 415 insertions(+), 524 deletions(-)
^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2019-03-09 16:02 UTC | newest]
Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-28 14:29 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 01/35] perf stat: Define a structure for per-thread shadow stats Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 02/35] perf stat: Extend rbtree to support " Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 03/35] perf stat: Create the runtime_stat init/exit function Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 04/35] perf stat: Update per-thread shadow stats Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 05/35] perf stat: Print " Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 06/35] perf stat: Remove a set of shadow stats static variables Arnaldo Carvalho de Melo
2017-12-28 14:29 ` [PATCH 07/35] perf stat: Allocate shadow stats buffer for threads Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 08/35] perf stat: Update or print per-thread stats Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 09/35] perf thread_map: Enumerate all threads from /proc Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 10/35] perf stat: Remove --per-thread pid/tid limitation Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 11/35] perf stat: Resort '--per-thread' result Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 12/35] perf utils: Move is_directory() to path.h Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 13/35] perf test: Handle properly readdir DT_UNKNOWN Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 14/35] perf perf: Remove duplicate includes Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 15/35] tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 16/35] perf s390: Generate system call table from asm/unistd.h Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 17/35] perf trace: Use generated syscall table on s390 too Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 18/35] perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate() Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 19/35] perf annotate: Use perf_env when obtaining the arch name Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 20/35] perf env: Adopt perf_env__arch() from the annotate code Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 21/35] perf probe: Add warning message if there is unexpected event name Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 22/35] perf probe: Cut off the version suffix from " Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 23/35] perf probe: Add __return suffix for return events Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 24/35] perf probe: Find versioned symbols from map Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 25/35] perf string: Add {strdup,strpbrk}_esc() Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 26/35] perf probe: Support escaped character in parser Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 27/35] perf evsel: Fix swap for samples with raw data Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 28/35] perf test shell: Fix check open filename arg using 'perf trace' Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 29/35] Revert "perf s390: Always build with -fPIC" Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 30/35] perf s390: Always build with -fPIC Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 31/35] perf evsel: Enable ignore_missing_thread for pid option Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 32/35] perf probe arm64: Fix symbol fixup issues due to ELF type Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 33/35] perf tool: Improve bash command line auto-complete for multiple events with comma Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 34/35] perf tools: Return all events as auto-completions after comma Arnaldo Carvalho de Melo
2017-12-28 14:30 ` [PATCH 35/35] perf tools: Auto-complete for events with ':' Arnaldo Carvalho de Melo
2017-12-28 15:17 ` [GIT PULL 00/35] perf/core improvements and fixes Ingo Molnar
-- strict thread matches above, loose matches on Subject: below --
2019-03-07 17:43 Arnaldo Carvalho de Melo
2019-03-09 16:02 ` Ingo Molnar
2018-08-15 15:05 Arnaldo Carvalho de Melo
2018-08-15 15:21 ` Andy Lutomirski
2018-08-18 11:17 ` Ingo Molnar
2017-03-06 19:37 Arnaldo Carvalho de Melo
2017-03-07 7:17 ` Ingo Molnar
2016-08-23 21:03 Arnaldo Carvalho de Melo
2016-08-24 9:09 ` Ingo Molnar
2013-12-20 19:08 Arnaldo Carvalho de Melo
2013-12-27 20:05 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).