* [GIT PULL 00/15] perf/core improvements and fixes @ 2017-08-23 19:35 Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 01/15] perf xyarray: Save max_x, max_y Arnaldo Carvalho de Melo ` (14 more replies) 0 siblings, 15 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:35 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo, Alexander Shishkin, Andi Kleen, Borislav Petkov, Jiri Olsa, Konstantin Khlebnikov, Peter Zijlstra, Wang Nan, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo Test results at the end of this message, as usual. The following changes since commit ba63f76e22ee723819c8cec86b31f7ea3182b2ed: Merge tag 'perf-core-for-mingo-4.14-20170821' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-08-22 12:16:39 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.14-20170823 for you to fetch changes up to 60913e005c8d19ec5187a638eafdd088509dfb9e: perf tools: Fix static linking with libunwind (2017-08-22 13:24:55 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: - Expression parser enhancements for metrics (Andi Kleen) - Fix buffer overflow while freeing events in 'perf stat' (Andi Kleen) - Fix static linking with elfutils's libdf and with libunwind in Debian/Ubuntu (Konstantin Khlebnikov) - Tighten detection of BPF events, avoiding matching some other PMU events such as 'cpu/uops_executed.core,cmask=1/' as a .c source file that ended up being considered a BPF event (Andi Kleen) - Add Skylake server uncore JSON vendor events (Andi Kleen) - Add support for printing new mem_info encodings, including 'perf test' checks (Andi Kleen) - Really install manpages via 'make install-man' (Konstantin Khlebnikov) - Fix documentation for perf_event_paranoid and perf_event_mlock_kb sysctls (Konstantin Khlebnikov) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Andi Kleen (11): perf xyarray: Save max_x, max_y perf evsel: Fix buffer overflow while freeing events perf bpf: Tighten detection of BPF events perf tools: Add utility function to detect SMT status perf tools: Expression parser enhancements for metrics perf tools: Increase maximum number of events in expressions perf tools: Dedup events in expression parsing perf vendor events: Add core event list for Skylake Server perf vendor events: Add Skylake server uncore event list perf tools: Add support for printing new mem_info encodings perf test: Add test cases for new data source encoding Konstantin Khlebnikov (4): perf tools: Really install manpages via 'make install-man' perf: Fix documentation for sysctls perf_event_paranoid and perf_event_mlock_kb perf tools: Fix static linking with libdw from elfutils perf tools: Fix static linking with libunwind Documentation/sysctl/kernel.txt | 13 +- tools/include/uapi/linux/perf_event.h | 30 +- tools/perf/Documentation/Makefile | 2 +- tools/perf/Makefile.config | 16 +- tools/perf/pmu-events/arch/x86/mapfile.csv | 1 + tools/perf/pmu-events/arch/x86/skylakex/cache.json | 1672 ++++++++++++++++++++ .../arch/x86/skylakex/floating-point.json | 88 ++ .../pmu-events/arch/x86/skylakex/frontend.json | 482 ++++++ .../perf/pmu-events/arch/x86/skylakex/memory.json | 1396 ++++++++++++++++ tools/perf/pmu-events/arch/x86/skylakex/other.json | 72 + .../pmu-events/arch/x86/skylakex/pipeline.json | 950 +++++++++++ .../arch/x86/skylakex/uncore-memory.json | 172 ++ .../pmu-events/arch/x86/skylakex/uncore-other.json | 1156 ++++++++++++++ .../arch/x86/skylakex/virtual-memory.json | 284 ++++ tools/perf/tests/Build | 1 + tools/perf/tests/builtin-test.c | 4 + tools/perf/tests/expr.c | 5 + tools/perf/tests/mem.c | 56 + tools/perf/tests/openat-syscall-all-cpus.c | 2 +- tools/perf/tests/openat-syscall.c | 2 +- tools/perf/tests/tests.h | 1 + tools/perf/util/Build | 1 + tools/perf/util/evlist.c | 12 +- tools/perf/util/evsel.c | 41 +- tools/perf/util/evsel.h | 7 +- tools/perf/util/expr.h | 2 +- tools/perf/util/expr.y | 74 +- tools/perf/util/mem-events.c | 43 +- tools/perf/util/parse-events.l | 23 +- tools/perf/util/smt.c | 44 + tools/perf/util/smt.h | 6 + tools/perf/util/xyarray.c | 2 + tools/perf/util/xyarray.h | 12 + 33 files changed, 6607 insertions(+), 65 deletions(-) create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/cache.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/floating-point.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/frontend.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/memory.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/other.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/pipeline.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/uncore-other.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json create mode 100644 tools/perf/tests/mem.c create mode 100644 tools/perf/util/smt.c create mode 100644 tools/perf/util/smt.h Test results: The first ones are container (docker) based builds of tools/perf with and without libelf support, objtool where it is supported and samples/bpf/, ditto. Where clang is available, it is also used to build perf with/without libelf. Several are cross builds, the ones with -x-ARCH, and the android one, and those may not have all the features built, due to lack of multi-arch devel packages, available and being used so far on just a few, like debian:experimental-x-{arm64,mipsel}. The 'perf test' one will perform a variety of tests exercising tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands with a variety of command line event specifications to then intercept the sys_perf_event syscall to check that the perf_event_attr fields are set up as expected, among a variety of other unit tests. The 'perf test' also runs shell scripts exercising the tools, checking if they affect the system in certain ways, like setting up kprobes and uprobes, request callchains for well known programs and check that they are the expected ones, see if 'perf trace' beautifies system call arguments correctly, etc. Additionally, a new set of tests, script based, runs the tools in a live system, setting probes in place that then gets used by 'perf trace', with its output compared against expected results. Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/ with a variety of feature sets, exercising the build with an incomplete set of features as well as with a complete one. It is planned to have it run on each of the containers mentioned above, using some container orchestration infrastructure. Get in contact if interested in helping having this in place. # dm 1 alpine:3.4: Ok 2 alpine:3.5: Ok 3 alpine:3.6: Ok 4 alpine:edge: Ok 5 android-ndk:r12b-arm: Ok 6 archlinux:latest: Ok 7 centos:5: Ok 8 centos:6: Ok 9 centos:7: Ok 10 debian:7: Ok 11 debian:8: Ok 12 debian:9: Ok 13 debian:experimental: Ok 14 debian:experimental-x-arm64: Ok 15 debian:experimental-x-mips: Ok 16 debian:experimental-x-mips64: Ok 17 debian:experimental-x-mipsel: Ok 18 fedora:20: Ok 19 fedora:21: Ok 20 fedora:22: Ok 21 fedora:23: Ok 22 fedora:24: Ok 23 fedora:24-x-ARC-uClibc: Ok 24 fedora:25: Ok 25 fedora:26: Ok 26 fedora:rawhide: Ok 27 mageia:5: Ok 28 opensuse:13.2: Ok 29 opensuse:42.1: Ok 30 opensuse:42.2: Ok 31 opensuse:tumbleweed: Ok 32 oraclelinux:6: Ok 33 oraclelinux:7: Ok 34 ubuntu:12.04.5: Ok 35 ubuntu:14.04.4: Ok 36 ubuntu:14.04.4-x-linaro-arm64: Ok 37 ubuntu:15.10: Ok 38 ubuntu:16.04: Ok 39 ubuntu:16.04-x-arm: Ok 40 ubuntu:16.04-x-arm64: Ok 41 ubuntu:16.04-x-powerpc: Ok 42 ubuntu:16.04-x-powerpc64: Ok 43 ubuntu:16.04-x-powerpc64el: Ok 44 ubuntu:16.04-x-s390: Ok 45 ubuntu:16.10: Ok 46 ubuntu:17.04: Ok 47 ubuntu:17.10: Ok # # uname -a Linux jouet 4.13.0-rc4+ #2 SMP Fri Aug 11 12:39:09 -03 2017 x86_64 x86_64 x86_64 GNU/Linux # perf test 1: vmlinux symtab matches kallsyms : Ok 2: Detect openat syscall event : Ok 3: Detect openat syscall event on all cpus : Ok 4: Read samples using the mmap interface : Ok 5: Test data source output : Ok 6: Parse event definition strings : Ok 7: Simple expression parser : Ok 8: PERF_RECORD_* events & perf_sample fields : Ok 9: Parse perf pmu format : Ok 10: DSO data read : Ok 11: DSO data cache : Ok 12: DSO data reopen : Ok 13: Roundtrip evsel->name : Ok 14: Parse sched tracepoints fields : Ok 15: syscalls:sys_enter_openat event fields : Ok 16: Setup struct perf_event_attr : Ok 17: Match and link multiple hists : Ok 18: 'import perf' in python : Ok 19: Breakpoint overflow signal handler : Ok 20: Breakpoint overflow sampling : Ok 21: Number of exit events of a simple workload : Ok 22: Software clock events period values : Ok 23: Object code reading : Ok 24: Sample parsing : Ok 25: Use a dummy software event to keep tracking : Ok 26: Parse with no sample_id_all bit set : Ok 27: Filter hist entries : Ok 28: Lookup mmap thread : Ok 29: Share thread mg : Ok 30: Sort output of hist entries : Ok 31: Cumulate child hist entries : Ok 32: Track with sched_switch : Ok 33: Filter fds with revents mask in a fdarray : Ok 34: Add fd to a fdarray, making it autogrow : Ok 35: kmod_path__parse : Ok 36: Thread map : Ok 37: LLVM search and compile : 37.1: Basic BPF llvm compile : Ok 37.2: kbuild searching : Ok 37.3: Compile source for BPF prologue generation : Ok 37.4: Compile source for BPF relocation : Ok 38: Session topology : Ok 39: BPF filter : 39.1: Basic BPF filtering : Ok 39.2: BPF pinning : Ok 39.3: BPF prologue generation : Ok 39.4: BPF relocation checker : Ok 40: Synthesize thread map : Ok 41: Remove thread map : Ok 42: Synthesize cpu map : Ok 43: Synthesize stat config : Ok 44: Synthesize stat : Ok 45: Synthesize stat round : Ok 46: Synthesize attr update : Ok 47: Event times : Ok 48: Read backward ring buffer : Ok 49: Print cpu map : Ok 50: Probe SDT events : Ok 51: is_printable_array : Ok 52: Print bitmap : Ok 53: perf hooks : Ok 54: builtin clang support : Skip (not compiled in) 55: unit_number__scnprintf : Ok 56: x86 rdpmc : Ok 57: Convert perf time to TSC : Ok 58: DWARF unwind : Ok 59: x86 instruction decoder - new instructions : Ok 60: Intel cqm nmi context read : Skip 61: Use vfs_getname probe to get syscall args filenames : Ok 62: probe libc's inet_pton & backtrace it with ping : Ok 63: Check open filename arg using perf trace + vfs_getname: Ok 64: Add vfs_getname probe to get syscall args filenames : Ok # $ make -C tools/perf build-test make: Entering directory '/home/acme/git/linux/tools/perf' - tarpkg: ./tests/perf-targz-src-pkg . make_pure_O: make make_util_pmu_bison_o_O: make util/pmu-bison.o make_tags_O: make tags make_util_map_o_O: make util/map.o make_no_libunwind_O: make NO_LIBUNWIND=1 make_clean_all_O: make clean all make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1 make_perf_o_O: make perf.o make_help_O: make help make_no_libbionic_O: make NO_LIBBIONIC=1 make_no_libpython_O: make NO_LIBPYTHON=1 make_no_libperl_O: make NO_LIBPERL=1 make_doc_O: make doc make_install_prefix_slash_O: make install prefix=/tmp/krava/ make_install_O: make install make_no_newt_O: make NO_NEWT=1 make_no_libnuma_O: make NO_LIBNUMA=1 make_install_bin_O: make install-bin make_install_prefix_O: make install prefix=/tmp/krava make_no_auxtrace_O: make NO_AUXTRACE=1 make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 make_no_gtk2_O: make NO_GTK2=1 make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 make_with_babeltrace_O: make LIBBABELTRACE=1 make_no_libelf_O: make NO_LIBELF=1 make_no_demangle_O: make NO_DEMANGLE=1 make_no_libaudit_O: make NO_LIBAUDIT=1 make_no_libbpf_O: make NO_LIBBPF=1 make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1 make_no_backtrace_O: make NO_BACKTRACE=1 make_debug_O: make DEBUG=1 make_with_clangllvm_O: make LIBCLANGLLVM=1 make_no_slang_O: make NO_SLANG=1 OK make: Leaving directory '/home/acme/git/linux/tools/perf' $ ^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH 01/15] perf xyarray: Save max_x, max_y 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 02/15] perf evsel: Fix buffer overflow while freeing events Arnaldo Carvalho de Melo ` (13 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Save the original array dimensions in xyarrays, so that users can retrieve them later. Add some inline functions to access these fields. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170811232634.30465-1-andi@firstfloor.org [ As noticed by Jiri, fix up namespacing: xy__method() -> xyarray__method() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/xyarray.c | 2 ++ tools/perf/util/xyarray.h | 12 ++++++++++++ 2 files changed, 14 insertions(+) diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c index 7251fdbabced..c8f415d9877b 100644 --- a/tools/perf/util/xyarray.c +++ b/tools/perf/util/xyarray.c @@ -12,6 +12,8 @@ struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size) xy->entry_size = entry_size; xy->row_size = row_size; xy->entries = xlen * ylen; + xy->max_x = xlen; + xy->max_y = ylen; } return xy; diff --git a/tools/perf/util/xyarray.h b/tools/perf/util/xyarray.h index 7f30af371b7e..4ba726c90870 100644 --- a/tools/perf/util/xyarray.h +++ b/tools/perf/util/xyarray.h @@ -7,6 +7,8 @@ struct xyarray { size_t row_size; size_t entry_size; size_t entries; + size_t max_x; + size_t max_y; char contents[]; }; @@ -19,4 +21,14 @@ static inline void *xyarray__entry(struct xyarray *xy, int x, int y) return &xy->contents[x * xy->row_size + y * xy->entry_size]; } +static inline int xyarray__max_y(struct xyarray *xy) +{ + return xy->max_x; +} + +static inline int xyarray__max_x(struct xyarray *xy) +{ + return xy->max_y; +} + #endif /* _PERF_XYARRAY_H_ */ -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 02/15] perf evsel: Fix buffer overflow while freeing events 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 01/15] perf xyarray: Save max_x, max_y Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 03/15] perf bpf: Tighten detection of BPF events Arnaldo Carvalho de Melo ` (12 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Fix buffer overflow for: % perf stat -e msr/tsc/,cstate_core/c7-residency/ true that causes glibc free list corruption. For some reason it doesn't trigger in valgrind, but it is visible in AS: ================================================================= ==32681==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000003f5c at pc 0x0000005671ef bp 0x7ffdaaac9ac0 sp 0x7ffdaaac9ab0 READ of size 4 at 0x603000003f5c thread T0 #0 0x5671ee in perf_evsel__close_fd util/evsel.c:1196 #1 0x56c57a in perf_evsel__close util/evsel.c:1717 #2 0x55ed5f in perf_evlist__close util/evlist.c:1631 #3 0x4647e1 in __run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:749 #4 0x4648e3 in run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:767 #5 0x46e1bc in cmd_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:2785 #6 0x52f83d in run_builtin /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:296 #7 0x52fd49 in handle_internal_command /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:348 #8 0x5300de in run_argv /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:392 #9 0x5308f3 in main /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:530 #10 0x7f0672d13400 in __libc_start_main (/lib64/libc.so.6+0x20400) #11 0x428419 in _start (/home/ak/hle/obj-perf/perf+0x428419) 0x603000003f5c is located 0 bytes to the right of 28-byte region [0x603000003f40,0x603000003f5c) allocated by thread T0 here: #0 0x7f0675139020 in calloc (/lib64/libasan.so.3+0xc7020) #1 0x648a2d in zalloc util/util.h:23 #2 0x648a88 in xyarray__new util/xyarray.c:9 #3 0x566419 in perf_evsel__alloc_fd util/evsel.c:1039 #4 0x56b427 in perf_evsel__open util/evsel.c:1529 #5 0x56c620 in perf_evsel__open_per_thread util/evsel.c:1730 #6 0x461dea in create_perf_stat_counter /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:263 #7 0x4637d7 in __run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:600 #8 0x4648e3 in run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:767 #9 0x46e1bc in cmd_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:2785 #10 0x52f83d in run_builtin /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:296 #11 0x52fd49 in handle_internal_command /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:348 #12 0x5300de in run_argv /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:392 #13 0x5308f3 in main /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:530 #14 0x7f0672d13400 in __libc_start_main (/lib64/libc.so.6+0x20400) The event is allocated with cpus == 1, but freed with cpus == real number When the evsel close function walks the file descriptors it exceeds the fd xyarray boundaries and reads random memory. v2: Now that xyarrays save their original dimensions we can use these to iterate the two dimensional fd arrays. Fix some users (close, ioctl) in evsel.c to use these fields directly. This allows simplifying the code and dropping quite a few function arguments. Adjust all callers by removing the unneeded arguments. The actual perf event reading still uses the original values from the evsel list. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170811232634.30465-2-andi@firstfloor.org [ Fix up xy_max_[xy]() -> xyarray__max_[xy]() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/tests/openat-syscall-all-cpus.c | 2 +- tools/perf/tests/openat-syscall.c | 2 +- tools/perf/util/evlist.c | 12 +++------- tools/perf/util/evsel.c | 37 ++++++++++-------------------- tools/perf/util/evsel.h | 7 +++--- 5 files changed, 20 insertions(+), 40 deletions(-) diff --git a/tools/perf/tests/openat-syscall-all-cpus.c b/tools/perf/tests/openat-syscall-all-cpus.c index 87265117fd7f..9cf1c35f2ad0 100644 --- a/tools/perf/tests/openat-syscall-all-cpus.c +++ b/tools/perf/tests/openat-syscall-all-cpus.c @@ -115,7 +115,7 @@ int test__openat_syscall_event_on_all_cpus(struct test *test __maybe_unused, int perf_evsel__free_counts(evsel); out_close_fd: - perf_evsel__close_fd(evsel, 1, threads->nr); + perf_evsel__close_fd(evsel); out_evsel_delete: perf_evsel__delete(evsel); out_thread_map_delete: diff --git a/tools/perf/tests/openat-syscall.c b/tools/perf/tests/openat-syscall.c index 85bb6729d303..9dc5c5d37553 100644 --- a/tools/perf/tests/openat-syscall.c +++ b/tools/perf/tests/openat-syscall.c @@ -56,7 +56,7 @@ int test__openat_syscall_event(struct test *test __maybe_unused, int subtest __m err = 0; out_close_fd: - perf_evsel__close_fd(evsel, 1, threads->nr); + perf_evsel__close_fd(evsel); out_evsel_delete: perf_evsel__delete(evsel); out_thread_map_delete: diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 078b58511595..6a0d7ffbeba0 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1419,8 +1419,6 @@ int perf_evlist__apply_filters(struct perf_evlist *evlist, struct perf_evsel **e { struct perf_evsel *evsel; int err = 0; - const int ncpus = cpu_map__nr(evlist->cpus), - nthreads = thread_map__nr(evlist->threads); evlist__for_each_entry(evlist, evsel) { if (evsel->filter == NULL) @@ -1430,7 +1428,7 @@ int perf_evlist__apply_filters(struct perf_evlist *evlist, struct perf_evsel **e * filters only work for tracepoint event, which doesn't have cpu limit. * So evlist and evsel should always be same. */ - err = perf_evsel__apply_filter(evsel, ncpus, nthreads, evsel->filter); + err = perf_evsel__apply_filter(evsel, evsel->filter); if (err) { *err_evsel = evsel; break; @@ -1623,13 +1621,9 @@ void perf_evlist__set_selected(struct perf_evlist *evlist, void perf_evlist__close(struct perf_evlist *evlist) { struct perf_evsel *evsel; - int ncpus = cpu_map__nr(evlist->cpus); - int nthreads = thread_map__nr(evlist->threads); - evlist__for_each_entry_reverse(evlist, evsel) { - int n = evsel->cpus ? evsel->cpus->nr : ncpus; - perf_evsel__close(evsel, n, nthreads); - } + evlist__for_each_entry_reverse(evlist, evsel) + perf_evsel__close(evsel); } static int perf_evlist__create_syswide_maps(struct perf_evlist *evlist) diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 3735c9e0080d..5dfb8bc4db89 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1051,16 +1051,13 @@ static int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthread return evsel->fd != NULL ? 0 : -ENOMEM; } -static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ncpus, int nthreads, +static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ioc, void *arg) { int cpu, thread; - if (evsel->system_wide) - nthreads = 1; - - for (cpu = 0; cpu < ncpus; cpu++) { - for (thread = 0; thread < nthreads; thread++) { + for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++) { + for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) { int fd = FD(evsel, cpu, thread), err = ioctl(fd, ioc, arg); @@ -1072,10 +1069,9 @@ static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ncpus, int nthrea return 0; } -int perf_evsel__apply_filter(struct perf_evsel *evsel, int ncpus, int nthreads, - const char *filter) +int perf_evsel__apply_filter(struct perf_evsel *evsel, const char *filter) { - return perf_evsel__run_ioctl(evsel, ncpus, nthreads, + return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_SET_FILTER, (void *)filter); } @@ -1122,20 +1118,14 @@ int perf_evsel__append_addr_filter(struct perf_evsel *evsel, const char *filter) int perf_evsel__enable(struct perf_evsel *evsel) { - int nthreads = thread_map__nr(evsel->threads); - int ncpus = cpu_map__nr(evsel->cpus); - - return perf_evsel__run_ioctl(evsel, ncpus, nthreads, + return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, 0); } int perf_evsel__disable(struct perf_evsel *evsel) { - int nthreads = thread_map__nr(evsel->threads); - int ncpus = cpu_map__nr(evsel->cpus); - - return perf_evsel__run_ioctl(evsel, ncpus, nthreads, + return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_DISABLE, 0); } @@ -1185,15 +1175,12 @@ static void perf_evsel__free_config_terms(struct perf_evsel *evsel) } } -void perf_evsel__close_fd(struct perf_evsel *evsel, int ncpus, int nthreads) +void perf_evsel__close_fd(struct perf_evsel *evsel) { int cpu, thread; - if (evsel->system_wide) - nthreads = 1; - - for (cpu = 0; cpu < ncpus; cpu++) - for (thread = 0; thread < nthreads; ++thread) { + for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++) + for (thread = 0; thread < xyarray__max_y(evsel->fd); ++thread) { close(FD(evsel, cpu, thread)); FD(evsel, cpu, thread) = -1; } @@ -1854,12 +1841,12 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus, return err; } -void perf_evsel__close(struct perf_evsel *evsel, int ncpus, int nthreads) +void perf_evsel__close(struct perf_evsel *evsel) { if (evsel->fd == NULL) return; - perf_evsel__close_fd(evsel, ncpus, nthreads); + perf_evsel__close_fd(evsel); perf_evsel__free_fd(evsel); } diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index de03c18daaf0..351d3b2d8887 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -226,7 +226,7 @@ const char *perf_evsel__group_name(struct perf_evsel *evsel); int perf_evsel__group_desc(struct perf_evsel *evsel, char *buf, size_t size); int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads); -void perf_evsel__close_fd(struct perf_evsel *evsel, int ncpus, int nthreads); +void perf_evsel__close_fd(struct perf_evsel *evsel); void __perf_evsel__set_sample_bit(struct perf_evsel *evsel, enum perf_event_sample_format bit); @@ -246,8 +246,7 @@ int perf_evsel__set_filter(struct perf_evsel *evsel, const char *filter); int perf_evsel__append_tp_filter(struct perf_evsel *evsel, const char *filter); int perf_evsel__append_addr_filter(struct perf_evsel *evsel, const char *filter); -int perf_evsel__apply_filter(struct perf_evsel *evsel, int ncpus, int nthreads, - const char *filter); +int perf_evsel__apply_filter(struct perf_evsel *evsel, const char *filter); int perf_evsel__enable(struct perf_evsel *evsel); int perf_evsel__disable(struct perf_evsel *evsel); @@ -257,7 +256,7 @@ int perf_evsel__open_per_thread(struct perf_evsel *evsel, struct thread_map *threads); int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus, struct thread_map *threads); -void perf_evsel__close(struct perf_evsel *evsel, int ncpus, int nthreads); +void perf_evsel__close(struct perf_evsel *evsel); struct perf_sample; -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 03/15] perf bpf: Tighten detection of BPF events 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 01/15] perf xyarray: Save max_x, max_y Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 02/15] perf evsel: Fix buffer overflow while freeing events Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 04/15] perf tools: Add utility function to detect SMT status Arnaldo Carvalho de Melo ` (11 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Wang Nan, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> perf stat -e cpu/uops_executed.core,cmask=1/ would be detected as a BPF source event because the .c matches the .c source BPF pattern. v2: Originally I tried to use lex lookahead, but it doesn't seem to work. This now extends the BPF pattern to match longer events, but then does an extra check in the C code to reject BPF matches that do not end with .c/.o/.obj This uses REJECT, which makes the flex scanner slower, but that shouldn't be a big problem for the perf events. Committer testing: # perf trace -e write -e /home/acme/bpf/tracepoint.c cat /etc/passwd > /dev/null 0.000 ( 0.006 ms): cat/18485 write(fd: 1, buf: 0x7f59eebe1000, count: 3494 ) ... 0.006 ( ): raw_syscalls:sys_enter:NR 1 (1, 7f59eebe1000, da6, 22, 7f59eebe0010, 0)) 0.008 ( ): perf_bpf_probe:_write:(ffffffff9626b2c0)) 0.000 ( 0.010 ms): cat/18485 ... [continued]: write()) = 3494 # It continues doing what was expected, i.e. identifying /home/acme/bpf/tracepoint.c as a BPF event and activates the clang machinery to build an eBPF object and then uses sys_bpf() to hook it up to the raw_syscalls:sys_enter tracepoint, etc. Andi forgot to add Wang to the CC list, fix it. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170811232634.30465-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/parse-events.l | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l index 660fca05bc93..c42edeac451f 100644 --- a/tools/perf/util/parse-events.l +++ b/tools/perf/util/parse-events.l @@ -53,6 +53,21 @@ static int str(yyscan_t scanner, int token) return token; } +static bool isbpf(yyscan_t scanner) +{ + char *text = parse_events_get_text(scanner); + int len = strlen(text); + + if (len < 2) + return false; + if ((text[len - 1] == 'c' || text[len - 1] == 'o') && + text[len - 2] == '.') + return true; + if (len > 4 && !strcmp(text + len - 4, ".obj")) + return true; + return false; +} + /* * This function is called when the parser gets two kind of input: * @@ -136,8 +151,8 @@ do { \ group [^,{}/]*[{][^}]*[}][^,{}/]* event_pmu [^,{}/]+[/][^/]*[/][^,{}/]* event [^,{}/]+ -bpf_object [^,{}]+\.(o|bpf) -bpf_source [^,{}]+\.c +bpf_object [^,{}]+\.(o|bpf)[a-zA-Z0-9._]* +bpf_source [^,{}]+\.c[a-zA-Z0-9._]* num_dec [0-9]+ num_hex 0x[a-fA-F0-9]+ @@ -307,8 +322,8 @@ r{num_raw_hex} { return raw(yyscanner); } {num_hex} { return value(yyscanner, 16); } {modifier_event} { return str(yyscanner, PE_MODIFIER_EVENT); } -{bpf_object} { return str(yyscanner, PE_BPF_OBJECT); } -{bpf_source} { return str(yyscanner, PE_BPF_SOURCE); } +{bpf_object} { if (!isbpf(yyscanner)) REJECT; return str(yyscanner, PE_BPF_OBJECT); } +{bpf_source} { if (!isbpf(yyscanner)) REJECT; return str(yyscanner, PE_BPF_SOURCE); } {name} { return pmu_str_check(yyscanner); } "/" { BEGIN(config); return '/'; } - { return '-'; } -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 04/15] perf tools: Add utility function to detect SMT status 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (2 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 03/15] perf bpf: Tighten detection of BPF events Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 05/15] perf tools: Expression parser enhancements for metrics Arnaldo Carvalho de Melo ` (10 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Add an smt_on() function to return if SMT is enabled or disabled. Used in the next patch. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170811232634.30465-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/Build | 1 + tools/perf/util/smt.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ tools/perf/util/smt.h | 6 ++++++ 3 files changed, 51 insertions(+) create mode 100644 tools/perf/util/smt.c create mode 100644 tools/perf/util/smt.h diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 8d49a989f193..94518c1bf8b6 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -22,6 +22,7 @@ libperf-y += rbtree.o libperf-y += libstring.o libperf-y += bitmap.o libperf-y += hweight.o +libperf-y += smt.o libperf-y += quote.o libperf-y += strbuf.o libperf-y += string.o diff --git a/tools/perf/util/smt.c b/tools/perf/util/smt.c new file mode 100644 index 000000000000..453f6f6f29f3 --- /dev/null +++ b/tools/perf/util/smt.c @@ -0,0 +1,44 @@ +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <linux/bitops.h> +#include "api/fs/fs.h" +#include "smt.h" + +int smt_on(void) +{ + static bool cached; + static int cached_result; + int cpu; + int ncpu; + + if (cached) + return cached_result; + + ncpu = sysconf(_SC_NPROCESSORS_CONF); + for (cpu = 0; cpu < ncpu; cpu++) { + unsigned long long siblings; + char *str; + size_t strlen; + char fn[256]; + + snprintf(fn, sizeof fn, + "devices/system/cpu/cpu%d/topology/thread_siblings", + cpu); + if (sysfs__read_str(fn, &str, &strlen) < 0) + continue; + /* Entry is hex, but does not have 0x, so need custom parser */ + siblings = strtoull(str, NULL, 16); + free(str); + if (hweight64(siblings) > 1) { + cached_result = 1; + cached = true; + break; + } + } + if (!cached) { + cached_result = 0; + cached = true; + } + return cached_result; +} diff --git a/tools/perf/util/smt.h b/tools/perf/util/smt.h new file mode 100644 index 000000000000..b8414b7bcbc8 --- /dev/null +++ b/tools/perf/util/smt.h @@ -0,0 +1,6 @@ +#ifndef SMT_H +#define SMT_H 1 + +int smt_on(void); + +#endif -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 05/15] perf tools: Expression parser enhancements for metrics 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (3 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 04/15] perf tools: Add utility function to detect SMT status Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 06/15] perf tools: Increase maximum number of events in expressions Arnaldo Carvalho de Melo ` (9 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Enhance the expression parser for more complex metric formulas. - Support python style IF ELSE operators - Add an #SMT_On magic variable for formulas that depend on the SMT status. Example: 4 *( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles - Support MIN/MAX operations Example: min(1 , IDQ.MITE_UOPS / ( UPI * 16 * ( ICACHE.HIT + ICACHE.MISSES ) / 4.0 ) ) This is useful to fix up problems caused by multiplexing. - Support | & ^ operators - Minor cleanups and fixes - Support an \ escape for operators. This allows to specify event names like c2-residency - Support @ as an alternative for / to be able to specify pmus without conflicts with operators (like msr/tsc/ as msr@tsc@) Example: (cstate_core@c3\\-residency@ / msr@tsc@) * 100 Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170811232634.30465-8-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/tests/expr.c | 5 ++++ tools/perf/util/expr.y | 61 ++++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 60 insertions(+), 6 deletions(-) diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c index e93903295532..cb251bf523e7 100644 --- a/tools/perf/tests/expr.c +++ b/tools/perf/tests/expr.c @@ -31,6 +31,11 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused) ret |= test(&ctx, "(BAR/2)%2", 1); ret |= test(&ctx, "1 - -4", 5); ret |= test(&ctx, "(FOO-1)*2 + (BAR/2)%2 - -4", 5); + ret |= test(&ctx, "1-1 | 1", 1); + ret |= test(&ctx, "1-1 & 1", 0); + ret |= test(&ctx, "min(1,2) + 1", 2); + ret |= test(&ctx, "max(1,2) + 1", 3); + ret |= test(&ctx, "1+1 if 3*4 else 0", 2); if (ret) return ret; diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y index 953e65ba2cc7..5753c4f21534 100644 --- a/tools/perf/util/expr.y +++ b/tools/perf/util/expr.y @@ -4,6 +4,7 @@ #include "util/debug.h" #define IN_EXPR_Y 1 #include "expr.h" +#include "smt.h" #include <string.h> #define MAXIDLEN 256 @@ -22,13 +23,15 @@ %token <num> NUMBER %token <id> ID +%token MIN MAX IF ELSE SMT_ON +%left MIN MAX IF %left '|' %left '^' %left '&' %left '-' '+' %left '*' '/' '%' %left NEG NOT -%type <num> expr +%type <num> expr if_expr %{ static int expr__lex(YYSTYPE *res, const char **pp); @@ -57,7 +60,12 @@ static int lookup_id(struct parse_ctx *ctx, char *id, double *val) %} %% -all_expr: expr { *final_val = $1; } +all_expr: if_expr { *final_val = $1; } + ; + +if_expr: + expr IF expr ELSE expr { $$ = $3 ? $1 : $5; } + | expr ; expr: NUMBER @@ -66,13 +74,19 @@ expr: NUMBER YYABORT; } } + | expr '|' expr { $$ = (long)$1 | (long)$3; } + | expr '&' expr { $$ = (long)$1 & (long)$3; } + | expr '^' expr { $$ = (long)$1 ^ (long)$3; } | expr '+' expr { $$ = $1 + $3; } | expr '-' expr { $$ = $1 - $3; } | expr '*' expr { $$ = $1 * $3; } | expr '/' expr { if ($3 == 0) YYABORT; $$ = $1 / $3; } | expr '%' expr { if ((long)$3 == 0) YYABORT; $$ = (long)$1 % (long)$3; } | '-' expr %prec NEG { $$ = -$2; } - | '(' expr ')' { $$ = $2; } + | '(' if_expr ')' { $$ = $2; } + | MIN '(' expr ',' expr ')' { $$ = $3 < $5 ? $3 : $5; } + | MAX '(' expr ',' expr ')' { $$ = $3 > $5 ? $3 : $5; } + | SMT_ON { $$ = smt_on() > 0; } ; %% @@ -82,13 +96,47 @@ static int expr__symbol(YYSTYPE *res, const char *p, const char **pp) char *dst = res->id; const char *s = p; - while (isalnum(*p) || *p == '_' || *p == '.') { + if (*p == '#') + *dst++ = *p++; + + while (isalnum(*p) || *p == '_' || *p == '.' || *p == ':' || *p == '@' || *p == '\\') { if (p - s >= MAXIDLEN) return -1; - *dst++ = *p++; + /* + * Allow @ instead of / to be able to specify pmu/event/ without + * conflicts with normal division. + */ + if (*p == '@') + *dst++ = '/'; + else if (*p == '\\') + *dst++ = *++p; + else + *dst++ = *p; + p++; } *dst = 0; *pp = p; + dst = res->id; + switch (dst[0]) { + case 'm': + if (!strcmp(dst, "min")) + return MIN; + if (!strcmp(dst, "max")) + return MAX; + break; + case 'i': + if (!strcmp(dst, "if")) + return IF; + break; + case 'e': + if (!strcmp(dst, "else")) + return ELSE; + break; + case '#': + if (!strcasecmp(dst, "#smt_on")) + return SMT_ON; + break; + } return ID; } @@ -102,6 +150,7 @@ static int expr__lex(YYSTYPE *res, const char **pp) p++; s = p; switch (*p++) { + case '#': case 'a' ... 'z': case 'A' ... 'Z': return expr__symbol(res, p - 1, pp); @@ -151,7 +200,7 @@ int expr__find_other(const char *p, const char *one, const char ***other, err = 0; break; } - if (tok == ID && strcasecmp(one, val.id)) { + if (tok == ID && (!one || strcasecmp(one, val.id))) { if (num_other >= EXPR_MAX_OTHER - 1) { pr_debug("Too many extra events in %s\n", orig); break; -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 06/15] perf tools: Increase maximum number of events in expressions 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (4 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 05/15] perf tools: Expression parser enhancements for metrics Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 07/15] perf tools: Dedup events in expression parsing Arnaldo Carvalho de Melo ` (8 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Some of the upcoming metrics need more than 8 events. Increase the maximum number the parser supports. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170811232634.30465-9-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/expr.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h index 9c2760a1a96e..400ef9eab00a 100644 --- a/tools/perf/util/expr.h +++ b/tools/perf/util/expr.h @@ -1,7 +1,7 @@ #ifndef PARSE_CTX_H #define PARSE_CTX_H 1 -#define EXPR_MAX_OTHER 8 +#define EXPR_MAX_OTHER 15 #define MAX_PARSE_ID EXPR_MAX_OTHER struct parse_id { -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 07/15] perf tools: Dedup events in expression parsing 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (5 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 06/15] perf tools: Increase maximum number of events in expressions Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 08/15] perf vendor events: Add core event list for Skylake Server Arnaldo Carvalho de Melo ` (7 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Avoid adding redundant events while parsing an expression. When we add an "other" event check first if it already exists. v2: Fix perf test failure. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170811232634.30465-10-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/expr.y | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y index 5753c4f21534..432b8560cf51 100644 --- a/tools/perf/util/expr.y +++ b/tools/perf/util/expr.y @@ -181,6 +181,19 @@ void expr__ctx_init(struct parse_ctx *ctx) ctx->num_ids = 0; } +static bool already_seen(const char *val, const char *one, const char **other, + int num_other) +{ + int i; + + if (one && !strcasecmp(one, val)) + return true; + for (i = 0; i < num_other; i++) + if (!strcasecmp(other[i], val)) + return true; + return false; +} + int expr__find_other(const char *p, const char *one, const char ***other, int *num_otherp) { @@ -200,7 +213,7 @@ int expr__find_other(const char *p, const char *one, const char ***other, err = 0; break; } - if (tok == ID && (!one || strcasecmp(one, val.id))) { + if (tok == ID && !already_seen(val.id, one, *other, num_other)) { if (num_other >= EXPR_MAX_OTHER - 1) { pr_debug("Too many extra events in %s\n", orig); break; -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 08/15] perf vendor events: Add core event list for Skylake Server 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (6 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 07/15] perf tools: Dedup events in expression parsing Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 09/15] perf vendor events: Add Skylake server uncore event list Arnaldo Carvalho de Melo ` (6 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Based on JSON list version v1.01 Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/3269ae458a883139110ec82bc895423bd8843d65 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/pmu-events/arch/x86/mapfile.csv | 1 + tools/perf/pmu-events/arch/x86/skylakex/cache.json | 1672 ++++++++++++++++++++ .../arch/x86/skylakex/floating-point.json | 88 ++ .../pmu-events/arch/x86/skylakex/frontend.json | 482 ++++++ .../perf/pmu-events/arch/x86/skylakex/memory.json | 1396 ++++++++++++++++ tools/perf/pmu-events/arch/x86/skylakex/other.json | 72 + .../pmu-events/arch/x86/skylakex/pipeline.json | 950 +++++++++++ .../arch/x86/skylakex/virtual-memory.json | 284 ++++ 8 files changed, 4945 insertions(+) create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/cache.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/floating-point.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/frontend.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/memory.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/other.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/pipeline.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv index d1a12e584c1b..4ea068366c3e 100644 --- a/tools/perf/pmu-events/arch/x86/mapfile.csv +++ b/tools/perf/pmu-events/arch/x86/mapfile.csv @@ -34,3 +34,4 @@ GenuineIntel-6-2C,v2,westmereep-dp,core GenuineIntel-6-2C,v2,westmereep-dp,core GenuineIntel-6-25,v2,westmereep-sp,core GenuineIntel-6-2F,v2,westmereex,core +GenuineIntel-6-55,v1,skylakex,core diff --git a/tools/perf/pmu-events/arch/x86/skylakex/cache.json b/tools/perf/pmu-events/arch/x86/skylakex/cache.json new file mode 100644 index 000000000000..b5bc742b6fbc --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/skylakex/cache.json @@ -0,0 +1,1672 @@ +[ + { + "EventCode": "0x24", + "UMask": "0x21", + "BriefDescription": "Demand Data Read miss L2, no rejects", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.DEMAND_DATA_RD_MISS", + "PublicDescription": "Counts the number of demand Data Read requests that miss L2 cache. Only not rejected loads are counted.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0x22", + "BriefDescription": "RFO requests that miss L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.RFO_MISS", + "PublicDescription": "Counts the RFO (Read-for-Ownership) requests that miss L2 cache.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0x24", + "BriefDescription": "L2 cache misses when fetching instructions", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.CODE_RD_MISS", + "PublicDescription": "Counts L2 cache misses when fetching instructions.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0x27", + "BriefDescription": "Demand requests that miss L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.ALL_DEMAND_MISS", + "PublicDescription": "Demand requests that miss L2 cache.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0x38", + "BriefDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.PF_MISS", + "PublicDescription": "Counts requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0x3f", + "BriefDescription": "All requests that miss L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.MISS", + "PublicDescription": "All requests that miss L2 cache.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0x41", + "BriefDescription": "Demand Data Read requests that hit L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.DEMAND_DATA_RD_HIT", + "PublicDescription": "Counts the number of demand Data Read requests that hit L2 cache. Only non rejected loads are counted.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0x42", + "BriefDescription": "RFO requests that hit L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.RFO_HIT", + "PublicDescription": "Counts the RFO (Read-for-Ownership) requests that hit L2 cache.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0x44", + "BriefDescription": "L2 cache hits when fetching instructions, code reads.", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.CODE_RD_HIT", + "PublicDescription": "Counts L2 cache hits when fetching instructions, code reads.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0xd8", + "BriefDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that hit L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.PF_HIT", + "PublicDescription": "Counts requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that hit L2 cache.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0xe1", + "BriefDescription": "Demand Data Read requests", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.ALL_DEMAND_DATA_RD", + "PublicDescription": "Counts the number of demand Data Read requests (including requests from L1D hardware prefetchers). These loads may hit or miss L2 cache. Only non rejected loads are counted.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0xe2", + "BriefDescription": "RFO requests to L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.ALL_RFO", + "PublicDescription": "Counts the total number of RFO (read for ownership) requests to L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0xe4", + "BriefDescription": "L2 code requests", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.ALL_CODE_RD", + "PublicDescription": "Counts the total number of L2 code requests.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0xe7", + "BriefDescription": "Demand requests to L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES", + "PublicDescription": "Demand requests to L2 cache.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0xf8", + "BriefDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.ALL_PF", + "PublicDescription": "Counts the total number of requests from the L2 hardware prefetchers.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x24", + "UMask": "0xff", + "BriefDescription": "All L2 requests", + "Counter": "0,1,2,3", + "EventName": "L2_RQSTS.REFERENCES", + "PublicDescription": "All L2 requests.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x2E", + "UMask": "0x41", + "BriefDescription": "Core-originated cacheable demand requests missed L3", + "Counter": "0,1,2,3", + "EventName": "LONGEST_LAT_CACHE.MISS", + "PublicDescription": "Counts core-originated cacheable requests that miss the L3 cache (Longest Latency cache). Requests include data and code reads, Reads-for-Ownership (RFOs), speculative accesses and hardware prefetches from L1 and L2. It does not include all misses to the L3.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x2E", + "UMask": "0x4f", + "BriefDescription": "Core-originated cacheable demand requests that refer to L3", + "Counter": "0,1,2,3", + "EventName": "LONGEST_LAT_CACHE.REFERENCE", + "PublicDescription": "Counts core-originated cacheable requests to the L3 cache (Longest Latency cache). Requests include data and code reads, Reads-for-Ownership (RFOs), speculative accesses and hardware prefetches from L1 and L2. It does not include all accesses to the L3.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x48", + "UMask": "0x1", + "BriefDescription": "L1D miss outstandings duration in cycles", + "Counter": "0,1,2,3", + "EventName": "L1D_PEND_MISS.PENDING", + "PublicDescription": "Counts duration of L1D miss outstanding, that is each cycle number of Fill Buffers (FB) outstanding required by Demand Reads. FB either is held by demand loads, or it is held by non-demand loads and gets hit at least once by demand. The valid outstanding interval is defined until the FB deallocation by one of the following ways: from FB allocation, if FB is allocated by demand from the demand Hit FB, if it is allocated by hardware or software prefetch.Note: In the L1D, a Demand Read contains cacheable or noncacheable demand loads, including ones causing cache-line splits and reads due to page walks resulted from any request type.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x48", + "UMask": "0x1", + "BriefDescription": "Cycles with L1D load Misses outstanding.", + "Counter": "0,1,2,3", + "EventName": "L1D_PEND_MISS.PENDING_CYCLES", + "CounterMask": "1", + "PublicDescription": "Counts duration of L1D miss outstanding in cycles.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x48", + "UMask": "0x1", + "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.", + "Counter": "0,1,2,3", + "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY", + "AnyThread": "1", + "CounterMask": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x48", + "UMask": "0x2", + "BriefDescription": "Number of times a request needed a FB entry but there was no entry available for it. That is the FB unavailability was dominant reason for blocking the request. A request includes cacheable/uncacheable demands that is load, store or SW prefetch.", + "Counter": "0,1,2,3", + "EventName": "L1D_PEND_MISS.FB_FULL", + "PublicDescription": "Number of times a request needed a FB (Fill Buffer) entry but there was no entry available for it. A request includes cacheable/uncacheable demands that are load, store or SW prefetch instructions.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x51", + "UMask": "0x1", + "BriefDescription": "L1D data line replacements", + "Counter": "0,1,2,3", + "EventName": "L1D.REPLACEMENT", + "PublicDescription": "Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x1", + "BriefDescription": "Offcore outstanding Demand Data Read transactions in uncore queue.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD", + "PublicDescription": "Counts the number of offcore outstanding Demand Data Read transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor. See the corresponding Umask under OFFCORE_REQUESTS.Note: A prefetch promoted to Demand is counted from the promotion point.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x1", + "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD", + "CounterMask": "1", + "PublicDescription": "Counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x1", + "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6", + "CounterMask": "6", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x2", + "BriefDescription": "Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle. ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD", + "PublicDescription": "Counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x2", + "BriefDescription": "Cycles with offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD", + "CounterMask": "1", + "PublicDescription": "Counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x4", + "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO", + "PublicDescription": "Counts the number of offcore outstanding RFO (store) transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x4", + "BriefDescription": "Cycles with offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO", + "CounterMask": "1", + "PublicDescription": "Counts the number of offcore outstanding demand rfo Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x8", + "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD", + "PublicDescription": "Counts the number of offcore outstanding cacheable Core Data Read transactions in the super queue every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x8", + "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD", + "CounterMask": "1", + "PublicDescription": "Counts cycles when offcore outstanding cacheable Core Data Read transactions are present in the super queue. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB0", + "UMask": "0x1", + "BriefDescription": "Demand Data Read requests sent to uncore", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS.DEMAND_DATA_RD", + "PublicDescription": "Counts the Demand Data Read requests sent to uncore. Use it in conjunction with OFFCORE_REQUESTS_OUTSTANDING to determine average latency in the uncore.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB0", + "UMask": "0x2", + "BriefDescription": "Cacheable and noncachaeble code read requests", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS.DEMAND_CODE_RD", + "PublicDescription": "Counts both cacheable and non-cacheable code read requests.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB0", + "UMask": "0x4", + "BriefDescription": "Demand RFO requests including regular RFOs, locks, ItoM", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS.DEMAND_RFO", + "PublicDescription": "Counts the demand RFO (read for ownership) requests including regular RFOs, locks, ItoM.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB0", + "UMask": "0x8", + "BriefDescription": "Demand and prefetch data reads", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS.ALL_DATA_RD", + "PublicDescription": "Counts the demand and prefetch data reads. All Core Data Reads include cacheable 'Demands' and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB0", + "UMask": "0x80", + "BriefDescription": "Any memory transaction that reached the SQ.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS.ALL_REQUESTS", + "PublicDescription": "Counts memory transactions reached the super queue including requests initiated by the core, all L3 prefetches, page walks, etc..", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB2", + "UMask": "0x1", + "BriefDescription": "Offcore requests buffer cannot take more entries for this thread core.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_BUFFER.SQ_FULL", + "PublicDescription": "Counts the number of cases when the offcore requests buffer cannot take more entries for the core. This can happen when the superqueue does not contain eligible entries, or when L1D writeback pending FIFO requests is full.Note: Writeback pending FIFO has six entries.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE", + "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD0", + "UMask": "0x11", + "BriefDescription": "Retired load instructions that miss the STLB.", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_INST_RETIRED.STLB_MISS_LOADS", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD0", + "UMask": "0x12", + "BriefDescription": "Retired store instructions that miss the STLB.", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_INST_RETIRED.STLB_MISS_STORES", + "SampleAfterValue": "100003", + "L1_Hit_Indication": "1", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD0", + "UMask": "0x21", + "BriefDescription": "Retired load instructions with locked access.", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_INST_RETIRED.LOCK_LOADS", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD0", + "UMask": "0x41", + "BriefDescription": "Retired load instructions that split across a cacheline boundary.", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_INST_RETIRED.SPLIT_LOADS", + "PublicDescription": "Counts retired load instructions that split across a cacheline boundary.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD0", + "UMask": "0x42", + "BriefDescription": "Retired store instructions that split across a cacheline boundary.", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_INST_RETIRED.SPLIT_STORES", + "PublicDescription": "Counts retired store instructions that split across a cacheline boundary.", + "SampleAfterValue": "100003", + "L1_Hit_Indication": "1", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD0", + "UMask": "0x81", + "BriefDescription": "All retired load instructions.", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_INST_RETIRED.ALL_LOADS", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD0", + "UMask": "0x82", + "BriefDescription": "All retired store instructions.", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_INST_RETIRED.ALL_STORES", + "SampleAfterValue": "2000003", + "L1_Hit_Indication": "1", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD1", + "UMask": "0x1", + "BriefDescription": "Retired load instructions with L1 cache hits as data sources", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_RETIRED.L1_HIT", + "PublicDescription": "Counts retired load instructions with at least one uop that hit in the L1 data cache. This event includes all SW prefetches and lock instructions regardless of the data source.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD1", + "UMask": "0x2", + "BriefDescription": "Retired load instructions with L2 cache hits as data sources", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_RETIRED.L2_HIT", + "PublicDescription": "Retired load instructions with L2 cache hits as data sources.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD1", + "UMask": "0x4", + "BriefDescription": "Retired load instructions with L3 cache hits as data sources", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_RETIRED.L3_HIT", + "PublicDescription": "Counts retired load instructions with at least one uop that hit in the L3 cache. ", + "SampleAfterValue": "50021", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD1", + "UMask": "0x8", + "BriefDescription": "Retired load instructions missed L1 cache as data sources", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_RETIRED.L1_MISS", + "PublicDescription": "Counts retired load instructions with at least one uop that missed in the L1 cache.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD1", + "UMask": "0x10", + "BriefDescription": "Retired load instructions missed L2 cache as data sources", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_RETIRED.L2_MISS", + "PublicDescription": "Retired load instructions missed L2 cache as data sources.", + "SampleAfterValue": "50021", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD1", + "UMask": "0x20", + "BriefDescription": "Retired load instructions missed L3 cache as data sources", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_RETIRED.L3_MISS", + "PublicDescription": "Counts retired load instructions with at least one uop that missed in the L3 cache. ", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD1", + "UMask": "0x40", + "BriefDescription": "Retired load instructions which data sources were load missed L1 but hit FB due to preceding miss to the same cache line with data not ready", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_RETIRED.FB_HIT", + "PublicDescription": "Counts retired load instructions with at least one uop was load missed in L1 but hit FB (Fill Buffers) due to preceding miss to the same cache line with data not ready. ", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD2", + "UMask": "0x1", + "BriefDescription": "Retired load instructions which data sources were L3 hit and cross-core snoop missed in on-pkg core cache.", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS", + "SampleAfterValue": "20011", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD2", + "UMask": "0x2", + "BriefDescription": "Retired load instructions which data sources were L3 and cross-core snoop hits in on-pkg core cache", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_L3_HIT_RETIRED.XSNP_HIT", + "PublicDescription": "Retired load instructions which data sources were L3 and cross-core snoop hits in on-pkg core cache.", + "SampleAfterValue": "20011", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD2", + "UMask": "0x4", + "BriefDescription": "Retired load instructions which data sources were HitM responses from shared L3", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_L3_HIT_RETIRED.XSNP_HITM", + "PublicDescription": "Retired load instructions which data sources were HitM responses from shared L3.", + "SampleAfterValue": "20011", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD2", + "UMask": "0x8", + "BriefDescription": "Retired load instructions which data sources were hits in L3 without snoops required", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_L3_HIT_RETIRED.XSNP_NONE", + "PublicDescription": "Retired load instructions which data sources were hits in L3 without snoops required.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD3", + "UMask": "0x1", + "BriefDescription": "Retired load instructions which data sources missed L3 but serviced from local dram", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM", + "PublicDescription": "Retired load instructions which data sources missed L3 but serviced from local DRAM.", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD3", + "UMask": "0x2", + "BriefDescription": "Retired load instructions which data sources missed L3 but serviced from remote dram", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD3", + "UMask": "0x4", + "BriefDescription": "Retired load instructions whose data sources was remote HITM", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM", + "PublicDescription": "Retired load instructions whose data sources was remote HITM.", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD3", + "UMask": "0x8", + "BriefDescription": "Retired load instructions whose data sources was forwarded from a remote cache", + "Data_LA": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD", + "PublicDescription": "Retired load instructions whose data sources was forwarded from a remote cache.", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xD4", + "UMask": "0x4", + "BriefDescription": "Retired instructions with at least 1 uncacheable load or lock.", + "Data_LA": "1", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "MEM_LOAD_MISC_RETIRED.UC", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xF0", + "UMask": "0x40", + "BriefDescription": "L2 writebacks that access L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_TRANS.L2_WB", + "PublicDescription": "Counts L2 writebacks that access L2 cache.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xF1", + "UMask": "0x1f", + "BriefDescription": "L2 cache lines filling L2", + "Counter": "0,1,2,3", + "EventName": "L2_LINES_IN.ALL", + "PublicDescription": "Counts the number of L2 cache lines filling the L2. Counting does not cover rejects.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xF2", + "UMask": "0x1", + "BriefDescription": "Counts the number of lines that are silently dropped by L2 cache when triggered by an L2 cache fill. These lines are typically in Shared state. A non-threaded event.", + "Counter": "0,1,2,3", + "EventName": "L2_LINES_OUT.SILENT", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xF2", + "UMask": "0x2", + "BriefDescription": "Counts the number of lines that are evicted by L2 cache when triggered by an L2 cache fill. Those lines can be either in modified state or clean state. Modified lines may either be written back to L3 or directly written to memory and not allocated in L3. Clean lines may either be allocated in L3 or dropped ", + "Counter": "0,1,2,3", + "EventName": "L2_LINES_OUT.NON_SILENT", + "PublicDescription": "Counts the number of lines that are evicted by L2 cache when triggered by an L2 cache fill. Those lines can be either in modified state or clean state. Modified lines may either be written back to L3 or directly written to memory and not allocated in L3. Clean lines may either be allocated in L3 or dropped.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xF2", + "UMask": "0x4", + "BriefDescription": "Counts the number of lines that have been hardware prefetched but not used and now evicted by L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_LINES_OUT.USELESS_PREF", + "PublicDescription": "Counts the number of lines that have been hardware prefetched but not used and now evicted by L2 cache.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xF2", + "UMask": "0x4", + "BriefDescription": "Counts the number of lines that have been hardware prefetched but not used and now evicted by L2 cache", + "Counter": "0,1,2,3", + "EventName": "L2_LINES_OUT.USELESS_HWPF", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xF4", + "UMask": "0x10", + "BriefDescription": "Number of cache line split locks sent to uncore.", + "Counter": "0,1,2,3", + "EventName": "SQ_MISC.SPLIT_LOCK", + "PublicDescription": "Counts the number of cache line split locks sent to the uncore.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that have any response type.", + "MSRValue": "0x0000010001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "DEMAND_DATA_RD & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that hit in the L3.", + "MSRValue": "0x3f803c0001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that have any response type.", + "MSRValue": "0x0000010002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "DEMAND_RFO & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that hit in the L3.", + "MSRValue": "0x3f803c0002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that have any response type.", + "MSRValue": "0x0000010004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "DEMAND_CODE_RD & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that hit in the L3.", + "MSRValue": "0x3f803c0004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that have any response type.", + "MSRValue": "0x0000010010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "PF_L2_DATA_RD & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3.", + "MSRValue": "0x3f803c0010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that have any response type.", + "MSRValue": "0x0000010020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "PF_L2_RFO & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3.", + "MSRValue": "0x3f803c0020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that have any response type.", + "MSRValue": "0x0000010080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "PF_L3_DATA_RD & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3.", + "MSRValue": "0x3f803c0080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that have any response type.", + "MSRValue": "0x0000010100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "PF_L3_RFO & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3.", + "MSRValue": "0x3f803c0100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that have any response type.", + "MSRValue": "0x0000010400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "PF_L1D_AND_SW & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3.", + "MSRValue": "0x3f803c0400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that have any response type.", + "MSRValue": "0x0000010490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "ALL_PF_DATA_RD & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that hit in the L3.", + "MSRValue": "0x3f803c0490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that have any response type.", + "MSRValue": "0x0000010120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "ALL_PF_RFO & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that hit in the L3.", + "MSRValue": "0x3f803c0120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that have any response type.", + "MSRValue": "0x0000010491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "ALL_DATA_RD & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that hit in the L3.", + "MSRValue": "0x3f803c0491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that have any response type.", + "MSRValue": "0x0000010122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.", + "MSRValue": "0x01003c0122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x04003c0122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "ALL_RFO & L3_HIT & SNOOP_HIT_WITH_FWD", + "MSRValue": "0x08003c0122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.SNOOP_HIT_WITH_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.", + "MSRValue": "0x10003c0122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that hit in the L3.", + "MSRValue": "0x3f803c0122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + } +] \ No newline at end of file diff --git a/tools/perf/pmu-events/arch/x86/skylakex/floating-point.json b/tools/perf/pmu-events/arch/x86/skylakex/floating-point.json new file mode 100644 index 000000000000..1c09a328df36 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/skylakex/floating-point.json @@ -0,0 +1,88 @@ +[ + { + "EventCode": "0xC7", + "UMask": "0x1", + "BriefDescription": "Number of SSE/AVX computational scalar double precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.", + "Counter": "0,1,2,3", + "EventName": "FP_ARITH_INST_RETIRED.SCALAR_DOUBLE", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC7", + "UMask": "0x2", + "BriefDescription": "Number of SSE/AVX computational scalar single precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.", + "Counter": "0,1,2,3", + "EventName": "FP_ARITH_INST_RETIRED.SCALAR_SINGLE", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC7", + "UMask": "0x4", + "BriefDescription": "Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired. Each count represents 2 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.", + "Counter": "0,1,2,3", + "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC7", + "UMask": "0x8", + "BriefDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element. ", + "Counter": "0,1,2,3", + "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE", + "PublicDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC7", + "UMask": "0x10", + "BriefDescription": "Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.", + "Counter": "0,1,2,3", + "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC7", + "UMask": "0x20", + "BriefDescription": "Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired. Each count represents 8 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.", + "Counter": "0,1,2,3", + "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC7", + "UMask": "0x40", + "BriefDescription": "Number of Packed Double-Precision FP arithmetic instructions (Use operation multiplier of 8)", + "Counter": "0,1,2,3", + "EventName": "FP_ARITH_INST_RETIRED.512B_PACKED_DOUBLE", + "PublicDescription": "Number of Packed Double-Precision FP arithmetic instructions (Use operation multiplier of 8).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC7", + "UMask": "0x80", + "BriefDescription": "Number of Packed Single-Precision FP arithmetic instructions (Use operation multiplier of 16)", + "Counter": "0,1,2,3", + "EventName": "FP_ARITH_INST_RETIRED.512B_PACKED_SINGLE", + "PublicDescription": "Number of Packed Single-Precision FP arithmetic instructions (Use operation multiplier of 16).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xCA", + "UMask": "0x1e", + "BriefDescription": "Cycles with any input/output SSE or FP assist", + "Counter": "0,1,2,3", + "EventName": "FP_ASSIST.ANY", + "CounterMask": "1", + "PublicDescription": "Counts cycles with any input and output SSE or x87 FP assist. If an input and output assist are detected on the same cycle the event increments by 1.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + } +] \ No newline at end of file diff --git a/tools/perf/pmu-events/arch/x86/skylakex/frontend.json b/tools/perf/pmu-events/arch/x86/skylakex/frontend.json new file mode 100644 index 000000000000..40abc0852cd6 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/skylakex/frontend.json @@ -0,0 +1,482 @@ +[ + { + "EventCode": "0x79", + "UMask": "0x4", + "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path", + "Counter": "0,1,2,3", + "EventName": "IDQ.MITE_UOPS", + "PublicDescription": "Counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may 'bypass' the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x4", + "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path", + "Counter": "0,1,2,3", + "EventName": "IDQ.MITE_CYCLES", + "CounterMask": "1", + "PublicDescription": "Counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may 'bypass' the IDQ.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x8", + "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path", + "Counter": "0,1,2,3", + "EventName": "IDQ.DSB_UOPS", + "PublicDescription": "Counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may 'bypass' the IDQ.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x8", + "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path", + "Counter": "0,1,2,3", + "EventName": "IDQ.DSB_CYCLES", + "CounterMask": "1", + "PublicDescription": "Counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may 'bypass' the IDQ.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x10", + "BriefDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy", + "Counter": "0,1,2,3", + "EventName": "IDQ.MS_DSB_CYCLES", + "CounterMask": "1", + "PublicDescription": "Counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x18", + "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops", + "Counter": "0,1,2,3", + "EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS", + "CounterMask": "4", + "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x18", + "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop", + "Counter": "0,1,2,3", + "EventName": "IDQ.ALL_DSB_CYCLES_ANY_UOPS", + "CounterMask": "1", + "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x20", + "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy", + "Counter": "0,1,2,3", + "EventName": "IDQ.MS_MITE_UOPS", + "PublicDescription": "Counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x24", + "BriefDescription": "Cycles MITE is delivering 4 Uops", + "Counter": "0,1,2,3", + "EventName": "IDQ.ALL_MITE_CYCLES_4_UOPS", + "CounterMask": "4", + "PublicDescription": "Counts the number of cycles 4 uops were delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. Counting includes uops that may 'bypass' the IDQ. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x24", + "BriefDescription": "Cycles MITE is delivering any Uop", + "Counter": "0,1,2,3", + "EventName": "IDQ.ALL_MITE_CYCLES_ANY_UOPS", + "CounterMask": "1", + "PublicDescription": "Counts the number of cycles uops were delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. Counting includes uops that may 'bypass' the IDQ. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x30", + "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy", + "Counter": "0,1,2,3", + "EventName": "IDQ.MS_CYCLES", + "CounterMask": "1", + "PublicDescription": "Counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EdgeDetect": "1", + "EventCode": "0x79", + "UMask": "0x30", + "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer", + "Counter": "0,1,2,3", + "EventName": "IDQ.MS_SWITCHES", + "CounterMask": "1", + "PublicDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x79", + "UMask": "0x30", + "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy", + "Counter": "0,1,2,3", + "EventName": "IDQ.MS_UOPS", + "PublicDescription": "Counts the total number of uops delivered by the Microcode Sequencer (MS). Any instruction over 4 uops will be delivered by the MS. Some instructions such as transcendentals may additionally generate uops from the MS.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x80", + "UMask": "0x4", + "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss.", + "Counter": "0,1,2,3", + "EventName": "ICACHE_16B.IFDATA_STALL", + "PublicDescription": "Cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x83", + "UMask": "0x1", + "BriefDescription": "Instruction fetch tag lookups that hit in the instruction cache (L1I). Counts at 64-byte cache-line granularity.", + "Counter": "0,1,2,3", + "EventName": "ICACHE_64B.IFTAG_HIT", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x83", + "UMask": "0x2", + "BriefDescription": "Instruction fetch tag lookups that miss in the instruction cache (L1I). Counts at 64-byte cache-line granularity.", + "Counter": "0,1,2,3", + "EventName": "ICACHE_64B.IFTAG_MISS", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x83", + "UMask": "0x4", + "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.", + "Counter": "0,1,2,3", + "EventName": "ICACHE_64B.IFTAG_STALL", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x9C", + "UMask": "0x1", + "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled", + "Counter": "0,1,2,3", + "EventName": "IDQ_UOPS_NOT_DELIVERED.CORE", + "PublicDescription": "Counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding \u201c4 \u2013 x\u201d when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when: a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread. b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions). c. Instruction Decode Queue (IDQ) delivers four uops.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x9C", + "UMask": "0x1", + "BriefDescription": "Cycles per thread when 4 or more uops are not delivered to Resource Allocation Table (RAT) when backend of the machine is not stalled", + "Counter": "0,1,2,3", + "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE", + "CounterMask": "4", + "PublicDescription": "Counts, on the per-thread basis, cycles when no uops are delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core =4.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x9C", + "UMask": "0x1", + "BriefDescription": "Cycles per thread when 3 or more uops are not delivered to Resource Allocation Table (RAT) when backend of the machine is not stalled", + "Counter": "0,1,2,3", + "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_LE_1_UOP_DELIV.CORE", + "CounterMask": "3", + "PublicDescription": "Counts, on the per-thread basis, cycles when less than 1 uop is delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core >= 3.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x9C", + "UMask": "0x1", + "BriefDescription": "Cycles with less than 2 uops delivered by the front end.", + "Counter": "0,1,2,3", + "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_LE_2_UOP_DELIV.CORE", + "CounterMask": "2", + "PublicDescription": "Cycles with less than 2 uops delivered by the front-end.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x9C", + "UMask": "0x1", + "BriefDescription": "Cycles with less than 3 uops delivered by the front end.", + "Counter": "0,1,2,3", + "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_LE_3_UOP_DELIV.CORE", + "CounterMask": "1", + "PublicDescription": "Cycles with less than 3 uops delivered by the front-end.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "Invert": "1", + "EventCode": "0x9C", + "UMask": "0x1", + "BriefDescription": "Counts cycles FE delivered 4 uops or Resource Allocation Table (RAT) was stalling FE.", + "Counter": "0,1,2,3", + "EventName": "IDQ_UOPS_NOT_DELIVERED.CYCLES_FE_WAS_OK", + "CounterMask": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xAB", + "UMask": "0x2", + "BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.", + "Counter": "0,1,2,3", + "EventName": "DSB2MITE_SWITCHES.PENALTY_CYCLES", + "PublicDescription": "Counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. MM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.Penalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 0\u20132 cycles.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired Instructions who experienced decode stream buffer (DSB - the decoded instruction-cache) miss.", + "PEBS": "1", + "MSRValue": "0x11", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.DSB_MISS", + "MSRIndex": "0x3F7", + "PublicDescription": "Counts retired Instructions that experienced DSB (Decode stream buffer i.e. the decoded instruction-cache) miss. ", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired Instructions who experienced Instruction L1 Cache true miss.", + "PEBS": "1", + "MSRValue": "0x12", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.L1I_MISS", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired Instructions who experienced Instruction L2 Cache true miss.", + "PEBS": "1", + "MSRValue": "0x13", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.L2_MISS", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired Instructions who experienced iTLB true miss.", + "PEBS": "1", + "MSRValue": "0x14", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.ITLB_MISS", + "MSRIndex": "0x3F7", + "PublicDescription": "Counts retired Instructions that experienced iTLB (Instruction TLB) true miss.", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired Instructions who experienced STLB (2nd level TLB) true miss.", + "PEBS": "1", + "MSRValue": "0x15", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.STLB_MISS", + "MSRIndex": "0x3F7", + "PublicDescription": "Counts retired Instructions that experienced STLB (2nd level TLB) true miss. ", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 2 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x400206", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_2", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 2 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x200206", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_2_BUBBLES_GE_2", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x400406", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_4", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 8 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x400806", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_8", + "MSRIndex": "0x3F7", + "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 8 cycles. During this period the front-end delivered no uops.", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 16 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x401006", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_16", + "MSRIndex": "0x3F7", + "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 16 cycles. During this period the front-end delivered no uops.", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 32 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x402006", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_32", + "MSRIndex": "0x3F7", + "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 32 cycles. During this period the front-end delivered no uops.", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x404006", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_64", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x408006", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_128", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x410006", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_256", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x420006", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_512", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 1 bubble-slot for a period of 2 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x100206", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_2_BUBBLES_GE_1", + "MSRIndex": "0x3F7", + "PublicDescription": "Counts retired instructions that are delivered to the back-end after the front-end had at least 1 bubble-slot for a period of 2 cycles. A bubble-slot is an empty issue-pipeline slot while there was no RAT stall.", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC6", + "UMask": "0x1", + "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 3 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall.", + "PEBS": "1", + "MSRValue": "0x300206", + "Counter": "0,1,2,3", + "EventName": "FRONTEND_RETIRED.LATENCY_GE_2_BUBBLES_GE_3", + "MSRIndex": "0x3F7", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + } +] \ No newline at end of file diff --git a/tools/perf/pmu-events/arch/x86/skylakex/memory.json b/tools/perf/pmu-events/arch/x86/skylakex/memory.json new file mode 100644 index 000000000000..ca22a22c1abd --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/skylakex/memory.json @@ -0,0 +1,1396 @@ +[ + { + "EventCode": "0x54", + "UMask": "0x1", + "BriefDescription": "Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address", + "Counter": "0,1,2,3", + "EventName": "TX_MEM.ABORT_CONFLICT", + "PublicDescription": "Number of times a TSX line had a cache conflict.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x54", + "UMask": "0x2", + "BriefDescription": "Number of times a transactional abort was signaled due to a data capacity limitation for transactional reads or writes.", + "Counter": "0,1,2,3", + "EventName": "TX_MEM.ABORT_CAPACITY", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x54", + "UMask": "0x4", + "BriefDescription": "Number of times a HLE transactional region aborted due to a non XRELEASE prefixed instruction writing to an elided lock in the elision buffer", + "Counter": "0,1,2,3", + "EventName": "TX_MEM.ABORT_HLE_STORE_TO_ELIDED_LOCK", + "PublicDescription": "Number of times a TSX Abort was triggered due to a non-release/commit store to lock.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x54", + "UMask": "0x8", + "BriefDescription": "Number of times an HLE transactional execution aborted due to NoAllocatedElisionBuffer being non-zero.", + "Counter": "0,1,2,3", + "EventName": "TX_MEM.ABORT_HLE_ELISION_BUFFER_NOT_EMPTY", + "PublicDescription": "Number of times a TSX Abort was triggered due to commit but Lock Buffer not empty.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x54", + "UMask": "0x10", + "BriefDescription": "Number of times an HLE transactional execution aborted due to XRELEASE lock not satisfying the address and value requirements in the elision buffer", + "Counter": "0,1,2,3", + "EventName": "TX_MEM.ABORT_HLE_ELISION_BUFFER_MISMATCH", + "PublicDescription": "Number of times a TSX Abort was triggered due to release/commit but data and address mismatch.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x54", + "UMask": "0x20", + "BriefDescription": "Number of times an HLE transactional execution aborted due to an unsupported read alignment from the elision buffer.", + "Counter": "0,1,2,3", + "EventName": "TX_MEM.ABORT_HLE_ELISION_BUFFER_UNSUPPORTED_ALIGNMENT", + "PublicDescription": "Number of times a TSX Abort was triggered due to attempting an unsupported alignment from Lock Buffer.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x54", + "UMask": "0x40", + "BriefDescription": "Number of times HLE lock could not be elided due to ElisionBufferAvailable being zero.", + "Counter": "0,1,2,3", + "EventName": "TX_MEM.HLE_ELISION_BUFFER_FULL", + "PublicDescription": "Number of times we could not allocate Lock Buffer.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x5d", + "UMask": "0x1", + "BriefDescription": "Counts the number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort.", + "Counter": "0,1,2,3", + "EventName": "TX_EXEC.MISC1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x5d", + "UMask": "0x2", + "BriefDescription": "Counts the number of times a class of instructions (e.g., vzeroupper) that may cause a transactional abort was executed inside a transactional region", + "Counter": "0,1,2,3", + "EventName": "TX_EXEC.MISC2", + "PublicDescription": "Unfriendly TSX abort triggered by a vzeroupper instruction.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x5d", + "UMask": "0x4", + "BriefDescription": "Counts the number of times an instruction execution caused the transactional nest count supported to be exceeded", + "Counter": "0,1,2,3", + "EventName": "TX_EXEC.MISC3", + "PublicDescription": "Unfriendly TSX abort triggered by a nest count that is too deep.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x5d", + "UMask": "0x8", + "BriefDescription": "Counts the number of times a XBEGIN instruction was executed inside an HLE transactional region.", + "Counter": "0,1,2,3", + "EventName": "TX_EXEC.MISC4", + "PublicDescription": "RTM region detected inside HLE.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x5d", + "UMask": "0x10", + "BriefDescription": "Counts the number of times an HLE XACQUIRE instruction was executed inside an RTM transactional region", + "Counter": "0,1,2,3", + "EventName": "TX_EXEC.MISC5", + "PublicDescription": "Counts the number of times an HLE XACQUIRE instruction was executed inside an RTM transactional region.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x10", + "BriefDescription": "Counts number of Offcore outstanding Demand Data Read requests that miss L3 cache in the superQ every cycle.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.L3_MISS_DEMAND_DATA_RD", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x10", + "BriefDescription": "Cycles with at least 1 Demand Data Read requests who miss L3 cache in the superQ.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_L3_MISS_DEMAND_DATA_RD", + "CounterMask": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x60", + "UMask": "0x10", + "BriefDescription": "Cycles with at least 6 Demand Data Read requests that miss L3 cache in the superQ.", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS_OUTSTANDING.L3_MISS_DEMAND_DATA_RD_GE_6", + "CounterMask": "6", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA3", + "UMask": "0x2", + "BriefDescription": "Cycles while L3 cache miss demand load is outstanding.", + "Counter": "0,1,2,3", + "EventName": "CYCLE_ACTIVITY.CYCLES_L3_MISS", + "CounterMask": "2", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA3", + "UMask": "0x6", + "BriefDescription": "Execution stalls while L3 cache miss demand load is outstanding.", + "Counter": "0,1,2,3", + "EventName": "CYCLE_ACTIVITY.STALLS_L3_MISS", + "CounterMask": "6", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB0", + "UMask": "0x10", + "BriefDescription": "Demand Data Read requests who miss L3 cache", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_REQUESTS.L3_MISS_DEMAND_DATA_RD", + "PublicDescription": "Demand Data Read requests who miss L3 cache.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC3", + "UMask": "0x2", + "BriefDescription": "Counts the number of machine clears due to memory order conflicts.", + "Counter": "0,1,2,3", + "EventName": "MACHINE_CLEARS.MEMORY_ORDERING", + "Errata": "SKL089", + "PublicDescription": "Counts the number of memory ordering Machine Clears detected. Memory Ordering Machine Clears can result from one of the following:a. memory disambiguation,b. external snoop, orc. cross SMT-HW-thread snoop (stores) hitting load buffer.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC8", + "UMask": "0x1", + "BriefDescription": "Number of times an HLE execution started.", + "Counter": "0,1,2,3", + "EventName": "HLE_RETIRED.START", + "PublicDescription": "Number of times we entered an HLE region. Does not count nested transactions.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC8", + "UMask": "0x2", + "BriefDescription": "Number of times an HLE execution successfully committed", + "Counter": "0,1,2,3", + "EventName": "HLE_RETIRED.COMMIT", + "PublicDescription": "Number of times HLE commit succeeded.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC8", + "UMask": "0x4", + "BriefDescription": "Number of times an HLE execution aborted due to any reasons (multiple categories may count as one). ", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "HLE_RETIRED.ABORTED", + "PublicDescription": "Number of times HLE abort was triggered.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC8", + "UMask": "0x8", + "BriefDescription": "Number of times an HLE execution aborted due to various memory events (e.g., read/write capacity and conflicts).", + "Counter": "0,1,2,3", + "EventName": "HLE_RETIRED.ABORTED_MEM", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC8", + "UMask": "0x10", + "BriefDescription": "Number of times an HLE execution aborted due to hardware timer expiration.", + "Counter": "0,1,2,3", + "EventName": "HLE_RETIRED.ABORTED_TIMER", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC8", + "UMask": "0x20", + "BriefDescription": "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.). ", + "Counter": "0,1,2,3", + "EventName": "HLE_RETIRED.ABORTED_UNFRIENDLY", + "PublicDescription": "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC8", + "UMask": "0x40", + "BriefDescription": "Number of times an HLE execution aborted due to incompatible memory type", + "Counter": "0,1,2,3", + "EventName": "HLE_RETIRED.ABORTED_MEMTYPE", + "PublicDescription": "Number of times an HLE execution aborted due to incompatible memory type.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC8", + "UMask": "0x80", + "BriefDescription": "Number of times an HLE execution aborted due to unfriendly events (such as interrupts).", + "Counter": "0,1,2,3", + "EventName": "HLE_RETIRED.ABORTED_EVENTS", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC9", + "UMask": "0x1", + "BriefDescription": "Number of times an RTM execution started.", + "Counter": "0,1,2,3", + "EventName": "RTM_RETIRED.START", + "PublicDescription": "Number of times we entered an RTM region. Does not count nested transactions.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC9", + "UMask": "0x2", + "BriefDescription": "Number of times an RTM execution successfully committed", + "Counter": "0,1,2,3", + "EventName": "RTM_RETIRED.COMMIT", + "PublicDescription": "Number of times RTM commit succeeded.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC9", + "UMask": "0x4", + "BriefDescription": "Number of times an RTM execution aborted due to any reasons (multiple categories may count as one). ", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "RTM_RETIRED.ABORTED", + "PublicDescription": "Number of times RTM abort was triggered.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC9", + "UMask": "0x8", + "BriefDescription": "Number of times an RTM execution aborted due to various memory events (e.g. read/write capacity and conflicts)", + "Counter": "0,1,2,3", + "EventName": "RTM_RETIRED.ABORTED_MEM", + "PublicDescription": "Number of times an RTM execution aborted due to various memory events (e.g. read/write capacity and conflicts).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC9", + "UMask": "0x10", + "BriefDescription": "Number of times an RTM execution aborted due to uncommon conditions.", + "Counter": "0,1,2,3", + "EventName": "RTM_RETIRED.ABORTED_TIMER", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC9", + "UMask": "0x20", + "BriefDescription": "Number of times an RTM execution aborted due to HLE-unfriendly instructions", + "Counter": "0,1,2,3", + "EventName": "RTM_RETIRED.ABORTED_UNFRIENDLY", + "PublicDescription": "Number of times an RTM execution aborted due to HLE-unfriendly instructions.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC9", + "UMask": "0x40", + "BriefDescription": "Number of times an RTM execution aborted due to incompatible memory type", + "Counter": "0,1,2,3", + "EventName": "RTM_RETIRED.ABORTED_MEMTYPE", + "PublicDescription": "Number of times an RTM execution aborted due to incompatible memory type.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC9", + "UMask": "0x80", + "BriefDescription": "Number of times an RTM execution aborted due to none of the previous 4 categories (e.g. interrupt)", + "Counter": "0,1,2,3", + "EventName": "RTM_RETIRED.ABORTED_EVENTS", + "PublicDescription": "Number of times an RTM execution aborted due to none of the previous 4 categories (e.g. interrupt).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xCD", + "UMask": "0x1", + "BriefDescription": "Counts loads when the latency from first dispatch to completion is greater than 4 cycles.", + "PEBS": "2", + "MSRValue": "0x4", + "Counter": "0,1,2,3", + "EventName": "MEM_TRANS_RETIRED.LOAD_LATENCY_GT_4", + "MSRIndex": "0x3F6", + "PublicDescription": "Counts loads when the latency from first dispatch to completion is greater than 4 cycles. Reported latency may be longer than just the memory latency.", + "TakenAlone": "1", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xCD", + "UMask": "0x1", + "BriefDescription": "Counts loads when the latency from first dispatch to completion is greater than 8 cycles.", + "PEBS": "2", + "MSRValue": "0x8", + "Counter": "0,1,2,3", + "EventName": "MEM_TRANS_RETIRED.LOAD_LATENCY_GT_8", + "MSRIndex": "0x3F6", + "PublicDescription": "Counts loads when the latency from first dispatch to completion is greater than 8 cycles. Reported latency may be longer than just the memory latency.", + "TakenAlone": "1", + "SampleAfterValue": "50021", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xCD", + "UMask": "0x1", + "BriefDescription": "Counts loads when the latency from first dispatch to completion is greater than 16 cycles.", + "PEBS": "2", + "MSRValue": "0x10", + "Counter": "0,1,2,3", + "EventName": "MEM_TRANS_RETIRED.LOAD_LATENCY_GT_16", + "MSRIndex": "0x3F6", + "PublicDescription": "Counts loads when the latency from first dispatch to completion is greater than 16 cycles. Reported latency may be longer than just the memory latency.", + "TakenAlone": "1", + "SampleAfterValue": "20011", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xCD", + "UMask": "0x1", + "BriefDescription": "Counts loads when the latency from first dispatch to completion is greater than 32 cycles.", + "PEBS": "2", + "MSRValue": "0x20", + "Counter": "0,1,2,3", + "EventName": "MEM_TRANS_RETIRED.LOAD_LATENCY_GT_32", + "MSRIndex": "0x3F6", + "PublicDescription": "Counts loads when the latency from first dispatch to completion is greater than 32 cycles. Reported latency may be longer than just the memory latency.", + "TakenAlone": "1", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xCD", + "UMask": "0x1", + "BriefDescription": "Counts loads when the latency from first dispatch to completion is greater than 64 cycles.", + "PEBS": "2", + "MSRValue": "0x40", + "Counter": "0,1,2,3", + "EventName": "MEM_TRANS_RETIRED.LOAD_LATENCY_GT_64", + "MSRIndex": "0x3F6", + "PublicDescription": "Counts loads when the latency from first dispatch to completion is greater than 64 cycles. Reported latency may be longer than just the memory latency.", + "TakenAlone": "1", + "SampleAfterValue": "2003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xCD", + "UMask": "0x1", + "BriefDescription": "Counts loads when the latency from first dispatch to completion is greater than 128 cycles.", + "PEBS": "2", + "MSRValue": "0x80", + "Counter": "0,1,2,3", + "EventName": "MEM_TRANS_RETIRED.LOAD_LATENCY_GT_128", + "MSRIndex": "0x3F6", + "PublicDescription": "Counts loads when the latency from first dispatch to completion is greater than 128 cycles. Reported latency may be longer than just the memory latency.", + "TakenAlone": "1", + "SampleAfterValue": "1009", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xCD", + "UMask": "0x1", + "BriefDescription": "Counts loads when the latency from first dispatch to completion is greater than 256 cycles.", + "PEBS": "2", + "MSRValue": "0x100", + "Counter": "0,1,2,3", + "EventName": "MEM_TRANS_RETIRED.LOAD_LATENCY_GT_256", + "MSRIndex": "0x3F6", + "PublicDescription": "Counts loads when the latency from first dispatch to completion is greater than 256 cycles. Reported latency may be longer than just the memory latency.", + "TakenAlone": "1", + "SampleAfterValue": "503", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xCD", + "UMask": "0x1", + "BriefDescription": "Counts loads when the latency from first dispatch to completion is greater than 512 cycles.", + "PEBS": "2", + "MSRValue": "0x200", + "Counter": "0,1,2,3", + "EventName": "MEM_TRANS_RETIRED.LOAD_LATENCY_GT_512", + "MSRIndex": "0x3F6", + "PublicDescription": "Counts loads when the latency from first dispatch to completion is greater than 512 cycles. Reported latency may be longer than just the memory latency.", + "TakenAlone": "1", + "SampleAfterValue": "101", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that miss in the L3.", + "MSRValue": "0x3fbc000001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts demand data reads that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000001 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that miss in the L3.", + "MSRValue": "0x3fbc000002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000002 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that miss in the L3.", + "MSRValue": "0x3fbc000004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand code reads that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000004 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that miss in the L3.", + "MSRValue": "0x3fbc000010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000010 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that miss in the L3.", + "MSRValue": "0x3fbc000020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000020 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss in the L3.", + "MSRValue": "0x3fbc000080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000080 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss in the L3.", + "MSRValue": "0x3fbc000100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000100 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss in the L3.", + "MSRValue": "0x3fbc000400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000400 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that miss in the L3.", + "MSRValue": "0x3fbc000490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000490 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that miss in the L3.", + "MSRValue": "0x3fbc000120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000120 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that miss in the L3.", + "MSRValue": "0x3fbc000491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000491 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that miss in the L3.", + "MSRValue": "0x3fbc000122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.ANY_SNOOP", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that miss in the L3. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that miss the L3 and clean or shared data is transferred from remote cache.", + "MSRValue": "0x083fc00122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.REMOTE_HIT_FORWARD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and clean or shared data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that miss the L3 and the modified data is transferred from remote cache.", + "MSRValue": "0x103fc00122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.REMOTE_HITM", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the modified data is transferred from remote cache. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local or remote dram.", + "MSRValue": "0x063fc00122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local or remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from remote dram.", + "MSRValue": "0x063b800122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from remote dram. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + }, + { + "Offcore": "1", + "EventCode": "0xB7, 0xBB", + "UMask": "0x1", + "BriefDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local dram.", + "MSRValue": "0x0604000122 ", + "Counter": "0,1,2,3", + "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local dram.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3" + } +] diff --git a/tools/perf/pmu-events/arch/x86/skylakex/other.json b/tools/perf/pmu-events/arch/x86/skylakex/other.json new file mode 100644 index 000000000000..70243b0b0586 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/skylakex/other.json @@ -0,0 +1,72 @@ +[ + { + "EventCode": "0x28", + "UMask": "0x7", + "BriefDescription": "Core cycles where the core was running in a manner where Turbo may be clipped to the Non-AVX turbo schedule.", + "Counter": "0,1,2,3", + "EventName": "CORE_POWER.LVL0_TURBO_LICENSE", + "PublicDescription": "Core cycles where the core was running with power-delivery for baseline license level 0. This includes non-AVX codes, SSE, AVX 128-bit, and low-current AVX 256-bit codes.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x28", + "UMask": "0x18", + "BriefDescription": "Core cycles where the core was running in a manner where Turbo may be clipped to the AVX2 turbo schedule.", + "Counter": "0,1,2,3", + "EventName": "CORE_POWER.LVL1_TURBO_LICENSE", + "PublicDescription": "Core cycles where the core was running with power-delivery for license level 1. This includes high current AVX 256-bit instructions as well as low current AVX 512-bit instructions.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x28", + "UMask": "0x20", + "BriefDescription": "Core cycles where the core was running in a manner where Turbo may be clipped to the AVX512 turbo schedule.", + "Counter": "0,1,2,3", + "EventName": "CORE_POWER.LVL2_TURBO_LICENSE", + "PublicDescription": "Core cycles where the core was running with power-delivery for license level 2 (introduced in Skylake Server michroarchtecture). This includes high current AVX 512-bit instructions.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x28", + "UMask": "0x40", + "BriefDescription": "Core cycles the core was throttled due to a pending power level request.", + "Counter": "0,1,2,3", + "EventName": "CORE_POWER.THROTTLE", + "PublicDescription": "Core cycles the out-of-order engine was throttled due to a pending power level request.", + "SampleAfterValue": "200003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xCB", + "UMask": "0x1", + "BriefDescription": "Number of hardware interrupts received by the processor.", + "Counter": "0,1,2,3", + "EventName": "HW_INTERRUPTS.RECEIVED", + "PublicDescription": "Counts the number of hardware interruptions received by the processor.", + "SampleAfterValue": "203", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xFE", + "UMask": "0x2", + "BriefDescription": "Counts number of cache lines that are allocated and written back to L3 with the intention that they are more likely to be reused shortly", + "Counter": "0,1,2,3", + "EventName": "IDI_MISC.WB_UPGRADE", + "PublicDescription": "Counts number of cache lines that are allocated and written back to L3 with the intention that they are more likely to be reused shortly.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xFE", + "UMask": "0x4", + "BriefDescription": "Counts number of cache lines that are dropped and not written back to L3 as they are deemed to be less likely to be reused shortly", + "Counter": "0,1,2,3", + "EventName": "IDI_MISC.WB_DOWNGRADE", + "PublicDescription": "Counts number of cache lines that are dropped and not written back to L3 as they are deemed to be less likely to be reused shortly.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + } +] diff --git a/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json b/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json new file mode 100644 index 000000000000..0895d1e52a4a --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json @@ -0,0 +1,950 @@ +[ + { + "EventCode": "0x00", + "UMask": "0x1", + "BriefDescription": "Instructions retired from execution.", + "Counter": "Fixed counter 1", + "EventName": "INST_RETIRED.ANY", + "PublicDescription": "Counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, Counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. Notes: INST_RETIRED.ANY is counted by a designated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. INST_RETIRED.ANY_P is counted by a programmable counter and it is an architectural performance event. Counting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions.", + "SampleAfterValue": "2000003", + "CounterHTOff": "Fixed counter 1" + }, + { + "EventCode": "0x00", + "UMask": "0x2", + "BriefDescription": "Core cycles when the thread is not in halt state", + "Counter": "Fixed counter 2", + "EventName": "CPU_CLK_UNHALTED.THREAD", + "PublicDescription": "Counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.", + "SampleAfterValue": "2000003", + "CounterHTOff": "Fixed counter 2" + }, + { + "EventCode": "0x00", + "UMask": "0x2", + "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.", + "Counter": "Fixed counter 2", + "EventName": "CPU_CLK_UNHALTED.THREAD_ANY", + "AnyThread": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "Fixed counter 2" + }, + { + "EventCode": "0x00", + "UMask": "0x3", + "BriefDescription": "Reference cycles when the core is not in halt state.", + "Counter": "Fixed counter 3", + "EventName": "CPU_CLK_UNHALTED.REF_TSC", + "PublicDescription": "Counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. This event has a constant ratio with the CPU_CLK_UNHALTED.REF_XCLK event. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. Note: On all current platforms this event stops counting during 'throttling (TM)' states duty off periods the processor is 'halted'. The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear 'sticky'. After the counter has overflowed and software clears the overfl ow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip 'high (1)' and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit '1 for bit 34 while the counter value is less than MAX. Software should ignore this case.", + "SampleAfterValue": "2000003", + "CounterHTOff": "Fixed counter 3" + }, + { + "EventCode": "0x03", + "UMask": "0x2", + "BriefDescription": "Loads blocked by overlapping with store buffer that cannot be forwarded .", + "Counter": "0,1,2,3", + "EventName": "LD_BLOCKS.STORE_FORWARD", + "PublicDescription": "Counts how many times the load operation got the true Block-on-Store blocking code preventing store forwarding. This includes cases when:a. preceding store conflicts with the load (incomplete overlap),b. store forwarding is impossible due to u-arch limitations,c. preceding lock RMW operations are not forwarded,d. store has the no-forward bit set (uncacheable/page-split/masked stores),e. all-blocking stores are used (mostly, fences and port I/O), and others.The most common case is a load blocked due to its address range overlapping with a preceding smaller uncompleted store. Note: This event does not take into account cases of out-of-SW-control (for example, SbTailHit), unknown physical STA, and cases of blocking loads on store due to being non-WB memory type or a lock. These cases are covered by other events. See the table of not supported store forwards in the Optimization Guide.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x03", + "UMask": "0x8", + "BriefDescription": "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use", + "Counter": "0,1,2,3", + "EventName": "LD_BLOCKS.NO_SR", + "PublicDescription": "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x07", + "UMask": "0x1", + "BriefDescription": "False dependencies in MOB due to partial compare on address.", + "Counter": "0,1,2,3", + "EventName": "LD_BLOCKS_PARTIAL.ADDRESS_ALIAS", + "PublicDescription": "Counts false dependencies in MOB when the partial comparison upon loose net check and dependency was resolved by the Enhanced Loose net mechanism. This may not result in high performance penalties. Loose net checks can fail when loads and stores are 4k aliased.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x0D", + "UMask": "0x1", + "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread (e.g. misprediction or memory nuke)", + "Counter": "0,1,2,3", + "EventName": "INT_MISC.RECOVERY_CYCLES", + "PublicDescription": "Core cycles the Resource allocator was stalled due to recovery from an earlier branch misprediction or machine clear event.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x0D", + "UMask": "0x1", + "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).", + "Counter": "0,1,2,3", + "EventName": "INT_MISC.RECOVERY_CYCLES_ANY", + "AnyThread": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x0D", + "UMask": "0x80", + "BriefDescription": "Cycles the issue-stage is waiting for front-end to fetch from resteered path following branch misprediction or machine clear events.", + "Counter": "0,1,2,3", + "EventName": "INT_MISC.CLEAR_RESTEER_CYCLES", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x0E", + "UMask": "0x1", + "BriefDescription": "Uops that Resource Allocation Table (RAT) issues to Reservation Station (RS)", + "Counter": "0,1,2,3", + "EventName": "UOPS_ISSUED.ANY", + "PublicDescription": "Counts the number of uops that the Resource Allocation Table (RAT) issues to the Reservation Station (RS).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "Invert": "1", + "EventCode": "0x0E", + "UMask": "0x1", + "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread", + "Counter": "0,1,2,3", + "EventName": "UOPS_ISSUED.STALL_CYCLES", + "CounterMask": "1", + "PublicDescription": "Counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x0E", + "UMask": "0x2", + "BriefDescription": "Uops inserted at issue-stage in order to preserve upper bits of vector registers.", + "Counter": "0,1,2,3", + "EventName": "UOPS_ISSUED.VECTOR_WIDTH_MISMATCH", + "PublicDescription": "Counts the number of Blend Uops issued by the Resource Allocation Table (RAT) to the reservation station (RS) in order to preserve upper bits of vector registers. Starting with the Skylake microarchitecture, these Blend uops are needed since every Intel SSE instruction executed in Dirty Upper State needs to preserve bits 128-255 of the destination register. For more information, refer to \u201cMixing Intel AVX and Intel SSE Code\u201d section of the Optimization Guide.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x0E", + "UMask": "0x20", + "BriefDescription": "Number of slow LEA uops being allocated. A uop is generally considered SlowLea if it has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not.", + "Counter": "0,1,2,3", + "EventName": "UOPS_ISSUED.SLOW_LEA", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x14", + "UMask": "0x1", + "BriefDescription": "Cycles when divide unit is busy executing divide or square root operations. Accounts for integer and floating-point operations.", + "Counter": "0,1,2,3", + "EventName": "ARITH.DIVIDER_ACTIVE", + "CounterMask": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x3C", + "UMask": "0x0", + "BriefDescription": "Thread cycles when thread is not in halt state", + "Counter": "0,1,2,3", + "EventName": "CPU_CLK_UNHALTED.THREAD_P", + "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x3C", + "UMask": "0x0", + "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.", + "Counter": "0,1,2,3", + "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY", + "AnyThread": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EdgeDetect": "1", + "EventCode": "0x3C", + "UMask": "0x0", + "BriefDescription": "Counts when there is a transition from ring 1, 2 or 3 to ring 0.", + "Counter": "0,1,2,3", + "EventName": "CPU_CLK_UNHALTED.RING0_TRANS", + "CounterMask": "1", + "PublicDescription": "Counts when the Current Privilege Level (CPL) transitions from ring 1, 2 or 3 to ring 0 (Kernel).", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x3C", + "UMask": "0x1", + "BriefDescription": "Core crystal clock cycles when the thread is unhalted.", + "Counter": "0,1,2,3", + "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK", + "SampleAfterValue": "2503", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x3C", + "UMask": "0x1", + "BriefDescription": "Core crystal clock cycles when at least one thread on the physical core is unhalted.", + "Counter": "0,1,2,3", + "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY", + "AnyThread": "1", + "SampleAfterValue": "2503", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x3C", + "UMask": "0x1", + "BriefDescription": "Core crystal clock cycles when the thread is unhalted.", + "Counter": "0,1,2,3", + "EventName": "CPU_CLK_UNHALTED.REF_XCLK", + "SampleAfterValue": "2503", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x3C", + "UMask": "0x1", + "BriefDescription": "Core crystal clock cycles when at least one thread on the physical core is unhalted.", + "Counter": "0,1,2,3", + "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY", + "AnyThread": "1", + "SampleAfterValue": "2503", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x3C", + "UMask": "0x2", + "BriefDescription": "Core crystal clock cycles when this thread is unhalted and the other thread is halted.", + "Counter": "0,1,2,3", + "EventName": "CPU_CLK_THREAD_UNHALTED.ONE_THREAD_ACTIVE", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x3C", + "UMask": "0x2", + "BriefDescription": "Core crystal clock cycles when this thread is unhalted and the other thread is halted.", + "Counter": "0,1,2,3", + "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE", + "SampleAfterValue": "2503", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x4C", + "UMask": "0x1", + "BriefDescription": "Demand load dispatches that hit L1D fill buffer (FB) allocated for software prefetch.", + "Counter": "0,1,2,3", + "EventName": "LOAD_HIT_PRE.SW_PF", + "PublicDescription": "Counts all not software-prefetch load dispatches that hit the fill buffer (FB) allocated for the software prefetch. It can also be incremented by some lock instructions. So it should only be used with profiling so that the locks can be excluded by ASM (Assembly File) inspection of the nearby instructions.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x5E", + "UMask": "0x1", + "BriefDescription": "Cycles when Reservation Station (RS) is empty for the thread", + "Counter": "0,1,2,3", + "EventName": "RS_EVENTS.EMPTY_CYCLES", + "PublicDescription": "Counts cycles during which the reservation station (RS) is empty for the thread.; Note: In ST-mode, not active thread should drive 0. This is usually caused by severely costly branch mispredictions, or allocator/FE issues.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EdgeDetect": "1", + "Invert": "1", + "EventCode": "0x5E", + "UMask": "0x1", + "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.", + "Counter": "0,1,2,3", + "EventName": "RS_EVENTS.EMPTY_END", + "CounterMask": "1", + "PublicDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate front-end Latency Bound issues.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x87", + "UMask": "0x1", + "BriefDescription": "Stalls caused by changing prefix length of the instruction.", + "Counter": "0,1,2,3", + "EventName": "ILD_STALL.LCP", + "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA1", + "UMask": "0x1", + "BriefDescription": "Cycles per thread when uops are executed in port 0", + "Counter": "0,1,2,3", + "EventName": "UOPS_DISPATCHED_PORT.PORT_0", + "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 0.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA1", + "UMask": "0x2", + "BriefDescription": "Cycles per thread when uops are executed in port 1", + "Counter": "0,1,2,3", + "EventName": "UOPS_DISPATCHED_PORT.PORT_1", + "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 1.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA1", + "UMask": "0x4", + "BriefDescription": "Cycles per thread when uops are executed in port 2", + "Counter": "0,1,2,3", + "EventName": "UOPS_DISPATCHED_PORT.PORT_2", + "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 2.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA1", + "UMask": "0x8", + "BriefDescription": "Cycles per thread when uops are executed in port 3", + "Counter": "0,1,2,3", + "EventName": "UOPS_DISPATCHED_PORT.PORT_3", + "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 3.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA1", + "UMask": "0x10", + "BriefDescription": "Cycles per thread when uops are executed in port 4", + "Counter": "0,1,2,3", + "EventName": "UOPS_DISPATCHED_PORT.PORT_4", + "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 4.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA1", + "UMask": "0x20", + "BriefDescription": "Cycles per thread when uops are executed in port 5", + "Counter": "0,1,2,3", + "EventName": "UOPS_DISPATCHED_PORT.PORT_5", + "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 5.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA1", + "UMask": "0x40", + "BriefDescription": "Cycles per thread when uops are executed in port 6", + "Counter": "0,1,2,3", + "EventName": "UOPS_DISPATCHED_PORT.PORT_6", + "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 6.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA1", + "UMask": "0x80", + "BriefDescription": "Cycles per thread when uops are executed in port 7", + "Counter": "0,1,2,3", + "EventName": "UOPS_DISPATCHED_PORT.PORT_7", + "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 7.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA2", + "UMask": "0x1", + "BriefDescription": "Resource-related stall cycles", + "Counter": "0,1,2,3", + "EventName": "RESOURCE_STALLS.ANY", + "PublicDescription": "Counts resource-related stall cycles. Reasons for stalls can be as follows:a. *any* u-arch structure got full (LB, SB, RS, ROB, BOB, LM, Physical Register Reclaim Table (PRRT), or Physical History Table (PHT) slots).b. *any* u-arch structure got empty (like INT/SIMD FreeLists).c. FPU control word (FPCW), MXCSR.and others. This counts cycles that the pipeline back-end blocked uop delivery from the front-end.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA2", + "UMask": "0x8", + "BriefDescription": "Cycles stalled due to no store buffers available. (not including draining form sync).", + "Counter": "0,1,2,3", + "EventName": "RESOURCE_STALLS.SB", + "PublicDescription": "Counts allocation stall cycles caused by the store buffer (SB) being full. This counts cycles that the pipeline back-end blocked uop delivery from the front-end.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA3", + "UMask": "0x1", + "BriefDescription": "Cycles while L2 cache miss demand load is outstanding.", + "Counter": "0,1,2,3", + "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS", + "CounterMask": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA3", + "UMask": "0x4", + "BriefDescription": "Total execution stalls.", + "Counter": "0,1,2,3", + "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL", + "CounterMask": "4", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA3", + "UMask": "0x5", + "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.", + "Counter": "0,1,2,3", + "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS", + "CounterMask": "5", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA3", + "UMask": "0x8", + "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.", + "Counter": "0,1,2,3", + "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS", + "CounterMask": "8", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA3", + "UMask": "0xc", + "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.", + "Counter": "0,1,2,3", + "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS", + "CounterMask": "12", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA3", + "UMask": "0x10", + "BriefDescription": "Cycles while memory subsystem has an outstanding load.", + "Counter": "0,1,2,3", + "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY", + "CounterMask": "16", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA3", + "UMask": "0x14", + "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.", + "Counter": "0,1,2,3", + "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY", + "CounterMask": "20", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xA6", + "UMask": "0x1", + "BriefDescription": "Cycles where no uops were executed, the Reservation Station was not empty, the Store Buffer was full and there was no outstanding load.", + "Counter": "0,1,2,3", + "EventName": "EXE_ACTIVITY.EXE_BOUND_0_PORTS", + "PublicDescription": "Counts cycles during which no uops were executed on all ports and Reservation Station (RS) was not empty.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA6", + "UMask": "0x2", + "BriefDescription": "Cycles total of 1 uop is executed on all ports and Reservation Station was not empty.", + "Counter": "0,1,2,3", + "EventName": "EXE_ACTIVITY.1_PORTS_UTIL", + "PublicDescription": "Counts cycles during which a total of 1 uop was executed on all ports and Reservation Station (RS) was not empty.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA6", + "UMask": "0x4", + "BriefDescription": "Cycles total of 2 uops are executed on all ports and Reservation Station was not empty.", + "Counter": "0,1,2,3", + "EventName": "EXE_ACTIVITY.2_PORTS_UTIL", + "PublicDescription": "Counts cycles during which a total of 2 uops were executed on all ports and Reservation Station (RS) was not empty.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA6", + "UMask": "0x8", + "BriefDescription": "Cycles total of 3 uops are executed on all ports and Reservation Station was not empty.", + "Counter": "0,1,2,3", + "EventName": "EXE_ACTIVITY.3_PORTS_UTIL", + "PublicDescription": "Cycles total of 3 uops are executed on all ports and Reservation Station (RS) was not empty.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA6", + "UMask": "0x10", + "BriefDescription": "Cycles total of 4 uops are executed on all ports and Reservation Station was not empty.", + "Counter": "0,1,2,3", + "EventName": "EXE_ACTIVITY.4_PORTS_UTIL", + "PublicDescription": "Cycles total of 4 uops are executed on all ports and Reservation Station (RS) was not empty.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA6", + "UMask": "0x40", + "BriefDescription": "Cycles where the Store Buffer was full and no outstanding load.", + "Counter": "0,1,2,3", + "EventName": "EXE_ACTIVITY.BOUND_ON_STORES", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA8", + "UMask": "0x1", + "BriefDescription": "Number of Uops delivered by the LSD.", + "Counter": "0,1,2,3", + "EventName": "LSD.UOPS", + "PublicDescription": "Number of uops delivered to the back-end by the LSD(Loop Stream Detector).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA8", + "UMask": "0x1", + "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.", + "Counter": "0,1,2,3", + "EventName": "LSD.CYCLES_ACTIVE", + "CounterMask": "1", + "PublicDescription": "Counts the cycles when at least one uop is delivered by the LSD (Loop-stream detector).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xA8", + "UMask": "0x1", + "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.", + "Counter": "0,1,2,3", + "EventName": "LSD.CYCLES_4_UOPS", + "CounterMask": "4", + "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x1", + "BriefDescription": "Counts the number of uops to be executed per-thread each cycle.", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.THREAD", + "PublicDescription": "Number of uops to be executed per-thread each cycle.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "Invert": "1", + "EventCode": "0xB1", + "UMask": "0x1", + "BriefDescription": "Counts number of cycles no uops were dispatched to be executed on this thread.", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.STALL_CYCLES", + "CounterMask": "1", + "PublicDescription": "Counts cycles during which no uops were dispatched from the Reservation Station (RS) per thread.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x1", + "BriefDescription": "Cycles where at least 1 uop was executed per-thread", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CYCLES_GE_1_UOP_EXEC", + "CounterMask": "1", + "PublicDescription": "Cycles where at least 1 uop was executed per-thread.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x1", + "BriefDescription": "Cycles where at least 2 uops were executed per-thread", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CYCLES_GE_2_UOPS_EXEC", + "CounterMask": "2", + "PublicDescription": "Cycles where at least 2 uops were executed per-thread.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x1", + "BriefDescription": "Cycles where at least 3 uops were executed per-thread", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CYCLES_GE_3_UOPS_EXEC", + "CounterMask": "3", + "PublicDescription": "Cycles where at least 3 uops were executed per-thread.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x1", + "BriefDescription": "Cycles where at least 4 uops were executed per-thread", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CYCLES_GE_4_UOPS_EXEC", + "CounterMask": "4", + "PublicDescription": "Cycles where at least 4 uops were executed per-thread.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x2", + "BriefDescription": "Number of uops executed on the core.", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CORE", + "PublicDescription": "Number of uops executed from any thread.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x2", + "BriefDescription": "Cycles at least 1 micro-op is executed from any thread on physical core.", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CORE_CYCLES_GE_1", + "CounterMask": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x2", + "BriefDescription": "Cycles at least 2 micro-op is executed from any thread on physical core.", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CORE_CYCLES_GE_2", + "CounterMask": "2", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x2", + "BriefDescription": "Cycles at least 3 micro-op is executed from any thread on physical core.", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CORE_CYCLES_GE_3", + "CounterMask": "3", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x2", + "BriefDescription": "Cycles at least 4 micro-op is executed from any thread on physical core.", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CORE_CYCLES_GE_4", + "CounterMask": "4", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "Invert": "1", + "EventCode": "0xB1", + "UMask": "0x2", + "BriefDescription": "Cycles with no micro-ops executed from any thread on physical core.", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.CORE_CYCLES_NONE", + "CounterMask": "1", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xB1", + "UMask": "0x10", + "BriefDescription": "Counts the number of x87 uops dispatched.", + "Counter": "0,1,2,3", + "EventName": "UOPS_EXECUTED.X87", + "PublicDescription": "Counts the number of x87 uops executed.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC0", + "UMask": "0x0", + "BriefDescription": "Number of instructions retired. General Counter - architectural event", + "Counter": "0,1,2,3", + "EventName": "INST_RETIRED.ANY_P", + "Errata": "SKL091, SKL044", + "PublicDescription": "Counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC0", + "UMask": "0x1", + "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution", + "PEBS": "2", + "Counter": "1", + "EventName": "INST_RETIRED.PREC_DIST", + "Errata": "SKL091, SKL044", + "PublicDescription": "A version of INST_RETIRED that allows for a more unbiased distribution of samples across instructions retired. It utilizes the Precise Distribution of Instructions Retired (PDIR) feature to mitigate some bias in how retired instructions get sampled.", + "SampleAfterValue": "2000003", + "CounterHTOff": "1" + }, + { + "Invert": "1", + "EventCode": "0xC0", + "UMask": "0x1", + "BriefDescription": "Number of cycles using always true condition applied to PEBS instructions retired event.", + "PEBS": "2", + "Counter": "0,2,3", + "EventName": "INST_RETIRED.TOTAL_CYCLES_PS", + "CounterMask": "10", + "Errata": "SKL091, SKL044", + "PublicDescription": "Number of cycles using an always true condition applied to PEBS instructions retired event. (inst_ret< 16)", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,2,3" + }, + { + "EventCode": "0xC1", + "UMask": "0x3f", + "BriefDescription": "Number of times a microcode assist is invoked by HW other than FP-assist. Examples include AD (page Access Dirty) and AVX* related assists.", + "Counter": "0,1,2,3", + "EventName": "OTHER_ASSISTS.ANY", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC2", + "UMask": "0x2", + "BriefDescription": "Retirement slots used.", + "Counter": "0,1,2,3", + "EventName": "UOPS_RETIRED.RETIRE_SLOTS", + "PublicDescription": "Counts the retirement slots used.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "Invert": "1", + "EventCode": "0xC2", + "UMask": "0x2", + "BriefDescription": "Cycles without actually retired uops.", + "Counter": "0,1,2,3", + "EventName": "UOPS_RETIRED.STALL_CYCLES", + "CounterMask": "1", + "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts cycles without actually retired uops.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "Invert": "1", + "EventCode": "0xC2", + "UMask": "0x2", + "BriefDescription": "Cycles with less than 10 actually retired uops.", + "Counter": "0,1,2,3", + "EventName": "UOPS_RETIRED.TOTAL_CYCLES", + "CounterMask": "10", + "PublicDescription": "Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EdgeDetect": "1", + "EventCode": "0xC3", + "UMask": "0x1", + "BriefDescription": "Number of machine clears (nukes) of any type. ", + "Counter": "0,1,2,3", + "EventName": "MACHINE_CLEARS.COUNT", + "CounterMask": "1", + "PublicDescription": "Number of machine clears (nukes) of any type.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC3", + "UMask": "0x4", + "BriefDescription": "Self-modifying code (SMC) detected.", + "Counter": "0,1,2,3", + "EventName": "MACHINE_CLEARS.SMC", + "PublicDescription": "Counts self-modifying code (SMC) detected, which causes a machine clear.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC4", + "UMask": "0x0", + "BriefDescription": "All (macro) branch instructions retired.", + "Counter": "0,1,2,3", + "EventName": "BR_INST_RETIRED.ALL_BRANCHES", + "Errata": "SKL091", + "PublicDescription": "Counts all (macro) branch instructions retired.", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC4", + "UMask": "0x1", + "BriefDescription": "Conditional branch instructions retired.", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "BR_INST_RETIRED.CONDITIONAL", + "Errata": "SKL091", + "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts conditional branch instructions retired.", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC4", + "UMask": "0x2", + "BriefDescription": "Direct and indirect near call instructions retired.", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "BR_INST_RETIRED.NEAR_CALL", + "Errata": "SKL091", + "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts both direct and indirect near call instructions retired.", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC4", + "UMask": "0x4", + "BriefDescription": "All (macro) branch instructions retired. ", + "PEBS": "2", + "Counter": "0,1,2,3", + "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS", + "Errata": "SKL091", + "PublicDescription": "This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC4", + "UMask": "0x8", + "BriefDescription": "Return instructions retired.", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "BR_INST_RETIRED.NEAR_RETURN", + "Errata": "SKL091", + "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts return instructions retired.", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC4", + "UMask": "0x10", + "BriefDescription": "Not taken branch instructions retired.", + "Counter": "0,1,2,3", + "EventName": "BR_INST_RETIRED.NOT_TAKEN", + "Errata": "SKL091", + "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts not taken branch instructions retired.", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC4", + "UMask": "0x20", + "BriefDescription": "Taken branch instructions retired.", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "BR_INST_RETIRED.NEAR_TAKEN", + "Errata": "SKL091", + "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts taken branch instructions retired.", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC4", + "UMask": "0x40", + "BriefDescription": "Far branch instructions retired.", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "BR_INST_RETIRED.FAR_BRANCH", + "Errata": "SKL091", + "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts far branch instructions retired.", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC5", + "UMask": "0x0", + "BriefDescription": "All mispredicted macro branch instructions retired.", + "Counter": "0,1,2,3", + "EventName": "BR_MISP_RETIRED.ALL_BRANCHES", + "PublicDescription": "Counts all the retired branch instructions that were mispredicted by the processor. A branch misprediction occurs when the processor incorrectly predicts the destination of the branch. When the misprediction is discovered at execution, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC5", + "UMask": "0x1", + "BriefDescription": "Mispredicted conditional branch instructions retired.", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "BR_MISP_RETIRED.CONDITIONAL", + "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted conditional branch instructions retired.", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC5", + "UMask": "0x2", + "BriefDescription": "Mispredicted direct and indirect near call instructions retired.", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "BR_MISP_RETIRED.NEAR_CALL", + "PublicDescription": "Counts both taken and not taken retired mispredicted direct and indirect near calls, including both register and memory indirect.", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xC5", + "UMask": "0x4", + "BriefDescription": "Mispredicted macro branch instructions retired. ", + "PEBS": "2", + "Counter": "0,1,2,3", + "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS", + "PublicDescription": "This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3" + }, + { + "EventCode": "0xC5", + "UMask": "0x20", + "BriefDescription": "Number of near branch instructions retired that were mispredicted and taken.", + "PEBS": "1", + "Counter": "0,1,2,3", + "EventName": "BR_MISP_RETIRED.NEAR_TAKEN", + "SampleAfterValue": "400009", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xCC", + "UMask": "0x20", + "BriefDescription": "Increments whenever there is an update to the LBR array.", + "Counter": "0,1,2,3", + "EventName": "ROB_MISC_EVENTS.LBR_INSERTS", + "PublicDescription": "Increments when an entry is added to the Last Branch Record (LBR) array (or removed from the array in case of RETURNs in call stack mode). The event requires LBR enable via IA32_DEBUGCTL MSR and branch type selection via MSR_LBR_SELECT.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xE6", + "UMask": "0x1", + "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.", + "Counter": "0,1,2,3", + "EventName": "BACLEARS.ANY", + "PublicDescription": "Counts the number of times the front-end is resteered when it finds a branch instruction in a fetch line. This occurs for the first time a branch instruction is fetched or when the branch is not tracked by the BPU (Branch Prediction Unit) anymore.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + } +] \ No newline at end of file diff --git a/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json b/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json new file mode 100644 index 000000000000..70750dab7ead --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json @@ -0,0 +1,284 @@ +[ + { + "EventCode": "0x08", + "UMask": "0x1", + "BriefDescription": "Load misses in all DTLB levels that cause page walks", + "Counter": "0,1,2,3", + "EventName": "DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK", + "PublicDescription": "Counts demand data loads that caused a page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels, but the walk need not have completed.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x08", + "UMask": "0x2", + "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes (4K).", + "Counter": "0,1,2,3", + "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_4K", + "PublicDescription": "Counts demand data loads that caused a completed page walk (4K page size). This implies it missed in all TLB levels. The page walk can end with or without a fault.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x08", + "UMask": "0x4", + "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes (2M/4M).", + "Counter": "0,1,2,3", + "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M", + "PublicDescription": "Counts demand data loads that caused a completed page walk (2M and 4M page sizes). This implies it missed in all TLB levels. The page walk can end with or without a fault.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x08", + "UMask": "0x8", + "BriefDescription": "Load miss in all TLB levels causes a page walk that completes. (1G)", + "Counter": "0,1,2,3", + "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_1G", + "PublicDescription": "Counts load misses in all DTLB levels that cause a completed page walk (1G page size). The page walk can end with or without a fault.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x08", + "UMask": "0xe", + "BriefDescription": "Load miss in all TLB levels causes a page walk that completes. (All page sizes)", + "Counter": "0,1,2,3", + "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED", + "PublicDescription": "Counts demand data loads that caused a completed page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels. The page walk can end with or without a fault.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x08", + "UMask": "0x10", + "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a load. EPT page walk duration are excluded in Skylake. ", + "Counter": "0,1,2,3", + "EventName": "DTLB_LOAD_MISSES.WALK_PENDING", + "PublicDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a load. EPT page walk duration are excluded in Skylake microarchitecture. ", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x08", + "UMask": "0x10", + "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a load. EPT page walk duration are excluded in Skylake. ", + "Counter": "0,1,2,3", + "EventName": "DTLB_LOAD_MISSES.WALK_ACTIVE", + "CounterMask": "1", + "PublicDescription": "Counts cycles when at least one PMH (Page Miss Handler) is busy with a page walk for a load.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x08", + "UMask": "0x20", + "BriefDescription": "Loads that miss the DTLB and hit the STLB.", + "Counter": "0,1,2,3", + "EventName": "DTLB_LOAD_MISSES.STLB_HIT", + "PublicDescription": "Counts loads that miss the DTLB (Data TLB) and hit the STLB (Second level TLB).", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x49", + "UMask": "0x1", + "BriefDescription": "Store misses in all DTLB levels that cause page walks", + "Counter": "0,1,2,3", + "EventName": "DTLB_STORE_MISSES.MISS_CAUSES_A_WALK", + "PublicDescription": "Counts demand data stores that caused a page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels, but the walk need not have completed.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x49", + "UMask": "0x2", + "BriefDescription": "Store miss in all TLB levels causes a page walk that completes. (4K)", + "Counter": "0,1,2,3", + "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_4K", + "PublicDescription": "Counts demand data stores that caused a completed page walk (4K page size). This implies it missed in all TLB levels. The page walk can end with or without a fault.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x49", + "UMask": "0x4", + "BriefDescription": "Store misses in all DTLB levels that cause completed page walks (2M/4M)", + "Counter": "0,1,2,3", + "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M", + "PublicDescription": "Counts demand data stores that caused a completed page walk (2M and 4M page sizes). This implies it missed in all TLB levels. The page walk can end with or without a fault.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x49", + "UMask": "0x8", + "BriefDescription": "Store misses in all DTLB levels that cause completed page walks (1G)", + "Counter": "0,1,2,3", + "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_1G", + "PublicDescription": "Counts store misses in all DTLB levels that cause a completed page walk (1G page size). The page walk can end with or without a fault.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x49", + "UMask": "0xe", + "BriefDescription": "Store misses in all TLB levels causes a page walk that completes. (All page sizes)", + "Counter": "0,1,2,3", + "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED", + "PublicDescription": "Counts demand data stores that caused a completed page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels. The page walk can end with or without a fault.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x49", + "UMask": "0x10", + "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a store. EPT page walk duration are excluded in Skylake. ", + "Counter": "0,1,2,3", + "EventName": "DTLB_STORE_MISSES.WALK_PENDING", + "PublicDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a store. EPT page walk duration are excluded in Skylake microarchitecture. ", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x49", + "UMask": "0x10", + "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a store. EPT page walk duration are excluded in Skylake. ", + "Counter": "0,1,2,3", + "EventName": "DTLB_STORE_MISSES.WALK_ACTIVE", + "CounterMask": "1", + "PublicDescription": "Counts cycles when at least one PMH (Page Miss Handler) is busy with a page walk for a store.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x49", + "UMask": "0x20", + "BriefDescription": "Stores that miss the DTLB and hit the STLB.", + "Counter": "0,1,2,3", + "EventName": "DTLB_STORE_MISSES.STLB_HIT", + "PublicDescription": "Stores that miss the DTLB (Data TLB) and hit the STLB (2nd Level TLB).", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x4F", + "UMask": "0x10", + "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a EPT (Extended Page Table) walk for any request type.", + "Counter": "0,1,2,3", + "EventName": "EPT.WALK_PENDING", + "PublicDescription": "Counts cycles for each PMH (Page Miss Handler) that is busy with an EPT (Extended Page Table) walk for any request type.", + "SampleAfterValue": "2000003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x85", + "UMask": "0x1", + "BriefDescription": "Misses at all ITLB levels that cause page walks", + "Counter": "0,1,2,3", + "EventName": "ITLB_MISSES.MISS_CAUSES_A_WALK", + "PublicDescription": "Counts page walks of any page size (4K/2M/4M/1G) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB, but the walk need not have completed.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x85", + "UMask": "0x2", + "BriefDescription": "Code miss in all TLB levels causes a page walk that completes. (4K)", + "Counter": "0,1,2,3", + "EventName": "ITLB_MISSES.WALK_COMPLETED_4K", + "PublicDescription": "Counts completed page walks (4K page size) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x85", + "UMask": "0x4", + "BriefDescription": "Code miss in all TLB levels causes a page walk that completes. (2M/4M)", + "Counter": "0,1,2,3", + "EventName": "ITLB_MISSES.WALK_COMPLETED_2M_4M", + "PublicDescription": "Counts completed page walks of any page size (4K/2M/4M/1G) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x85", + "UMask": "0x8", + "BriefDescription": "Code miss in all TLB levels causes a page walk that completes. (1G)", + "Counter": "0,1,2,3", + "EventName": "ITLB_MISSES.WALK_COMPLETED_1G", + "PublicDescription": "Counts store misses in all DTLB levels that cause a completed page walk (1G page size). The page walk can end with or without a fault.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x85", + "UMask": "0xe", + "BriefDescription": "Code miss in all TLB levels causes a page walk that completes. (All page sizes)", + "Counter": "0,1,2,3", + "EventName": "ITLB_MISSES.WALK_COMPLETED", + "PublicDescription": "Counts completed page walks (2M and 4M page sizes) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x85", + "UMask": "0x10", + "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake. ", + "Counter": "0,1,2,3", + "EventName": "ITLB_MISSES.WALK_PENDING", + "PublicDescription": "Counts 1 per cycle for each PMH (Page Miss Handler) that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake michroarchitecture. ", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x85", + "UMask": "0x10", + "BriefDescription": "Cycles when at least one PMH is busy with a page walk for code (instruction fetch) request. EPT page walk duration are excluded in Skylake.", + "Counter": "0,1,2,3", + "EventName": "ITLB_MISSES.WALK_ACTIVE", + "CounterMask": "1", + "PublicDescription": "Cycles when at least one PMH is busy with a page walk for code (instruction fetch) request. EPT page walk duration are excluded in Skylake microarchitecture.", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0x85", + "UMask": "0x20", + "BriefDescription": "Instruction fetch requests that miss the ITLB and hit the STLB.", + "Counter": "0,1,2,3", + "EventName": "ITLB_MISSES.STLB_HIT", + "SampleAfterValue": "100003", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xAE", + "UMask": "0x1", + "BriefDescription": "Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages.", + "Counter": "0,1,2,3", + "EventName": "ITLB.ITLB_FLUSH", + "PublicDescription": "Counts the number of flushes of the big or small ITLB pages. Counting include both TLB Flush (covering all sets) and TLB Set Clear (set-specific).", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xBD", + "UMask": "0x1", + "BriefDescription": "DTLB flush attempts of the thread-specific entries", + "Counter": "0,1,2,3", + "EventName": "TLB_FLUSH.DTLB_THREAD", + "PublicDescription": "Counts the number of DTLB flush attempts of the thread-specific entries.", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3,4,5,6,7" + }, + { + "EventCode": "0xBD", + "UMask": "0x20", + "BriefDescription": "STLB flush attempts", + "Counter": "0,1,2,3", + "EventName": "TLB_FLUSH.STLB_ANY", + "PublicDescription": "Counts the number of any STLB flush attempts (such as entire, VPID, PCID, InvPage, CR3 write, etc.).", + "SampleAfterValue": "100007", + "CounterHTOff": "0,1,2,3,4,5,6,7" + } +] \ No newline at end of file -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 09/15] perf vendor events: Add Skylake server uncore event list 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (7 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 08/15] perf vendor events: Add core event list for Skylake Server Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 10/15] perf tools: Add support for printing new mem_info encodings Arnaldo Carvalho de Melo ` (5 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Add JSON uncore events for Skylake Server to perf. Based on JSON list V1.01 This is a much fuller list than with earlier uncores, including more low level (but also harder to understand) events. It does not include the "experimential" events. The previous high level metric (LLC_* etc.) are still available when applicable. C state power events are not included at this point. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20170816220553.GA19463@tassilo.jf.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- .../arch/x86/skylakex/uncore-memory.json | 172 +++ .../pmu-events/arch/x86/skylakex/uncore-other.json | 1156 ++++++++++++++++++++ 2 files changed, 1328 insertions(+) create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/uncore-other.json diff --git a/tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json b/tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json new file mode 100644 index 000000000000..9c7e5f8beee2 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json @@ -0,0 +1,172 @@ +[ + { + "BriefDescription": "read requests to memory controller. Derived from unc_m_cas_count.rd", + "Counter": "0,1,2,3", + "EventCode": "0x4", + "EventName": "LLC_MISSES.MEM_READ", + "PerPkg": "1", + "ScaleUnit": "64Bytes", + "UMask": "0x3", + "Unit": "iMC" + }, + { + "BriefDescription": "write requests to memory controller. Derived from unc_m_cas_count.wr", + "Counter": "0,1,2,3", + "EventCode": "0x4", + "EventName": "LLC_MISSES.MEM_WRITE", + "PerPkg": "1", + "ScaleUnit": "64Bytes", + "UMask": "0xC", + "Unit": "iMC" + }, + { + "BriefDescription": "Memory controller clock ticks", + "Counter": "0,1,2,3", + "EventName": "UNC_M_CLOCKTICKS", + "PerPkg": "1", + "Unit": "iMC" + }, + { + "BriefDescription": "Cycles where DRAM ranks are in power down (CKE) mode", + "Counter": "0,1,2,3", + "EventCode": "0x85", + "EventName": "UNC_M_POWER_CHANNEL_PPD", + "MetricExpr": "(UNC_M_POWER_CHANNEL_PPD / UNC_M_CLOCKTICKS) * 100.", + "MetricName": "power_channel_ppd %", + "PerPkg": "1", + "Unit": "iMC" + }, + { + "BriefDescription": "Cycles Memory is in self refresh power mode", + "Counter": "0,1,2,3", + "EventCode": "0x43", + "EventName": "UNC_M_POWER_SELF_REFRESH", + "MetricExpr": "(UNC_M_POWER_SELF_REFRESH / UNC_M_CLOCKTICKS) * 100.", + "MetricName": "power_self_refresh %", + "PerPkg": "1", + "Unit": "iMC" + }, + { + "BriefDescription": "Pre-charges due to page misses", + "Counter": "0,1,2,3", + "EventCode": "0x2", + "EventName": "UNC_M_PRE_COUNT.PAGE_MISS", + "PerPkg": "1", + "UMask": "0x1", + "Unit": "iMC" + }, + { + "BriefDescription": "Pre-charge for reads", + "Counter": "0,1,2,3", + "EventCode": "0x2", + "EventName": "UNC_M_PRE_COUNT.RD", + "PerPkg": "1", + "UMask": "0x4", + "Unit": "iMC" + }, + { + "BriefDescription": "Pre-charge for writes", + "Counter": "0,1,2,3", + "EventCode": "0x2", + "EventName": "UNC_M_PRE_COUNT.WR", + "PerPkg": "1", + "UMask": "0x8", + "Unit": "iMC" + }, + { + "BriefDescription": "DRAM Page Activate commands sent due to a write request", + "Counter": "0,1,2,3", + "EventCode": "0x1", + "EventName": "UNC_M_ACT_COUNT.WR", + "PerPkg": "1", + "PublicDescription": "Counts DRAM Page Activate commands sent on this channel due to a write request to the iMC (Memory Controller). Activate commands are issued to open up a page on the DRAM devices so that it can be read or written to with a CAS (Column Access Select) command.", + "UMask": "0x2", + "Unit": "iMC" + }, + { + "BriefDescription": "All DRAM CAS Commands issued", + "Counter": "0,1,2,3", + "EventCode": "0x4", + "EventName": "UNC_M_CAS_COUNT.ALL", + "PerPkg": "1", + "PublicDescription": "Counts all CAS (Column Address Select) commands issued to DRAM per memory channel. CAS commands are issued to specify the address to read or write on DRAM, so this event increments for every read and write. This event counts whether AutoPrecharge (which closes the DRAM Page automatically after a read/write) is enabled or not.", + "UMask": "0xF", + "Unit": "iMC" + }, + { + "BriefDescription": "read requests to memory controller. Derived from unc_m_cas_count.rd", + "Counter": "0,1,2,3", + "EventCode": "0x4", + "EventName": "LLC_MISSES.MEM_READ", + "PerPkg": "1", + "ScaleUnit": "64Bytes", + "UMask": "0x3", + "Unit": "iMC" + }, + { + "BriefDescription": "All DRAM Read CAS Commands issued (does not include underfills) ", + "Counter": "0,1,2,3", + "EventCode": "0x4", + "EventName": "UNC_M_CAS_COUNT.RD_REG", + "PerPkg": "1", + "PublicDescription": "Counts CAS (Column Access Select) regular read commands issued to DRAM on a per channel basis. CAS commands are issued to specify the address to read or write on DRAM, and this event increments for every regular read. This event only counts regular reads and does not includes underfill reads due to partial write requests. This event counts whether AutoPrecharge (which closes the DRAM Page automatically after a read/write) is enabled or not.", + "UMask": "0x1", + "Unit": "iMC" + }, + { + "BriefDescription": "DRAM Underfill Read CAS Commands issued", + "Counter": "0,1,2,3", + "EventCode": "0x4", + "EventName": "UNC_M_CAS_COUNT.RD_UNDERFILL", + "PerPkg": "1", + "PublicDescription": "Counts CAS (Column Access Select) underfill read commands issued to DRAM due to a partial write, on a per channel basis. CAS commands are issued to specify the address to read or write on DRAM, and this command counts underfill reads. Partial writes must be completed by first reading in the underfill from DRAM and then merging in the partial write data before writing the full line back to DRAM. This event will generally count about the same as the number of partial writes, but may be slightly less because of partials hitting in the WPQ (due to a previous write request). ", + "UMask": "0x2", + "Unit": "iMC" + }, + { + "BriefDescription": "write requests to memory controller. Derived from unc_m_cas_count.wr", + "Counter": "0,1,2,3", + "EventCode": "0x4", + "EventName": "LLC_MISSES.MEM_WRITE", + "PerPkg": "1", + "ScaleUnit": "64Bytes", + "UMask": "0xC", + "Unit": "iMC" + }, + { + "BriefDescription": "Read Pending Queue Allocations", + "Counter": "0,1,2,3", + "EventCode": "0x10", + "EventName": "UNC_M_RPQ_INSERTS", + "PerPkg": "1", + "PublicDescription": "Counts the number of read requests allocated into the Read Pending Queue (RPQ). This queue is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC. The requests deallocate after the read CAS command has been issued to DRAM. This event counts both Isochronous and non-Isochronous requests which were issued to the RPQ. ", + "Unit": "iMC" + }, + { + "BriefDescription": "Read Pending Queue Occupancy", + "Counter": "0,1,2,3", + "EventCode": "0x80", + "EventName": "UNC_M_RPQ_OCCUPANCY", + "PerPkg": "1", + "PublicDescription": "Counts the number of entries in the Read Pending Queue (RPQ) at each cycle. This can then be used to calculate both the average occupancy of the queue (in conjunction with the number of cycles not empty) and the average latency in the queue (in conjunction with the number of allocations). The RPQ is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC. They deallocate from the RPQ after the CAS command has been issued to memory.", + "Unit": "iMC" + }, + { + "BriefDescription": "Write Pending Queue Allocations", + "Counter": "0,1,2,3", + "EventCode": "0x20", + "EventName": "UNC_M_WPQ_INSERTS", + "PerPkg": "1", + "PublicDescription": "Counts the number of writes requests allocated into the Write Pending Queue (WPQ). The WPQ is used to schedule writes out to the memory controller and to track the requests. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC (Memory Controller). The write requests deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have 'posted' to the iMC.", + "Unit": "iMC" + }, + { + "BriefDescription": "Write Pending Queue Occupancy", + "Counter": "0,1,2,3", + "EventCode": "0x81", + "EventName": "UNC_M_WPQ_OCCUPANCY", + "PerPkg": "1", + "PublicDescription": "Counts the number of entries in the Write Pending Queue (WPQ) at each cycle. This can then be used to calculate both the average queue occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations). The WPQ is used to schedule writes out to the memory controller and to track the requests.", + "Unit": "iMC" + } +] diff --git a/tools/perf/pmu-events/arch/x86/skylakex/uncore-other.json b/tools/perf/pmu-events/arch/x86/skylakex/uncore-other.json new file mode 100644 index 000000000000..de6e70e552e2 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/skylakex/uncore-other.json @@ -0,0 +1,1156 @@ +[ + { + "BriefDescription": "Uncore cache clock ticks", + "Counter": "0,1,2,3", + "EventName": "UNC_CHA_CLOCKTICKS", + "PerPkg": "1", + "Unit": "CHA" + }, + { + "BriefDescription": "LLC misses - Uncacheable reads (from cpu) . Derived from unc_cha_tor_inserts.ia_miss", + "Counter": "0,1,2,3", + "EventCode": "0x35", + "EventName": "LLC_MISSES.UNCACHEABLE", + "Filter": "config1=0x40e33", + "PerPkg": "1", + "UMask": "0x21", + "Unit": "CHA" + }, + { + "BriefDescription": "MMIO reads. Derived from unc_cha_tor_inserts.ia_miss", + "Counter": "0,1,2,3", + "EventCode": "0x35", + "EventName": "LLC_MISSES.MMIO_READ", + "Filter": "config1=0x40040e33", + "PerPkg": "1", + "UMask": "0x21", + "Unit": "CHA" + }, + { + "BriefDescription": "MMIO writes. Derived from unc_cha_tor_inserts.ia_miss", + "Counter": "0,1,2,3", + "EventCode": "0x35", + "EventName": "LLC_MISSES.MMIO_WRITE", + "Filter": "config1=0x40041e33", + "PerPkg": "1", + "UMask": "0x21", + "Unit": "CHA" + }, + { + "BriefDescription": "Streaming stores (full cache line). Derived from unc_cha_tor_inserts.ia_miss", + "Counter": "0,1,2,3", + "EventCode": "0x35", + "EventName": "LLC_REFERENCES.STREAMING_FULL", + "Filter": "config1=0x41833", + "PerPkg": "1", + "ScaleUnit": "64Bytes", + "UMask": "0x21", + "Unit": "CHA" + }, + { + "BriefDescription": "Streaming stores (partial cache line). Derived from unc_cha_tor_inserts.ia_miss", + "Counter": "0,1,2,3", + "EventCode": "0x35", + "EventName": "LLC_REFERENCES.STREAMING_PARTIAL", + "Filter": "config1=0x41a33", + "PerPkg": "1", + "ScaleUnit": "64Bytes", + "UMask": "0x21", + "Unit": "CHA" + }, + { + "BriefDescription": "read requests from home agent", + "Counter": "0,1,2,3", + "EventCode": "0x50", + "EventName": "UNC_CHA_REQUESTS.READS", + "PerPkg": "1", + "UMask": "0x03", + "Unit": "CHA" + }, + { + "BriefDescription": "read requests from local home agent", + "Counter": "0,1,2,3", + "EventCode": "0x50", + "EventName": "UNC_CHA_REQUESTS.READS_LOCAL", + "PerPkg": "1", + "UMask": "0x01", + "Unit": "CHA" + }, + { + "BriefDescription": "read requests from remote home agent", + "Counter": "0,1,2,3", + "EventCode": "0x50", + "EventName": "UNC_CHA_REQUESTS.READS_REMOTE", + "PerPkg": "1", + "UMask": "0x02", + "Unit": "CHA" + }, + { + "BriefDescription": "write requests from home agent", + "Counter": "0,1,2,3", + "EventCode": "0x50", + "EventName": "UNC_CHA_REQUESTS.WRITES", + "PerPkg": "1", + "UMask": "0x0C", + "Unit": "CHA" + }, + { + "BriefDescription": "write requests from local home agent", + "Counter": "0,1,2,3", + "EventCode": "0x50", + "EventName": "UNC_CHA_REQUESTS.WRITES_LOCAL", + "PerPkg": "1", + "UMask": "0x04", + "Unit": "CHA" + }, + { + "BriefDescription": "write requests from remote home agent", + "Counter": "0,1,2,3", + "EventCode": "0x50", + "EventName": "UNC_CHA_REQUESTS.WRITES_REMOTE", + "PerPkg": "1", + "UMask": "0x08", + "Unit": "CHA" + }, + { + "BriefDescription": "UPI interconnect send bandwidth for payload. Derived from unc_upi_txl_flits.all_data", + "Counter": "0,1,2,3", + "EventCode": "0x2", + "EventName": "UPI_DATA_BANDWIDTH_TX", + "PerPkg": "1", + "ScaleUnit": "7.11E-06Bytes", + "UMask": "0x0F", + "Unit": "UPI LL" + }, + { + "BriefDescription": "PCI Express bandwidth reading at IIO. Derived from unc_iio_data_req_of_cpu.mem_read.part0", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "LLC_MISSES.PCIE_READ", + "FCMask": "0x07", + "Filter": "ch_mask=0x1f", + "MetricExpr": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART0 + UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART1 + UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART2 + UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART3", + "MetricName": "LLC_MISSES.PCIE_READ", + "PerPkg": "1", + "PortMask": "0x01", + "ScaleUnit": "4Bytes", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "PCI Express bandwidth writing at IIO. Derived from unc_iio_data_req_of_cpu.mem_write.part0", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "LLC_MISSES.PCIE_WRITE", + "FCMask": "0x07", + "Filter": "ch_mask=0x1f", + "MetricExpr": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 +UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 +UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 +UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3", + "MetricName": "LLC_MISSES.PCIE_WRITE", + "PerPkg": "1", + "PortMask": "0x01", + "ScaleUnit": "4Bytes", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "PCI Express bandwidth writing at IIO, part 0", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0", + "FCMask": "0x07", + "MetricExpr": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 +UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 +UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 +UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3", + "MetricName": "LLC_MISSES.PCIE_WRITE", + "PerPkg": "1", + "PortMask": "0x01", + "ScaleUnit": "4Bytes", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "PCI Express bandwidth writing at IIO, part 1", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x02", + "ScaleUnit": "4Bytes", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "PCI Express bandwidth writing at IIO, part 2", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x04", + "ScaleUnit": "4Bytes", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "PCI Express bandwidth writing at IIO, part 3", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x08", + "ScaleUnit": "4Bytes", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "PCI Express bandwidth reading at IIO, part 0", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART0", + "FCMask": "0x07", + "MetricExpr": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART0 + UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART1 + UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART2 + UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART3", + "MetricName": "LLC_MISSES.PCIE_READ", + "PerPkg": "1", + "PortMask": "0x01", + "ScaleUnit": "4Bytes", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "PCI Express bandwidth reading at IIO, part 1", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART1", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x02", + "ScaleUnit": "4Bytes", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "PCI Express bandwidth reading at IIO, part 2", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART2", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x04", + "ScaleUnit": "4Bytes", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "PCI Express bandwidth reading at IIO, part 3", + "Counter": "0,1", + "EventCode": "0x83", + "EventName": "UNC_IIO_DATA_REQ_OF_CPU.MEM_READ.PART3", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x08", + "ScaleUnit": "4Bytes", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Core Cross Snoops Issued; Multiple Core Requests", + "Counter": "0,1,2,3", + "EventCode": "0x33", + "EventName": "UNC_CHA_CORE_SNP.CORE_GTONE", + "PerPkg": "1", + "PublicDescription": "Counts the number of transactions that trigger a configurable number of cross snoops. Cores are snooped if the transaction looks up the cache and determines that it is necessary based on the operation type and what CoreValid bits are set. For example, if 2 CV bits are set on a data read, the cores must have the data in S state so it is not necessary to snoop them. However, if only 1 CV bit is set the core my have modified the data. If the transaction was an RFO, it would need to invalidate the lines. This event can be filtered based on who triggered the initial snoop(s).", + "UMask": "0x42", + "Unit": "CHA" + }, + { + "BriefDescription": "Core Cross Snoops Issued; Multiple Eviction", + "Counter": "0,1,2,3", + "EventCode": "0x33", + "EventName": "UNC_CHA_CORE_SNP.EVICT_GTONE", + "PerPkg": "1", + "PublicDescription": "Counts the number of transactions that trigger a configurable number of cross snoops. Cores are snooped if the transaction looks up the cache and determines that it is necessary based on the operation type and what CoreValid bits are set. For example, if 2 CV bits are set on a data read, the cores must have the data in S state so it is not necessary to snoop them. However, if only 1 CV bit is set the core my have modified the data. If the transaction was an RFO, it would need to invalidate the lines. This event can be filtered based on who triggered the initial snoop(s).", + "UMask": "0x82", + "Unit": "CHA" + }, + { + "BriefDescription": "Multi-socket cacheline Directory state lookups; Snoop Not Needed", + "Counter": "0,1,2,3", + "EventCode": "0x53", + "EventName": "UNC_CHA_DIR_LOOKUP.NO_SNP", + "PerPkg": "1", + "PublicDescription": "Counts transactions that looked into the multi-socket cacheline Directory state, and therefore did not send a snoop because the Directory indicated it was not needed", + "UMask": "0x02", + "Unit": "CHA" + }, + { + "BriefDescription": "Multi-socket cacheline Directory state lookups; Snoop Needed", + "Counter": "0,1,2,3", + "EventCode": "0x53", + "EventName": "UNC_CHA_DIR_LOOKUP.SNP", + "PerPkg": "1", + "PublicDescription": "Counts transactions that looked into the multi-socket cacheline Directory state, and sent one or more snoops, because the Directory indicated it was needed", + "UMask": "0x01", + "Unit": "CHA" + }, + { + "BriefDescription": "Multi-socket cacheline Directory state updates; Directory Updated memory write from the HA pipe", + "Counter": "0,1,2,3", + "EventCode": "0x54", + "EventName": "UNC_CHA_DIR_UPDATE.HA", + "PerPkg": "1", + "PublicDescription": "Counts only multi-socket cacheline Directory state updates memory writes issued from the HA pipe. This does not include memory write requests which are for I (Invalid) or E (Exclusive) cachelines.", + "UMask": "0x01", + "Unit": "CHA" + }, + { + "BriefDescription": "Multi-socket cacheline Directory state updates; Directory Updated memory write from TOR pipe", + "Counter": "0,1,2,3", + "EventCode": "0x54", + "EventName": "UNC_CHA_DIR_UPDATE.TOR", + "PerPkg": "1", + "PublicDescription": "Counts only multi-socket cacheline Directory state updates due to memory writes issued from the TOR pipe which are the result of remote transaction hitting the SF/LLC and returning data Core2Core. This does not include memory write requests which are for I (Invalid) or E (Exclusive) cachelines.", + "UMask": "0x02", + "Unit": "CHA" + }, + { + "BriefDescription": "Read request from a remote socket which hit in the HitMe Cache to a line In the E state", + "Counter": "0,1,2,3", + "EventCode": "0x5F", + "EventName": "UNC_CHA_HITME_HIT.EX_RDS", + "PerPkg": "1", + "PublicDescription": "Counts read requests from a remote socket which hit in the HitME cache (used to cache the multi-socket Directory state) to a line in the E(Exclusive) state. This includes the following read opcodes (RdCode, RdData, RdDataMigratory, RdCur, RdInv*, Inv*)", + "UMask": "0x01", + "Unit": "CHA" + }, + { + "BriefDescription": "Normal priority reads issued to the memory controller from the CHA", + "Counter": "0,1,2,3", + "EventCode": "0x59", + "EventName": "UNC_CHA_IMC_READS_COUNT.NORMAL", + "PerPkg": "1", + "PublicDescription": "Counts when a normal (Non-Isochronous) read is issued to any of the memory controller channels from the CHA.", + "UMask": "0x01", + "Unit": "CHA" + }, + { + "BriefDescription": "CHA to iMC Full Line Writes Issued; Full Line Non-ISOCH", + "Counter": "0,1,2,3", + "EventCode": "0x5B", + "EventName": "UNC_CHA_IMC_WRITES_COUNT.FULL", + "PerPkg": "1", + "PublicDescription": "Counts when a normal (Non-Isochronous) full line write is issued from the CHA to the any of the memory controller channels.", + "UMask": "0x01", + "Unit": "CHA" + }, + { + "BriefDescription": "Number of times that an RFO hit in S state.", + "Counter": "0,1,2,3", + "EventCode": "0x39", + "EventName": "UNC_CHA_MISC.RFO_HIT_S", + "PerPkg": "1", + "PublicDescription": "Counts when a RFO (the Read for Ownership issued before a write) request hit a cacheline in the S (Shared) state.", + "UMask": "0x08", + "Unit": "CHA" + }, + { + "BriefDescription": "Local requests for exclusive ownership of a cache line without receiving data", + "Counter": "0,1,2,3", + "EventCode": "0x50", + "EventName": "UNC_CHA_REQUESTS.INVITOE_LOCAL", + "PerPkg": "1", + "PublicDescription": "Counts the total number of requests coming from a unit on this socket for exclusive ownership of a cache line without receiving data (INVITOE) to the CHA.", + "UMask": "0x10", + "Unit": "CHA" + }, + { + "BriefDescription": "Local requests for exclusive ownership of a cache line without receiving data", + "Counter": "0,1,2,3", + "EventCode": "0x50", + "EventName": "UNC_CHA_REQUESTS.INVITOE_REMOTE", + "PerPkg": "1", + "PublicDescription": "Counts the total number of requests coming from a remote socket for exclusive ownership of a cache line without receiving data (INVITOE) to the CHA.", + "UMask": "0x20", + "Unit": "CHA" + }, + { + "BriefDescription": "RspCnflct* Snoop Responses Received", + "Counter": "0,1,2,3", + "EventCode": "0x5C", + "EventName": "UNC_CHA_SNOOP_RESP.RSPCNFLCTS", + "PerPkg": "1", + "PublicDescription": "Counts when a a transaction with the opcode type RspCnflct* Snoop Response was received. This is returned when a snoop finds an existing outstanding transaction in a remote caching agent. This triggers conflict resolution hardware. This covers both the opcode RspCnflct and RspCnflctWbI.", + "UMask": "0x40", + "Unit": "CHA" + }, + { + "BriefDescription": "RspI Snoop Responses Received", + "Counter": "0,1,2,3", + "EventCode": "0x5C", + "EventName": "UNC_CHA_SNOOP_RESP.RSPI", + "PerPkg": "1", + "PublicDescription": "Counts when a transaction with the opcode type RspI Snoop Response was received which indicates the remote cache does not have the data, or when the remote cache silently evicts data (such as when an RFO: the Read for Ownership issued before a write hits non-modified data).", + "UMask": "0x01", + "Unit": "CHA" + }, + { + "BriefDescription": "RspIFwd Snoop Responses Received", + "Counter": "0,1,2,3", + "EventCode": "0x5C", + "EventName": "UNC_CHA_SNOOP_RESP.RSPIFWD", + "PerPkg": "1", + "PublicDescription": "Counts when a a transaction with the opcode type RspIFwd Snoop Response was received which indicates a remote caching agent forwarded the data and the requesting agent is able to acquire the data in E (Exclusive) or M (modified) states. This is commonly returned with RFO (the Read for Ownership issued before a write) transactions. The snoop could have either been to a cacheline in the M,E,F (Modified, Exclusive or Forward) states.", + "UMask": "0x04", + "Unit": "CHA" + }, + { + "BriefDescription": "RspSFwd Snoop Responses Received", + "Counter": "0,1,2,3", + "EventCode": "0x5C", + "EventName": "UNC_CHA_SNOOP_RESP.RSPSFWD", + "PerPkg": "1", + "PublicDescription": "Counts when a a transaction with the opcode type RspSFwd Snoop Response was received which indicates a remote caching agent forwarded the data but held on to its current copy. This is common for data and code reads that hit in a remote socket in E (Exclusive) or F (Forward) state.", + "UMask": "0x08", + "Unit": "CHA" + }, + { + "BriefDescription": "Rsp*Fwd*WB Snoop Responses Received", + "Counter": "0,1,2,3", + "EventCode": "0x5C", + "EventName": "UNC_CHA_SNOOP_RESP.RSP_FWD_WB", + "PerPkg": "1", + "PublicDescription": "Counts when a transaction with the opcode type Rsp*Fwd*WB Snoop Response was received which indicates the data was written back to it's home socket, and the cacheline was forwarded to the requestor socket. This snoop response is only used in >= 4 socket systems. It is used when a snoop HITM's in a remote caching agent and it directly forwards data to a requestor, and simultaneously returns data to it's home socket to be written back to memory.", + "UMask": "0x20", + "Unit": "CHA" + }, + { + "BriefDescription": "Rsp*WB Snoop Responses Received", + "Counter": "0,1,2,3", + "EventCode": "0x5C", + "EventName": "UNC_CHA_SNOOP_RESP.RSP_WBWB", + "PerPkg": "1", + "PublicDescription": "Counts when a transaction with the opcode type Rsp*WB Snoop Response was received which indicates which indicates the data was written back to it's home. This is returned when a non-RFO request hits a cacheline in the Modified state. The Cache can either downgrade the cacheline to a S (Shared) or I (Invalid) state depending on how the system has been configured. This reponse will also be sent when a cache requests E (Exclusive) ownership of a cache line without receiving data, because the cache must acquire ownership.", + "UMask": "0x10", + "Unit": "CHA" + }, + { + "BriefDescription": "Clockticks of the IIO Traffic Controller", + "Counter": "0,1,2,3", + "EventCode": "0x1", + "EventName": "UNC_IIO_CLOCKTICKS", + "PerPkg": "1", + "PublicDescription": "Counts clockticks of the 1GHz trafiic controller clock in the IIO unit.", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for 4 bytes made by the CPU to IIO Part0", + "Counter": "2,3", + "EventCode": "0xC0", + "EventName": "UNC_IIO_DATA_REQ_BY_CPU.MEM_READ.PART0", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x01", + "PublicDescription": "Counts every read request for 4 bytes of data made by a unit on the main die (generally a core) to the MMIO space of a card on IIO Part0. In the general case, Part0 refers to a standard PCIe card of any size (x16,x8,x4) that is plugged directly into one of the PCIe slots. Part0 could also refer to any device plugged into the first slot of a PCIe riser card or to a device attached to the IIO unit which starts its use of the bus using lane 0 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for 4 bytes made by the CPU to IIO Part1", + "Counter": "2,3", + "EventCode": "0xC0", + "EventName": "UNC_IIO_DATA_REQ_BY_CPU.MEM_READ.PART1", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x02", + "PublicDescription": "Counts every read request for 4 bytes of data made by a unit on the main die (generally a core) to the MMIO space of a card on IIO Part1. In the general case, Part1 refers to a x4 PCIe card plugged into the second slot of a PCIe riser card, but it could refer to any x4 device attached to the IIO unit using lanes starting at lane 4 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for 4 bytes made by the CPU to IIO Part2", + "Counter": "2,3", + "EventCode": "0xC0", + "EventName": "UNC_IIO_DATA_REQ_BY_CPU.MEM_READ.PART2", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x04", + "PublicDescription": "Counts every read request for 4 bytes of data made by a unit on the main die (generally a core) to the MMIO space of a card on IIO Part2. In the general case, Part2 refers to a x4 or x8 PCIe card plugged into the third slot of a PCIe riser card, but it could refer to any x4 or x8 device attached to the IIO unit and using lanes starting at lane 8 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for 4 bytes made by the CPU to IIO Part3", + "Counter": "2,3", + "EventCode": "0xC0", + "EventName": "UNC_IIO_DATA_REQ_BY_CPU.MEM_READ.PART3", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x08", + "PublicDescription": "Counts every read request for 4 bytes of data made by a unit on the main die (generally a core) to the MMIO space of a card on IIO Part3. In the general case, Part3 refers to a x4 PCIe card plugged into the fourth slot of a PCIe riser card, but it could brefer to any device attached to the IIO unit using the lanes starting at lane 12 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of 4 bytes made to IIO Part0 by the CPU", + "Counter": "2,3", + "EventCode": "0xC0", + "EventName": "UNC_IIO_DATA_REQ_BY_CPU.MEM_WRITE.PART0", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x01", + "PublicDescription": "Counts every write request of 4 bytes of data made to the MMIO space of a card on IIO Part0 by a unit on the main die (generally a core). In the general case, Part0 refers to a standard PCIe card of any size (x16,x8,x4) that is plugged directly into one of the PCIe slots. Part0 could also refer to any device plugged into the first slot of a PCIe riser card or to a device attached to the IIO unit which starts its use of the bus using lane 0 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of 4 bytes made to IIO Part1 by the CPU", + "Counter": "2,3", + "EventCode": "0xC0", + "EventName": "UNC_IIO_DATA_REQ_BY_CPU.MEM_WRITE.PART1", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x02", + "PublicDescription": "Counts every write request of 4 bytes of data made to the MMIO space of a card on IIO Part1 by a unit on the main die (generally a core). In the general case, Part1 refers to a x4 PCIe card plugged into the second slot of a PCIe riser card, but it could refer to any x4 device attached to the IIO unit using lanes starting at lane 4 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of 4 bytes made to IIO Part2 by the CPU ", + "Counter": "2,3", + "EventCode": "0xC0", + "EventName": "UNC_IIO_DATA_REQ_BY_CPU.MEM_WRITE.PART2", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x04", + "PublicDescription": "Counts every write request of 4 bytes of data made to the MMIO space of a card on IIO Part2 by a unit on the main die (generally a core). In the general case, Part2 refers to a x4 or x8 PCIe card plugged into the third slot of a PCIe riser card, but it could refer to any x4 or x8 device attached to the IIO unit and using lanes starting at lane 8 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of 4 bytes made to IIO Part3 by the CPU ", + "Counter": "2,3", + "EventCode": "0xC0", + "EventName": "UNC_IIO_DATA_REQ_BY_CPU.MEM_WRITE.PART3", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x08", + "PublicDescription": "Counts every write request of 4 bytes of data made to the MMIO space of a card on IIO Part3 by a unit on the main die (generally a core). In the general case, Part3 refers to a x4 PCIe card plugged into the fourth slot of a PCIe riser card, but it could brefer to any device attached to the IIO unit using the lanes starting at lane 12 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for up to a 64 byte transaction is made by the CPU to IIO Part0", + "Counter": "0,1,2,3", + "EventCode": "0xC1", + "EventName": "UNC_IIO_TXN_REQ_BY_CPU.MEM_READ.PART0", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x01", + "PublicDescription": "Counts every read request for up to a 64 byte transaction of data made by a unit on the main die (generally a core) to the MMIO space of a card on IIO Part0. In the general case, part0 refers to a standard PCIe card of any size (x16,x8,x4) that is plugged directly into one of the PCIe slots. Part0 could also refer to any device plugged into the first slot of a PCIe riser card or to a device attached to the IIO unit which starts its use of the bus using lane 0 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for up to a 64 byte transaction is made by the CPU to IIO Part1", + "Counter": "0,1,2,3", + "EventCode": "0xC1", + "EventName": "UNC_IIO_TXN_REQ_BY_CPU.MEM_READ.PART1", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x02", + "PublicDescription": "Counts every read request for up to a 64 byte transaction of data made by a unit on the main die (generally a core) to the MMIO space of a card on IIO Part1. In the general case, Part1 refers to a x4 PCIe card plugged into the second slot of a PCIe riser card, but it could refer to any x4 device attached to the IIO unit using lanes starting at lane 4 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for up to a 64 byte transaction is made by the CPU to IIO Part2", + "Counter": "0,1,2,3", + "EventCode": "0xC1", + "EventName": "UNC_IIO_TXN_REQ_BY_CPU.MEM_READ.PART2", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x04", + "PublicDescription": "Counts every read request for up to a 64 byte transaction of data made by a unit on the main die (generally a core) to the MMIO space of a card on IIO Part2. In the general case, Part2 refers to a x4 or x8 PCIe card plugged into the third slot of a PCIe riser card, but it could refer to any x4 or x8 device attached to the IIO unit and using lanes starting at lane 8 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for up to a 64 byte transaction is made by the CPU to IIO Part3", + "Counter": "0,1,2,3", + "EventCode": "0xC1", + "EventName": "UNC_IIO_TXN_REQ_BY_CPU.MEM_READ.PART3", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x08", + "PublicDescription": "Counts every read request for up to a 64 byte transaction of data made by a unit on the main die (generally a core) to the MMIO space of a card on IIO Part3. In the general case, Part3 refers to a x4 PCIe card plugged into the fourth slot of a PCIe riser card, but it could brefer to any device attached to the IIO unit using the lanes starting at lane 12 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of up to a 64 byte transaction is made to IIO Part0 by the CPU", + "Counter": "0,1,2,3", + "EventCode": "0xC1", + "EventName": "UNC_IIO_TXN_REQ_BY_CPU.MEM_WRITE.PART0", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x01", + "PublicDescription": "Counts every write request of up to a 64 byte transaction of data made to the MMIO space of a card on IIO Part0 by a unit on the main die (generally a core). In the general case, Part0 refers to a standard PCIe card of any size (x16,x8,x4) that is plugged directly into one of the PCIe slots. Part0 could also refer to any device plugged into the first slot of a PCIe riser card or to a device attached to the IIO unit which starts its use of the bus using lane 0 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of up to a 64 byte transaction is made to IIO Part1 by the CPU", + "Counter": "0,1,2,3", + "EventCode": "0xC1", + "EventName": "UNC_IIO_TXN_REQ_BY_CPU.MEM_WRITE.PART1", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x02", + "PublicDescription": "Counts every write request of up to a 64 byte transaction of data made to the MMIO space of a card on IIO Part1 by a unit on the main die (generally a core). In the general case, Part1 refers to a x4 PCIe card plugged into the second slot of a PCIe riser card, but it could refer to any x4 device attached to the IIO unit using lanes starting at lane 4 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of up to a 64 byte transaction is made to IIO Part2 by the CPU ", + "Counter": "0,1,2,3", + "EventCode": "0xC1", + "EventName": "UNC_IIO_TXN_REQ_BY_CPU.MEM_WRITE.PART2", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x04", + "PublicDescription": "Counts every write request of up to a 64 byte transaction of data made to the MMIO space of a card on IIO Part2 by a unit on the main die (generally a core). In the general case, Part2 refers to a x4 or x8 PCIe card plugged into the third slot of a PCIe riser card, but it could refer to any x4 or x8 device attached to the IIO unit and using lanes starting at lane 8 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of up to a 64 byte transaction is made to IIO Part3 by the CPU ", + "Counter": "0,1,2,3", + "EventCode": "0xC1", + "EventName": "UNC_IIO_TXN_REQ_BY_CPU.MEM_WRITE.PART3", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x08", + "PublicDescription": "Counts every write request of up to a 64 byte transaction of data made to the MMIO space of a card on IIO Part3 by a unit on the main die (generally a core). In the general case, Part3 refers to a x4 PCIe card plugged into the fourth slot of a PCIe riser card, but it could brefer to any device attached to the IIO unit using the lanes starting at lane 12 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for up to a 64 byte transaction is made by IIO Part0 to Memory", + "Counter": "0,1,2,3", + "EventCode": "0x84", + "EventName": "UNC_IIO_TXN_REQ_OF_CPU.MEM_READ.PART0", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x01", + "PublicDescription": "Counts every read request for up to a 64 byte transaction of data made by IIO Part0 to a unit on the main die (generally memory). In the general case, Part0 refers to a standard PCIe card of any size (x16,x8,x4) that is plugged directly into one of the PCIe slots. Part0 could also refer to any device plugged into the first slot of a PCIe riser card or to a device attached to the IIO unit which starts its use of the bus using lane 0 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for up to a 64 byte transaction is made by IIO Part1 to Memory", + "Counter": "0,1,2,3", + "EventCode": "0x84", + "EventName": "UNC_IIO_TXN_REQ_OF_CPU.MEM_READ.PART1", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x02", + "PublicDescription": "Counts every read request for up to a 64 byte transaction of data made by IIO Part1 to a unit on the main die (generally memory). In the general case, Part1 refers to a x4 PCIe card plugged into the second slot of a PCIe riser card, but it could refer to any x4 device attached to the IIO unit using lanes starting at lane 4 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for up to a 64 byte transaction is made by IIO Part2 to Memory", + "Counter": "0,1,2,3", + "EventCode": "0x84", + "EventName": "UNC_IIO_TXN_REQ_OF_CPU.MEM_READ.PART2", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x04", + "PublicDescription": "Counts every read request for up to a 64 byte transaction of data made by IIO Part2 to a unit on the main die (generally memory). In the general case, Part2 refers to a x4 or x8 PCIe card plugged into the third slot of a PCIe riser card, but it could refer to any x4 or x8 device attached to the IIO unit and using lanes starting at lane 8 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Read request for up to a 64 byte transaction is made by IIO Part3 to Memory", + "Counter": "0,1,2,3", + "EventCode": "0x84", + "EventName": "UNC_IIO_TXN_REQ_OF_CPU.MEM_READ.PART3", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x08", + "PublicDescription": "Counts every read request for up to a 64 byte transaction of data made by IIO Part3 to a unit on the main die (generally memory). In the general case, Part3 refers to a x4 PCIe card plugged into the fourth slot of a PCIe riser card, but it could brefer to any device attached to the IIO unit using the lanes starting at lane 12 of the 16 lanes supported by the bus.", + "UMask": "0x04", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of up to a 64 byte transaction is made by IIO Part0 to Memory", + "Counter": "0,1,2,3", + "EventCode": "0x84", + "EventName": "UNC_IIO_TXN_REQ_OF_CPU.MEM_WRITE.PART0", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x01", + "PublicDescription": "Counts every write request of up to a 64 byte transaction of data made by IIO Part0 to a unit on the main die (generally memory). In the general case, Part0 refers to a standard PCIe card of any size (x16,x8,x4) that is plugged directly into one of the PCIe slots. Part0 could also refer to any device plugged into the first slot of a PCIe riser card or to a device attached to the IIO unit which starts its use of the bus using lane 0 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of up to a 64 byte transaction is made by IIO Part1 to Memory", + "Counter": "0,1,2,3", + "EventCode": "0x84", + "EventName": "UNC_IIO_TXN_REQ_OF_CPU.MEM_WRITE.PART1", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x02", + "PublicDescription": "Counts every write request of up to a 64 byte transaction of data made by IIO Part1 to a unit on the main die (generally memory). In the general case, Part1 refers to a x4 PCIe card plugged into the second slot of a PCIe riser card, but it could refer to any x4 device attached to the IIO unit using lanes starting at lane 4 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of up to a 64 byte transaction is made by IIO Part2 to Memory", + "Counter": "0,1,2,3", + "EventCode": "0x84", + "EventName": "UNC_IIO_TXN_REQ_OF_CPU.MEM_WRITE.PART2", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x04", + "PublicDescription": "Counts every write request of up to a 64 byte transaction of data made by IIO Part2 to a unit on the main die (generally memory). In the general case, Part2 refers to a x4 or x8 PCIe card plugged into the third slot of a PCIe riser card, but it could refer to any x4 or x8 device attached to the IIO unit and using lanes starting at lane 8 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Write request of up to a 64 byte transaction is made by IIO Part3 to Memory", + "Counter": "0,1,2,3", + "EventCode": "0x84", + "EventName": "UNC_IIO_TXN_REQ_OF_CPU.MEM_WRITE.PART3", + "FCMask": "0x07", + "PerPkg": "1", + "PortMask": "0x08", + "PublicDescription": "Counts every write request of up to a 64 byte transaction of data made by IIO Part3 to a unit on the main die (generally memory). In the general case, Part3 refers to a x4 PCIe card plugged into the fourth slot of a PCIe riser card, but it could brefer to any device attached to the IIO unit using the lanes starting at lane 12 of the 16 lanes supported by the bus.", + "UMask": "0x01", + "Unit": "IIO" + }, + { + "BriefDescription": "Traffic in which the M2M to iMC Bypass was not taken", + "Counter": "0,1,2,3", + "EventCode": "0x22", + "EventName": "UNC_M2M_BYPASS_M2M_Egress.NOT_TAKEN", + "PerPkg": "1", + "PublicDescription": "Counts traffic in which the M2M (Mesh to Memory) to iMC (Memory Controller) bypass was not taken", + "UMask": "0x2", + "Unit": "M2M" + }, + { + "BriefDescription": "Cycles when direct to core mode (which bypasses the CHA) was disabled", + "Counter": "0,1,2,3", + "EventCode": "0x24", + "EventName": "UNC_M2M_DIRECT2CORE_NOT_TAKEN_DIRSTATE", + "PerPkg": "1", + "PublicDescription": "Counts cycles when direct to core mode (which bypasses the CHA) was disabled", + "Unit": "M2M" + }, + { + "BriefDescription": "Messages sent direct to core (bypassing the CHA)", + "Counter": "0,1,2,3", + "EventCode": "0x23", + "EventName": "UNC_M2M_DIRECT2CORE_TAKEN", + "PerPkg": "1", + "PublicDescription": "Counts when messages were sent direct to core (bypassing the CHA)", + "Unit": "M2M" + }, + { + "BriefDescription": "Number of reads in which direct to core transaction were overridden", + "Counter": "0,1,2,3", + "EventCode": "0x25", + "EventName": "UNC_M2M_DIRECT2CORE_TXN_OVERRIDE", + "PerPkg": "1", + "PublicDescription": "Counts reads in which direct to core transactions (which would have bypassed the CHA) were overridden", + "Unit": "M2M" + }, + { + "BriefDescription": "Number of reads in which direct to Intel UPI transactions were overridden", + "Counter": "0,1,2,3", + "EventCode": "0x28", + "EventName": "UNC_M2M_DIRECT2UPI_NOT_TAKEN_CREDITS", + "PerPkg": "1", + "PublicDescription": "Counts reads in which direct to Intel Ultra Path Interconnect (UPI) transactions (which would have bypassed the CHA) were overridden", + "Unit": "M2M" + }, + { + "BriefDescription": "Cycles when direct to Intel UPI was disabled", + "Counter": "0,1,2,3", + "EventCode": "0x27", + "EventName": "UNC_M2M_DIRECT2UPI_NOT_TAKEN_DIRSTATE", + "PerPkg": "1", + "PublicDescription": "Counts cycles when the ability to send messages direct to the Intel Ultra Path Interconnect (bypassing the CHA) was disabled", + "Unit": "M2M" + }, + { + "BriefDescription": "Messages sent direct to the Intel UPI", + "Counter": "0,1,2,3", + "EventCode": "0x26", + "EventName": "UNC_M2M_DIRECT2UPI_TAKEN", + "PerPkg": "1", + "PublicDescription": "Counts when messages were sent direct to the Intel Ultra Path Interconnect (bypassing the CHA)", + "Unit": "M2M" + }, + { + "BriefDescription": "Number of reads that a message sent direct2 Intel UPI was overridden", + "Counter": "0,1,2,3", + "EventCode": "0x29", + "EventName": "UNC_M2M_DIRECT2UPI_TXN_OVERRIDE", + "PerPkg": "1", + "PublicDescription": "Counts when a read message that was sent direct to the Intel Ultra Path Interconnect (bypassing the CHA) was overridden", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory lookups (any state found)", + "Counter": "0,1,2,3", + "EventCode": "0x2D", + "EventName": "UNC_M2M_DIRECTORY_LOOKUP.ANY", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) looks into the multi-socket cacheline Directory state, and found the cacheline marked in Any State (A, I, S or unused)", + "UMask": "0x1", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory lookups (cacheline found in A state) ", + "Counter": "0,1,2,3", + "EventCode": "0x2D", + "EventName": "UNC_M2M_DIRECTORY_LOOKUP.STATE_A", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) looks into the multi-socket cacheline Directory state, and found the cacheline marked in the A (SnoopAll) state, indicating the cacheline is stored in another socket in any state, and we must snoop the other sockets to make sure we get the latest data. The data may be stored in any state in the local socket.", + "UMask": "0x8", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory lookup (cacheline found in I state) ", + "Counter": "0,1,2,3", + "EventCode": "0x2D", + "EventName": "UNC_M2M_DIRECTORY_LOOKUP.STATE_I", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) looks into the multi-socket cacheline Directory state , and found the cacheline marked in the I (Invalid) state indicating the cacheline is not stored in another socket, and so there is no need to snoop the other sockets for the latest data. The data may be stored in any state in the local socket.", + "UMask": "0x2", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory lookup (cacheline found in S state) ", + "Counter": "0,1,2,3", + "EventCode": "0x2D", + "EventName": "UNC_M2M_DIRECTORY_LOOKUP.STATE_S", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) looks into the multi-socket cacheline Directory state , and found the cacheline marked in the S (Shared) state indicating the cacheline is either stored in another socket in the S(hared) state , and so there is no need to snoop the other sockets for the latest data. The data may be stored in any state in the local socket.", + "UMask": "0x4", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory update from A to I", + "Counter": "0,1,2,3", + "EventCode": "0x2E", + "EventName": "UNC_M2M_DIRECTORY_UPDATE.A2I", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) updates the multi-socket cacheline Directory state from from A (SnoopAll) to I (Invalid)", + "UMask": "0x20", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory update from A to S", + "Counter": "0,1,2,3", + "EventCode": "0x2E", + "EventName": "UNC_M2M_DIRECTORY_UPDATE.A2S", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) updates the multi-socket cacheline Directory state from from A (SnoopAll) to S (Shared)", + "UMask": "0x40", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory update from/to Any state ", + "Counter": "0,1,2,3", + "EventCode": "0x2E", + "EventName": "UNC_M2M_DIRECTORY_UPDATE.ANY", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) updates the multi-socket cacheline Directory to a new state", + "UMask": "0x1", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory update from I to A", + "Counter": "0,1,2,3", + "EventCode": "0x2E", + "EventName": "UNC_M2M_DIRECTORY_UPDATE.I2A", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) updates the multi-socket cacheline Directory state from from I (Invalid) to A (SnoopAll)", + "UMask": "0x4", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory update from I to S", + "Counter": "0,1,2,3", + "EventCode": "0x2E", + "EventName": "UNC_M2M_DIRECTORY_UPDATE.I2S", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) updates the multi-socket cacheline Directory state from from I (Invalid) to S (Shared)", + "UMask": "0x2", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory update from S to A", + "Counter": "0,1,2,3", + "EventCode": "0x2E", + "EventName": "UNC_M2M_DIRECTORY_UPDATE.S2A", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) updates the multi-socket cacheline Directory state from from S (Shared) to A (SnoopAll)", + "UMask": "0x10", + "Unit": "M2M" + }, + { + "BriefDescription": "Multi-socket cacheline Directory update from S to I", + "Counter": "0,1,2,3", + "EventCode": "0x2E", + "EventName": "UNC_M2M_DIRECTORY_UPDATE.S2I", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) updates the multi-socket cacheline Directory state from from S (Shared) to I (Invalid)", + "UMask": "0x8", + "Unit": "M2M" + }, + { + "BriefDescription": "Reads to iMC issued", + "Counter": "0,1,2,3", + "EventCode": "0x37", + "EventName": "UNC_M2M_IMC_READS.ALL", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) issues reads to the iMC (Memory Controller). ", + "UMask": "0x4", + "Unit": "M2M" + }, + { + "BriefDescription": "Reads to iMC issued at Normal Priority (Non-Isochronous)", + "Counter": "0,1,2,3", + "EventCode": "0x37", + "EventName": "UNC_M2M_IMC_READS.NORMAL", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) issues reads to the iMC (Memory Controller). It only counts normal priority non-isochronous reads.", + "UMask": "0x1", + "Unit": "M2M" + }, + { + "BriefDescription": "Writes to iMC issued", + "Counter": "0,1,2,3", + "EventCode": "0x38", + "EventName": "UNC_M2M_IMC_WRITES.ALL", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) issues writes to the iMC (Memory Controller).", + "UMask": "0x10", + "Unit": "M2M" + }, + { + "BriefDescription": "Partial Non-Isochronous writes to the iMC", + "Counter": "0,1,2,3", + "EventCode": "0x38", + "EventName": "UNC_M2M_IMC_WRITES.PARTIAL", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) issues partial writes to the iMC (Memory Controller). It only counts normal priority non-isochronous writes.", + "UMask": "0x2", + "Unit": "M2M" + }, + { + "BriefDescription": "Prefecth requests that got turn into a demand request", + "Counter": "0,1,2,3", + "EventCode": "0x56", + "EventName": "UNC_M2M_PREFCAM_DEMAND_PROMOTIONS", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) promotes a outstanding request in the prefetch queue due to a subsequent demand read request that entered the M2M with the same address. Explanatory Side Note: The Prefecth queue is made of CAM (Content Addressable Memory)", + "Unit": "M2M" + }, + { + "BriefDescription": "Inserts into the Memory Controller Prefetch Queue", + "Counter": "0,1,2,3", + "EventCode": "0x57", + "EventName": "UNC_M2M_PREFCAM_INSERTS", + "PerPkg": "1", + "PublicDescription": "Counts when the M2M (Mesh to Memory) recieves a prefetch request and inserts it into its outstanding prefetch queue. Explanatory Side Note: the prefect queue is made from CAM: Content Addressable Memory", + "Unit": "M2M" + }, + { + "BriefDescription": "AD Ingress (from CMS) Queue Inserts", + "Counter": "0,1,2,3", + "EventCode": "0x1", + "EventName": "UNC_M2M_RxC_AD_INSERTS", + "PerPkg": "1", + "PublicDescription": "Counts when the a new entry is Received(RxC) and then added to the AD (Address Ring) Ingress Queue from the CMS (Common Mesh Stop). This is generally used for reads, and ", + "Unit": "M2M" + }, + { + "BriefDescription": "Prefetches generated by the flow control queue of the M3UPI unit.", + "Counter": "0,1,2,3", + "EventCode": "0x29", + "EventName": "UNC_M3UPI_UPI_PREFETCH_SPAWN", + "PerPkg": "1", + "PublicDescription": "Count cases where flow control queue that sits between the Intel Ultra Path Interconnect (UPI) and the mesh spawns a prefetch to the iMC (Memory Controller)", + "Unit": "M3UPI" + }, + { + "BriefDescription": "Clocks of the Intel Ultra Path Interconnect (UPI)", + "Counter": "0,1,2,3", + "EventCode": "0x1", + "EventName": "UNC_UPI_CLOCKTICKS", + "PerPkg": "1", + "PublicDescription": "Counts clockticks of the fixed frequency clock controlling the Intel Ultra Path Interconnect (UPI). This clock runs at1/8th the 'GT/s' speed of the UPI link. For example, a 9.6GT/s link will have a fixed Frequency of 1.2 Ghz.", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Data Response packets that go direct to core", + "Counter": "0,1,2,3", + "EventCode": "0x12", + "EventName": "UNC_UPI_DIRECT_ATTEMPTS.D2C", + "PerPkg": "1", + "PublicDescription": "Counts Data Response (DRS) packets that attempted to go direct to core bypassing the CHA.", + "UMask": "0x1", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Data Response packets that go direct to Intel UPI", + "Counter": "0,1,2,3", + "EventCode": "0x12", + "EventName": "UNC_UPI_DIRECT_ATTEMPTS.D2U", + "PerPkg": "1", + "PublicDescription": "Counts Data Response (DRS) packets that attempted to go direct to Intel Ultra Path Interconnect (UPI) bypassing the CHA .", + "UMask": "0x2", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Cycles Intel UPI is in L1 power mode (shutdown)", + "Counter": "0,1,2,3", + "EventCode": "0x21", + "EventName": "UNC_UPI_L1_POWER_CYCLES", + "PerPkg": "1", + "PublicDescription": "Counts cycles when the Intel Ultra Path Interconnect (UPI) is in L1 power mode. L1 is a mode that totally shuts down the UPI link. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another, this event only coutns when both links are shutdown.", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Cycles the Rx of the Intel UPI is in L0p power mode", + "Counter": "0,1,2,3", + "EventCode": "0x25", + "EventName": "UNC_UPI_RxL0P_POWER_CYCLES", + "PerPkg": "1", + "PublicDescription": "Counts cycles when the the receive side (Rx) of the Intel Ultra Path Interconnect(UPI) is in L0p power mode. L0p is a mode where we disable 60% of the UPI lanes, decreasing our bandwidth in order to save power.", + "Unit": "UPI LL" + }, + { + "BriefDescription": "FLITs received which bypassed the Slot0 Receive Buffer", + "Counter": "0,1,2,3", + "EventCode": "0x31", + "EventName": "UNC_UPI_RxL_BYPASSED.SLOT0", + "PerPkg": "1", + "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the slot0 RxQ buffer (Receive Queue) and passed directly to the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of FLITs transfered, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", + "UMask": "0x1", + "Unit": "UPI LL" + }, + { + "BriefDescription": "FLITs received which bypassed the Slot0 Receive Buffer", + "Counter": "0,1,2,3", + "EventCode": "0x31", + "EventName": "UNC_UPI_RxL_BYPASSED.SLOT1", + "PerPkg": "1", + "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the slot1 RxQ buffer (Receive Queue) and passed directly across the BGF and into the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of FLITs transfered, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", + "UMask": "0x2", + "Unit": "UPI LL" + }, + { + "BriefDescription": "FLITs received which bypassed the Slot0 Recieve Buffer", + "Counter": "0,1,2,3", + "EventCode": "0x31", + "EventName": "UNC_UPI_RxL_BYPASSED.SLOT2", + "PerPkg": "1", + "PublicDescription": "Counts incoming FLITs (FLow control unITs) whcih bypassed the slot2 RxQ buffer (Receive Queue) and passed directly to the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of FLITs transfered, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", + "UMask": "0x4", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Valid data FLITs received from any slot", + "Counter": "0,1,2,3", + "EventCode": "0x3", + "EventName": "UNC_UPI_RxL_FLITS.ALL_DATA", + "PerPkg": "1", + "PublicDescription": "Counts valid data FLITs (80 bit FLow control unITs: 64bits of data) received from any of the 3 Intel Ultra Path Interconnect (UPI) Receive Queue slots on this UPI unit.", + "UMask": "0x0F", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Null FLITs received from any slot", + "Counter": "0,1,2,3", + "EventCode": "0x3", + "EventName": "UNC_UPI_RxL_FLITS.ALL_NULL", + "PerPkg": "1", + "PublicDescription": "Counts null FLITs (80 bit FLow control unITs) received from any of the 3 Intel Ultra Path Interconnect (UPI) Receive Queue slots on this UPI unit.", + "UMask": "0x27", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Protocol header and credit FLITs received from any slot", + "Counter": "0,1,2,3", + "EventCode": "0x3", + "EventName": "UNC_UPI_RxL_FLITS.NON_DATA", + "PerPkg": "1", + "PublicDescription": "Counts protocol header and credit FLITs (80 bit FLow control unITs) received from any of the 3 UPI slots on this UPI unit.", + "UMask": "0x97", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Cycles in which the Tx of the Intel Ultra Path Interconnect (UPI) is in L0p power mode", + "Counter": "0,1,2,3", + "EventCode": "0x27", + "EventName": "UNC_UPI_TxL0P_POWER_CYCLES", + "PerPkg": "1", + "PublicDescription": "Counts cycles when the transmit side (Tx) of the Intel Ultra Path Interconnect(UPI) is in L0p power mode. L0p is a mode where we disable 60% of the UPI lanes, decreasing our bandwidth in order to save power.", + "Unit": "UPI LL" + }, + { + "BriefDescription": "FLITs that bypassed the TxL Buffer", + "Counter": "0,1,2,3", + "EventCode": "0x41", + "EventName": "UNC_UPI_TxL_BYPASSED", + "PerPkg": "1", + "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the TxL(transmit) FLIT buffer and pass directly out the UPI Link. Generally, when data is transmitted across the Intel Ultra Path Interconnect (UPI), it will bypass the TxQ and pass directly to the link. However, the TxQ will be used in L0p (Low Power) mode and (Link Layer Retry) LLR mode, increasing latency to transfer out to the link.", + "Unit": "UPI LL" + }, + { + "BriefDescription": "UPI interconnect send bandwidth for payload. Derived from unc_upi_txl_flits.all_data", + "Counter": "0,1,2,3", + "EventCode": "0x2", + "EventName": "UPI_DATA_BANDWIDTH_TX", + "PerPkg": "1", + "ScaleUnit": "7.11E-06Bytes", + "UMask": "0x0F", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Null FLITs transmitted from any slot", + "Counter": "0,1,2,3", + "EventCode": "0x2", + "EventName": "UNC_UPI_TxL_FLITS.ALL_NULL", + "PerPkg": "1", + "PublicDescription": "Counts null FLITs (80 bit FLow control unITs) transmitted via any of the 3 Intel Ulra Path Interconnect (UPI) slots on this UPI unit.", + "UMask": "0x27", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Idle FLITs transmitted", + "Counter": "0,1,2,3", + "EventCode": "0x2", + "EventName": "UNC_UPI_TxL_FLITS.IDLE", + "PerPkg": "1", + "PublicDescription": "Counts when the Intel Ultra Path Interconnect(UPI) transmits an idle FLIT(80 bit FLow control unITs). Every UPI cycle must be sending either data FLITs, protocol/credit FLITs or idle FLITs.", + "UMask": "0x47", + "Unit": "UPI LL" + }, + { + "BriefDescription": "Protocol header and credit FLITs transmitted across any slot", + "Counter": "0,1,2,3", + "EventCode": "0x2", + "EventName": "UNC_UPI_TxL_FLITS.NON_DATA", + "PerPkg": "1", + "PublicDescription": "Counts protocol header and credit FLITs (80 bit FLow control unITs) transmitted across any of the 3 UPI (Ultra Path Interconnect) slots on this UPI unit.", + "UMask": "0x97", + "Unit": "UPI LL" + } +] -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 10/15] perf tools: Add support for printing new mem_info encodings 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (8 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 09/15] perf vendor events: Add Skylake server uncore event list Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 11/15] perf test: Add test cases for new data source encoding Arnaldo Carvalho de Melo ` (4 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Add decoding for the new "lvlx" and "snoopx" meminfo fields added earlier to the kernel so that "perf mem report" and other tools can print it properly. v2: Merge with persistent memory patch. Switch to new bit encoding for each combination. v3: Switch to generic lvlnum field. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170816222156.19953-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/include/uapi/linux/perf_event.h | 30 ++++++++++++++++++++++-- tools/perf/util/mem-events.c | 43 ++++++++++++++++++++++++++++++++--- 2 files changed, 68 insertions(+), 5 deletions(-) diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 642db5fa3286..2a37ae925d85 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -954,14 +954,20 @@ union perf_mem_data_src { mem_snoop:5, /* snoop mode */ mem_lock:2, /* lock instr */ mem_dtlb:7, /* tlb access */ - mem_rsvd:31; + mem_lvl_num:4, /* memory hierarchy level number */ + mem_remote:1, /* remote */ + mem_snoopx:2, /* snoop mode, ext */ + mem_rsvd:24; }; }; #elif defined(__BIG_ENDIAN_BITFIELD) union perf_mem_data_src { __u64 val; struct { - __u64 mem_rsvd:31, + __u64 mem_rsvd:24, + mem_snoopx:2, /* snoop mode, ext */ + mem_remote:1, /* remote */ + mem_lvl_num:4, /* memory hierarchy level number */ mem_dtlb:7, /* tlb access */ mem_lock:2, /* lock instr */ mem_snoop:5, /* snoop mode */ @@ -998,6 +1004,22 @@ union perf_mem_data_src { #define PERF_MEM_LVL_UNC 0x2000 /* Uncached memory */ #define PERF_MEM_LVL_SHIFT 5 +#define PERF_MEM_REMOTE_REMOTE 0x01 /* Remote */ +#define PERF_MEM_REMOTE_SHIFT 37 + +#define PERF_MEM_LVLNUM_L1 0x01 /* L1 */ +#define PERF_MEM_LVLNUM_L2 0x02 /* L2 */ +#define PERF_MEM_LVLNUM_L3 0x03 /* L3 */ +#define PERF_MEM_LVLNUM_L4 0x04 /* L4 */ +/* 5-0xa available */ +#define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */ +#define PERF_MEM_LVLNUM_LFB 0x0c /* LFB */ +#define PERF_MEM_LVLNUM_RAM 0x0d /* RAM */ +#define PERF_MEM_LVLNUM_PMEM 0x0e /* PMEM */ +#define PERF_MEM_LVLNUM_NA 0x0f /* N/A */ + +#define PERF_MEM_LVLNUM_SHIFT 33 + /* snoop mode */ #define PERF_MEM_SNOOP_NA 0x01 /* not available */ #define PERF_MEM_SNOOP_NONE 0x02 /* no snoop */ @@ -1006,6 +1028,10 @@ union perf_mem_data_src { #define PERF_MEM_SNOOP_HITM 0x10 /* snoop hit modified */ #define PERF_MEM_SNOOP_SHIFT 19 +#define PERF_MEM_SNOOPX_FWD 0x01 /* forward */ +/* 1 free */ +#define PERF_MEM_SNOOPX_SHIFT 37 + /* locked instruction */ #define PERF_MEM_LOCK_NA 0x01 /* not available */ #define PERF_MEM_LOCK_LOCKED 0x02 /* locked transaction */ diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c index 06f5a3a4295c..ced4f3fff035 100644 --- a/tools/perf/util/mem-events.c +++ b/tools/perf/util/mem-events.c @@ -166,11 +166,20 @@ static const char * const mem_lvl[] = { "Uncached", }; +static const char * const mem_lvlnum[] = { + [PERF_MEM_LVLNUM_ANY_CACHE] = "Any cache", + [PERF_MEM_LVLNUM_LFB] = "LFB", + [PERF_MEM_LVLNUM_RAM] = "RAM", + [PERF_MEM_LVLNUM_PMEM] = "PMEM", + [PERF_MEM_LVLNUM_NA] = "N/A", +}; + int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info) { size_t i, l = 0; u64 m = PERF_MEM_LVL_NA; u64 hit, miss; + int printed; if (mem_info) m = mem_info->data_src.mem_lvl; @@ -184,17 +193,37 @@ int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info) /* already taken care of */ m &= ~(PERF_MEM_LVL_HIT|PERF_MEM_LVL_MISS); + + if (mem_info && mem_info->data_src.mem_remote) { + strcat(out, "Remote "); + l += 7; + } + + printed = 0; for (i = 0; m && i < ARRAY_SIZE(mem_lvl); i++, m >>= 1) { if (!(m & 0x1)) continue; - if (l) { + if (printed++) { strcat(out, " or "); l += 4; } l += scnprintf(out + l, sz - l, mem_lvl[i]); } - if (*out == '\0') - l += scnprintf(out, sz - l, "N/A"); + + if (mem_info && mem_info->data_src.mem_lvl_num) { + int lvl = mem_info->data_src.mem_lvl_num; + if (printed++) { + strcat(out, " or "); + l += 4; + } + if (mem_lvlnum[lvl]) + l += scnprintf(out + l, sz - l, mem_lvlnum[lvl]); + else + l += scnprintf(out + l, sz - l, "L%d", lvl); + } + + if (l == 0) + l += scnprintf(out + l, sz - l, "N/A"); if (hit) l += scnprintf(out + l, sz - l, " hit"); if (miss) @@ -231,6 +260,14 @@ int perf_mem__snp_scnprintf(char *out, size_t sz, struct mem_info *mem_info) } l += scnprintf(out + l, sz - l, snoop_access[i]); } + if (mem_info && + (mem_info->data_src.mem_snoopx & PERF_MEM_SNOOPX_FWD)) { + if (l) { + strcat(out, " or "); + l += 4; + } + l += scnprintf(out + l, sz - l, "Fwd"); + } if (*out == '\0') l += scnprintf(out, sz - l, "N/A"); -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 11/15] perf test: Add test cases for new data source encoding 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (9 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 10/15] perf tools: Add support for printing new mem_info encodings Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 12/15] perf tools: Really install manpages via 'make install-man' Arnaldo Carvalho de Melo ` (3 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo From: Andi Kleen <ak@linux.intel.com> Add some simple tests to perf test to test data source printing. v2: Make the tests actually checked for the correct name of Forward v3: Adjust to new encoding Committer notes: Avoid the in place declaration to make this build with older compilers, for instance, in Debian 7 we get: tests/mem.c: In function 'test__mem': tests/mem.c:30:5: error: missing initializer [-Werror=missing-field-initializers] tests/mem.c:30:5: error: (near initialization for '(anonymous).<anonymous>.mem_snoop') [-Werror=missing-field-initializers] So just zero a struct, then go on building the unions as needed, reusing settings from the previous test, i.e. local -> remote, etc. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170816222156.19953-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/tests/Build | 1 + tools/perf/tests/builtin-test.c | 4 +++ tools/perf/tests/mem.c | 56 +++++++++++++++++++++++++++++++++++++++++ tools/perf/tests/tests.h | 1 + 4 files changed, 62 insertions(+) create mode 100644 tools/perf/tests/mem.c diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build index 84222bdb8689..87bf3edb037c 100644 --- a/tools/perf/tests/Build +++ b/tools/perf/tests/Build @@ -34,6 +34,7 @@ perf-y += thread-map.o perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o llvm-src-relocation.o perf-y += bpf.o perf-y += topology.o +perf-y += mem.o perf-y += cpumap.o perf-y += stat.o perf-y += event_update.o diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c index 9ecc44e68990..377bea009163 100644 --- a/tools/perf/tests/builtin-test.c +++ b/tools/perf/tests/builtin-test.c @@ -48,6 +48,10 @@ static struct test generic_tests[] = { .func = test__basic_mmap, }, { + .desc = "Test data source output", + .func = test__mem, + }, + { .desc = "Parse event definition strings", .func = test__parse_events, }, diff --git a/tools/perf/tests/mem.c b/tools/perf/tests/mem.c new file mode 100644 index 000000000000..21952e1e6e6d --- /dev/null +++ b/tools/perf/tests/mem.c @@ -0,0 +1,56 @@ +#include "util/mem-events.h" +#include "util/symbol.h" +#include "linux/perf_event.h" +#include "util/debug.h" +#include "tests.h" +#include <string.h> + +static int check(union perf_mem_data_src data_src, + const char *string) +{ + char out[100]; + char failure[100]; + struct mem_info mi = { .data_src = data_src }; + + int n; + + n = perf_mem__snp_scnprintf(out, sizeof out, &mi); + n += perf_mem__lvl_scnprintf(out + n, sizeof out - n, &mi); + snprintf(failure, sizeof failure, "unexpected %s", out); + TEST_ASSERT_VAL(failure, !strcmp(string, out)); + return 0; +} + +int test__mem(struct test *text __maybe_unused, int subtest __maybe_unused) +{ + int ret = 0; + union perf_mem_data_src src; + + memset(&src, 0, sizeof(src)); + + src.mem_lvl = PERF_MEM_LVL_HIT; + src.mem_lvl_num = 4; + + ret |= check(src, "N/AL4 hit"); + + src.mem_remote = 1; + + ret |= check(src, "N/ARemote L4 hit"); + + src.mem_lvl = PERF_MEM_LVL_MISS; + src.mem_lvl_num = PERF_MEM_LVLNUM_PMEM; + src.mem_remote = 0; + + ret |= check(src, "N/APMEM miss"); + + src.mem_remote = 1; + + ret |= check(src, "N/ARemote PMEM miss"); + + src.mem_snoopx = PERF_MEM_SNOOPX_FWD; + src.mem_lvl_num = PERF_MEM_LVLNUM_RAM; + + ret |= check(src , "FwdRemote RAM miss"); + + return ret; +} diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h index c46ae818aac8..921412a6a880 100644 --- a/tools/perf/tests/tests.h +++ b/tools/perf/tests/tests.h @@ -58,6 +58,7 @@ int test__python_use(struct test *test, int subtest); int test__bp_signal(struct test *test, int subtest); int test__bp_signal_overflow(struct test *test, int subtest); int test__task_exit(struct test *test, int subtest); +int test__mem(struct test *test, int subtest); int test__sw_clock_freq(struct test *test, int subtest); int test__code_reading(struct test *test, int subtest); int test__sample_parsing(struct test *test, int subtest); -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 12/15] perf tools: Really install manpages via 'make install-man' 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (10 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 11/15] perf test: Add test cases for new data source encoding Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 13/15] perf: Fix documentation for sysctls perf_event_paranoid and perf_event_mlock_kb Arnaldo Carvalho de Melo ` (2 subsequent siblings) 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Konstantin Khlebnikov, Alexander Shishkin, Borislav Petkov, Peter Zijlstra, Arnaldo Carvalho de Melo From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Target install-man builds them but forget to install. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: af3df2cf17f5 ("perf tools: Try to build Documentation when installing") Link: http://lkml.kernel.org/r/150322915300.129715.13645857235229756834.stgit@buzz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Documentation/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/Documentation/Makefile b/tools/perf/Documentation/Makefile index 098cfb9ca8f0..db11478e30b4 100644 --- a/tools/perf/Documentation/Makefile +++ b/tools/perf/Documentation/Makefile @@ -192,7 +192,7 @@ do-install-man: man # $(INSTALL) -m 644 $(DOC_MAN5) $(DESTDIR)$(man5dir); \ # $(INSTALL) -m 644 $(DOC_MAN7) $(DESTDIR)$(man7dir) -install-man: check-man-tools man +install-man: check-man-tools man do-install-man ifdef missing_tools DO_INSTALL_MAN = $(warning Please install $(missing_tools) to have the man pages installed) -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 13/15] perf: Fix documentation for sysctls perf_event_paranoid and perf_event_mlock_kb 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (11 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 12/15] perf tools: Really install manpages via 'make install-man' Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 14/15] perf tools: Fix static linking with libdw from elfutils Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 15/15] perf tools: Fix static linking with libunwind Arnaldo Carvalho de Melo 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Konstantin Khlebnikov, Alexander Shishkin, Peter Zijlstra, Arnaldo Carvalho de Melo From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fix misprint CAP_IOC_LOCK -> CAP_IPC_LOCK. This capability have nothing to do with raw tracepoints. This part is about bypassing mlock limits. Sysctl kernel.perf_event_paranoid = -1 allows raw and ftrace function tracepoints without CAP_SYS_ADMIN. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/150322916080.129746.11285255474738558340.stgit@buzz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- Documentation/sysctl/kernel.txt | 13 ++++++++++++- tools/perf/util/evsel.c | 4 +++- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index bac23c198360..ce61d1fe08ca 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -61,6 +61,7 @@ show up in /proc/sys/kernel: - perf_cpu_time_max_percent - perf_event_paranoid - perf_event_max_stack +- perf_event_mlock_kb - perf_event_max_contexts_per_stack - pid_max - powersave-nap [ PPC only ] @@ -654,7 +655,9 @@ Controls use of the performance events system by unprivileged users (without CAP_SYS_ADMIN). The default value is 2. -1: Allow use of (almost) all events by all users ->=0: Disallow raw tracepoint access by users without CAP_IOC_LOCK + Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK +>=0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN + Disallow raw tracepoint access by users without CAP_SYS_ADMIN >=1: Disallow CPU event access by users without CAP_SYS_ADMIN >=2: Disallow kernel profiling by users without CAP_SYS_ADMIN @@ -673,6 +676,14 @@ The default value is 127. ============================================================== +perf_event_mlock_kb: + +Control size of per-cpu ring buffer not counted agains mlock limit. + +The default value is 512 + 1 page + +============================================================== + perf_event_max_contexts_per_stack: Controls maximum number of stack frame context entries for diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 5dfb8bc4db89..a5888c704e01 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -2674,7 +2674,9 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target, "unprivileged users (without CAP_SYS_ADMIN).\n\n" "The current value is %d:\n\n" " -1: Allow use of (almost) all events by all users\n" - ">= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK\n" + " Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n" + ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN\n" + " Disallow raw tracepoint access by users without CAP_SYS_ADMIN\n" ">= 1: Disallow CPU event access by users without CAP_SYS_ADMIN\n" ">= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN\n\n" "To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n" -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 14/15] perf tools: Fix static linking with libdw from elfutils 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (12 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 13/15] perf: Fix documentation for sysctls perf_event_paranoid and perf_event_mlock_kb Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 15/15] perf tools: Fix static linking with libunwind Arnaldo Carvalho de Melo 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Konstantin Khlebnikov, Alexander Shishkin, Peter Zijlstra, Arnaldo Carvalho de Melo From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fix feature test for static libdw: link required dependencies. Backends of libebl are not statically linked thus libdl is required. In Debian/Ubuntu libdw-dev includes libebl.a starting from 0.166-1. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/150322916720.129772.7959925864494283854.stgit@buzz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Makefile.config | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 37d203c4cd1f..bb4735b92ada 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -103,8 +103,12 @@ ifdef LIBDW_DIR LIBDW_CFLAGS := -I$(LIBDW_DIR)/include LIBDW_LDFLAGS := -L$(LIBDW_DIR)/lib endif +DWARFLIBS := -ldw +ifeq ($(findstring -static,${LDFLAGS}),-static) + DWARFLIBS += -lelf -lebl -ldl -lz -llzma -lbz2 +endif FEATURE_CHECK_CFLAGS-libdw-dwarf-unwind := $(LIBDW_CFLAGS) -FEATURE_CHECK_LDFLAGS-libdw-dwarf-unwind := $(LIBDW_LDFLAGS) -ldw +FEATURE_CHECK_LDFLAGS-libdw-dwarf-unwind := $(LIBDW_LDFLAGS) $(DWARFLIBS) # for linking with debug library, run like: # make DEBUG=1 LIBBABELTRACE_DIR=/opt/libbabeltrace/ @@ -365,10 +369,6 @@ ifndef NO_LIBELF else CFLAGS += -DHAVE_DWARF_SUPPORT $(LIBDW_CFLAGS) LDFLAGS += $(LIBDW_LDFLAGS) - DWARFLIBS := -ldw - ifeq ($(findstring -static,${LDFLAGS}),-static) - DWARFLIBS += -lelf -lebl -lz -llzma -lbz2 - endif EXTLIBS += ${DWARFLIBS} $(call detected,CONFIG_DWARF) endif # PERF_HAVE_DWARF_REGS -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 15/15] perf tools: Fix static linking with libunwind 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (13 preceding siblings ...) 2017-08-23 19:36 ` [PATCH 14/15] perf tools: Fix static linking with libdw from elfutils Arnaldo Carvalho de Melo @ 2017-08-23 19:36 ` Arnaldo Carvalho de Melo 14 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-08-23 19:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Konstantin Khlebnikov, Alexander Shishkin, Peter Zijlstra, Arnaldo Carvalho de Melo From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> * libunwind-x86_64 must be linked before libunwind * libunwind requires liblzma * static libunwind conflicts with static libgcc_eh Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/150322917247.129799.14247751517961953155.stgit@buzz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Makefile.config | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index bb4735b92ada..6a64c6bbd9a5 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -35,7 +35,7 @@ ifeq ($(SRCARCH),x86) ifeq (${IS_64_BIT}, 1) CFLAGS += -DHAVE_ARCH_X86_64_SUPPORT -DHAVE_SYSCALL_TABLE -I$(OUTPUT)arch/x86/include/generated ARCH_INCLUDE = ../../arch/x86/lib/memcpy_64.S ../../arch/x86/lib/memset_64.S - LIBUNWIND_LIBS = -lunwind -lunwind-x86_64 + LIBUNWIND_LIBS = -lunwind-x86_64 -lunwind -llzma $(call detected,CONFIG_X86_64) else LIBUNWIND_LIBS = -lunwind-x86 -llzma -lunwind @@ -505,6 +505,10 @@ ifndef NO_LOCAL_LIBUNWIND EXTLIBS += $(LIBUNWIND_LIBS) LDFLAGS += $(LIBUNWIND_LIBS) endif +ifeq ($(findstring -static,${LDFLAGS}),-static) + # gcc -static links libgcc_eh which contans piece of libunwind + LIBUNWIND_LDFLAGS += -Wl,--allow-multiple-definition +endif ifndef NO_LIBUNWIND CFLAGS += -DHAVE_LIBUNWIND_SUPPORT -- 2.13.5 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2017-07-28 20:00 Arnaldo Carvalho de Melo 2017-07-30 9:31 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-07-28 20:00 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin, Andi Kleen, David Ahern, David Carrillo-Cisneros, Francis Deslauriers, Geneviève Bastien, Jiri Olsa, Julien Desfossez, Martin Liška, Mathieu Desnoyers, Milian Wolff, Namhyung Kim, Paul Turner, Peter Zijlstra, Simon Que, Stephane Eranian, Taeung Song, Wang Nan, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo Test results at the end of this message, as usual. The following changes since commit ee438ec8f33c5af0d4a4ffb935c5b9272e8c2680: Merge tag 'perf-core-for-mingo-4.14-20170725' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-07-26 19:07:30 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.14-20170728 for you to fetch changes up to 6b7007af728df7258bb60ed73099be3b59b3030e: perf data: Add doc when no conversion support compiled (2017-07-28 16:30:45 -0300) ---------------------------------------------------------------- perf/core improvements and fixes for 4.14: New features: - Add PERF_SAMPLE_CALLCHAIN and PERF_RECORD_MMAP[2] to 'perf data' CTF conversion, allowing CTF trace visualization tools to show callchains and to resolve symbols (Geneviève Bastien) Improvements: - Use group read for event groups in 'perf stat', reducing overhead when groups are defined in the event specification, i.e. when using {} to enclose a list of events, asking them to be read at the same time, e.g.: "perf stat -e '{cycles,instructions}'" (Jiri Olsa) Fixes: - Do not overwrite perf_sample->weight in 'perf annotate' when processing samples, use whatever came from the kernel when perf_event_attr.sample_type has PERF_SAMPLE_WEIGHT set or just handle its default value, 0, when that is not set and "weight" is one of the sort orders chosen (Arnaldo Carvalho de Melo) - 'perf annotate --show-total-period' fixes: - TUI should show period, not nr_samples - Set appropriate column width for period/percent - Fix the column header to show "Period" when when that is what is being asked for (Taeung Song, Arnaldo Carvalho de Melo) - Use default sort if evlist is empty, fixing pipe mode (David Carrillo-Cisneros) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (6): perf annotate: Do not overwrite perf_sample->weight perf annotate stdio: Set enough columns for --show-total-period perf annotate: Fix storing per line sym_hist_entry perf annotate TUI: Use sym_hist_entry in disasm_line_samples perf annotate TUI: Clarify calculation of column header widths perf annotate TUI: Set appropriate column width for period/percent David Carrillo-Cisneros (1): perf sort: Use default sort if evlist is empty Geneviève Bastien (3): perf data: Add callchain to CTF conversion perf data: Add mmap[2] events to CTF conversion perf data: Add doc when no conversion support compiled Jiri Olsa (3): perf tools: Add perf_evsel__read_size function perf evsel: Add read_counter() perf stat: Use group read for event groups Taeung Song (2): perf annotate TUI: Fix --show-total-period perf annotate TUI: Fix column header when toggling period/percent tools/perf/builtin-annotate.c | 2 - tools/perf/builtin-data.c | 2 +- tools/perf/builtin-stat.c | 30 +++++++- tools/perf/ui/browsers/annotate.c | 36 +++++----- tools/perf/util/annotate.c | 11 +-- tools/perf/util/counts.h | 1 + tools/perf/util/data-convert-bt.c | 127 +++++++++++++++++++++++++++++++++- tools/perf/util/evlist.h | 5 ++ tools/perf/util/evsel.c | 139 +++++++++++++++++++++++++++++++++++++- tools/perf/util/evsel.h | 2 + tools/perf/util/sort.c | 2 +- tools/perf/util/stat.c | 4 ++ tools/perf/util/stat.h | 5 +- 13 files changed, 334 insertions(+), 32 deletions(-) Test results: The first ones are container (docker) based builds of tools/perf with and without libelf support, objtool where it is supported and samples/bpf/, ditto. Where clang is available, it is also used to build perf with/without libelf. Several are cross builds, the ones with -x-ARCH and the android one, and those may not have all the features built, due to lack of multi-arch devel packages, available and being used so far on just a few, like debian:experimental-x-{arm64,mipsel}. The 'perf test' one will perform a variety of tests exercising tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands with a variety of command line event specifications to then intercept the sys_perf_event syscall to check that the perf_event_attr fields are set up as expected, among a variety of other unit tests. Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/ with a variety of feature sets, exercising the build with an incomplete set of features as well as with a complete one. It is planned to have it run on each of the containers mentioned above, using some container orchestration infrastructure. Get in contact if interested in helping having this in place. # uname -a Linux jouet 4.12.0-rc6+ #3 SMP Tue Jun 27 15:12:38 -03 2017 x86_64 x86_64 x86_64 GNU/Linux # perf test 1: vmlinux symtab matches kallsyms : Ok 2: Detect openat syscall event : Ok 3: Detect openat syscall event on all cpus : Ok 4: Read samples using the mmap interface : Ok 5: Parse event definition strings : Ok 6: Simple expression parser : Ok 7: PERF_RECORD_* events & perf_sample fields : Ok 8: Parse perf pmu format : Ok 9: DSO data read : Ok 10: DSO data cache : Ok 11: DSO data reopen : Ok 12: Roundtrip evsel->name : Ok 13: Parse sched tracepoints fields : Ok 14: syscalls:sys_enter_openat event fields : Ok 15: Setup struct perf_event_attr : Ok 16: Match and link multiple hists : Ok 17: 'import perf' in python : Ok 18: Breakpoint overflow signal handler : Ok 19: Breakpoint overflow sampling : Ok 20: Number of exit events of a simple workload : Ok 21: Software clock events period values : Ok 22: Object code reading : Ok 23: Sample parsing : Ok 24: Use a dummy software event to keep tracking: Ok 25: Parse with no sample_id_all bit set : Ok 26: Filter hist entries : Ok 27: Lookup mmap thread : Ok 28: Share thread mg : Ok 29: Sort output of hist entries : Ok 30: Cumulate child hist entries : Ok 31: Track with sched_switch : Ok 32: Filter fds with revents mask in a fdarray : Ok 33: Add fd to a fdarray, making it autogrow : Ok 34: kmod_path__parse : Ok 35: Thread map : Ok 36: LLVM search and compile : 36.1: Basic BPF llvm compile : Ok 36.2: kbuild searching : Ok 36.3: Compile source for BPF prologue generation: Ok 36.4: Compile source for BPF relocation : Ok 37: Session topology : Ok 38: BPF filter : 38.1: Basic BPF filtering : Ok 38.2: BPF pinning : Ok 38.3: BPF prologue generation : Ok 38.4: BPF relocation checker : Ok 39: Synthesize thread map : Ok 40: Remove thread map : Ok 41: Synthesize cpu map : Ok 42: Synthesize stat config : Ok 43: Synthesize stat : Ok 44: Synthesize stat round : Ok 45: Synthesize attr update : Ok 46: Event times : Ok 47: Read backward ring buffer : Ok 48: Print cpu map : Ok 49: Probe SDT events : Ok 50: is_printable_array : Ok 51: Print bitmap : Ok 52: perf hooks : Ok 53: builtin clang support : Skip (not compiled in) 54: unit_number__scnprintf : Ok 55: x86 rdpmc : Ok 56: Convert perf time to TSC : Ok 57: DWARF unwind : Ok 58: x86 instruction decoder - new instructions : Ok 59: Intel cqm nmi context read : Skip # # dm 1 alpine:3.4: Ok 2 alpine:3.5: Ok 3 alpine:3.6: Ok 4 alpine:edge: Ok 5 android-ndk:r12b-arm: Ok 6 archlinux:latest: Ok 7 centos:5: Ok 8 centos:6: Ok 9 centos:7: Ok 10 debian:7: Ok 11 debian:8: Ok 12 debian:9: Ok 13 debian:experimental: Ok 14 debian:experimental-x-arm64: Ok 15 debian:experimental-x-mips: Ok 16 debian:experimental-x-mips64: Ok 17 debian:experimental-x-mipsel: Ok 18 fedora:20: Ok 19 fedora:21: Ok 20 fedora:22: Ok 21 fedora:23: Ok 22 fedora:24: Ok 23 fedora:24-x-ARC-uClibc: Ok 24 fedora:25: Ok 25 fedora:26: Ok 26 fedora:rawhide: FAIL 27 mageia:5: Ok 28 opensuse:13.2: Ok 29 opensuse:42.1: Ok 30 opensuse:42.2: Ok 31 opensuse:tumbleweed: Ok 32 oraclelinux:6: Ok 33 oraclelinux:7: Ok 34 ubuntu:12.04.5: Ok 35 ubuntu:14.04.4: Ok 36 ubuntu:14.04.4-x-linaro-arm64: Ok 37 ubuntu:15.10: Ok 38 ubuntu:16.04: Ok 39 ubuntu:16.04-x-arm: Ok 40 ubuntu:16.04-x-arm64: Ok 41 ubuntu:16.04-x-powerpc: Ok 42 ubuntu:16.04-x-powerpc64: Ok 43 ubuntu:16.04-x-powerpc64el: Ok 44 ubuntu:16.04-x-s390: Ok 45 ubuntu:16.10: Ok 46 ubuntu:17.04: Ok 47 ubuntu:17.10: Ok # $ make -C tools/perf build-test make: Entering directory '/home/acme/git/linux/tools/perf' - tarpkg: ./tests/perf-targz-src-pkg . make_no_libunwind_O: make NO_LIBUNWIND=1 make_install_prefix_O: make install prefix=/tmp/krava make_static_O: make LDFLAGS=-static make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1 make_no_libelf_O: make NO_LIBELF=1 make_no_libperl_O: make NO_LIBPERL=1 make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1 make_install_bin_O: make install-bin make_no_backtrace_O: make NO_BACKTRACE=1 make_with_clangllvm_O: make LIBCLANGLLVM=1 make_no_gtk2_O: make NO_GTK2=1 make_install_prefix_slash_O: make install prefix=/tmp/krava/ make_util_map_o_O: make util/map.o make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 make_no_slang_O: make NO_SLANG=1 make_util_pmu_bison_o_O: make util/pmu-bison.o make_perf_o_O: make perf.o make_help_O: make help make_no_libpython_O: make NO_LIBPYTHON=1 make_with_babeltrace_O: make LIBBABELTRACE=1 make_debug_O: make DEBUG=1 make_no_libnuma_O: make NO_LIBNUMA=1 make_no_auxtrace_O: make NO_AUXTRACE=1 make_pure_O: make make_no_demangle_O: make NO_DEMANGLE=1 make_no_newt_O: make NO_NEWT=1 make_no_libbionic_O: make NO_LIBBIONIC=1 make_clean_all_O: make clean all make_doc_O: make doc make_no_libaudit_O: make NO_LIBAUDIT=1 make_tags_O: make tags make_install_O: make install make_no_libbpf_O: make NO_LIBBPF=1 OK make: Leaving directory '/home/acme/git/linux/tools/perf' $ ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2017-07-28 20:00 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo @ 2017-07-30 9:31 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2017-07-30 9:31 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, linux-perf-users, Adrian Hunter, Alexander Shishkin, Andi Kleen, David Ahern, David Carrillo-Cisneros, Francis Deslauriers, Geneviève Bastien, Jiri Olsa, Julien Desfossez, Martin Liška, Mathieu Desnoyers, Milian Wolff, Namhyung Kim, Paul Turner, Peter Zijlstra, Simon Que, Stephane Eranian, Taeung Song, Wang Nan, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > Test results at the end of this message, as usual. > > The following changes since commit ee438ec8f33c5af0d4a4ffb935c5b9272e8c2680: > > Merge tag 'perf-core-for-mingo-4.14-20170725' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-07-26 19:07:30 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.14-20170728 > > for you to fetch changes up to 6b7007af728df7258bb60ed73099be3b59b3030e: > > perf data: Add doc when no conversion support compiled (2017-07-28 16:30:45 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes for 4.14: > > New features: > > - Add PERF_SAMPLE_CALLCHAIN and PERF_RECORD_MMAP[2] to 'perf data' CTF > conversion, allowing CTF trace visualization tools to show callchains > and to resolve symbols (Geneviève Bastien) > > Improvements: > > - Use group read for event groups in 'perf stat', reducing overhead when > groups are defined in the event specification, i.e. when using {} to > enclose a list of events, asking them to be read at the same time, > e.g.: "perf stat -e '{cycles,instructions}'" (Jiri Olsa) > > Fixes: > > - Do not overwrite perf_sample->weight in 'perf annotate' when > processing samples, use whatever came from the kernel when > perf_event_attr.sample_type has PERF_SAMPLE_WEIGHT set or just handle > its default value, 0, when that is not set and "weight" is one of the > sort orders chosen (Arnaldo Carvalho de Melo) > > - 'perf annotate --show-total-period' fixes: > - TUI should show period, not nr_samples > - Set appropriate column width for period/percent > - Fix the column header to show "Period" when when that is what > is being asked for > (Taeung Song, Arnaldo Carvalho de Melo) > > - Use default sort if evlist is empty, fixing pipe mode (David Carrillo-Cisneros) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (6): > perf annotate: Do not overwrite perf_sample->weight > perf annotate stdio: Set enough columns for --show-total-period > perf annotate: Fix storing per line sym_hist_entry > perf annotate TUI: Use sym_hist_entry in disasm_line_samples > perf annotate TUI: Clarify calculation of column header widths > perf annotate TUI: Set appropriate column width for period/percent > > David Carrillo-Cisneros (1): > perf sort: Use default sort if evlist is empty > > Geneviève Bastien (3): > perf data: Add callchain to CTF conversion > perf data: Add mmap[2] events to CTF conversion > perf data: Add doc when no conversion support compiled > > Jiri Olsa (3): > perf tools: Add perf_evsel__read_size function > perf evsel: Add read_counter() > perf stat: Use group read for event groups > > Taeung Song (2): > perf annotate TUI: Fix --show-total-period > perf annotate TUI: Fix column header when toggling period/percent > > tools/perf/builtin-annotate.c | 2 - > tools/perf/builtin-data.c | 2 +- > tools/perf/builtin-stat.c | 30 +++++++- > tools/perf/ui/browsers/annotate.c | 36 +++++----- > tools/perf/util/annotate.c | 11 +-- > tools/perf/util/counts.h | 1 + > tools/perf/util/data-convert-bt.c | 127 +++++++++++++++++++++++++++++++++- > tools/perf/util/evlist.h | 5 ++ > tools/perf/util/evsel.c | 139 +++++++++++++++++++++++++++++++++++++- > tools/perf/util/evsel.h | 2 + > tools/perf/util/sort.c | 2 +- > tools/perf/util/stat.c | 4 ++ > tools/perf/util/stat.h | 5 +- > 13 files changed, 334 insertions(+), 32 deletions(-) Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2017-02-14 1:13 Arnaldo Carvalho de Melo 2017-02-14 6:31 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-02-14 1:13 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexei Starovoitov, Clark Williams, Daniel Borkmann, David Ahern, David S . Miller, Jiri Olsa, Joe Perches, Joe Stringer, Mickaël Salaün, Namhyung Kim, netdev, Peter Zijlstra, Steven Rostedt, Taeung Song, Wang Nan, Wang YanQing, linux-perf-users, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo Test results at the end of this message, as usual. The following changes since commit f2029b1e47b607619d1dd2cb0bbb77f64ec6b7c2: perf/x86/intel: Add Kaby Lake support (2017-02-11 21:28:23 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170213 for you to fetch changes up to a734fb5d60067a73dd7099a58756847c07f9cd68: samples/bpf: Reset global variables (2017-02-13 17:22:53 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: New feature: - Introduce the 'delta-abs' 'perf diff' compute method, that orders the histogram entries by the absolute value of the percentage delta for a function in two perf.data files, i.e. the functions that changed the most (increase or decrease in samples) comes first (Namhyung Kim) User visible: - Improve message about tweaking the kernel.perf_event_paranoid setting, telling how to make the change permanent by editing /etc/sysctl.conf (Ingo Molnar) Infrastructure: - Introduce linux/compiler-gcc.h as a counterpart to the kernel's, initially containing the definition of __fallthrough, more to come (__maybe_unused, etc) (Arnaldo Carvalho de Melo) - Fixes for problems uncovered by building tools/perf with clang, such as always true tests of arrays against NULL and variables that sometimes were used without being initialized (Arnaldo Carvalho de Melo, Steven Rostedt) - Before loading a new ELF, clear global variables set by the samples/bpf loader (Mickaël Salaün) - Ignore already processed ELF sections in the samples/bpf loader (Mickaël Salaün) - Fix compile error in the scripting code with some perl5 versions (Wang YanQing) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (6): tools include: Introduce linux/compiler-gcc.h tools lib traceevent plugin function: Initialize 'index' variable perf evsel: Inform how to make a sysctl setting permanent perf symbols: No need to check if sym->name is NULL perf tests record: No need to test an array against NULL perf symbols: dso->name is an array, no need to check it against NULL Mickaël Salaün (3): samples/bpf: Add missing header samples/bpf: Ignore already processed ELF sections samples/bpf: Reset global variables Namhyung Kim (4): perf diff: Add 'delta-abs' compute method perf diff: Add diff.order config option perf diff: Add diff.compute config option perf diff: Change default setting to "delta-abs" Steven Rostedt (VMware) (1): tools lib traceevent: Initialize lenght on OLD_RING_BUFFER_TYPE_TIME_STAMP Wang YanQing (1): perf scripting perl: Fix compile error with some perl5 versions samples/bpf/bpf_load.c | 7 ++ samples/bpf/tracex5_kern.c | 1 + tools/include/linux/compiler-gcc.h | 14 ++++ tools/include/linux/compiler.h | 10 +-- tools/lib/traceevent/kbuffer-parse.c | 1 + tools/lib/traceevent/plugin_function.c | 2 +- tools/perf/Documentation/perf-config.txt | 12 ++++ tools/perf/Documentation/perf-diff.txt | 15 ++++- tools/perf/MANIFEST | 1 + tools/perf/builtin-diff.c | 78 ++++++++++++++++++++-- tools/perf/builtin-kmem.c | 4 +- tools/perf/builtin-record.c | 2 +- tools/perf/builtin-sched.c | 2 +- tools/perf/builtin-stat.c | 2 +- tools/perf/builtin-top.c | 2 +- tools/perf/tests/perf-record.c | 2 +- tools/perf/util/evsel.c | 4 +- tools/perf/util/evsel_fprintf.c | 1 - tools/perf/util/machine.c | 2 +- tools/perf/util/map.c | 4 +- tools/perf/util/scripting-engines/Build | 2 +- .../perf/util/scripting-engines/trace-event-perl.c | 4 +- tools/perf/util/symbol_fprintf.c | 2 +- 23 files changed, 145 insertions(+), 29 deletions(-) create mode 100644 tools/include/linux/compiler-gcc.h Test results: The first ones are container (docker) based builds of tools/perf with and without libelf support, objtool where it is supported and samples/bpf/, ditto. Several are cross builds, the ones with -x-ARCH, and the android one, and those may not have all the features built, due to lack of multi-arch devel packages, available and being used so far on just a few, like debian:experimental-x-{arm64,mipsel}. The 'perf test' one will perform a variety of tests exercising tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands with a variety of command line event specifications to then intercept the sys_perf_event syscall to check that the perf_event_attr fields are set up as expected, among a variety of other unit tests. Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/ with a variety of feature sets, exercising the build with an incomplete set of features as well as with a complete one. It is planned to have it run on each of the containers mentioned above, using some container orchestration infrastructure. Get in contact if interested in helping having this in place. # time dm 1 alpine:3.4: Ok 2 android-ndk:r12b-arm: Ok 3 archlinux:latest: Ok 4 centos:5: Ok 5 centos:6: Ok 6 centos:7: Ok 7 debian:7: Ok 8 debian:8: Ok 9 debian:experimental: Ok 10 debian:experimental-x-arm64: Ok 11 debian:experimental-x-mips: Ok 12 debian:experimental-x-mips64: Ok 13 debian:experimental-x-mipsel: Ok 14 fedora:20: Ok 15 fedora:21: Ok 16 fedora:22: Ok 17 fedora:23: Ok 18 fedora:24: Ok 19 fedora:24-x-ARC-uClibc: Ok 20 fedora:25: Ok 21 fedora:rawhide: Ok 22 mageia:5: Ok 23 opensuse:13.2: Ok 24 opensuse:42.1: Ok 25 opensuse:tumbleweed: Ok 26 ubuntu:12.04.5: Ok 27 ubuntu:14.04.4-x-linaro-arm64: Ok 28 ubuntu:15.10: Ok 29 ubuntu:16.04: Ok 30 ubuntu:16.04-x-arm: Ok 31 ubuntu:16.04-x-arm64: Ok 32 ubuntu:16.04-x-powerpc: Ok 33 ubuntu:16.04-x-powerpc64: Ok 34 ubuntu:16.04-x-powerpc64el: Ok 35 ubuntu:16.04-x-s390: Ok 36 ubuntu:16.10: Ok # # uname -a Linux jouet 4.9.8-201.fc25.x86_64 #1 SMP Tue Feb 7 11:28:07 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux # perf test 1: vmlinux symtab matches kallsyms : Ok 2: Detect openat syscall event : Ok 3: Detect openat syscall event on all cpus : Ok 4: Read samples using the mmap interface : Ok 5: Parse event definition strings : Ok 6: PERF_RECORD_* events & perf_sample fields : Ok 7: Parse perf pmu format : Ok 8: DSO data read : Ok 9: DSO data cache : Ok 10: DSO data reopen : Ok 11: Roundtrip evsel->name : Ok 12: Parse sched tracepoints fields : Ok 13: syscalls:sys_enter_openat event fields : Ok 14: Setup struct perf_event_attr : Ok 15: Match and link multiple hists : Ok 16: 'import perf' in python : Ok 17: Breakpoint overflow signal handler : Ok 18: Breakpoint overflow sampling : Ok 19: Number of exit events of a simple workload : Ok 20: Software clock events period values : Ok 21: Object code reading : Ok 22: Sample parsing : Ok 23: Use a dummy software event to keep tracking: Ok 24: Parse with no sample_id_all bit set : Ok 25: Filter hist entries : Ok 26: Lookup mmap thread : Ok 27: Share thread mg : Ok 28: Sort output of hist entries : Ok 29: Cumulate child hist entries : Ok 30: Track with sched_switch : Ok 31: Filter fds with revents mask in a fdarray : Ok 32: Add fd to a fdarray, making it autogrow : Ok 33: kmod_path__parse : Ok 34: Thread map : Ok 35: LLVM search and compile : 35.1: Basic BPF llvm compile : Ok 35.2: kbuild searching : Ok 35.3: Compile source for BPF prologue generation: Ok 35.4: Compile source for BPF relocation : Ok 36: Session topology : Ok 37: BPF filter : 37.1: Basic BPF filtering : Ok 37.2: BPF pinning : Ok 37.3: BPF prologue generation : Ok 37.4: BPF relocation checker : Ok 38: Synthesize thread map : Ok 39: Remove thread map : Ok 40: Synthesize cpu map : Ok 41: Synthesize stat config : Ok 42: Synthesize stat : Ok 43: Synthesize stat round : Ok 44: Synthesize attr update : Ok 45: Event times : Ok 46: Read backward ring buffer : Ok 47: Print cpu map : Ok 48: Probe SDT events : Ok 49: is_printable_array : Ok 50: Print bitmap : Ok 51: perf hooks : Ok 52: builtin clang support : Skip (not compiled in) 53: unit_number__scnprintf : Ok 54: x86 rdpmc : Ok 55: Convert perf time to TSC : Ok 56: DWARF unwind : Ok 57: x86 instruction decoder - new instructions : Ok 58: Intel cqm nmi context read : Skip # $ make -C tools/perf build-test make: Entering directory '/home/acme/git/linux/tools/perf' - tarpkg: ./tests/perf-targz-src-pkg . make_install_O: make install make_no_libunwind_O: make NO_LIBUNWIND=1 make_no_newt_O: make NO_NEWT=1 make_no_slang_O: make NO_SLANG=1 make_static_O: make LDFLAGS=-static make_no_backtrace_O: make NO_BACKTRACE=1 make_no_libbionic_O: make NO_LIBBIONIC=1 make_clean_all_O: make clean all make_util_pmu_bison_o_O: make util/pmu-bison.o make_no_libnuma_O: make NO_LIBNUMA=1 make_tags_O: make tags make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1 make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 make_perf_o_O: make perf.o make_install_prefix_slash_O: make install prefix=/tmp/krava/ make_no_libpython_O: make NO_LIBPYTHON=1 make_no_gtk2_O: make NO_GTK2=1 make_no_libaudit_O: make NO_LIBAUDIT=1 make_help_O: make help make_with_babeltrace_O: make LIBBABELTRACE=1 make_install_prefix_O: make install prefix=/tmp/krava make_debug_O: make DEBUG=1 make_no_libbpf_O: make NO_LIBBPF=1 make_util_map_o_O: make util/map.o make_with_clangllvm_O: make LIBCLANGLLVM=1 make_no_libperl_O: make NO_LIBPERL=1 make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1 make_doc_O: make doc make_no_libelf_O: make NO_LIBELF=1 make_no_auxtrace_O: make NO_AUXTRACE=1 make_install_bin_O: make install-bin make_no_demangle_O: make NO_DEMANGLE=1 make_pure_O: make OK make: Leaving directory '/home/acme/git/linux/tools/perf' $ ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2017-02-14 1:13 Arnaldo Carvalho de Melo @ 2017-02-14 6:31 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2017-02-14 6:31 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Alexei Starovoitov, Clark Williams, Daniel Borkmann, David Ahern, David S . Miller, Jiri Olsa, Joe Perches, Joe Stringer, Mickaël Salaün, Namhyung Kim, netdev, Peter Zijlstra, Steven Rostedt, Taeung Song, Wang Nan, Wang YanQing, linux-perf-users, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > Test results at the end of this message, as usual. > > The following changes since commit f2029b1e47b607619d1dd2cb0bbb77f64ec6b7c2: > > perf/x86/intel: Add Kaby Lake support (2017-02-11 21:28:23 +0100) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.11-20170213 > > for you to fetch changes up to a734fb5d60067a73dd7099a58756847c07f9cd68: > > samples/bpf: Reset global variables (2017-02-13 17:22:53 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > New feature: > > - Introduce the 'delta-abs' 'perf diff' compute method, that orders the > histogram entries by the absolute value of the percentage delta for a > function in two perf.data files, i.e. the functions that changed the > most (increase or decrease in samples) comes first (Namhyung Kim) > > User visible: > > - Improve message about tweaking the kernel.perf_event_paranoid setting, > telling how to make the change permanent by editing /etc/sysctl.conf > (Ingo Molnar) > > Infrastructure: > > - Introduce linux/compiler-gcc.h as a counterpart to the kernel's, > initially containing the definition of __fallthrough, more to > come (__maybe_unused, etc) (Arnaldo Carvalho de Melo) > > - Fixes for problems uncovered by building tools/perf with clang, such > as always true tests of arrays against NULL and variables that sometimes > were used without being initialized (Arnaldo Carvalho de Melo, Steven Rostedt) > > - Before loading a new ELF, clear global variables set by the > samples/bpf loader (Mickaël Salaün) > > - Ignore already processed ELF sections in the samples/bpf > loader (Mickaël Salaün) > > - Fix compile error in the scripting code with some perl5 > versions (Wang YanQing) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (6): > tools include: Introduce linux/compiler-gcc.h > tools lib traceevent plugin function: Initialize 'index' variable > perf evsel: Inform how to make a sysctl setting permanent > perf symbols: No need to check if sym->name is NULL > perf tests record: No need to test an array against NULL > perf symbols: dso->name is an array, no need to check it against NULL > > Mickaël Salaün (3): > samples/bpf: Add missing header > samples/bpf: Ignore already processed ELF sections > samples/bpf: Reset global variables > > Namhyung Kim (4): > perf diff: Add 'delta-abs' compute method > perf diff: Add diff.order config option > perf diff: Add diff.compute config option > perf diff: Change default setting to "delta-abs" > > Steven Rostedt (VMware) (1): > tools lib traceevent: Initialize lenght on OLD_RING_BUFFER_TYPE_TIME_STAMP > > Wang YanQing (1): > perf scripting perl: Fix compile error with some perl5 versions > > samples/bpf/bpf_load.c | 7 ++ > samples/bpf/tracex5_kern.c | 1 + > tools/include/linux/compiler-gcc.h | 14 ++++ > tools/include/linux/compiler.h | 10 +-- > tools/lib/traceevent/kbuffer-parse.c | 1 + > tools/lib/traceevent/plugin_function.c | 2 +- > tools/perf/Documentation/perf-config.txt | 12 ++++ > tools/perf/Documentation/perf-diff.txt | 15 ++++- > tools/perf/MANIFEST | 1 + > tools/perf/builtin-diff.c | 78 ++++++++++++++++++++-- > tools/perf/builtin-kmem.c | 4 +- > tools/perf/builtin-record.c | 2 +- > tools/perf/builtin-sched.c | 2 +- > tools/perf/builtin-stat.c | 2 +- > tools/perf/builtin-top.c | 2 +- > tools/perf/tests/perf-record.c | 2 +- > tools/perf/util/evsel.c | 4 +- > tools/perf/util/evsel_fprintf.c | 1 - > tools/perf/util/machine.c | 2 +- > tools/perf/util/map.c | 4 +- > tools/perf/util/scripting-engines/Build | 2 +- > .../perf/util/scripting-engines/trace-event-perl.c | 4 +- > tools/perf/util/symbol_fprintf.c | 2 +- > 23 files changed, 145 insertions(+), 29 deletions(-) > create mode 100644 tools/include/linux/compiler-gcc.h Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2016-11-15 1:38 Arnaldo Carvalho de Melo 2016-11-15 8:47 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2016-11-15 1:38 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen, David Ahern, He Kuang, Jiri Olsa, Kan Liang, Linux-kernel, Nambong Ha, Namhyung Kim, Peter Zijlstra, Rabin Vincent, Stephane Eranian, Taeung Song, Wang Nan, William Cohen, Wookje Kwon, Yao Jin Hi Ingo, Please consider pulling, - Arnaldo Test results at the end. The following changes since commit 91a79e5fa696fa626bfbd47f827eaf3eb7d76dc5: Merge tag 'perf-core-for-mingo-20161028' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-10-28 19:37:34 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161114 for you to fetch changes up to fef51ecd1056b5e090c9fb73e0833bd751389572: perf report: Show branch info in callchain entry for browser mode (2016-11-14 13:34:08 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: New features: - Allow querying and setting .perfconfig variables (Taeung Song) - Show branch information in callchains (predicted, TSX aborts, loop iteractions, etc) (Jin Yao) Infrastructure: - Support kbuild's CFLAGS_REMOVE_ in tools/build (Jiri Olsa) - Plug building jvmti to the main perf Makefile (Jiri Olsa) Documentation: - Update Intel PT documentation about context switch events (Arnaldo Carvalho de Melo) - Fix 'perf record --call-graph dwarf' help/config in builds not linking with a unwind library, mentioning that is a possible record option (Rabin Vincent) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (1): perf intel-pt: Update documentation about context switch events Jin Yao (5): perf report: Add branch flag to callchain cursor node perf report: Create a symbol_conf flag for showing branch flag counting perf report: Calculate and return the branch flag counting perf report: Show branch info in callchain entry for stdio mode perf report: Show branch info in callchain entry for browser mode Jiri Olsa (4): tools build: Add CFLAGS_REMOVE_* support tools build: Add jvmti feature detection support perf jvmti: Plug compilation into perf build perf kvmti: Remove unused Makefile file Rabin Vincent (1): perf callchain: Fixup help/config for no-unwinding Taeung Song (4): perf config: Add support for getting config key-value pairs perf config: Validate config variable arguments before trying use them perf config: Add support setting variables in a config file perf config: Mark where are config items from (user or system) tools/build/Build.include | 4 +- tools/build/Documentation/Build.txt | 6 +- tools/build/feature/Makefile | 6 +- tools/build/feature/test-jvmti.c | 13 ++ tools/perf/Documentation/intel-pt.txt | 19 ++- tools/perf/Documentation/perf-config.txt | 35 ++++++ tools/perf/Makefile.config | 26 ++++ tools/perf/Makefile.perf | 24 +++- tools/perf/builtin-config.c | 137 ++++++++++++++++++++- tools/perf/builtin-report.c | 3 + tools/perf/jvmti/Build | 8 ++ tools/perf/jvmti/Makefile | 89 -------------- tools/perf/tests/make | 2 +- tools/perf/ui/browsers/hists.c | 20 ++- tools/perf/ui/stdio/hist.c | 35 +++++- tools/perf/util/callchain.c | 205 ++++++++++++++++++++++++++++++- tools/perf/util/callchain.h | 26 +++- tools/perf/util/config.c | 20 +++ tools/perf/util/config.h | 4 + tools/perf/util/machine.c | 82 ++++++++++--- tools/perf/util/symbol.h | 1 + 21 files changed, 634 insertions(+), 131 deletions(-) create mode 100644 tools/build/feature/test-jvmti.c create mode 100644 tools/perf/jvmti/Build delete mode 100644 tools/perf/jvmti/Makefile [root@jouet ~]# perf test 1: vmlinux symtab matches kallsyms : Ok 2: detect openat syscall event : Ok 3: detect openat syscall event on all cpus : Ok 4: read samples using the mmap interface : Ok 5: parse events tests : Ok 6: Validate PERF_RECORD_* events & perf_sample fields : Ok 7: Test perf pmu format parsing : Ok 8: Test dso data read : Ok 9: Test dso data cache : Ok 10: Test dso data reopen : Ok 11: roundtrip evsel->name check : Ok 12: Check parsing of sched tracepoints fields : Ok 13: Generate and check syscalls:sys_enter_openat event fields: Ok 14: struct perf_event_attr setup : Ok 15: Test matching and linking multiple hists : Ok 16: Try 'import perf' in python, checking link problems : Ok 17: Test breakpoint overflow signal handler : Ok 18: Test breakpoint overflow sampling : Ok 19: Test number of exit event of a simple workload : Ok 20: Test software clock events have valid period values : Ok 21: Test object code reading : Ok 22: Test sample parsing : Ok 23: Test using a dummy software event to keep tracking : Ok 24: Test parsing with no sample_id_all bit set : Ok 25: Test filtering hist entries : Ok 26: Test mmap thread lookup : Ok 27: Test thread mg sharing : Ok 28: Test output sorting of hist entries : Ok 29: Test cumulation of child hist entries : Ok 30: Test tracking with sched_switch : Ok 31: Filter fds with revents mask in a fdarray : Ok 32: Add fd to a fdarray, making it autogrow : Ok 33: Test kmod_path__parse function : Ok 34: Test thread map : Ok 35: Test LLVM searching and compiling : 35.1: Basic BPF llvm compiling test : Ok 35.2: Test kbuild searching : Ok 35.3: Compile source for BPF prologue generation test : Ok 35.4: Compile source for BPF relocation test : Ok 36: Test topology in session : Ok 37: Test BPF filter : 37.1: Test basic BPF filtering : Ok 37.2: Test BPF prologue generation : Ok 37.3: Test BPF relocation checker : Ok 38: Test thread map synthesize : Ok 39: Test cpu map synthesize : Ok 40: Test stat config synthesize : Ok 41: Test stat synthesize : Ok 42: Test stat round synthesize : Ok 43: Test attr update synthesize : Ok 44: Test events times : Ok 45: Test backward reading from ring buffer : Ok 46: Test cpu map print : Ok 47: Test SDT event probing : Ok 48: Test is_printable_array function : Ok 49: Test bitmap print : Ok 50: x86 rdpmc test : Ok 51: Test converting perf time to TSC : Ok 52: Test dwarf unwind : Ok 53: Test x86 instruction decoder - new instructions : Ok 54: Test intel cqm nmi context read : Skip [root@jouet ~]# [root@zoo ~]# time dm 1 alpine:3.4: Ok 2 android-ndk:r12b-arm: Ok 3 archlinux:latest: Ok 4 centos:5: Ok 5 centos:6: Ok 6 centos:7: Ok 7 debian:7: Ok 8 debian:8: Ok 9 debian:experimental: Ok 10 fedora:20: Ok 11 fedora:21: Ok 12 fedora:22: Ok 13 fedora:23: Ok 14 fedora:24: Ok 15 fedora:24-x-ARC-uClibc: Ok 16 fedora:rawhide: Ok 17 mageia:5: Ok 18 opensuse:13.2: Ok 19 opensuse:42.1: Ok 20 opensuse:tumbleweed: Ok 21 ubuntu:12.04.5: Ok 22 ubuntu:14.04: Ok 23 ubuntu:14.04.4: Ok 24 ubuntu:15.10: Ok 25 ubuntu:16.04: Ok 26 ubuntu:16.04-x-arm: Ok 27 ubuntu:16.04-x-arm64: Ok 28 ubuntu:16.04-x-powerpc: Ok 29 ubuntu:16.04-x-powerpc64: Ok 30 ubuntu:16.04-x-powerpc64el: Ok 31 ubuntu:16.04-x-s390: Ok 32 ubuntu:16.10: Ok real 61m29.498s user 0m3.969s sys 0m3.525s [root@zoo ~]# [acme@jouet linux]$ perf stat make -C tools/perf build-test make: Entering directory '/home/acme/git/linux/tools/perf' - tarpkg: ./tests/perf-targz-src-pkg . make_no_libbionic_O: make NO_LIBBIONIC=1 make_no_libbpf_O: make NO_LIBBPF=1 make_no_libunwind_O: make NO_LIBUNWIND=1 make_install_O: make install make_no_libaudit_O: make NO_LIBAUDIT=1 make_no_libperl_O: make NO_LIBPERL=1 make_install_prefix_slash_O: make install prefix=/tmp/krava/ make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 make_clean_all_O: make clean all make_debug_O: make DEBUG=1 make_no_newt_O: make NO_NEWT=1 make_perf_o_O: make perf.o make_no_demangle_O: make NO_DEMANGLE=1 make_doc_O: make doc make_install_bin_O: make install-bin make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 make_install_prefix_O: make install prefix=/tmp/krava make_no_slang_O: make NO_SLANG=1 make_no_libelf_O: make NO_LIBELF=1 make_static_O: make LDFLAGS=-static make_util_map_o_O: make util/map.o make_with_babeltrace_O: make LIBBABELTRACE=1 make_no_auxtrace_O: make NO_AUXTRACE=1 make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1 make_no_libnuma_O: make NO_LIBNUMA=1 make_pure_O: make make_help_O: make help make_no_gtk2_O: make NO_GTK2=1 make_no_libpython_O: make NO_LIBPYTHON=1 make_no_backtrace_O: make NO_BACKTRACE=1 make_tags_O: make tags make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1 make_util_pmu_bison_o_O: make util/pmu-bison.o OK make: Leaving directory '/home/acme/git/linux/tools/perf' ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2016-11-15 1:38 Arnaldo Carvalho de Melo @ 2016-11-15 8:47 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2016-11-15 8:47 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen, David Ahern, He Kuang, Jiri Olsa, Kan Liang, Nambong Ha, Namhyung Kim, Peter Zijlstra, Rabin Vincent, Stephane Eranian, Taeung Song, Wang Nan, William Cohen, Wookje Kwon, Yao Jin * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > Test results at the end. > > The following changes since commit 91a79e5fa696fa626bfbd47f827eaf3eb7d76dc5: > > Merge tag 'perf-core-for-mingo-20161028' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-10-28 19:37:34 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161114 > > for you to fetch changes up to fef51ecd1056b5e090c9fb73e0833bd751389572: > > perf report: Show branch info in callchain entry for browser mode (2016-11-14 13:34:08 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > New features: > > - Allow querying and setting .perfconfig variables (Taeung Song) > > - Show branch information in callchains (predicted, TSX aborts, loop > iteractions, etc) (Jin Yao) > > Infrastructure: > > - Support kbuild's CFLAGS_REMOVE_ in tools/build (Jiri Olsa) > > - Plug building jvmti to the main perf Makefile (Jiri Olsa) > > Documentation: > > - Update Intel PT documentation about context switch events (Arnaldo Carvalho de Melo) > > - Fix 'perf record --call-graph dwarf' help/config in builds not linking > with a unwind library, mentioning that is a possible record option (Rabin Vincent) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (1): > perf intel-pt: Update documentation about context switch events > > Jin Yao (5): > perf report: Add branch flag to callchain cursor node > perf report: Create a symbol_conf flag for showing branch flag counting > perf report: Calculate and return the branch flag counting > perf report: Show branch info in callchain entry for stdio mode > perf report: Show branch info in callchain entry for browser mode > > Jiri Olsa (4): > tools build: Add CFLAGS_REMOVE_* support > tools build: Add jvmti feature detection support > perf jvmti: Plug compilation into perf build > perf kvmti: Remove unused Makefile file > > Rabin Vincent (1): > perf callchain: Fixup help/config for no-unwinding > > Taeung Song (4): > perf config: Add support for getting config key-value pairs > perf config: Validate config variable arguments before trying use them > perf config: Add support setting variables in a config file > perf config: Mark where are config items from (user or system) > > tools/build/Build.include | 4 +- > tools/build/Documentation/Build.txt | 6 +- > tools/build/feature/Makefile | 6 +- > tools/build/feature/test-jvmti.c | 13 ++ > tools/perf/Documentation/intel-pt.txt | 19 ++- > tools/perf/Documentation/perf-config.txt | 35 ++++++ > tools/perf/Makefile.config | 26 ++++ > tools/perf/Makefile.perf | 24 +++- > tools/perf/builtin-config.c | 137 ++++++++++++++++++++- > tools/perf/builtin-report.c | 3 + > tools/perf/jvmti/Build | 8 ++ > tools/perf/jvmti/Makefile | 89 -------------- > tools/perf/tests/make | 2 +- > tools/perf/ui/browsers/hists.c | 20 ++- > tools/perf/ui/stdio/hist.c | 35 +++++- > tools/perf/util/callchain.c | 205 ++++++++++++++++++++++++++++++- > tools/perf/util/callchain.h | 26 +++- > tools/perf/util/config.c | 20 +++ > tools/perf/util/config.h | 4 + > tools/perf/util/machine.c | 82 ++++++++++--- > tools/perf/util/symbol.h | 1 + > 21 files changed, 634 insertions(+), 131 deletions(-) > create mode 100644 tools/build/feature/test-jvmti.c > create mode 100644 tools/perf/jvmti/Build > delete mode 100644 tools/perf/jvmti/Makefile > > [root@jouet ~]# perf test > 1: vmlinux symtab matches kallsyms : Ok > 2: detect openat syscall event : Ok > 3: detect openat syscall event on all cpus : Ok > 4: read samples using the mmap interface : Ok > 5: parse events tests : Ok > 6: Validate PERF_RECORD_* events & perf_sample fields : Ok > 7: Test perf pmu format parsing : Ok > 8: Test dso data read : Ok > 9: Test dso data cache : Ok > 10: Test dso data reopen : Ok > 11: roundtrip evsel->name check : Ok > 12: Check parsing of sched tracepoints fields : Ok > 13: Generate and check syscalls:sys_enter_openat event fields: Ok > 14: struct perf_event_attr setup : Ok > 15: Test matching and linking multiple hists : Ok > 16: Try 'import perf' in python, checking link problems : Ok > 17: Test breakpoint overflow signal handler : Ok > 18: Test breakpoint overflow sampling : Ok > 19: Test number of exit event of a simple workload : Ok > 20: Test software clock events have valid period values : Ok > 21: Test object code reading : Ok > 22: Test sample parsing : Ok > 23: Test using a dummy software event to keep tracking : Ok > 24: Test parsing with no sample_id_all bit set : Ok > 25: Test filtering hist entries : Ok > 26: Test mmap thread lookup : Ok > 27: Test thread mg sharing : Ok > 28: Test output sorting of hist entries : Ok > 29: Test cumulation of child hist entries : Ok > 30: Test tracking with sched_switch : Ok > 31: Filter fds with revents mask in a fdarray : Ok > 32: Add fd to a fdarray, making it autogrow : Ok > 33: Test kmod_path__parse function : Ok > 34: Test thread map : Ok > 35: Test LLVM searching and compiling : > 35.1: Basic BPF llvm compiling test : Ok > 35.2: Test kbuild searching : Ok > 35.3: Compile source for BPF prologue generation test : Ok > 35.4: Compile source for BPF relocation test : Ok > 36: Test topology in session : Ok > 37: Test BPF filter : > 37.1: Test basic BPF filtering : Ok > 37.2: Test BPF prologue generation : Ok > 37.3: Test BPF relocation checker : Ok > 38: Test thread map synthesize : Ok > 39: Test cpu map synthesize : Ok > 40: Test stat config synthesize : Ok > 41: Test stat synthesize : Ok > 42: Test stat round synthesize : Ok > 43: Test attr update synthesize : Ok > 44: Test events times : Ok > 45: Test backward reading from ring buffer : Ok > 46: Test cpu map print : Ok > 47: Test SDT event probing : Ok > 48: Test is_printable_array function : Ok > 49: Test bitmap print : Ok > 50: x86 rdpmc test : Ok > 51: Test converting perf time to TSC : Ok > 52: Test dwarf unwind : Ok > 53: Test x86 instruction decoder - new instructions : Ok > 54: Test intel cqm nmi context read : Skip > [root@jouet ~]# > > [root@zoo ~]# time dm > 1 alpine:3.4: Ok > 2 android-ndk:r12b-arm: Ok > 3 archlinux:latest: Ok > 4 centos:5: Ok > 5 centos:6: Ok > 6 centos:7: Ok > 7 debian:7: Ok > 8 debian:8: Ok > 9 debian:experimental: Ok > 10 fedora:20: Ok > 11 fedora:21: Ok > 12 fedora:22: Ok > 13 fedora:23: Ok > 14 fedora:24: Ok > 15 fedora:24-x-ARC-uClibc: Ok > 16 fedora:rawhide: Ok > 17 mageia:5: Ok > 18 opensuse:13.2: Ok > 19 opensuse:42.1: Ok > 20 opensuse:tumbleweed: Ok > 21 ubuntu:12.04.5: Ok > 22 ubuntu:14.04: Ok > 23 ubuntu:14.04.4: Ok > 24 ubuntu:15.10: Ok > 25 ubuntu:16.04: Ok > 26 ubuntu:16.04-x-arm: Ok > 27 ubuntu:16.04-x-arm64: Ok > 28 ubuntu:16.04-x-powerpc: Ok > 29 ubuntu:16.04-x-powerpc64: Ok > 30 ubuntu:16.04-x-powerpc64el: Ok > 31 ubuntu:16.04-x-s390: Ok > 32 ubuntu:16.10: Ok > > real 61m29.498s > user 0m3.969s > sys 0m3.525s > [root@zoo ~]# > > [acme@jouet linux]$ perf stat make -C tools/perf build-test > make: Entering directory '/home/acme/git/linux/tools/perf' > - tarpkg: ./tests/perf-targz-src-pkg . > make_no_libbionic_O: make NO_LIBBIONIC=1 > make_no_libbpf_O: make NO_LIBBPF=1 > make_no_libunwind_O: make NO_LIBUNWIND=1 > make_install_O: make install > make_no_libaudit_O: make NO_LIBAUDIT=1 > make_no_libperl_O: make NO_LIBPERL=1 > make_install_prefix_slash_O: make install prefix=/tmp/krava/ > make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 > make_clean_all_O: make clean all > make_debug_O: make DEBUG=1 > make_no_newt_O: make NO_NEWT=1 > make_perf_o_O: make perf.o > make_no_demangle_O: make NO_DEMANGLE=1 > make_doc_O: make doc > make_install_bin_O: make install-bin > make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 > make_install_prefix_O: make install prefix=/tmp/krava > make_no_slang_O: make NO_SLANG=1 > make_no_libelf_O: make NO_LIBELF=1 > make_static_O: make LDFLAGS=-static > make_util_map_o_O: make util/map.o > make_with_babeltrace_O: make LIBBABELTRACE=1 > make_no_auxtrace_O: make NO_AUXTRACE=1 > make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1 > make_no_libnuma_O: make NO_LIBNUMA=1 > make_pure_O: make > make_help_O: make help > make_no_gtk2_O: make NO_GTK2=1 > make_no_libpython_O: make NO_LIBPYTHON=1 > make_no_backtrace_O: make NO_BACKTRACE=1 > make_tags_O: make tags > make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1 > make_util_pmu_bison_o_O: make util/pmu-bison.o > OK > make: Leaving directory '/home/acme/git/linux/tools/perf' Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2016-10-27 20:40 Arnaldo Carvalho de Melo 0 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2016-10-27 20:40 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen, Dave Hansen, David Ahern, Davidlohr Bueso, Frederic Weisbecker, Jiri Olsa, Josh Poimboeuf, Namhyung Kim, Peter Zijlstra, Sebastian Andrzej Siewior, Thomas Gleixner, Tom Zanussi, Wang Nan, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo Build and test stats at the end of the message. The following changes since commit 76e2d2617d767c445498c4c4b1162eb2201cdd77: Merge tag 'perf-core-for-mingo-20161024' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-10-24 20:42:42 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161027 for you to fetch changes up to 97321c8437977490432d470799faa3e5f1227806: perf tools: Add missing object file to the python binding linkage list (2016-10-26 19:08:43 -0200) ---------------------------------------------------------------- perf/core improvements and fixes: New features: - Support matching by topic in 'perf list' (Andi Kleen) User visible: - Apply cpu color only when there was activity in 'perf sched map' (Namhyung Kim) - Always show the task's COMM in 'perf sched map -v' (Namhyung Kim) - Fix hierarchy column counts in the perf hist browser (top, report), avoiding showing nothing after pressing the RIGHT key a number of times (Namhyung Kim) Infrastructure: - Support cascading options in libsubcmd and use it to share common options in 'perf sched' subcommands (Namhyung Kim) - Avoid worker cacheline bouncing in 'perf bench futex' (Davidlohr Bueso) - Sanitize numeric parameters in 'perf bench futex' (Davidlohr Bueso) - Update copies of kernel files (Arnaldo Carvalho de Melo) - Fix scripting (perl, python) setup to avoid leaks (Arnaldo Carvalho de Melo) - Add missing object file to the python binding linkage list (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Andi Kleen (1): perf list: Support matching by topic Arnaldo Carvalho de Melo (6): perf bench mem: Ignore export.h related changes to mem{cpy,set}.S tools: Update asm-generic/mman-common.h copy from the kernel perf tools: Update x86's syscall_64.tbl, adding pkey_(alloc,free,mprotect) perf scripting: Avoid leaking the scripting_context variable perf scripting: Don't die if scripting can't be setup, disable it perf tools: Add missing object file to the python binding linkage list Davidlohr Bueso (2): perf bench futex: Avoid worker cacheline bouncing perf bench futex: Sanitize numeric parameters Namhyung Kim (6): perf hist browser: Fix hierarchy column counts tools lib subcmd: Suppport cascading options perf sched: Make common options cascading perf sched map: Apply cpu color when there's an activity perf sched map: Always show task comm with -v perf tools: Introduce timestamp_in_usec() tools/include/uapi/asm-generic/mman-common.h | 5 +++ tools/lib/subcmd/parse-options.c | 14 ++++++++ tools/lib/subcmd/parse-options.h | 2 ++ tools/perf/Makefile.perf | 4 +-- tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 3 ++ tools/perf/bench/futex-hash.c | 15 +++++---- tools/perf/bench/futex-lock-pi.c | 7 +++- tools/perf/bench/futex-requeue.c | 2 ++ tools/perf/bench/futex-wake-parallel.c | 4 +++ tools/perf/bench/futex-wake.c | 3 ++ tools/perf/bench/futex.h | 4 +++ tools/perf/builtin-sched.c | 37 +++++++++++---------- tools/perf/builtin-script.c | 9 ++++-- tools/perf/ui/browsers/hists.c | 15 ++++++++- tools/perf/util/parse-branch-options.c | 2 +- tools/perf/util/pmu.c | 4 ++- tools/perf/util/python-ext-sources | 1 + tools/perf/util/trace-event-scripting.c | 39 +++++++++++------------ tools/perf/util/util.c | 9 ++++++ tools/perf/util/util.h | 3 ++ 20 files changed, 130 insertions(+), 52 deletions(-) # perf test 1: vmlinux symtab matches kallsyms : Ok 2: detect openat syscall event : Ok 3: detect openat syscall event on all cpus : Ok 4: read samples using the mmap interface : Ok 5: parse events tests : Ok 6: Validate PERF_RECORD_* events & perf_sample fields : Ok 7: Test perf pmu format parsing : Ok 8: Test dso data read : Ok 9: Test dso data cache : Ok 10: Test dso data reopen : Ok 11: roundtrip evsel->name check : Ok 12: Check parsing of sched tracepoints fields : Ok 13: Generate and check syscalls:sys_enter_openat event fields: Ok 14: struct perf_event_attr setup : Ok 15: Test matching and linking multiple hists : Ok 16: Try 'import perf' in python, checking link problems : Ok 17: Test breakpoint overflow signal handler : Ok 18: Test breakpoint overflow sampling : Ok 19: Test number of exit event of a simple workload : Ok 20: Test software clock events have valid period values : Ok 21: Test object code reading : Ok 22: Test sample parsing : Ok 23: Test using a dummy software event to keep tracking : Ok 24: Test parsing with no sample_id_all bit set : Ok 25: Test filtering hist entries : Ok 26: Test mmap thread lookup : Ok 27: Test thread mg sharing : Ok 28: Test output sorting of hist entries : Ok 29: Test cumulation of child hist entries : Ok 30: Test tracking with sched_switch : Ok 31: Filter fds with revents mask in a fdarray : Ok 32: Add fd to a fdarray, making it autogrow : Ok 33: Test kmod_path__parse function : Ok 34: Test thread map : Ok 35: Test LLVM searching and compiling : 35.1: Basic BPF llvm compiling test : Ok 35.2: Test kbuild searching : Ok 35.3: Compile source for BPF prologue generation test : Ok 35.4: Compile source for BPF relocation test : Ok 36: Test topology in session : Ok 37: Test BPF filter : 37.1: Test basic BPF filtering : Ok 37.2: Test BPF prologue generation : Ok 37.3: Test BPF relocation checker : Ok 38: Test thread map synthesize : Ok 39: Test cpu map synthesize : Ok 40: Test stat config synthesize : Ok 41: Test stat synthesize : Ok 42: Test stat round synthesize : Ok 43: Test attr update synthesize : Ok 44: Test events times : Ok 45: Test backward reading from ring buffer : Ok 46: Test cpu map print : Ok 47: Test SDT event probing : Ok 48: Test is_printable_array function : Ok 49: Test bitmap print : Ok 50: x86 rdpmc test : Ok 51: Test converting perf time to TSC : Ok 52: Test dwarf unwind : Ok 53: Test x86 instruction decoder - new instructions : Ok 54: Test intel cqm nmi context read : Skip # # dm 1 alpine:3.4: Ok 2 android-ndk:r12b-arm: Ok 3 archlinux:latest: Ok 4 centos:5: Ok 5 centos:6: Ok 6 centos:7: Ok 7 debian:7: Ok 8 debian:8: Ok 9 debian:experimental: Ok 10 fedora:20: Ok 11 fedora:21: Ok 12 fedora:22: Ok 13 fedora:23: Ok 14 fedora:24: Ok 15 fedora:24-x-ARC-uClibc: Ok 16 fedora:rawhide: Ok 17 mageia:5: Ok 18 opensuse:13.2: Ok 19 opensuse:42.1: Ok 20 opensuse:tumbleweed: Ok 21 ubuntu:12.04.5: Ok 22 ubuntu:14.04: Ok 23 ubuntu:14.04.4: Ok 24 ubuntu:15.10: Ok 25 ubuntu:16.04: Ok 26 ubuntu:16.04-x-arm: Ok 27 ubuntu:16.04-x-arm64: Ok 28 ubuntu:16.04-x-powerpc: Ok 29 ubuntu:16.04-x-powerpc64: Ok 30 ubuntu:16.04-x-powerpc64el: Ok 31 ubuntu:16.04-x-s390: Ok 32 ubuntu:16.10: Ok # $ make -C tools/perf build-test make: Entering directory '/home/acme/git/linux/tools/perf' - tarpkg: ./tests/perf-targz-src-pkg . make_debug_O: make DEBUG=1 make_install_prefix_O: make install prefix=/tmp/krava make_with_babeltrace_O: make LIBBABELTRACE=1 make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 make_tags_O: make tags make_util_pmu_bison_o_O: make util/pmu-bison.o make_help_O: make help make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1 make_no_newt_O: make NO_NEWT=1 make_no_gtk2_O: make NO_GTK2=1 make_no_libunwind_O: make NO_LIBUNWIND=1 make_util_map_o_O: make util/map.o make_install_bin_O: make install-bin make_no_backtrace_O: make NO_BACKTRACE=1 make_no_demangle_O: make NO_DEMANGLE=1 make_doc_O: make doc make_perf_o_O: make perf.o make_no_libbionic_O: make NO_LIBBIONIC=1 make_no_auxtrace_O: make NO_AUXTRACE=1 make_no_slang_O: make NO_SLANG=1 make_no_libperl_O: make NO_LIBPERL=1 make_no_libnuma_O: make NO_LIBNUMA=1 make_install_prefix_slash_O: make install prefix=/tmp/krava/ make_clean_all_O: make clean all make_no_libpython_O: make NO_LIBPYTHON=1 make_pure_O: make make_no_libaudit_O: make NO_LIBAUDIT=1 make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1 make_install_O: make install make_no_libelf_O: make NO_LIBELF=1 make_static_O: make LDFLAGS=-static make_no_libbpf_O: make NO_LIBBPF=1 OK make: Leaving directory '/home/acme/git/linux/tools/perf' $ ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2016-09-22 21:12 Arnaldo Carvalho de Melo 2016-09-23 5:22 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2016-09-22 21:12 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin, Andi Kleen, David Ahern, Don Zickus, Jiri Olsa, Joe Mario, linux-arm-kernel, Mathieu Poirier, Namhyung Kim, Peter Zijlstra From: Arnaldo Carvalho de Melo <acme@redhat.com> Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit 89f1c2c59c4aef8e26edbc7db5175e6ffb0e9ec7: Merge tag 'perf-core-for-mingo-20160920' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-09-20 23:32:02 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160922 for you to fetch changes up to 2d831454140f28fa643b78deede4511b9e2c9e5f: perf hists: Make hists__fprintf_headers function global (2016-09-22 13:08:59 -0300) ---------------------------------------------------------------- perf/core improvements: New features: - Add support for interacting with Coresight PMU ETMs/PTMs, that are IP blocks to perform hardware assisted tracing on a ARM CPU core (Mathieu Poirier) Infrastructure: - Histogram prep work for the upcoming c2c tool (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Jiri Olsa (9): perf evsel: Remove superfluous initialization of weight perf hists: Use bigger buffer for stdio headers perf hists: Add __hist_entry__snprintf function perf tools: Make reset_dimensions global perf tools: Make output_field_add and sort_dimension__add global perf tools: Make several sorting functions global perf tools: Make several display functions global perf hists: Make __hist_entry__snprintf function global perf hists: Make hists__fprintf_headers function global Mathieu Poirier (6): perf tools: Confine __get_cpuid() to x86 architecture perf tools: Make coresight PMU listable perf tools: Add coresight etm PMU record capabilities perf pmu: Push configuration down to PMU driver perf tools: Add PMU configuration to tools perf tools: Add sink configuration for cs_etm PMU MAINTAINERS | 5 + tools/perf/Makefile.config | 11 +- tools/perf/arch/arm/util/Build | 2 + tools/perf/arch/arm/util/auxtrace.c | 54 ++++ tools/perf/arch/arm/util/cs-etm.c | 617 ++++++++++++++++++++++++++++++++++++ tools/perf/arch/arm/util/cs-etm.h | 26 ++ tools/perf/arch/arm/util/pmu.c | 36 +++ tools/perf/arch/arm64/util/Build | 4 + tools/perf/builtin-record.c | 10 + tools/perf/builtin-stat.c | 9 + tools/perf/builtin-top.c | 13 + tools/perf/ui/browsers/hists.c | 2 +- tools/perf/ui/hist.c | 2 +- tools/perf/ui/stdio/hist.c | 14 +- tools/perf/util/Build | 1 + tools/perf/util/auxtrace.c | 1 + tools/perf/util/auxtrace.h | 1 + tools/perf/util/cs-etm.h | 74 +++++ tools/perf/util/drv_configs.c | 77 +++++ tools/perf/util/drv_configs.h | 26 ++ tools/perf/util/evsel.c | 2 - tools/perf/util/hist.h | 5 + tools/perf/util/pmu.h | 2 + tools/perf/util/sort.c | 16 +- tools/perf/util/sort.h | 11 + 25 files changed, 1001 insertions(+), 20 deletions(-) create mode 100644 tools/perf/arch/arm/util/auxtrace.c create mode 100644 tools/perf/arch/arm/util/cs-etm.c create mode 100644 tools/perf/arch/arm/util/cs-etm.h create mode 100644 tools/perf/arch/arm/util/pmu.c create mode 100644 tools/perf/util/cs-etm.h create mode 100644 tools/perf/util/drv_configs.c create mode 100644 tools/perf/util/drv_configs.h [root@zoo ~]# time dm 1 73.911 alpine:3.4: Ok 2 26.890 android-ndk:r12b-arm: Ok 3 77.833 archlinux:latest: Ok 4 40.814 centos:5: Ok 5 64.151 centos:6: Ok 6 75.720 centos:7: Ok 7 68.960 debian:7: Ok 8 75.606 debian:8: Ok 9 75.127 fedora:20: Ok 10 80.186 fedora:21: Ok 11 80.157 fedora:22: Ok 12 83.273 fedora:23: Ok 13 91.566 fedora:24: Ok 14 37.720 fedora:24-x-ARC-uClibc: Ok 15 98.492 fedora:rawhide: Ok 16 100.555 mageia:5: Ok 17 94.140 opensuse:13.2: Ok 18 95.476 opensuse:42.1: Ok 19 106.037 opensuse:tumbleweed: Ok 20 75.951 ubuntu:12.04.5: Ok 21 52.138 ubuntu:14.04: Ok 22 94.814 ubuntu:14.04.4: Ok 23 100.525 ubuntu:15.10: Ok 24 93.813 ubuntu:16.04: Ok 25 85.214 ubuntu:16.04-x-arm: Ok 26 83.487 ubuntu:16.04-x-arm64: Ok 27 82.918 ubuntu:16.04-x-powerpc64: Ok 28 84.189 ubuntu:16.04-x-powerpc64el: Ok 29 93.162 ubuntu:16.10: Ok real 38m13.568s user 0m2.379s sys 0m2.402s [root@zoo ~]# [root@jouet ~]# perf test 1: vmlinux symtab matches kallsyms : Ok 2: detect openat syscall event : Ok 3: detect openat syscall event on all cpus : Ok 4: read samples using the mmap interface : Ok 5: parse events tests : Ok 6: Validate PERF_RECORD_* events & perf_sample fields : Ok 7: Test perf pmu format parsing : Ok 8: Test dso data read : Ok 9: Test dso data cache : Ok 10: Test dso data reopen : Ok 11: roundtrip evsel->name check : Ok 12: Check parsing of sched tracepoints fields : Ok 13: Generate and check syscalls:sys_enter_openat event fields: Ok 14: struct perf_event_attr setup : Ok 15: Test matching and linking multiple hists : Ok 16: Try 'import perf' in python, checking link problems : Ok 17: Test breakpoint overflow signal handler : Ok 18: Test breakpoint overflow sampling : Ok 19: Test number of exit event of a simple workload : Ok 20: Test software clock events have valid period values : Ok 21: Test object code reading : Ok 22: Test sample parsing : Ok 23: Test using a dummy software event to keep tracking : Ok 24: Test parsing with no sample_id_all bit set : Ok 25: Test filtering hist entries : Ok 26: Test mmap thread lookup : Ok 27: Test thread mg sharing : Ok 28: Test output sorting of hist entries : Ok 29: Test cumulation of child hist entries : Ok 30: Test tracking with sched_switch : Ok 31: Filter fds with revents mask in a fdarray : Ok 32: Add fd to a fdarray, making it autogrow : Ok 33: Test kmod_path__parse function : Ok 34: Test thread map : Ok 35: Test LLVM searching and compiling : 35.1: Basic BPF llvm compiling test : Ok 35.2: Test kbuild searching : Ok 35.3: Compile source for BPF prologue generation test : Ok 35.4: Compile source for BPF relocation test : Ok 36: Test topology in session : Ok 37: Test BPF filter : 37.1: Test basic BPF filtering : Ok 37.2: Test BPF prologue generation : Ok 37.3: Test BPF relocation checker : Ok 38: Test thread map synthesize : Ok 39: Test cpu map synthesize : Ok 40: Test stat config synthesize : Ok 41: Test stat synthesize : Ok 42: Test stat round synthesize : Ok 43: Test attr update synthesize : Ok 44: Test events times : Ok 45: Test backward reading from ring buffer : Ok 46: Test cpu map print : Ok 47: Test SDT event probing : Ok 48: Test is_printable_array function : Ok 49: Test bitmap print : Ok 50: x86 rdpmc test : Ok 51: Test converting perf time to TSC : Ok 52: Test dwarf unwind : Ok 53: Test x86 instruction decoder - new instructions : Ok 54: Test intel cqm nmi context read : Skip [root@jouet ~]# ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2016-09-22 21:12 Arnaldo Carvalho de Melo @ 2016-09-23 5:22 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2016-09-23 5:22 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin, Andi Kleen, David Ahern, Don Zickus, Jiri Olsa, Joe Mario, linux-arm-kernel, Mathieu Poirier, Namhyung Kim, Peter Zijlstra * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > From: Arnaldo Carvalho de Melo <acme@redhat.com> > > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit 89f1c2c59c4aef8e26edbc7db5175e6ffb0e9ec7: > > Merge tag 'perf-core-for-mingo-20160920' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-09-20 23:32:02 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160922 > > for you to fetch changes up to 2d831454140f28fa643b78deede4511b9e2c9e5f: > > perf hists: Make hists__fprintf_headers function global (2016-09-22 13:08:59 -0300) > > ---------------------------------------------------------------- > perf/core improvements: > > New features: > > - Add support for interacting with Coresight PMU ETMs/PTMs, that are IP blocks > to perform hardware assisted tracing on a ARM CPU core (Mathieu Poirier) > > Infrastructure: > > - Histogram prep work for the upcoming c2c tool (Jiri Olsa) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Jiri Olsa (9): > perf evsel: Remove superfluous initialization of weight > perf hists: Use bigger buffer for stdio headers > perf hists: Add __hist_entry__snprintf function > perf tools: Make reset_dimensions global > perf tools: Make output_field_add and sort_dimension__add global > perf tools: Make several sorting functions global > perf tools: Make several display functions global > perf hists: Make __hist_entry__snprintf function global > perf hists: Make hists__fprintf_headers function global > > Mathieu Poirier (6): > perf tools: Confine __get_cpuid() to x86 architecture > perf tools: Make coresight PMU listable > perf tools: Add coresight etm PMU record capabilities > perf pmu: Push configuration down to PMU driver > perf tools: Add PMU configuration to tools > perf tools: Add sink configuration for cs_etm PMU > > MAINTAINERS | 5 + > tools/perf/Makefile.config | 11 +- > tools/perf/arch/arm/util/Build | 2 + > tools/perf/arch/arm/util/auxtrace.c | 54 ++++ > tools/perf/arch/arm/util/cs-etm.c | 617 ++++++++++++++++++++++++++++++++++++ > tools/perf/arch/arm/util/cs-etm.h | 26 ++ > tools/perf/arch/arm/util/pmu.c | 36 +++ > tools/perf/arch/arm64/util/Build | 4 + > tools/perf/builtin-record.c | 10 + > tools/perf/builtin-stat.c | 9 + > tools/perf/builtin-top.c | 13 + > tools/perf/ui/browsers/hists.c | 2 +- > tools/perf/ui/hist.c | 2 +- > tools/perf/ui/stdio/hist.c | 14 +- > tools/perf/util/Build | 1 + > tools/perf/util/auxtrace.c | 1 + > tools/perf/util/auxtrace.h | 1 + > tools/perf/util/cs-etm.h | 74 +++++ > tools/perf/util/drv_configs.c | 77 +++++ > tools/perf/util/drv_configs.h | 26 ++ > tools/perf/util/evsel.c | 2 - > tools/perf/util/hist.h | 5 + > tools/perf/util/pmu.h | 2 + > tools/perf/util/sort.c | 16 +- > tools/perf/util/sort.h | 11 + > 25 files changed, 1001 insertions(+), 20 deletions(-) > create mode 100644 tools/perf/arch/arm/util/auxtrace.c > create mode 100644 tools/perf/arch/arm/util/cs-etm.c > create mode 100644 tools/perf/arch/arm/util/cs-etm.h > create mode 100644 tools/perf/arch/arm/util/pmu.c > create mode 100644 tools/perf/util/cs-etm.h > create mode 100644 tools/perf/util/drv_configs.c > create mode 100644 tools/perf/util/drv_configs.h > > [root@zoo ~]# time dm > 1 73.911 alpine:3.4: Ok > 2 26.890 android-ndk:r12b-arm: Ok > 3 77.833 archlinux:latest: Ok > 4 40.814 centos:5: Ok > 5 64.151 centos:6: Ok > 6 75.720 centos:7: Ok > 7 68.960 debian:7: Ok > 8 75.606 debian:8: Ok > 9 75.127 fedora:20: Ok > 10 80.186 fedora:21: Ok > 11 80.157 fedora:22: Ok > 12 83.273 fedora:23: Ok > 13 91.566 fedora:24: Ok > 14 37.720 fedora:24-x-ARC-uClibc: Ok > 15 98.492 fedora:rawhide: Ok > 16 100.555 mageia:5: Ok > 17 94.140 opensuse:13.2: Ok > 18 95.476 opensuse:42.1: Ok > 19 106.037 opensuse:tumbleweed: Ok > 20 75.951 ubuntu:12.04.5: Ok > 21 52.138 ubuntu:14.04: Ok > 22 94.814 ubuntu:14.04.4: Ok > 23 100.525 ubuntu:15.10: Ok > 24 93.813 ubuntu:16.04: Ok > 25 85.214 ubuntu:16.04-x-arm: Ok > 26 83.487 ubuntu:16.04-x-arm64: Ok > 27 82.918 ubuntu:16.04-x-powerpc64: Ok > 28 84.189 ubuntu:16.04-x-powerpc64el: Ok > 29 93.162 ubuntu:16.10: Ok > > real 38m13.568s > user 0m2.379s > sys 0m2.402s > [root@zoo ~]# > > [root@jouet ~]# perf test > 1: vmlinux symtab matches kallsyms : Ok > 2: detect openat syscall event : Ok > 3: detect openat syscall event on all cpus : Ok > 4: read samples using the mmap interface : Ok > 5: parse events tests : Ok > 6: Validate PERF_RECORD_* events & perf_sample fields : Ok > 7: Test perf pmu format parsing : Ok > 8: Test dso data read : Ok > 9: Test dso data cache : Ok > 10: Test dso data reopen : Ok > 11: roundtrip evsel->name check : Ok > 12: Check parsing of sched tracepoints fields : Ok > 13: Generate and check syscalls:sys_enter_openat event fields: Ok > 14: struct perf_event_attr setup : Ok > 15: Test matching and linking multiple hists : Ok > 16: Try 'import perf' in python, checking link problems : Ok > 17: Test breakpoint overflow signal handler : Ok > 18: Test breakpoint overflow sampling : Ok > 19: Test number of exit event of a simple workload : Ok > 20: Test software clock events have valid period values : Ok > 21: Test object code reading : Ok > 22: Test sample parsing : Ok > 23: Test using a dummy software event to keep tracking : Ok > 24: Test parsing with no sample_id_all bit set : Ok > 25: Test filtering hist entries : Ok > 26: Test mmap thread lookup : Ok > 27: Test thread mg sharing : Ok > 28: Test output sorting of hist entries : Ok > 29: Test cumulation of child hist entries : Ok > 30: Test tracking with sched_switch : Ok > 31: Filter fds with revents mask in a fdarray : Ok > 32: Add fd to a fdarray, making it autogrow : Ok > 33: Test kmod_path__parse function : Ok > 34: Test thread map : Ok > 35: Test LLVM searching and compiling : > 35.1: Basic BPF llvm compiling test : Ok > 35.2: Test kbuild searching : Ok > 35.3: Compile source for BPF prologue generation test : Ok > 35.4: Compile source for BPF relocation test : Ok > 36: Test topology in session : Ok > 37: Test BPF filter : > 37.1: Test basic BPF filtering : Ok > 37.2: Test BPF prologue generation : Ok > 37.3: Test BPF relocation checker : Ok > 38: Test thread map synthesize : Ok > 39: Test cpu map synthesize : Ok > 40: Test stat config synthesize : Ok > 41: Test stat synthesize : Ok > 42: Test stat round synthesize : Ok > 43: Test attr update synthesize : Ok > 44: Test events times : Ok > 45: Test backward reading from ring buffer : Ok > 46: Test cpu map print : Ok > 47: Test SDT event probing : Ok > 48: Test is_printable_array function : Ok > 49: Test bitmap print : Ok > 50: x86 rdpmc test : Ok > 51: Test converting perf time to TSC : Ok > 52: Test dwarf unwind : Ok > 53: Test x86 instruction decoder - new instructions : Ok > 54: Test intel cqm nmi context read : Skip > [root@jouet ~]# Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2016-07-18 23:33 Arnaldo Carvalho de Melo 2016-07-19 6:46 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2016-07-18 23:33 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin, Dan Carpenter, David Ahern, He Kuang, Jiri Olsa, Jiri Pirko, Josh Poimboeuf, Kan Liang, Mark Rutland, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, Stephane Eranian, Steven Rostedt, Wang Nan From: Arnaldo Carvalho de Melo <acme@redhat.com> Hi Ingo, Please consider pulling, - Arnaldo Build stats: [root@jouet 5]# perf stat dm alpine:3.4: Ok android-ndk:r12b: Ok centos:5: Ok centos:6: Ok centos:7: Ok debian:7: Ok debian:8: Ok debian:experimental: Ok fedora:21: Ok fedora:22: Ok fedora:23: Ok fedora:24: Ok fedora:rawhide: Ok mageia:5: Ok opensuse:13.2: Ok opensuse:42.1: Ok ubuntu:14.04.4: Ok ubuntu:15.10: Ok ubuntu:16.04: Ok ubuntu:16.04-x-armhf: Ok Performance counter stats for 'dm': 1896.227285 task-clock (msec) # 0.002 CPUs utilized 76,145 context-switches # 0.040 M/sec 9,323 cpu-migrations # 0.005 M/sec 53,894 page-faults # 0.028 M/sec 5,497,625,679 cycles # 2.899 GHz 5,110,226,458 instructions # 0.93 insn per cycle 950,036,839 branches # 501.014 M/sec 16,978,253 branch-misses # 1.79% of all branches 767.910393301 seconds time elapsed [root@jouet 5]# The following changes since commit 09211e2530ab4905ec16edecc27022d6b247419d: Merge tag 'perf-core-for-mingo-20160715' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-07-16 22:36:42 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160718 for you to fetch changes up to 988dd774dcbd9151c2a643fc7284c5c3c4d0adb7: perf tests: Add is_printable_array test (2016-07-18 19:50:35 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Properly report when a function wildcard produces no matches in 'perf probe' (Masami Hiramatsu) - Balance opening and reading events in 'perf stat', which could cause it to get stuck trying to close invalid file descriptors (Mark Rutland) Infrastructure: - Copy more headers from the kernel, this time for headers that were just including the contents of its kernel counterparts, should help resolving the problems with linux-next, where some uapi related patches seem to be breaking tools/object/ build. Some more combing will be done, but at least it is possible to build perf out of tree, via a detached tarball (make help | grep perf) without including kernel files in its MANIFEST (Arnaldo Carvalho de Melo) - Fix smatch found errors that were not causing problems, but are mistakes nonetheless (Dan Carpenter) - Fix string vs byte array resolving in the python script code (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (7): perf tools: Add missing linux/compiler.h include to perf-sys.h perf tools: Remove tools/perf/util/include/asm/byteorder.h perf tools: Remove tools/perf/util/include/linux/const.h Remove: kernel unistd*h files from perf's MANIFEST, not used tools: Copy the bitops files accessed from the kernel and check for drift perf tools: Remove include/linux/list.h from perf's MANIFEST tools: Copy linux/{hash,poison}.h and check for drift Dan Carpenter (2): perf jit: Add missing curly braces perf jit: Remove some no-op error handling Jiri Olsa (3): perf script python: Fix string vs byte array resolving perf tools: Make is_printable_array global perf tests: Add is_printable_array test Mark Rutland (2): perf stat: Balance opening and reading events perf cpu_map: Add more helpers Masami Hiramatsu (1): perf probe: Warn unmatched function filter correctly tools/include/asm-generic/bitops/__fls.h | 44 ++++++++- tools/include/asm-generic/bitops/arch_hweight.h | 26 ++++- tools/include/asm-generic/bitops/const_hweight.h | 44 ++++++++- tools/include/asm-generic/bitops/fls.h | 42 ++++++++- tools/include/asm-generic/bitops/fls64.h | 37 +++++++- tools/include/linux/hash.h | 105 ++++++++++++++++++++- tools/include/linux/poison.h | 91 +++++++++++++++++- tools/perf/MANIFEST | 13 --- tools/perf/Makefile.perf | 18 ++++ tools/perf/builtin-stat.c | 8 +- tools/perf/jvmti/jvmti_agent.c | 10 +- tools/perf/perf-sys.h | 1 + tools/perf/tests/Build | 1 + tools/perf/tests/builtin-test.c | 4 + tools/perf/tests/is_printable_array.c | 36 +++++++ tools/perf/tests/tests.h | 1 + tools/perf/util/cpumap.c | 14 ++- tools/perf/util/cpumap.h | 2 + tools/perf/util/include/asm/byteorder.h | 2 - tools/perf/util/include/linux/const.h | 1 - tools/perf/util/map.c | 3 + tools/perf/util/probe-event.c | 12 ++- tools/perf/util/python.c | 12 --- .../util/scripting-engines/trace-event-python.c | 25 +++-- tools/perf/util/util.c | 16 ++++ tools/perf/util/util.h | 1 + 26 files changed, 512 insertions(+), 57 deletions(-) create mode 100644 tools/perf/tests/is_printable_array.c delete mode 100644 tools/perf/util/include/asm/byteorder.h delete mode 100644 tools/perf/util/include/linux/const.h ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2016-07-18 23:33 Arnaldo Carvalho de Melo @ 2016-07-19 6:46 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2016-07-19 6:46 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin, Dan Carpenter, David Ahern, He Kuang, Jiri Olsa, Jiri Pirko, Josh Poimboeuf, Kan Liang, Mark Rutland, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, Stephane Eranian, Steven Rostedt, Wang Nan * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > From: Arnaldo Carvalho de Melo <acme@redhat.com> > > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > Build stats: > > [root@jouet 5]# perf stat dm > alpine:3.4: Ok > android-ndk:r12b: Ok > centos:5: Ok > centos:6: Ok > centos:7: Ok > debian:7: Ok > debian:8: Ok > debian:experimental: Ok > fedora:21: Ok > fedora:22: Ok > fedora:23: Ok > fedora:24: Ok > fedora:rawhide: Ok > mageia:5: Ok > opensuse:13.2: Ok > opensuse:42.1: Ok > ubuntu:14.04.4: Ok > ubuntu:15.10: Ok > ubuntu:16.04: Ok > ubuntu:16.04-x-armhf: Ok > > Performance counter stats for 'dm': > > 1896.227285 task-clock (msec) # 0.002 CPUs utilized > 76,145 context-switches # 0.040 M/sec > 9,323 cpu-migrations # 0.005 M/sec > 53,894 page-faults # 0.028 M/sec > 5,497,625,679 cycles # 2.899 GHz > 5,110,226,458 instructions # 0.93 insn per cycle > 950,036,839 branches # 501.014 M/sec > 16,978,253 branch-misses # 1.79% of all branches > > 767.910393301 seconds time elapsed > > [root@jouet 5]# > > The following changes since commit 09211e2530ab4905ec16edecc27022d6b247419d: > > Merge tag 'perf-core-for-mingo-20160715' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-07-16 22:36:42 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160718 > > for you to fetch changes up to 988dd774dcbd9151c2a643fc7284c5c3c4d0adb7: > > perf tests: Add is_printable_array test (2016-07-18 19:50:35 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Properly report when a function wildcard produces no matches in 'perf probe' > (Masami Hiramatsu) > > - Balance opening and reading events in 'perf stat', which could cause > it to get stuck trying to close invalid file descriptors (Mark Rutland) > > Infrastructure: > > - Copy more headers from the kernel, this time for headers that > were just including the contents of its kernel counterparts, should > help resolving the problems with linux-next, where some uapi related > patches seem to be breaking tools/object/ build. > > Some more combing will be done, but at least it is possible to build > perf out of tree, via a detached tarball (make help | grep perf) > without including kernel files in its MANIFEST (Arnaldo Carvalho de Melo) > > - Fix smatch found errors that were not causing problems, but are > mistakes nonetheless (Dan Carpenter) > > - Fix string vs byte array resolving in the python script code (Jiri Olsa) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (7): > perf tools: Add missing linux/compiler.h include to perf-sys.h > perf tools: Remove tools/perf/util/include/asm/byteorder.h > perf tools: Remove tools/perf/util/include/linux/const.h > Remove: kernel unistd*h files from perf's MANIFEST, not used > tools: Copy the bitops files accessed from the kernel and check for drift > perf tools: Remove include/linux/list.h from perf's MANIFEST > tools: Copy linux/{hash,poison}.h and check for drift > > Dan Carpenter (2): > perf jit: Add missing curly braces > perf jit: Remove some no-op error handling > > Jiri Olsa (3): > perf script python: Fix string vs byte array resolving > perf tools: Make is_printable_array global > perf tests: Add is_printable_array test > > Mark Rutland (2): > perf stat: Balance opening and reading events > perf cpu_map: Add more helpers > > Masami Hiramatsu (1): > perf probe: Warn unmatched function filter correctly > > tools/include/asm-generic/bitops/__fls.h | 44 ++++++++- > tools/include/asm-generic/bitops/arch_hweight.h | 26 ++++- > tools/include/asm-generic/bitops/const_hweight.h | 44 ++++++++- > tools/include/asm-generic/bitops/fls.h | 42 ++++++++- > tools/include/asm-generic/bitops/fls64.h | 37 +++++++- > tools/include/linux/hash.h | 105 ++++++++++++++++++++- > tools/include/linux/poison.h | 91 +++++++++++++++++- > tools/perf/MANIFEST | 13 --- > tools/perf/Makefile.perf | 18 ++++ > tools/perf/builtin-stat.c | 8 +- > tools/perf/jvmti/jvmti_agent.c | 10 +- > tools/perf/perf-sys.h | 1 + > tools/perf/tests/Build | 1 + > tools/perf/tests/builtin-test.c | 4 + > tools/perf/tests/is_printable_array.c | 36 +++++++ > tools/perf/tests/tests.h | 1 + > tools/perf/util/cpumap.c | 14 ++- > tools/perf/util/cpumap.h | 2 + > tools/perf/util/include/asm/byteorder.h | 2 - > tools/perf/util/include/linux/const.h | 1 - > tools/perf/util/map.c | 3 + > tools/perf/util/probe-event.c | 12 ++- > tools/perf/util/python.c | 12 --- > .../util/scripting-engines/trace-event-python.c | 25 +++-- > tools/perf/util/util.c | 16 ++++ > tools/perf/util/util.h | 1 + > 26 files changed, 512 insertions(+), 57 deletions(-) > create mode 100644 tools/perf/tests/is_printable_array.c > delete mode 100644 tools/perf/util/include/asm/byteorder.h > delete mode 100644 tools/perf/util/include/linux/const.h Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2016-05-10 15:15 Arnaldo Carvalho de Melo 2016-05-10 20:28 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2016-05-10 15:15 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin, Andi Kleen, Chris Phlipot, David Ahern, Ekaterina Tumanova, He Kuang, Jiri Olsa, Josh Poimboeuf, Kan Liang, Masami Hiramatsu, Milian Wolff, Namhyung Kim, Pekka Enberg, Peter Zijlstra, pi3orama, Stephane Eranian, Sukadev Bhattiprolu, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit ea7c28518943b26a85d73cd76acd03b71962cb18: Merge tag 'perf-core-for-mingo-20160506' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-05-07 06:49:28 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160510 for you to fetch changes up to 452e84012595d681f254a3a0d733fb0b18ffaf42: perf tools: Remove xrealloc and ALLOC_GROW (2016-05-10 11:58:27 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Recording 'dwarf' callchains do not need DWARF unwinding support (He Kuang) - Print recently added perf_event_attr.write_backward bit flag in -vv verbose mode (Arnaldo Carvalho de Melo) - Fix incorrect python db-export error message in 'perf script' (Chris Phlipot) - Fix handling of zero-length symbols (Chris Phlipot) Andi Kleen (1): perf stat: Scale values by unit before metrics Infrastructure: - Rewrite strbuf not to die(), making tools using it to check its return value instead (Masami Hiramatsu) - Support reading from backward ring buffer, add a 'perf test' entry for it (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Andi Kleen (1): perf stat: Scale values by unit before metrics Arnaldo Carvalho de Melo (1): perf evsel: Print state of perf_event_attr.write_backward Chris Phlipot (2): perf script: Fix incorrect python db-export error message perf symbols: Fix handling of zero-length symbols. He Kuang (1): perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding support Masami Hiramatsu (8): perf tools: Rewrite strbuf not to die() perf probe: Check the return value of strbuf APIs perf help: Make check_emacsclient_version to check strbuf APIs perf tools: Make alias handler to check return value of strbuf perf header: Make topology checkers to check return value of strbuf perf pmu: Make pmu_formats_string to check return value of strbuf perf help: Do not use ALLOC_GROW in add_cmd_list perf tools: Remove xrealloc and ALLOC_GROW Wang Nan (2): perf tools: Support reading from backward ring buffer perf tests: Add test to check backward ring buffer tools/perf/builtin-help.c | 18 +-- tools/perf/perf.c | 8 +- tools/perf/tests/Build | 1 + tools/perf/tests/backward-ring-buffer.c | 151 +++++++++++++++++++++ tools/perf/tests/builtin-test.c | 4 + tools/perf/tests/tests.h | 1 + tools/perf/util/Build | 1 - tools/perf/util/cache.h | 19 --- tools/perf/util/dwarf-aux.c | 52 ++++--- tools/perf/util/evlist.c | 50 +++++++ tools/perf/util/evlist.h | 4 + tools/perf/util/evsel.c | 1 + tools/perf/util/header.c | 31 +++-- tools/perf/util/help-unknown-cmd.c | 30 ++-- tools/perf/util/pmu.c | 10 +- tools/perf/util/probe-event.c | 143 +++++++++++-------- tools/perf/util/probe-finder.c | 30 ++-- tools/perf/util/quote.c | 36 ++--- tools/perf/util/quote.h | 2 +- .../util/scripting-engines/trace-event-python.c | 2 +- tools/perf/util/stat.c | 4 +- tools/perf/util/strbuf.c | 93 +++++++++---- tools/perf/util/strbuf.h | 25 ++-- tools/perf/util/symbol.c | 2 +- tools/perf/util/util.c | 2 - tools/perf/util/util.h | 6 - tools/perf/util/wrapper.c | 29 ---- 27 files changed, 510 insertions(+), 245 deletions(-) create mode 100644 tools/perf/tests/backward-ring-buffer.c delete mode 100644 tools/perf/util/wrapper.c ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2016-05-10 15:15 Arnaldo Carvalho de Melo @ 2016-05-10 20:28 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2016-05-10 20:28 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Alexander Shishkin, Andi Kleen, Chris Phlipot, David Ahern, Ekaterina Tumanova, He Kuang, Jiri Olsa, Josh Poimboeuf, Kan Liang, Masami Hiramatsu, Milian Wolff, Namhyung Kim, Pekka Enberg, Peter Zijlstra, pi3orama, Stephane Eranian, Sukadev Bhattiprolu, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit ea7c28518943b26a85d73cd76acd03b71962cb18: > > Merge tag 'perf-core-for-mingo-20160506' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-05-07 06:49:28 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160510 > > for you to fetch changes up to 452e84012595d681f254a3a0d733fb0b18ffaf42: > > perf tools: Remove xrealloc and ALLOC_GROW (2016-05-10 11:58:27 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Recording 'dwarf' callchains do not need DWARF unwinding support (He Kuang) > > - Print recently added perf_event_attr.write_backward bit flag in -vv > verbose mode (Arnaldo Carvalho de Melo) > > - Fix incorrect python db-export error message in 'perf script' (Chris Phlipot) > > - Fix handling of zero-length symbols (Chris Phlipot) > > Andi Kleen (1): > perf stat: Scale values by unit before metrics > > Infrastructure: > > - Rewrite strbuf not to die(), making tools using it to check its > return value instead (Masami Hiramatsu) > > - Support reading from backward ring buffer, add a 'perf test' entry > for it (Wang Nan) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Andi Kleen (1): > perf stat: Scale values by unit before metrics > > Arnaldo Carvalho de Melo (1): > perf evsel: Print state of perf_event_attr.write_backward > > Chris Phlipot (2): > perf script: Fix incorrect python db-export error message > perf symbols: Fix handling of zero-length symbols. > > He Kuang (1): > perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding support > > Masami Hiramatsu (8): > perf tools: Rewrite strbuf not to die() > perf probe: Check the return value of strbuf APIs > perf help: Make check_emacsclient_version to check strbuf APIs > perf tools: Make alias handler to check return value of strbuf > perf header: Make topology checkers to check return value of strbuf > perf pmu: Make pmu_formats_string to check return value of strbuf > perf help: Do not use ALLOC_GROW in add_cmd_list > perf tools: Remove xrealloc and ALLOC_GROW > > Wang Nan (2): > perf tools: Support reading from backward ring buffer > perf tests: Add test to check backward ring buffer > > tools/perf/builtin-help.c | 18 +-- > tools/perf/perf.c | 8 +- > tools/perf/tests/Build | 1 + > tools/perf/tests/backward-ring-buffer.c | 151 +++++++++++++++++++++ > tools/perf/tests/builtin-test.c | 4 + > tools/perf/tests/tests.h | 1 + > tools/perf/util/Build | 1 - > tools/perf/util/cache.h | 19 --- > tools/perf/util/dwarf-aux.c | 52 ++++--- > tools/perf/util/evlist.c | 50 +++++++ > tools/perf/util/evlist.h | 4 + > tools/perf/util/evsel.c | 1 + > tools/perf/util/header.c | 31 +++-- > tools/perf/util/help-unknown-cmd.c | 30 ++-- > tools/perf/util/pmu.c | 10 +- > tools/perf/util/probe-event.c | 143 +++++++++++-------- > tools/perf/util/probe-finder.c | 30 ++-- > tools/perf/util/quote.c | 36 ++--- > tools/perf/util/quote.h | 2 +- > .../util/scripting-engines/trace-event-python.c | 2 +- > tools/perf/util/stat.c | 4 +- > tools/perf/util/strbuf.c | 93 +++++++++---- > tools/perf/util/strbuf.h | 25 ++-- > tools/perf/util/symbol.c | 2 +- > tools/perf/util/util.c | 2 - > tools/perf/util/util.h | 6 - > tools/perf/util/wrapper.c | 29 ---- > 27 files changed, 510 insertions(+), 245 deletions(-) > create mode 100644 tools/perf/tests/backward-ring-buffer.c > delete mode 100644 tools/perf/util/wrapper.c Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2016-03-07 19:44 Arnaldo Carvalho de Melo 0 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2016-03-07 19:44 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin, Andi Kleen, Borislav Petkov, Colin Ian King, David Ahern, Davidlohr Bueso, He Kuang, Jiri Olsa, Mel Gorman, Namhyung Kim, Peter Zijlstra, Stephane Eranian, Steven Rostedt, Wang Nan, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit 009668520ae00d52026ccdb3884864e3473c6b65: Merge tag 'perf-core-for-mingo-20160303' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-03-04 12:19:21 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160307 for you to fetch changes up to b03ae342d9bec460a6c9c327c3f5f758263b0932: perf report: Use hierarchy hpp list on gtk (2016-03-07 15:10:41 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Allow grouping multiple sort keys per 'perf report/top --hierarchy' level (Namhyung Kim) - Document 'perf stat --detailed' option (Borislav Petkov) Infrastructure: - jitdump prep work for supporting it with Intel PT (Adrian Hunter) - Use 64-bit shifts with (TSC) time conversion (Adrian Hunter) Trivial: - Explicitly declare inc_group_count as a void function (Colin Ian King) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Adrian Hunter (5): perf inject: Hit all DSOs for AUX data in JIT and other cases perf session: Simplify tool stubs perf jit: Let jit_process() return errors perf jit: Move clockid validation perf tools: Use 64-bit shifts with (TSC) time conversion Borislav Petkov (1): perf stat: Document --detailed option Colin Ian King (1): perf tools: Explicitly declare inc_group_count as a void function Namhyung Kim (8): perf hists: Add level field to struct perf_hpp_fmt perf hists: Introduce perf_hpp__setup_hists_formats() perf hists: Use own hpp_list for hierarchy mode perf hists: Support multiple sort keys in a hierarchy level perf hists: Fix indent for multiple hierarchy sort key perf report: Use hierarchy hpp list on stdio perf hists browser: Use hierarchy hpp list perf report: Use hierarchy hpp list on gtk tools/perf/Documentation/perf-stat.txt | 8 ++ tools/perf/arch/x86/tests/rdpmc.c | 2 +- tools/perf/builtin-inject.c | 52 ++++------ tools/perf/ui/browsers/hists.c | 147 +++++++++++++++------------- tools/perf/ui/gtk/hists.c | 73 ++++++++------ tools/perf/ui/hist.c | 69 +++++++++++++ tools/perf/ui/stdio/hist.c | 171 +++++++++++++++++---------------- tools/perf/util/hist.c | 72 +++++++++----- tools/perf/util/hist.h | 14 +++ tools/perf/util/jitdump.c | 29 +++++- tools/perf/util/parse-events.y | 2 +- tools/perf/util/session.c | 40 ++------ tools/perf/util/sort.c | 146 ++++++++++++++++++++-------- tools/perf/util/sort.h | 1 + tools/perf/util/tsc.c | 2 +- 15 files changed, 514 insertions(+), 314 deletions(-) ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2016-02-22 18:02 Arnaldo Carvalho de Melo 2016-02-24 7:21 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2016-02-22 18:02 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexei Starovoitov, Andi Kleen, Brendan Gregg, Cody P Schafer, David Ahern, He Kuang, Jeremie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama, Wang Nan, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit 91e48b7df15196b8ce01f40455219d3ed7889988: Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-02-20 11:52:16 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo for you to fetch changes up to 03e0a7df3efd959e40cd7ff40b1fabddc234ec5a: perf tools: Introduce bpf-output event (2016-02-22 14:37:21 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: New features: - Add API to set values of map entries in a BPF object, be it individual map slots or ranges (Wang Nan) - Introduce support for the 'bpf-output' event (Wang Nan) - Add glue to read perf events in a BPF program (Wang Nan) Fixes: - Sort key fixes: Alignment for srcline, file, trace; fix segfault for dynamic, trace events related sort keys (Namyung Kim) Build fixes: - Remove duplicate typedef config_term_func_t definition, fixing the build on older systems (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (2): perf tools: Fix build on older systems perf tools: Remove duplicate typedef config_term_func_t definition Namhyung Kim (5): perf tools: Fix segfault on dynamic entries perf tools: Update srcline/file if needed perf tools: Fix alignment on some sort keys perf tools: Fix column width setting on 'trace' sort key perf tools: Fix assertion failure on dynamic entry Wang Nan (8): perf bpf: Add API to set values to map entries in a bpf object perf tools: Enable BPF object configure syntax perf record: Apply config to BPF objects before recording perf tools: Enable passing event to BPF object perf tools: Support setting different slots in a BPF map separately perf tools: Enable indices setting syntax for BPF map perf tools: Apply tracepoint event definition options to BPF script perf tools: Introduce bpf-output event tools/perf/builtin-record.c | 11 + tools/perf/tests/bpf.c | 2 +- tools/perf/ui/hist.c | 3 + tools/perf/util/bpf-loader.c | 718 +++++++++++++++++++++++++++++++++++++++++ tools/perf/util/bpf-loader.h | 59 ++++ tools/perf/util/evlist.c | 16 + tools/perf/util/evlist.h | 3 + tools/perf/util/evsel.c | 5 + tools/perf/util/evsel.h | 8 + tools/perf/util/hist.c | 3 + tools/perf/util/parse-events.c | 130 +++++++- tools/perf/util/parse-events.h | 17 +- tools/perf/util/parse-events.l | 16 +- tools/perf/util/parse-events.y | 95 +++++- tools/perf/util/sort.c | 90 +++--- 15 files changed, 1112 insertions(+), 64 deletions(-) ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2016-02-22 18:02 Arnaldo Carvalho de Melo @ 2016-02-24 7:21 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2016-02-24 7:21 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Alexei Starovoitov, Andi Kleen, Brendan Gregg, Cody P Schafer, David Ahern, He Kuang, Jeremie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama, Wang Nan, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit 91e48b7df15196b8ce01f40455219d3ed7889988: > > Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-02-20 11:52:16 +0100) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to 03e0a7df3efd959e40cd7ff40b1fabddc234ec5a: > > perf tools: Introduce bpf-output event (2016-02-22 14:37:21 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > New features: > > - Add API to set values of map entries in a BPF object, be it > individual map slots or ranges (Wang Nan) > > - Introduce support for the 'bpf-output' event (Wang Nan) > > - Add glue to read perf events in a BPF program (Wang Nan) > > Fixes: > > - Sort key fixes: Alignment for srcline, file, trace; fix > segfault for dynamic, trace events related sort keys (Namyung Kim) > > Build fixes: > > - Remove duplicate typedef config_term_func_t definition, > fixing the build on older systems (Arnaldo Carvalho de Melo) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (2): > perf tools: Fix build on older systems > perf tools: Remove duplicate typedef config_term_func_t definition > > Namhyung Kim (5): > perf tools: Fix segfault on dynamic entries > perf tools: Update srcline/file if needed > perf tools: Fix alignment on some sort keys > perf tools: Fix column width setting on 'trace' sort key > perf tools: Fix assertion failure on dynamic entry > > Wang Nan (8): > perf bpf: Add API to set values to map entries in a bpf object > perf tools: Enable BPF object configure syntax > perf record: Apply config to BPF objects before recording > perf tools: Enable passing event to BPF object > perf tools: Support setting different slots in a BPF map separately > perf tools: Enable indices setting syntax for BPF map > perf tools: Apply tracepoint event definition options to BPF script > perf tools: Introduce bpf-output event > > tools/perf/builtin-record.c | 11 + > tools/perf/tests/bpf.c | 2 +- > tools/perf/ui/hist.c | 3 + > tools/perf/util/bpf-loader.c | 718 +++++++++++++++++++++++++++++++++++++++++ > tools/perf/util/bpf-loader.h | 59 ++++ > tools/perf/util/evlist.c | 16 + > tools/perf/util/evlist.h | 3 + > tools/perf/util/evsel.c | 5 + > tools/perf/util/evsel.h | 8 + > tools/perf/util/hist.c | 3 + > tools/perf/util/parse-events.c | 130 +++++++- > tools/perf/util/parse-events.h | 17 +- > tools/perf/util/parse-events.l | 16 +- > tools/perf/util/parse-events.y | 95 +++++- > tools/perf/util/sort.c | 90 +++--- > 15 files changed, 1112 insertions(+), 64 deletions(-) Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2015-09-05 1:06 Arnaldo Carvalho de Melo 2015-09-08 14:09 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-09-05 1:06 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen, Corey Ashford, David Ahern, Frederic Weisbecker, Jan Stancek, Jiri Olsa, Kan Liang, Matt Fleming, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Raphael Beamonte, Stephane Eranian, Steven Rostedt, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, this is on top of the previous pull request, perf-core-for-mingo. - Arnaldo The following changes since commit cf2f33a4e54096f90652cca3511fd6a456ea5abe: perf trace: Add read/write to the file group (2015-09-04 13:22:06 -0300) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-2 for you to fetch changes up to 0959e527b1593e662cb99639a587eac39ea1232d: perf stat: Move sw clock metrics printout to stat-shadow (2015-09-04 20:30:01 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Add 'socket' sort entry, to sort by the processor socket in 'perf top' and 'perf report' (Kan Liang) - Introduce --socket-filter to 'perf report', for filtering by processor socket (Kan Liang) - Add new "Zoom into Processor Socket" operation in the perf hists browser, used in 'perf top' and 'perf report' (Kan Liang) Infrastructure: - 'perf test' fixes for the object code reading entry (Jan Stancek) - Add processor socket and cpu topology 'perf test' entries (Kan Liang) - Move sw clock metrics printout to stat-shadow (Andi Kleen) - Switch to tracing_patch interface (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Andi Kleen (1): perf stat: Move sw clock metrics printout to stat-shadow Jan Stancek (4): perf tests: Take into account address of each objdump line perf tests: Make objdump disassemble zero blocks perf tests: Stop reading if objdump output crossed sections perf tests: Print objdump/dso buffers if they don't match Jiri Olsa (4): tools lib api fs: Make tracing_path_strerror_open message generic tools lib api fs: Replace debugfs/tracefs objects interface with fs.c tools lib api fs: Remove debugfs, tracefs and findfs objects perf tools: Switch to tracing_path interface on appropriate places Kan Liang (6): perf test: Add entry to test cpu topology perf tools: Add processor socket info to hist_entry and addr_location perf tools: Introduce new sort type "socket" for the processor socket perf report: Introduce --socket-filter option perf hists browser: Zoom in/out for processor socket perf test: Add entry for hists socket filter tools/lib/api/fs/Build | 3 - tools/lib/api/fs/debugfs.c | 77 ------------------- tools/lib/api/fs/debugfs.h | 23 ------ tools/lib/api/fs/findfs.c | 63 ---------------- tools/lib/api/fs/findfs.h | 23 ------ tools/lib/api/fs/fs.c | 1 - tools/lib/api/fs/tracefs.c | 78 ------------------- tools/lib/api/fs/tracefs.h | 21 ------ tools/lib/api/fs/tracing_path.c | 35 +++++---- tools/perf/Documentation/perf-report.txt | 6 +- tools/perf/builtin-kvm.c | 1 - tools/perf/builtin-probe.c | 1 - tools/perf/builtin-report.c | 15 ++++ tools/perf/builtin-stat.c | 9 --- tools/perf/tests/Build | 1 + tools/perf/tests/builtin-test.c | 4 + tools/perf/tests/code-reading.c | 74 +++++++++++++++---- tools/perf/tests/hists_filter.c | 55 +++++++++++--- tools/perf/tests/openat-syscall-all-cpus.c | 10 +-- tools/perf/tests/openat-syscall.c | 10 +-- tools/perf/tests/parse-events.c | 19 +---- tools/perf/tests/tests.h | 1 + tools/perf/tests/topology.c | 115 +++++++++++++++++++++++++++++ tools/perf/ui/browsers/hists.c | 59 ++++++++++++++- tools/perf/util/event.c | 1 + tools/perf/util/evsel.c | 2 +- tools/perf/util/hist.c | 37 ++++++++++ tools/perf/util/hist.h | 6 +- tools/perf/util/probe-event.c | 5 +- tools/perf/util/probe-file.c | 15 +--- tools/perf/util/sort.c | 22 ++++++ tools/perf/util/sort.h | 2 + tools/perf/util/stat-shadow.c | 3 + tools/perf/util/symbol.h | 1 + tools/perf/util/util.h | 3 +- 35 files changed, 409 insertions(+), 392 deletions(-) delete mode 100644 tools/lib/api/fs/debugfs.c delete mode 100644 tools/lib/api/fs/debugfs.h delete mode 100644 tools/lib/api/fs/findfs.c delete mode 100644 tools/lib/api/fs/findfs.h delete mode 100644 tools/lib/api/fs/tracefs.c delete mode 100644 tools/lib/api/fs/tracefs.h create mode 100644 tools/perf/tests/topology.c ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2015-09-05 1:06 Arnaldo Carvalho de Melo @ 2015-09-08 14:09 ` Arnaldo Carvalho de Melo 2015-09-08 14:21 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-09-08 14:09 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Ingo Molnar, linux-kernel, Adrian Hunter, Andi Kleen, Corey Ashford, David Ahern, Frederic Weisbecker, Jan Stancek, Jiri Olsa, Kan Liang, Matt Fleming, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Raphael Beamonte, Stephane Eranian, Steven Rostedt Em Fri, Sep 04, 2015 at 10:06:28PM -0300, Arnaldo Carvalho de Melo escreveu: > Hi Ingo, > > Please consider pulling, this is on top of the previous pull request, > perf-core-for-mingo. Ingo, please do not pull this 'perf-core-for-mingo-2' tag, there were some misunderstandings about the acks for "Move sw clock metrics printout to stat-shadow" and Jiri and Andi are working that out. I'll remove those patches and get a new perf-core-for-mingo-2 tag in place, before continuing today's batch, which possibly will be available as 'perf-core-for-mingo-3' What is in 'perf-core-for-mingo" should be Ok. - Arnaldo > - Arnaldo > > The following changes since commit cf2f33a4e54096f90652cca3511fd6a456ea5abe: > > perf trace: Add read/write to the file group (2015-09-04 13:22:06 -0300) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-2 > > for you to fetch changes up to 0959e527b1593e662cb99639a587eac39ea1232d: > > perf stat: Move sw clock metrics printout to stat-shadow (2015-09-04 20:30:01 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Add 'socket' sort entry, to sort by the processor socket in > 'perf top' and 'perf report' (Kan Liang) > > - Introduce --socket-filter to 'perf report', for filtering by > processor socket (Kan Liang) > > - Add new "Zoom into Processor Socket" operation in the perf hists browser, > used in 'perf top' and 'perf report' (Kan Liang) > > Infrastructure: > > - 'perf test' fixes for the object code reading entry (Jan Stancek) > > - Add processor socket and cpu topology 'perf test' entries (Kan Liang) > > - Move sw clock metrics printout to stat-shadow (Andi Kleen) > > - Switch to tracing_patch interface (Jiri Olsa) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Andi Kleen (1): > perf stat: Move sw clock metrics printout to stat-shadow > > Jan Stancek (4): > perf tests: Take into account address of each objdump line > perf tests: Make objdump disassemble zero blocks > perf tests: Stop reading if objdump output crossed sections > perf tests: Print objdump/dso buffers if they don't match > > Jiri Olsa (4): > tools lib api fs: Make tracing_path_strerror_open message generic > tools lib api fs: Replace debugfs/tracefs objects interface with fs.c > tools lib api fs: Remove debugfs, tracefs and findfs objects > perf tools: Switch to tracing_path interface on appropriate places > > Kan Liang (6): > perf test: Add entry to test cpu topology > perf tools: Add processor socket info to hist_entry and addr_location > perf tools: Introduce new sort type "socket" for the processor socket > perf report: Introduce --socket-filter option > perf hists browser: Zoom in/out for processor socket > perf test: Add entry for hists socket filter > > tools/lib/api/fs/Build | 3 - > tools/lib/api/fs/debugfs.c | 77 ------------------- > tools/lib/api/fs/debugfs.h | 23 ------ > tools/lib/api/fs/findfs.c | 63 ---------------- > tools/lib/api/fs/findfs.h | 23 ------ > tools/lib/api/fs/fs.c | 1 - > tools/lib/api/fs/tracefs.c | 78 ------------------- > tools/lib/api/fs/tracefs.h | 21 ------ > tools/lib/api/fs/tracing_path.c | 35 +++++---- > tools/perf/Documentation/perf-report.txt | 6 +- > tools/perf/builtin-kvm.c | 1 - > tools/perf/builtin-probe.c | 1 - > tools/perf/builtin-report.c | 15 ++++ > tools/perf/builtin-stat.c | 9 --- > tools/perf/tests/Build | 1 + > tools/perf/tests/builtin-test.c | 4 + > tools/perf/tests/code-reading.c | 74 +++++++++++++++---- > tools/perf/tests/hists_filter.c | 55 +++++++++++--- > tools/perf/tests/openat-syscall-all-cpus.c | 10 +-- > tools/perf/tests/openat-syscall.c | 10 +-- > tools/perf/tests/parse-events.c | 19 +---- > tools/perf/tests/tests.h | 1 + > tools/perf/tests/topology.c | 115 +++++++++++++++++++++++++++++ > tools/perf/ui/browsers/hists.c | 59 ++++++++++++++- > tools/perf/util/event.c | 1 + > tools/perf/util/evsel.c | 2 +- > tools/perf/util/hist.c | 37 ++++++++++ > tools/perf/util/hist.h | 6 +- > tools/perf/util/probe-event.c | 5 +- > tools/perf/util/probe-file.c | 15 +--- > tools/perf/util/sort.c | 22 ++++++ > tools/perf/util/sort.h | 2 + > tools/perf/util/stat-shadow.c | 3 + > tools/perf/util/symbol.h | 1 + > tools/perf/util/util.h | 3 +- > 35 files changed, 409 insertions(+), 392 deletions(-) > delete mode 100644 tools/lib/api/fs/debugfs.c > delete mode 100644 tools/lib/api/fs/debugfs.h > delete mode 100644 tools/lib/api/fs/findfs.c > delete mode 100644 tools/lib/api/fs/findfs.h > delete mode 100644 tools/lib/api/fs/tracefs.c > delete mode 100644 tools/lib/api/fs/tracefs.h > create mode 100644 tools/perf/tests/topology.c ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2015-09-08 14:09 ` Arnaldo Carvalho de Melo @ 2015-09-08 14:21 ` Ingo Molnar 2015-09-08 14:30 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 51+ messages in thread From: Ingo Molnar @ 2015-09-08 14:21 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo, linux-kernel, Adrian Hunter, Andi Kleen, Corey Ashford, David Ahern, Frederic Weisbecker, Jan Stancek, Jiri Olsa, Kan Liang, Matt Fleming, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Raphael Beamonte, Stephane Eranian, Steven Rostedt * Arnaldo Carvalho de Melo <acme@redhat.com> wrote: > Em Fri, Sep 04, 2015 at 10:06:28PM -0300, Arnaldo Carvalho de Melo escreveu: > > Hi Ingo, > > > > Please consider pulling, this is on top of the previous pull request, > > perf-core-for-mingo. > > Ingo, please do not pull this 'perf-core-for-mingo-2' tag, there were > some misunderstandings about the acks for "Move sw clock metrics > printout to stat-shadow" and Jiri and Andi are working that out. > > I'll remove those patches and get a new perf-core-for-mingo-2 > tag in place, before continuing today's batch, which possibly will be > available as 'perf-core-for-mingo-3' > > What is in 'perf-core-for-mingo" should be Ok. Ok! Thanks, Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2015-09-08 14:21 ` Ingo Molnar @ 2015-09-08 14:30 ` Arnaldo Carvalho de Melo 2015-09-14 8:41 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-09-08 14:30 UTC (permalink / raw) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo, linux-kernel, Adrian Hunter, Andi Kleen, Corey Ashford, David Ahern, Frederic Weisbecker, Jan Stancek, Jiri Olsa, Kan Liang, Matt Fleming, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Raphael Beamonte, Stephane Eranian, Steven Rostedt Em Tue, Sep 08, 2015 at 04:21:47PM +0200, Ingo Molnar escreveu: > > * Arnaldo Carvalho de Melo <acme@redhat.com> wrote: > > > Em Fri, Sep 04, 2015 at 10:06:28PM -0300, Arnaldo Carvalho de Melo escreveu: > > > Hi Ingo, > > > > > > Please consider pulling, this is on top of the previous pull request, > > > perf-core-for-mingo. > > > > Ingo, please do not pull this 'perf-core-for-mingo-2' tag, there were > > some misunderstandings about the acks for "Move sw clock metrics > > printout to stat-shadow" and Jiri and Andi are working that out. > > > > I'll remove those patches and get a new perf-core-for-mingo-2 > > tag in place, before continuing today's batch, which possibly will be > > available as 'perf-core-for-mingo-3' > > > > What is in 'perf-core-for-mingo" should be Ok. > > Ok! Thanks! I have already removed that problematic changeset and resigned the 'perf-core-for-mingo-2' tag, same contents modulo that cset. - Arnaldo ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2015-09-08 14:30 ` Arnaldo Carvalho de Melo @ 2015-09-14 8:41 ` Ingo Molnar 2015-09-14 9:07 ` Wangnan (F) 0 siblings, 1 reply; 51+ messages in thread From: Ingo Molnar @ 2015-09-14 8:41 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo, linux-kernel, Adrian Hunter, Andi Kleen, Corey Ashford, David Ahern, Frederic Weisbecker, Jan Stancek, Jiri Olsa, Kan Liang, Matt Fleming, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Raphael Beamonte, Stephane Eranian, Steven Rostedt * Arnaldo Carvalho de Melo <acme@redhat.com> wrote: > Em Tue, Sep 08, 2015 at 04:21:47PM +0200, Ingo Molnar escreveu: > > > > * Arnaldo Carvalho de Melo <acme@redhat.com> wrote: > > > > > Em Fri, Sep 04, 2015 at 10:06:28PM -0300, Arnaldo Carvalho de Melo escreveu: > > > > Hi Ingo, > > > > > > > > Please consider pulling, this is on top of the previous pull request, > > > > perf-core-for-mingo. > > > > > > Ingo, please do not pull this 'perf-core-for-mingo-2' tag, there were > > > some misunderstandings about the acks for "Move sw clock metrics > > > printout to stat-shadow" and Jiri and Andi are working that out. > > > > > > I'll remove those patches and get a new perf-core-for-mingo-2 > > > tag in place, before continuing today's batch, which possibly will be > > > available as 'perf-core-for-mingo-3' > > > > > > What is in 'perf-core-for-mingo" should be Ok. > > > > Ok! > > Thanks! I have already removed that problematic changeset and resigned > the 'perf-core-for-mingo-2' tag, same contents modulo that cset. Hm, so I pulled it (commit 1765d9b26f84), but with an old perf.data I'm getting this crash: triton:~/tip/tools/perf> perf report perf: Segmentation fault -------- backtrace -------- perf[0x52bc0b] /lib/x86_64-linux-gnu/libc.so.6(+0x352f0)[0x7f51a583c2f0] perf[0x42ce95] perf[0x4bc6c3] perf[0x4bcfa1] perf[0x4bf939] perf(perf_session__process_events+0x390)[0x4be430] perf(cmd_report+0x1070)[0x42e2e0] perf[0x478e03] perf(main+0x60a)[0x41f1ba] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f51a5827a40] perf(_start+0x29)[0x41f2d9] [0x0] I also re-tested 1765d9b26f84 and it still crashes. Bisected it to: e1e499aba570 perf tools: Add processor socket info to hist_entry and addr_location Running on Ubuntu, 1 socket box, 12 CPUs. I went back to perf/core 8f3e5684d3fb and it doesn't crash anymore - so I unpulled your tree for now. (Will send you the perf.data privately.) Thanks, Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2015-09-14 8:41 ` Ingo Molnar @ 2015-09-14 9:07 ` Wangnan (F) 0 siblings, 0 replies; 51+ messages in thread From: Wangnan (F) @ 2015-09-14 9:07 UTC (permalink / raw) To: Ingo Molnar, Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo, linux-kernel, Adrian Hunter, Andi Kleen, Corey Ashford, David Ahern, Frederic Weisbecker, Jan Stancek, Jiri Olsa, Kan Liang, Matt Fleming, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Raphael Beamonte, Stephane Eranian, Steven Rostedt On 2015/9/14 16:41, Ingo Molnar wrote: > * Arnaldo Carvalho de Melo <acme@redhat.com> wrote: > >> Em Tue, Sep 08, 2015 at 04:21:47PM +0200, Ingo Molnar escreveu: >>> * Arnaldo Carvalho de Melo <acme@redhat.com> wrote: >>> >>>> Em Fri, Sep 04, 2015 at 10:06:28PM -0300, Arnaldo Carvalho de Melo escreveu: >>>>> Hi Ingo, >>>>> >>>>> Please consider pulling, this is on top of the previous pull request, >>>>> perf-core-for-mingo. >>>> Ingo, please do not pull this 'perf-core-for-mingo-2' tag, there were >>>> some misunderstandings about the acks for "Move sw clock metrics >>>> printout to stat-shadow" and Jiri and Andi are working that out. >>>> >>>> I'll remove those patches and get a new perf-core-for-mingo-2 >>>> tag in place, before continuing today's batch, which possibly will be >>>> available as 'perf-core-for-mingo-3' >>>> >>>> What is in 'perf-core-for-mingo" should be Ok. >>> Ok! >> Thanks! I have already removed that problematic changeset and resigned >> the 'perf-core-for-mingo-2' tag, same contents modulo that cset. > Hm, so I pulled it (commit 1765d9b26f84), but with an old perf.data I'm getting > this crash: > > triton:~/tip/tools/perf> perf report > perf: Segmentation fault > -------- backtrace -------- > perf[0x52bc0b] > /lib/x86_64-linux-gnu/libc.so.6(+0x352f0)[0x7f51a583c2f0] > perf[0x42ce95] > perf[0x4bc6c3] > perf[0x4bcfa1] > perf[0x4bf939] > perf(perf_session__process_events+0x390)[0x4be430] > perf(cmd_report+0x1070)[0x42e2e0] > perf[0x478e03] > perf(main+0x60a)[0x41f1ba] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f51a5827a40] > perf(_start+0x29)[0x41f2d9] > [0x0] > > I also re-tested qand it still crashes. > > Bisected it to: > > e1e499aba570 perf tools: Add processor socket info to hist_entry and addr_location > > Running on Ubuntu, 1 socket box, 12 CPUs. Hi Ingo, It seems you met a bug we are discussing these days. Please have a look at the following discussions: http://lkml.kernel.org/r/1441630315-189525-1-git-send-email-wangnan0@huawei.com http://lkml.kernel.org/r/1441828225-667-1-git-send-email-acme@kernel.org Thank you. > I went back to perf/core 8f3e5684d3fb and it doesn't crash anymore - so I unpulled > your tree for now. (Will send you the perf.data privately.) > > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2015-06-08 14:17 Arnaldo Carvalho de Melo 2015-06-09 9:47 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-06-08 14:17 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen, David Ahern, He Kuang, Jiri Olsa, Namhyung Kim, Peter Zijlstra, Stephane Eranian, Wang Nan, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, more to come, - Arnaldo The following changes since commit a3d86542de8850be52e8589da22b24002941dfb7: perf/x86/intel/pebs: Add PEBSv3 decoding (2015-06-07 16:09:16 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo for you to fetch changes up to d3a7c489c7fd2463e3b2c3a2179c7be879dd9cb4: perf tools: Reference count struct dso (2015-06-08 10:31:40 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Fix perf.data size reporting in 'perf record' in no-buildid mode (He Kuang) Infrastructure: - Protect accesses the dso rbtrees/lists with a rw lock and reference count struct dso instances (Arnaldo Carvalho de Melo) - Export dynamic symbols used by traceevent plugins (He Kuang) - Add libtrace-dynamic-list file to libtraceevent's .gitignore (He Kuang) - Refactor shadow stats code in 'perf stat', prep work for further patchkits (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (3): perf machine: Fix up some more method names perf tools: Protect accesses the dso rbtrees/lists with a rw lock perf tools: Reference count struct dso He Kuang (3): tools lib traceevent: Export dynamic symbols used by traceevent plugins tools lib traceevent: Ignore libtrace-dynamic-list file perf record: Fix perf.data size in no-buildid mode Jiri Olsa (9): perf stat: Add id into perf_stat struct perf stat: Replace transaction event possition check with id check perf stat: Remove setup_events function perf stat: Remove transaction_run from shadow update/print code perf stat: Introduce reset_shadow_stats function perf stat: Introduce print_shadow_stats function perf stat: Add output file argument to print_shadow_stats function perf stat: Add aggr_mode argument to print_shadow_stats function perf stat: Move shadow stat counters into separate object tools/lib/traceevent/.gitignore | 1 + tools/lib/traceevent/Makefile | 14 +- tools/perf/Makefile.perf | 14 +- tools/perf/builtin-record.c | 6 +- tools/perf/builtin-stat.c | 506 ++-------------------------------------- tools/perf/tests/dso-data.c | 4 +- tools/perf/tests/hists_common.c | 6 +- tools/perf/util/Build | 1 + tools/perf/util/dso.c | 87 +++++-- tools/perf/util/dso.h | 24 +- tools/perf/util/header.c | 1 + tools/perf/util/machine.c | 58 +++-- tools/perf/util/machine.h | 4 +- tools/perf/util/map.c | 11 +- tools/perf/util/probe-event.c | 2 +- tools/perf/util/probe-finder.c | 2 +- tools/perf/util/stat-shadow.c | 434 ++++++++++++++++++++++++++++++++++ tools/perf/util/stat.c | 35 ++- tools/perf/util/stat.h | 40 ++++ tools/perf/util/symbol-elf.c | 2 +- tools/perf/util/symbol.c | 4 +- tools/perf/util/vdso.c | 54 +++-- 22 files changed, 737 insertions(+), 573 deletions(-) create mode 100644 tools/perf/util/stat-shadow.c ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2015-06-08 14:17 Arnaldo Carvalho de Melo @ 2015-06-09 9:47 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2015-06-09 9:47 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Andi Kleen, David Ahern, He Kuang, Jiri Olsa, Namhyung Kim, Peter Zijlstra, Stephane Eranian, Wang Nan, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, more to come, > > - Arnaldo > > The following changes since commit a3d86542de8850be52e8589da22b24002941dfb7: > > perf/x86/intel/pebs: Add PEBSv3 decoding (2015-06-07 16:09:16 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to d3a7c489c7fd2463e3b2c3a2179c7be879dd9cb4: > > perf tools: Reference count struct dso (2015-06-08 10:31:40 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Fix perf.data size reporting in 'perf record' in no-buildid mode (He Kuang) > > Infrastructure: > > - Protect accesses the dso rbtrees/lists with a rw lock and reference > count struct dso instances (Arnaldo Carvalho de Melo) > > - Export dynamic symbols used by traceevent plugins (He Kuang) > > - Add libtrace-dynamic-list file to libtraceevent's .gitignore (He Kuang) > > - Refactor shadow stats code in 'perf stat', prep work for further > patchkits (Jiri Olsa) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (3): > perf machine: Fix up some more method names > perf tools: Protect accesses the dso rbtrees/lists with a rw lock > perf tools: Reference count struct dso > > He Kuang (3): > tools lib traceevent: Export dynamic symbols used by traceevent plugins > tools lib traceevent: Ignore libtrace-dynamic-list file > perf record: Fix perf.data size in no-buildid mode > > Jiri Olsa (9): > perf stat: Add id into perf_stat struct > perf stat: Replace transaction event possition check with id check > perf stat: Remove setup_events function > perf stat: Remove transaction_run from shadow update/print code > perf stat: Introduce reset_shadow_stats function > perf stat: Introduce print_shadow_stats function > perf stat: Add output file argument to print_shadow_stats function > perf stat: Add aggr_mode argument to print_shadow_stats function > perf stat: Move shadow stat counters into separate object > > tools/lib/traceevent/.gitignore | 1 + > tools/lib/traceevent/Makefile | 14 +- > tools/perf/Makefile.perf | 14 +- > tools/perf/builtin-record.c | 6 +- > tools/perf/builtin-stat.c | 506 ++-------------------------------------- > tools/perf/tests/dso-data.c | 4 +- > tools/perf/tests/hists_common.c | 6 +- > tools/perf/util/Build | 1 + > tools/perf/util/dso.c | 87 +++++-- > tools/perf/util/dso.h | 24 +- > tools/perf/util/header.c | 1 + > tools/perf/util/machine.c | 58 +++-- > tools/perf/util/machine.h | 4 +- > tools/perf/util/map.c | 11 +- > tools/perf/util/probe-event.c | 2 +- > tools/perf/util/probe-finder.c | 2 +- > tools/perf/util/stat-shadow.c | 434 ++++++++++++++++++++++++++++++++++ > tools/perf/util/stat.c | 35 ++- > tools/perf/util/stat.h | 40 ++++ > tools/perf/util/symbol-elf.c | 2 +- > tools/perf/util/symbol.c | 4 +- > tools/perf/util/vdso.c | 54 +++-- > 22 files changed, 737 insertions(+), 573 deletions(-) > create mode 100644 tools/perf/util/stat-shadow.c Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2015-04-02 22:28 Arnaldo Carvalho de Melo 2015-04-03 5:02 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-04-02 22:28 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Borislav Petkov, David Ahern, Don Zickus, Frederic Weisbecker, Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Wang Nan, Yunlong Song, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit e1abf2cc8d5d80b41c4419368ec743ccadbb131e: bpf: Fix the build on BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable (2015-04-02 16:28:06 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo for you to fetch changes up to bd05954bfa17f03a7bd4454178ba09786b35e383: perf data: Support using -f to override perf.data file ownership for 'convert' (2015-04-02 13:18:52 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Support unnamed union/structure members data collection in 'perf probe' (Masami Hiramatsu) - Support missing -f to override perf.data file ownership (Yunlong Song) Infrastructure: - No need to lookup thread twice when processing samples in 'perf script' (Arnaldo Carvalho de Melo) - No need to pass thread twice to the scripting callbacks (Arnaldo Carvalho de Melo) - No need to pass thread twice to the db-export facility (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (4): perf script: No need to lookup thread twice perf scripting: No need to pass thread twice to the scripting callbacks perf db-export: No need to pass thread twice to db_export__sample perf db-export: No need to have ->thread twice in struct export_sample Masami Hiramatsu (1): perf probe: Fix to track down unnamed union/structure members Yunlong Song (10): perf evlist: Support using -f to override perf.data file ownership perf inject: Support using -f to override perf.data file ownership perf kmem: Support using -f to override perf.data file ownership perf kvm: Support using -f to override perf.data.guest file ownership perf lock: Support using -f to override perf.data file ownership perf mem: Support using -f to override perf.data file ownership perf script: Support using -f to override perf.data file ownership perf timechart: Support using -f to override perf.data file ownership perf trace: Support using -f to override perf.data file ownership perf data: Support using -f to override perf.data file ownership for 'convert' tools/perf/builtin-data.c | 4 +++- tools/perf/builtin-evlist.c | 2 ++ tools/perf/builtin-inject.c | 1 + tools/perf/builtin-kmem.c | 9 +++++---- tools/perf/builtin-kvm.c | 2 ++ tools/perf/builtin-lock.c | 5 +++++ tools/perf/builtin-mem.c | 3 +++ tools/perf/builtin-script.c | 23 ++++++++-------------- tools/perf/builtin-timechart.c | 3 +++ tools/perf/builtin-trace.c | 3 +++ tools/perf/util/data-convert-bt.c | 3 ++- tools/perf/util/data-convert-bt.h | 2 +- tools/perf/util/db-export.c | 4 ++-- tools/perf/util/db-export.h | 3 +-- tools/perf/util/dwarf-aux.c | 14 +++++++++---- tools/perf/util/evsel.h | 1 + tools/perf/util/kvm-stat.h | 1 + tools/perf/util/probe-finder.c | 8 +++++++- .../perf/util/scripting-engines/trace-event-perl.c | 5 ++--- .../util/scripting-engines/trace-event-python.c | 16 ++++++--------- tools/perf/util/trace-event-scripting.c | 1 - tools/perf/util/trace-event.h | 3 +-- 22 files changed, 69 insertions(+), 47 deletions(-) ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2015-04-02 22:28 Arnaldo Carvalho de Melo @ 2015-04-03 5:02 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2015-04-03 5:02 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Borislav Petkov, David Ahern, Don Zickus, Frederic Weisbecker, Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Wang Nan, Yunlong Song, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit e1abf2cc8d5d80b41c4419368ec743ccadbb131e: > > bpf: Fix the build on BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable (2015-04-02 16:28:06 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to bd05954bfa17f03a7bd4454178ba09786b35e383: > > perf data: Support using -f to override perf.data file ownership for 'convert' (2015-04-02 13:18:52 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Support unnamed union/structure members data collection in 'perf probe' (Masami Hiramatsu) > > - Support missing -f to override perf.data file ownership (Yunlong Song) > > Infrastructure: > > - No need to lookup thread twice when processing samples in 'perf script' (Arnaldo Carvalho de Melo) > > - No need to pass thread twice to the scripting callbacks (Arnaldo Carvalho de Melo) > > - No need to pass thread twice to the db-export facility (Arnaldo Carvalho de Melo) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (4): > perf script: No need to lookup thread twice > perf scripting: No need to pass thread twice to the scripting callbacks > perf db-export: No need to pass thread twice to db_export__sample > perf db-export: No need to have ->thread twice in struct export_sample > > Masami Hiramatsu (1): > perf probe: Fix to track down unnamed union/structure members > > Yunlong Song (10): > perf evlist: Support using -f to override perf.data file ownership > perf inject: Support using -f to override perf.data file ownership > perf kmem: Support using -f to override perf.data file ownership > perf kvm: Support using -f to override perf.data.guest file ownership > perf lock: Support using -f to override perf.data file ownership > perf mem: Support using -f to override perf.data file ownership > perf script: Support using -f to override perf.data file ownership > perf timechart: Support using -f to override perf.data file ownership > perf trace: Support using -f to override perf.data file ownership > perf data: Support using -f to override perf.data file ownership for 'convert' > > tools/perf/builtin-data.c | 4 +++- > tools/perf/builtin-evlist.c | 2 ++ > tools/perf/builtin-inject.c | 1 + > tools/perf/builtin-kmem.c | 9 +++++---- > tools/perf/builtin-kvm.c | 2 ++ > tools/perf/builtin-lock.c | 5 +++++ > tools/perf/builtin-mem.c | 3 +++ > tools/perf/builtin-script.c | 23 ++++++++-------------- > tools/perf/builtin-timechart.c | 3 +++ > tools/perf/builtin-trace.c | 3 +++ > tools/perf/util/data-convert-bt.c | 3 ++- > tools/perf/util/data-convert-bt.h | 2 +- > tools/perf/util/db-export.c | 4 ++-- > tools/perf/util/db-export.h | 3 +-- > tools/perf/util/dwarf-aux.c | 14 +++++++++---- > tools/perf/util/evsel.h | 1 + > tools/perf/util/kvm-stat.h | 1 + > tools/perf/util/probe-finder.c | 8 +++++++- > .../perf/util/scripting-engines/trace-event-perl.c | 5 ++--- > .../util/scripting-engines/trace-event-python.c | 16 ++++++--------- > tools/perf/util/trace-event-scripting.c | 1 - > tools/perf/util/trace-event.h | 3 +-- > 22 files changed, 69 insertions(+), 47 deletions(-) Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2014-10-15 20:52 Arnaldo Carvalho de Melo 2014-10-16 5:18 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2014-10-15 20:52 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Alexander Yarygin, Andi Kleen, Anshuman Khandual, Arun Sharma, Christian Borntraeger, Cody P Schafer, David Ahern, Frederic Weisbecker, Haren Myneni, Jean Pihet, Jiri Olsa, Kan Liang, linuxppc-dev, Masanari Iida, Michael Ellerman, Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Randy Dunlap, Stephane Eranian, Sukadev Bhattiprolu, Taeung Song, Yasser Shalabi, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, I guess the changes are minor of affect just some non-core feature, so it is you call if you prefer to pull it into perf/urgent instead. Best Regards, - Arnaldo The following changes since commit ec4212d88a77eb6caec10777ddd629b702a5ebbd: Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2014-10-15 11:54:14 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo for you to fetch changes up to 673d659f5c5918b7ddbafebf1f129c9eb82973b4: perf kvm stat live: Enable events copying (2014-10-15 17:39:03 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: * Add a visual cue for toggle zeroing of samples in 'perf top' (Taeung Song) * Fix for double free in 'perf stat' when using some specific invalid command line combo (Yasser Shalabi) Infrastructure: * Add option to copy events when queuing for sorting across cpu buffers and enable it for 'perf kvm stat live', to avoid having events left in the queue pointing to the ring buffer be rewritten in high volume sessions. (Alexander Yarygin, improving work done by David Ahern): * Document sysfs events/ interfaces (Cody P Schafer) * Add support to new style format of kernel PMU event. (Kan Liang) * Fix typos in perf/Documentation (Masanari Iida) * Improve callchains when using libunwind (Namhyung Kim) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Lines starting with '#' will be ignored. ---------------------------------------------------------------- Alexander Yarygin (2): perf session: Add option to copy events when queueing perf kvm stat live: Enable events copying Cody P Schafer (2): perf Documentation: sysfs events/ interfaces perf Documentation: Remove Ruplicated docs for powerpc cpu specific events Kan Liang (4): Revert "perf tools: Default to cpu// for events v5" perf tools: Parse the pmu event prefix and suffix perf tools: Add support to new style format of kernel PMU event perf test: Add test case for pmu event new style format Masanari Iida (1): perf Documentation: Fix typos in perf/Documentation Namhyung Kim (4): perf report: Set callchain_param.record_mode for future use perf callchain: Create an address space per thread perf kvm: Use thread_{,_set}_priv helpers perf trace: Use thread_{,_set}_priv helpers Taeung Song (1): perf top: Add a visual cue for toggle zeroing of samples Yasser Shalabi (1): perf evlist: Fix for double free in tools/perf stat .../testing/sysfs-bus-event_source-devices-events | 611 ++------------------- tools/perf/Documentation/perf-diff.txt | 6 +- tools/perf/Documentation/perf-kvm.txt | 4 +- tools/perf/Documentation/perf-list.txt | 2 +- tools/perf/Documentation/perf-record.txt | 2 +- tools/perf/Documentation/perf-script-perl.txt | 4 +- tools/perf/Documentation/perf-script-python.txt | 6 +- tools/perf/Documentation/perf-script.txt | 2 +- tools/perf/Documentation/perf-test.txt | 2 +- tools/perf/Documentation/perf-trace.txt | 2 +- tools/perf/builtin-kvm.c | 7 +- tools/perf/builtin-report.c | 7 + tools/perf/builtin-trace.c | 16 +- tools/perf/tests/dwarf-unwind.c | 3 + tools/perf/tests/parse-events.c | 36 ++ tools/perf/ui/browsers/hists.c | 32 +- tools/perf/util/evlist.c | 1 + tools/perf/util/include/linux/string.h | 1 - tools/perf/util/ordered-events.c | 49 +- tools/perf/util/ordered-events.h | 10 +- tools/perf/util/parse-events.c | 133 ++++- tools/perf/util/parse-events.h | 14 + tools/perf/util/parse-events.l | 30 +- tools/perf/util/parse-events.y | 40 ++ tools/perf/util/pmu.c | 10 - tools/perf/util/pmu.h | 10 + tools/perf/util/session.c | 5 +- tools/perf/util/string.c | 24 - tools/perf/util/thread.c | 6 + tools/perf/util/unwind-libunwind.c | 37 +- tools/perf/util/unwind.h | 17 + 31 files changed, 460 insertions(+), 669 deletions(-) ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2014-10-15 20:52 Arnaldo Carvalho de Melo @ 2014-10-16 5:18 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2014-10-16 5:18 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Alexander Yarygin, Andi Kleen, Anshuman Khandual, Arun Sharma, Christian Borntraeger, Cody P Schafer, David Ahern, Frederic Weisbecker, Haren Myneni, Jean Pihet, Jiri Olsa, Kan Liang, linuxppc-dev, Masanari Iida, Michael Ellerman, Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Randy Dunlap, Stephane Eranian, Sukadev Bhattiprolu, Taeung Song, Yasser Shalabi, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, I guess the changes are minor of affect just some > non-core feature, so it is you call if you prefer to pull it into perf/urgent instead. > > Best Regards, > > - Arnaldo > > The following changes since commit ec4212d88a77eb6caec10777ddd629b702a5ebbd: > > Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2014-10-15 11:54:14 +0200) > > are available in the git repository at: > > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to 673d659f5c5918b7ddbafebf1f129c9eb82973b4: > > perf kvm stat live: Enable events copying (2014-10-15 17:39:03 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > * Add a visual cue for toggle zeroing of samples in 'perf top' (Taeung Song) > > * Fix for double free in 'perf stat' when using some specific invalid > command line combo (Yasser Shalabi) > > Infrastructure: > > * Add option to copy events when queuing for sorting across cpu buffers > and enable it for 'perf kvm stat live', to avoid having events left > in the queue pointing to the ring buffer be rewritten in high volume > sessions. (Alexander Yarygin, improving work done by David Ahern): > > * Document sysfs events/ interfaces (Cody P Schafer) > > * Add support to new style format of kernel PMU event. (Kan Liang) > > * Fix typos in perf/Documentation (Masanari Iida) > > * Improve callchains when using libunwind (Namhyung Kim) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > Lines starting with '#' will be ignored. > > ---------------------------------------------------------------- > Alexander Yarygin (2): > perf session: Add option to copy events when queueing > perf kvm stat live: Enable events copying > > Cody P Schafer (2): > perf Documentation: sysfs events/ interfaces > perf Documentation: Remove Ruplicated docs for powerpc cpu specific events > > Kan Liang (4): > Revert "perf tools: Default to cpu// for events v5" > perf tools: Parse the pmu event prefix and suffix > perf tools: Add support to new style format of kernel PMU event > perf test: Add test case for pmu event new style format > > Masanari Iida (1): > perf Documentation: Fix typos in perf/Documentation > > Namhyung Kim (4): > perf report: Set callchain_param.record_mode for future use > perf callchain: Create an address space per thread > perf kvm: Use thread_{,_set}_priv helpers > perf trace: Use thread_{,_set}_priv helpers > > Taeung Song (1): > perf top: Add a visual cue for toggle zeroing of samples > > Yasser Shalabi (1): > perf evlist: Fix for double free in tools/perf stat > > .../testing/sysfs-bus-event_source-devices-events | 611 ++------------------- > tools/perf/Documentation/perf-diff.txt | 6 +- > tools/perf/Documentation/perf-kvm.txt | 4 +- > tools/perf/Documentation/perf-list.txt | 2 +- > tools/perf/Documentation/perf-record.txt | 2 +- > tools/perf/Documentation/perf-script-perl.txt | 4 +- > tools/perf/Documentation/perf-script-python.txt | 6 +- > tools/perf/Documentation/perf-script.txt | 2 +- > tools/perf/Documentation/perf-test.txt | 2 +- > tools/perf/Documentation/perf-trace.txt | 2 +- > tools/perf/builtin-kvm.c | 7 +- > tools/perf/builtin-report.c | 7 + > tools/perf/builtin-trace.c | 16 +- > tools/perf/tests/dwarf-unwind.c | 3 + > tools/perf/tests/parse-events.c | 36 ++ > tools/perf/ui/browsers/hists.c | 32 +- > tools/perf/util/evlist.c | 1 + > tools/perf/util/include/linux/string.h | 1 - > tools/perf/util/ordered-events.c | 49 +- > tools/perf/util/ordered-events.h | 10 +- > tools/perf/util/parse-events.c | 133 ++++- > tools/perf/util/parse-events.h | 14 + > tools/perf/util/parse-events.l | 30 +- > tools/perf/util/parse-events.y | 40 ++ > tools/perf/util/pmu.c | 10 - > tools/perf/util/pmu.h | 10 + > tools/perf/util/session.c | 5 +- > tools/perf/util/string.c | 24 - > tools/perf/util/thread.c | 6 + > tools/perf/util/unwind-libunwind.c | 37 +- > tools/perf/util/unwind.h | 17 + > 31 files changed, 460 insertions(+), 669 deletions(-) Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2014-06-09 20:02 Jiri Olsa 2014-06-12 11:54 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Jiri Olsa @ 2014-06-09 20:02 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford, David Ahern, Don Zickus, Frederic Weisbecker, Javi Merino, Jean Pihet, Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt hi Ingo, please consider pulling thanks, jirka The following changes since commit 82b897782d10fcc4930c9d4a15b175348fdd2871: perf: Differentiate exec() and non-exec() comm events (2014-06-06 07:56:22 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo for you to fetch changes up to a2609f3b0c582d6aaa8f69a61a0eea6c7a98d291: perf tools: Support spark lines in perf stat (2014-06-09 13:34:50 +0200) ---------------------------------------------------------------- perf/core improvements and fixes: . Bitmask handling and plugin updates (Steven Rostedt) . Fix pipe check regression in attr event callback (Jiri Olsa) . Prettify the tags/TAGS/cscope targets output (Jiri Olsa) . Print array argument as string (Namhyung Kim) . Pass protection and flags bits through mmap2 interface (Peter Zijlstra) . Update perf tool mmap2 interface with protection and flag bits (Don Zickus) . Re-enable mmap interface (Don Zickus) . Add mem-mode documentation to report command (Don Zickus) . Add sort on dcacheline (Don Zickus) . Support spark lines in perf stat (Andi Kleen) Signed-off-by: Jiri Olsa <jolsa@kernel.org> ---------------------------------------------------------------- Andi Kleen (1): perf tools: Support spark lines in perf stat Don Zickus (6): perf tools: Update mmap2 interface with protection and flag bits Revert "perf: Disable PERF_RECORD_MMAP2 support" perf report: Add mem-mode documentation to report command perf tools: Add cpumode to struct hist_entry perf tools: Add support to dynamically get cacheline size perf tools: Add dcacheline sort Jiri Olsa (2): perf tools: Fix pipe check regression in attr event callback perf tools: Prettify the tags/TAGS/cscope targets output Namhyung Kim (1): perf script/python: Print array argument as string Peter Zijlstra (1): perf: Pass protection and flags bits through mmap2 interface Steven Rostedt (1): tools lib traceevent: Add options to plugins Steven Rostedt (Red Hat) (3): tools lib traceevent: Add flag to not load event plugins tools lib traceevent: Add options to function plugin tools lib traceevent: Added support for __get_bitmask() macro include/uapi/linux/perf_event.h | 1 + kernel/events/core.c | 37 +++- tools/lib/traceevent/event-parse.c | 113 ++++++++++++ tools/lib/traceevent/event-parse.h | 25 ++- tools/lib/traceevent/event-plugin.c | 203 ++++++++++++++++++++- tools/lib/traceevent/plugin_function.c | 43 ++++- tools/perf/Documentation/perf-report.txt | 23 +++ tools/perf/Documentation/perf-stat.txt | 4 + tools/perf/Makefile.perf | 7 +- tools/perf/builtin-inject.c | 2 +- tools/perf/builtin-stat.c | 12 ++ tools/perf/perf.c | 1 + tools/perf/tests/dwarf-unwind.c | 2 +- tools/perf/util/event.c | 57 ++++-- tools/perf/util/event.h | 2 + tools/perf/util/evsel.c | 1 + tools/perf/util/hist.c | 9 +- tools/perf/util/hist.h | 1 + tools/perf/util/machine.c | 4 +- tools/perf/util/map.c | 4 +- tools/perf/util/map.h | 4 +- .../perf/util/scripting-engines/trace-event-perl.c | 1 + .../util/scripting-engines/trace-event-python.c | 2 + tools/perf/util/sort.c | 107 +++++++++++ tools/perf/util/sort.h | 2 + tools/perf/util/spark.c | 31 ++++ tools/perf/util/spark.h | 4 + tools/perf/util/stat.c | 34 ++++ tools/perf/util/stat.h | 10 + tools/perf/util/util.c | 1 + tools/perf/util/util.h | 1 + 31 files changed, 707 insertions(+), 41 deletions(-) create mode 100644 tools/perf/util/spark.c create mode 100644 tools/perf/util/spark.h ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2014-06-09 20:02 Jiri Olsa @ 2014-06-12 11:54 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2014-06-12 11:54 UTC (permalink / raw) To: Jiri Olsa Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford, David Ahern, Don Zickus, Frederic Weisbecker, Javi Merino, Jean Pihet, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt * Jiri Olsa <jolsa@kernel.org> wrote: > > hi Ingo, > please consider pulling > > thanks, > jirka > > > The following changes since commit 82b897782d10fcc4930c9d4a15b175348fdd2871: > > perf: Differentiate exec() and non-exec() comm events (2014-06-06 07:56:22 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo > > for you to fetch changes up to a2609f3b0c582d6aaa8f69a61a0eea6c7a98d291: > > perf tools: Support spark lines in perf stat (2014-06-09 13:34:50 +0200) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > . Bitmask handling and plugin updates (Steven Rostedt) > > . Fix pipe check regression in attr event callback (Jiri Olsa) > > . Prettify the tags/TAGS/cscope targets output (Jiri Olsa) > > . Print array argument as string (Namhyung Kim) > > . Pass protection and flags bits through mmap2 interface (Peter Zijlstra) > > . Update perf tool mmap2 interface with protection and flag bits (Don Zickus) > > . Re-enable mmap interface (Don Zickus) > > . Add mem-mode documentation to report command (Don Zickus) > > . Add sort on dcacheline (Don Zickus) > > . Support spark lines in perf stat (Andi Kleen) > > Signed-off-by: Jiri Olsa <jolsa@kernel.org> > > ---------------------------------------------------------------- > Andi Kleen (1): > perf tools: Support spark lines in perf stat > > Don Zickus (6): > perf tools: Update mmap2 interface with protection and flag bits > Revert "perf: Disable PERF_RECORD_MMAP2 support" > perf report: Add mem-mode documentation to report command > perf tools: Add cpumode to struct hist_entry > perf tools: Add support to dynamically get cacheline size > perf tools: Add dcacheline sort > > Jiri Olsa (2): > perf tools: Fix pipe check regression in attr event callback > perf tools: Prettify the tags/TAGS/cscope targets output > > Namhyung Kim (1): > perf script/python: Print array argument as string > > Peter Zijlstra (1): > perf: Pass protection and flags bits through mmap2 interface > > Steven Rostedt (1): > tools lib traceevent: Add options to plugins > > Steven Rostedt (Red Hat) (3): > tools lib traceevent: Add flag to not load event plugins > tools lib traceevent: Add options to function plugin > tools lib traceevent: Added support for __get_bitmask() macro > > include/uapi/linux/perf_event.h | 1 + > kernel/events/core.c | 37 +++- > tools/lib/traceevent/event-parse.c | 113 ++++++++++++ > tools/lib/traceevent/event-parse.h | 25 ++- > tools/lib/traceevent/event-plugin.c | 203 ++++++++++++++++++++- > tools/lib/traceevent/plugin_function.c | 43 ++++- > tools/perf/Documentation/perf-report.txt | 23 +++ > tools/perf/Documentation/perf-stat.txt | 4 + > tools/perf/Makefile.perf | 7 +- > tools/perf/builtin-inject.c | 2 +- > tools/perf/builtin-stat.c | 12 ++ > tools/perf/perf.c | 1 + > tools/perf/tests/dwarf-unwind.c | 2 +- > tools/perf/util/event.c | 57 ++++-- > tools/perf/util/event.h | 2 + > tools/perf/util/evsel.c | 1 + > tools/perf/util/hist.c | 9 +- > tools/perf/util/hist.h | 1 + > tools/perf/util/machine.c | 4 +- > tools/perf/util/map.c | 4 +- > tools/perf/util/map.h | 4 +- > .../perf/util/scripting-engines/trace-event-perl.c | 1 + > .../util/scripting-engines/trace-event-python.c | 2 + > tools/perf/util/sort.c | 107 +++++++++++ > tools/perf/util/sort.h | 2 + > tools/perf/util/spark.c | 31 ++++ > tools/perf/util/spark.h | 4 + > tools/perf/util/stat.c | 34 ++++ > tools/perf/util/stat.h | 10 + > tools/perf/util/util.c | 1 + > tools/perf/util/util.h | 1 + > 31 files changed, 707 insertions(+), 41 deletions(-) > create mode 100644 tools/perf/util/spark.c > create mode 100644 tools/perf/util/spark.h Pulled, thanks a lot Jiri! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2013-08-30 18:58 Arnaldo Carvalho de Melo 2013-08-31 8:08 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2013-08-30 18:58 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Frederic Weisbecker, Jiri Olsa, Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo From: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit 00e4cb1ced1b17c35465defafe86d156cbd7544e: Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2013-08-29 12:02:34 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo for you to fetch changes up to f2935f3e585226b8203ec3861907e1cb16ad3d6a: perf trace: Handle missing HUGEPAGE defines (2013-08-30 15:43:28 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: . Tidy up sample parsing validation, from Adrian Hunter. . Make events stream always parsable by adding a new sample_type bit: PERF_SAMPLE_IDENTIFIER, that when requested will be always aat a fixed position in all PERF_RECORD_ records, from Adrian Hunter. . Add a sample parsing test, from Adrian Hunter. . Add option to 'perf trace' to analyze events in a file versus live, so that one can do: [root@zoo ~]# perf record -a -e raw_syscalls:* sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 25.150 MB perf.data (~1098836 samples) ] [root@zoo ~]# perf trace -i perf.data -e futex --duration 1 17.799 ( 1.020 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, ua 113.344 (95.429 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 4294967 133.778 ( 1.042 ms): 18004 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 429496 [root@zoo ~]# From David Ahern. . Honor target pid / tid options in 'perf trace' when analyzing a file, from David Ahern. . Handle missing HUGEPAGE defines in the mmap beautifier in 'perf trace', from David Ahern. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Adrian Hunter (11): perf tools: change machine__findnew_thread() to set thread pid perf evsel: Tidy up sample parsing overflow checking perf callchain: Remove unnecessary validation perf tools: Remove references to struct ip_event perf: make events stream always parsable perf evlist: Move perf_evlist__config() to a new source file perf tools: Add support for PERF_SAMPLE_IDENTIFIER perf tools: Add missing 'abi' member to 'struct regs_dump' perf tools: Expand perf_event__synthesize_sample() perf tools: Add a function to calculate sample event size perf tests: Add a sample parsing test David Ahern (4): perf evlist: Add tracepoint lookup by name perf trace: Add option to analyze events in a file versus live perf trace: Honor target pid / tid options when analyzing a file perf trace: Handle missing HUGEPAGE defines include/uapi/linux/perf_event.h | 27 ++- kernel/events/core.c | 11 +- tools/perf/Documentation/perf-trace.txt | 4 + tools/perf/Makefile | 2 + tools/perf/builtin-inject.c | 8 +- tools/perf/builtin-kmem.c | 3 +- tools/perf/builtin-kvm.c | 2 +- tools/perf/builtin-lock.c | 3 +- tools/perf/builtin-mem.c | 2 +- tools/perf/builtin-report.c | 2 +- tools/perf/builtin-sched.c | 20 +- tools/perf/builtin-script.c | 3 +- tools/perf/builtin-top.c | 11 +- tools/perf/builtin-trace.c | 157 ++++++++++++- tools/perf/tests/builtin-test.c | 4 + tools/perf/tests/code-reading.c | 4 +- tools/perf/tests/hists_link.c | 23 +- tools/perf/tests/mmap-basic.c | 2 +- tools/perf/tests/sample-parsing.c | 316 +++++++++++++++++++++++++ tools/perf/tests/tests.h | 1 + tools/perf/util/build-id.c | 11 +- tools/perf/util/callchain.c | 8 - tools/perf/util/callchain.h | 5 - tools/perf/util/event.c | 5 +- tools/perf/util/event.h | 18 +- tools/perf/util/evlist.c | 140 +++++++++-- tools/perf/util/evlist.h | 12 +- tools/perf/util/evsel.c | 405 ++++++++++++++++++++++++++++---- tools/perf/util/evsel.h | 14 +- tools/perf/util/machine.c | 22 +- tools/perf/util/machine.h | 3 +- tools/perf/util/record.c | 108 +++++++++ tools/perf/util/session.c | 32 +-- 33 files changed, 1193 insertions(+), 195 deletions(-) create mode 100644 tools/perf/tests/sample-parsing.c create mode 100644 tools/perf/util/record.c ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2013-08-30 18:58 Arnaldo Carvalho de Melo @ 2013-08-31 8:08 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2013-08-31 8:08 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Frederic Weisbecker, Jiri Olsa, Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > From: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> > > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit 00e4cb1ced1b17c35465defafe86d156cbd7544e: > > Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2013-08-29 12:02:34 +0200) > > are available in the git repository at: > > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo > > for you to fetch changes up to f2935f3e585226b8203ec3861907e1cb16ad3d6a: > > perf trace: Handle missing HUGEPAGE defines (2013-08-30 15:43:28 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > . Tidy up sample parsing validation, from Adrian Hunter. > > . Make events stream always parsable by adding a new sample_type bit: > PERF_SAMPLE_IDENTIFIER, that when requested will be always aat a fixed > position in all PERF_RECORD_ records, from Adrian Hunter. > > . Add a sample parsing test, from Adrian Hunter. > > . Add option to 'perf trace' to analyze events in a file versus live, > so that one can do: > > [root@zoo ~]# perf record -a -e raw_syscalls:* sleep 1 > [ perf record: Woken up 0 times to write data ] > [ perf record: Captured and wrote 25.150 MB perf.data (~1098836 samples) ] > [root@zoo ~]# perf trace -i perf.data -e futex --duration 1 > 17.799 ( 1.020 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, ua > 113.344 (95.429 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 4294967 > 133.778 ( 1.042 ms): 18004 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 429496 > [root@zoo ~]# > > From David Ahern. > > . Honor target pid / tid options in 'perf trace' when analyzing a file, > from David Ahern. > > . Handle missing HUGEPAGE defines in the mmap beautifier in 'perf trace', > from David Ahern. > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Adrian Hunter (11): > perf tools: change machine__findnew_thread() to set thread pid > perf evsel: Tidy up sample parsing overflow checking > perf callchain: Remove unnecessary validation > perf tools: Remove references to struct ip_event > perf: make events stream always parsable > perf evlist: Move perf_evlist__config() to a new source file > perf tools: Add support for PERF_SAMPLE_IDENTIFIER > perf tools: Add missing 'abi' member to 'struct regs_dump' > perf tools: Expand perf_event__synthesize_sample() > perf tools: Add a function to calculate sample event size > perf tests: Add a sample parsing test > > David Ahern (4): > perf evlist: Add tracepoint lookup by name > perf trace: Add option to analyze events in a file versus live > perf trace: Honor target pid / tid options when analyzing a file > perf trace: Handle missing HUGEPAGE defines > > include/uapi/linux/perf_event.h | 27 ++- > kernel/events/core.c | 11 +- > tools/perf/Documentation/perf-trace.txt | 4 + > tools/perf/Makefile | 2 + > tools/perf/builtin-inject.c | 8 +- > tools/perf/builtin-kmem.c | 3 +- > tools/perf/builtin-kvm.c | 2 +- > tools/perf/builtin-lock.c | 3 +- > tools/perf/builtin-mem.c | 2 +- > tools/perf/builtin-report.c | 2 +- > tools/perf/builtin-sched.c | 20 +- > tools/perf/builtin-script.c | 3 +- > tools/perf/builtin-top.c | 11 +- > tools/perf/builtin-trace.c | 157 ++++++++++++- > tools/perf/tests/builtin-test.c | 4 + > tools/perf/tests/code-reading.c | 4 +- > tools/perf/tests/hists_link.c | 23 +- > tools/perf/tests/mmap-basic.c | 2 +- > tools/perf/tests/sample-parsing.c | 316 +++++++++++++++++++++++++ > tools/perf/tests/tests.h | 1 + > tools/perf/util/build-id.c | 11 +- > tools/perf/util/callchain.c | 8 - > tools/perf/util/callchain.h | 5 - > tools/perf/util/event.c | 5 +- > tools/perf/util/event.h | 18 +- > tools/perf/util/evlist.c | 140 +++++++++-- > tools/perf/util/evlist.h | 12 +- > tools/perf/util/evsel.c | 405 ++++++++++++++++++++++++++++---- > tools/perf/util/evsel.h | 14 +- > tools/perf/util/machine.c | 22 +- > tools/perf/util/machine.h | 3 +- > tools/perf/util/record.c | 108 +++++++++ > tools/perf/util/session.c | 32 +-- > 33 files changed, 1193 insertions(+), 195 deletions(-) > create mode 100644 tools/perf/tests/sample-parsing.c > create mode 100644 tools/perf/util/record.c Pulled, thanks Arnaldo! Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2013-02-28 21:05 Arnaldo Carvalho de Melo 0 siblings, 0 replies; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2013-02-28 21:05 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Borislav Petkov, Corey Ashford, David Ahern, Feng Tang, Frederic Weisbecker, Ingo Molnar, Ingo Molnar, Jiri Olsa, liguang, Marcin Slusarz, Michael Ellerman, Namhyung Kim, Namhyung Kim, Oleg Nesterov, Paul Mackerras, Pekka Enberg, Peter Zijlstra, Steven Rostedt, Wu Fengguang, Arnaldo Carvalho de Melo From: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit e259514eef764a5286873618e34c560ecb6cff13: perf/x86/amd: Enable northbridge performance counters on AMD family 15h (2013-02-16 09:37:27 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo for you to fetch changes up to 0e0c6670a333aa884d11799f38a435bdf4c408ed: perf report: Fix build with NO_NEWT=1 (2013-02-28 16:51:01 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: . Honor parallel jobs, fix from Borislav Petkov . Introduce tools/lib/lk library, initially with just debugfs handling routines shared with tools/vm, more to come, from Borislav Petkov . Fix handling of -C (cpus) in perf record, from Jiri Olsa . Add perf_event_attr entries in 'perf test' to check -C handling in 'record' and 'stat', from Jiri Olsa. . Check if -DFORTIFY_SOURCE=2 is allowed, fix from Marcin Slusarz. . Fix build with NO_NEWT=1, from Michael Ellerman. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Borislav Petkov (6): perf tools: Remove a write-only variable in the debugfs code perf tools: Honor parallel jobs perf tools: Correct Makefile.include perf tools: Introduce tools/lib/lk library perf tools: Extract perf-specific stuff from debugfs.c tools/vm: Switch to liblk library Jiri Olsa (5): perf tests: Make attr script verbose friendly perf tests: Make attr script test event cpu perf tests: Add attr record -C cpu test perf tests: Add attr stat -C cpu test perf record: Fix -C option Marcin Slusarz (1): perf tools: check if -DFORTIFY_SOURCE=2 is allowed Michael Ellerman (2): perf annotate: Fix build with NO_NEWT=1 perf report: Fix build with NO_NEWT=1 liguang (1): perf tools: Sort command-list.txt alphabetically Makefile | 4 +- tools/Makefile | 16 ++++++- tools/lib/lk/Makefile | 35 +++++++++++++++ tools/{perf/util => lib/lk}/debugfs.c | 49 ++++++++------------ tools/lib/lk/debugfs.h | 29 ++++++++++++ tools/perf/MANIFEST | 1 + tools/perf/Makefile | 42 +++++++++++++---- tools/perf/builtin-kvm.c | 2 +- tools/perf/builtin-probe.c | 2 +- tools/perf/builtin-record.c | 6 ++- tools/perf/command-list.txt | 14 +++--- tools/perf/perf.c | 8 ++-- tools/perf/tests/attr.c | 9 +++- tools/perf/tests/attr.py | 5 ++- tools/perf/tests/attr/base-record | 1 + tools/perf/tests/attr/base-stat | 1 + tools/perf/tests/attr/test-record-C0 | 13 ++++++ tools/perf/tests/attr/test-stat-C0 | 9 ++++ tools/perf/tests/parse-events.c | 2 +- tools/perf/util/debugfs.h | 12 ----- tools/perf/util/evlist.c | 2 +- tools/perf/util/evsel.c | 2 +- tools/perf/util/hist.h | 5 ++- tools/perf/util/parse-events.c | 2 +- tools/perf/util/probe-event.c | 2 +- tools/perf/util/python-ext-sources | 1 - tools/perf/util/setup.py | 3 +- tools/perf/util/trace-event-info.c | 4 +- tools/perf/util/util.c | 27 +++++++++++ tools/perf/util/util.h | 7 ++- tools/scripts/Makefile.include | 6 ++- tools/vm/Makefile | 17 +++++-- tools/vm/page-types.c | 85 +++-------------------------------- 33 files changed, 253 insertions(+), 170 deletions(-) create mode 100644 tools/lib/lk/Makefile rename tools/{perf/util => lib/lk}/debugfs.c (68%) create mode 100644 tools/lib/lk/debugfs.h create mode 100644 tools/perf/tests/attr/test-record-C0 create mode 100644 tools/perf/tests/attr/test-stat-C0 delete mode 100644 tools/perf/util/debugfs.h ^ permalink raw reply [flat|nested] 51+ messages in thread
* [GIT PULL 00/15] perf/core improvements and fixes @ 2011-12-23 21:53 Arnaldo Carvalho de Melo 2011-12-29 20:28 ` Ingo Molnar 0 siblings, 1 reply; 51+ messages in thread From: Arnaldo Carvalho de Melo @ 2011-12-23 21:53 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern, Frederic Weisbecker, Namhyung Kim, Nelson Elhage, Paul Mackerras, Peter Zijlstra, Robert Richter, Stephane Eranian, arnaldo.melo Hi Ingo, Please consider pulling from: git://github.com/acmel/linux.git perf/core Regards, - Arnaldo David Ahern (3): perf tools: Fix comm for processes with named threads perf tools: Look up thread names for system wide profiling perf script: look up thread using tid instead of pid Ingo Molnar (1): perf tools: Fix truncated annotation Namhyung Kim (1): perf report: Fix usage string Nelson Elhage (2): perf: builtin-record: Provide advice if mmap'ing fails with EPERM. perf: builtin-record: Document and check that mmap_pages must be a power of two. Robert Richter (8): perf tools: Improve macros for struct feature_ops perf tools: Continue processing header on unknown features perf tools: Fix out-of-bound access to struct perf_session perf tools: Moving code in some files perf report: Accept fifos as input file perf tools: Unify handling of features when writing feature section perf tools: Use for_each_set_bit() to iterate over feature flags perf script: Add generic perl handler to process events tools/perf/Documentation/perf-annotate.txt | 2 +- tools/perf/Documentation/perf-buildid-list.txt | 2 +- tools/perf/Documentation/perf-evlist.txt | 2 +- tools/perf/Documentation/perf-kmem.txt | 2 +- tools/perf/Documentation/perf-lock.txt | 2 +- tools/perf/Documentation/perf-record.txt | 2 +- tools/perf/Documentation/perf-report.txt | 2 +- tools/perf/Documentation/perf-sched.txt | 2 +- tools/perf/Documentation/perf-script.txt | 2 +- tools/perf/Documentation/perf-timechart.txt | 2 +- tools/perf/builtin-annotate.c | 3 +- tools/perf/builtin-buildid-list.c | 53 +- tools/perf/builtin-evlist.c | 2 +- tools/perf/builtin-kmem.c | 2 +- tools/perf/builtin-lock.c | 2 +- tools/perf/builtin-record.c | 19 +- tools/perf/builtin-report.c | 15 +- tools/perf/builtin-sched.c | 2 +- tools/perf/builtin-script.c | 6 +- tools/perf/builtin-timechart.c | 4 +- tools/perf/util/annotate.c | 2 +- tools/perf/util/event.c | 112 +++- tools/perf/util/evlist.c | 2 + tools/perf/util/header.c | 663 +++++++++----------- tools/perf/util/header.h | 6 +- tools/perf/util/include/linux/bitops.h | 118 ++++ .../perf/util/scripting-engines/trace-event-perl.c | 73 ++- tools/perf/util/session.c | 15 +- tools/perf/util/session.h | 2 +- tools/perf/util/util.h | 11 + 30 files changed, 676 insertions(+), 456 deletions(-) -- 1.7.8.rc0.35.gee6df ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [GIT PULL 00/15] perf/core improvements and fixes 2011-12-23 21:53 Arnaldo Carvalho de Melo @ 2011-12-29 20:28 ` Ingo Molnar 0 siblings, 0 replies; 51+ messages in thread From: Ingo Molnar @ 2011-12-29 20:28 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, David Ahern, Frederic Weisbecker, Namhyung Kim, Nelson Elhage, Paul Mackerras, Peter Zijlstra, Robert Richter, Stephane Eranian, arnaldo.melo * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > Hi Ingo, > > Please consider pulling from: > > git://github.com/acmel/linux.git perf/core > > Regards, > > - Arnaldo > > David Ahern (3): > perf tools: Fix comm for processes with named threads > perf tools: Look up thread names for system wide profiling > perf script: look up thread using tid instead of pid > > Ingo Molnar (1): > perf tools: Fix truncated annotation > > Namhyung Kim (1): > perf report: Fix usage string > > Nelson Elhage (2): > perf: builtin-record: Provide advice if mmap'ing fails with EPERM. > perf: builtin-record: Document and check that mmap_pages must be a power of two. > > Robert Richter (8): > perf tools: Improve macros for struct feature_ops > perf tools: Continue processing header on unknown features > perf tools: Fix out-of-bound access to struct perf_session > perf tools: Moving code in some files > perf report: Accept fifos as input file > perf tools: Unify handling of features when writing feature section > perf tools: Use for_each_set_bit() to iterate over feature flags > perf script: Add generic perl handler to process events > > tools/perf/Documentation/perf-annotate.txt | 2 +- > tools/perf/Documentation/perf-buildid-list.txt | 2 +- > tools/perf/Documentation/perf-evlist.txt | 2 +- > tools/perf/Documentation/perf-kmem.txt | 2 +- > tools/perf/Documentation/perf-lock.txt | 2 +- > tools/perf/Documentation/perf-record.txt | 2 +- > tools/perf/Documentation/perf-report.txt | 2 +- > tools/perf/Documentation/perf-sched.txt | 2 +- > tools/perf/Documentation/perf-script.txt | 2 +- > tools/perf/Documentation/perf-timechart.txt | 2 +- > tools/perf/builtin-annotate.c | 3 +- > tools/perf/builtin-buildid-list.c | 53 +- > tools/perf/builtin-evlist.c | 2 +- > tools/perf/builtin-kmem.c | 2 +- > tools/perf/builtin-lock.c | 2 +- > tools/perf/builtin-record.c | 19 +- > tools/perf/builtin-report.c | 15 +- > tools/perf/builtin-sched.c | 2 +- > tools/perf/builtin-script.c | 6 +- > tools/perf/builtin-timechart.c | 4 +- > tools/perf/util/annotate.c | 2 +- > tools/perf/util/event.c | 112 +++- > tools/perf/util/evlist.c | 2 + > tools/perf/util/header.c | 663 +++++++++----------- > tools/perf/util/header.h | 6 +- > tools/perf/util/include/linux/bitops.h | 118 ++++ > .../perf/util/scripting-engines/trace-event-perl.c | 73 ++- > tools/perf/util/session.c | 15 +- > tools/perf/util/session.h | 2 +- > tools/perf/util/util.h | 11 + > 30 files changed, 676 insertions(+), 456 deletions(-) Pulled, thanks a lot Arnaldo! FYI, i fixed a trivial build failure, in: f2328062726d: perf tools: Fix feature-bits rework fallout, remove unused variable Thanks, Ingo ^ permalink raw reply [flat|nested] 51+ messages in thread
end of thread, other threads:[~2017-08-23 19:40 UTC | newest] Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-08-23 19:35 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 01/15] perf xyarray: Save max_x, max_y Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 02/15] perf evsel: Fix buffer overflow while freeing events Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 03/15] perf bpf: Tighten detection of BPF events Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 04/15] perf tools: Add utility function to detect SMT status Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 05/15] perf tools: Expression parser enhancements for metrics Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 06/15] perf tools: Increase maximum number of events in expressions Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 07/15] perf tools: Dedup events in expression parsing Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 08/15] perf vendor events: Add core event list for Skylake Server Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 09/15] perf vendor events: Add Skylake server uncore event list Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 10/15] perf tools: Add support for printing new mem_info encodings Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 11/15] perf test: Add test cases for new data source encoding Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 12/15] perf tools: Really install manpages via 'make install-man' Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 13/15] perf: Fix documentation for sysctls perf_event_paranoid and perf_event_mlock_kb Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 14/15] perf tools: Fix static linking with libdw from elfutils Arnaldo Carvalho de Melo 2017-08-23 19:36 ` [PATCH 15/15] perf tools: Fix static linking with libunwind Arnaldo Carvalho de Melo -- strict thread matches above, loose matches on Subject: below -- 2017-07-28 20:00 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo 2017-07-30 9:31 ` Ingo Molnar 2017-02-14 1:13 Arnaldo Carvalho de Melo 2017-02-14 6:31 ` Ingo Molnar 2016-11-15 1:38 Arnaldo Carvalho de Melo 2016-11-15 8:47 ` Ingo Molnar 2016-10-27 20:40 Arnaldo Carvalho de Melo 2016-09-22 21:12 Arnaldo Carvalho de Melo 2016-09-23 5:22 ` Ingo Molnar 2016-07-18 23:33 Arnaldo Carvalho de Melo 2016-07-19 6:46 ` Ingo Molnar 2016-05-10 15:15 Arnaldo Carvalho de Melo 2016-05-10 20:28 ` Ingo Molnar 2016-03-07 19:44 Arnaldo Carvalho de Melo 2016-02-22 18:02 Arnaldo Carvalho de Melo 2016-02-24 7:21 ` Ingo Molnar 2015-09-05 1:06 Arnaldo Carvalho de Melo 2015-09-08 14:09 ` Arnaldo Carvalho de Melo 2015-09-08 14:21 ` Ingo Molnar 2015-09-08 14:30 ` Arnaldo Carvalho de Melo 2015-09-14 8:41 ` Ingo Molnar 2015-09-14 9:07 ` Wangnan (F) 2015-06-08 14:17 Arnaldo Carvalho de Melo 2015-06-09 9:47 ` Ingo Molnar 2015-04-02 22:28 Arnaldo Carvalho de Melo 2015-04-03 5:02 ` Ingo Molnar 2014-10-15 20:52 Arnaldo Carvalho de Melo 2014-10-16 5:18 ` Ingo Molnar 2014-06-09 20:02 Jiri Olsa 2014-06-12 11:54 ` Ingo Molnar 2013-08-30 18:58 Arnaldo Carvalho de Melo 2013-08-31 8:08 ` Ingo Molnar 2013-02-28 21:05 Arnaldo Carvalho de Melo 2011-12-23 21:53 Arnaldo Carvalho de Melo 2011-12-29 20:28 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).