[RESEND PATCH V3 0/8] TopDown metrics support for Icelake

* [RESEND PATCH V3 0/8] TopDown metrics support for Icelake
@ 2019-08-26 14:47 kan.liang
  2019-08-26 14:47 ` [RESEND PATCH V3 1/8] perf/x86/intel: Set correct mask for TOPDOWN.SLOTS kan.liang
                   ` (7 more replies)
  0 siblings, 8 replies; 39+ messages in thread
From: kan.liang @ 2019-08-26 14:47 UTC (permalink / raw)
  To: peterz, acme, mingo, linux-kernel
  Cc: tglx, jolsa, eranian, alexander.shishkin, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Icelake has support for measuring the level 1 TopDown metrics
directly in hardware. This is implemented by an additional METRICS
register, and a new Fixed Counter 3 that measures pipeline SLOTS.

Four TopDown metric events as separate perf events, which map to
internal METRICS register, are exposed. They are topdown-retiring,
topdown-bad-spec, topdown-fe-bound and topdown-be-bound.
Those events do not exist in hardware, but can be allocated by the
scheduler. The value of TopDown metric events can be calculated by
multiplying the METRICS (percentage) register with SLOTS fixed counter.

New in Icelake
- Do not require generic counters. This allows to collect TopDown always
  in addition to other events.
- Measuring TopDown per thread/process instead of only per core

Limitation
- To get accurate result and avoid reading the METRICS register multiple
  times, the TopDown metrics events and SLOTS event have to be in the
  same group.
- METRICS and SLOTS registers have to be cleared after each read by SW.
  That is to prevent the lose of precision and a known side effect of
  METRICS register.
- Cannot do sampling read SLOTS and TopDown metric events

Please refer SDM Vol3, 18.3.9.3 Performance Metrics for the details of
TopDown metrics.

Changes since V2:
- Rebase on top of v5.3-rc1

Key changes since V1:
- Remove variables for reg_idx and enabled_events[] array.
  The reg_idx can be calculated by idx in runtime.
  Using existing active_mask to replace enabled_events.
- Choose value 47 for the fixed index of BTS.
- Support OVF_PERF_METRICS overflow bit in PMI handler
- Drops the caching mechanism and related variables
  New mechanism is to update all active slots/metrics events for the
  first slots/metrics events in a group. For each group reading, it
  still only read the slots/perf_metrics MSR once
- Disable PMU for read of topdown events to avoid the NMI issue
- Move RDPMC support to a separate patch
- Using event=0x00,umask=0x1X for topdown metrics events
- Drop the patch which add REMOVE transaction
  We can indicate x86_pmu_stop() by checking
  (event && !test_bit(event->hw.idx, cpuc->active_mask)),
  which is a good place to save the slots/metrics MSR value

Andi Kleen (2):
  perf, tools, stat: Support new per thread TopDown metrics
  perf, tools: Add documentation for topdown metrics

Kan Liang (6):
  perf/x86/intel: Set correct mask for TOPDOWN.SLOTS
  perf/x86/intel: Basic support for metrics counters
  perf/x86/intel: Support hardware TopDown metrics
  perf/x86/intel: Support per thread RDPMC TopDown metrics
  perf/x86/intel: Export TopDown events for Icelake
  perf/x86/intel: Disable sampling read slots and topdown

 arch/x86/events/core.c                 |  35 ++-
 arch/x86/events/intel/core.c           | 362 ++++++++++++++++++++++++-
 arch/x86/events/perf_event.h           |  33 +++
 arch/x86/include/asm/msr-index.h       |   3 +
 arch/x86/include/asm/perf_event.h      |  54 +++-
 include/linux/perf_event.h             |   3 +
 tools/perf/Documentation/perf-stat.txt |   9 +-
 tools/perf/Documentation/topdown.txt   | 223 +++++++++++++++
 tools/perf/builtin-stat.c              |  24 ++
 tools/perf/util/stat-shadow.c          |  89 ++++++
 tools/perf/util/stat.c                 |   4 +
 tools/perf/util/stat.h                 |   8 +
 12 files changed, 827 insertions(+), 20 deletions(-)
 create mode 100644 tools/perf/Documentation/topdown.txt

-- 
2.17.1

^ permalink raw reply	[flat|nested] 39+ messages in thread