All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/15] perf mem/c2c: Add support for AMD
@ 2022-09-28  9:57 Ravi Bangoria
  2022-09-28  9:57 ` [PATCH v3 01/15] perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO} Ravi Bangoria
                   ` (14 more replies)
  0 siblings, 15 replies; 41+ messages in thread
From: Ravi Bangoria @ 2022-09-28  9:57 UTC (permalink / raw)
  To: peterz, acme
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

Perf mem and c2c tools are wrappers around perf record with mem load/
store events. IBS tagged load/store sample provides most of the
information needed for these tools. Enable support for these tools on
AMD Zen processors based on IBS Op pmu.

There are some limitations though: Only load/store micro-ops provide
mem/c2c information. Whereas, IBS does not have a way to choose a
particular type of micro-op to tag. This results in many non-LS
micro-ops being tagged which appear as N/A in the perf report. IBS,
being an uncore pmu from kernel point of view[1], does not support per
process monitoring. Thus, perf mem/c2c on AMD are currently supported
in per-cpu mode only.

Example:
  $ sudo ./perf mem record -- -c 10000
  ^C[ perf record: Woken up 227 times to write data ]
  [ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]

  $ sudo ./perf mem report -F mem,sample,snoop
  Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
  Memory access                  Samples  Snoop
  N/A                             700620  N/A
  L1 hit                          126675  N/A
  L2 hit                             424  N/A
  L3 hit                             664  HitM
  L3 hit                              10  N/A
  Local RAM hit                        2  N/A
  Remote RAM (1 hop) hit            8558  N/A
  Remote Cache (1 hop) hit             3  N/A
  Remote Cache (1 hop) hit             2  HitM
  Remote Cache (2 hops) hit            10  HitM
  Remote Cache (2 hops) hit             6  N/A
  Uncached hit                         4  N/A

Prepared on queue/perf/core (cce6a2d7e0e49).

v2: https://lore.kernel.org/all/20220616113638.900-1-ravi.bangoria@amd.com
v2->v3:
 - Use sample_flags instead of __PERF_SAMPLE_*_EARLY varients
 - Make PERF_SAMPLE_WEIGHT independent of PERF_SAMPLE_DATA_SRC
 - Add a patch to reverse sync PERF_MEM_SNOOPX_PEER from tools
   to kernel uapi header
 - Add Acked-by: Jiri Olsa for tool side unchanged patches

Also, a recent patch[2] to test perf mem fails on AMD because of
aforementioned limitations.

[1]: https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com
[2]: https://lore.kernel.org/lkml/20220924133408.1125903-1-leo.yan%40linaro.org


Ravi Bangoria (15):
  perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO}
  perf/x86/amd: Add IBS OP_DATA2 DataSrc bit definitions
  perf/x86/amd: Support PERF_SAMPLE_DATA_SRC
  perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}
  perf/x86/amd: Support PERF_SAMPLE_ADDR
  perf/x86/amd: Support PERF_SAMPLE_PHY_ADDR
  perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file
  perf tool: Sync include/uapi/linux/perf_event.h header
  perf tool: Sync arch/x86/include/asm/amd-ibs.h header
  perf mem: Add support for printing PERF_MEM_LVLNUM_{EXTN_MEM|IO}
  perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
  perf mem/c2c: Add load store event mappings for AMD
  perf mem/c2c: Avoid printing empty lines for unsupported events
  perf mem: Use more generic term for LFB
  perf script: Add missing fields in usage hint

 arch/x86/events/amd/ibs.c                | 345 ++++++++++++++++++++++-
 arch/x86/include/asm/amd-ibs.h           |  16 ++
 include/uapi/linux/perf_event.h          |   6 +-
 kernel/events/core.c                     |   3 +-
 tools/arch/x86/include/asm/amd-ibs.h     |  16 ++
 tools/include/uapi/linux/perf_event.h    |   4 +-
 tools/perf/Documentation/perf-c2c.txt    |  14 +-
 tools/perf/Documentation/perf-mem.txt    |   3 +-
 tools/perf/Documentation/perf-record.txt |   1 +
 tools/perf/arch/x86/util/mem-events.c    |  31 +-
 tools/perf/builtin-c2c.c                 |   1 +
 tools/perf/builtin-mem.c                 |   1 +
 tools/perf/builtin-script.c              |   7 +-
 tools/perf/util/mem-events.c             |  17 +-
 14 files changed, 438 insertions(+), 27 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2022-10-28  6:42 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-28  9:57 [PATCH v3 00/15] perf mem/c2c: Add support for AMD Ravi Bangoria
2022-09-28  9:57 ` [PATCH v3 01/15] perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO} Ravi Bangoria
2022-09-30  9:31   ` [tip: perf/core] " tip-bot2 for Ravi Bangoria
2022-09-30 10:48   ` [PATCH v3 01/15] " kajoljain
2022-09-30 12:50     ` Ravi Bangoria
2022-09-30 14:17       ` Liang, Kan
2022-10-01  6:37         ` Ravi Bangoria
2022-10-03 13:15           ` Liang, Kan
2022-10-06 11:38             ` Ravi Bangoria
2022-10-14 13:53               ` Arnaldo Carvalho de Melo
2022-10-14 15:04                 ` Ravi Bangoria
2022-10-27  8:25           ` Peter Zijlstra
2022-10-28  6:41           ` [tip: perf/urgent] perf/mem: Rename PERF_MEM_LVLNUM_EXTN_MEM to PERF_MEM_LVLNUM_CXL tip-bot2 for Ravi Bangoria
2022-09-28  9:57 ` [PATCH v3 02/15] perf/x86/amd: Add IBS OP_DATA2 DataSrc bit definitions Ravi Bangoria
2022-09-30  4:41   ` Namhyung Kim
2022-09-30  4:48     ` Ravi Bangoria
2022-09-30  5:11       ` Namhyung Kim
2022-09-30  6:16         ` Ravi Bangoria
2022-09-30  9:31   ` [tip: perf/core] " tip-bot2 for Ravi Bangoria
2022-09-28  9:57 ` [PATCH v3 03/15] perf/x86/amd: Support PERF_SAMPLE_DATA_SRC Ravi Bangoria
2022-09-30  9:31   ` [tip: perf/core] " tip-bot2 for Ravi Bangoria
2022-09-28  9:57 ` [PATCH v3 04/15] perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT} Ravi Bangoria
2022-09-30  5:09   ` Namhyung Kim
2022-09-30  9:31   ` [tip: perf/core] " tip-bot2 for Ravi Bangoria
2022-09-28  9:57 ` [PATCH v3 05/15] perf/x86/amd: Support PERF_SAMPLE_ADDR Ravi Bangoria
2022-09-30  9:31   ` [tip: perf/core] " tip-bot2 for Ravi Bangoria
2022-09-28  9:57 ` [PATCH v3 06/15] perf/x86/amd: Support PERF_SAMPLE_PHY_ADDR Ravi Bangoria
2022-09-30  4:59   ` Namhyung Kim
2022-09-30  5:05     ` Ravi Bangoria
2022-09-30  9:31   ` [tip: perf/core] " tip-bot2 for Ravi Bangoria
2022-09-30 17:02   ` [PATCH v3 06/15] " Jiri Olsa
2022-09-28  9:57 ` [PATCH v3 07/15] perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file Ravi Bangoria
2022-09-30  9:31   ` [tip: perf/core] " tip-bot2 for Ravi Bangoria
2022-09-28  9:57 ` [PATCH v3 08/15] perf tool: Sync include/uapi/linux/perf_event.h header Ravi Bangoria
2022-09-28  9:57 ` [PATCH v3 09/15] perf tool: Sync arch/x86/include/asm/amd-ibs.h header Ravi Bangoria
2022-09-28  9:58 ` [PATCH v3 10/15] perf mem: Add support for printing PERF_MEM_LVLNUM_{EXTN_MEM|IO} Ravi Bangoria
2022-09-28  9:58 ` [PATCH v3 11/15] perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events Ravi Bangoria
2022-09-28  9:58 ` [PATCH v3 12/15] perf mem/c2c: Add load store event mappings for AMD Ravi Bangoria
2022-09-28  9:58 ` [PATCH v3 13/15] perf mem/c2c: Avoid printing empty lines for unsupported events Ravi Bangoria
2022-09-28  9:58 ` [PATCH v3 14/15] perf mem: Use more generic term for LFB Ravi Bangoria
2022-09-28  9:58 ` [PATCH v3 15/15] perf script: Add missing fields in usage hint Ravi Bangoria

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.