linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes)
@ 2022-10-06 15:39 Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header Ravi Bangoria
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
  To: acme, peterz
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

Kernel side of changes are already present in tip/perf/core except
one patch to rename PERF_MEM_LVLNUM_EXTN_MEM to PERF_MEM_LVLNUM_CXL[1].

Original description:

Perf mem and c2c tools are wrappers around perf record with mem load/
store events. IBS tagged load/store sample provides most of the
information needed for these tools. Enable support for these tools on
AMD Zen processors based on IBS Op pmu.

There are some limitations though: Only load/store micro-ops provide
mem/c2c information. Whereas, IBS does not have a way to choose a
particular type of micro-op to tag. This results in many non-LS
micro-ops being tagged which appear as N/A in the perf report. IBS,
being an uncore pmu from kernel point of view[2], does not support per
process monitoring. Thus, perf mem/c2c on AMD are currently supported
in per-cpu mode only.

Example:
  $ sudo ./perf mem record -- -c 10000
  ^C[ perf record: Woken up 227 times to write data ]
  [ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]

  $ sudo ./perf mem report -F mem,sample,snoop
  Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
  Memory access                  Samples  Snoop
  N/A                             700620  N/A
  L1 hit                          126675  N/A
  L2 hit                             424  N/A
  L3 hit                             664  HitM
  L3 hit                              10  N/A
  Local RAM hit                        2  N/A
  Remote RAM (1 hop) hit            8558  N/A
  Remote Cache (1 hop) hit             3  N/A
  Remote Cache (1 hop) hit             2  HitM
  Remote Cache (2 hops) hit            10  HitM
  Remote Cache (2 hops) hit             6  N/A
  Uncached hit                         4  N/A

Prepared on top of acme/perf/core (3b1913adb188)

v3: https://lore.kernel.org/lkml/20220928095805.596-1-ravi.bangoria@amd.com
v3->v4:
 - Rename PERF_MEM_LVLNUM_EXTN_MEM to PERF_MEM_LVLNUM_CXL for tools part.

[1]: https://lore.kernel.org/lkml/f6268268-b4e9-9ed6-0453-65792644d953@amd.com
[2]: https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com


Ravi Bangoria (8):
  perf tool: Sync include/uapi/linux/perf_event.h header
  perf tool: Sync arch/x86/include/asm/amd-ibs.h header
  perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}
  perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
  perf mem/c2c: Add load store event mappings for AMD
  perf mem/c2c: Avoid printing empty lines for unsupported events
  perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB
  perf script: Add missing fields in usage hint

 tools/arch/x86/include/asm/amd-ibs.h     | 16 ++++++++++++
 tools/include/uapi/linux/perf_event.h    |  4 ++-
 tools/perf/Documentation/perf-c2c.txt    | 14 ++++++++---
 tools/perf/Documentation/perf-mem.txt    |  3 ++-
 tools/perf/Documentation/perf-record.txt |  1 +
 tools/perf/arch/x86/util/mem-events.c    | 31 ++++++++++++++++++++++--
 tools/perf/builtin-c2c.c                 |  1 +
 tools/perf/builtin-mem.c                 |  1 +
 tools/perf/builtin-script.c              |  7 +++---
 tools/perf/util/mem-events.c             | 17 +++++++------
 10 files changed, 77 insertions(+), 18 deletions(-)

-- 
2.37.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header
  2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 2/8] perf tool: Sync arch/x86/include/asm/amd-ibs.h header Ravi Bangoria
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
  To: acme, peterz
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

Two new fields for mem_lvl_num has been introduced: PERF_MEM_LVLNUM_IO
and PERF_MEM_LVLNUM_CXL which are required to support perf mem/c2c on
AMD platform.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/include/uapi/linux/perf_event.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 49cb2355efc0..ea6defacc1a7 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -1327,7 +1327,9 @@ union perf_mem_data_src {
 #define PERF_MEM_LVLNUM_L2	0x02 /* L2 */
 #define PERF_MEM_LVLNUM_L3	0x03 /* L3 */
 #define PERF_MEM_LVLNUM_L4	0x04 /* L4 */
-/* 5-0xa available */
+/* 5-0x8 available */
+#define PERF_MEM_LVLNUM_CXL	0x09 /* CXL */
+#define PERF_MEM_LVLNUM_IO	0x0a /* I/O */
 #define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
 #define PERF_MEM_LVLNUM_LFB	0x0c /* LFB */
 #define PERF_MEM_LVLNUM_RAM	0x0d /* RAM */
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 2/8] perf tool: Sync arch/x86/include/asm/amd-ibs.h header
  2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 3/8] perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO} Ravi Bangoria
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
  To: acme, peterz
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

Although new details added into this header is currently used by
kernel only, tools copy needs to be in sync with kernel file.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/arch/x86/include/asm/amd-ibs.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/tools/arch/x86/include/asm/amd-ibs.h b/tools/arch/x86/include/asm/amd-ibs.h
index 9a3312e12e2e..93807b437e4d 100644
--- a/tools/arch/x86/include/asm/amd-ibs.h
+++ b/tools/arch/x86/include/asm/amd-ibs.h
@@ -6,6 +6,22 @@
 
 #include "msr-index.h"
 
+/* IBS_OP_DATA2 DataSrc */
+#define IBS_DATA_SRC_LOC_CACHE			 2
+#define IBS_DATA_SRC_DRAM			 3
+#define IBS_DATA_SRC_REM_CACHE			 4
+#define IBS_DATA_SRC_IO				 7
+
+/* IBS_OP_DATA2 DataSrc Extension */
+#define IBS_DATA_SRC_EXT_LOC_CACHE		 1
+#define IBS_DATA_SRC_EXT_NEAR_CCX_CACHE		 2
+#define IBS_DATA_SRC_EXT_DRAM			 3
+#define IBS_DATA_SRC_EXT_FAR_CCX_CACHE		 5
+#define IBS_DATA_SRC_EXT_PMEM			 6
+#define IBS_DATA_SRC_EXT_IO			 7
+#define IBS_DATA_SRC_EXT_EXT_MEM		 8
+#define IBS_DATA_SRC_EXT_PEER_AGENT_MEM		12
+
 /*
  * IBS Hardware MSRs
  */
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 3/8] perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}
  2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 2/8] perf tool: Sync arch/x86/include/asm/amd-ibs.h header Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 4/8] perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events Ravi Bangoria
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
  To: acme, peterz
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

Add support for printing these new fields in perf mem report.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/mem-events.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 764883183519..8909dc7b14a7 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -294,6 +294,8 @@ static const char * const mem_lvl[] = {
 };
 
 static const char * const mem_lvlnum[] = {
+	[PERF_MEM_LVLNUM_CXL] = "CXL",
+	[PERF_MEM_LVLNUM_IO] = "I/O",
 	[PERF_MEM_LVLNUM_ANY_CACHE] = "Any cache",
 	[PERF_MEM_LVLNUM_LFB] = "LFB",
 	[PERF_MEM_LVLNUM_RAM] = "RAM",
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 4/8] perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
  2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
                   ` (2 preceding siblings ...)
  2022-10-06 15:39 ` [PATCH v4 3/8] perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO} Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 5/8] perf mem/c2c: Add load store event mappings for AMD Ravi Bangoria
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
  To: acme, peterz
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

Currently perf sets PERF_SAMPLE_WEIGHT flag only for mem load events.
Set it for combined load-store event as well which will enable recording
of load latency by default on arch that does not support independent
mem load event.

Also document missing -W in perf-record man page.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Documentation/perf-record.txt | 1 +
 tools/perf/builtin-c2c.c                 | 1 +
 tools/perf/builtin-mem.c                 | 1 +
 3 files changed, 3 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 378f497f4be3..e41ae950fdc3 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -411,6 +411,7 @@ is enabled for all the sampling events. The sampled branch type is the same for
 The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
 Note that this feature may not be available on all processors.
 
+-W::
 --weight::
 Enable weightened sampling. An additional weight is recorded per sample and can be
 displayed with the weight and local_weight sort keys.  This currently works for TSX
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index f35a47b2dbe4..a9190458d2d5 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -3281,6 +3281,7 @@ static int perf_c2c__record(int argc, const char **argv)
 		 */
 		if (e->tag) {
 			e->record = true;
+			rec_argv[i++] = "-W";
 		} else {
 			e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
 			e->record = true;
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 9e435fd23503..f7dd8216de72 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -122,6 +122,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
 	    (mem->operation & MEM_OPERATION_LOAD) &&
 	    (mem->operation & MEM_OPERATION_STORE)) {
 		e->record = true;
+		rec_argv[i++] = "-W";
 	} else {
 		if (mem->operation & MEM_OPERATION_LOAD) {
 			e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 5/8] perf mem/c2c: Add load store event mappings for AMD
  2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
                   ` (3 preceding siblings ...)
  2022-10-06 15:39 ` [PATCH v4 4/8] perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 6/8] perf mem/c2c: Avoid printing empty lines for unsupported events Ravi Bangoria
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
  To: acme, peterz
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

Perf mem and c2c tools are wrappers around perf record with mem load/
store events. IBS tagged load/store sample provides most of the
information needed for these tools. Wire in ibs_op// event as mem-ldst
event for AMD.

There are some limitations though: Only load/store micro-ops provide
mem/c2c information. Whereas, IBS does not have a way to choose a
particular type of micro-op to tag. This results in many non-LS
micro-ops being tagged which appear as N/A in the perf report. IBS,
being an uncore pmu from kernel point of view[1], does not support per
process monitoring. Thus, perf mem/c2c on AMD are currently supported
in per-cpu mode only.

Example:
  $ sudo ./perf mem record -- -c 10000
  ^C[ perf record: Woken up 227 times to write data ]
  [ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]

  $ sudo ./perf mem report -F mem,sample,snoop
  Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
  Memory access                  Samples  Snoop
  N/A                             700620  N/A
  L1 hit                          126675  N/A
  L2 hit                             424  N/A
  L3 hit                             664  HitM
  L3 hit                              10  N/A
  Local RAM hit                        2  N/A
  Remote RAM (1 hop) hit            8558  N/A
  Remote Cache (1 hop) hit             3  N/A
  Remote Cache (1 hop) hit             2  HitM
  Remote Cache (2 hops) hit            10  HitM
  Remote Cache (2 hops) hit             6  N/A
  Uncached hit                         4  N/A

[1]: https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Documentation/perf-c2c.txt | 14 ++++++++----
 tools/perf/Documentation/perf-mem.txt |  3 ++-
 tools/perf/arch/x86/util/mem-events.c | 31 +++++++++++++++++++++++++--
 3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
index f1f7ae6b08d1..5c5eb2def83e 100644
--- a/tools/perf/Documentation/perf-c2c.txt
+++ b/tools/perf/Documentation/perf-c2c.txt
@@ -19,9 +19,10 @@ C2C stands for Cache To Cache.
 The perf c2c tool provides means for Shared Data C2C/HITM analysis. It allows
 you to track down the cacheline contentions.
 
-On x86, the tool is based on load latency and precise store facility events
+On Intel, the tool is based on load latency and precise store facility events
 provided by Intel CPUs. On PowerPC, the tool uses random instruction sampling
-with thresholding feature.
+with thresholding feature. On AMD, the tool uses IBS op pmu (due to hardware
+limitations, perf c2c is not supported on Zen3 cpus).
 
 These events provide:
   - memory address of the access
@@ -49,7 +50,8 @@ RECORD OPTIONS
 
 -l::
 --ldlat::
-	Configure mem-loads latency. (x86 only)
+	Configure mem-loads latency. Supported on Intel and Arm64 processors
+	only. Ignored on other archs.
 
 -k::
 --all-kernel::
@@ -135,11 +137,15 @@ Following perf record options are configured by default:
   -W,-d,--phys-data,--sample-cpu
 
 Unless specified otherwise with '-e' option, following events are monitored by
-default on x86:
+default on Intel:
 
   cpu/mem-loads,ldlat=30/P
   cpu/mem-stores/P
 
+following on AMD:
+
+  ibs_op//
+
 and following on PowerPC:
 
   cpu/mem-loads/
diff --git a/tools/perf/Documentation/perf-mem.txt b/tools/perf/Documentation/perf-mem.txt
index 66177511c5c4..005c95580b1e 100644
--- a/tools/perf/Documentation/perf-mem.txt
+++ b/tools/perf/Documentation/perf-mem.txt
@@ -85,7 +85,8 @@ RECORD OPTIONS
 	Be more verbose (show counter open errors, etc)
 
 --ldlat <n>::
-	Specify desired latency for loads event. (x86 only)
+	Specify desired latency for loads event. Supported on Intel and Arm64
+	processors only. Ignored on other archs.
 
 In addition, for report all perf report options are valid, and for record
 all perf record options.
diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index 5214370ca4e4..f683ac702247 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -1,7 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "util/pmu.h"
+#include "util/env.h"
 #include "map_symbol.h"
 #include "mem-events.h"
+#include "linux/string.h"
 
 static char mem_loads_name[100];
 static bool mem_loads_name__init;
@@ -12,18 +14,43 @@ static char mem_stores_name[100];
 
 #define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
 
-static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
+static struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
 	E("ldlat-loads",	"%s/mem-loads,ldlat=%u/P",	"%s/events/mem-loads"),
 	E("ldlat-stores",	"%s/mem-stores/P",		"%s/events/mem-stores"),
 	E(NULL,			NULL,				NULL),
 };
 
+static struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
+	E(NULL,		NULL,		NULL),
+	E(NULL,		NULL,		NULL),
+	E("mem-ldst",	"ibs_op//",	"ibs_op"),
+};
+
+static int perf_mem_is_amd_cpu(void)
+{
+	struct perf_env env = { .total_mem = 0, };
+
+	perf_env__cpuid(&env);
+	if (env.cpuid && strstarts(env.cpuid, "AuthenticAMD"))
+		return 1;
+	return -1;
+}
+
 struct perf_mem_event *perf_mem_events__ptr(int i)
 {
+	/* 0: Uninitialized, 1: Yes, -1: No */
+	static int is_amd;
+
 	if (i >= PERF_MEM_EVENTS__MAX)
 		return NULL;
 
-	return &perf_mem_events[i];
+	if (!is_amd)
+		is_amd = perf_mem_is_amd_cpu();
+
+	if (is_amd == 1)
+		return &perf_mem_events_amd[i];
+
+	return &perf_mem_events_intel[i];
 }
 
 bool is_mem_loads_aux_event(struct evsel *leader)
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 6/8] perf mem/c2c: Avoid printing empty lines for unsupported events
  2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
                   ` (4 preceding siblings ...)
  2022-10-06 15:39 ` [PATCH v4 5/8] perf mem/c2c: Add load store event mappings for AMD Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 7/8] perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 8/8] perf script: Add missing fields in usage hint Ravi Bangoria
  7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
  To: acme, peterz
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

Perf mem and c2c can be used with 3 different events: load, store and
combined load-store. Some architectures might support only partial set
of events in which case, perf prints empty line for unsupported events.
Avoid that.

Ex, AMD Zen cpus supports only combined load-store event and does not
support individual load and store event.

Before patch:
  $ ./perf mem record -e list


  mem-ldst     : available

After patch:
  $ ./perf mem record -e list
  mem-ldst     : available

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/mem-events.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 8909dc7b14a7..6c7feecd2e04 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -156,11 +156,12 @@ void perf_mem_events__list(void)
 	for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
 		struct perf_mem_event *e = perf_mem_events__ptr(j);
 
-		fprintf(stderr, "%-13s%-*s%s\n",
-			e->tag ?: "",
-			verbose > 0 ? 25 : 0,
-			verbose > 0 ? perf_mem_events__name(j, NULL) : "",
-			e->supported ? ": available" : "");
+		fprintf(stderr, "%-*s%-*s%s",
+			e->tag ? 13 : 0,
+			e->tag ? : "",
+			e->tag && verbose > 0 ? 25 : 0,
+			e->tag && verbose > 0 ? perf_mem_events__name(j, NULL) : "",
+			e->supported ? ": available\n" : "");
 	}
 }
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 7/8] perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB
  2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
                   ` (5 preceding siblings ...)
  2022-10-06 15:39 ` [PATCH v4 6/8] perf mem/c2c: Avoid printing empty lines for unsupported events Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
  2022-10-06 15:39 ` [PATCH v4 8/8] perf script: Add missing fields in usage hint Ravi Bangoria
  7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
  To: acme, peterz
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

A hw component to track outstanding L1 Data Cache misses is called
LFB (Line Fill Buffer) on Intel and Arm. However similar component
exists on other arch with different names, for ex, it's called MAB
(Miss Address Buffer) on AMD. Use 'LFB/MAB' instead of just 'LFB'.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
 tools/perf/util/mem-events.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 6c7feecd2e04..b3a91093069a 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -282,7 +282,7 @@ static const char * const mem_lvl[] = {
 	"HIT",
 	"MISS",
 	"L1",
-	"LFB",
+	"LFB/MAB",
 	"L2",
 	"L3",
 	"Local RAM",
@@ -298,7 +298,7 @@ static const char * const mem_lvlnum[] = {
 	[PERF_MEM_LVLNUM_CXL] = "CXL",
 	[PERF_MEM_LVLNUM_IO] = "I/O",
 	[PERF_MEM_LVLNUM_ANY_CACHE] = "Any cache",
-	[PERF_MEM_LVLNUM_LFB] = "LFB",
+	[PERF_MEM_LVLNUM_LFB] = "LFB/MAB",
 	[PERF_MEM_LVLNUM_RAM] = "RAM",
 	[PERF_MEM_LVLNUM_PMEM] = "PMEM",
 	[PERF_MEM_LVLNUM_NA] = "N/A",
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 8/8] perf script: Add missing fields in usage hint
  2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
                   ` (6 preceding siblings ...)
  2022-10-06 15:39 ` [PATCH v4 7/8] perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
  7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
  To: acme, peterz
  Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
	leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
	mark.rutland, alexander.shishkin, tglx, bp, x86,
	linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
	kim.phillips, santosh.shukla

Few fields are missing in the usage message printed when wrong
field option is passed. Add them in the list.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-script.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7fa467ed91dc..7ca238277d83 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3846,9 +3846,10 @@ int cmd_script(int argc, const char **argv)
 		     "Valid types: hw,sw,trace,raw,synth. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
 		     "addr,symoff,srcline,period,iregs,uregs,brstack,"
-		     "brstacksym,flags,bpf-output,brstackinsn,brstackinsnlen,brstackoff,"
-		     "callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc,tod,"
-		     "data_page_size,code_page_size,ins_lat",
+		     "brstacksym,flags,data_src,weight,bpf-output,brstackinsn,"
+		     "brstackinsnlen,brstackoff,callindent,insn,insnlen,synth,"
+		     "phys_addr,metric,misc,srccode,ipc,tod,data_page_size,"
+		     "code_page_size,ins_lat",
 		     parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-10-06 15:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 2/8] perf tool: Sync arch/x86/include/asm/amd-ibs.h header Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 3/8] perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO} Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 4/8] perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 5/8] perf mem/c2c: Add load store event mappings for AMD Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 6/8] perf mem/c2c: Avoid printing empty lines for unsupported events Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 7/8] perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 8/8] perf script: Add missing fields in usage hint Ravi Bangoria

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).