* [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes)
@ 2022-10-06 15:39 Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header Ravi Bangoria
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
To: acme, peterz
Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
mark.rutland, alexander.shishkin, tglx, bp, x86,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
kim.phillips, santosh.shukla
Kernel side of changes are already present in tip/perf/core except
one patch to rename PERF_MEM_LVLNUM_EXTN_MEM to PERF_MEM_LVLNUM_CXL[1].
Original description:
Perf mem and c2c tools are wrappers around perf record with mem load/
store events. IBS tagged load/store sample provides most of the
information needed for these tools. Enable support for these tools on
AMD Zen processors based on IBS Op pmu.
There are some limitations though: Only load/store micro-ops provide
mem/c2c information. Whereas, IBS does not have a way to choose a
particular type of micro-op to tag. This results in many non-LS
micro-ops being tagged which appear as N/A in the perf report. IBS,
being an uncore pmu from kernel point of view[2], does not support per
process monitoring. Thus, perf mem/c2c on AMD are currently supported
in per-cpu mode only.
Example:
$ sudo ./perf mem record -- -c 10000
^C[ perf record: Woken up 227 times to write data ]
[ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]
$ sudo ./perf mem report -F mem,sample,snoop
Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
Memory access Samples Snoop
N/A 700620 N/A
L1 hit 126675 N/A
L2 hit 424 N/A
L3 hit 664 HitM
L3 hit 10 N/A
Local RAM hit 2 N/A
Remote RAM (1 hop) hit 8558 N/A
Remote Cache (1 hop) hit 3 N/A
Remote Cache (1 hop) hit 2 HitM
Remote Cache (2 hops) hit 10 HitM
Remote Cache (2 hops) hit 6 N/A
Uncached hit 4 N/A
Prepared on top of acme/perf/core (3b1913adb188)
v3: https://lore.kernel.org/lkml/20220928095805.596-1-ravi.bangoria@amd.com
v3->v4:
- Rename PERF_MEM_LVLNUM_EXTN_MEM to PERF_MEM_LVLNUM_CXL for tools part.
[1]: https://lore.kernel.org/lkml/f6268268-b4e9-9ed6-0453-65792644d953@amd.com
[2]: https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com
Ravi Bangoria (8):
perf tool: Sync include/uapi/linux/perf_event.h header
perf tool: Sync arch/x86/include/asm/amd-ibs.h header
perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}
perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
perf mem/c2c: Add load store event mappings for AMD
perf mem/c2c: Avoid printing empty lines for unsupported events
perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB
perf script: Add missing fields in usage hint
tools/arch/x86/include/asm/amd-ibs.h | 16 ++++++++++++
tools/include/uapi/linux/perf_event.h | 4 ++-
tools/perf/Documentation/perf-c2c.txt | 14 ++++++++---
tools/perf/Documentation/perf-mem.txt | 3 ++-
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/arch/x86/util/mem-events.c | 31 ++++++++++++++++++++++--
tools/perf/builtin-c2c.c | 1 +
tools/perf/builtin-mem.c | 1 +
tools/perf/builtin-script.c | 7 +++---
tools/perf/util/mem-events.c | 17 +++++++------
10 files changed, 77 insertions(+), 18 deletions(-)
--
2.37.3
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 2/8] perf tool: Sync arch/x86/include/asm/amd-ibs.h header Ravi Bangoria
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
To: acme, peterz
Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
mark.rutland, alexander.shishkin, tglx, bp, x86,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
kim.phillips, santosh.shukla
Two new fields for mem_lvl_num has been introduced: PERF_MEM_LVLNUM_IO
and PERF_MEM_LVLNUM_CXL which are required to support perf mem/c2c on
AMD platform.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
tools/include/uapi/linux/perf_event.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 49cb2355efc0..ea6defacc1a7 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -1327,7 +1327,9 @@ union perf_mem_data_src {
#define PERF_MEM_LVLNUM_L2 0x02 /* L2 */
#define PERF_MEM_LVLNUM_L3 0x03 /* L3 */
#define PERF_MEM_LVLNUM_L4 0x04 /* L4 */
-/* 5-0xa available */
+/* 5-0x8 available */
+#define PERF_MEM_LVLNUM_CXL 0x09 /* CXL */
+#define PERF_MEM_LVLNUM_IO 0x0a /* I/O */
#define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
#define PERF_MEM_LVLNUM_LFB 0x0c /* LFB */
#define PERF_MEM_LVLNUM_RAM 0x0d /* RAM */
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 2/8] perf tool: Sync arch/x86/include/asm/amd-ibs.h header
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 3/8] perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO} Ravi Bangoria
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
To: acme, peterz
Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
mark.rutland, alexander.shishkin, tglx, bp, x86,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
kim.phillips, santosh.shukla
Although new details added into this header is currently used by
kernel only, tools copy needs to be in sync with kernel file.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
tools/arch/x86/include/asm/amd-ibs.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/tools/arch/x86/include/asm/amd-ibs.h b/tools/arch/x86/include/asm/amd-ibs.h
index 9a3312e12e2e..93807b437e4d 100644
--- a/tools/arch/x86/include/asm/amd-ibs.h
+++ b/tools/arch/x86/include/asm/amd-ibs.h
@@ -6,6 +6,22 @@
#include "msr-index.h"
+/* IBS_OP_DATA2 DataSrc */
+#define IBS_DATA_SRC_LOC_CACHE 2
+#define IBS_DATA_SRC_DRAM 3
+#define IBS_DATA_SRC_REM_CACHE 4
+#define IBS_DATA_SRC_IO 7
+
+/* IBS_OP_DATA2 DataSrc Extension */
+#define IBS_DATA_SRC_EXT_LOC_CACHE 1
+#define IBS_DATA_SRC_EXT_NEAR_CCX_CACHE 2
+#define IBS_DATA_SRC_EXT_DRAM 3
+#define IBS_DATA_SRC_EXT_FAR_CCX_CACHE 5
+#define IBS_DATA_SRC_EXT_PMEM 6
+#define IBS_DATA_SRC_EXT_IO 7
+#define IBS_DATA_SRC_EXT_EXT_MEM 8
+#define IBS_DATA_SRC_EXT_PEER_AGENT_MEM 12
+
/*
* IBS Hardware MSRs
*/
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 3/8] perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 2/8] perf tool: Sync arch/x86/include/asm/amd-ibs.h header Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 4/8] perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events Ravi Bangoria
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
To: acme, peterz
Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
mark.rutland, alexander.shishkin, tglx, bp, x86,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
kim.phillips, santosh.shukla
Add support for printing these new fields in perf mem report.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/mem-events.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 764883183519..8909dc7b14a7 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -294,6 +294,8 @@ static const char * const mem_lvl[] = {
};
static const char * const mem_lvlnum[] = {
+ [PERF_MEM_LVLNUM_CXL] = "CXL",
+ [PERF_MEM_LVLNUM_IO] = "I/O",
[PERF_MEM_LVLNUM_ANY_CACHE] = "Any cache",
[PERF_MEM_LVLNUM_LFB] = "LFB",
[PERF_MEM_LVLNUM_RAM] = "RAM",
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 4/8] perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
` (2 preceding siblings ...)
2022-10-06 15:39 ` [PATCH v4 3/8] perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO} Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 5/8] perf mem/c2c: Add load store event mappings for AMD Ravi Bangoria
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
To: acme, peterz
Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
mark.rutland, alexander.shishkin, tglx, bp, x86,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
kim.phillips, santosh.shukla
Currently perf sets PERF_SAMPLE_WEIGHT flag only for mem load events.
Set it for combined load-store event as well which will enable recording
of load latency by default on arch that does not support independent
mem load event.
Also document missing -W in perf-record man page.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/builtin-c2c.c | 1 +
tools/perf/builtin-mem.c | 1 +
3 files changed, 3 insertions(+)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 378f497f4be3..e41ae950fdc3 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -411,6 +411,7 @@ is enabled for all the sampling events. The sampled branch type is the same for
The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
Note that this feature may not be available on all processors.
+-W::
--weight::
Enable weightened sampling. An additional weight is recorded per sample and can be
displayed with the weight and local_weight sort keys. This currently works for TSX
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index f35a47b2dbe4..a9190458d2d5 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -3281,6 +3281,7 @@ static int perf_c2c__record(int argc, const char **argv)
*/
if (e->tag) {
e->record = true;
+ rec_argv[i++] = "-W";
} else {
e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
e->record = true;
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 9e435fd23503..f7dd8216de72 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -122,6 +122,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
(mem->operation & MEM_OPERATION_LOAD) &&
(mem->operation & MEM_OPERATION_STORE)) {
e->record = true;
+ rec_argv[i++] = "-W";
} else {
if (mem->operation & MEM_OPERATION_LOAD) {
e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 5/8] perf mem/c2c: Add load store event mappings for AMD
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
` (3 preceding siblings ...)
2022-10-06 15:39 ` [PATCH v4 4/8] perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 6/8] perf mem/c2c: Avoid printing empty lines for unsupported events Ravi Bangoria
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
To: acme, peterz
Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
mark.rutland, alexander.shishkin, tglx, bp, x86,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
kim.phillips, santosh.shukla
Perf mem and c2c tools are wrappers around perf record with mem load/
store events. IBS tagged load/store sample provides most of the
information needed for these tools. Wire in ibs_op// event as mem-ldst
event for AMD.
There are some limitations though: Only load/store micro-ops provide
mem/c2c information. Whereas, IBS does not have a way to choose a
particular type of micro-op to tag. This results in many non-LS
micro-ops being tagged which appear as N/A in the perf report. IBS,
being an uncore pmu from kernel point of view[1], does not support per
process monitoring. Thus, perf mem/c2c on AMD are currently supported
in per-cpu mode only.
Example:
$ sudo ./perf mem record -- -c 10000
^C[ perf record: Woken up 227 times to write data ]
[ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]
$ sudo ./perf mem report -F mem,sample,snoop
Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
Memory access Samples Snoop
N/A 700620 N/A
L1 hit 126675 N/A
L2 hit 424 N/A
L3 hit 664 HitM
L3 hit 10 N/A
Local RAM hit 2 N/A
Remote RAM (1 hop) hit 8558 N/A
Remote Cache (1 hop) hit 3 N/A
Remote Cache (1 hop) hit 2 HitM
Remote Cache (2 hops) hit 10 HitM
Remote Cache (2 hops) hit 6 N/A
Uncached hit 4 N/A
[1]: https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/Documentation/perf-c2c.txt | 14 ++++++++----
tools/perf/Documentation/perf-mem.txt | 3 ++-
tools/perf/arch/x86/util/mem-events.c | 31 +++++++++++++++++++++++++--
3 files changed, 41 insertions(+), 7 deletions(-)
diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
index f1f7ae6b08d1..5c5eb2def83e 100644
--- a/tools/perf/Documentation/perf-c2c.txt
+++ b/tools/perf/Documentation/perf-c2c.txt
@@ -19,9 +19,10 @@ C2C stands for Cache To Cache.
The perf c2c tool provides means for Shared Data C2C/HITM analysis. It allows
you to track down the cacheline contentions.
-On x86, the tool is based on load latency and precise store facility events
+On Intel, the tool is based on load latency and precise store facility events
provided by Intel CPUs. On PowerPC, the tool uses random instruction sampling
-with thresholding feature.
+with thresholding feature. On AMD, the tool uses IBS op pmu (due to hardware
+limitations, perf c2c is not supported on Zen3 cpus).
These events provide:
- memory address of the access
@@ -49,7 +50,8 @@ RECORD OPTIONS
-l::
--ldlat::
- Configure mem-loads latency. (x86 only)
+ Configure mem-loads latency. Supported on Intel and Arm64 processors
+ only. Ignored on other archs.
-k::
--all-kernel::
@@ -135,11 +137,15 @@ Following perf record options are configured by default:
-W,-d,--phys-data,--sample-cpu
Unless specified otherwise with '-e' option, following events are monitored by
-default on x86:
+default on Intel:
cpu/mem-loads,ldlat=30/P
cpu/mem-stores/P
+following on AMD:
+
+ ibs_op//
+
and following on PowerPC:
cpu/mem-loads/
diff --git a/tools/perf/Documentation/perf-mem.txt b/tools/perf/Documentation/perf-mem.txt
index 66177511c5c4..005c95580b1e 100644
--- a/tools/perf/Documentation/perf-mem.txt
+++ b/tools/perf/Documentation/perf-mem.txt
@@ -85,7 +85,8 @@ RECORD OPTIONS
Be more verbose (show counter open errors, etc)
--ldlat <n>::
- Specify desired latency for loads event. (x86 only)
+ Specify desired latency for loads event. Supported on Intel and Arm64
+ processors only. Ignored on other archs.
In addition, for report all perf report options are valid, and for record
all perf record options.
diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index 5214370ca4e4..f683ac702247 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -1,7 +1,9 @@
// SPDX-License-Identifier: GPL-2.0
#include "util/pmu.h"
+#include "util/env.h"
#include "map_symbol.h"
#include "mem-events.h"
+#include "linux/string.h"
static char mem_loads_name[100];
static bool mem_loads_name__init;
@@ -12,18 +14,43 @@ static char mem_stores_name[100];
#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
-static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
+static struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
E("ldlat-loads", "%s/mem-loads,ldlat=%u/P", "%s/events/mem-loads"),
E("ldlat-stores", "%s/mem-stores/P", "%s/events/mem-stores"),
E(NULL, NULL, NULL),
};
+static struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
+ E(NULL, NULL, NULL),
+ E(NULL, NULL, NULL),
+ E("mem-ldst", "ibs_op//", "ibs_op"),
+};
+
+static int perf_mem_is_amd_cpu(void)
+{
+ struct perf_env env = { .total_mem = 0, };
+
+ perf_env__cpuid(&env);
+ if (env.cpuid && strstarts(env.cpuid, "AuthenticAMD"))
+ return 1;
+ return -1;
+}
+
struct perf_mem_event *perf_mem_events__ptr(int i)
{
+ /* 0: Uninitialized, 1: Yes, -1: No */
+ static int is_amd;
+
if (i >= PERF_MEM_EVENTS__MAX)
return NULL;
- return &perf_mem_events[i];
+ if (!is_amd)
+ is_amd = perf_mem_is_amd_cpu();
+
+ if (is_amd == 1)
+ return &perf_mem_events_amd[i];
+
+ return &perf_mem_events_intel[i];
}
bool is_mem_loads_aux_event(struct evsel *leader)
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 6/8] perf mem/c2c: Avoid printing empty lines for unsupported events
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
` (4 preceding siblings ...)
2022-10-06 15:39 ` [PATCH v4 5/8] perf mem/c2c: Add load store event mappings for AMD Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 7/8] perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 8/8] perf script: Add missing fields in usage hint Ravi Bangoria
7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
To: acme, peterz
Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
mark.rutland, alexander.shishkin, tglx, bp, x86,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
kim.phillips, santosh.shukla
Perf mem and c2c can be used with 3 different events: load, store and
combined load-store. Some architectures might support only partial set
of events in which case, perf prints empty line for unsupported events.
Avoid that.
Ex, AMD Zen cpus supports only combined load-store event and does not
support individual load and store event.
Before patch:
$ ./perf mem record -e list
mem-ldst : available
After patch:
$ ./perf mem record -e list
mem-ldst : available
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/mem-events.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 8909dc7b14a7..6c7feecd2e04 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -156,11 +156,12 @@ void perf_mem_events__list(void)
for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
struct perf_mem_event *e = perf_mem_events__ptr(j);
- fprintf(stderr, "%-13s%-*s%s\n",
- e->tag ?: "",
- verbose > 0 ? 25 : 0,
- verbose > 0 ? perf_mem_events__name(j, NULL) : "",
- e->supported ? ": available" : "");
+ fprintf(stderr, "%-*s%-*s%s",
+ e->tag ? 13 : 0,
+ e->tag ? : "",
+ e->tag && verbose > 0 ? 25 : 0,
+ e->tag && verbose > 0 ? perf_mem_events__name(j, NULL) : "",
+ e->supported ? ": available\n" : "");
}
}
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 7/8] perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
` (5 preceding siblings ...)
2022-10-06 15:39 ` [PATCH v4 6/8] perf mem/c2c: Avoid printing empty lines for unsupported events Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 8/8] perf script: Add missing fields in usage hint Ravi Bangoria
7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
To: acme, peterz
Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
mark.rutland, alexander.shishkin, tglx, bp, x86,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
kim.phillips, santosh.shukla
A hw component to track outstanding L1 Data Cache misses is called
LFB (Line Fill Buffer) on Intel and Arm. However similar component
exists on other arch with different names, for ex, it's called MAB
(Miss Address Buffer) on AMD. Use 'LFB/MAB' instead of just 'LFB'.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
tools/perf/util/mem-events.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 6c7feecd2e04..b3a91093069a 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -282,7 +282,7 @@ static const char * const mem_lvl[] = {
"HIT",
"MISS",
"L1",
- "LFB",
+ "LFB/MAB",
"L2",
"L3",
"Local RAM",
@@ -298,7 +298,7 @@ static const char * const mem_lvlnum[] = {
[PERF_MEM_LVLNUM_CXL] = "CXL",
[PERF_MEM_LVLNUM_IO] = "I/O",
[PERF_MEM_LVLNUM_ANY_CACHE] = "Any cache",
- [PERF_MEM_LVLNUM_LFB] = "LFB",
+ [PERF_MEM_LVLNUM_LFB] = "LFB/MAB",
[PERF_MEM_LVLNUM_RAM] = "RAM",
[PERF_MEM_LVLNUM_PMEM] = "PMEM",
[PERF_MEM_LVLNUM_NA] = "N/A",
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 8/8] perf script: Add missing fields in usage hint
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
` (6 preceding siblings ...)
2022-10-06 15:39 ` [PATCH v4 7/8] perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB Ravi Bangoria
@ 2022-10-06 15:39 ` Ravi Bangoria
7 siblings, 0 replies; 9+ messages in thread
From: Ravi Bangoria @ 2022-10-06 15:39 UTC (permalink / raw)
To: acme, peterz
Cc: ravi.bangoria, jolsa, namhyung, eranian, irogers, jmario,
leo.yan, alisaidi, ak, kan.liang, dave.hansen, hpa, mingo,
mark.rutland, alexander.shishkin, tglx, bp, x86,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
kim.phillips, santosh.shukla
Few fields are missing in the usage message printed when wrong
field option is passed. Add them in the list.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-script.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7fa467ed91dc..7ca238277d83 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3846,9 +3846,10 @@ int cmd_script(int argc, const char **argv)
"Valid types: hw,sw,trace,raw,synth. "
"Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
"addr,symoff,srcline,period,iregs,uregs,brstack,"
- "brstacksym,flags,bpf-output,brstackinsn,brstackinsnlen,brstackoff,"
- "callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc,tod,"
- "data_page_size,code_page_size,ins_lat",
+ "brstacksym,flags,data_src,weight,bpf-output,brstackinsn,"
+ "brstackinsnlen,brstackoff,callindent,insn,insnlen,synth,"
+ "phys_addr,metric,misc,srccode,ipc,tod,data_page_size,"
+ "code_page_size,ins_lat",
parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
"system-wide collection from all CPUs"),
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-10-06 15:46 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-06 15:39 [PATCH v4 0/8] perf mem/c2c: Add support for AMD (tools changes) Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 1/8] perf tool: Sync include/uapi/linux/perf_event.h header Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 2/8] perf tool: Sync arch/x86/include/asm/amd-ibs.h header Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 3/8] perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO} Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 4/8] perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 5/8] perf mem/c2c: Add load store event mappings for AMD Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 6/8] perf mem/c2c: Avoid printing empty lines for unsupported events Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 7/8] perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB Ravi Bangoria
2022-10-06 15:39 ` [PATCH v4 8/8] perf script: Add missing fields in usage hint Ravi Bangoria
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).