* [PATCH 0/4] perf mem amd: Fix few logic bugs
@ 2023-06-13 9:55 Ravi Bangoria
2023-06-13 9:55 ` [PATCH 1/4] perf pmus: Describe semantics of 'core_pmus' and 'other_pmus' Ravi Bangoria
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Ravi Bangoria @ 2023-06-13 9:55 UTC (permalink / raw)
To: acme
Cc: ravi.bangoria, irogers, jolsa, namhyung, mark.rutland, peterz,
adrian.hunter, kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
Recent pmu refactoring changes[1] introduced a notion of core vs other
pmus and made perf mem/c2c code depend only on core pmus, which is
logically wrong for AMD as perf mem/c2c on AMD depends on IBS OP pmu,
not the core pmu. Although user visible perf mem/c2c functionality is
still working fine, internal code logic is wrong. Fix those.
[1] https://lore.kernel.org/r/20230527072210.2900565-1-irogers@google.com
Ravi Bangoria (4):
perf pmus: Describe semantics of 'core_pmus' and 'other_pmus'
perf tool x86: Consolidate is_amd check into single function
perf mem amd: Fix perf_pmus__num_mem_pmus()
perf mem amd: Scan all PMUs instead of just core ones
tools/perf/arch/x86/util/Build | 1 +
tools/perf/arch/x86/util/env.c | 19 +++++++++++++++++++
tools/perf/arch/x86/util/env.h | 7 +++++++
tools/perf/arch/x86/util/evsel.c | 16 ++--------------
tools/perf/arch/x86/util/mem-events.c | 24 +++++++-----------------
tools/perf/arch/x86/util/pmu.c | 15 +++++++++++++++
tools/perf/util/mem-events.c | 16 ++++++++++++----
tools/perf/util/mem-events.h | 1 +
tools/perf/util/pmus.c | 15 ++++++++++++++-
9 files changed, 78 insertions(+), 36 deletions(-)
create mode 100644 tools/perf/arch/x86/util/env.c
create mode 100644 tools/perf/arch/x86/util/env.h
--
2.40.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/4] perf pmus: Describe semantics of 'core_pmus' and 'other_pmus'
2023-06-13 9:55 [PATCH 0/4] perf mem amd: Fix few logic bugs Ravi Bangoria
@ 2023-06-13 9:55 ` Ravi Bangoria
2023-06-13 9:55 ` [PATCH 2/4] perf tool x86: Consolidate is_amd check into single function Ravi Bangoria
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: Ravi Bangoria @ 2023-06-13 9:55 UTC (permalink / raw)
To: acme
Cc: ravi.bangoria, irogers, jolsa, namhyung, mark.rutland, peterz,
adrian.hunter, kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
'core_pmus' and 'other_pmus' are independent of hw core pmu and uncore
pmus. For example, AMD IBS PMUs are present in each SMT-thread but they
belongs to 'other_pmus'. Add a comment describing what these list
contains and how they are treated.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
tools/perf/util/pmus.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index e1d0a93147e5..e505d2fef828 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -12,6 +12,19 @@
#include "pmu.h"
#include "print-events.h"
+/*
+ * core_pmus: A PMU belongs to core_pmus if it's name is "cpu" or it's sysfs
+ * directory contains "cpus" file. All PMUs belonging to core_pmus
+ * must have pmu->is_core=1. If there are more than one PMUs in
+ * this list, perf interprets it as a heterogeneous platform.
+ * (FWIW, certain ARM platforms having heterogeneous cores uses
+ * homogeneous PMU in which case it will be treated as homogeneous
+ * platform by perf because core_pmus will have only one entry.)
+ * other_pmus: All other PMUs which are not part of core_pmus list. Does not
+ * matter whether it is a per SMT-thread or outside of the core in
+ * hw. i.e. PMUs belonging to other_pmus must have pmu->is_core=0
+ * but pmu->is_uncore could be 0 or 1.
+ */
static LIST_HEAD(core_pmus);
static LIST_HEAD(other_pmus);
static bool read_sysfs_core_pmus;
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/4] perf tool x86: Consolidate is_amd check into single function
2023-06-13 9:55 [PATCH 0/4] perf mem amd: Fix few logic bugs Ravi Bangoria
2023-06-13 9:55 ` [PATCH 1/4] perf pmus: Describe semantics of 'core_pmus' and 'other_pmus' Ravi Bangoria
@ 2023-06-13 9:55 ` Ravi Bangoria
2023-06-13 19:49 ` Arnaldo Carvalho de Melo
2023-06-13 19:49 ` Ian Rogers
2023-06-13 9:55 ` [PATCH 3/4] perf mem amd: Fix perf_pmus__num_mem_pmus() Ravi Bangoria
2023-06-13 9:55 ` [PATCH 4/4] perf mem amd: Scan all PMUs instead of just core ones Ravi Bangoria
3 siblings, 2 replies; 12+ messages in thread
From: Ravi Bangoria @ 2023-06-13 9:55 UTC (permalink / raw)
To: acme
Cc: ravi.bangoria, irogers, jolsa, namhyung, mark.rutland, peterz,
adrian.hunter, kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
There are multiple places where x86 specific code determines AMD vs
Intel arch and acts based on that. Consolidate those checks into a
single function.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
tools/perf/arch/x86/util/Build | 1 +
tools/perf/arch/x86/util/env.c | 19 +++++++++++++++++++
tools/perf/arch/x86/util/env.h | 7 +++++++
tools/perf/arch/x86/util/evsel.c | 16 ++--------------
tools/perf/arch/x86/util/mem-events.c | 19 ++-----------------
5 files changed, 31 insertions(+), 31 deletions(-)
create mode 100644 tools/perf/arch/x86/util/env.c
create mode 100644 tools/perf/arch/x86/util/env.h
diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 195ccfdef7aa..005907cb97d8 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -10,6 +10,7 @@ perf-y += evlist.o
perf-y += mem-events.o
perf-y += evsel.o
perf-y += iostat.o
+perf-y += env.o
perf-$(CONFIG_DWARF) += dwarf-regs.o
perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
diff --git a/tools/perf/arch/x86/util/env.c b/tools/perf/arch/x86/util/env.c
new file mode 100644
index 000000000000..33b87f8ac1cc
--- /dev/null
+++ b/tools/perf/arch/x86/util/env.c
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "linux/string.h"
+#include "util/env.h"
+#include "env.h"
+
+bool x86__is_amd_cpu(void)
+{
+ struct perf_env env = { .total_mem = 0, };
+ static int is_amd; /* 0: Uninitialized, 1: Yes, -1: No */
+
+ if (is_amd)
+ goto ret;
+
+ perf_env__cpuid(&env);
+ is_amd = env.cpuid && strstarts(env.cpuid, "AuthenticAMD") ? 1 : -1;
+
+ret:
+ return is_amd >= 1 ? true : false;
+}
diff --git a/tools/perf/arch/x86/util/env.h b/tools/perf/arch/x86/util/env.h
new file mode 100644
index 000000000000..d78f080b6b3f
--- /dev/null
+++ b/tools/perf/arch/x86/util/env.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _X86_ENV_H
+#define _X86_ENV_H
+
+bool x86__is_amd_cpu(void);
+
+#endif /* _X86_ENV_H */
diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index 25da46c8cca9..512c2d885d24 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -8,6 +8,7 @@
#include "linux/string.h"
#include "evsel.h"
#include "util/debug.h"
+#include "env.h"
#define IBS_FETCH_L3MISSONLY (1ULL << 59)
#define IBS_OP_L3MISSONLY (1ULL << 16)
@@ -78,23 +79,10 @@ void arch__post_evsel_config(struct evsel *evsel, struct perf_event_attr *attr)
{
struct perf_pmu *evsel_pmu, *ibs_fetch_pmu, *ibs_op_pmu;
static int warned_once;
- /* 0: Uninitialized, 1: Yes, -1: No */
- static int is_amd;
- if (warned_once || is_amd == -1)
+ if (warned_once || !x86__is_amd_cpu())
return;
- if (!is_amd) {
- struct perf_env *env = evsel__env(evsel);
-
- if (!perf_env__cpuid(env) || !env->cpuid ||
- !strstarts(env->cpuid, "AuthenticAMD")) {
- is_amd = -1;
- return;
- }
- is_amd = 1;
- }
-
evsel_pmu = evsel__find_pmu(evsel);
if (!evsel_pmu)
return;
diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index 32879d12a8d5..a8a782bcb121 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -5,6 +5,7 @@
#include "map_symbol.h"
#include "mem-events.h"
#include "linux/string.h"
+#include "env.h"
static char mem_loads_name[100];
static bool mem_loads_name__init;
@@ -27,28 +28,12 @@ static struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
E("mem-ldst", "ibs_op//", "ibs_op"),
};
-static int perf_mem_is_amd_cpu(void)
-{
- struct perf_env env = { .total_mem = 0, };
-
- perf_env__cpuid(&env);
- if (env.cpuid && strstarts(env.cpuid, "AuthenticAMD"))
- return 1;
- return -1;
-}
-
struct perf_mem_event *perf_mem_events__ptr(int i)
{
- /* 0: Uninitialized, 1: Yes, -1: No */
- static int is_amd;
-
if (i >= PERF_MEM_EVENTS__MAX)
return NULL;
- if (!is_amd)
- is_amd = perf_mem_is_amd_cpu();
-
- if (is_amd == 1)
+ if (x86__is_amd_cpu())
return &perf_mem_events_amd[i];
return &perf_mem_events_intel[i];
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/4] perf mem amd: Fix perf_pmus__num_mem_pmus()
2023-06-13 9:55 [PATCH 0/4] perf mem amd: Fix few logic bugs Ravi Bangoria
2023-06-13 9:55 ` [PATCH 1/4] perf pmus: Describe semantics of 'core_pmus' and 'other_pmus' Ravi Bangoria
2023-06-13 9:55 ` [PATCH 2/4] perf tool x86: Consolidate is_amd check into single function Ravi Bangoria
@ 2023-06-13 9:55 ` Ravi Bangoria
2023-06-13 15:25 ` Ian Rogers
2023-06-13 9:55 ` [PATCH 4/4] perf mem amd: Scan all PMUs instead of just core ones Ravi Bangoria
3 siblings, 1 reply; 12+ messages in thread
From: Ravi Bangoria @ 2023-06-13 9:55 UTC (permalink / raw)
To: acme
Cc: ravi.bangoria, irogers, jolsa, namhyung, mark.rutland, peterz,
adrian.hunter, kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
AMD cpus does not contain hybrid cores. Also, perf mem/c2c on AMD
internally uses IBS OP PMU, not the core PMU.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
tools/perf/arch/x86/util/pmu.c | 15 +++++++++++++++
tools/perf/util/pmus.c | 2 +-
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
index 3c0de3370d7e..8a20d28d9927 100644
--- a/tools/perf/arch/x86/util/pmu.c
+++ b/tools/perf/arch/x86/util/pmu.c
@@ -14,6 +14,8 @@
#include "../../../util/intel-bts.h"
#include "../../../util/pmu.h"
#include "../../../util/fncache.h"
+#include "../../../util/pmus.h"
+#include "env.h"
struct pmu_alias {
char *name;
@@ -168,3 +170,16 @@ char *pmu_find_alias_name(const char *name)
return __pmu_find_alias_name(name);
}
+
+int perf_pmus__num_mem_pmus(void)
+{
+ /*
+ * AMD does not have hybrid cores and also uses IBS OP
+ * pmu for perf mem/c2c.
+ */
+ if (x86__is_amd_cpu())
+ return 1;
+
+ /* Intel uses core pmus for perf mem/c2c */
+ return perf_pmus__num_core_pmus();
+}
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index e505d2fef828..0ed943932945 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -240,7 +240,7 @@ const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str)
return NULL;
}
-int perf_pmus__num_mem_pmus(void)
+int __weak perf_pmus__num_mem_pmus(void)
{
/* All core PMUs are for mem events. */
return perf_pmus__num_core_pmus();
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/4] perf mem amd: Scan all PMUs instead of just core ones
2023-06-13 9:55 [PATCH 0/4] perf mem amd: Fix few logic bugs Ravi Bangoria
` (2 preceding siblings ...)
2023-06-13 9:55 ` [PATCH 3/4] perf mem amd: Fix perf_pmus__num_mem_pmus() Ravi Bangoria
@ 2023-06-13 9:55 ` Ravi Bangoria
2023-06-13 15:34 ` Ian Rogers
3 siblings, 1 reply; 12+ messages in thread
From: Ravi Bangoria @ 2023-06-13 9:55 UTC (permalink / raw)
To: acme
Cc: ravi.bangoria, irogers, jolsa, namhyung, mark.rutland, peterz,
adrian.hunter, kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
Scanning only core PMUs is not sufficient on AMD since perf mem on
AMD uses IBS OP PMU, which is independent of core PMU.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
tools/perf/arch/x86/util/mem-events.c | 5 +++++
tools/perf/util/mem-events.c | 16 ++++++++++++----
tools/perf/util/mem-events.h | 1 +
3 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
index a8a782bcb121..43af872e89a6 100644
--- a/tools/perf/arch/x86/util/mem-events.c
+++ b/tools/perf/arch/x86/util/mem-events.c
@@ -91,3 +91,8 @@ char *perf_mem_events__name(int i, char *pmu_name)
return (char *)e->name;
}
+
+bool perf_mem_events__via_core_pmus(void)
+{
+ return !x86__is_amd_cpu();
+}
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index be15aadb6b14..0c04f883d634 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -109,6 +109,14 @@ static bool perf_mem_event__supported(const char *mnt, char *sysfs_name)
return !stat(path, &st);
}
+bool __weak perf_mem_events__via_core_pmus(void)
+{
+ return true;
+}
+
+#define perf_mem_scan_next_pmu(pmu) \
+ (perf_mem_events__via_core_pmus() ? perf_pmus__scan_core(pmu) : perf_pmus__scan(pmu))
+
int perf_mem_events__init(void)
{
const char *mnt = sysfs__mount();
@@ -130,7 +138,7 @@ int perf_mem_events__init(void)
if (!e->tag)
continue;
- while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
+ while ((pmu = perf_mem_scan_next_pmu(pmu)) != NULL) {
scnprintf(sysfs_name, sizeof(sysfs_name), e->sysfs_name, pmu->name);
e->supported |= perf_mem_event__supported(mnt, sysfs_name);
}
@@ -165,7 +173,7 @@ static void perf_mem_events__print_unsupport_hybrid(struct perf_mem_event *e,
char sysfs_name[100];
struct perf_pmu *pmu = NULL;
- while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
+ while ((pmu = perf_mem_scan_next_pmu(pmu)) != NULL) {
scnprintf(sysfs_name, sizeof(sysfs_name), e->sysfs_name,
pmu->name);
if (!perf_mem_event__supported(mnt, sysfs_name)) {
@@ -188,7 +196,7 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
if (!e->record)
continue;
- if (perf_pmus__num_core_pmus() == 1) {
+ if (perf_pmus__num_mem_pmus() == 1) {
if (!e->supported) {
pr_err("failed: event '%s' not supported\n",
perf_mem_events__name(j, NULL));
@@ -203,7 +211,7 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
return -1;
}
- while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
+ while ((pmu = perf_mem_scan_next_pmu(pmu)) != NULL) {
rec_argv[i++] = "-e";
s = perf_mem_events__name(j, pmu->name);
if (s) {
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 12372309d60e..d650eb311113 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -36,6 +36,7 @@ enum {
extern unsigned int perf_mem_events__loads_ldlat;
int perf_mem_events__parse(const char *str);
+bool perf_mem_events__via_core_pmus(void);
int perf_mem_events__init(void);
char *perf_mem_events__name(int i, char *pmu_name);
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/4] perf mem amd: Fix perf_pmus__num_mem_pmus()
2023-06-13 9:55 ` [PATCH 3/4] perf mem amd: Fix perf_pmus__num_mem_pmus() Ravi Bangoria
@ 2023-06-13 15:25 ` Ian Rogers
2023-06-14 3:49 ` Ravi Bangoria
0 siblings, 1 reply; 12+ messages in thread
From: Ian Rogers @ 2023-06-13 15:25 UTC (permalink / raw)
To: Ravi Bangoria
Cc: acme, jolsa, namhyung, mark.rutland, peterz, adrian.hunter,
kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
On Tue, Jun 13, 2023 at 2:56 AM Ravi Bangoria <ravi.bangoria@amd.com> wrote:
>
> AMD cpus does not contain hybrid cores. Also, perf mem/c2c on AMD
> internally uses IBS OP PMU, not the core PMU.
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
> tools/perf/arch/x86/util/pmu.c | 15 +++++++++++++++
> tools/perf/util/pmus.c | 2 +-
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
> index 3c0de3370d7e..8a20d28d9927 100644
> --- a/tools/perf/arch/x86/util/pmu.c
> +++ b/tools/perf/arch/x86/util/pmu.c
> @@ -14,6 +14,8 @@
> #include "../../../util/intel-bts.h"
> #include "../../../util/pmu.h"
> #include "../../../util/fncache.h"
> +#include "../../../util/pmus.h"
> +#include "env.h"
>
> struct pmu_alias {
> char *name;
> @@ -168,3 +170,16 @@ char *pmu_find_alias_name(const char *name)
>
> return __pmu_find_alias_name(name);
> }
> +
> +int perf_pmus__num_mem_pmus(void)
> +{
> + /*
> + * AMD does not have hybrid cores and also uses IBS OP
> + * pmu for perf mem/c2c.
> + */
> + if (x86__is_amd_cpu())
> + return 1;
The code and comment seem out of sync here. For the hybrid part
perf_pmus__num_core_pmus() will yield 1 if there is no hybrid, so we
can just use perf_pmus__num_core_pmus(). For the IBS OP part, does
that mean that AMD should have 2 mem pmus? Or is IBS OP a core PMU?
Can we add this as an example in the core/other documentation in patch
1, as you've done for ARM, for clarity.
Thanks,
Ian
> +
> + /* Intel uses core pmus for perf mem/c2c */
> + return perf_pmus__num_core_pmus();
> +}
> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> index e505d2fef828..0ed943932945 100644
> --- a/tools/perf/util/pmus.c
> +++ b/tools/perf/util/pmus.c
> @@ -240,7 +240,7 @@ const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str)
> return NULL;
> }
>
> -int perf_pmus__num_mem_pmus(void)
> +int __weak perf_pmus__num_mem_pmus(void)
> {
> /* All core PMUs are for mem events. */
> return perf_pmus__num_core_pmus();
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 4/4] perf mem amd: Scan all PMUs instead of just core ones
2023-06-13 9:55 ` [PATCH 4/4] perf mem amd: Scan all PMUs instead of just core ones Ravi Bangoria
@ 2023-06-13 15:34 ` Ian Rogers
2023-06-14 3:58 ` Ravi Bangoria
0 siblings, 1 reply; 12+ messages in thread
From: Ian Rogers @ 2023-06-13 15:34 UTC (permalink / raw)
To: Ravi Bangoria
Cc: acme, jolsa, namhyung, mark.rutland, peterz, adrian.hunter,
kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
On Tue, Jun 13, 2023 at 2:56 AM Ravi Bangoria <ravi.bangoria@amd.com> wrote:
>
> Scanning only core PMUs is not sufficient on AMD since perf mem on
> AMD uses IBS OP PMU, which is independent of core PMU.
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
> tools/perf/arch/x86/util/mem-events.c | 5 +++++
> tools/perf/util/mem-events.c | 16 ++++++++++++----
> tools/perf/util/mem-events.h | 1 +
> 3 files changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
> index a8a782bcb121..43af872e89a6 100644
> --- a/tools/perf/arch/x86/util/mem-events.c
> +++ b/tools/perf/arch/x86/util/mem-events.c
> @@ -91,3 +91,8 @@ char *perf_mem_events__name(int i, char *pmu_name)
>
> return (char *)e->name;
> }
> +
> +bool perf_mem_events__via_core_pmus(void)
> +{
> + return !x86__is_amd_cpu();
> +}
> diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
> index be15aadb6b14..0c04f883d634 100644
> --- a/tools/perf/util/mem-events.c
> +++ b/tools/perf/util/mem-events.c
> @@ -109,6 +109,14 @@ static bool perf_mem_event__supported(const char *mnt, char *sysfs_name)
> return !stat(path, &st);
> }
>
> +bool __weak perf_mem_events__via_core_pmus(void)
> +{
> + return true;
> +}
> +
> +#define perf_mem_scan_next_pmu(pmu) \
> + (perf_mem_events__via_core_pmus() ? perf_pmus__scan_core(pmu) : perf_pmus__scan(pmu))
> +
> int perf_mem_events__init(void)
> {
> const char *mnt = sysfs__mount();
> @@ -130,7 +138,7 @@ int perf_mem_events__init(void)
> if (!e->tag)
> continue;
>
> - while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
> + while ((pmu = perf_mem_scan_next_pmu(pmu)) != NULL) {
> scnprintf(sysfs_name, sizeof(sysfs_name), e->sysfs_name, pmu->name);
> e->supported |= perf_mem_event__supported(mnt, sysfs_name);
> }
> @@ -165,7 +173,7 @@ static void perf_mem_events__print_unsupport_hybrid(struct perf_mem_event *e,
> char sysfs_name[100];
> struct perf_pmu *pmu = NULL;
>
> - while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
> + while ((pmu = perf_mem_scan_next_pmu(pmu)) != NULL) {
It was my mistake to optimize this, I think we can just go back to:
perf_pmus__scan(pmu)
which would remove a lot of the weak/macros etc. here. We can have a
comment as to why this is scan not scan_core, because of AMD. I plan
to further improve overhead of PMUs so I'm not worried about losing
the small performance win from this.
Thanks,
Ian
> scnprintf(sysfs_name, sizeof(sysfs_name), e->sysfs_name,
> pmu->name);
> if (!perf_mem_event__supported(mnt, sysfs_name)) {
> @@ -188,7 +196,7 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
> if (!e->record)
> continue;
>
> - if (perf_pmus__num_core_pmus() == 1) {
> + if (perf_pmus__num_mem_pmus() == 1) {
> if (!e->supported) {
> pr_err("failed: event '%s' not supported\n",
> perf_mem_events__name(j, NULL));
> @@ -203,7 +211,7 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
> return -1;
> }
>
> - while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
> + while ((pmu = perf_mem_scan_next_pmu(pmu)) != NULL) {
> rec_argv[i++] = "-e";
> s = perf_mem_events__name(j, pmu->name);
> if (s) {
> diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
> index 12372309d60e..d650eb311113 100644
> --- a/tools/perf/util/mem-events.h
> +++ b/tools/perf/util/mem-events.h
> @@ -36,6 +36,7 @@ enum {
> extern unsigned int perf_mem_events__loads_ldlat;
>
> int perf_mem_events__parse(const char *str);
> +bool perf_mem_events__via_core_pmus(void);
> int perf_mem_events__init(void);
>
> char *perf_mem_events__name(int i, char *pmu_name);
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/4] perf tool x86: Consolidate is_amd check into single function
2023-06-13 9:55 ` [PATCH 2/4] perf tool x86: Consolidate is_amd check into single function Ravi Bangoria
@ 2023-06-13 19:49 ` Arnaldo Carvalho de Melo
2023-06-13 19:49 ` Ian Rogers
1 sibling, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-06-13 19:49 UTC (permalink / raw)
To: Ravi Bangoria
Cc: irogers, jolsa, namhyung, mark.rutland, peterz, adrian.hunter,
kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
Em Tue, Jun 13, 2023 at 03:25:04PM +0530, Ravi Bangoria escreveu:
> There are multiple places where x86 specific code determines AMD vs
> Intel arch and acts based on that. Consolidate those checks into a
> single function.
I'm cherry picking this one now.
- Arnaldo
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
> tools/perf/arch/x86/util/Build | 1 +
> tools/perf/arch/x86/util/env.c | 19 +++++++++++++++++++
> tools/perf/arch/x86/util/env.h | 7 +++++++
> tools/perf/arch/x86/util/evsel.c | 16 ++--------------
> tools/perf/arch/x86/util/mem-events.c | 19 ++-----------------
> 5 files changed, 31 insertions(+), 31 deletions(-)
> create mode 100644 tools/perf/arch/x86/util/env.c
> create mode 100644 tools/perf/arch/x86/util/env.h
>
> diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
> index 195ccfdef7aa..005907cb97d8 100644
> --- a/tools/perf/arch/x86/util/Build
> +++ b/tools/perf/arch/x86/util/Build
> @@ -10,6 +10,7 @@ perf-y += evlist.o
> perf-y += mem-events.o
> perf-y += evsel.o
> perf-y += iostat.o
> +perf-y += env.o
>
> perf-$(CONFIG_DWARF) += dwarf-regs.o
> perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
> diff --git a/tools/perf/arch/x86/util/env.c b/tools/perf/arch/x86/util/env.c
> new file mode 100644
> index 000000000000..33b87f8ac1cc
> --- /dev/null
> +++ b/tools/perf/arch/x86/util/env.c
> @@ -0,0 +1,19 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include "linux/string.h"
> +#include "util/env.h"
> +#include "env.h"
> +
> +bool x86__is_amd_cpu(void)
> +{
> + struct perf_env env = { .total_mem = 0, };
> + static int is_amd; /* 0: Uninitialized, 1: Yes, -1: No */
> +
> + if (is_amd)
> + goto ret;
> +
> + perf_env__cpuid(&env);
> + is_amd = env.cpuid && strstarts(env.cpuid, "AuthenticAMD") ? 1 : -1;
> +
> +ret:
> + return is_amd >= 1 ? true : false;
> +}
> diff --git a/tools/perf/arch/x86/util/env.h b/tools/perf/arch/x86/util/env.h
> new file mode 100644
> index 000000000000..d78f080b6b3f
> --- /dev/null
> +++ b/tools/perf/arch/x86/util/env.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _X86_ENV_H
> +#define _X86_ENV_H
> +
> +bool x86__is_amd_cpu(void);
> +
> +#endif /* _X86_ENV_H */
> diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
> index 25da46c8cca9..512c2d885d24 100644
> --- a/tools/perf/arch/x86/util/evsel.c
> +++ b/tools/perf/arch/x86/util/evsel.c
> @@ -8,6 +8,7 @@
> #include "linux/string.h"
> #include "evsel.h"
> #include "util/debug.h"
> +#include "env.h"
>
> #define IBS_FETCH_L3MISSONLY (1ULL << 59)
> #define IBS_OP_L3MISSONLY (1ULL << 16)
> @@ -78,23 +79,10 @@ void arch__post_evsel_config(struct evsel *evsel, struct perf_event_attr *attr)
> {
> struct perf_pmu *evsel_pmu, *ibs_fetch_pmu, *ibs_op_pmu;
> static int warned_once;
> - /* 0: Uninitialized, 1: Yes, -1: No */
> - static int is_amd;
>
> - if (warned_once || is_amd == -1)
> + if (warned_once || !x86__is_amd_cpu())
> return;
>
> - if (!is_amd) {
> - struct perf_env *env = evsel__env(evsel);
> -
> - if (!perf_env__cpuid(env) || !env->cpuid ||
> - !strstarts(env->cpuid, "AuthenticAMD")) {
> - is_amd = -1;
> - return;
> - }
> - is_amd = 1;
> - }
> -
> evsel_pmu = evsel__find_pmu(evsel);
> if (!evsel_pmu)
> return;
> diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
> index 32879d12a8d5..a8a782bcb121 100644
> --- a/tools/perf/arch/x86/util/mem-events.c
> +++ b/tools/perf/arch/x86/util/mem-events.c
> @@ -5,6 +5,7 @@
> #include "map_symbol.h"
> #include "mem-events.h"
> #include "linux/string.h"
> +#include "env.h"
>
> static char mem_loads_name[100];
> static bool mem_loads_name__init;
> @@ -27,28 +28,12 @@ static struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
> E("mem-ldst", "ibs_op//", "ibs_op"),
> };
>
> -static int perf_mem_is_amd_cpu(void)
> -{
> - struct perf_env env = { .total_mem = 0, };
> -
> - perf_env__cpuid(&env);
> - if (env.cpuid && strstarts(env.cpuid, "AuthenticAMD"))
> - return 1;
> - return -1;
> -}
> -
> struct perf_mem_event *perf_mem_events__ptr(int i)
> {
> - /* 0: Uninitialized, 1: Yes, -1: No */
> - static int is_amd;
> -
> if (i >= PERF_MEM_EVENTS__MAX)
> return NULL;
>
> - if (!is_amd)
> - is_amd = perf_mem_is_amd_cpu();
> -
> - if (is_amd == 1)
> + if (x86__is_amd_cpu())
> return &perf_mem_events_amd[i];
>
> return &perf_mem_events_intel[i];
> --
> 2.40.1
>
--
- Arnaldo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/4] perf tool x86: Consolidate is_amd check into single function
2023-06-13 9:55 ` [PATCH 2/4] perf tool x86: Consolidate is_amd check into single function Ravi Bangoria
2023-06-13 19:49 ` Arnaldo Carvalho de Melo
@ 2023-06-13 19:49 ` Ian Rogers
2023-06-13 20:28 ` Arnaldo Carvalho de Melo
1 sibling, 1 reply; 12+ messages in thread
From: Ian Rogers @ 2023-06-13 19:49 UTC (permalink / raw)
To: Ravi Bangoria
Cc: acme, jolsa, namhyung, mark.rutland, peterz, adrian.hunter,
kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
On Tue, Jun 13, 2023 at 2:56 AM Ravi Bangoria <ravi.bangoria@amd.com> wrote:
>
> There are multiple places where x86 specific code determines AMD vs
> Intel arch and acts based on that. Consolidate those checks into a
> single function.
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
> ---
> tools/perf/arch/x86/util/Build | 1 +
> tools/perf/arch/x86/util/env.c | 19 +++++++++++++++++++
> tools/perf/arch/x86/util/env.h | 7 +++++++
> tools/perf/arch/x86/util/evsel.c | 16 ++--------------
> tools/perf/arch/x86/util/mem-events.c | 19 ++-----------------
> 5 files changed, 31 insertions(+), 31 deletions(-)
> create mode 100644 tools/perf/arch/x86/util/env.c
> create mode 100644 tools/perf/arch/x86/util/env.h
>
> diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
> index 195ccfdef7aa..005907cb97d8 100644
> --- a/tools/perf/arch/x86/util/Build
> +++ b/tools/perf/arch/x86/util/Build
> @@ -10,6 +10,7 @@ perf-y += evlist.o
> perf-y += mem-events.o
> perf-y += evsel.o
> perf-y += iostat.o
> +perf-y += env.o
>
> perf-$(CONFIG_DWARF) += dwarf-regs.o
> perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
> diff --git a/tools/perf/arch/x86/util/env.c b/tools/perf/arch/x86/util/env.c
> new file mode 100644
> index 000000000000..33b87f8ac1cc
> --- /dev/null
> +++ b/tools/perf/arch/x86/util/env.c
> @@ -0,0 +1,19 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include "linux/string.h"
> +#include "util/env.h"
> +#include "env.h"
> +
> +bool x86__is_amd_cpu(void)
> +{
> + struct perf_env env = { .total_mem = 0, };
> + static int is_amd; /* 0: Uninitialized, 1: Yes, -1: No */
> +
> + if (is_amd)
> + goto ret;
> +
> + perf_env__cpuid(&env);
> + is_amd = env.cpuid && strstarts(env.cpuid, "AuthenticAMD") ? 1 : -1;
> +
> +ret:
> + return is_amd >= 1 ? true : false;
> +}
> diff --git a/tools/perf/arch/x86/util/env.h b/tools/perf/arch/x86/util/env.h
> new file mode 100644
> index 000000000000..d78f080b6b3f
> --- /dev/null
> +++ b/tools/perf/arch/x86/util/env.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _X86_ENV_H
> +#define _X86_ENV_H
> +
> +bool x86__is_amd_cpu(void);
> +
> +#endif /* _X86_ENV_H */
> diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
> index 25da46c8cca9..512c2d885d24 100644
> --- a/tools/perf/arch/x86/util/evsel.c
> +++ b/tools/perf/arch/x86/util/evsel.c
> @@ -8,6 +8,7 @@
> #include "linux/string.h"
> #include "evsel.h"
> #include "util/debug.h"
> +#include "env.h"
>
> #define IBS_FETCH_L3MISSONLY (1ULL << 59)
> #define IBS_OP_L3MISSONLY (1ULL << 16)
> @@ -78,23 +79,10 @@ void arch__post_evsel_config(struct evsel *evsel, struct perf_event_attr *attr)
> {
> struct perf_pmu *evsel_pmu, *ibs_fetch_pmu, *ibs_op_pmu;
> static int warned_once;
> - /* 0: Uninitialized, 1: Yes, -1: No */
> - static int is_amd;
>
> - if (warned_once || is_amd == -1)
> + if (warned_once || !x86__is_amd_cpu())
> return;
>
> - if (!is_amd) {
> - struct perf_env *env = evsel__env(evsel);
> -
> - if (!perf_env__cpuid(env) || !env->cpuid ||
> - !strstarts(env->cpuid, "AuthenticAMD")) {
> - is_amd = -1;
> - return;
> - }
> - is_amd = 1;
> - }
> -
> evsel_pmu = evsel__find_pmu(evsel);
> if (!evsel_pmu)
> return;
> diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
> index 32879d12a8d5..a8a782bcb121 100644
> --- a/tools/perf/arch/x86/util/mem-events.c
> +++ b/tools/perf/arch/x86/util/mem-events.c
> @@ -5,6 +5,7 @@
> #include "map_symbol.h"
> #include "mem-events.h"
> #include "linux/string.h"
> +#include "env.h"
>
> static char mem_loads_name[100];
> static bool mem_loads_name__init;
> @@ -27,28 +28,12 @@ static struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
> E("mem-ldst", "ibs_op//", "ibs_op"),
> };
>
> -static int perf_mem_is_amd_cpu(void)
> -{
> - struct perf_env env = { .total_mem = 0, };
> -
> - perf_env__cpuid(&env);
> - if (env.cpuid && strstarts(env.cpuid, "AuthenticAMD"))
> - return 1;
> - return -1;
> -}
> -
> struct perf_mem_event *perf_mem_events__ptr(int i)
> {
> - /* 0: Uninitialized, 1: Yes, -1: No */
> - static int is_amd;
> -
> if (i >= PERF_MEM_EVENTS__MAX)
> return NULL;
>
> - if (!is_amd)
> - is_amd = perf_mem_is_amd_cpu();
> -
> - if (is_amd == 1)
> + if (x86__is_amd_cpu())
> return &perf_mem_events_amd[i];
>
> return &perf_mem_events_intel[i];
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/4] perf tool x86: Consolidate is_amd check into single function
2023-06-13 19:49 ` Ian Rogers
@ 2023-06-13 20:28 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-06-13 20:28 UTC (permalink / raw)
To: Ian Rogers
Cc: Ravi Bangoria, jolsa, namhyung, mark.rutland, peterz,
adrian.hunter, kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla
Em Tue, Jun 13, 2023 at 12:49:50PM -0700, Ian Rogers escreveu:
> On Tue, Jun 13, 2023 at 2:56 AM Ravi Bangoria <ravi.bangoria@amd.com> wrote:
> >
> > There are multiple places where x86 specific code determines AMD vs
> > Intel arch and acts based on that. Consolidate those checks into a
> > single function.
> >
> > Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
>
> Acked-by: Ian Rogers <irogers@google.com>
Thanks, added to the cset.
- Arnaldo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/4] perf mem amd: Fix perf_pmus__num_mem_pmus()
2023-06-13 15:25 ` Ian Rogers
@ 2023-06-14 3:49 ` Ravi Bangoria
0 siblings, 0 replies; 12+ messages in thread
From: Ravi Bangoria @ 2023-06-14 3:49 UTC (permalink / raw)
To: Ian Rogers
Cc: acme, jolsa, namhyung, mark.rutland, peterz, adrian.hunter,
kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla, Ravi Bangoria
>> +int perf_pmus__num_mem_pmus(void)
>> +{
>> + /*
>> + * AMD does not have hybrid cores and also uses IBS OP
>> + * pmu for perf mem/c2c.
>> + */
>> + if (x86__is_amd_cpu())
>> + return 1;
>
> The code and comment seem out of sync here. For the hybrid part
> perf_pmus__num_core_pmus() will yield 1 if there is no hybrid, so we
> can just use perf_pmus__num_core_pmus(). For the IBS OP part, does
> that mean that AMD should have 2 mem pmus? Or is IBS OP a core PMU?
Sure. Let me remove hybrid part from the comment.
There are two IBS pmus: ibs_fetch// and ibs_op//. Both of them are
independent of the core pmu (cpu//). An instance of all 3 pmus is
present in each hw SMT thread. And, perf mem/c2c internally uses
ibs_op// pmu on AMD. See tools/perf/arch/x86/util/mem-events.c
static struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
E(NULL, NULL, NULL),
E(NULL, NULL, NULL),
E("mem-ldst", "ibs_op//", "ibs_op"),
};
Hope this clarifies.
> Can we add this as an example in the core/other documentation in patch
> 1, as you've done for ARM, for clarity.
Sure. Let me also add IBS example in the patch #1 comment.
Thanks,
Ravi
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 4/4] perf mem amd: Scan all PMUs instead of just core ones
2023-06-13 15:34 ` Ian Rogers
@ 2023-06-14 3:58 ` Ravi Bangoria
0 siblings, 0 replies; 12+ messages in thread
From: Ravi Bangoria @ 2023-06-14 3:58 UTC (permalink / raw)
To: Ian Rogers
Cc: acme, jolsa, namhyung, mark.rutland, peterz, adrian.hunter,
kan.liang, james.clark, alisaidi, leo.yan, maddy,
linux-perf-users, linux-kernel, sandipan.das, ananth.narayan,
santosh.shukla, Ravi Bangoria
>> @@ -165,7 +173,7 @@ static void perf_mem_events__print_unsupport_hybrid(struct perf_mem_event *e,
>> char sysfs_name[100];
>> struct perf_pmu *pmu = NULL;
>>
>> - while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
>> + while ((pmu = perf_mem_scan_next_pmu(pmu)) != NULL) {
>
> It was my mistake to optimize this,
Not really. I mean, there was already a bug which just got exacerbated.
> I think we can just go back to:
> perf_pmus__scan(pmu)
> which would remove a lot of the weak/macros etc. here. We can have a
> comment as to why this is scan not scan_core, because of AMD. I plan
> to further improve overhead of PMUs so I'm not worried about losing
> the small performance win from this.
Sure. Let me do that.
Thanks,
Ravi
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-06-14 3:58 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-13 9:55 [PATCH 0/4] perf mem amd: Fix few logic bugs Ravi Bangoria
2023-06-13 9:55 ` [PATCH 1/4] perf pmus: Describe semantics of 'core_pmus' and 'other_pmus' Ravi Bangoria
2023-06-13 9:55 ` [PATCH 2/4] perf tool x86: Consolidate is_amd check into single function Ravi Bangoria
2023-06-13 19:49 ` Arnaldo Carvalho de Melo
2023-06-13 19:49 ` Ian Rogers
2023-06-13 20:28 ` Arnaldo Carvalho de Melo
2023-06-13 9:55 ` [PATCH 3/4] perf mem amd: Fix perf_pmus__num_mem_pmus() Ravi Bangoria
2023-06-13 15:25 ` Ian Rogers
2023-06-14 3:49 ` Ravi Bangoria
2023-06-13 9:55 ` [PATCH 4/4] perf mem amd: Scan all PMUs instead of just core ones Ravi Bangoria
2023-06-13 15:34 ` Ian Rogers
2023-06-14 3:58 ` Ravi Bangoria
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).