* [PATCH 0/5] Add perf stat default events for hybrid machines
@ 2022-06-07 1:33 zhengjun.xing
2022-06-07 1:33 ` [PATCH 1/5] perf stat: Revert "perf stat: Add default hybrid events" zhengjun.xing
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: zhengjun.xing @ 2022-06-07 1:33 UTC (permalink / raw)
To: acme, peterz, mingo, alexander.shishkin, jolsa
Cc: linux-kernel, linux-perf-users, irogers, adrian.hunter, ak,
kan.liang, zhengjun.xing
From: Zhengjun Xing <zhengjun.xing@linux.intel.com>
The patch series is to clean up the existing perf stat default and support
the perf metrics Topdown for the p-core PMU in the perf stat default. The
first 4 patches are the clean-up patch and fixing the "--detailed" issue.
The last patch adds support for the perf metrics Topdown, the perf metrics
Topdown support for e-core PMU will be implemented later separately.
Kan Liang (4):
perf stat: Revert "perf stat: Add default hybrid events"
perf evsel: Add arch_evsel__hw_name()
perf evlist: Always use arch_evlist__add_default_attrs()
perf x86 evlist: Add default hybrid events for perf stat
Zhengjun Xing (1):
perf stat: Add topdown metrics in the default perf stat on the hybrid
machine
tools/perf/arch/x86/util/evlist.c | 64 +++++++++++++++++++++++++-----
tools/perf/arch/x86/util/evsel.c | 20 ++++++++++
tools/perf/arch/x86/util/topdown.c | 51 ++++++++++++++++++++++++
tools/perf/arch/x86/util/topdown.h | 1 +
tools/perf/builtin-stat.c | 50 ++++-------------------
tools/perf/util/evlist.c | 11 +++--
tools/perf/util/evlist.h | 9 ++++-
tools/perf/util/evsel.c | 7 +++-
tools/perf/util/evsel.h | 1 +
tools/perf/util/stat-display.c | 2 +-
tools/perf/util/topdown.c | 7 ++++
tools/perf/util/topdown.h | 3 +-
12 files changed, 166 insertions(+), 60 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/5] perf stat: Revert "perf stat: Add default hybrid events"
2022-06-07 1:33 [PATCH 0/5] Add perf stat default events for hybrid machines zhengjun.xing
@ 2022-06-07 1:33 ` zhengjun.xing
2022-06-07 1:33 ` [PATCH 2/5] perf evsel: Add arch_evsel__hw_name() zhengjun.xing
` (3 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: zhengjun.xing @ 2022-06-07 1:33 UTC (permalink / raw)
To: acme, peterz, mingo, alexander.shishkin, jolsa
Cc: linux-kernel, linux-perf-users, irogers, adrian.hunter, ak,
kan.liang, zhengjun.xing
From: Kan Liang <kan.liang@linux.intel.com>
This reverts commit ac2dc29edd21 ("perf stat: Add default hybrid
events").
Between this patch and the reverted patch, the commit 6c1912898ed2
("perf parse-events: Rename parse_events_error functions") and the
commit 07eafd4e053a ("perf parse-event: Add init and exit to
parse_event_error") clean up the parse_events_error_*() codes. The
related change is also reverted.
The reverted patch is hard to be extended to support new default
events, e.g., Topdown events, and the existing "--detailed" option
on a hybrid platform.
A new solution will be proposed in the following patch to enable the
perf stat default on a hybrid platform.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
---
tools/perf/builtin-stat.c | 30 ------------------------------
1 file changed, 30 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 4ce87a8eb7d7..6ac79d95f3b5 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1685,12 +1685,6 @@ static int add_default_attributes(void)
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },
-};
- struct perf_event_attr default_sw_attrs[] = {
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
};
/*
@@ -1947,30 +1941,6 @@ static int add_default_attributes(void)
}
if (!evsel_list->core.nr_entries) {
- if (perf_pmu__has_hybrid()) {
- struct parse_events_error errinfo;
- const char *hybrid_str = "cycles,instructions,branches,branch-misses";
-
- if (target__has_cpu(&target))
- default_sw_attrs[0].config = PERF_COUNT_SW_CPU_CLOCK;
-
- if (evlist__add_default_attrs(evsel_list,
- default_sw_attrs) < 0) {
- return -1;
- }
-
- parse_events_error__init(&errinfo);
- err = parse_events(evsel_list, hybrid_str, &errinfo);
- if (err) {
- fprintf(stderr,
- "Cannot set up hybrid events %s: %d\n",
- hybrid_str, err);
- parse_events_error__print(&errinfo, hybrid_str);
- }
- parse_events_error__exit(&errinfo);
- return err ? -1 : 0;
- }
-
if (target__has_cpu(&target))
default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/5] perf evsel: Add arch_evsel__hw_name()
2022-06-07 1:33 [PATCH 0/5] Add perf stat default events for hybrid machines zhengjun.xing
2022-06-07 1:33 ` [PATCH 1/5] perf stat: Revert "perf stat: Add default hybrid events" zhengjun.xing
@ 2022-06-07 1:33 ` zhengjun.xing
2022-06-07 1:33 ` [PATCH 3/5] perf evlist: Always use arch_evlist__add_default_attrs() zhengjun.xing
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: zhengjun.xing @ 2022-06-07 1:33 UTC (permalink / raw)
To: acme, peterz, mingo, alexander.shishkin, jolsa
Cc: linux-kernel, linux-perf-users, irogers, adrian.hunter, ak,
kan.liang, zhengjun.xing
From: Kan Liang <kan.liang@linux.intel.com>
The commit 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE") extends the two types to become PMU aware types for
a hybrid system. However, current evsel__hw_name doesn't take the PMU
type into account. It mistakenly returns the "unknown-hardware" for the
hardware event with a specific PMU type.
Add an Arch specific arch_evsel__hw_name() to specially handle the PMU
aware hardware event.
Currently, the extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE is only
supported by X86. Only implement the specific arch_evsel__hw_name() for
X86 in the patch.
Nothing is changed for the other Archs.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
---
tools/perf/arch/x86/util/evsel.c | 20 ++++++++++++++++++++
tools/perf/util/evsel.c | 7 ++++++-
tools/perf/util/evsel.h | 1 +
3 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index 3501399cef35..f6feb61d98a0 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -61,3 +61,23 @@ bool arch_evsel__must_be_in_group(const struct evsel *evsel)
(strcasestr(evsel->name, "slots") ||
strcasestr(evsel->name, "topdown"));
}
+
+int arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
+{
+ u64 event = evsel->core.attr.config & PERF_HW_EVENT_MASK;
+ u64 pmu = evsel->core.attr.config >> PERF_PMU_TYPE_SHIFT;
+ const char *event_name;
+
+ if (event < PERF_COUNT_HW_MAX && evsel__hw_names[event])
+ event_name = evsel__hw_names[event];
+ else
+ event_name = "unknown-hardware";
+
+ /* The PMU type is not required for the non-hybrid platform. */
+ if (!pmu)
+ return scnprintf(bf, size, "%s", event_name);
+
+ return scnprintf(bf, size, "%s/%s/",
+ evsel->pmu_name ? evsel->pmu_name : "cpu",
+ event_name);
+}
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ce499c5da8d7..782be377208f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -593,9 +593,14 @@ static int evsel__add_modifiers(struct evsel *evsel, char *bf, size_t size)
return r;
}
+int __weak arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
+{
+ return scnprintf(bf, size, "%s", __evsel__hw_name(evsel->core.attr.config));
+}
+
static int evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
{
- int r = scnprintf(bf, size, "%s", __evsel__hw_name(evsel->core.attr.config));
+ int r = arch_evsel__hw_name(evsel, bf, size);
return r + evsel__add_modifiers(evsel, bf + r, size - r);
}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 73ea48e94079..8dd3f04a5bdb 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -271,6 +271,7 @@ extern const char *const evsel__hw_names[PERF_COUNT_HW_MAX];
extern const char *const evsel__sw_names[PERF_COUNT_SW_MAX];
extern char *evsel__bpf_counter_events;
bool evsel__match_bpf_counter_events(const char *name);
+int arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size);
int __evsel__hw_cache_type_op_res_name(u8 type, u8 op, u8 result, char *bf, size_t size);
const char *evsel__name(struct evsel *evsel);
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/5] perf evlist: Always use arch_evlist__add_default_attrs()
2022-06-07 1:33 [PATCH 0/5] Add perf stat default events for hybrid machines zhengjun.xing
2022-06-07 1:33 ` [PATCH 1/5] perf stat: Revert "perf stat: Add default hybrid events" zhengjun.xing
2022-06-07 1:33 ` [PATCH 2/5] perf evsel: Add arch_evsel__hw_name() zhengjun.xing
@ 2022-06-07 1:33 ` zhengjun.xing
2022-06-07 1:33 ` [PATCH 4/5] perf x86 evlist: Add default hybrid events for perf stat zhengjun.xing
2022-06-07 1:33 ` [PATCH 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine zhengjun.xing
4 siblings, 0 replies; 11+ messages in thread
From: zhengjun.xing @ 2022-06-07 1:33 UTC (permalink / raw)
To: acme, peterz, mingo, alexander.shishkin, jolsa
Cc: linux-kernel, linux-perf-users, irogers, adrian.hunter, ak,
kan.liang, zhengjun.xing
From: Kan Liang <kan.liang@linux.intel.com>
Current perf stat uses the evlist__add_default_attrs() to add the
generic default attrs, and uses arch_evlist__add_default_attrs()
to add the Arch specific default attrs, e.g., Topdown for X86.
It works well for the non-hybrid platforms. However, for a hybrid
platform, the hard code generic default attrs don't work.
Uses arch_evlist__add_default_attrs() to replace the
evlist__add_default_attrs(). The arch_evlist__add_default_attrs() is
modified to invoke the same __evlist__add_default_attrs() for the
generic default attrs. No functional change.
Add default_null_attrs[] to indicate the Arch specific attrs.
No functional change for the Arch specific default attrs either.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
---
tools/perf/arch/x86/util/evlist.c | 7 ++++++-
tools/perf/builtin-stat.c | 6 +++++-
tools/perf/util/evlist.c | 9 +++++++--
tools/perf/util/evlist.h | 7 +++++--
4 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
index 68f681ad54c1..777bdf182a58 100644
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@@ -8,8 +8,13 @@
#define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
#define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
-int arch_evlist__add_default_attrs(struct evlist *evlist)
+int arch_evlist__add_default_attrs(struct evlist *evlist,
+ struct perf_event_attr *attrs,
+ size_t nr_attrs)
{
+ if (nr_attrs)
+ return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
+
if (!pmu_have_event("cpu", "slots"))
return 0;
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 6ac79d95f3b5..837c3ca91af1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1777,6 +1777,9 @@ static int add_default_attributes(void)
(PERF_COUNT_HW_CACHE_OP_PREFETCH << 8) |
(PERF_COUNT_HW_CACHE_RESULT_MISS << 16) },
};
+
+ struct perf_event_attr default_null_attrs[] = {};
+
/* Set attrs if no event is selected and !null_run: */
if (stat_config.null_run)
return 0;
@@ -1958,7 +1961,8 @@ static int add_default_attributes(void)
return -1;
stat_config.topdown_level = TOPDOWN_MAX_LEVEL;
- if (arch_evlist__add_default_attrs(evsel_list) < 0)
+ /* Platform specific attrs */
+ if (evlist__add_default_attrs(evsel_list, default_null_attrs) < 0)
return -1;
}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 48af7d379d82..efa5f006b5c6 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -342,9 +342,14 @@ int __evlist__add_default_attrs(struct evlist *evlist, struct perf_event_attr *a
return evlist__add_attrs(evlist, attrs, nr_attrs);
}
-__weak int arch_evlist__add_default_attrs(struct evlist *evlist __maybe_unused)
+__weak int arch_evlist__add_default_attrs(struct evlist *evlist,
+ struct perf_event_attr *attrs,
+ size_t nr_attrs)
{
- return 0;
+ if (!nr_attrs)
+ return 0;
+
+ return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
}
struct evsel *evlist__find_tracepoint_by_id(struct evlist *evlist, int id)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1bde9ccf4e7d..129095c0fe6d 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -107,10 +107,13 @@ static inline int evlist__add_default(struct evlist *evlist)
int __evlist__add_default_attrs(struct evlist *evlist,
struct perf_event_attr *attrs, size_t nr_attrs);
+int arch_evlist__add_default_attrs(struct evlist *evlist,
+ struct perf_event_attr *attrs,
+ size_t nr_attrs);
+
#define evlist__add_default_attrs(evlist, array) \
- __evlist__add_default_attrs(evlist, array, ARRAY_SIZE(array))
+ arch_evlist__add_default_attrs(evlist, array, ARRAY_SIZE(array))
-int arch_evlist__add_default_attrs(struct evlist *evlist);
struct evsel *arch_evlist__leader(struct list_head *list);
int evlist__add_dummy(struct evlist *evlist);
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/5] perf x86 evlist: Add default hybrid events for perf stat
2022-06-07 1:33 [PATCH 0/5] Add perf stat default events for hybrid machines zhengjun.xing
` (2 preceding siblings ...)
2022-06-07 1:33 ` [PATCH 3/5] perf evlist: Always use arch_evlist__add_default_attrs() zhengjun.xing
@ 2022-06-07 1:33 ` zhengjun.xing
2022-06-09 0:04 ` Namhyung Kim
2022-06-07 1:33 ` [PATCH 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine zhengjun.xing
4 siblings, 1 reply; 11+ messages in thread
From: zhengjun.xing @ 2022-06-07 1:33 UTC (permalink / raw)
To: acme, peterz, mingo, alexander.shishkin, jolsa
Cc: linux-kernel, linux-perf-users, irogers, adrian.hunter, ak,
kan.liang, zhengjun.xing
From: Kan Liang <kan.liang@linux.intel.com>
Provide a new solution to replace the reverted commit ac2dc29edd21
("perf stat: Add default hybrid events").
For the default software attrs, nothing is changed.
For the default hardware attrs, create a new evsel for each hybrid pmu.
With the new solution, adding a new default attr will not require the
special support for the hybrid platform anymore.
Also, the "--detailed" is supported on the hybrid platform
With the patch,
./perf stat -a -ddd sleep 1
Performance counter stats for 'system wide':
32,231.06 msec cpu-clock # 32.056 CPUs utilized
529 context-switches # 16.413 /sec
32 cpu-migrations # 0.993 /sec
69 page-faults # 2.141 /sec
176,754,151 cpu_core/cycles/ # 5.484 M/sec (41.65%)
161,695,280 cpu_atom/cycles/ # 5.017 M/sec (49.92%)
48,595,992 cpu_core/instructions/ # 1.508 M/sec (49.98%)
32,363,337 cpu_atom/instructions/ # 1.004 M/sec (58.26%)
10,088,639 cpu_core/branches/ # 313.010 K/sec (58.31%)
6,390,582 cpu_atom/branches/ # 198.274 K/sec (58.26%)
846,201 cpu_core/branch-misses/ # 26.254 K/sec (66.65%)
676,477 cpu_atom/branch-misses/ # 20.988 K/sec (58.27%)
14,290,070 cpu_core/L1-dcache-loads/ # 443.363 K/sec (66.66%)
9,983,532 cpu_atom/L1-dcache-loads/ # 309.749 K/sec (58.27%)
740,725 cpu_core/L1-dcache-load-misses/ # 22.982 K/sec (66.66%)
<not supported> cpu_atom/L1-dcache-load-misses/
480,441 cpu_core/LLC-loads/ # 14.906 K/sec (66.67%)
326,570 cpu_atom/LLC-loads/ # 10.132 K/sec (58.27%)
329 cpu_core/LLC-load-misses/ # 10.208 /sec (66.68%)
0 cpu_atom/LLC-load-misses/ # 0.000 /sec (58.32%)
<not supported> cpu_core/L1-icache-loads/
21,982,491 cpu_atom/L1-icache-loads/ # 682.028 K/sec (58.43%)
4,493,189 cpu_core/L1-icache-load-misses/ # 139.406 K/sec (33.34%)
4,711,404 cpu_atom/L1-icache-load-misses/ # 146.176 K/sec (50.08%)
13,713,090 cpu_core/dTLB-loads/ # 425.462 K/sec (33.34%)
9,384,727 cpu_atom/dTLB-loads/ # 291.170 K/sec (50.08%)
157,387 cpu_core/dTLB-load-misses/ # 4.883 K/sec (33.33%)
108,328 cpu_atom/dTLB-load-misses/ # 3.361 K/sec (50.08%)
<not supported> cpu_core/iTLB-loads/
<not supported> cpu_atom/iTLB-loads/
37,655 cpu_core/iTLB-load-misses/ # 1.168 K/sec (33.32%)
61,661 cpu_atom/iTLB-load-misses/ # 1.913 K/sec (50.03%)
<not supported> cpu_core/L1-dcache-prefetches/
<not supported> cpu_atom/L1-dcache-prefetches/
<not supported> cpu_core/L1-dcache-prefetch-misses/
<not supported> cpu_atom/L1-dcache-prefetch-misses/
1.005466919 seconds time elapsed
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
---
tools/perf/arch/x86/util/evlist.c | 52 ++++++++++++++++++++++++++++++-
tools/perf/util/evlist.c | 2 +-
tools/perf/util/evlist.h | 2 ++
3 files changed, 54 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
index 777bdf182a58..1b3f9e1a2287 100644
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@@ -4,16 +4,66 @@
#include "util/evlist.h"
#include "util/parse-events.h"
#include "topdown.h"
+#include "util/event.h"
+#include "util/pmu-hybrid.h"
#define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
#define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
+static int ___evlist__add_default_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
+{
+ struct perf_cpu_map *cpus;
+ struct evsel *evsel, *n;
+ struct perf_pmu *pmu;
+ LIST_HEAD(head);
+ size_t i, j = 0;
+
+ for (i = 0; i < nr_attrs; i++)
+ event_attr_init(attrs + i);
+
+ if (!perf_pmu__has_hybrid())
+ return evlist__add_attrs(evlist, attrs, nr_attrs);
+
+ for (i = 0; i < nr_attrs; i++) {
+ if (attrs[i].type == PERF_TYPE_SOFTWARE) {
+ evsel = evsel__new_idx(attrs + i, evlist->core.nr_entries + j);
+ if (evsel == NULL)
+ goto out_delete_partial_list;
+ j++;
+ list_add_tail(&evsel->core.node, &head);
+ continue;
+ }
+
+ perf_pmu__for_each_hybrid_pmu(pmu) {
+ evsel = evsel__new_idx(attrs + i, evlist->core.nr_entries + j);
+ if (evsel == NULL)
+ goto out_delete_partial_list;
+ j++;
+ evsel->core.attr.config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT;
+ cpus = perf_cpu_map__get(pmu->cpus);
+ evsel->core.cpus = cpus;
+ evsel->core.own_cpus = perf_cpu_map__get(cpus);
+ evsel->pmu_name = strdup(pmu->name);
+ list_add_tail(&evsel->core.node, &head);
+ }
+ }
+
+ evlist__splice_list_tail(evlist, &head);
+
+ return 0;
+
+out_delete_partial_list:
+ __evlist__for_each_entry_safe(&head, n, evsel)
+ evsel__delete(evsel);
+ return -1;
+}
+
int arch_evlist__add_default_attrs(struct evlist *evlist,
struct perf_event_attr *attrs,
size_t nr_attrs)
{
if (nr_attrs)
- return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
+ return ___evlist__add_default_attrs(evlist, attrs, nr_attrs);
if (!pmu_have_event("cpu", "slots"))
return 0;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index efa5f006b5c6..5ff4b9504828 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -309,7 +309,7 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
return evsel;
}
-static int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
+int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
{
struct evsel *evsel, *n;
LIST_HEAD(head);
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 129095c0fe6d..351ba2887a79 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -104,6 +104,8 @@ static inline int evlist__add_default(struct evlist *evlist)
return __evlist__add_default(evlist, true);
}
+int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs);
+
int __evlist__add_default_attrs(struct evlist *evlist,
struct perf_event_attr *attrs, size_t nr_attrs);
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine
2022-06-07 1:33 [PATCH 0/5] Add perf stat default events for hybrid machines zhengjun.xing
` (3 preceding siblings ...)
2022-06-07 1:33 ` [PATCH 4/5] perf x86 evlist: Add default hybrid events for perf stat zhengjun.xing
@ 2022-06-07 1:33 ` zhengjun.xing
2022-06-09 0:09 ` Namhyung Kim
4 siblings, 1 reply; 11+ messages in thread
From: zhengjun.xing @ 2022-06-07 1:33 UTC (permalink / raw)
To: acme, peterz, mingo, alexander.shishkin, jolsa
Cc: linux-kernel, linux-perf-users, irogers, adrian.hunter, ak,
kan.liang, zhengjun.xing
From: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Topdown metrics are missed in the default perf stat on the hybrid machine,
add Topdown metrics in default perf stat for hybrid systems.
Currently, we support the perf metrics Topdown for the p-core PMU in the
perf stat default, the perf metrics Topdown support for e-core PMU will be
implemented later separately. Refactor the code adds two x86 specific
functions. Widen the size of the event name column by 7 chars, so that all
metrics after the "#" become aligned again.
The perf metrics topdown feature is supported on the cpu_core of ADL. The
dedicated perf metrics counter and the fixed counter 3 are used for the
topdown events. Adding the topdown metrics doesn't trigger multiplexing.
Before:
# ./perf stat -a true
Performance counter stats for 'system wide':
53.70 msec cpu-clock # 25.736 CPUs utilized
80 context-switches # 1.490 K/sec
24 cpu-migrations # 446.951 /sec
52 page-faults # 968.394 /sec
2,788,555 cpu_core/cycles/ # 51.931 M/sec
851,129 cpu_atom/cycles/ # 15.851 M/sec
2,974,030 cpu_core/instructions/ # 55.385 M/sec
416,919 cpu_atom/instructions/ # 7.764 M/sec
586,136 cpu_core/branches/ # 10.916 M/sec
79,872 cpu_atom/branches/ # 1.487 M/sec
14,220 cpu_core/branch-misses/ # 264.819 K/sec
7,691 cpu_atom/branch-misses/ # 143.229 K/sec
0.002086438 seconds time elapsed
After:
# ./perf stat -a true
Performance counter stats for 'system wide':
61.39 msec cpu-clock # 24.874 CPUs utilized
76 context-switches # 1.238 K/sec
24 cpu-migrations # 390.968 /sec
52 page-faults # 847.097 /sec
2,753,695 cpu_core/cycles/ # 44.859 M/sec
903,899 cpu_atom/cycles/ # 14.725 M/sec
2,927,529 cpu_core/instructions/ # 47.690 M/sec
428,498 cpu_atom/instructions/ # 6.980 M/sec
581,299 cpu_core/branches/ # 9.470 M/sec
83,409 cpu_atom/branches/ # 1.359 M/sec
13,641 cpu_core/branch-misses/ # 222.216 K/sec
8,008 cpu_atom/branch-misses/ # 130.453 K/sec
14,761,308 cpu_core/slots/ # 240.466 M/sec
3,288,625 cpu_core/topdown-retiring/ # 22.3% retiring
1,323,323 cpu_core/topdown-bad-spec/ # 9.0% bad speculation
5,477,470 cpu_core/topdown-fe-bound/ # 37.1% frontend bound
4,679,199 cpu_core/topdown-be-bound/ # 31.7% backend bound
646,194 cpu_core/topdown-heavy-ops/ # 4.4% heavy operations # 17.9% light operations
1,244,999 cpu_core/topdown-br-mispredict/ # 8.4% branch mispredict # 0.5% machine clears
3,891,800 cpu_core/topdown-fetch-lat/ # 26.4% fetch latency # 10.7% fetch bandwidth
1,879,034 cpu_core/topdown-mem-bound/ # 12.7% memory bound # 19.0% Core bound
0.002467839 seconds time elapsed
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
---
tools/perf/arch/x86/util/evlist.c | 13 ++------
tools/perf/arch/x86/util/topdown.c | 51 ++++++++++++++++++++++++++++++
tools/perf/arch/x86/util/topdown.h | 1 +
tools/perf/builtin-stat.c | 14 ++------
tools/perf/util/stat-display.c | 2 +-
tools/perf/util/topdown.c | 7 ++++
tools/perf/util/topdown.h | 3 +-
7 files changed, 66 insertions(+), 25 deletions(-)
diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
index 1b3f9e1a2287..883559064818 100644
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@@ -3,12 +3,9 @@
#include "util/pmu.h"
#include "util/evlist.h"
#include "util/parse-events.h"
-#include "topdown.h"
#include "util/event.h"
#include "util/pmu-hybrid.h"
-
-#define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
-#define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
+#include "topdown.h"
static int ___evlist__add_default_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
{
@@ -65,13 +62,7 @@ int arch_evlist__add_default_attrs(struct evlist *evlist,
if (nr_attrs)
return ___evlist__add_default_attrs(evlist, attrs, nr_attrs);
- if (!pmu_have_event("cpu", "slots"))
- return 0;
-
- if (pmu_have_event("cpu", "topdown-heavy-ops"))
- return parse_events(evlist, TOPDOWN_L2_EVENTS, NULL);
- else
- return parse_events(evlist, TOPDOWN_L1_EVENTS, NULL);
+ return topdown_parse_events(evlist);
}
struct evsel *arch_evlist__leader(struct list_head *list)
diff --git a/tools/perf/arch/x86/util/topdown.c b/tools/perf/arch/x86/util/topdown.c
index f81a7cfe4d63..ba66e43a6b2a 100644
--- a/tools/perf/arch/x86/util/topdown.c
+++ b/tools/perf/arch/x86/util/topdown.c
@@ -3,9 +3,17 @@
#include "api/fs/fs.h"
#include "util/pmu.h"
#include "util/topdown.h"
+#include "util/evlist.h"
+#include "util/debug.h"
+#include "util/pmu-hybrid.h"
#include "topdown.h"
#include "evsel.h"
+#define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
+#define TOPDOWN_L1_EVENTS_CORE "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/,cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/}"
+#define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
+#define TOPDOWN_L2_EVENTS_CORE "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/,cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/,cpu_core/topdown-heavy-ops/,cpu_core/topdown-br-mispredict/,cpu_core/topdown-fetch-lat/,cpu_core/topdown-mem-bound/}"
+
/* Check whether there is a PMU which supports the perf metrics. */
bool topdown_sys_has_perf_metrics(void)
{
@@ -73,3 +81,46 @@ bool arch_topdown_sample_read(struct evsel *leader)
return false;
}
+
+const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn)
+{
+ const char *pmu_name = "cpu";
+
+ if (perf_pmu__has_hybrid()) {
+ if (!evlist->hybrid_pmu_name) {
+ if (warn)
+ pr_warning
+ ("WARNING: default to use cpu_core topdown events\n");
+ evlist->hybrid_pmu_name =
+ perf_pmu__hybrid_type_to_pmu("core");
+ }
+
+ pmu_name = evlist->hybrid_pmu_name;
+ }
+ return pmu_name;
+}
+
+int topdown_parse_events(struct evlist *evlist)
+{
+ const char *topdown_events;
+ const char *pmu_name;
+
+ if (!topdown_sys_has_perf_metrics())
+ return 0;
+
+ pmu_name = arch_get_topdown_pmu_name(evlist, false);
+
+ if (pmu_have_event(pmu_name, "topdown-heavy-ops")) {
+ if (!strcmp(pmu_name, "cpu_core"))
+ topdown_events = TOPDOWN_L2_EVENTS_CORE;
+ else
+ topdown_events = TOPDOWN_L2_EVENTS;
+ } else {
+ if (!strcmp(pmu_name, "cpu_core"))
+ topdown_events = TOPDOWN_L1_EVENTS_CORE;
+ else
+ topdown_events = TOPDOWN_L1_EVENTS;
+ }
+
+ return parse_events(evlist, topdown_events, NULL);
+}
diff --git a/tools/perf/arch/x86/util/topdown.h b/tools/perf/arch/x86/util/topdown.h
index 46bf9273e572..7eb81f042838 100644
--- a/tools/perf/arch/x86/util/topdown.h
+++ b/tools/perf/arch/x86/util/topdown.h
@@ -3,5 +3,6 @@
#define _TOPDOWN_H 1
bool topdown_sys_has_perf_metrics(void);
+int topdown_parse_events(struct evlist *evlist);
#endif
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 837c3ca91af1..c6b68be78f8c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -71,6 +71,7 @@
#include "util/bpf_counter.h"
#include "util/iostat.h"
#include "util/pmu-hybrid.h"
+ #include "util/topdown.h"
#include "asm/bug.h"
#include <linux/time64.h>
@@ -1858,22 +1859,11 @@ static int add_default_attributes(void)
unsigned int max_level = 1;
char *str = NULL;
bool warn = false;
- const char *pmu_name = "cpu";
+ const char *pmu_name = arch_get_topdown_pmu_name(evsel_list, true);
if (!force_metric_only)
stat_config.metric_only = true;
- if (perf_pmu__has_hybrid()) {
- if (!evsel_list->hybrid_pmu_name) {
- pr_warning("WARNING: default to use cpu_core topdown events\n");
- evsel_list->hybrid_pmu_name = perf_pmu__hybrid_type_to_pmu("core");
- }
-
- pmu_name = evsel_list->hybrid_pmu_name;
- if (!pmu_name)
- return -1;
- }
-
if (pmu_have_event(pmu_name, topdown_metric_L2_attrs[5])) {
metric_attrs = topdown_metric_L2_attrs;
max_level = 2;
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 606f09b09226..44045565c8f8 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -374,7 +374,7 @@ static void abs_printout(struct perf_stat_config *config,
config->csv_output ? 0 : config->unit_width,
evsel->unit, config->csv_sep);
- fprintf(output, "%-*s", config->csv_output ? 0 : 25, evsel__name(evsel));
+ fprintf(output, "%-*s", config->csv_output ? 0 : 32, evsel__name(evsel));
print_cgroup(config, evsel);
}
diff --git a/tools/perf/util/topdown.c b/tools/perf/util/topdown.c
index a369f84ceb6a..1090841550f7 100644
--- a/tools/perf/util/topdown.c
+++ b/tools/perf/util/topdown.c
@@ -65,3 +65,10 @@ __weak bool arch_topdown_sample_read(struct evsel *leader __maybe_unused)
{
return false;
}
+
+__weak const char *arch_get_topdown_pmu_name(struct evlist *evlist
+ __maybe_unused,
+ bool warn __maybe_unused)
+{
+ return "cpu";
+}
diff --git a/tools/perf/util/topdown.h b/tools/perf/util/topdown.h
index 118e75281f93..f9531528c559 100644
--- a/tools/perf/util/topdown.h
+++ b/tools/perf/util/topdown.h
@@ -2,11 +2,12 @@
#ifndef TOPDOWN_H
#define TOPDOWN_H 1
#include "evsel.h"
+#include "evlist.h"
bool arch_topdown_check_group(bool *warn);
void arch_topdown_group_warn(void);
bool arch_topdown_sample_read(struct evsel *leader);
-
+const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn);
int topdown_filter_events(const char **attr, char **str, bool use_group,
const char *pmu_name);
--
2.25.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 4/5] perf x86 evlist: Add default hybrid events for perf stat
2022-06-07 1:33 ` [PATCH 4/5] perf x86 evlist: Add default hybrid events for perf stat zhengjun.xing
@ 2022-06-09 0:04 ` Namhyung Kim
2022-06-09 12:47 ` Liang, Kan
0 siblings, 1 reply; 11+ messages in thread
From: Namhyung Kim @ 2022-06-09 0:04 UTC (permalink / raw)
To: Xing Zhengjun
Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
alexander.shishkin, Jiri Olsa, linux-kernel, linux-perf-users,
Ian Rogers, Adrian Hunter, Andi Kleen, Kan Liang
Hello,
On Tue, Jun 7, 2022 at 12:31 AM <zhengjun.xing@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> Provide a new solution to replace the reverted commit ac2dc29edd21
> ("perf stat: Add default hybrid events").
>
> For the default software attrs, nothing is changed.
> For the default hardware attrs, create a new evsel for each hybrid pmu.
>
> With the new solution, adding a new default attr will not require the
> special support for the hybrid platform anymore.
>
> Also, the "--detailed" is supported on the hybrid platform
>
> With the patch,
>
> ./perf stat -a -ddd sleep 1
>
> Performance counter stats for 'system wide':
>
> 32,231.06 msec cpu-clock # 32.056 CPUs utilized
> 529 context-switches # 16.413 /sec
> 32 cpu-migrations # 0.993 /sec
> 69 page-faults # 2.141 /sec
> 176,754,151 cpu_core/cycles/ # 5.484 M/sec (41.65%)
> 161,695,280 cpu_atom/cycles/ # 5.017 M/sec (49.92%)
> 48,595,992 cpu_core/instructions/ # 1.508 M/sec (49.98%)
> 32,363,337 cpu_atom/instructions/ # 1.004 M/sec (58.26%)
> 10,088,639 cpu_core/branches/ # 313.010 K/sec (58.31%)
> 6,390,582 cpu_atom/branches/ # 198.274 K/sec (58.26%)
> 846,201 cpu_core/branch-misses/ # 26.254 K/sec (66.65%)
> 676,477 cpu_atom/branch-misses/ # 20.988 K/sec (58.27%)
> 14,290,070 cpu_core/L1-dcache-loads/ # 443.363 K/sec (66.66%)
> 9,983,532 cpu_atom/L1-dcache-loads/ # 309.749 K/sec (58.27%)
> 740,725 cpu_core/L1-dcache-load-misses/ # 22.982 K/sec (66.66%)
> <not supported> cpu_atom/L1-dcache-load-misses/
> 480,441 cpu_core/LLC-loads/ # 14.906 K/sec (66.67%)
> 326,570 cpu_atom/LLC-loads/ # 10.132 K/sec (58.27%)
> 329 cpu_core/LLC-load-misses/ # 10.208 /sec (66.68%)
> 0 cpu_atom/LLC-load-misses/ # 0.000 /sec (58.32%)
> <not supported> cpu_core/L1-icache-loads/
> 21,982,491 cpu_atom/L1-icache-loads/ # 682.028 K/sec (58.43%)
> 4,493,189 cpu_core/L1-icache-load-misses/ # 139.406 K/sec (33.34%)
> 4,711,404 cpu_atom/L1-icache-load-misses/ # 146.176 K/sec (50.08%)
> 13,713,090 cpu_core/dTLB-loads/ # 425.462 K/sec (33.34%)
> 9,384,727 cpu_atom/dTLB-loads/ # 291.170 K/sec (50.08%)
> 157,387 cpu_core/dTLB-load-misses/ # 4.883 K/sec (33.33%)
> 108,328 cpu_atom/dTLB-load-misses/ # 3.361 K/sec (50.08%)
> <not supported> cpu_core/iTLB-loads/
> <not supported> cpu_atom/iTLB-loads/
> 37,655 cpu_core/iTLB-load-misses/ # 1.168 K/sec (33.32%)
> 61,661 cpu_atom/iTLB-load-misses/ # 1.913 K/sec (50.03%)
> <not supported> cpu_core/L1-dcache-prefetches/
> <not supported> cpu_atom/L1-dcache-prefetches/
> <not supported> cpu_core/L1-dcache-prefetch-misses/
> <not supported> cpu_atom/L1-dcache-prefetch-misses/
>
> 1.005466919 seconds time elapsed
>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
> ---
> tools/perf/arch/x86/util/evlist.c | 52 ++++++++++++++++++++++++++++++-
> tools/perf/util/evlist.c | 2 +-
> tools/perf/util/evlist.h | 2 ++
> 3 files changed, 54 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
> index 777bdf182a58..1b3f9e1a2287 100644
> --- a/tools/perf/arch/x86/util/evlist.c
> +++ b/tools/perf/arch/x86/util/evlist.c
> @@ -4,16 +4,66 @@
> #include "util/evlist.h"
> #include "util/parse-events.h"
> #include "topdown.h"
> +#include "util/event.h"
> +#include "util/pmu-hybrid.h"
>
> #define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
> #define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
>
> +static int ___evlist__add_default_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
> +{
> + struct perf_cpu_map *cpus;
> + struct evsel *evsel, *n;
> + struct perf_pmu *pmu;
> + LIST_HEAD(head);
> + size_t i, j = 0;
> +
> + for (i = 0; i < nr_attrs; i++)
> + event_attr_init(attrs + i);
> +
> + if (!perf_pmu__has_hybrid())
> + return evlist__add_attrs(evlist, attrs, nr_attrs);
> +
> + for (i = 0; i < nr_attrs; i++) {
> + if (attrs[i].type == PERF_TYPE_SOFTWARE) {
> + evsel = evsel__new_idx(attrs + i, evlist->core.nr_entries + j);
Probably no need to calculate index (j) as it's updated
later when it goes to the evlist...
> + if (evsel == NULL)
> + goto out_delete_partial_list;
> + j++;
> + list_add_tail(&evsel->core.node, &head);
> + continue;
> + }
> +
> + perf_pmu__for_each_hybrid_pmu(pmu) {
> + evsel = evsel__new_idx(attrs + i, evlist->core.nr_entries + j);
> + if (evsel == NULL)
> + goto out_delete_partial_list;
> + j++;
> + evsel->core.attr.config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT;
> + cpus = perf_cpu_map__get(pmu->cpus);
> + evsel->core.cpus = cpus;
> + evsel->core.own_cpus = perf_cpu_map__get(cpus);
> + evsel->pmu_name = strdup(pmu->name);
> + list_add_tail(&evsel->core.node, &head);
> + }
> + }
> +
> + evlist__splice_list_tail(evlist, &head);
... like here.
Thanks,
Namhyung
> +
> + return 0;
> +
> +out_delete_partial_list:
> + __evlist__for_each_entry_safe(&head, n, evsel)
> + evsel__delete(evsel);
> + return -1;
> +}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine
2022-06-07 1:33 ` [PATCH 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine zhengjun.xing
@ 2022-06-09 0:09 ` Namhyung Kim
2022-06-09 10:41 ` Xing Zhengjun
0 siblings, 1 reply; 11+ messages in thread
From: Namhyung Kim @ 2022-06-09 0:09 UTC (permalink / raw)
To: Xing Zhengjun
Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
alexander.shishkin, Jiri Olsa, linux-kernel, linux-perf-users,
Ian Rogers, Adrian Hunter, Andi Kleen, Kan Liang
On Tue, Jun 7, 2022 at 1:08 AM <zhengjun.xing@linux.intel.com> wrote:
>
> From: Zhengjun Xing <zhengjun.xing@linux.intel.com>
>
> Topdown metrics are missed in the default perf stat on the hybrid machine,
> add Topdown metrics in default perf stat for hybrid systems.
>
> Currently, we support the perf metrics Topdown for the p-core PMU in the
> perf stat default, the perf metrics Topdown support for e-core PMU will be
> implemented later separately. Refactor the code adds two x86 specific
> functions. Widen the size of the event name column by 7 chars, so that all
> metrics after the "#" become aligned again.
>
> The perf metrics topdown feature is supported on the cpu_core of ADL. The
> dedicated perf metrics counter and the fixed counter 3 are used for the
> topdown events. Adding the topdown metrics doesn't trigger multiplexing.
>
> Before:
>
> # ./perf stat -a true
>
> Performance counter stats for 'system wide':
>
> 53.70 msec cpu-clock # 25.736 CPUs utilized
> 80 context-switches # 1.490 K/sec
> 24 cpu-migrations # 446.951 /sec
> 52 page-faults # 968.394 /sec
> 2,788,555 cpu_core/cycles/ # 51.931 M/sec
> 851,129 cpu_atom/cycles/ # 15.851 M/sec
> 2,974,030 cpu_core/instructions/ # 55.385 M/sec
> 416,919 cpu_atom/instructions/ # 7.764 M/sec
> 586,136 cpu_core/branches/ # 10.916 M/sec
> 79,872 cpu_atom/branches/ # 1.487 M/sec
> 14,220 cpu_core/branch-misses/ # 264.819 K/sec
> 7,691 cpu_atom/branch-misses/ # 143.229 K/sec
>
> 0.002086438 seconds time elapsed
>
> After:
>
> # ./perf stat -a true
>
> Performance counter stats for 'system wide':
>
> 61.39 msec cpu-clock # 24.874 CPUs utilized
> 76 context-switches # 1.238 K/sec
> 24 cpu-migrations # 390.968 /sec
> 52 page-faults # 847.097 /sec
> 2,753,695 cpu_core/cycles/ # 44.859 M/sec
> 903,899 cpu_atom/cycles/ # 14.725 M/sec
> 2,927,529 cpu_core/instructions/ # 47.690 M/sec
> 428,498 cpu_atom/instructions/ # 6.980 M/sec
> 581,299 cpu_core/branches/ # 9.470 M/sec
> 83,409 cpu_atom/branches/ # 1.359 M/sec
> 13,641 cpu_core/branch-misses/ # 222.216 K/sec
> 8,008 cpu_atom/branch-misses/ # 130.453 K/sec
> 14,761,308 cpu_core/slots/ # 240.466 M/sec
> 3,288,625 cpu_core/topdown-retiring/ # 22.3% retiring
> 1,323,323 cpu_core/topdown-bad-spec/ # 9.0% bad speculation
> 5,477,470 cpu_core/topdown-fe-bound/ # 37.1% frontend bound
> 4,679,199 cpu_core/topdown-be-bound/ # 31.7% backend bound
> 646,194 cpu_core/topdown-heavy-ops/ # 4.4% heavy operations # 17.9% light operations
> 1,244,999 cpu_core/topdown-br-mispredict/ # 8.4% branch mispredict # 0.5% machine clears
> 3,891,800 cpu_core/topdown-fetch-lat/ # 26.4% fetch latency # 10.7% fetch bandwidth
> 1,879,034 cpu_core/topdown-mem-bound/ # 12.7% memory bound # 19.0% Core bound
>
> 0.002467839 seconds time elapsed
>
> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> ---
[SNIP]
> +const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn)
> +{
> + const char *pmu_name = "cpu";
> +
> + if (perf_pmu__has_hybrid()) {
> + if (!evlist->hybrid_pmu_name) {
> + if (warn)
> + pr_warning
> + ("WARNING: default to use cpu_core topdown events\n");
> + evlist->hybrid_pmu_name =
> + perf_pmu__hybrid_type_to_pmu("core");
This doesn't look good. Please consider reducing the
indent level like returning early as
if (!perf_pmu__has_hybrid())
return "cpu";
if (!evlist->hybrid_pmu_name) {
...
Thanks,
Namhyung
> + }
> +
> + pmu_name = evlist->hybrid_pmu_name;
> + }
> + return pmu_name;
> +}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine
2022-06-09 0:09 ` Namhyung Kim
@ 2022-06-09 10:41 ` Xing Zhengjun
0 siblings, 0 replies; 11+ messages in thread
From: Xing Zhengjun @ 2022-06-09 10:41 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
alexander.shishkin, Jiri Olsa, linux-kernel, linux-perf-users,
Ian Rogers, Adrian Hunter, Andi Kleen, Kan Liang
On 6/9/2022 8:09 AM, Namhyung Kim wrote:
> On Tue, Jun 7, 2022 at 1:08 AM <zhengjun.xing@linux.intel.com> wrote:
>>
>> From: Zhengjun Xing <zhengjun.xing@linux.intel.com>
>>
>> Topdown metrics are missed in the default perf stat on the hybrid machine,
>> add Topdown metrics in default perf stat for hybrid systems.
>>
>> Currently, we support the perf metrics Topdown for the p-core PMU in the
>> perf stat default, the perf metrics Topdown support for e-core PMU will be
>> implemented later separately. Refactor the code adds two x86 specific
>> functions. Widen the size of the event name column by 7 chars, so that all
>> metrics after the "#" become aligned again.
>>
>> The perf metrics topdown feature is supported on the cpu_core of ADL. The
>> dedicated perf metrics counter and the fixed counter 3 are used for the
>> topdown events. Adding the topdown metrics doesn't trigger multiplexing.
>>
>> Before:
>>
>> # ./perf stat -a true
>>
>> Performance counter stats for 'system wide':
>>
>> 53.70 msec cpu-clock # 25.736 CPUs utilized
>> 80 context-switches # 1.490 K/sec
>> 24 cpu-migrations # 446.951 /sec
>> 52 page-faults # 968.394 /sec
>> 2,788,555 cpu_core/cycles/ # 51.931 M/sec
>> 851,129 cpu_atom/cycles/ # 15.851 M/sec
>> 2,974,030 cpu_core/instructions/ # 55.385 M/sec
>> 416,919 cpu_atom/instructions/ # 7.764 M/sec
>> 586,136 cpu_core/branches/ # 10.916 M/sec
>> 79,872 cpu_atom/branches/ # 1.487 M/sec
>> 14,220 cpu_core/branch-misses/ # 264.819 K/sec
>> 7,691 cpu_atom/branch-misses/ # 143.229 K/sec
>>
>> 0.002086438 seconds time elapsed
>>
>> After:
>>
>> # ./perf stat -a true
>>
>> Performance counter stats for 'system wide':
>>
>> 61.39 msec cpu-clock # 24.874 CPUs utilized
>> 76 context-switches # 1.238 K/sec
>> 24 cpu-migrations # 390.968 /sec
>> 52 page-faults # 847.097 /sec
>> 2,753,695 cpu_core/cycles/ # 44.859 M/sec
>> 903,899 cpu_atom/cycles/ # 14.725 M/sec
>> 2,927,529 cpu_core/instructions/ # 47.690 M/sec
>> 428,498 cpu_atom/instructions/ # 6.980 M/sec
>> 581,299 cpu_core/branches/ # 9.470 M/sec
>> 83,409 cpu_atom/branches/ # 1.359 M/sec
>> 13,641 cpu_core/branch-misses/ # 222.216 K/sec
>> 8,008 cpu_atom/branch-misses/ # 130.453 K/sec
>> 14,761,308 cpu_core/slots/ # 240.466 M/sec
>> 3,288,625 cpu_core/topdown-retiring/ # 22.3% retiring
>> 1,323,323 cpu_core/topdown-bad-spec/ # 9.0% bad speculation
>> 5,477,470 cpu_core/topdown-fe-bound/ # 37.1% frontend bound
>> 4,679,199 cpu_core/topdown-be-bound/ # 31.7% backend bound
>> 646,194 cpu_core/topdown-heavy-ops/ # 4.4% heavy operations # 17.9% light operations
>> 1,244,999 cpu_core/topdown-br-mispredict/ # 8.4% branch mispredict # 0.5% machine clears
>> 3,891,800 cpu_core/topdown-fetch-lat/ # 26.4% fetch latency # 10.7% fetch bandwidth
>> 1,879,034 cpu_core/topdown-mem-bound/ # 12.7% memory bound # 19.0% Core bound
>>
>> 0.002467839 seconds time elapsed
>>
>> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
>> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
> [SNIP]
>> +const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn)
>> +{
>> + const char *pmu_name = "cpu";
>> +
>> + if (perf_pmu__has_hybrid()) {
>> + if (!evlist->hybrid_pmu_name) {
>> + if (warn)
>> + pr_warning
>> + ("WARNING: default to use cpu_core topdown events\n");
>> + evlist->hybrid_pmu_name =
>> + perf_pmu__hybrid_type_to_pmu("core");
>
> This doesn't look good. Please consider reducing the
> indent level like returning early as
>
> if (!perf_pmu__has_hybrid())
> return "cpu";
>
> if (!evlist->hybrid_pmu_name) {
> ...
>
Thanks for the comments, I will update it in the next version.
> Thanks,
> Namhyung
>
>
>> + }
>> +
>> + pmu_name = evlist->hybrid_pmu_name;
>> + }
>> + return pmu_name;
>> +}
--
Zhengjun Xing
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/5] perf x86 evlist: Add default hybrid events for perf stat
2022-06-09 0:04 ` Namhyung Kim
@ 2022-06-09 12:47 ` Liang, Kan
2022-06-09 13:51 ` Xing Zhengjun
0 siblings, 1 reply; 11+ messages in thread
From: Liang, Kan @ 2022-06-09 12:47 UTC (permalink / raw)
To: Namhyung Kim, Xing Zhengjun
Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
alexander.shishkin, Jiri Olsa, linux-kernel, linux-perf-users,
Ian Rogers, Adrian Hunter, Andi Kleen
On 6/8/2022 8:04 PM, Namhyung Kim wrote:
> Hello,
>
> On Tue, Jun 7, 2022 at 12:31 AM <zhengjun.xing@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> Provide a new solution to replace the reverted commit ac2dc29edd21
>> ("perf stat: Add default hybrid events").
>>
>> For the default software attrs, nothing is changed.
>> For the default hardware attrs, create a new evsel for each hybrid pmu.
>>
>> With the new solution, adding a new default attr will not require the
>> special support for the hybrid platform anymore.
>>
>> Also, the "--detailed" is supported on the hybrid platform
>>
>> With the patch,
>>
>> ./perf stat -a -ddd sleep 1
>>
>> Performance counter stats for 'system wide':
>>
>> 32,231.06 msec cpu-clock # 32.056 CPUs utilized
>> 529 context-switches # 16.413 /sec
>> 32 cpu-migrations # 0.993 /sec
>> 69 page-faults # 2.141 /sec
>> 176,754,151 cpu_core/cycles/ # 5.484 M/sec (41.65%)
>> 161,695,280 cpu_atom/cycles/ # 5.017 M/sec (49.92%)
>> 48,595,992 cpu_core/instructions/ # 1.508 M/sec (49.98%)
>> 32,363,337 cpu_atom/instructions/ # 1.004 M/sec (58.26%)
>> 10,088,639 cpu_core/branches/ # 313.010 K/sec (58.31%)
>> 6,390,582 cpu_atom/branches/ # 198.274 K/sec (58.26%)
>> 846,201 cpu_core/branch-misses/ # 26.254 K/sec (66.65%)
>> 676,477 cpu_atom/branch-misses/ # 20.988 K/sec (58.27%)
>> 14,290,070 cpu_core/L1-dcache-loads/ # 443.363 K/sec (66.66%)
>> 9,983,532 cpu_atom/L1-dcache-loads/ # 309.749 K/sec (58.27%)
>> 740,725 cpu_core/L1-dcache-load-misses/ # 22.982 K/sec (66.66%)
>> <not supported> cpu_atom/L1-dcache-load-misses/
>> 480,441 cpu_core/LLC-loads/ # 14.906 K/sec (66.67%)
>> 326,570 cpu_atom/LLC-loads/ # 10.132 K/sec (58.27%)
>> 329 cpu_core/LLC-load-misses/ # 10.208 /sec (66.68%)
>> 0 cpu_atom/LLC-load-misses/ # 0.000 /sec (58.32%)
>> <not supported> cpu_core/L1-icache-loads/
>> 21,982,491 cpu_atom/L1-icache-loads/ # 682.028 K/sec (58.43%)
>> 4,493,189 cpu_core/L1-icache-load-misses/ # 139.406 K/sec (33.34%)
>> 4,711,404 cpu_atom/L1-icache-load-misses/ # 146.176 K/sec (50.08%)
>> 13,713,090 cpu_core/dTLB-loads/ # 425.462 K/sec (33.34%)
>> 9,384,727 cpu_atom/dTLB-loads/ # 291.170 K/sec (50.08%)
>> 157,387 cpu_core/dTLB-load-misses/ # 4.883 K/sec (33.33%)
>> 108,328 cpu_atom/dTLB-load-misses/ # 3.361 K/sec (50.08%)
>> <not supported> cpu_core/iTLB-loads/
>> <not supported> cpu_atom/iTLB-loads/
>> 37,655 cpu_core/iTLB-load-misses/ # 1.168 K/sec (33.32%)
>> 61,661 cpu_atom/iTLB-load-misses/ # 1.913 K/sec (50.03%)
>> <not supported> cpu_core/L1-dcache-prefetches/
>> <not supported> cpu_atom/L1-dcache-prefetches/
>> <not supported> cpu_core/L1-dcache-prefetch-misses/
>> <not supported> cpu_atom/L1-dcache-prefetch-misses/
>>
>> 1.005466919 seconds time elapsed
>>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
>> ---
>> tools/perf/arch/x86/util/evlist.c | 52 ++++++++++++++++++++++++++++++-
>> tools/perf/util/evlist.c | 2 +-
>> tools/perf/util/evlist.h | 2 ++
>> 3 files changed, 54 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
>> index 777bdf182a58..1b3f9e1a2287 100644
>> --- a/tools/perf/arch/x86/util/evlist.c
>> +++ b/tools/perf/arch/x86/util/evlist.c
>> @@ -4,16 +4,66 @@
>> #include "util/evlist.h"
>> #include "util/parse-events.h"
>> #include "topdown.h"
>> +#include "util/event.h"
>> +#include "util/pmu-hybrid.h"
>>
>> #define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
>> #define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
>>
>> +static int ___evlist__add_default_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
>> +{
>> + struct perf_cpu_map *cpus;
>> + struct evsel *evsel, *n;
>> + struct perf_pmu *pmu;
>> + LIST_HEAD(head);
>> + size_t i, j = 0;
>> +
>> + for (i = 0; i < nr_attrs; i++)
>> + event_attr_init(attrs + i);
>> +
>> + if (!perf_pmu__has_hybrid())
>> + return evlist__add_attrs(evlist, attrs, nr_attrs);
>> +
>> + for (i = 0; i < nr_attrs; i++) {
>> + if (attrs[i].type == PERF_TYPE_SOFTWARE) {
>> + evsel = evsel__new_idx(attrs + i, evlist->core.nr_entries + j);
>
> Probably no need to calculate index (j) as it's updated
> later when it goes to the evlist...
>
>
>> + if (evsel == NULL)
>> + goto out_delete_partial_list;
>> + j++;
>> + list_add_tail(&evsel->core.node, &head);
>> + continue;
>> + }
>> +
>> + perf_pmu__for_each_hybrid_pmu(pmu) {
>> + evsel = evsel__new_idx(attrs + i, evlist->core.nr_entries + j);
>> + if (evsel == NULL)
>> + goto out_delete_partial_list;
>> + j++;
>> + evsel->core.attr.config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT;
>> + cpus = perf_cpu_map__get(pmu->cpus);
>> + evsel->core.cpus = cpus;
>> + evsel->core.own_cpus = perf_cpu_map__get(cpus);
>> + evsel->pmu_name = strdup(pmu->name);
>> + list_add_tail(&evsel->core.node, &head);
>> + }
>> + }
>> +
>> + evlist__splice_list_tail(evlist, &head);
>
> ... like here.
Yes, the index of all new evsel will be updated when adding to the evlist.
Zhengjun, could you please handle the patch? Just set 0 idx for the new
evsel should be good enough.
Thanks,
Kan
>
> Thanks,
> Namhyung
>
>
>> +
>> + return 0;
>> +
>> +out_delete_partial_list:
>> + __evlist__for_each_entry_safe(&head, n, evsel)
>> + evsel__delete(evsel);
>> + return -1;
>> +}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/5] perf x86 evlist: Add default hybrid events for perf stat
2022-06-09 12:47 ` Liang, Kan
@ 2022-06-09 13:51 ` Xing Zhengjun
0 siblings, 0 replies; 11+ messages in thread
From: Xing Zhengjun @ 2022-06-09 13:51 UTC (permalink / raw)
To: Liang, Kan, Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
alexander.shishkin, Jiri Olsa, linux-kernel, linux-perf-users,
Ian Rogers, Adrian Hunter, Andi Kleen
On 6/9/2022 8:47 PM, Liang, Kan wrote:
>
>
> On 6/8/2022 8:04 PM, Namhyung Kim wrote:
>> Hello,
>>
>> On Tue, Jun 7, 2022 at 12:31 AM <zhengjun.xing@linux.intel.com> wrote:
>>>
>>> From: Kan Liang <kan.liang@linux.intel.com>
>>>
>>> Provide a new solution to replace the reverted commit ac2dc29edd21
>>> ("perf stat: Add default hybrid events").
>>>
>>> For the default software attrs, nothing is changed.
>>> For the default hardware attrs, create a new evsel for each hybrid pmu.
>>>
>>> With the new solution, adding a new default attr will not require the
>>> special support for the hybrid platform anymore.
>>>
>>> Also, the "--detailed" is supported on the hybrid platform
>>>
>>> With the patch,
>>>
>>> ./perf stat -a -ddd sleep 1
>>>
>>> Performance counter stats for 'system wide':
>>>
>>> 32,231.06 msec cpu-clock # 32.056 CPUs
>>> utilized
>>> 529 context-switches # 16.413 /sec
>>> 32 cpu-migrations # 0.993 /sec
>>> 69 page-faults # 2.141 /sec
>>> 176,754,151 cpu_core/cycles/ # 5.484
>>> M/sec (41.65%)
>>> 161,695,280 cpu_atom/cycles/ # 5.017
>>> M/sec (49.92%)
>>> 48,595,992 cpu_core/instructions/ # 1.508
>>> M/sec (49.98%)
>>> 32,363,337 cpu_atom/instructions/ # 1.004
>>> M/sec (58.26%)
>>> 10,088,639 cpu_core/branches/ # 313.010
>>> K/sec (58.31%)
>>> 6,390,582 cpu_atom/branches/ # 198.274
>>> K/sec (58.26%)
>>> 846,201 cpu_core/branch-misses/ # 26.254
>>> K/sec (66.65%)
>>> 676,477 cpu_atom/branch-misses/ # 20.988
>>> K/sec (58.27%)
>>> 14,290,070 cpu_core/L1-dcache-loads/ # 443.363
>>> K/sec (66.66%)
>>> 9,983,532 cpu_atom/L1-dcache-loads/ # 309.749
>>> K/sec (58.27%)
>>> 740,725 cpu_core/L1-dcache-load-misses/ # 22.982
>>> K/sec (66.66%)
>>> <not supported> cpu_atom/L1-dcache-load-misses/
>>> 480,441 cpu_core/LLC-loads/ # 14.906
>>> K/sec (66.67%)
>>> 326,570 cpu_atom/LLC-loads/ # 10.132
>>> K/sec (58.27%)
>>> 329 cpu_core/LLC-load-misses/ # 10.208
>>> /sec (66.68%)
>>> 0 cpu_atom/LLC-load-misses/ # 0.000
>>> /sec (58.32%)
>>> <not supported> cpu_core/L1-icache-loads/
>>> 21,982,491 cpu_atom/L1-icache-loads/ # 682.028
>>> K/sec (58.43%)
>>> 4,493,189 cpu_core/L1-icache-load-misses/ # 139.406
>>> K/sec (33.34%)
>>> 4,711,404 cpu_atom/L1-icache-load-misses/ # 146.176
>>> K/sec (50.08%)
>>> 13,713,090 cpu_core/dTLB-loads/ # 425.462
>>> K/sec (33.34%)
>>> 9,384,727 cpu_atom/dTLB-loads/ # 291.170
>>> K/sec (50.08%)
>>> 157,387 cpu_core/dTLB-load-misses/ # 4.883
>>> K/sec (33.33%)
>>> 108,328 cpu_atom/dTLB-load-misses/ # 3.361
>>> K/sec (50.08%)
>>> <not supported> cpu_core/iTLB-loads/
>>> <not supported> cpu_atom/iTLB-loads/
>>> 37,655 cpu_core/iTLB-load-misses/ # 1.168
>>> K/sec (33.32%)
>>> 61,661 cpu_atom/iTLB-load-misses/ # 1.913
>>> K/sec (50.03%)
>>> <not supported> cpu_core/L1-dcache-prefetches/
>>> <not supported> cpu_atom/L1-dcache-prefetches/
>>> <not supported> cpu_core/L1-dcache-prefetch-misses/
>>> <not supported> cpu_atom/L1-dcache-prefetch-misses/
>>>
>>> 1.005466919 seconds time elapsed
>>>
>>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>>> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
>>> ---
>>> tools/perf/arch/x86/util/evlist.c | 52 ++++++++++++++++++++++++++++++-
>>> tools/perf/util/evlist.c | 2 +-
>>> tools/perf/util/evlist.h | 2 ++
>>> 3 files changed, 54 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tools/perf/arch/x86/util/evlist.c
>>> b/tools/perf/arch/x86/util/evlist.c
>>> index 777bdf182a58..1b3f9e1a2287 100644
>>> --- a/tools/perf/arch/x86/util/evlist.c
>>> +++ b/tools/perf/arch/x86/util/evlist.c
>>> @@ -4,16 +4,66 @@
>>> #include "util/evlist.h"
>>> #include "util/parse-events.h"
>>> #include "topdown.h"
>>> +#include "util/event.h"
>>> +#include "util/pmu-hybrid.h"
>>>
>>> #define TOPDOWN_L1_EVENTS
>>> "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
>>>
>>> #define TOPDOWN_L2_EVENTS
>>> "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
>>>
>>>
>>> +static int ___evlist__add_default_attrs(struct evlist *evlist,
>>> struct perf_event_attr *attrs, size_t nr_attrs)
>>> +{
>>> + struct perf_cpu_map *cpus;
>>> + struct evsel *evsel, *n;
>>> + struct perf_pmu *pmu;
>>> + LIST_HEAD(head);
>>> + size_t i, j = 0;
>>> +
>>> + for (i = 0; i < nr_attrs; i++)
>>> + event_attr_init(attrs + i);
>>> +
>>> + if (!perf_pmu__has_hybrid())
>>> + return evlist__add_attrs(evlist, attrs, nr_attrs);
>>> +
>>> + for (i = 0; i < nr_attrs; i++) {
>>> + if (attrs[i].type == PERF_TYPE_SOFTWARE) {
>>> + evsel = evsel__new_idx(attrs + i,
>>> evlist->core.nr_entries + j);
>>
>> Probably no need to calculate index (j) as it's updated
>> later when it goes to the evlist...
>>
>>
>>> + if (evsel == NULL)
>>> + goto out_delete_partial_list;
>>> + j++;
>>> + list_add_tail(&evsel->core.node, &head);
>>> + continue;
>>> + }
>>> +
>>> + perf_pmu__for_each_hybrid_pmu(pmu) {
>>> + evsel = evsel__new_idx(attrs + i,
>>> evlist->core.nr_entries + j);
>>> + if (evsel == NULL)
>>> + goto out_delete_partial_list;
>>> + j++;
>>> + evsel->core.attr.config |= (__u64)pmu->type
>>> << PERF_PMU_TYPE_SHIFT;
>>> + cpus = perf_cpu_map__get(pmu->cpus);
>>> + evsel->core.cpus = cpus;
>>> + evsel->core.own_cpus = perf_cpu_map__get(cpus);
>>> + evsel->pmu_name = strdup(pmu->name);
>>> + list_add_tail(&evsel->core.node, &head);
>>> + }
>>> + }
>>> +
>>> + evlist__splice_list_tail(evlist, &head);
>>
>> ... like here.
>
> Yes, the index of all new evsel will be updated when adding to the evlist.
>
> Zhengjun, could you please handle the patch? Just set 0 idx for the new
> evsel should be good enough.
>
>
Ok, I will update it in the new version.
> Thanks,
> Kan
>
>>
>> Thanks,
>> Namhyung
>>
>>
>>> +
>>> + return 0;
>>> +
>>> +out_delete_partial_list:
>>> + __evlist__for_each_entry_safe(&head, n, evsel)
>>> + evsel__delete(evsel);
>>> + return -1;
>>> +}
--
Zhengjun Xing
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-06-09 13:51 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-07 1:33 [PATCH 0/5] Add perf stat default events for hybrid machines zhengjun.xing
2022-06-07 1:33 ` [PATCH 1/5] perf stat: Revert "perf stat: Add default hybrid events" zhengjun.xing
2022-06-07 1:33 ` [PATCH 2/5] perf evsel: Add arch_evsel__hw_name() zhengjun.xing
2022-06-07 1:33 ` [PATCH 3/5] perf evlist: Always use arch_evlist__add_default_attrs() zhengjun.xing
2022-06-07 1:33 ` [PATCH 4/5] perf x86 evlist: Add default hybrid events for perf stat zhengjun.xing
2022-06-09 0:04 ` Namhyung Kim
2022-06-09 12:47 ` Liang, Kan
2022-06-09 13:51 ` Xing Zhengjun
2022-06-07 1:33 ` [PATCH 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine zhengjun.xing
2022-06-09 0:09 ` Namhyung Kim
2022-06-09 10:41 ` Xing Zhengjun
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.