* [PATCH v2 1/2] perf evlist: Keep topdown counters in weak group
2022-05-12 6:13 [PATCH v2 0/2] Fix topdown event weak grouping Ian Rogers
@ 2022-05-12 6:13 ` Ian Rogers
2022-05-12 6:13 ` [PATCH v2 2/2] perf test: Add basic stat and topdown group test Ian Rogers
2022-05-13 14:25 ` [PATCH v2 0/2] Fix topdown event weak grouping Liang, Kan
2 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2022-05-12 6:13 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Riccardo Mancini, Kim Phillips, Madhavan Srinivasan,
Shunsuke Nakamura, Florian Fischer, Andi Kleen, John Garry,
Zhengjun Xing, Adrian Hunter, James Clark, linux-perf-users,
linux-kernel
Cc: Stephane Eranian, Ian Rogers
On Intel Icelake, topdown events must always be grouped with a slots
event as leader. When a metric is parsed a weak group is formed and
retried if perf_event_open fails. The retried events aren't grouped
breaking the slots leader requirement. This change modifies the weak
group "reset" behavior so that topdown events aren't broken from the
group for the retry.
$ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
Performance counter stats for 'system wide':
47,867,188,483 slots (92.27%)
<not supported> topdown-bad-spec
<not supported> topdown-be-bound
<not supported> topdown-fe-bound
<not supported> topdown-retiring
2,173,346,937 branch-instructions (92.27%)
10,540,253 branch-misses # 0.48% of all branches (92.29%)
96,291,140 bus-cycles (92.29%)
6,214,202 cache-misses # 20.120 % of all cache refs (92.29%)
30,886,082 cache-references (76.91%)
11,773,726,641 cpu-cycles (84.62%)
11,807,585,307 instructions # 1.00 insn per cycle (92.31%)
0 mem-loads (92.32%)
2,212,928,573 mem-stores (84.69%)
10,024,403,118 ref-cycles (92.35%)
16,232,978 baclears.any (92.35%)
23,832,633 ARITH.DIVIDER_ACTIVE (84.59%)
0.981070734 seconds time elapsed
After:
$ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
Performance counter stats for 'system wide':
31040189283 slots (92.27%)
8997514811 topdown-bad-spec # 28.2% bad speculation (92.27%)
10997536028 topdown-be-bound # 34.5% backend bound (92.27%)
4778060526 topdown-fe-bound # 15.0% frontend bound (92.27%)
7086628768 topdown-retiring # 22.2% retiring (92.27%)
1417611942 branch-instructions (92.26%)
5285529 branch-misses # 0.37% of all branches (92.28%)
62922469 bus-cycles (92.29%)
1440708 cache-misses # 8.292 % of all cache refs (92.30%)
17374098 cache-references (76.94%)
8040889520 cpu-cycles (84.63%)
7709992319 instructions # 0.96 insn per cycle (92.32%)
0 mem-loads (92.32%)
1515669558 mem-stores (84.68%)
6542411177 ref-cycles (92.35%)
4154149 baclears.any (92.35%)
20556152 ARITH.DIVIDER_ACTIVE (84.59%)
1.010799593 seconds time elapsed
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/arch/x86/util/evsel.c | 12 ++++++++++++
tools/perf/util/evlist.c | 16 ++++++++++++++--
tools/perf/util/evsel.c | 10 ++++++++++
tools/perf/util/evsel.h | 3 +++
4 files changed, 39 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index ac2899a25b7a..00cb4466b4ca 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -3,6 +3,7 @@
#include <stdlib.h>
#include "util/evsel.h"
#include "util/env.h"
+#include "util/pmu.h"
#include "linux/string.h"
void arch_evsel__set_sample_weight(struct evsel *evsel)
@@ -29,3 +30,14 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
free(env.cpuid);
}
+
+bool arch_evsel__must_be_in_group(const struct evsel *evsel)
+{
+ if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
+ !pmu_have_event("cpu", "slots"))
+ return false;
+
+ return evsel->name &&
+ (!strcasecmp(evsel->name, "slots") ||
+ strcasestr(evsel->name, "topdown"));
+}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 52ea004ba01e..dfa65a383502 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1790,8 +1790,17 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
if (evsel__has_leader(c2, leader)) {
if (is_open && close)
perf_evsel__close(&c2->core);
- evsel__set_leader(c2, c2);
- c2->core.nr_members = 0;
+ /*
+ * We want to close all members of the group and reopen
+ * them. Some events, like Intel topdown, require being
+ * in a group and so keep these in the group.
+ */
+ if (!evsel__must_be_in_group(c2) && c2 != leader) {
+ evsel__set_leader(c2, c2);
+ c2->core.nr_members = 0;
+ leader->core.nr_members--;
+ }
+
/*
* Set this for all former members of the group
* to indicate they get reopened.
@@ -1799,6 +1808,9 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
c2->reset_group = true;
}
}
+ /* Reset the leader count if all entries were removed. */
+ if (leader->core.nr_members)
+ leader->core.nr_members = 0;
return leader;
}
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5fd7924f8eb3..1cf967d689aa 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -3103,3 +3103,13 @@ int evsel__source_count(const struct evsel *evsel)
}
return count;
}
+
+bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
+{
+ return false;
+}
+
+bool evsel__must_be_in_group(const struct evsel *evsel)
+{
+ return arch_evsel__must_be_in_group(evsel);
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d4b04537ce6d..3e41b1712b86 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -498,6 +498,9 @@ bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
bool evsel__is_leader(struct evsel *evsel);
void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
int evsel__source_count(const struct evsel *evsel);
+bool evsel__must_be_in_group(const struct evsel *evsel);
+
+bool arch_evsel__must_be_in_group(const struct evsel *evsel);
/*
* Macro to swap the bit-field postition and size.
--
2.36.0.512.ge40c2bad7a-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/2] perf test: Add basic stat and topdown group test
2022-05-12 6:13 [PATCH v2 0/2] Fix topdown event weak grouping Ian Rogers
2022-05-12 6:13 ` [PATCH v2 1/2] perf evlist: Keep topdown counters in weak group Ian Rogers
@ 2022-05-12 6:13 ` Ian Rogers
2022-05-13 14:25 ` [PATCH v2 0/2] Fix topdown event weak grouping Liang, Kan
2 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2022-05-12 6:13 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Riccardo Mancini, Kim Phillips, Madhavan Srinivasan,
Shunsuke Nakamura, Florian Fischer, Andi Kleen, John Garry,
Zhengjun Xing, Adrian Hunter, James Clark, linux-perf-users,
linux-kernel
Cc: Stephane Eranian, Ian Rogers
Add a basic stat test.
Add two tests of grouping behavior for topdown events. Topdown events
are special as they must be grouped with the slots event first.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/tests/shell/stat.sh | 67 ++++++++++++++++++++++++++++++++++
1 file changed, 67 insertions(+)
create mode 100755 tools/perf/tests/shell/stat.sh
diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh
new file mode 100755
index 000000000000..c7894764d4a6
--- /dev/null
+++ b/tools/perf/tests/shell/stat.sh
@@ -0,0 +1,67 @@
+#!/bin/sh
+# perf stat tests
+# SPDX-License-Identifier: GPL-2.0
+
+set -e
+
+err=0
+test_default_stat() {
+ echo "Basic stat command test"
+ if ! perf stat true 2>&1 | egrep -q "Performance counter stats for 'true':"
+ then
+ echo "Basic stat command test [Failed]"
+ err=1
+ return
+ fi
+ echo "Basic stat command test [Success]"
+}
+
+test_topdown_groups() {
+ # Topdown events must be grouped with the slots event first. Test that
+ # parse-events reorders this.
+ echo "Topdown event group test"
+ if ! perf stat -e '{slots,topdown-retiring}' true > /dev/null 2>&1
+ then
+ echo "Topdown event group test [Skipped event parsing failed]"
+ return
+ fi
+ if perf stat -e '{slots,topdown-retiring}' true 2>&1 | egrep -q "<not supported>"
+ then
+ echo "Topdown event group test [Failed events not supported]"
+ err=1
+ return
+ fi
+ if perf stat -e '{topdown-retiring,slots}' true 2>&1 | egrep -q "<not supported>"
+ then
+ echo "Topdown event group test [Failed slots not reordered first]"
+ err=1
+ return
+ fi
+ echo "Topdown event group test [Success]"
+}
+
+test_topdown_weak_groups() {
+ # Weak groups break if the perf_event_open of multiple grouped events
+ # fails. Breaking a topdown group causes the events to fail. Test a very large
+ # grouping to see that the topdown events aren't broken out.
+ echo "Topdown weak groups test"
+ ok_grouping="{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring},branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,cache-misses,cache-references"
+ if ! perf stat --no-merge -e "$ok_grouping" true > /dev/null 2>&1
+ then
+ echo "Topdown weak groups test [Skipped event parsing failed]"
+ return
+ fi
+ group_needs_break="{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,cache-misses,cache-references}:W"
+ if perf stat --no-merge -e "$group_needs_break" true 2>&1 | egrep -q "<not supported>"
+ then
+ echo "Topdown weak groups test [Failed events not supported]"
+ err=1
+ return
+ fi
+ echo "Topdown weak groups test [Success]"
+}
+
+test_default_stat
+test_topdown_groups
+test_topdown_weak_groups
+exit $err
--
2.36.0.512.ge40c2bad7a-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 0/2] Fix topdown event weak grouping
2022-05-12 6:13 [PATCH v2 0/2] Fix topdown event weak grouping Ian Rogers
2022-05-12 6:13 ` [PATCH v2 1/2] perf evlist: Keep topdown counters in weak group Ian Rogers
2022-05-12 6:13 ` [PATCH v2 2/2] perf test: Add basic stat and topdown group test Ian Rogers
@ 2022-05-13 14:25 ` Liang, Kan
2022-05-13 15:18 ` Ian Rogers
2022-05-13 15:19 ` Liang, Kan
2 siblings, 2 replies; 6+ messages in thread
From: Liang, Kan @ 2022-05-13 14:25 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Riccardo Mancini, Kim Phillips,
Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
Andi Kleen, John Garry, Zhengjun Xing, Adrian Hunter,
James Clark, linux-perf-users, linux-kernel
Cc: Stephane Eranian
On 5/12/2022 2:13 AM, Ian Rogers wrote:
> Keep topdown events within a group when a weak group is broken. This
> is a requirement as topdown events must form a group.
>
> Add perf stat testing including for required topdown event group
> behaviors.
>
> Note: as with existing topdown evsel/evlist code topdown events are
> assumed to be on the PMU "cpu". On Alderlake the PMU "cpu_core" should
> also be tested. Future changes can fix Alderlake.
I will send a follow-up patch to fix the weak grouping for the hybrid
platform shortly.
For the non-hybrid platform, the patch set looks good to me.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Thanks,
Kan
>
> v2. Correct behavior wrt pmu prefixed events and avoid the test using
> deprecated events: Suggested-by: Liang, Kan <kan.liang@linux.intel.com>
>
> Ian Rogers (2):
> perf evlist: Keep topdown counters in weak group
> perf test: Add basic stat and topdown group test
>
> tools/perf/arch/x86/util/evsel.c | 12 ++++++
> tools/perf/tests/shell/stat.sh | 67 ++++++++++++++++++++++++++++++++
> tools/perf/util/evlist.c | 16 +++++++-
> tools/perf/util/evsel.c | 10 +++++
> tools/perf/util/evsel.h | 3 ++
> 5 files changed, 106 insertions(+), 2 deletions(-)
> create mode 100755 tools/perf/tests/shell/stat.sh
>
^ permalink raw reply [flat|nested] 6+ messages in thread