From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Ian Rogers <irogers@google.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Ahmad Yasin <ahmad.yasin@intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Stephane Eranian <eranian@google.com>,
Andi Kleen <ak@linux.intel.com>,
Perry Taylor <perry.taylor@intel.com>,
Samantha Alt <samantha.alt@intel.com>,
Caleb Biggers <caleb.biggers@intel.com>,
Weilin Wang <weilin.wang@intel.com>,
Edward Baker <edward.baker@intel.com>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Florian Fischer <florian.fischer@muhq.space>,
Rob Herring <robh@kernel.org>,
Zhengjun Xing <zhengjun.xing@linux.intel.com>,
John Garry <john.g.garry@oracle.com>,
Kajol Jain <kjain@linux.ibm.com>,
Sumanth Korikkar <sumanthk@linux.ibm.com>,
Thomas Richter <tmricht@linux.ibm.com>,
Tiezhu Yang <yangtiezhu@loongson.cn>,
Ravi Bangoria <ravi.bangoria@amd.com>,
Leo Yan <leo.yan@linaro.org>,
Yang Jihong <yangjihong1@huawei.com>,
James Clark <james.clark@arm.com>,
Suzuki Poulouse <suzuki.poulose@arm.com>,
Kang Minchul <tegongkang@gmail.com>,
Athira Rajeev <atrajeev@linux.vnet.ibm.com>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 00/40] Fix perf on Intel hybrid CPUs
Date: Wed, 26 Apr 2023 09:53:45 -0400 [thread overview]
Message-ID: <bff481ba-e60a-763f-0aa0-3ee53302c480@linux.intel.com> (raw)
In-Reply-To: <20230426070050.1315519-1-irogers@google.com>
On 2023-04-26 3:00 a.m., Ian Rogers wrote:
> TL;DR: hybrid doesn't crash, json metrics work on hybrid on both PMUs
> or individually, event parsing doesn't always scan all PMUs, more and
> new tests that also run without hybrid, less code.
>
> The first patches were previously posted to improve metrics here:
> "perf stat: Introduce skippable evsels"
> https://lore.kernel.org/all/20230414051922.3625666-1-irogers@google.com/
> "perf vendor events intel: Add xxx metric constraints"
> https://lore.kernel.org/all/20230419005423.343862-1-irogers@google.com/
>
> Next are some general test improvements.
>
> Next event parsing is rewritten to not scan all PMUs for the benefit
> of raw and legacy cache parsing, instead these are handled by the
> lexer and a new term type. This ultimately removes the need for the
> event parser for hybrid to be recursive as legacy cache can be just a
> term. Tests are re-enabled for events with hyphens, so AMD's
> branch-brs event is now parsable.
>
> The cputype option is made a generic pmu filter flag and is tested
> even on non-hybrid systems.
>
> The final patches address specific json metric issues on hybrid, in
> both the json metrics and the metric code. They also bring in a new
> json option to not group events when matching a metricgroup, this
> helps reduce counter pressure for TopdownL1 and TopdownL2 metric
> groups. The updates to the script that updates the json are posted in:
> https://github.com/intel/perfmon/pull/73
>
> The patches add slightly more code than they remove, in areas like
> better json metric constraints and tests, but in the core util code,
> the removal of hybrid is a net reduction:
> 20 files changed, 631 insertions(+), 951 deletions(-)
>
> There's specific detail with each patch, but for now here is the 6.3
> output followed by that from perf-tools-next with the patch series
> applied. The tool is running on an Alderlake CPU on an elderly 5.15
> kernel:
>
> Events on hybrid that parse and pass tests:
> '''
> $ perf-6.3 version
> perf version 6.3.rc7.gb7bc77e2f2c7
> $ perf-6.3 test
> ...
> 6.1: Test event parsing : FAILED!
> ...
> $ perf test
> ...
> 6: Parse event definition strings :
> 6.1: Test event parsing : Ok
> 6.2: Parsing of all PMU events from sysfs : Ok
> 6.3: Parsing of given PMU events from sysfs : Ok
> 6.4: Parsing of aliased events from sysfs : Skip (no aliases in sysfs)
> 6.5: Parsing of aliased events : Ok
> 6.6: Parsing of terms (event modifiers) : Ok
> ...
> '''
>
> No event/metric running with json metrics and TopdownL1 on both PMUs:
> '''
> $ perf-6.3 stat -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 24,073.58 msec cpu-clock # 23.975 CPUs utilized
> 350 context-switches # 14.539 /sec
> 25 cpu-migrations # 1.038 /sec
> 66 page-faults # 2.742 /sec
> 21,257,199 cpu_core/cycles/ # 883.009 K/sec
> 2,162,192 cpu_atom/cycles/ # 89.816 K/sec
> 6,679,379 cpu_core/instructions/ # 277.457 K/sec
> 753,197 cpu_atom/instructions/ # 31.287 K/sec
> 1,300,647 cpu_core/branches/ # 54.028 K/sec
> 148,652 cpu_atom/branches/ # 6.175 K/sec
> 117,429 cpu_core/branch-misses/ # 4.878 K/sec
> 14,396 cpu_atom/branch-misses/ # 598.000 /sec
> 123,097,644 cpu_core/slots/ # 5.113 M/sec
> 9,241,207 cpu_core/topdown-retiring/ # 7.5% Retiring
> 8,903,288 cpu_core/topdown-bad-spec/ # 7.2% Bad Speculation
> 66,590,029 cpu_core/topdown-fe-bound/ # 54.1% Frontend Bound
> 38,397,500 cpu_core/topdown-be-bound/ # 31.2% Backend Bound
> 3,294,283 cpu_core/topdown-heavy-ops/ # 2.7% Heavy Operations # 4.8% Light Operations
> 8,855,769 cpu_core/topdown-br-mispredict/ # 7.2% Branch Mispredict # 0.0% Machine Clears
> 57,695,714 cpu_core/topdown-fetch-lat/ # 46.9% Fetch Latency # 7.2% Fetch Bandwidth
> 12,823,926 cpu_core/topdown-mem-bound/ # 10.4% Memory Bound # 20.8% Core Bound
>
> 1.004093622 seconds time elapsed
>
> $ perf stat -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 24,064.65 msec cpu-clock # 23.973 CPUs utilized
> 384 context-switches # 15.957 /sec
> 24 cpu-migrations # 0.997 /sec
> 71 page-faults # 2.950 /sec
> 19,737,646 cpu_core/cycles/ # 820.192 K/sec
> 122,018,505 cpu_atom/cycles/ # 5.070 M/sec (63.32%)
> 7,636,653 cpu_core/instructions/ # 317.339 K/sec
> 16,266,629 cpu_atom/instructions/ # 675.955 K/sec (72.50%)
> 1,552,995 cpu_core/branches/ # 64.534 K/sec
> 3,208,143 cpu_atom/branches/ # 133.314 K/sec (72.50%)
> 132,151 cpu_core/branch-misses/ # 5.491 K/sec
> 547,285 cpu_atom/branch-misses/ # 22.742 K/sec (72.49%)
> 32,110,597 cpu_atom/TOPDOWN_RETIRING.ALL/ # 1.334 M/sec
> # 18.4 % tma_bad_speculation (72.48%)
> 228,006,765 cpu_atom/TOPDOWN_FE_BOUND.ALL/ # 9.475 M/sec
> # 38.1 % tma_frontend_bound (72.47%)
> 225,866,251 cpu_atom/TOPDOWN_BE_BOUND.ALL/ # 9.386 M/sec
> # 37.7 % tma_backend_bound
> # 37.7 % tma_backend_bound_aux (72.73%)
> 119,748,254 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 4.976 M/sec
> # 5.2 % tma_retiring (73.14%)
> 31,363,579 cpu_atom/TOPDOWN_RETIRING.ALL/ # 1.303 M/sec (73.37%)
> 227,907,321 cpu_atom/TOPDOWN_FE_BOUND.ALL/ # 9.471 M/sec (63.95%)
> 228,803,268 cpu_atom/TOPDOWN_BE_BOUND.ALL/ # 9.508 M/sec (63.55%)
> 113,357,334 cpu_core/TOPDOWN.SLOTS/ # 30.5 % tma_backend_bound
> # 9.2 % tma_retiring
> # 8.7 % tma_bad_speculation
> # 51.6 % tma_frontend_bound
> 10,451,044 cpu_core/topdown-retiring/
> 9,687,449 cpu_core/topdown-bad-spec/
> 58,703,214 cpu_core/topdown-fe-bound/
> 34,540,660 cpu_core/topdown-be-bound/
> 154,902 cpu_core/INT_MISC.UOP_DROPPING/ # 6.437 K/sec
>
> 1.003818397 seconds time elapsed
> '''
Thanks for the fixes. That should work for -M or --topdown options.
But I don't think the above output is better than the 6.3 for the
*default* of perf stat?
- The multiplexing in the atom core messes up the other events.
- The "M/sec" seems useless for the Topdown events.
- The tma_* is not a generic name.
"Retiring" is much better than "tma_retiring" as a generic annotation.
It should works for both X86 and Arm.
As the default, it's better to provide a clean and generic ouptput for
the end users.
If the users want to know more details, they can use -M or --topdown
options. The events/formats are expected to be different among ARCHs.
Also, there should be a bug for all atom Topdown events. They are
displayed twice.
Thanks,
Kan
next prev parent reply other threads:[~2023-04-26 13:53 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-26 7:00 [PATCH v1 00/40] Fix perf on Intel hybrid CPUs Ian Rogers
2023-04-26 7:00 ` [PATCH v1 01/40] perf stat: Introduce skippable evsels Ian Rogers
2023-04-26 23:26 ` Yasin, Ahmad
2023-04-27 0:37 ` Ian Rogers
2023-04-27 2:03 ` Ian Rogers
2023-04-27 18:52 ` Liang, Kan
2023-04-27 20:21 ` Ian Rogers
2023-04-27 21:00 ` Namhyung Kim
2023-04-27 21:09 ` Ian Rogers
2023-04-26 7:00 ` [PATCH v1 02/40] perf vendor events intel: Add alderlake metric constraints Ian Rogers
2023-04-26 7:00 ` [PATCH v1 03/40] perf vendor events intel: Add icelake " Ian Rogers
2023-04-27 19:06 ` Liang, Kan
2023-04-27 20:22 ` Ian Rogers
2023-04-26 7:00 ` [PATCH v1 04/40] perf vendor events intel: Add icelakex " Ian Rogers
2023-04-26 7:00 ` [PATCH v1 05/40] perf vendor events intel: Add sapphirerapids " Ian Rogers
2023-04-26 7:00 ` [PATCH v1 06/40] perf vendor events intel: Add tigerlake " Ian Rogers
2023-04-26 7:00 ` [PATCH v1 07/40] perf stat: Avoid segv on counter->name Ian Rogers
2023-04-27 19:11 ` Liang, Kan
2023-04-27 19:34 ` Arnaldo Carvalho de Melo
2023-04-26 7:00 ` [PATCH v1 08/40] perf test: Test more sysfs events Ian Rogers
2023-04-27 19:38 ` Liang, Kan
2023-04-27 20:23 ` Ian Rogers
2023-04-26 7:00 ` [PATCH v1 09/40] perf test: Use valid for PMU tests Ian Rogers
2023-04-27 19:39 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 10/40] perf test: Mask config then test Ian Rogers
2023-04-27 19:39 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 11/40] perf test: Test more with config_cache Ian Rogers
2023-04-27 19:40 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 12/40] perf test: Roundtrip name, don't assume 1 event per name Ian Rogers
2023-04-27 19:44 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 13/40] perf parse-events: Set attr.type to PMU type early Ian Rogers
2023-04-27 20:00 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 14/40] perf print-events: Avoid unnecessary strlist Ian Rogers
2023-04-27 20:01 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 15/40] perf parse-events: Avoid scanning PMUs before parsing Ian Rogers
2023-04-27 20:06 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 16/40] perf test: Validate events with hyphens in Ian Rogers
2023-04-27 20:08 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 17/40] perf evsel: Modify group pmu name for software events Ian Rogers
2023-04-27 20:12 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 18/40] perf test: Move x86 hybrid tests to arch/x86 Ian Rogers
2023-04-27 21:42 ` Liang, Kan
2023-04-26 7:00 ` [PATCH v1 19/40] perf test x86 hybrid: Don't assume evlist order Ian Rogers
2023-04-26 7:00 ` [PATCH v1 20/40] perf parse-events: Support PMUs for legacy cache events Ian Rogers
2023-04-26 7:00 ` [PATCH v1 21/40] perf parse-events: Wildcard " Ian Rogers
2023-04-26 10:11 ` James Clark
2023-04-27 5:50 ` Ian Rogers
2023-04-27 21:02 ` Ian Rogers
2023-04-26 7:00 ` [PATCH v1 22/40] perf print-events: Print legacy cache events for each PMU Ian Rogers
2023-04-26 7:00 ` [PATCH v1 23/40] perf parse-events: Support wildcards on raw events Ian Rogers
2023-04-26 7:00 ` [PATCH v1 24/40] perf parse-events: Remove now unused hybrid logic Ian Rogers
2023-04-26 7:00 ` [PATCH v1 25/40] perf parse-events: Minor type safety cleanup Ian Rogers
2023-04-26 7:00 ` [PATCH v1 26/40] perf parse-events: Add pmu filter Ian Rogers
2023-04-26 7:00 ` [PATCH v1 27/40] perf stat: Make cputype filter generic Ian Rogers
2023-04-26 7:00 ` [PATCH v1 28/40] perf test: Add cputype testing to perf stat Ian Rogers
2023-04-26 7:00 ` [PATCH v1 29/40] perf test: Fix parse-events tests for >1 core PMU Ian Rogers
2023-04-26 7:00 ` [PATCH v1 30/40] perf parse-events: Support hardware events as terms Ian Rogers
2023-04-26 7:00 ` [PATCH v1 31/40] perf parse-events: Avoid error when assigning a term Ian Rogers
2023-04-26 7:00 ` [PATCH v1 32/40] perf parse-events: Avoid error when assigning a legacy cache term Ian Rogers
2023-04-26 7:00 ` [PATCH v1 33/40] perf parse-events: Don't auto merge hybrid wildcard events Ian Rogers
2023-04-26 7:00 ` [PATCH v1 34/40] perf parse-events: Don't reorder atom cpu events Ian Rogers
2023-04-26 7:00 ` [PATCH v1 35/40] perf metrics: Be PMU specific for referenced metrics Ian Rogers
2023-04-26 7:00 ` [PATCH v1 36/40] perf metric: Json flag to not group events if gathering a metric group Ian Rogers
2023-04-26 7:00 ` [PATCH v1 37/40] perf stat: Command line PMU metric filtering Ian Rogers
2023-04-26 7:00 ` [PATCH v1 38/40] perf vendor events intel: Correct alderlake metrics Ian Rogers
2023-04-26 7:00 ` [PATCH v1 39/40] perf jevents: Don't rewrite metrics across PMUs Ian Rogers
2023-04-26 7:00 ` [PATCH v1 40/40] perf metrics: Be PMU specific in event match Ian Rogers
2023-04-26 13:53 ` Liang, Kan [this message]
2023-04-26 21:09 ` [PATCH v1 00/40] Fix perf on Intel hybrid CPUs Arnaldo Carvalho de Melo
2023-04-26 21:33 ` Arnaldo Carvalho de Melo
2023-04-26 22:07 ` Liang, Kan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bff481ba-e60a-763f-0aa0-3ee53302c480@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ahmad.yasin@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=atrajeev@linux.vnet.ibm.com \
--cc=caleb.biggers@intel.com \
--cc=edward.baker@intel.com \
--cc=eranian@google.com \
--cc=florian.fischer@muhq.space \
--cc=irogers@google.com \
--cc=james.clark@arm.com \
--cc=john.g.garry@oracle.com \
--cc=jolsa@kernel.org \
--cc=kjain@linux.ibm.com \
--cc=leo.yan@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=perry.taylor@intel.com \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=robh@kernel.org \
--cc=samantha.alt@intel.com \
--cc=sumanthk@linux.ibm.com \
--cc=suzuki.poulose@arm.com \
--cc=tegongkang@gmail.com \
--cc=tmricht@linux.ibm.com \
--cc=weilin.wang@intel.com \
--cc=yangjihong1@huawei.com \
--cc=yangtiezhu@loongson.cn \
--cc=zhengjun.xing@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).