From: Jing Zhang <renyu.zj@linux.alibaba.com> To: John Garry <john.g.garry@oracle.com>, Ian Rogers <irogers@google.com>, Xing Zhengjun <zhengjun.xing@linux.intel.com>, Will Deacon <will@kernel.org>, James Clark <james.clark@arm.com>, Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Andrew Kilroy <andrew.kilroy@arm.com>, Shuai Xue <xueshuai@linux.alibaba.com>, Zhuo Song <zhuo.song@linux.alibaba.com>, Jing Zhang <renyu.zj@linux.alibaba.com> Subject: [PATCH v5 0/6] Add metrics for neoverse-n2 Date: Tue, 3 Jan 2023 19:39:30 +0800 [thread overview] Message-ID: <1672745976-2800146-1-git-send-email-renyu.zj@linux.alibaba.com> (raw) Changes since v4: - Add MPKI/PKI “ScaleUnit”; - Add acked-by from Ian Rogers; - Link: https://lore.kernel.org/all/1671799045-1108027-1-git-send-email-renyu.zj@linux.alibaba.com/ Changes since v3: - Add ipc_rate metric; - Drop the PublicDescription; - Describe PEutilization metrics in more detail; - Link: https://lore.kernel.org/all/1669310088-13482-1-git-send-email-renyu.zj@linux.alibaba.com/ Changes since v2: - Correct the furmula of Branch metrics; - Add more PE utilization metrics; - Add more TLB metrics; - Add “ScaleUnit” for some metrics; - Add a newline at the end of the file; - Link: https://lore.kernel.org/all/1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com/ Changes since v1: - Corrected formula for topdown L1 due to wrong counts for stall_slot and stall_slot_frontend; - Link: https://lore.kernel.org/all/1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com/ This series add six metricgroups for neoverse-n2, among which, the formula of topdown L1 is from ARM sbsa7.0 platform design document [0], D37-38. However, due to the wrong count of stall_slot and stall_slot_frontend on neoverse-n2, the real stall_slot and real stall_slot_frontend need to subtract cpu_cycles, so correct the expression of topdown metrics. Reference from ARM neoverse-n2 errata notice [1], D117. Since neoverse-n2 does not yet support topdown L2, metricgroups such as Cache, TLB, Branch, InstructionsMix, and PEutilization are added to help further analysis of performance bottlenecks. Reference from ARM PMU guide [2][3]. [0] https://documentation-service.arm.com/static/60250c7395978b529036da86?token= [1] https://documentation-service.arm.com/static/636a66a64e6cf12278ad89cb?token= [2] https://documentation-service.arm.com/static/628f8fa3dfaf015c2b76eae8?token= [3] https://documentation-service.arm.com/static/62cfe21e31ea212bb6627393?token= $./perf list ... Metric Groups: Branch: branch_miss_pred_rate [The rate of branches mis-predited to the overall branches] branch_mpki [The rate of branches mis-predicted per kilo instructions] branch_pki [The rate of branches retired per kilo instructions] Cache: l1d_cache_miss_rate [The rate of L1 D-Cache misses to the overall L1 D-Cache] l1d_cache_mpki [The rate of L1 D-Cache misses per kilo instructions] ... $sudo ./perf stat -M TLB false_sharing 2 Performance counter stats for 'false_sharing 2': 31,254 L2D_TLB # 19.6 % l2_tlb_miss_rate (42.51%) 6,130 L2D_TLB_REFILL (42.51%) 1,910 L1I_TLB_REFILL # 0.1 % l1i_tlb_miss_rate (42.93%) 2,306,689 L1I_TLB (42.93%) 324,924,247 L1D_TLB # 0.0 % l1d_tlb_miss_rate (43.98%) 22,688 L1D_TLB_REFILL (43.98%) 627,992 L1I_TLB # 0.0 % itlb_walk_rate (44.26%) 92 ITLB_WALK (44.26%) 772,445,613 INST_RETIRED # 0.0 MPKI itlb_mpki (43.94%) 88 ITLB_WALK (43.94%) 907 DTLB_WALK # 0.0 % dtlb_walk_rate (43.10%) 259,132,258 L1D_TLB (43.10%) 804,080,968 INST_RETIRED # 0.0 MPKI dtlb_mpki (42.22%) 937 DTLB_WALK (42.22%) 0.479544400 seconds time elapsed 1.264233000 seconds user 0.000000000 seconds sys $sudo ./perf stat -M TopDownL1 false_sharing 2 Performance counter stats for 'false_sharing 2': 4,310,905,590 cpu_cycles # 0.0 % bad_speculation # 4.0 % retiring (66.87%) 25,009,763,735 stall_slot (66.87%) 855,659,327 op_spec (66.87%) 854,335,288 op_retired (66.87%) 4,330,308,058 cpu_cycles # 27.1 % frontend_bound (66.99%) 10,207,186,460 stall_slot_frontend (66.99%) 4,316,583,673 cpu_cycles # 69.4 % backend_bound (66.65%) 14,979,136,808 stall_slot_backend (66.65%) 0.572056818 seconds time elapsed 1.572143000 seconds user 0.004010000 seconds sys Jing Zhang (6): perf vendor events arm64: Add topdown L1 metrics for neoverse-n2 perf vendor events arm64: Add TLB metrics for neoverse-n2 perf vendor events arm64: Add cache metrics for neoverse-n2 perf vendor events arm64: Add branch metrics for neoverse-n2 perf vendor events arm64: Add PE utilization metrics for neoverse-n2 perf vendor events arm64: Add instruction mix metrics for neoverse-n2 .../arch/arm64/arm/neoverse-n2/metrics.json | 286 +++++++++++++++++++++ 1 file changed, 286 insertions(+) create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json -- 1.8.3.1
WARNING: multiple messages have this Message-ID (diff)
From: Jing Zhang <renyu.zj@linux.alibaba.com> To: John Garry <john.g.garry@oracle.com>, Ian Rogers <irogers@google.com>, Xing Zhengjun <zhengjun.xing@linux.intel.com>, Will Deacon <will@kernel.org>, James Clark <james.clark@arm.com>, Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Andrew Kilroy <andrew.kilroy@arm.com>, Shuai Xue <xueshuai@linux.alibaba.com>, Zhuo Song <zhuo.song@linux.alibaba.com>, Jing Zhang <renyu.zj@linux.alibaba.com> Subject: [PATCH v5 0/6] Add metrics for neoverse-n2 Date: Tue, 3 Jan 2023 19:39:30 +0800 [thread overview] Message-ID: <1672745976-2800146-1-git-send-email-renyu.zj@linux.alibaba.com> (raw) Changes since v4: - Add MPKI/PKI “ScaleUnit”; - Add acked-by from Ian Rogers; - Link: https://lore.kernel.org/all/1671799045-1108027-1-git-send-email-renyu.zj@linux.alibaba.com/ Changes since v3: - Add ipc_rate metric; - Drop the PublicDescription; - Describe PEutilization metrics in more detail; - Link: https://lore.kernel.org/all/1669310088-13482-1-git-send-email-renyu.zj@linux.alibaba.com/ Changes since v2: - Correct the furmula of Branch metrics; - Add more PE utilization metrics; - Add more TLB metrics; - Add “ScaleUnit” for some metrics; - Add a newline at the end of the file; - Link: https://lore.kernel.org/all/1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com/ Changes since v1: - Corrected formula for topdown L1 due to wrong counts for stall_slot and stall_slot_frontend; - Link: https://lore.kernel.org/all/1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com/ This series add six metricgroups for neoverse-n2, among which, the formula of topdown L1 is from ARM sbsa7.0 platform design document [0], D37-38. However, due to the wrong count of stall_slot and stall_slot_frontend on neoverse-n2, the real stall_slot and real stall_slot_frontend need to subtract cpu_cycles, so correct the expression of topdown metrics. Reference from ARM neoverse-n2 errata notice [1], D117. Since neoverse-n2 does not yet support topdown L2, metricgroups such as Cache, TLB, Branch, InstructionsMix, and PEutilization are added to help further analysis of performance bottlenecks. Reference from ARM PMU guide [2][3]. [0] https://documentation-service.arm.com/static/60250c7395978b529036da86?token= [1] https://documentation-service.arm.com/static/636a66a64e6cf12278ad89cb?token= [2] https://documentation-service.arm.com/static/628f8fa3dfaf015c2b76eae8?token= [3] https://documentation-service.arm.com/static/62cfe21e31ea212bb6627393?token= $./perf list ... Metric Groups: Branch: branch_miss_pred_rate [The rate of branches mis-predited to the overall branches] branch_mpki [The rate of branches mis-predicted per kilo instructions] branch_pki [The rate of branches retired per kilo instructions] Cache: l1d_cache_miss_rate [The rate of L1 D-Cache misses to the overall L1 D-Cache] l1d_cache_mpki [The rate of L1 D-Cache misses per kilo instructions] ... $sudo ./perf stat -M TLB false_sharing 2 Performance counter stats for 'false_sharing 2': 31,254 L2D_TLB # 19.6 % l2_tlb_miss_rate (42.51%) 6,130 L2D_TLB_REFILL (42.51%) 1,910 L1I_TLB_REFILL # 0.1 % l1i_tlb_miss_rate (42.93%) 2,306,689 L1I_TLB (42.93%) 324,924,247 L1D_TLB # 0.0 % l1d_tlb_miss_rate (43.98%) 22,688 L1D_TLB_REFILL (43.98%) 627,992 L1I_TLB # 0.0 % itlb_walk_rate (44.26%) 92 ITLB_WALK (44.26%) 772,445,613 INST_RETIRED # 0.0 MPKI itlb_mpki (43.94%) 88 ITLB_WALK (43.94%) 907 DTLB_WALK # 0.0 % dtlb_walk_rate (43.10%) 259,132,258 L1D_TLB (43.10%) 804,080,968 INST_RETIRED # 0.0 MPKI dtlb_mpki (42.22%) 937 DTLB_WALK (42.22%) 0.479544400 seconds time elapsed 1.264233000 seconds user 0.000000000 seconds sys $sudo ./perf stat -M TopDownL1 false_sharing 2 Performance counter stats for 'false_sharing 2': 4,310,905,590 cpu_cycles # 0.0 % bad_speculation # 4.0 % retiring (66.87%) 25,009,763,735 stall_slot (66.87%) 855,659,327 op_spec (66.87%) 854,335,288 op_retired (66.87%) 4,330,308,058 cpu_cycles # 27.1 % frontend_bound (66.99%) 10,207,186,460 stall_slot_frontend (66.99%) 4,316,583,673 cpu_cycles # 69.4 % backend_bound (66.65%) 14,979,136,808 stall_slot_backend (66.65%) 0.572056818 seconds time elapsed 1.572143000 seconds user 0.004010000 seconds sys Jing Zhang (6): perf vendor events arm64: Add topdown L1 metrics for neoverse-n2 perf vendor events arm64: Add TLB metrics for neoverse-n2 perf vendor events arm64: Add cache metrics for neoverse-n2 perf vendor events arm64: Add branch metrics for neoverse-n2 perf vendor events arm64: Add PE utilization metrics for neoverse-n2 perf vendor events arm64: Add instruction mix metrics for neoverse-n2 .../arch/arm64/arm/neoverse-n2/metrics.json | 286 +++++++++++++++++++++ 1 file changed, 286 insertions(+) create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json -- 1.8.3.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2023-01-03 11:40 UTC|newest] Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-01-03 11:39 Jing Zhang [this message] 2023-01-03 11:39 ` [PATCH v5 0/6] Add metrics for neoverse-n2 Jing Zhang 2023-01-03 11:39 ` [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 " Jing Zhang 2023-01-03 11:39 ` Jing Zhang 2023-01-03 11:52 ` John Garry 2023-01-03 11:52 ` John Garry 2023-01-04 5:05 ` Jing Zhang 2023-01-04 5:05 ` Jing Zhang 2023-01-04 17:26 ` John Garry 2023-01-04 17:26 ` John Garry 2023-01-05 10:05 ` Jing Zhang 2023-01-05 10:05 ` Jing Zhang 2023-01-05 10:13 ` John Garry 2023-01-05 10:13 ` John Garry 2023-01-05 11:02 ` Jing Zhang 2023-01-05 11:02 ` Jing Zhang 2023-01-05 21:13 ` Ian Rogers 2023-01-05 21:13 ` Ian Rogers 2023-01-06 10:14 ` John Garry 2023-01-06 10:14 ` John Garry 2023-01-06 10:34 ` Jing Zhang 2023-01-06 10:34 ` Jing Zhang 2023-01-09 15:34 ` James Clark 2023-01-09 15:34 ` James Clark 2023-01-11 6:14 ` Ian Rogers 2023-01-11 6:14 ` Ian Rogers 2023-01-03 11:39 ` [PATCH v5 2/6] perf vendor events arm64: Add TLB " Jing Zhang 2023-01-03 11:39 ` Jing Zhang 2023-01-03 17:14 ` Ian Rogers 2023-01-03 17:14 ` Ian Rogers 2023-01-04 5:21 ` Jing Zhang 2023-01-04 5:21 ` Jing Zhang 2023-01-04 8:40 ` Jing Zhang 2023-01-04 8:40 ` Jing Zhang 2023-01-04 16:57 ` Ian Rogers 2023-01-04 16:57 ` Ian Rogers 2023-01-03 11:39 ` [PATCH v5 3/6] perf vendor events arm64: Add cache " Jing Zhang 2023-01-03 11:39 ` Jing Zhang 2023-01-03 11:39 ` [PATCH v5 4/6] perf vendor events arm64: Add branch " Jing Zhang 2023-01-03 11:39 ` Jing Zhang 2023-01-03 11:39 ` [PATCH v5 5/6] perf vendor events arm64: Add PE utilization " Jing Zhang 2023-01-03 11:39 ` Jing Zhang 2023-01-03 11:39 ` [PATCH v5 6/6] perf vendor events arm64: Add instruction mix " Jing Zhang 2023-01-03 11:39 ` Jing Zhang
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1672745976-2800146-1-git-send-email-renyu.zj@linux.alibaba.com \ --to=renyu.zj@linux.alibaba.com \ --cc=acme@kernel.org \ --cc=alexander.shishkin@linux.intel.com \ --cc=andrew.kilroy@arm.com \ --cc=irogers@google.com \ --cc=james.clark@arm.com \ --cc=john.g.garry@oracle.com \ --cc=jolsa@kernel.org \ --cc=leo.yan@linaro.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-perf-users@vger.kernel.org \ --cc=mark.rutland@arm.com \ --cc=mike.leach@linaro.org \ --cc=mingo@redhat.com \ --cc=namhyung@kernel.org \ --cc=peterz@infradead.org \ --cc=will@kernel.org \ --cc=xueshuai@linux.alibaba.com \ --cc=zhengjun.xing@linux.intel.com \ --cc=zhuo.song@linux.alibaba.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.