From: Jing Zhang <renyu.zj@linux.alibaba.com> To: linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, John Garry <john.garry@huawei.com>, Will Deacon <will@kernel.org>, James Clark <james.clark@arm.com>, Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Andrew Kilroy <andrew.kilroy@arm.com>, Shuai Xue <xueshuai@linux.alibaba.com>, Zhuo Song <zhuo.song@linux.alibaba.com>, Jing Zhang <renyu.zj@linux.alibaba.com> Subject: [RFC PATCH v2 0/6] Add metrics for neoverse-n2 Date: Mon, 14 Nov 2022 15:41:54 +0800 [thread overview] Message-ID: <1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com> (raw) In-Reply-To: <1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com> Changes since v1: - Corrected formula for topdown L1 due to wrong counts for stall_slot and stall_slot_frontend; - Link: https://lore.kernel.org/all/1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com/ This series add six metricgroups for neoverse-n2, among which, the formula of topdown L1 is from the document: https://documentation-service.arm.com/static/60250c7395978b529036da86?token= Due to the wrong count of stall_slot and stall_slot_frontend in neoverse-n2, the real stall_slot and real stall_slot_frontend need to subtract cpu_cycles, so when calculating the topdownL1 metrics, stall_slot and stall_slot_frontend are corrected. Since neoverse-n2 does not yet support topdown L2, metricgroups such as Cache, TLB, Branch, InstructionsMix, and PEutilization are added to help further analysis of performance bottlenecks. with this series on neoverse-n2: $./perf list ... Metric Groups: Branch: branch_miss_pred_rate [The rate of branches mis-predited to the overall branches] branch_mpki [The rate of branches mis-predicted per kilo instructions] branch_pki [The rate of branches retired per kilo instructions] Cache: l1d_cache_miss_rate [The rate of L1 D-Cache misses to the overall L1 D-Cache] l1d_cache_mpki [The rate of L1 D-Cache misses per kilo instructions] ... $sudo ./perf stat -a -M TLB sleep 1 Performance counter stats for 'system wide': 35,861,936 L1I_TLB # 0.00 itlb_walk_rate (74.91%) 5,661 ITLB_WALK (74.91%) 97,279,240 INST_RETIRED # 0.07 itlb_mpki (74.91%) 6,851 ITLB_WALK (74.91%) 26,391 DTLB_WALK # 0.00 dtlb_walk_rate (75.07%) 35,585,545 L1D_TLB (75.07%) 85,923,244 INST_RETIRED # 0.35 dtlb_mpki (75.11%) 29,992 DTLB_WALK (75.11%) 1.003450755 seconds time elapsed $sudo ./perf stat -M TopDownL1 false_sharing 2 Performance counter stats for 'false_sharing 2': 3,388,884,713 cpu_cycles # 0.05 retiring # 0.00 wasted (66.59%) 19,495,064,576 stall_slot (66.59%) 838,235,126 op_spec (66.59%) 836,787,162 op_retired (66.59%) 3,380,520,038 cpu_cycles # 0.29 frontend_bound (67.15%) 8,267,545,049 stall_slot_frontend (67.15%) 3,389,138,804 cpu_cycles # 0.67 backend_bound (66.66%) 11,337,766,816 stall_slot_backend (66.66%) 0.442572628 seconds time elapsed 1.235153000 seconds user 0.000000000 seconds sys Jing Zhang (6): perf vendor events arm64: Add topdown L1 metrics for neoverse-n2 perf vendor events arm64: Add TLB metrics for neoverse-n2 perf vendor events arm64: Add cache metrics for neoverse-n2 perf vendor events arm64: Add branch metrics for neoverse-n2 perf vendor events arm64: Add PE utilization metrics for neoverse-n2 perf vendor events arm64: Add instruction mix metrics for neoverse-n2 .../arch/arm64/arm/neoverse-n2/metrics.json | 247 +++++++++++++++++++++ 1 file changed, 247 insertions(+) create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json -- 1.8.3.1
WARNING: multiple messages have this Message-ID (diff)
From: Jing Zhang <renyu.zj@linux.alibaba.com> To: linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, John Garry <john.garry@huawei.com>, Will Deacon <will@kernel.org>, James Clark <james.clark@arm.com>, Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Andrew Kilroy <andrew.kilroy@arm.com>, Shuai Xue <xueshuai@linux.alibaba.com>, Zhuo Song <zhuo.song@linux.alibaba.com>, Jing Zhang <renyu.zj@linux.alibaba.com> Subject: [RFC PATCH v2 0/6] Add metrics for neoverse-n2 Date: Mon, 14 Nov 2022 15:41:54 +0800 [thread overview] Message-ID: <1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com> (raw) In-Reply-To: <1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com> Changes since v1: - Corrected formula for topdown L1 due to wrong counts for stall_slot and stall_slot_frontend; - Link: https://lore.kernel.org/all/1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com/ This series add six metricgroups for neoverse-n2, among which, the formula of topdown L1 is from the document: https://documentation-service.arm.com/static/60250c7395978b529036da86?token= Due to the wrong count of stall_slot and stall_slot_frontend in neoverse-n2, the real stall_slot and real stall_slot_frontend need to subtract cpu_cycles, so when calculating the topdownL1 metrics, stall_slot and stall_slot_frontend are corrected. Since neoverse-n2 does not yet support topdown L2, metricgroups such as Cache, TLB, Branch, InstructionsMix, and PEutilization are added to help further analysis of performance bottlenecks. with this series on neoverse-n2: $./perf list ... Metric Groups: Branch: branch_miss_pred_rate [The rate of branches mis-predited to the overall branches] branch_mpki [The rate of branches mis-predicted per kilo instructions] branch_pki [The rate of branches retired per kilo instructions] Cache: l1d_cache_miss_rate [The rate of L1 D-Cache misses to the overall L1 D-Cache] l1d_cache_mpki [The rate of L1 D-Cache misses per kilo instructions] ... $sudo ./perf stat -a -M TLB sleep 1 Performance counter stats for 'system wide': 35,861,936 L1I_TLB # 0.00 itlb_walk_rate (74.91%) 5,661 ITLB_WALK (74.91%) 97,279,240 INST_RETIRED # 0.07 itlb_mpki (74.91%) 6,851 ITLB_WALK (74.91%) 26,391 DTLB_WALK # 0.00 dtlb_walk_rate (75.07%) 35,585,545 L1D_TLB (75.07%) 85,923,244 INST_RETIRED # 0.35 dtlb_mpki (75.11%) 29,992 DTLB_WALK (75.11%) 1.003450755 seconds time elapsed $sudo ./perf stat -M TopDownL1 false_sharing 2 Performance counter stats for 'false_sharing 2': 3,388,884,713 cpu_cycles # 0.05 retiring # 0.00 wasted (66.59%) 19,495,064,576 stall_slot (66.59%) 838,235,126 op_spec (66.59%) 836,787,162 op_retired (66.59%) 3,380,520,038 cpu_cycles # 0.29 frontend_bound (67.15%) 8,267,545,049 stall_slot_frontend (67.15%) 3,389,138,804 cpu_cycles # 0.67 backend_bound (66.66%) 11,337,766,816 stall_slot_backend (66.66%) 0.442572628 seconds time elapsed 1.235153000 seconds user 0.000000000 seconds sys Jing Zhang (6): perf vendor events arm64: Add topdown L1 metrics for neoverse-n2 perf vendor events arm64: Add TLB metrics for neoverse-n2 perf vendor events arm64: Add cache metrics for neoverse-n2 perf vendor events arm64: Add branch metrics for neoverse-n2 perf vendor events arm64: Add PE utilization metrics for neoverse-n2 perf vendor events arm64: Add instruction mix metrics for neoverse-n2 .../arch/arm64/arm/neoverse-n2/metrics.json | 247 +++++++++++++++++++++ 1 file changed, 247 insertions(+) create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json -- 1.8.3.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-11-14 7:42 UTC|newest] Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-10-31 11:11 [PATCH RFC 0/6] Add metrics for neoverse-n2 Jing Zhang 2022-10-31 11:11 ` Jing Zhang 2022-10-31 11:11 ` [PATCH RFC 1/6] perf vendor events arm64: Add topdown L1 " Jing Zhang 2022-10-31 11:11 ` Jing Zhang 2022-10-31 11:11 ` [PATCH RFC 2/6] perf vendor events arm64: Add TLB " Jing Zhang 2022-10-31 11:11 ` Jing Zhang 2022-10-31 11:11 ` [PATCH RFC 3/6] perf vendor events arm64: Add cache " Jing Zhang 2022-10-31 11:11 ` Jing Zhang 2022-10-31 11:11 ` [PATCH RFC 4/6] perf vendor events arm64: Add branch " Jing Zhang 2022-10-31 11:11 ` Jing Zhang 2022-10-31 11:11 ` [PATCH RFC 5/6] perf vendor events arm64: Add PE utilization " Jing Zhang 2022-10-31 11:11 ` Jing Zhang 2022-10-31 11:11 ` [PATCH RFC 6/6] perf vendor events arm64: Add instruction mix " Jing Zhang 2022-10-31 11:11 ` Jing Zhang 2022-11-14 7:41 ` Jing Zhang [this message] 2022-11-14 7:41 ` [RFC PATCH v2 0/6] Add " Jing Zhang 2022-11-24 17:14 ` [PATCH v3 " Jing Zhang 2022-11-24 17:14 ` Jing Zhang 2022-11-24 17:14 ` [PATCH v3 1/6] perf vendor events arm64: Add topdown L1 " Jing Zhang 2022-11-24 17:14 ` Jing Zhang 2022-11-24 17:14 ` [PATCH v3 2/6] perf vendor events arm64: Add TLB " Jing Zhang 2022-11-24 17:14 ` Jing Zhang 2022-11-24 17:14 ` [PATCH v3 3/6] perf vendor events arm64: Add cache " Jing Zhang 2022-11-24 17:14 ` Jing Zhang 2022-11-24 17:14 ` [PATCH v3 4/6] perf vendor events arm64: Add branch " Jing Zhang 2022-11-24 17:14 ` Jing Zhang 2022-11-24 17:14 ` [PATCH v3 5/6] perf vendor events arm64: Add PE utilization " Jing Zhang 2022-11-24 17:14 ` Jing Zhang 2022-11-30 18:58 ` Ian Rogers 2022-11-30 18:58 ` Ian Rogers 2022-12-01 11:08 ` Jing Zhang 2022-12-01 11:08 ` Jing Zhang 2022-12-02 20:05 ` Ian Rogers 2022-12-02 20:05 ` Ian Rogers 2022-12-04 7:10 ` Jing Zhang 2022-12-04 7:10 ` Jing Zhang 2022-11-24 17:14 ` [PATCH v3 6/6] perf vendor events arm64: Add instruction mix " Jing Zhang 2022-11-24 17:14 ` Jing Zhang 2022-11-14 7:41 ` [RFC PATCH v2 1/6] perf vendor events arm64: Add topdown L1 " Jing Zhang 2022-11-14 7:41 ` Jing Zhang 2022-11-14 12:59 ` [External] : " John Garry 2022-11-14 12:59 ` John Garry 2022-11-15 8:43 ` Jing Zhang 2022-11-15 8:43 ` Jing Zhang 2022-11-15 11:19 ` John Garry 2022-11-15 11:19 ` John Garry 2022-11-21 9:53 ` Jing Zhang 2022-11-21 9:53 ` Jing Zhang 2022-11-21 10:22 ` John Garry 2022-11-21 10:22 ` John Garry 2022-11-21 15:17 ` Jing Zhang 2022-11-21 15:17 ` Jing Zhang 2022-11-21 17:55 ` John Garry 2022-11-21 17:55 ` John Garry 2022-11-22 9:24 ` Jing Zhang 2022-11-22 9:24 ` Jing Zhang 2022-11-22 14:00 ` James Clark 2022-11-22 14:00 ` James Clark 2022-11-22 15:41 ` Jing Zhang 2022-11-22 15:41 ` Jing Zhang 2022-11-23 14:26 ` James Clark 2022-11-23 14:26 ` James Clark 2022-11-24 16:32 ` Jing Zhang 2022-11-24 16:32 ` Jing Zhang 2022-11-24 16:51 ` James Clark 2022-11-24 16:51 ` James Clark 2022-11-14 7:41 ` [RFC PATCH v2 2/6] perf vendor events arm64: Add TLB " Jing Zhang 2022-11-14 7:41 ` Jing Zhang 2022-11-14 7:41 ` [RFC PATCH v2 3/6] perf vendor events arm64: Add cache " Jing Zhang 2022-11-14 7:41 ` Jing Zhang 2022-11-14 8:35 ` Xing Zhengjun 2022-11-14 8:35 ` Xing Zhengjun 2022-11-15 6:28 ` Jing Zhang 2022-11-15 6:28 ` Jing Zhang 2022-11-14 7:41 ` [RFC PATCH v2 4/6] perf vendor events arm64: Add branch " Jing Zhang 2022-11-14 7:41 ` Jing Zhang 2022-11-14 7:41 ` [RFC PATCH v2 5/6] perf vendor events arm64: Add PE utilization " Jing Zhang 2022-11-14 7:41 ` Jing Zhang 2022-11-14 7:42 ` [RFC PATCH v2 6/6] perf vendor events arm64: Add instruction mix " Jing Zhang 2022-11-14 7:42 ` Jing Zhang 2022-11-16 11:19 ` [PATCH RFC 0/6] Add " James Clark 2022-11-16 11:19 ` James Clark 2022-11-16 15:26 ` Jing Zhang 2022-11-16 15:26 ` Jing Zhang 2022-11-21 11:51 ` James Clark 2022-11-21 11:51 ` James Clark 2022-11-22 7:11 ` Jing Zhang 2022-11-22 7:11 ` Jing Zhang 2022-11-22 11:53 ` James Clark 2022-11-22 11:53 ` James Clark 2022-11-19 3:30 ` Jing Zhang 2022-11-19 3:30 ` Jing Zhang [not found] ` <CAP-5=fW+Z_Tc3BfK1bRKUeKWfxtPfoZXL9D2BhcU1SzNOruSsg@mail.gmail.com> 2022-11-20 3:49 ` Jing Zhang 2022-11-20 3:49 ` Jing Zhang 2022-11-21 11:55 ` James Clark 2022-11-21 11:55 ` James Clark
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com \ --to=renyu.zj@linux.alibaba.com \ --cc=acme@kernel.org \ --cc=alexander.shishkin@linux.intel.com \ --cc=andrew.kilroy@arm.com \ --cc=james.clark@arm.com \ --cc=john.garry@huawei.com \ --cc=jolsa@kernel.org \ --cc=leo.yan@linaro.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-perf-users@vger.kernel.org \ --cc=mark.rutland@arm.com \ --cc=mike.leach@linaro.org \ --cc=mingo@redhat.com \ --cc=namhyung@kernel.org \ --cc=peterz@infradead.org \ --cc=will@kernel.org \ --cc=xueshuai@linux.alibaba.com \ --cc=zhuo.song@linux.alibaba.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.