All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jing Zhang <renyu.zj@linux.alibaba.com>
To: linux-arm-kernel@lists.infradead.org,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	John Garry <john.garry@huawei.com>, Will Deacon <will@kernel.org>,
	James Clark <james.clark@arm.com>,
	Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Andrew Kilroy <andrew.kilroy@arm.com>,
	Shuai Xue <xueshuai@linux.alibaba.com>,
	Zhuo Song <zhuo.song@linux.alibaba.com>,
	Jing Zhang <renyu.zj@linux.alibaba.com>
Subject: [RFC PATCH v2 0/6] Add metrics for neoverse-n2
Date: Mon, 14 Nov 2022 15:41:54 +0800	[thread overview]
Message-ID: <1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com> (raw)
In-Reply-To: <1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com>

Changes since v1: 
- Corrected formula for topdown L1 due to wrong counts for stall_slot and
  stall_slot_frontend; 
- Link: https://lore.kernel.org/all/1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com/

This series add six metricgroups for neoverse-n2, among which, the formula of
topdown L1 is from the document:
https://documentation-service.arm.com/static/60250c7395978b529036da86?token=

Due to the wrong count of stall_slot and stall_slot_frontend in neoverse-n2, the
real stall_slot and real stall_slot_frontend need to subtract cpu_cycles, so
when calculating the topdownL1 metrics, stall_slot and stall_slot_frontend are
corrected.

Since neoverse-n2 does not yet support topdown L2, metricgroups such as Cache,
TLB, Branch, InstructionsMix, and PEutilization are added to help further
analysis of performance bottlenecks.

with this series on neoverse-n2:

$./perf list

...
Metric Groups:

Branch:
  branch_miss_pred_rate
       [The rate of branches mis-predited to the overall branches]
  branch_mpki
       [The rate of branches mis-predicted per kilo instructions]
  branch_pki
       [The rate of branches retired per kilo instructions]
Cache:
  l1d_cache_miss_rate
       [The rate of L1 D-Cache misses to the overall L1 D-Cache]
  l1d_cache_mpki
       [The rate of L1 D-Cache misses per kilo instructions]
...


$sudo ./perf stat -a -M TLB sleep 1

 Performance counter stats for 'system wide':

        35,861,936      L1I_TLB                          #     0.00 itlb_walk_rate           (74.91%)
             5,661      ITLB_WALK                                                            (74.91%)
        97,279,240      INST_RETIRED                     #     0.07 itlb_mpki                (74.91%)
             6,851      ITLB_WALK                                                            (74.91%)
            26,391      DTLB_WALK                        #     0.00 dtlb_walk_rate           (75.07%)
        35,585,545      L1D_TLB                                                              (75.07%)
        85,923,244      INST_RETIRED                     #     0.35 dtlb_mpki                (75.11%)
            29,992      DTLB_WALK                                                            (75.11%)

       1.003450755 seconds time elapsed


$sudo ./perf stat -M TopDownL1 false_sharing 2

Performance counter stats for 'false_sharing 2':

     3,388,884,713      cpu_cycles                       #     0.05 retiring
                                                  #     0.00 wasted                   (66.59%)
    19,495,064,576      stall_slot                                                           (66.59%)
       838,235,126      op_spec                                                              (66.59%)
       836,787,162      op_retired                                                           (66.59%)
     3,380,520,038      cpu_cycles                       #     0.29 frontend_bound           (67.15%)
     8,267,545,049      stall_slot_frontend                                                  (67.15%)
     3,389,138,804      cpu_cycles                       #     0.67 backend_bound            (66.66%)
    11,337,766,816      stall_slot_backend                                                   (66.66%)

       0.442572628 seconds time elapsed

       1.235153000 seconds user
       0.000000000 seconds sys

Jing Zhang (6):
  perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  perf vendor events arm64: Add TLB metrics for neoverse-n2
  perf vendor events arm64: Add cache metrics for neoverse-n2
  perf vendor events arm64: Add branch metrics for neoverse-n2
  perf vendor events arm64: Add PE utilization metrics for neoverse-n2
  perf vendor events arm64: Add instruction mix metrics for neoverse-n2

 .../arch/arm64/arm/neoverse-n2/metrics.json        | 247 +++++++++++++++++++++
 1 file changed, 247 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json

-- 
1.8.3.1


WARNING: multiple messages have this Message-ID (diff)
From: Jing Zhang <renyu.zj@linux.alibaba.com>
To: linux-arm-kernel@lists.infradead.org,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	John Garry <john.garry@huawei.com>, Will Deacon <will@kernel.org>,
	James Clark <james.clark@arm.com>,
	Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Andrew Kilroy <andrew.kilroy@arm.com>,
	Shuai Xue <xueshuai@linux.alibaba.com>,
	Zhuo Song <zhuo.song@linux.alibaba.com>,
	Jing Zhang <renyu.zj@linux.alibaba.com>
Subject: [RFC PATCH v2 0/6] Add metrics for neoverse-n2
Date: Mon, 14 Nov 2022 15:41:54 +0800	[thread overview]
Message-ID: <1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com> (raw)
In-Reply-To: <1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com>

Changes since v1: 
- Corrected formula for topdown L1 due to wrong counts for stall_slot and
  stall_slot_frontend; 
- Link: https://lore.kernel.org/all/1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com/

This series add six metricgroups for neoverse-n2, among which, the formula of
topdown L1 is from the document:
https://documentation-service.arm.com/static/60250c7395978b529036da86?token=

Due to the wrong count of stall_slot and stall_slot_frontend in neoverse-n2, the
real stall_slot and real stall_slot_frontend need to subtract cpu_cycles, so
when calculating the topdownL1 metrics, stall_slot and stall_slot_frontend are
corrected.

Since neoverse-n2 does not yet support topdown L2, metricgroups such as Cache,
TLB, Branch, InstructionsMix, and PEutilization are added to help further
analysis of performance bottlenecks.

with this series on neoverse-n2:

$./perf list

...
Metric Groups:

Branch:
  branch_miss_pred_rate
       [The rate of branches mis-predited to the overall branches]
  branch_mpki
       [The rate of branches mis-predicted per kilo instructions]
  branch_pki
       [The rate of branches retired per kilo instructions]
Cache:
  l1d_cache_miss_rate
       [The rate of L1 D-Cache misses to the overall L1 D-Cache]
  l1d_cache_mpki
       [The rate of L1 D-Cache misses per kilo instructions]
...


$sudo ./perf stat -a -M TLB sleep 1

 Performance counter stats for 'system wide':

        35,861,936      L1I_TLB                          #     0.00 itlb_walk_rate           (74.91%)
             5,661      ITLB_WALK                                                            (74.91%)
        97,279,240      INST_RETIRED                     #     0.07 itlb_mpki                (74.91%)
             6,851      ITLB_WALK                                                            (74.91%)
            26,391      DTLB_WALK                        #     0.00 dtlb_walk_rate           (75.07%)
        35,585,545      L1D_TLB                                                              (75.07%)
        85,923,244      INST_RETIRED                     #     0.35 dtlb_mpki                (75.11%)
            29,992      DTLB_WALK                                                            (75.11%)

       1.003450755 seconds time elapsed


$sudo ./perf stat -M TopDownL1 false_sharing 2

Performance counter stats for 'false_sharing 2':

     3,388,884,713      cpu_cycles                       #     0.05 retiring
                                                  #     0.00 wasted                   (66.59%)
    19,495,064,576      stall_slot                                                           (66.59%)
       838,235,126      op_spec                                                              (66.59%)
       836,787,162      op_retired                                                           (66.59%)
     3,380,520,038      cpu_cycles                       #     0.29 frontend_bound           (67.15%)
     8,267,545,049      stall_slot_frontend                                                  (67.15%)
     3,389,138,804      cpu_cycles                       #     0.67 backend_bound            (66.66%)
    11,337,766,816      stall_slot_backend                                                   (66.66%)

       0.442572628 seconds time elapsed

       1.235153000 seconds user
       0.000000000 seconds sys

Jing Zhang (6):
  perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  perf vendor events arm64: Add TLB metrics for neoverse-n2
  perf vendor events arm64: Add cache metrics for neoverse-n2
  perf vendor events arm64: Add branch metrics for neoverse-n2
  perf vendor events arm64: Add PE utilization metrics for neoverse-n2
  perf vendor events arm64: Add instruction mix metrics for neoverse-n2

 .../arch/arm64/arm/neoverse-n2/metrics.json        | 247 +++++++++++++++++++++
 1 file changed, 247 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json

-- 
1.8.3.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2022-11-14  7:42 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 11:11 [PATCH RFC 0/6] Add metrics for neoverse-n2 Jing Zhang
2022-10-31 11:11 ` Jing Zhang
2022-10-31 11:11 ` [PATCH RFC 1/6] perf vendor events arm64: Add topdown L1 " Jing Zhang
2022-10-31 11:11   ` Jing Zhang
2022-10-31 11:11 ` [PATCH RFC 2/6] perf vendor events arm64: Add TLB " Jing Zhang
2022-10-31 11:11   ` Jing Zhang
2022-10-31 11:11 ` [PATCH RFC 3/6] perf vendor events arm64: Add cache " Jing Zhang
2022-10-31 11:11   ` Jing Zhang
2022-10-31 11:11 ` [PATCH RFC 4/6] perf vendor events arm64: Add branch " Jing Zhang
2022-10-31 11:11   ` Jing Zhang
2022-10-31 11:11 ` [PATCH RFC 5/6] perf vendor events arm64: Add PE utilization " Jing Zhang
2022-10-31 11:11   ` Jing Zhang
2022-10-31 11:11 ` [PATCH RFC 6/6] perf vendor events arm64: Add instruction mix " Jing Zhang
2022-10-31 11:11   ` Jing Zhang
2022-11-14  7:41 ` Jing Zhang [this message]
2022-11-14  7:41   ` [RFC PATCH v2 0/6] Add " Jing Zhang
2022-11-24 17:14   ` [PATCH v3 " Jing Zhang
2022-11-24 17:14     ` Jing Zhang
2022-11-24 17:14   ` [PATCH v3 1/6] perf vendor events arm64: Add topdown L1 " Jing Zhang
2022-11-24 17:14     ` Jing Zhang
2022-11-24 17:14   ` [PATCH v3 2/6] perf vendor events arm64: Add TLB " Jing Zhang
2022-11-24 17:14     ` Jing Zhang
2022-11-24 17:14   ` [PATCH v3 3/6] perf vendor events arm64: Add cache " Jing Zhang
2022-11-24 17:14     ` Jing Zhang
2022-11-24 17:14   ` [PATCH v3 4/6] perf vendor events arm64: Add branch " Jing Zhang
2022-11-24 17:14     ` Jing Zhang
2022-11-24 17:14   ` [PATCH v3 5/6] perf vendor events arm64: Add PE utilization " Jing Zhang
2022-11-24 17:14     ` Jing Zhang
2022-11-30 18:58     ` Ian Rogers
2022-11-30 18:58       ` Ian Rogers
2022-12-01 11:08       ` Jing Zhang
2022-12-01 11:08         ` Jing Zhang
2022-12-02 20:05         ` Ian Rogers
2022-12-02 20:05           ` Ian Rogers
2022-12-04  7:10           ` Jing Zhang
2022-12-04  7:10             ` Jing Zhang
2022-11-24 17:14   ` [PATCH v3 6/6] perf vendor events arm64: Add instruction mix " Jing Zhang
2022-11-24 17:14     ` Jing Zhang
2022-11-14  7:41 ` [RFC PATCH v2 1/6] perf vendor events arm64: Add topdown L1 " Jing Zhang
2022-11-14  7:41   ` Jing Zhang
2022-11-14 12:59   ` [External] : " John Garry
2022-11-14 12:59     ` John Garry
2022-11-15  8:43     ` Jing Zhang
2022-11-15  8:43       ` Jing Zhang
2022-11-15 11:19       ` John Garry
2022-11-15 11:19         ` John Garry
2022-11-21  9:53         ` Jing Zhang
2022-11-21  9:53           ` Jing Zhang
2022-11-21 10:22           ` John Garry
2022-11-21 10:22             ` John Garry
2022-11-21 15:17             ` Jing Zhang
2022-11-21 15:17               ` Jing Zhang
2022-11-21 17:55               ` John Garry
2022-11-21 17:55                 ` John Garry
2022-11-22  9:24                 ` Jing Zhang
2022-11-22  9:24                   ` Jing Zhang
2022-11-22 14:00                 ` James Clark
2022-11-22 14:00                   ` James Clark
2022-11-22 15:41                   ` Jing Zhang
2022-11-22 15:41                     ` Jing Zhang
2022-11-23 14:26                     ` James Clark
2022-11-23 14:26                       ` James Clark
2022-11-24 16:32                       ` Jing Zhang
2022-11-24 16:32                         ` Jing Zhang
2022-11-24 16:51                         ` James Clark
2022-11-24 16:51                           ` James Clark
2022-11-14  7:41 ` [RFC PATCH v2 2/6] perf vendor events arm64: Add TLB " Jing Zhang
2022-11-14  7:41   ` Jing Zhang
2022-11-14  7:41 ` [RFC PATCH v2 3/6] perf vendor events arm64: Add cache " Jing Zhang
2022-11-14  7:41   ` Jing Zhang
2022-11-14  8:35   ` Xing Zhengjun
2022-11-14  8:35     ` Xing Zhengjun
2022-11-15  6:28     ` Jing Zhang
2022-11-15  6:28       ` Jing Zhang
2022-11-14  7:41 ` [RFC PATCH v2 4/6] perf vendor events arm64: Add branch " Jing Zhang
2022-11-14  7:41   ` Jing Zhang
2022-11-14  7:41 ` [RFC PATCH v2 5/6] perf vendor events arm64: Add PE utilization " Jing Zhang
2022-11-14  7:41   ` Jing Zhang
2022-11-14  7:42 ` [RFC PATCH v2 6/6] perf vendor events arm64: Add instruction mix " Jing Zhang
2022-11-14  7:42   ` Jing Zhang
2022-11-16 11:19 ` [PATCH RFC 0/6] Add " James Clark
2022-11-16 11:19   ` James Clark
2022-11-16 15:26   ` Jing Zhang
2022-11-16 15:26     ` Jing Zhang
2022-11-21 11:51     ` James Clark
2022-11-21 11:51       ` James Clark
2022-11-22  7:11       ` Jing Zhang
2022-11-22  7:11         ` Jing Zhang
2022-11-22 11:53         ` James Clark
2022-11-22 11:53           ` James Clark
2022-11-19  3:30   ` Jing Zhang
2022-11-19  3:30     ` Jing Zhang
     [not found]     ` <CAP-5=fW+Z_Tc3BfK1bRKUeKWfxtPfoZXL9D2BhcU1SzNOruSsg@mail.gmail.com>
2022-11-20  3:49       ` Jing Zhang
2022-11-20  3:49         ` Jing Zhang
2022-11-21 11:55       ` James Clark
2022-11-21 11:55         ` James Clark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com \
    --to=renyu.zj@linux.alibaba.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andrew.kilroy@arm.com \
    --cc=james.clark@arm.com \
    --cc=john.garry@huawei.com \
    --cc=jolsa@kernel.org \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mike.leach@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    --cc=zhuo.song@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.