All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/6] Add metrics for neoverse-n2
@ 2023-01-03 11:39 ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Changes since v4:
- Add MPKI/PKI “ScaleUnit”;
- Add acked-by from Ian Rogers;
- Link: https://lore.kernel.org/all/1671799045-1108027-1-git-send-email-renyu.zj@linux.alibaba.com/

Changes since v3:
- Add ipc_rate metric;
- Drop the PublicDescription;
- Describe PEutilization metrics in more detail;
- Link: https://lore.kernel.org/all/1669310088-13482-1-git-send-email-renyu.zj@linux.alibaba.com/

Changes since v2:
- Correct the furmula of Branch metrics;
- Add more PE utilization metrics;
- Add more TLB metrics;
- Add “ScaleUnit” for some metrics;
- Add a newline at the end of the file;
- Link: https://lore.kernel.org/all/1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com/

Changes since v1:
- Corrected formula for topdown L1 due to wrong counts for stall_slot and
  stall_slot_frontend; 
- Link: https://lore.kernel.org/all/1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com/


This series add six metricgroups for neoverse-n2, among which, the formula of
topdown L1 is from ARM sbsa7.0 platform design document [0], D37-38.

However, due to the wrong count of stall_slot and stall_slot_frontend on
neoverse-n2, the real stall_slot and real stall_slot_frontend need to
subtract cpu_cycles,  so correct the expression of topdown metrics.
Reference from ARM neoverse-n2 errata notice [1], D117.

Since neoverse-n2 does not yet support topdown L2, metricgroups such as Cache,
TLB, Branch, InstructionsMix, and PEutilization are added to help further
analysis of performance bottlenecks. Reference from ARM PMU guide [2][3].

[0] https://documentation-service.arm.com/static/60250c7395978b529036da86?token=
[1] https://documentation-service.arm.com/static/636a66a64e6cf12278ad89cb?token=
[2] https://documentation-service.arm.com/static/628f8fa3dfaf015c2b76eae8?token=
[3] https://documentation-service.arm.com/static/62cfe21e31ea212bb6627393?token=


$./perf list
...
Metric Groups:

Branch:
  branch_miss_pred_rate
       [The rate of branches mis-predited to the overall branches]
  branch_mpki
       [The rate of branches mis-predicted per kilo instructions]
  branch_pki
       [The rate of branches retired per kilo instructions]
Cache:
  l1d_cache_miss_rate
       [The rate of L1 D-Cache misses to the overall L1 D-Cache]
  l1d_cache_mpki
       [The rate of L1 D-Cache misses per kilo instructions]
...


$sudo ./perf stat -M TLB false_sharing 2

 Performance counter stats for 'false_sharing 2':

            31,254      L2D_TLB                          #     19.6 %  l2_tlb_miss_rate      (42.51%)
             6,130      L2D_TLB_REFILL                                                       (42.51%)
             1,910      L1I_TLB_REFILL                   #      0.1 %  l1i_tlb_miss_rate     (42.93%)
         2,306,689      L1I_TLB                                                              (42.93%)
       324,924,247      L1D_TLB                          #      0.0 %  l1d_tlb_miss_rate     (43.98%)
            22,688      L1D_TLB_REFILL                                                       (43.98%)
           627,992      L1I_TLB                          #      0.0 %  itlb_walk_rate        (44.26%)
                92      ITLB_WALK                                                            (44.26%)
       772,445,613      INST_RETIRED                     #      0.0 MPKI  itlb_mpki          (43.94%)
                88      ITLB_WALK                                                            (43.94%)
               907      DTLB_WALK                        #      0.0 %  dtlb_walk_rate        (43.10%)
       259,132,258      L1D_TLB                                                              (43.10%)
       804,080,968      INST_RETIRED                     #      0.0 MPKI  dtlb_mpki          (42.22%)
               937      DTLB_WALK                                                            (42.22%)

       0.479544400 seconds time elapsed

       1.264233000 seconds user
       0.000000000 seconds sys


$sudo ./perf stat -M TopDownL1 false_sharing 2

 Performance counter stats for 'false_sharing 2':

     4,310,905,590      cpu_cycles                       #      0.0 %  bad_speculation
                                                  #      4.0 %  retiring              (66.87%)
    25,009,763,735      stall_slot                                                           (66.87%)
       855,659,327      op_spec                                                              (66.87%)
       854,335,288      op_retired                                                           (66.87%)
     4,330,308,058      cpu_cycles                       #     27.1 %  frontend_bound        (66.99%)
    10,207,186,460      stall_slot_frontend                                                  (66.99%)
     4,316,583,673      cpu_cycles                       #     69.4 %  backend_bound         (66.65%)
    14,979,136,808      stall_slot_backend                                                   (66.65%)

       0.572056818 seconds time elapsed

       1.572143000 seconds user
       0.004010000 seconds sys


Jing Zhang (6):
  perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  perf vendor events arm64: Add TLB metrics for neoverse-n2
  perf vendor events arm64: Add cache metrics for neoverse-n2
  perf vendor events arm64: Add branch metrics for neoverse-n2
  perf vendor events arm64: Add PE utilization metrics for neoverse-n2
  perf vendor events arm64: Add instruction mix metrics for neoverse-n2

 .../arch/arm64/arm/neoverse-n2/metrics.json        | 286 +++++++++++++++++++++
 1 file changed, 286 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 0/6] Add metrics for neoverse-n2
@ 2023-01-03 11:39 ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Changes since v4:
- Add MPKI/PKI “ScaleUnit”;
- Add acked-by from Ian Rogers;
- Link: https://lore.kernel.org/all/1671799045-1108027-1-git-send-email-renyu.zj@linux.alibaba.com/

Changes since v3:
- Add ipc_rate metric;
- Drop the PublicDescription;
- Describe PEutilization metrics in more detail;
- Link: https://lore.kernel.org/all/1669310088-13482-1-git-send-email-renyu.zj@linux.alibaba.com/

Changes since v2:
- Correct the furmula of Branch metrics;
- Add more PE utilization metrics;
- Add more TLB metrics;
- Add “ScaleUnit” for some metrics;
- Add a newline at the end of the file;
- Link: https://lore.kernel.org/all/1668411720-3581-1-git-send-email-renyu.zj@linux.alibaba.com/

Changes since v1:
- Corrected formula for topdown L1 due to wrong counts for stall_slot and
  stall_slot_frontend; 
- Link: https://lore.kernel.org/all/1667214694-89839-1-git-send-email-renyu.zj@linux.alibaba.com/


This series add six metricgroups for neoverse-n2, among which, the formula of
topdown L1 is from ARM sbsa7.0 platform design document [0], D37-38.

However, due to the wrong count of stall_slot and stall_slot_frontend on
neoverse-n2, the real stall_slot and real stall_slot_frontend need to
subtract cpu_cycles,  so correct the expression of topdown metrics.
Reference from ARM neoverse-n2 errata notice [1], D117.

Since neoverse-n2 does not yet support topdown L2, metricgroups such as Cache,
TLB, Branch, InstructionsMix, and PEutilization are added to help further
analysis of performance bottlenecks. Reference from ARM PMU guide [2][3].

[0] https://documentation-service.arm.com/static/60250c7395978b529036da86?token=
[1] https://documentation-service.arm.com/static/636a66a64e6cf12278ad89cb?token=
[2] https://documentation-service.arm.com/static/628f8fa3dfaf015c2b76eae8?token=
[3] https://documentation-service.arm.com/static/62cfe21e31ea212bb6627393?token=


$./perf list
...
Metric Groups:

Branch:
  branch_miss_pred_rate
       [The rate of branches mis-predited to the overall branches]
  branch_mpki
       [The rate of branches mis-predicted per kilo instructions]
  branch_pki
       [The rate of branches retired per kilo instructions]
Cache:
  l1d_cache_miss_rate
       [The rate of L1 D-Cache misses to the overall L1 D-Cache]
  l1d_cache_mpki
       [The rate of L1 D-Cache misses per kilo instructions]
...


$sudo ./perf stat -M TLB false_sharing 2

 Performance counter stats for 'false_sharing 2':

            31,254      L2D_TLB                          #     19.6 %  l2_tlb_miss_rate      (42.51%)
             6,130      L2D_TLB_REFILL                                                       (42.51%)
             1,910      L1I_TLB_REFILL                   #      0.1 %  l1i_tlb_miss_rate     (42.93%)
         2,306,689      L1I_TLB                                                              (42.93%)
       324,924,247      L1D_TLB                          #      0.0 %  l1d_tlb_miss_rate     (43.98%)
            22,688      L1D_TLB_REFILL                                                       (43.98%)
           627,992      L1I_TLB                          #      0.0 %  itlb_walk_rate        (44.26%)
                92      ITLB_WALK                                                            (44.26%)
       772,445,613      INST_RETIRED                     #      0.0 MPKI  itlb_mpki          (43.94%)
                88      ITLB_WALK                                                            (43.94%)
               907      DTLB_WALK                        #      0.0 %  dtlb_walk_rate        (43.10%)
       259,132,258      L1D_TLB                                                              (43.10%)
       804,080,968      INST_RETIRED                     #      0.0 MPKI  dtlb_mpki          (42.22%)
               937      DTLB_WALK                                                            (42.22%)

       0.479544400 seconds time elapsed

       1.264233000 seconds user
       0.000000000 seconds sys


$sudo ./perf stat -M TopDownL1 false_sharing 2

 Performance counter stats for 'false_sharing 2':

     4,310,905,590      cpu_cycles                       #      0.0 %  bad_speculation
                                                  #      4.0 %  retiring              (66.87%)
    25,009,763,735      stall_slot                                                           (66.87%)
       855,659,327      op_spec                                                              (66.87%)
       854,335,288      op_retired                                                           (66.87%)
     4,330,308,058      cpu_cycles                       #     27.1 %  frontend_bound        (66.99%)
    10,207,186,460      stall_slot_frontend                                                  (66.99%)
     4,316,583,673      cpu_cycles                       #     69.4 %  backend_bound         (66.65%)
    14,979,136,808      stall_slot_backend                                                   (66.65%)

       0.572056818 seconds time elapsed

       1.572143000 seconds user
       0.004010000 seconds sys


Jing Zhang (6):
  perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  perf vendor events arm64: Add TLB metrics for neoverse-n2
  perf vendor events arm64: Add cache metrics for neoverse-n2
  perf vendor events arm64: Add branch metrics for neoverse-n2
  perf vendor events arm64: Add PE utilization metrics for neoverse-n2
  perf vendor events arm64: Add instruction mix metrics for neoverse-n2

 .../arch/arm64/arm/neoverse-n2/metrics.json        | 286 +++++++++++++++++++++
 1 file changed, 286 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json

-- 
1.8.3.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-03 11:39 ` Jing Zhang
@ 2023-01-03 11:39   ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
design document [0], D37-38.

However, due to the wrong count of stall_slot and stall_slot_frontend on
neoverse-n2, the real stall_slot and real stall_slot_frontend need to
subtract cpu_cycles,  so correct the expression of topdown metrics.
Reference from ARM neoverse-n2 errata notice [1], D117.

Since neoverse-n2 does not yet support topdown L2, metricgroups such as
Cache, TLB, Branch, InstructionsMix, and PEutilization will be added to
further analysis of performance bottlenecks in the following patches.
Reference from ARM PMU guide [2][3].

[0] https://documentation-service.arm.com/static/60250c7395978b529036da86?token=
[1] https://documentation-service.arm.com/static/636a66a64e6cf12278ad89cb?token=
[2] https://documentation-service.arm.com/static/628f8fa3dfaf015c2b76eae8?token=
[3] https://documentation-service.arm.com/static/62cfe21e31ea212bb6627393?token=

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 30 ++++++++++++++++++++++
 1 file changed, 30 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
new file mode 100644
index 0000000..c126f1bc
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -0,0 +1,30 @@
+[
+    {
+        "MetricExpr": "(stall_slot_frontend - cpu_cycles) / (5 * cpu_cycles)",
+        "BriefDescription": "Frontend bound L1 topdown metric",
+        "MetricGroup": "TopdownL1",
+        "MetricName": "frontend_bound",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "(1 - op_retired / op_spec) * (1 - (stall_slot - cpu_cycles) / (5 * cpu_cycles))",
+        "BriefDescription": "Bad speculation L1 topdown metric",
+        "MetricGroup": "TopdownL1",
+        "MetricName": "bad_speculation",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "(op_retired / op_spec) * (1 - (stall_slot - cpu_cycles) / (5 * cpu_cycles))",
+        "BriefDescription": "Retiring L1 topdown metric",
+        "MetricGroup": "TopdownL1",
+        "MetricName": "retiring",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "stall_slot_backend / (5 * cpu_cycles)",
+        "BriefDescription": "Backend Bound L1 topdown metric",
+        "MetricGroup": "TopdownL1",
+        "MetricName": "backend_bound",
+        "ScaleUnit": "100%"
+    }
+]
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-03 11:39   ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
design document [0], D37-38.

However, due to the wrong count of stall_slot and stall_slot_frontend on
neoverse-n2, the real stall_slot and real stall_slot_frontend need to
subtract cpu_cycles,  so correct the expression of topdown metrics.
Reference from ARM neoverse-n2 errata notice [1], D117.

Since neoverse-n2 does not yet support topdown L2, metricgroups such as
Cache, TLB, Branch, InstructionsMix, and PEutilization will be added to
further analysis of performance bottlenecks in the following patches.
Reference from ARM PMU guide [2][3].

[0] https://documentation-service.arm.com/static/60250c7395978b529036da86?token=
[1] https://documentation-service.arm.com/static/636a66a64e6cf12278ad89cb?token=
[2] https://documentation-service.arm.com/static/628f8fa3dfaf015c2b76eae8?token=
[3] https://documentation-service.arm.com/static/62cfe21e31ea212bb6627393?token=

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 30 ++++++++++++++++++++++
 1 file changed, 30 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
new file mode 100644
index 0000000..c126f1bc
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -0,0 +1,30 @@
+[
+    {
+        "MetricExpr": "(stall_slot_frontend - cpu_cycles) / (5 * cpu_cycles)",
+        "BriefDescription": "Frontend bound L1 topdown metric",
+        "MetricGroup": "TopdownL1",
+        "MetricName": "frontend_bound",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "(1 - op_retired / op_spec) * (1 - (stall_slot - cpu_cycles) / (5 * cpu_cycles))",
+        "BriefDescription": "Bad speculation L1 topdown metric",
+        "MetricGroup": "TopdownL1",
+        "MetricName": "bad_speculation",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "(op_retired / op_spec) * (1 - (stall_slot - cpu_cycles) / (5 * cpu_cycles))",
+        "BriefDescription": "Retiring L1 topdown metric",
+        "MetricGroup": "TopdownL1",
+        "MetricName": "retiring",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "stall_slot_backend / (5 * cpu_cycles)",
+        "BriefDescription": "Backend Bound L1 topdown metric",
+        "MetricGroup": "TopdownL1",
+        "MetricName": "backend_bound",
+        "ScaleUnit": "100%"
+    }
+]
-- 
1.8.3.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
  2023-01-03 11:39 ` Jing Zhang
@ 2023-01-03 11:39   ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add TLB related metrics.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index c126f1bc..8a74e07 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -26,5 +26,54 @@
         "MetricGroup": "TopdownL1",
         "MetricName": "backend_bound",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
+        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "l1d_tlb_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
+        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "l1i_tlb_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
+        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "l2_tlb_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
+        "MetricGroup": "TLB",
+        "MetricName": "dtlb_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "DTLB_WALK / L1D_TLB",
+        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "dtlb_walk_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
+        "MetricGroup": "TLB",
+        "MetricName": "itlb_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "ITLB_WALK / L1I_TLB",
+        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "itlb_walk_rate",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
@ 2023-01-03 11:39   ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add TLB related metrics.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index c126f1bc..8a74e07 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -26,5 +26,54 @@
         "MetricGroup": "TopdownL1",
         "MetricName": "backend_bound",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
+        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "l1d_tlb_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
+        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "l1i_tlb_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
+        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "l2_tlb_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
+        "MetricGroup": "TLB",
+        "MetricName": "dtlb_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "DTLB_WALK / L1D_TLB",
+        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "dtlb_walk_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
+        "MetricGroup": "TLB",
+        "MetricName": "itlb_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "ITLB_WALK / L1I_TLB",
+        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
+        "MetricGroup": "TLB",
+        "MetricName": "itlb_walk_rate",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 3/6] perf vendor events arm64: Add cache metrics for neoverse-n2
  2023-01-03 11:39 ` Jing Zhang
@ 2023-01-03 11:39   ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add cache related metrics.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 77 ++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index 8a74e07..f81b40d 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -75,5 +75,82 @@
         "MetricGroup": "TLB",
         "MetricName": "itlb_walk_rate",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L1I_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of L1 I-Cache misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "l1i_cache_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "L1I_CACHE_REFILL / L1I_CACHE",
+        "BriefDescription": "The rate of L1 I-Cache misses to the overall L1 I-Cache",
+        "MetricGroup": "Cache",
+        "MetricName": "l1i_cache_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L1D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of L1 D-Cache misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "l1d_cache_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "L1D_CACHE_REFILL / L1D_CACHE",
+        "BriefDescription": "The rate of L1 D-Cache misses to the overall L1 D-Cache",
+        "MetricGroup": "Cache",
+        "MetricName": "l1d_cache_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L2D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of L2 D-Cache misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "l2d_cache_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "L2D_CACHE_REFILL / L2D_CACHE",
+        "BriefDescription": "The rate of L2 D-Cache misses to the overall L2 D-Cache",
+        "MetricGroup": "Cache",
+        "MetricName": "l2d_cache_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L3D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of L3 D-Cache misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "l3d_cache_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "L3D_CACHE_REFILL / L3D_CACHE",
+        "BriefDescription": "The rate of L3 D-Cache misses to the overall L3 D-Cache",
+        "MetricGroup": "Cache",
+        "MetricName": "l3d_cache_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "LL_CACHE_MISS_RD / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of LL Cache read misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "ll_cache_read_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "LL_CACHE_MISS_RD / LL_CACHE_RD",
+        "BriefDescription": "The rate of LL Cache read misses to the overall LL Cache read",
+        "MetricGroup": "Cache",
+        "MetricName": "ll_cache_read_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "(LL_CACHE_RD - LL_CACHE_MISS_RD) / LL_CACHE_RD",
+        "BriefDescription": "The rate of LL Cache read hit to the overall LL Cache read",
+        "MetricGroup": "Cache",
+        "MetricName": "ll_cache_read_hit_rate",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 3/6] perf vendor events arm64: Add cache metrics for neoverse-n2
@ 2023-01-03 11:39   ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add cache related metrics.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 77 ++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index 8a74e07..f81b40d 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -75,5 +75,82 @@
         "MetricGroup": "TLB",
         "MetricName": "itlb_walk_rate",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L1I_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of L1 I-Cache misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "l1i_cache_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "L1I_CACHE_REFILL / L1I_CACHE",
+        "BriefDescription": "The rate of L1 I-Cache misses to the overall L1 I-Cache",
+        "MetricGroup": "Cache",
+        "MetricName": "l1i_cache_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L1D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of L1 D-Cache misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "l1d_cache_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "L1D_CACHE_REFILL / L1D_CACHE",
+        "BriefDescription": "The rate of L1 D-Cache misses to the overall L1 D-Cache",
+        "MetricGroup": "Cache",
+        "MetricName": "l1d_cache_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L2D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of L2 D-Cache misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "l2d_cache_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "L2D_CACHE_REFILL / L2D_CACHE",
+        "BriefDescription": "The rate of L2 D-Cache misses to the overall L2 D-Cache",
+        "MetricGroup": "Cache",
+        "MetricName": "l2d_cache_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "L3D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of L3 D-Cache misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "l3d_cache_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "L3D_CACHE_REFILL / L3D_CACHE",
+        "BriefDescription": "The rate of L3 D-Cache misses to the overall L3 D-Cache",
+        "MetricGroup": "Cache",
+        "MetricName": "l3d_cache_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "LL_CACHE_MISS_RD / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of LL Cache read misses per kilo instructions",
+        "MetricGroup": "Cache",
+        "MetricName": "ll_cache_read_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "LL_CACHE_MISS_RD / LL_CACHE_RD",
+        "BriefDescription": "The rate of LL Cache read misses to the overall LL Cache read",
+        "MetricGroup": "Cache",
+        "MetricName": "ll_cache_read_miss_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "(LL_CACHE_RD - LL_CACHE_MISS_RD) / LL_CACHE_RD",
+        "BriefDescription": "The rate of LL Cache read hit to the overall LL Cache read",
+        "MetricGroup": "Cache",
+        "MetricName": "ll_cache_read_hit_rate",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 4/6] perf vendor events arm64: Add branch metrics for neoverse-n2
  2023-01-03 11:39 ` Jing Zhang
@ 2023-01-03 11:39   ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add branch related metrics.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json         | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index f81b40d..35e3710 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -152,5 +152,26 @@
         "MetricGroup": "Cache",
         "MetricName": "ll_cache_read_hit_rate",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "BR_MIS_PRED_RETIRED / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of branches mis-predicted per kilo instructions",
+        "MetricGroup": "Branch",
+        "MetricName": "branch_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "BR_RETIRED / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of branches retired per kilo instructions",
+        "MetricGroup": "Branch",
+        "MetricName": "branch_pki",
+        "ScaleUnit": "PKI"
+    },
+    {
+        "MetricExpr": "BR_MIS_PRED_RETIRED / BR_RETIRED",
+        "BriefDescription": "The rate of branches mis-predited to the overall branches",
+        "MetricGroup": "Branch",
+        "MetricName": "branch_miss_pred_rate",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 4/6] perf vendor events arm64: Add branch metrics for neoverse-n2
@ 2023-01-03 11:39   ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add branch related metrics.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json         | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index f81b40d..35e3710 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -152,5 +152,26 @@
         "MetricGroup": "Cache",
         "MetricName": "ll_cache_read_hit_rate",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "BR_MIS_PRED_RETIRED / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of branches mis-predicted per kilo instructions",
+        "MetricGroup": "Branch",
+        "MetricName": "branch_mpki",
+        "ScaleUnit": "MPKI"
+    },
+    {
+        "MetricExpr": "BR_RETIRED / INST_RETIRED * 1000",
+        "BriefDescription": "The rate of branches retired per kilo instructions",
+        "MetricGroup": "Branch",
+        "MetricName": "branch_pki",
+        "ScaleUnit": "PKI"
+    },
+    {
+        "MetricExpr": "BR_MIS_PRED_RETIRED / BR_RETIRED",
+        "BriefDescription": "The rate of branches mis-predited to the overall branches",
+        "MetricGroup": "Branch",
+        "MetricName": "branch_miss_pred_rate",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 5/6] perf vendor events arm64: Add PE utilization metrics for neoverse-n2
  2023-01-03 11:39 ` Jing Zhang
@ 2023-01-03 11:39   ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add PE utilization related metrics. In cpu_utilization metric, stall_slot
minus cpu_cycles is a correction to the stall_slot error count.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 46 ++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index 35e3710..94ca91f 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -173,5 +173,51 @@
         "MetricGroup": "Branch",
         "MetricName": "branch_miss_pred_rate",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "instructions / CPU_CYCLES",
+        "BriefDescription": "The average number of instructions executed for each cycle.",
+        "MetricGroup": "PEutilization",
+        "MetricName": "ipc"
+    },
+    {
+        "MetricExpr": "ipc / 5",
+        "BriefDescription": "IPC percentage of peak. The peak of IPC is 5.",
+        "MetricGroup": "PEutilization",
+        "MetricName": "ipc_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "INST_RETIRED / CPU_CYCLES",
+        "BriefDescription": "Architecturally executed Instructions Per Cycle (IPC)",
+        "MetricGroup": "PEutilization",
+        "MetricName": "retired_ipc"
+    },
+    {
+        "MetricExpr": "INST_SPEC / CPU_CYCLES",
+        "BriefDescription": "Speculatively executed Instructions Per Cycle (IPC)",
+        "MetricGroup": "PEutilization",
+        "MetricName": "spec_ipc"
+    },
+    {
+        "MetricExpr": "OP_RETIRED / OP_SPEC",
+        "BriefDescription": "Of all the micro-operations issued, what percentage are retired(committed)",
+        "MetricGroup": "PEutilization",
+        "MetricName": "retired_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "1 - OP_RETIRED / OP_SPEC",
+        "BriefDescription": "Of all the micro-operations issued, what percentage are not retired(committed)",
+        "MetricGroup": "PEutilization",
+        "MetricName": "wasted_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "OP_RETIRED / OP_SPEC * (1 - (STALL_SLOT - CPU_CYCLES) / (CPU_CYCLES * 5))",
+        "BriefDescription": "The truly effective ratio of micro-operations executed by the CPU, which means that misprediction and stall are not included",
+        "MetricGroup": "PEutilization",
+        "MetricName": "cpu_utilization",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 5/6] perf vendor events arm64: Add PE utilization metrics for neoverse-n2
@ 2023-01-03 11:39   ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add PE utilization related metrics. In cpu_utilization metric, stall_slot
minus cpu_cycles is a correction to the stall_slot error count.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 46 ++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index 35e3710..94ca91f 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -173,5 +173,51 @@
         "MetricGroup": "Branch",
         "MetricName": "branch_miss_pred_rate",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "instructions / CPU_CYCLES",
+        "BriefDescription": "The average number of instructions executed for each cycle.",
+        "MetricGroup": "PEutilization",
+        "MetricName": "ipc"
+    },
+    {
+        "MetricExpr": "ipc / 5",
+        "BriefDescription": "IPC percentage of peak. The peak of IPC is 5.",
+        "MetricGroup": "PEutilization",
+        "MetricName": "ipc_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "INST_RETIRED / CPU_CYCLES",
+        "BriefDescription": "Architecturally executed Instructions Per Cycle (IPC)",
+        "MetricGroup": "PEutilization",
+        "MetricName": "retired_ipc"
+    },
+    {
+        "MetricExpr": "INST_SPEC / CPU_CYCLES",
+        "BriefDescription": "Speculatively executed Instructions Per Cycle (IPC)",
+        "MetricGroup": "PEutilization",
+        "MetricName": "spec_ipc"
+    },
+    {
+        "MetricExpr": "OP_RETIRED / OP_SPEC",
+        "BriefDescription": "Of all the micro-operations issued, what percentage are retired(committed)",
+        "MetricGroup": "PEutilization",
+        "MetricName": "retired_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "1 - OP_RETIRED / OP_SPEC",
+        "BriefDescription": "Of all the micro-operations issued, what percentage are not retired(committed)",
+        "MetricGroup": "PEutilization",
+        "MetricName": "wasted_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "OP_RETIRED / OP_SPEC * (1 - (STALL_SLOT - CPU_CYCLES) / (CPU_CYCLES * 5))",
+        "BriefDescription": "The truly effective ratio of micro-operations executed by the CPU, which means that misprediction and stall are not included",
+        "MetricGroup": "PEutilization",
+        "MetricName": "cpu_utilization",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 6/6] perf vendor events arm64: Add instruction mix metrics for neoverse-n2
  2023-01-03 11:39 ` Jing Zhang
@ 2023-01-03 11:39   ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add instruction mix related metrics.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 63 ++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index 94ca91f..3bdde8b 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -219,5 +219,68 @@
         "MetricGroup": "PEutilization",
         "MetricName": "cpu_utilization",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "LD_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of load instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "load_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "ST_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of store instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "store_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "DP_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of integer data-processing instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "data_process_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "ASE_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of advanced SIMD instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "advanced_simd_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "VFP_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of floating point instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "float_point_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "CRYPTO_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of crypto instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "crypto_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "BR_IMMED_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of branch immediate instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "branch_immed_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "BR_RETURN_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of procedure return instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "branch_return_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "BR_INDIRECT_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of indirect branch instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "branch_indirect_spec_rate",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 6/6] perf vendor events arm64: Add instruction mix metrics for neoverse-n2
@ 2023-01-03 11:39   ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-03 11:39 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song, Jing Zhang

Add instruction mix related metrics.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Acked-by: Ian Rogers <irogers@google.com>
---
 .../arch/arm64/arm/neoverse-n2/metrics.json        | 63 ++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
index 94ca91f..3bdde8b 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
@@ -219,5 +219,68 @@
         "MetricGroup": "PEutilization",
         "MetricName": "cpu_utilization",
         "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "LD_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of load instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "load_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "ST_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of store instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "store_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "DP_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of integer data-processing instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "data_process_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "ASE_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of advanced SIMD instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "advanced_simd_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "VFP_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of floating point instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "float_point_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "CRYPTO_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of crypto instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "crypto_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "BR_IMMED_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of branch immediate instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "branch_immed_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "BR_RETURN_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of procedure return instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "branch_return_spec_rate",
+        "ScaleUnit": "100%"
+    },
+    {
+        "MetricExpr": "BR_INDIRECT_SPEC / INST_SPEC",
+        "BriefDescription": "The rate of indirect branch instructions speculatively executed to overall instructions speclatively executed",
+        "MetricGroup": "InstructionMix",
+        "MetricName": "branch_indirect_spec_rate",
+        "ScaleUnit": "100%"
     }
 ]
-- 
1.8.3.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-03 11:39   ` Jing Zhang
@ 2023-01-03 11:52     ` John Garry
  -1 siblings, 0 replies; 44+ messages in thread
From: John Garry @ 2023-01-03 11:52 UTC (permalink / raw)
  To: Jing Zhang, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song

On 03/01/2023 11:39, Jing Zhang wrote:
> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
> design document [0], D37-38.

I think that I mentioned this before - if the these metrics are coming 
from an sbsa doc, then they are standard. As such, we can make them 
"arch std events" and put them in a common json such as sbsa.json, so 
that other cores may reuse.

You don't strictly have to do do this now, but it would be better.

Thanks,
John

> 
> However, due to the wrong count of stall_slot and stall_slot_frontend on
> neoverse-n2, the real stall_slot and real stall_slot_frontend need to
> subtract cpu_cycles,  so correct the expression of topdown metrics.
> Reference from ARM neoverse-n2 errata notice [1], D117.
> 
> Since neoverse-n2 does not yet support topdown L2, metricgroups such as
> Cache, TLB, Branch, InstructionsMix, and PEutilization will be added to
> further analysis of performance bottlenecks in the following patches.
> Reference from ARM PMU guide [2][3].


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-03 11:52     ` John Garry
  0 siblings, 0 replies; 44+ messages in thread
From: John Garry @ 2023-01-03 11:52 UTC (permalink / raw)
  To: Jing Zhang, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song

On 03/01/2023 11:39, Jing Zhang wrote:
> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
> design document [0], D37-38.

I think that I mentioned this before - if the these metrics are coming 
from an sbsa doc, then they are standard. As such, we can make them 
"arch std events" and put them in a common json such as sbsa.json, so 
that other cores may reuse.

You don't strictly have to do do this now, but it would be better.

Thanks,
John

> 
> However, due to the wrong count of stall_slot and stall_slot_frontend on
> neoverse-n2, the real stall_slot and real stall_slot_frontend need to
> subtract cpu_cycles,  so correct the expression of topdown metrics.
> Reference from ARM neoverse-n2 errata notice [1], D117.
> 
> Since neoverse-n2 does not yet support topdown L2, metricgroups such as
> Cache, TLB, Branch, InstructionsMix, and PEutilization will be added to
> further analysis of performance bottlenecks in the following patches.
> Reference from ARM PMU guide [2][3].


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
  2023-01-03 11:39   ` Jing Zhang
@ 2023-01-03 17:14     ` Ian Rogers
  -1 siblings, 0 replies; 44+ messages in thread
From: Ian Rogers @ 2023-01-03 17:14 UTC (permalink / raw)
  To: Jing Zhang
  Cc: John Garry, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On Tue, Jan 3, 2023 at 3:39 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
>
> Add TLB related metrics.
>
> Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
> Acked-by: Ian Rogers <irogers@google.com>
> ---
>  .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> index c126f1bc..8a74e07 100644
> --- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> +++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> @@ -26,5 +26,54 @@
>          "MetricGroup": "TopdownL1",
>          "MetricName": "backend_bound",
>          "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
> +        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "l1d_tlb_miss_rate",
> +        "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
> +        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "l1i_tlb_miss_rate",
> +        "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
> +        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "l2_tlb_miss_rate",
> +        "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
> +        "MetricGroup": "TLB",
> +        "MetricName": "dtlb_mpki",
> +        "ScaleUnit": "MPKI"
> +    },
> +    {
> +        "MetricExpr": "DTLB_WALK / L1D_TLB",
> +        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "dtlb_walk_rate",
> +        "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
> +        "MetricGroup": "TLB",
> +        "MetricName": "itlb_mpki",
> +        "ScaleUnit": "MPKI"

Did you test this? IIRC if there is no number in the ScaleUnit then
the scale factor becomes 0 and the metric value is always multiplied
by zero. Perhaps:

"MetricName": "itlb_miss_rate",
"MetricExpr": "ITLB / INST_RETIRED"
"ScaleUnit": "1000MPKI"

Thanks,
Ian

> +    },
> +    {
> +        "MetricExpr": "ITLB_WALK / L1I_TLB",
> +        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "itlb_walk_rate",
> +        "ScaleUnit": "100%"
>      }
>  ]
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
@ 2023-01-03 17:14     ` Ian Rogers
  0 siblings, 0 replies; 44+ messages in thread
From: Ian Rogers @ 2023-01-03 17:14 UTC (permalink / raw)
  To: Jing Zhang
  Cc: John Garry, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On Tue, Jan 3, 2023 at 3:39 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
>
> Add TLB related metrics.
>
> Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
> Acked-by: Ian Rogers <irogers@google.com>
> ---
>  .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> index c126f1bc..8a74e07 100644
> --- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> +++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> @@ -26,5 +26,54 @@
>          "MetricGroup": "TopdownL1",
>          "MetricName": "backend_bound",
>          "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
> +        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "l1d_tlb_miss_rate",
> +        "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
> +        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "l1i_tlb_miss_rate",
> +        "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
> +        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "l2_tlb_miss_rate",
> +        "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
> +        "MetricGroup": "TLB",
> +        "MetricName": "dtlb_mpki",
> +        "ScaleUnit": "MPKI"
> +    },
> +    {
> +        "MetricExpr": "DTLB_WALK / L1D_TLB",
> +        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "dtlb_walk_rate",
> +        "ScaleUnit": "100%"
> +    },
> +    {
> +        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
> +        "MetricGroup": "TLB",
> +        "MetricName": "itlb_mpki",
> +        "ScaleUnit": "MPKI"

Did you test this? IIRC if there is no number in the ScaleUnit then
the scale factor becomes 0 and the metric value is always multiplied
by zero. Perhaps:

"MetricName": "itlb_miss_rate",
"MetricExpr": "ITLB / INST_RETIRED"
"ScaleUnit": "1000MPKI"

Thanks,
Ian

> +    },
> +    {
> +        "MetricExpr": "ITLB_WALK / L1I_TLB",
> +        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
> +        "MetricGroup": "TLB",
> +        "MetricName": "itlb_walk_rate",
> +        "ScaleUnit": "100%"
>      }
>  ]
> --
> 1.8.3.1
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-03 11:52     ` John Garry
@ 2023-01-04  5:05       ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-04  5:05 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



在 2023/1/3 下午7:52, John Garry 写道:
> On 03/01/2023 11:39, Jing Zhang wrote:
>> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
>> design document [0], D37-38.
> 
> I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse.
> 
> You don't strictly have to do do this now, but it would be better.
> 

Hi John,

I would really like to do this, but as discussed earlier, slot is different on each architectures.
If I do not specify the value of the slot in sbsa.json, then in the json file of n2/v1, I need to
overwrite each topdown "MetricExpr". In other words, the metrics placed in the sbsa.json file only
reuse "BriefDescription", "MetricGroup" and "ScaleUnit". So I'm not sure if it's acceptable?

In addition, James mentioned that if the units and names and group names of different architectures
are not unified, it will become complicated.

Perhaps we could do it later.

Thanks,
Jing

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-04  5:05       ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-04  5:05 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



在 2023/1/3 下午7:52, John Garry 写道:
> On 03/01/2023 11:39, Jing Zhang wrote:
>> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
>> design document [0], D37-38.
> 
> I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse.
> 
> You don't strictly have to do do this now, but it would be better.
> 

Hi John,

I would really like to do this, but as discussed earlier, slot is different on each architectures.
If I do not specify the value of the slot in sbsa.json, then in the json file of n2/v1, I need to
overwrite each topdown "MetricExpr". In other words, the metrics placed in the sbsa.json file only
reuse "BriefDescription", "MetricGroup" and "ScaleUnit". So I'm not sure if it's acceptable?

In addition, James mentioned that if the units and names and group names of different architectures
are not unified, it will become complicated.

Perhaps we could do it later.

Thanks,
Jing

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
  2023-01-03 17:14     ` Ian Rogers
@ 2023-01-04  5:21       ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-04  5:21 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song



在 2023/1/4 上午1:14, Ian Rogers 写道:
> On Tue, Jan 3, 2023 at 3:39 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
>>
>> Add TLB related metrics.
>>
>> Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
>> Acked-by: Ian Rogers <irogers@google.com>
>> ---
>>  .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
>>  1 file changed, 49 insertions(+)
>>
>> diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>> index c126f1bc..8a74e07 100644
>> --- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>> +++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>> @@ -26,5 +26,54 @@
>>          "MetricGroup": "TopdownL1",
>>          "MetricName": "backend_bound",
>>          "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
>> +        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "l1d_tlb_miss_rate",
>> +        "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
>> +        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "l1i_tlb_miss_rate",
>> +        "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
>> +        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "l2_tlb_miss_rate",
>> +        "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "dtlb_mpki",
>> +        "ScaleUnit": "MPKI"
>> +    },
>> +    {
>> +        "MetricExpr": "DTLB_WALK / L1D_TLB",
>> +        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "dtlb_walk_rate",
>> +        "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "itlb_mpki",
>> +        "ScaleUnit": "MPKI"
> 
> Did you test this? IIRC if there is no number in the ScaleUnit then
> the scale factor becomes 0 and the metric value is always multiplied
> by zero. Perhaps:
> 
> "MetricName": "itlb_miss_rate",
> "MetricExpr": "ITLB / INST_RETIRED"
> "ScaleUnit": "1000MPKI"
> 
> Thanks,
> Ian
> 

You are absolutely right, I only tested TLB metrics. Sorry for not double checking. I will repost the corrected patches.

>> +    },
>> +    {
>> +        "MetricExpr": "ITLB_WALK / L1I_TLB",
>> +        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "itlb_walk_rate",
>> +        "ScaleUnit": "100%"
>>      }
>>  ]
>> --
>> 1.8.3.1
>>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
@ 2023-01-04  5:21       ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-04  5:21 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song



在 2023/1/4 上午1:14, Ian Rogers 写道:
> On Tue, Jan 3, 2023 at 3:39 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
>>
>> Add TLB related metrics.
>>
>> Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
>> Acked-by: Ian Rogers <irogers@google.com>
>> ---
>>  .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
>>  1 file changed, 49 insertions(+)
>>
>> diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>> index c126f1bc..8a74e07 100644
>> --- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>> +++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>> @@ -26,5 +26,54 @@
>>          "MetricGroup": "TopdownL1",
>>          "MetricName": "backend_bound",
>>          "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
>> +        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "l1d_tlb_miss_rate",
>> +        "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
>> +        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "l1i_tlb_miss_rate",
>> +        "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
>> +        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "l2_tlb_miss_rate",
>> +        "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "dtlb_mpki",
>> +        "ScaleUnit": "MPKI"
>> +    },
>> +    {
>> +        "MetricExpr": "DTLB_WALK / L1D_TLB",
>> +        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "dtlb_walk_rate",
>> +        "ScaleUnit": "100%"
>> +    },
>> +    {
>> +        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "itlb_mpki",
>> +        "ScaleUnit": "MPKI"
> 
> Did you test this? IIRC if there is no number in the ScaleUnit then
> the scale factor becomes 0 and the metric value is always multiplied
> by zero. Perhaps:
> 
> "MetricName": "itlb_miss_rate",
> "MetricExpr": "ITLB / INST_RETIRED"
> "ScaleUnit": "1000MPKI"
> 
> Thanks,
> Ian
> 

You are absolutely right, I only tested TLB metrics. Sorry for not double checking. I will repost the corrected patches.

>> +    },
>> +    {
>> +        "MetricExpr": "ITLB_WALK / L1I_TLB",
>> +        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
>> +        "MetricGroup": "TLB",
>> +        "MetricName": "itlb_walk_rate",
>> +        "ScaleUnit": "100%"
>>      }
>>  ]
>> --
>> 1.8.3.1
>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
  2023-01-04  5:21       ` Jing Zhang
@ 2023-01-04  8:40         ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-04  8:40 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song



在 2023/1/4 下午1:21, Jing Zhang 写道:
> 
> 
> 在 2023/1/4 上午1:14, Ian Rogers 写道:
>> On Tue, Jan 3, 2023 at 3:39 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
>>>
>>> Add TLB related metrics.
>>>
>>> Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
>>> Acked-by: Ian Rogers <irogers@google.com>
>>> ---
>>>  .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
>>>  1 file changed, 49 insertions(+)
>>>
>>> diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>>> index c126f1bc..8a74e07 100644
>>> --- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>>> +++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>>> @@ -26,5 +26,54 @@
>>>          "MetricGroup": "TopdownL1",
>>>          "MetricName": "backend_bound",
>>>          "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
>>> +        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "l1d_tlb_miss_rate",
>>> +        "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
>>> +        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "l1i_tlb_miss_rate",
>>> +        "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
>>> +        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "l2_tlb_miss_rate",
>>> +        "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
>>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "dtlb_mpki",
>>> +        "ScaleUnit": "MPKI"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "DTLB_WALK / L1D_TLB",
>>> +        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "dtlb_walk_rate",
>>> +        "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
>>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "itlb_mpki",
>>> +        "ScaleUnit": "MPKI"
>>
>> Did you test this? IIRC if there is no number in the ScaleUnit then
>> the scale factor becomes 0 and the metric value is always multiplied
>> by zero. Perhaps:
>>
>> "MetricName": "itlb_miss_rate",
>> "MetricExpr": "ITLB / INST_RETIRED"
>> "ScaleUnit": "1000MPKI"
>>
>> Thanks,
>> Ian
>>
> 
> You are absolutely right, I only tested TLB metrics. Sorry for not double checking. I will repost the corrected patches.
> 

I rethought it. I want to change the ScaleUnit to "1MPKI" and keep the MetricExpr multiplied by 1000,
so that the "MetricExpr" expresses the value of per kilo instruciton, which can be consistent with the
description in "BriefDescription". Like:
   {
        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
        "MetricGroup": "TLB",
        "MetricName": "dtlb_mpki",
        "ScaleUnit": "1MPKI"
    },


In addition, I think it is more reasonable for ScaleUnit to have a default scale factor of 1 when there
is no number. I want to try to fix this bug.

Ian, what's your opnion?


>>> +    },
>>> +    {
>>> +        "MetricExpr": "ITLB_WALK / L1I_TLB",
>>> +        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "itlb_walk_rate",
>>> +        "ScaleUnit": "100%"
>>>      }
>>>  ]
>>> --
>>> 1.8.3.1
>>>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
@ 2023-01-04  8:40         ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-04  8:40 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song



在 2023/1/4 下午1:21, Jing Zhang 写道:
> 
> 
> 在 2023/1/4 上午1:14, Ian Rogers 写道:
>> On Tue, Jan 3, 2023 at 3:39 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
>>>
>>> Add TLB related metrics.
>>>
>>> Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
>>> Acked-by: Ian Rogers <irogers@google.com>
>>> ---
>>>  .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
>>>  1 file changed, 49 insertions(+)
>>>
>>> diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>>> index c126f1bc..8a74e07 100644
>>> --- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>>> +++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
>>> @@ -26,5 +26,54 @@
>>>          "MetricGroup": "TopdownL1",
>>>          "MetricName": "backend_bound",
>>>          "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
>>> +        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "l1d_tlb_miss_rate",
>>> +        "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
>>> +        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "l1i_tlb_miss_rate",
>>> +        "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
>>> +        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "l2_tlb_miss_rate",
>>> +        "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
>>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "dtlb_mpki",
>>> +        "ScaleUnit": "MPKI"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "DTLB_WALK / L1D_TLB",
>>> +        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "dtlb_walk_rate",
>>> +        "ScaleUnit": "100%"
>>> +    },
>>> +    {
>>> +        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
>>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "itlb_mpki",
>>> +        "ScaleUnit": "MPKI"
>>
>> Did you test this? IIRC if there is no number in the ScaleUnit then
>> the scale factor becomes 0 and the metric value is always multiplied
>> by zero. Perhaps:
>>
>> "MetricName": "itlb_miss_rate",
>> "MetricExpr": "ITLB / INST_RETIRED"
>> "ScaleUnit": "1000MPKI"
>>
>> Thanks,
>> Ian
>>
> 
> You are absolutely right, I only tested TLB metrics. Sorry for not double checking. I will repost the corrected patches.
> 

I rethought it. I want to change the ScaleUnit to "1MPKI" and keep the MetricExpr multiplied by 1000,
so that the "MetricExpr" expresses the value of per kilo instruciton, which can be consistent with the
description in "BriefDescription". Like:
   {
        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
        "MetricGroup": "TLB",
        "MetricName": "dtlb_mpki",
        "ScaleUnit": "1MPKI"
    },


In addition, I think it is more reasonable for ScaleUnit to have a default scale factor of 1 when there
is no number. I want to try to fix this bug.

Ian, what's your opnion?


>>> +    },
>>> +    {
>>> +        "MetricExpr": "ITLB_WALK / L1I_TLB",
>>> +        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
>>> +        "MetricGroup": "TLB",
>>> +        "MetricName": "itlb_walk_rate",
>>> +        "ScaleUnit": "100%"
>>>      }
>>>  ]
>>> --
>>> 1.8.3.1
>>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
  2023-01-04  8:40         ` Jing Zhang
@ 2023-01-04 16:57           ` Ian Rogers
  -1 siblings, 0 replies; 44+ messages in thread
From: Ian Rogers @ 2023-01-04 16:57 UTC (permalink / raw)
  To: Jing Zhang
  Cc: John Garry, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On Wed, Jan 4, 2023 at 12:40 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
>
>
>
> 在 2023/1/4 下午1:21, Jing Zhang 写道:
> >
> >
> > 在 2023/1/4 上午1:14, Ian Rogers 写道:
> >> On Tue, Jan 3, 2023 at 3:39 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
> >>>
> >>> Add TLB related metrics.
> >>>
> >>> Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
> >>> Acked-by: Ian Rogers <irogers@google.com>
> >>> ---
> >>>  .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
> >>>  1 file changed, 49 insertions(+)
> >>>
> >>> diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> >>> index c126f1bc..8a74e07 100644
> >>> --- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> >>> +++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> >>> @@ -26,5 +26,54 @@
> >>>          "MetricGroup": "TopdownL1",
> >>>          "MetricName": "backend_bound",
> >>>          "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
> >>> +        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "l1d_tlb_miss_rate",
> >>> +        "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
> >>> +        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "l1i_tlb_miss_rate",
> >>> +        "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
> >>> +        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "l2_tlb_miss_rate",
> >>> +        "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
> >>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "dtlb_mpki",
> >>> +        "ScaleUnit": "MPKI"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "DTLB_WALK / L1D_TLB",
> >>> +        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "dtlb_walk_rate",
> >>> +        "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
> >>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "itlb_mpki",
> >>> +        "ScaleUnit": "MPKI"
> >>
> >> Did you test this? IIRC if there is no number in the ScaleUnit then
> >> the scale factor becomes 0 and the metric value is always multiplied
> >> by zero. Perhaps:
> >>
> >> "MetricName": "itlb_miss_rate",
> >> "MetricExpr": "ITLB / INST_RETIRED"
> >> "ScaleUnit": "1000MPKI"
> >>
> >> Thanks,
> >> Ian
> >>
> >
> > You are absolutely right, I only tested TLB metrics. Sorry for not double checking. I will repost the corrected patches.
> >
>
> I rethought it. I want to change the ScaleUnit to "1MPKI" and keep the MetricExpr multiplied by 1000,
> so that the "MetricExpr" expresses the value of per kilo instruciton, which can be consistent with the
> description in "BriefDescription". Like:
>    {
>         "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
>         "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
>         "MetricGroup": "TLB",
>         "MetricName": "dtlb_mpki",
>         "ScaleUnit": "1MPKI"
>     },
>
>
> In addition, I think it is more reasonable for ScaleUnit to have a default scale factor of 1 when there
> is no number. I want to try to fix this bug.
>
> Ian, what's your opnion?

I like intention revealing, itlb_mpki is something of a soup of
characters to de-acronym-ify compared to itlb_miss_rate, but rate may
not be completely intuitive in that name. I'm happy to follow your
lead. Putting the 1000 in the ScaleUnit or the expression doesn't
matter, so again happy to follow what you think is best.

Thanks,
Ian

> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "ITLB_WALK / L1I_TLB",
> >>> +        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "itlb_walk_rate",
> >>> +        "ScaleUnit": "100%"
> >>>      }
> >>>  ]
> >>> --
> >>> 1.8.3.1
> >>>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/6] perf vendor events arm64: Add TLB metrics for neoverse-n2
@ 2023-01-04 16:57           ` Ian Rogers
  0 siblings, 0 replies; 44+ messages in thread
From: Ian Rogers @ 2023-01-04 16:57 UTC (permalink / raw)
  To: Jing Zhang
  Cc: John Garry, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On Wed, Jan 4, 2023 at 12:40 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
>
>
>
> 在 2023/1/4 下午1:21, Jing Zhang 写道:
> >
> >
> > 在 2023/1/4 上午1:14, Ian Rogers 写道:
> >> On Tue, Jan 3, 2023 at 3:39 AM Jing Zhang <renyu.zj@linux.alibaba.com> wrote:
> >>>
> >>> Add TLB related metrics.
> >>>
> >>> Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
> >>> Acked-by: Ian Rogers <irogers@google.com>
> >>> ---
> >>>  .../arch/arm64/arm/neoverse-n2/metrics.json        | 49 ++++++++++++++++++++++
> >>>  1 file changed, 49 insertions(+)
> >>>
> >>> diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> >>> index c126f1bc..8a74e07 100644
> >>> --- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> >>> +++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2/metrics.json
> >>> @@ -26,5 +26,54 @@
> >>>          "MetricGroup": "TopdownL1",
> >>>          "MetricName": "backend_bound",
> >>>          "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
> >>> +        "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "l1d_tlb_miss_rate",
> >>> +        "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
> >>> +        "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "l1i_tlb_miss_rate",
> >>> +        "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
> >>> +        "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "l2_tlb_miss_rate",
> >>> +        "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
> >>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "dtlb_mpki",
> >>> +        "ScaleUnit": "MPKI"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "DTLB_WALK / L1D_TLB",
> >>> +        "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "dtlb_walk_rate",
> >>> +        "ScaleUnit": "100%"
> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
> >>> +        "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "itlb_mpki",
> >>> +        "ScaleUnit": "MPKI"
> >>
> >> Did you test this? IIRC if there is no number in the ScaleUnit then
> >> the scale factor becomes 0 and the metric value is always multiplied
> >> by zero. Perhaps:
> >>
> >> "MetricName": "itlb_miss_rate",
> >> "MetricExpr": "ITLB / INST_RETIRED"
> >> "ScaleUnit": "1000MPKI"
> >>
> >> Thanks,
> >> Ian
> >>
> >
> > You are absolutely right, I only tested TLB metrics. Sorry for not double checking. I will repost the corrected patches.
> >
>
> I rethought it. I want to change the ScaleUnit to "1MPKI" and keep the MetricExpr multiplied by 1000,
> so that the "MetricExpr" expresses the value of per kilo instruciton, which can be consistent with the
> description in "BriefDescription". Like:
>    {
>         "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
>         "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
>         "MetricGroup": "TLB",
>         "MetricName": "dtlb_mpki",
>         "ScaleUnit": "1MPKI"
>     },
>
>
> In addition, I think it is more reasonable for ScaleUnit to have a default scale factor of 1 when there
> is no number. I want to try to fix this bug.
>
> Ian, what's your opnion?

I like intention revealing, itlb_mpki is something of a soup of
characters to de-acronym-ify compared to itlb_miss_rate, but rate may
not be completely intuitive in that name. I'm happy to follow your
lead. Putting the 1000 in the ScaleUnit or the expression doesn't
matter, so again happy to follow what you think is best.

Thanks,
Ian

> >>> +    },
> >>> +    {
> >>> +        "MetricExpr": "ITLB_WALK / L1I_TLB",
> >>> +        "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
> >>> +        "MetricGroup": "TLB",
> >>> +        "MetricName": "itlb_walk_rate",
> >>> +        "ScaleUnit": "100%"
> >>>      }
> >>>  ]
> >>> --
> >>> 1.8.3.1
> >>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-04  5:05       ` Jing Zhang
@ 2023-01-04 17:26         ` John Garry
  -1 siblings, 0 replies; 44+ messages in thread
From: John Garry @ 2023-01-04 17:26 UTC (permalink / raw)
  To: Jing Zhang, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song

On 04/01/2023 05:05, Jing Zhang wrote:
> 
> 
> 在 2023/1/3 下午7:52, John Garry 写道:
>> On 03/01/2023 11:39, Jing Zhang wrote:
>>> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
>>> design document [0], D37-38.
>>
>> I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse.
>>
>> You don't strictly have to do do this now, but it would be better.
>>
> 
> Hi John,

Hi Jing,

> 
> I would really like to do this, but as discussed earlier, slot is different on each architectures.
> If I do not specify the value of the slot in sbsa.json, then in the json file of n2/v1, I need to
> overwrite each topdown "MetricExpr". In other words, the metrics placed in the sbsa.json file only
> reuse "BriefDescription", "MetricGroup" and "ScaleUnit". So I'm not sure if it's acceptable?

I don't see a lot of value in that really.

However, for this value of slot, isn't this discoverable from a system 
register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED 
constant SLOTS is discoverable from the system register 
PMMIR_EL1.SLOTS." Did you consider how this could be used?

> 
> In addition, James mentioned that if the units and names and group names of different architectures
> are not unified, it will become complicated.
> 

Thanks,
John



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-04 17:26         ` John Garry
  0 siblings, 0 replies; 44+ messages in thread
From: John Garry @ 2023-01-04 17:26 UTC (permalink / raw)
  To: Jing Zhang, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song

On 04/01/2023 05:05, Jing Zhang wrote:
> 
> 
> 在 2023/1/3 下午7:52, John Garry 写道:
>> On 03/01/2023 11:39, Jing Zhang wrote:
>>> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
>>> design document [0], D37-38.
>>
>> I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse.
>>
>> You don't strictly have to do do this now, but it would be better.
>>
> 
> Hi John,

Hi Jing,

> 
> I would really like to do this, but as discussed earlier, slot is different on each architectures.
> If I do not specify the value of the slot in sbsa.json, then in the json file of n2/v1, I need to
> overwrite each topdown "MetricExpr". In other words, the metrics placed in the sbsa.json file only
> reuse "BriefDescription", "MetricGroup" and "ScaleUnit". So I'm not sure if it's acceptable?

I don't see a lot of value in that really.

However, for this value of slot, isn't this discoverable from a system 
register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED 
constant SLOTS is discoverable from the system register 
PMMIR_EL1.SLOTS." Did you consider how this could be used?

> 
> In addition, James mentioned that if the units and names and group names of different architectures
> are not unified, it will become complicated.
> 

Thanks,
John



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-04 17:26         ` John Garry
@ 2023-01-05 10:05           ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-05 10:05 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



在 2023/1/5 上午1:26, John Garry 写道:
> On 04/01/2023 05:05, Jing Zhang wrote:
>>
>>
>> 在 2023/1/3 下午7:52, John Garry 写道:
>>> On 03/01/2023 11:39, Jing Zhang wrote:
>>>> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
>>>> design document [0], D37-38.
>>>
>>> I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse.
>>>
>>> You don't strictly have to do do this now, but it would be better.
>>>
>>
>> Hi John,
> 
> Hi Jing,
> 
>>
>> I would really like to do this, but as discussed earlier, slot is different on each architectures.
>> If I do not specify the value of the slot in sbsa.json, then in the json file of n2/v1, I need to
>> overwrite each topdown "MetricExpr". In other words, the metrics placed in the sbsa.json file only
>> reuse "BriefDescription", "MetricGroup" and "ScaleUnit". So I'm not sure if it's acceptable?
> 
> I don't see a lot of value in that really.
> 
> However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used?
> 


This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
/sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
read slots values? Currently I understand that parameters in metricExpr only support events and constants.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-05 10:05           ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-05 10:05 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



在 2023/1/5 上午1:26, John Garry 写道:
> On 04/01/2023 05:05, Jing Zhang wrote:
>>
>>
>> 在 2023/1/3 下午7:52, John Garry 写道:
>>> On 03/01/2023 11:39, Jing Zhang wrote:
>>>> The formula of topdown L1 on neoverse-n2 is from ARM sbsa7.0 platform
>>>> design document [0], D37-38.
>>>
>>> I think that I mentioned this before - if the these metrics are coming from an sbsa doc, then they are standard. As such, we can make them "arch std events" and put them in a common json such as sbsa.json, so that other cores may reuse.
>>>
>>> You don't strictly have to do do this now, but it would be better.
>>>
>>
>> Hi John,
> 
> Hi Jing,
> 
>>
>> I would really like to do this, but as discussed earlier, slot is different on each architectures.
>> If I do not specify the value of the slot in sbsa.json, then in the json file of n2/v1, I need to
>> overwrite each topdown "MetricExpr". In other words, the metrics placed in the sbsa.json file only
>> reuse "BriefDescription", "MetricGroup" and "ScaleUnit". So I'm not sure if it's acceptable?
> 
> I don't see a lot of value in that really.
> 
> However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used?
> 


This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
/sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
read slots values? Currently I understand that parameters in metricExpr only support events and constants.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-05 10:05           ` Jing Zhang
@ 2023-01-05 10:13             ` John Garry
  -1 siblings, 0 replies; 44+ messages in thread
From: John Garry @ 2023-01-05 10:13 UTC (permalink / raw)
  To: Jing Zhang, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song

On 05/01/2023 10:05, Jing Zhang wrote:
>> However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used?
>>
> 
> This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
> read slots values? Currently I understand that parameters in metricExpr only support events and constants.
> 

Maybe during runtime we could create a pseudo metric/event for SLOT. 
This metric would be created during init, and it always just returns the 
value which was read from PMMIR_EL1.

I'm not sure how well that would play will trying to resolve metrics 
when building generated pmu-events.c, but I don't think it's all too 
difficult to achieve.

Have you actually read this value for the n2 core? Does look correct?

Thanks,
John

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-05 10:13             ` John Garry
  0 siblings, 0 replies; 44+ messages in thread
From: John Garry @ 2023-01-05 10:13 UTC (permalink / raw)
  To: Jing Zhang, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song

On 05/01/2023 10:05, Jing Zhang wrote:
>> However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used?
>>
> 
> This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
> read slots values? Currently I understand that parameters in metricExpr only support events and constants.
> 

Maybe during runtime we could create a pseudo metric/event for SLOT. 
This metric would be created during init, and it always just returns the 
value which was read from PMMIR_EL1.

I'm not sure how well that would play will trying to resolve metrics 
when building generated pmu-events.c, but I don't think it's all too 
difficult to achieve.

Have you actually read this value for the n2 core? Does look correct?

Thanks,
John

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-05 10:13             ` John Garry
@ 2023-01-05 11:02               ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-05 11:02 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



在 2023/1/5 下午6:13, John Garry 写道:
> Maybe during runtime we could create a pseudo metric/event for SLOT. This metric would be created during init, and it always just returns the value which was read from PMMIR_EL1.
> 
> I'm not sure how well that would play will trying to resolve metrics when building generated pmu-events.c, but I don't think it's all too difficult to achieve.
> 

I'll try it in the v7 patch. I want to release the v6 patch first, to correct a mistake I made. :)

> Have you actually read this value for the n2 core? Does look correct?

Yes, I read it in n2 and it has a value of 5 which is correct. If the
STALL_SLOT event is not implemented, PMMIR_EL1.SLOT might read as zero.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-05 11:02               ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-05 11:02 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Xing Zhengjun, Will Deacon, James Clark,
	Mike Leach, Leo Yan
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



在 2023/1/5 下午6:13, John Garry 写道:
> Maybe during runtime we could create a pseudo metric/event for SLOT. This metric would be created during init, and it always just returns the value which was read from PMMIR_EL1.
> 
> I'm not sure how well that would play will trying to resolve metrics when building generated pmu-events.c, but I don't think it's all too difficult to achieve.
> 

I'll try it in the v7 patch. I want to release the v6 patch first, to correct a mistake I made. :)

> Have you actually read this value for the n2 core? Does look correct?

Yes, I read it in n2 and it has a value of 5 which is correct. If the
STALL_SLOT event is not implemented, PMMIR_EL1.SLOT might read as zero.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-05 10:13             ` John Garry
@ 2023-01-05 21:13               ` Ian Rogers
  -1 siblings, 0 replies; 44+ messages in thread
From: Ian Rogers @ 2023-01-05 21:13 UTC (permalink / raw)
  To: John Garry
  Cc: Jing Zhang, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On Thu, Jan 5, 2023 at 2:13 AM John Garry <john.g.garry@oracle.com> wrote:
>
> On 05/01/2023 10:05, Jing Zhang wrote:
> >> However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used?
> >>
> >
> > This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
> > /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
> > read slots values? Currently I understand that parameters in metricExpr only support events and constants.
> >
>
> Maybe during runtime we could create a pseudo metric/event for SLOT.

For Intel we do this by just having a different constant for each
architecture. It is fairly easy to add a new "literal", so you could
add a #slots in expr__get_literal:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf/core#n407
Populating it would be the challenge :-)

Thanks,
Ian

> This metric would be created during init, and it always just returns the
> value which was read from PMMIR_EL1.
>
> I'm not sure how well that would play will trying to resolve metrics
> when building generated pmu-events.c, but I don't think it's all too
> difficult to achieve.
>
> Have you actually read this value for the n2 core? Does look correct?
>
> Thanks,
> John

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-05 21:13               ` Ian Rogers
  0 siblings, 0 replies; 44+ messages in thread
From: Ian Rogers @ 2023-01-05 21:13 UTC (permalink / raw)
  To: John Garry
  Cc: Jing Zhang, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On Thu, Jan 5, 2023 at 2:13 AM John Garry <john.g.garry@oracle.com> wrote:
>
> On 05/01/2023 10:05, Jing Zhang wrote:
> >> However, for this value of slot, isn't this discoverable from a system register per core? Quoting the sbsa: "The IMPLEMENTATION DEFINED constant SLOTS is discoverable from the system register PMMIR_EL1.SLOTS." Did you consider how this could be used?
> >>
> >
> > This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
> > /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
> > read slots values? Currently I understand that parameters in metricExpr only support events and constants.
> >
>
> Maybe during runtime we could create a pseudo metric/event for SLOT.

For Intel we do this by just having a different constant for each
architecture. It is fairly easy to add a new "literal", so you could
add a #slots in expr__get_literal:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf/core#n407
Populating it would be the challenge :-)

Thanks,
Ian

> This metric would be created during init, and it always just returns the
> value which was read from PMMIR_EL1.
>
> I'm not sure how well that would play will trying to resolve metrics
> when building generated pmu-events.c, but I don't think it's all too
> difficult to achieve.
>
> Have you actually read this value for the n2 core? Does look correct?
>
> Thanks,
> John

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-05 21:13               ` Ian Rogers
@ 2023-01-06 10:14                 ` John Garry
  -1 siblings, 0 replies; 44+ messages in thread
From: John Garry @ 2023-01-06 10:14 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Jing Zhang, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On 05/01/2023 21:13, Ian Rogers wrote:
>>> This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
>>> read slots values? Currently I understand that parameters in metricExpr only support events and constants.
>>>
>> Maybe during runtime we could create a pseudo metric/event for SLOT.
> For Intel we do this by just having a different constant for each
> architecture. It is fairly easy to add a new "literal", so you could
> add a #slots in expr__get_literal:
> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$  
> Populating it would be the challenge 😄

Thanks for the pointer. I think that the challenge in populating it 
really comes down to whether we would really want to make this generic.

I suppose that for arm64 we could have a method which accesses this 
PMMIR_EL1 register, while for other archs we could have a weak function 
which just returns NAN. If other archs want to use this key expr, they 
can add their own method.

Out of curiosity, do you know if x86 has such a capability to get this 
slot info from HW?

Thanks,
John


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-06 10:14                 ` John Garry
  0 siblings, 0 replies; 44+ messages in thread
From: John Garry @ 2023-01-06 10:14 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Jing Zhang, Xing Zhengjun, Will Deacon, James Clark, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On 05/01/2023 21:13, Ian Rogers wrote:
>>> This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
>>> read slots values? Currently I understand that parameters in metricExpr only support events and constants.
>>>
>> Maybe during runtime we could create a pseudo metric/event for SLOT.
> For Intel we do this by just having a different constant for each
> architecture. It is fairly easy to add a new "literal", so you could
> add a #slots in expr__get_literal:
> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$  
> Populating it would be the challenge 😄

Thanks for the pointer. I think that the challenge in populating it 
really comes down to whether we would really want to make this generic.

I suppose that for arm64 we could have a method which accesses this 
PMMIR_EL1 register, while for other archs we could have a weak function 
which just returns NAN. If other archs want to use this key expr, they 
can add their own method.

Out of curiosity, do you know if x86 has such a capability to get this 
slot info from HW?

Thanks,
John


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-06 10:14                 ` John Garry
@ 2023-01-06 10:34                   ` Jing Zhang
  -1 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-06 10:34 UTC (permalink / raw)
  To: John Garry, Ian Rogers
  Cc: Xing Zhengjun, Will Deacon, James Clark, Mike Leach, Leo Yan,
	linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



在 2023/1/6 下午6:14, John Garry 写道:
> On 05/01/2023 21:13, Ian Rogers wrote:
>>>> This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
>>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
>>>> read slots values? Currently I understand that parameters in metricExpr only support events and constants.
>>>>
>>> Maybe during runtime we could create a pseudo metric/event for SLOT.
>> For Intel we do this by just having a different constant for each
>> architecture. It is fairly easy to add a new "literal", so you could
>> add a #slots in expr__get_literal:
>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$  Populating it would be the challenge 😄

Yes! I was thinking the same as you, I found this method from the SMT_on variable in icl_metrics.json, then I
tried it and it worked, so excited!

> 
> Thanks for the pointer. I think that the challenge in populating it really comes down to whether we would really want to make this generic.
> 
> I suppose that for arm64 we could have a method which accesses this PMMIR_EL1 register, while for other archs we could have a weak function which just returns NAN. If other archs want to use this key expr, they can add their own method.
> 

Now I have to use this method, because I just found out that neoverse-n2 has been changed to neoverse-n2-v2,
merging n2 and v2. The slots of n2 are 5, and the slots of v2 are 8. I will release the v6 patch and put the
metric in the sbsa.json file. The metrics in sbsa.json is only applicable to arm64, so even if x86 cannot get
the slots value, there will be no conflict.


> Out of curiosity, do you know if x86 has such a capability to get this slot info from HW?
> 


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-06 10:34                   ` Jing Zhang
  0 siblings, 0 replies; 44+ messages in thread
From: Jing Zhang @ 2023-01-06 10:34 UTC (permalink / raw)
  To: John Garry, Ian Rogers
  Cc: Xing Zhengjun, Will Deacon, James Clark, Mike Leach, Leo Yan,
	linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



在 2023/1/6 下午6:14, John Garry 写道:
> On 05/01/2023 21:13, Ian Rogers wrote:
>>>> This may be a feasible idea. The value of slots comes from the register PMMIR_EL1, which I can read in
>>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I replace the slots in MetricExpr with the
>>>> read slots values? Currently I understand that parameters in metricExpr only support events and constants.
>>>>
>>> Maybe during runtime we could create a pseudo metric/event for SLOT.
>> For Intel we do this by just having a different constant for each
>> architecture. It is fairly easy to add a new "literal", so you could
>> add a #slots in expr__get_literal:
>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$  Populating it would be the challenge 😄

Yes! I was thinking the same as you, I found this method from the SMT_on variable in icl_metrics.json, then I
tried it and it worked, so excited!

> 
> Thanks for the pointer. I think that the challenge in populating it really comes down to whether we would really want to make this generic.
> 
> I suppose that for arm64 we could have a method which accesses this PMMIR_EL1 register, while for other archs we could have a weak function which just returns NAN. If other archs want to use this key expr, they can add their own method.
> 

Now I have to use this method, because I just found out that neoverse-n2 has been changed to neoverse-n2-v2,
merging n2 and v2. The slots of n2 are 5, and the slots of v2 are 8. I will release the v6 patch and put the
metric in the sbsa.json file. The metrics in sbsa.json is only applicable to arm64, so even if x86 cannot get
the slots value, there will be no conflict.


> Out of curiosity, do you know if x86 has such a capability to get this slot info from HW?
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-06 10:14                 ` John Garry
@ 2023-01-09 15:34                   ` James Clark
  -1 siblings, 0 replies; 44+ messages in thread
From: James Clark @ 2023-01-09 15:34 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Jing Zhang
  Cc: Xing Zhengjun, Will Deacon, Mike Leach, Leo Yan,
	linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



On 06/01/2023 10:14, John Garry wrote:
> On 05/01/2023 21:13, Ian Rogers wrote:
>>>> This may be a feasible idea. The value of slots comes from the
>>>> register PMMIR_EL1, which I can read in
>>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I
>>>> replace the slots in MetricExpr with the
>>>> read slots values? Currently I understand that parameters in
>>>> metricExpr only support events and constants.
>>>>
>>> Maybe during runtime we could create a pseudo metric/event for SLOT.
>> For Intel we do this by just having a different constant for each
>> architecture. It is fairly easy to add a new "literal", so you could
>> add a #slots in expr__get_literal:
>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$  Populating it would be the challenge 😄
> 
> Thanks for the pointer. I think that the challenge in populating it
> really comes down to whether we would really want to make this generic.
> 
> I suppose that for arm64 we could have a method which accesses this
> PMMIR_EL1 register, while for other archs we could have a weak function
> which just returns NAN. If other archs want to use this key expr, they
> can add their own method.
> 

I wonder if it would be worthwhile and even more generic to add some
sort of int containing file accessor construct. It could also have
support for a default value when the file doesn't exist. For example:

  "MetricExpr": "ITLB / {file://<pmu>/caps/slots(5)}"

It gets a bit fiddly because you might want to support absolute paths
and paths relative to whatever PMU is being used. But it could prevent
having to add some custom identifier and glue code for every possible
file that just has an integer in it.

It also wouldn't be possible to support the case where the file has
bitfields in it that need to be extracted, so maybe we shouldn't do it.

James


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-09 15:34                   ` James Clark
  0 siblings, 0 replies; 44+ messages in thread
From: James Clark @ 2023-01-09 15:34 UTC (permalink / raw)
  To: John Garry, Ian Rogers, Jing Zhang
  Cc: Xing Zhengjun, Will Deacon, Mike Leach, Leo Yan,
	linux-arm-kernel, linux-perf-users, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andrew Kilroy,
	Shuai Xue, Zhuo Song



On 06/01/2023 10:14, John Garry wrote:
> On 05/01/2023 21:13, Ian Rogers wrote:
>>>> This may be a feasible idea. The value of slots comes from the
>>>> register PMMIR_EL1, which I can read in
>>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I
>>>> replace the slots in MetricExpr with the
>>>> read slots values? Currently I understand that parameters in
>>>> metricExpr only support events and constants.
>>>>
>>> Maybe during runtime we could create a pseudo metric/event for SLOT.
>> For Intel we do this by just having a different constant for each
>> architecture. It is fairly easy to add a new "literal", so you could
>> add a #slots in expr__get_literal:
>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$  Populating it would be the challenge 😄
> 
> Thanks for the pointer. I think that the challenge in populating it
> really comes down to whether we would really want to make this generic.
> 
> I suppose that for arm64 we could have a method which accesses this
> PMMIR_EL1 register, while for other archs we could have a weak function
> which just returns NAN. If other archs want to use this key expr, they
> can add their own method.
> 

I wonder if it would be worthwhile and even more generic to add some
sort of int containing file accessor construct. It could also have
support for a default value when the file doesn't exist. For example:

  "MetricExpr": "ITLB / {file://<pmu>/caps/slots(5)}"

It gets a bit fiddly because you might want to support absolute paths
and paths relative to whatever PMU is being used. But it could prevent
having to add some custom identifier and glue code for every possible
file that just has an integer in it.

It also wouldn't be possible to support the case where the file has
bitfields in it that need to be extracted, so maybe we shouldn't do it.

James


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
  2023-01-09 15:34                   ` James Clark
@ 2023-01-11  6:14                     ` Ian Rogers
  -1 siblings, 0 replies; 44+ messages in thread
From: Ian Rogers @ 2023-01-11  6:14 UTC (permalink / raw)
  To: James Clark
  Cc: John Garry, Jing Zhang, Xing Zhengjun, Will Deacon, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On Mon, Jan 9, 2023 at 7:35 AM James Clark <james.clark@arm.com> wrote:
>
>
>
> On 06/01/2023 10:14, John Garry wrote:
> > On 05/01/2023 21:13, Ian Rogers wrote:
> >>>> This may be a feasible idea. The value of slots comes from the
> >>>> register PMMIR_EL1, which I can read in
> >>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I
> >>>> replace the slots in MetricExpr with the
> >>>> read slots values? Currently I understand that parameters in
> >>>> metricExpr only support events and constants.
> >>>>
> >>> Maybe during runtime we could create a pseudo metric/event for SLOT.
> >> For Intel we do this by just having a different constant for each
> >> architecture. It is fairly easy to add a new "literal", so you could
> >> add a #slots in expr__get_literal:
> >> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$  Populating it would be the challenge 😄
> >
> > Thanks for the pointer. I think that the challenge in populating it
> > really comes down to whether we would really want to make this generic.
> >
> > I suppose that for arm64 we could have a method which accesses this
> > PMMIR_EL1 register, while for other archs we could have a weak function
> > which just returns NAN. If other archs want to use this key expr, they
> > can add their own method.
> >
>
> I wonder if it would be worthwhile and even more generic to add some
> sort of int containing file accessor construct. It could also have
> support for a default value when the file doesn't exist. For example:
>
>   "MetricExpr": "ITLB / {file://<pmu>/caps/slots(5)}"
>
> It gets a bit fiddly because you might want to support absolute paths
> and paths relative to whatever PMU is being used. But it could prevent
> having to add some custom identifier and glue code for every possible
> file that just has an integer in it.
>
> It also wouldn't be possible to support the case where the file has
> bitfields in it that need to be extracted, so maybe we shouldn't do it.
>
> James

Thanks James,

I think there are many opportunities to improve the metrics. One step
in this direction is:
https://lore.kernel.org/lkml/20221221223420.2157113-1-irogers@google.com/
(which is looking for reviews :-D ). Some areas we could improve include:
 - the expression code has support for longs but I don't believe any
metrics use it.
 - the modulus is weird and again unused.
 - I think divide (/) should behave like d_ratio as aborting parsing
is next to useless.
 - events like Intel's msr/tsc/ don't have to be programmed on every
CPU/hyperthread and doing so is quite wasteful.
 - we may have a read but no write counter, so being able to read a
sibling CPUs/socket's read counter may inform about writes. This isn't
currently expressible as metrics compute based on whatever the
aggregation mode is, you can't get a particular count.
 - perf stat record/report don't work/compute metrics, but just
provide counters.
 - the json format should resemble sysfs rather than being a flat
list, metrics and events in the list should be separated.
 - metrics use / as divide and so @ is used in /'s place for event
modifiers. BPF events use / as a directory separator.

For the filesystems reading I think it is a good idea. I'd like to
make it so that things like #num_dies become tool events and remove
the notion of literals. Perhaps we can make reading a file something
that is an event. The current event parsing logic is overly complex,
for example the handling of '-' which has some legacy PMU separation
properties. A proposal mentioned at LPC was to have a new event
parsing library that doesn't carry legacy baggage. We can make metric
code use this as the metrics encode the events. If the new library
fails parsing the code can fall back on the existing parser. I'd like
it if the event parsing logic more closely resembled the sysfs style.
I'd like it if we could have events in sysfs, built into the tool (but
with a layout resembling sysfs) and also allow events, etc. to be
added by having say a zip of a sysfs directory/file structure. I'm
hoping libraries like metric.py:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/pmu-events/metric.py
can be used by vendors, so that it is easy to update vendor generated
json if/when the format changes.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 metrics for neoverse-n2
@ 2023-01-11  6:14                     ` Ian Rogers
  0 siblings, 0 replies; 44+ messages in thread
From: Ian Rogers @ 2023-01-11  6:14 UTC (permalink / raw)
  To: James Clark
  Cc: John Garry, Jing Zhang, Xing Zhengjun, Will Deacon, Mike Leach,
	Leo Yan, linux-arm-kernel, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andrew Kilroy, Shuai Xue, Zhuo Song

On Mon, Jan 9, 2023 at 7:35 AM James Clark <james.clark@arm.com> wrote:
>
>
>
> On 06/01/2023 10:14, John Garry wrote:
> > On 05/01/2023 21:13, Ian Rogers wrote:
> >>>> This may be a feasible idea. The value of slots comes from the
> >>>> register PMMIR_EL1, which I can read in
> >>>> /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. But how do I
> >>>> replace the slots in MetricExpr with the
> >>>> read slots values? Currently I understand that parameters in
> >>>> metricExpr only support events and constants.
> >>>>
> >>> Maybe during runtime we could create a pseudo metric/event for SLOT.
> >> For Intel we do this by just having a different constant for each
> >> architecture. It is fairly easy to add a new "literal", so you could
> >> add a #slots in expr__get_literal:
> >> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/expr.c?h=perf*core*n407__;LyM!!ACWV5N9M2RV99hQ!IHcZFuFaLdQDQvVOnHVlbbME2S4aW8GohWUkydlejpi7ifFz61r7RutGXReRt0d88X_vDfkTySCiuD2PqOA$  Populating it would be the challenge 😄
> >
> > Thanks for the pointer. I think that the challenge in populating it
> > really comes down to whether we would really want to make this generic.
> >
> > I suppose that for arm64 we could have a method which accesses this
> > PMMIR_EL1 register, while for other archs we could have a weak function
> > which just returns NAN. If other archs want to use this key expr, they
> > can add their own method.
> >
>
> I wonder if it would be worthwhile and even more generic to add some
> sort of int containing file accessor construct. It could also have
> support for a default value when the file doesn't exist. For example:
>
>   "MetricExpr": "ITLB / {file://<pmu>/caps/slots(5)}"
>
> It gets a bit fiddly because you might want to support absolute paths
> and paths relative to whatever PMU is being used. But it could prevent
> having to add some custom identifier and glue code for every possible
> file that just has an integer in it.
>
> It also wouldn't be possible to support the case where the file has
> bitfields in it that need to be extracted, so maybe we shouldn't do it.
>
> James

Thanks James,

I think there are many opportunities to improve the metrics. One step
in this direction is:
https://lore.kernel.org/lkml/20221221223420.2157113-1-irogers@google.com/
(which is looking for reviews :-D ). Some areas we could improve include:
 - the expression code has support for longs but I don't believe any
metrics use it.
 - the modulus is weird and again unused.
 - I think divide (/) should behave like d_ratio as aborting parsing
is next to useless.
 - events like Intel's msr/tsc/ don't have to be programmed on every
CPU/hyperthread and doing so is quite wasteful.
 - we may have a read but no write counter, so being able to read a
sibling CPUs/socket's read counter may inform about writes. This isn't
currently expressible as metrics compute based on whatever the
aggregation mode is, you can't get a particular count.
 - perf stat record/report don't work/compute metrics, but just
provide counters.
 - the json format should resemble sysfs rather than being a flat
list, metrics and events in the list should be separated.
 - metrics use / as divide and so @ is used in /'s place for event
modifiers. BPF events use / as a directory separator.

For the filesystems reading I think it is a good idea. I'd like to
make it so that things like #num_dies become tool events and remove
the notion of literals. Perhaps we can make reading a file something
that is an event. The current event parsing logic is overly complex,
for example the handling of '-' which has some legacy PMU separation
properties. A proposal mentioned at LPC was to have a new event
parsing library that doesn't carry legacy baggage. We can make metric
code use this as the metrics encode the events. If the new library
fails parsing the code can fall back on the existing parser. I'd like
it if the event parsing logic more closely resembled the sysfs style.
I'd like it if we could have events in sysfs, built into the tool (but
with a layout resembling sysfs) and also allow events, etc. to be
added by having say a zip of a sysfs directory/file structure. I'm
hoping libraries like metric.py:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/pmu-events/metric.py
can be used by vendors, so that it is easy to update vendor generated
json if/when the format changes.

Thanks,
Ian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2023-01-11  6:15 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-03 11:39 [PATCH v5 0/6] Add metrics for neoverse-n2 Jing Zhang
2023-01-03 11:39 ` Jing Zhang
2023-01-03 11:39 ` [PATCH v5 1/6] perf vendor events arm64: Add topdown L1 " Jing Zhang
2023-01-03 11:39   ` Jing Zhang
2023-01-03 11:52   ` John Garry
2023-01-03 11:52     ` John Garry
2023-01-04  5:05     ` Jing Zhang
2023-01-04  5:05       ` Jing Zhang
2023-01-04 17:26       ` John Garry
2023-01-04 17:26         ` John Garry
2023-01-05 10:05         ` Jing Zhang
2023-01-05 10:05           ` Jing Zhang
2023-01-05 10:13           ` John Garry
2023-01-05 10:13             ` John Garry
2023-01-05 11:02             ` Jing Zhang
2023-01-05 11:02               ` Jing Zhang
2023-01-05 21:13             ` Ian Rogers
2023-01-05 21:13               ` Ian Rogers
2023-01-06 10:14               ` John Garry
2023-01-06 10:14                 ` John Garry
2023-01-06 10:34                 ` Jing Zhang
2023-01-06 10:34                   ` Jing Zhang
2023-01-09 15:34                 ` James Clark
2023-01-09 15:34                   ` James Clark
2023-01-11  6:14                   ` Ian Rogers
2023-01-11  6:14                     ` Ian Rogers
2023-01-03 11:39 ` [PATCH v5 2/6] perf vendor events arm64: Add TLB " Jing Zhang
2023-01-03 11:39   ` Jing Zhang
2023-01-03 17:14   ` Ian Rogers
2023-01-03 17:14     ` Ian Rogers
2023-01-04  5:21     ` Jing Zhang
2023-01-04  5:21       ` Jing Zhang
2023-01-04  8:40       ` Jing Zhang
2023-01-04  8:40         ` Jing Zhang
2023-01-04 16:57         ` Ian Rogers
2023-01-04 16:57           ` Ian Rogers
2023-01-03 11:39 ` [PATCH v5 3/6] perf vendor events arm64: Add cache " Jing Zhang
2023-01-03 11:39   ` Jing Zhang
2023-01-03 11:39 ` [PATCH v5 4/6] perf vendor events arm64: Add branch " Jing Zhang
2023-01-03 11:39   ` Jing Zhang
2023-01-03 11:39 ` [PATCH v5 5/6] perf vendor events arm64: Add PE utilization " Jing Zhang
2023-01-03 11:39   ` Jing Zhang
2023-01-03 11:39 ` [PATCH v5 6/6] perf vendor events arm64: Add instruction mix " Jing Zhang
2023-01-03 11:39   ` Jing Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.