linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] latest PMU events for zen1/zen2
@ 2020-02-25 19:28 Vijay Thakkar
  2020-02-25 19:28 ` [PATCH v2 1/3] perf vendor events amd: restrict model detection for zen1 based processors Vijay Thakkar
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Vijay Thakkar @ 2020-02-25 19:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Vijay Thakkar, Peter Zijlstra, Ingo Molnar, Kim Phillips,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

This series of patches brings the PMU events for AMD family 17h series
of processors up to date with the latest versions of the AMD processor
programming reference manuals.

The first patch changes the pmu events mapfile to be more selective for
the model number rather than blanket detecting all f17h processors to
have the same events directory. This is required for the later patch
where we add events for zen2 based processors. In v2 of the patch, the
incorrect regex for model string was correct to include the range 0
through 2f.

The second patch adds the PMU events for zen2. In v2 ls_mab_alloc.loads
umask is corrected. No events from Zen1 have been removed.

Finally the third patch updates the zen1 PMU events to be in accordance
with the latest PPR version and bumps up the events version to v2. In v2
of the patch series, missing events (bp_dyn_ind_pred and bp_de_redirect)
were added and umasks were corrected for fpu_pipe_assignment.dual* and
ls_mab_alloc.loads.

Vijay Thakkar (3):
  perf vendor events amd: restrict model detection for zen1 based
    processors
  perf vendor events amd: add Zen2 events
  perf vendor events amd: update Zen1 events to V2

 .../pmu-events/arch/x86/amdfam17h/branch.json |  12 -
 .../pmu-events/arch/x86/amdzen1/branch.json   |  23 ++
 .../x86/{amdfam17h => amdzen1}/cache.json     |   0
 .../pmu-events/arch/x86/amdzen1/core.json     | 129 ++++++
 .../floating-point.json                       |  56 +++
 .../x86/{amdfam17h => amdzen1}/memory.json    |  18 +
 .../x86/{amdfam17h => amdzen1}/other.json     |   0
 .../pmu-events/arch/x86/amdzen2/branch.json   |  56 +++
 .../pmu-events/arch/x86/amdzen2/cache.json    | 375 ++++++++++++++++++
 .../arch/x86/{amdfam17h => amdzen2}/core.json |   0
 .../arch/x86/amdzen2/floating-point.json      | 128 ++++++
 .../pmu-events/arch/x86/amdzen2/memory.json   | 349 ++++++++++++++++
 .../pmu-events/arch/x86/amdzen2/other.json    | 137 +++++++
 tools/perf/pmu-events/arch/x86/mapfile.csv    |   3 +-
 14 files changed, 1273 insertions(+), 13 deletions(-)
 delete mode 100644 tools/perf/pmu-events/arch/x86/amdfam17h/branch.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/branch.json
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/cache.json (100%)
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/core.json
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/floating-point.json (63%)
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/memory.json (93%)
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/other.json (100%)
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/branch.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/cache.json
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen2}/core.json (100%)
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/other.json

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] perf vendor events amd: restrict model detection for zen1 based processors
  2020-02-25 19:28 [PATCH v2 0/3] latest PMU events for zen1/zen2 Vijay Thakkar
@ 2020-02-25 19:28 ` Vijay Thakkar
  2020-02-25 19:28 ` [PATCH v2 2/3] perf vendor events amd: add Zen2 events Vijay Thakkar
  2020-02-25 19:28 ` [PATCH v2 3/3] perf vendor events amd: update Zen1 events to V2 Vijay Thakkar
  2 siblings, 0 replies; 12+ messages in thread
From: Vijay Thakkar @ 2020-02-25 19:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Vijay Thakkar, Peter Zijlstra, Ingo Molnar, Kim Phillips,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

This patch changes the previous blanket detection of AMD Family 17h
processors to be more specific to Zen1 core based products only by
replacing model detection regex pattern [[:xdigit:]]+ with
([12][0-9A-F]|[0-9A-F]), restricting to models 0 though 2f only.

This change is required to allow for the addition of separate PMU events
for Zen2 core based models in the following patches as those belong to family
17h but have different PMCs. Current PMU events directory has also been
renamed to "amdzen1" from "amdfam17h" to reflect this specificity.

Note that although this change does not break PMU counters for existing
zen1 based systems, it does disable the current set of counters for zen2
based systems. Counters for zen2 have been added in the following
patches in this patchset.

Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>
---
Changes in v2:
    - Change Zen1 model detection regex to include all models in range 0
    through 2F inclusive.

 .../perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/branch.json | 0
 .../perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/cache.json  | 0
 tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/core.json | 0
 .../arch/x86/{amdfam17h => amdzen1}/floating-point.json         | 0
 .../perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/memory.json | 0
 .../perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/other.json  | 0
 tools/perf/pmu-events/arch/x86/mapfile.csv                      | 2 +-
 7 files changed, 1 insertion(+), 1 deletion(-)
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/branch.json (100%)
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/cache.json (100%)
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/core.json (100%)
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/floating-point.json (100%)
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/memory.json (100%)
 rename tools/perf/pmu-events/arch/x86/{amdfam17h => amdzen1}/other.json (100%)

diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/branch.json b/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
similarity index 100%
rename from tools/perf/pmu-events/arch/x86/amdfam17h/branch.json
rename to tools/perf/pmu-events/arch/x86/amdzen1/branch.json
diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/cache.json b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
similarity index 100%
rename from tools/perf/pmu-events/arch/x86/amdfam17h/cache.json
rename to tools/perf/pmu-events/arch/x86/amdzen1/cache.json
diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/core.json b/tools/perf/pmu-events/arch/x86/amdzen1/core.json
similarity index 100%
rename from tools/perf/pmu-events/arch/x86/amdfam17h/core.json
rename to tools/perf/pmu-events/arch/x86/amdzen1/core.json
diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/floating-point.json b/tools/perf/pmu-events/arch/x86/amdzen1/floating-point.json
similarity index 100%
rename from tools/perf/pmu-events/arch/x86/amdfam17h/floating-point.json
rename to tools/perf/pmu-events/arch/x86/amdzen1/floating-point.json
diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/memory.json b/tools/perf/pmu-events/arch/x86/amdzen1/memory.json
similarity index 100%
rename from tools/perf/pmu-events/arch/x86/amdfam17h/memory.json
rename to tools/perf/pmu-events/arch/x86/amdzen1/memory.json
diff --git a/tools/perf/pmu-events/arch/x86/amdfam17h/other.json b/tools/perf/pmu-events/arch/x86/amdzen1/other.json
similarity index 100%
rename from tools/perf/pmu-events/arch/x86/amdfam17h/other.json
rename to tools/perf/pmu-events/arch/x86/amdzen1/other.json
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 745ced083844..82a9db00125e 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -36,4 +36,4 @@ GenuineIntel-6-55-[56789ABCDEF],v1,cascadelakex,core
 GenuineIntel-6-7D,v1,icelake,core
 GenuineIntel-6-7E,v1,icelake,core
 GenuineIntel-6-86,v1,tremontx,core
-AuthenticAMD-23-[[:xdigit:]]+,v1,amdfam17h,core
+AuthenticAMD-23-([12][0-9A-F]|[0-9A-F]),v1,amdzen1,core
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] perf vendor events amd: add Zen2 events
  2020-02-25 19:28 [PATCH v2 0/3] latest PMU events for zen1/zen2 Vijay Thakkar
  2020-02-25 19:28 ` [PATCH v2 1/3] perf vendor events amd: restrict model detection for zen1 based processors Vijay Thakkar
@ 2020-02-25 19:28 ` Vijay Thakkar
  2020-02-26 22:09   ` Kim Phillips
  2020-02-25 19:28 ` [PATCH v2 3/3] perf vendor events amd: update Zen1 events to V2 Vijay Thakkar
  2 siblings, 1 reply; 12+ messages in thread
From: Vijay Thakkar @ 2020-02-25 19:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Vijay Thakkar, Peter Zijlstra, Ingo Molnar, Kim Phillips,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

This patch adds PMU events for AMD Zen2 core based processors, namely,
Matisse (model 71h), Castle Peak (model 31h) and Rome (model 2xh), as
documented in the AMD Processor Programming Reference for Matisse [1].
Zen2 adds some additional counters that are not present in Zen1 and
events for them have been added in this patch. Some counters have also
been removed for Zen2 thatwere previously present in Zen1 and have been
confirmed to always sample zero on zen2. These added/removed counters
have been omitted for brevity but can be found here:
https://gist.github.com/thakkarV/5b12ca5fd7488eb2c42e451e40bdd5f3

Note that PPR for Zen2 [1] does not include some counters that were
documented in the PPR for Zen1 based processors [2]. After having tested
these counters, some of them that still work for zen2 systems have been
preserved in the events for zen2. The counters that are omitted in [1]
but are still measurable and non-zero on zen2 (tested on a Ryzen 3900X
system) are the following:

PMC 0x000 fpu_pipe_assignment.{total|total0|total1|total2|total3}
PMC 0x004 fp_num_mov_elim_scal_op.*
PMC 0x046 ls_tablewalker.*
PMC 0x062 l2_latency.l2_cycles_waiting_on_fills
PMC 0x063 l2_wcb_req.*
PMC 0x06D l2_fill_pending.l2_fill_busy
PMC 0x080 ic_fw32
PMC 0x081 ic_fw32_miss
PMC 0x086 bp_snp_re_sync
PMC 0x087 ic_fetch_stall.*
PMC 0x08C ic_cache_inval.*
PMC 0x099 bp_tlb_rel
PMC 0x0C7 ex_ret_brn_resync
PMC 0x28A ic_oc_mode_switch.*
L3PMC 0x001 l3_request_g1.*
L3PMC 0x006 l3_comb_clstr_state.*

[1]: Processor Programming Reference (PPR) for AMD Family 17h Model 71h,
Revision B0 Processors, 56176 Rev 3.06 - Jul 17, 2019
[2]: Processor Programming Reference (PPR) for AMD Family 17h Models
01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019
All of the PPRs can be found at:
https://bugzilla.kernel.org/show_bug.cgi?id=206537

Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>

---
Changes in v2:
    - Correct UMask for ls_mab_alloc.loads

 .../pmu-events/arch/x86/amdzen2/branch.json   |  56 +++
 .../pmu-events/arch/x86/amdzen2/cache.json    | 375 ++++++++++++++++++
 .../pmu-events/arch/x86/amdzen2/core.json     | 134 +++++++
 .../arch/x86/amdzen2/floating-point.json      | 128 ++++++
 .../pmu-events/arch/x86/amdzen2/memory.json   | 349 ++++++++++++++++
 .../pmu-events/arch/x86/amdzen2/other.json    | 137 +++++++
 tools/perf/pmu-events/arch/x86/mapfile.csv    |   1 +
 7 files changed, 1180 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/branch.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/cache.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/core.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/other.json

diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/branch.json b/tools/perf/pmu-events/arch/x86/amdzen2/branch.json
new file mode 100644
index 000000000000..c1c1b856504d
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/branch.json
@@ -0,0 +1,56 @@
+[
+  {
+    "EventName": "bp_l1_btb_correct",
+    "EventCode": "0x8a",
+    "BriefDescription": "L1 BTB Correction."
+  },
+  {
+    "EventName": "bp_l2_btb_correct",
+    "EventCode": "0x8b",
+    "BriefDescription": "L2 BTB Correction."
+  },
+  {
+    "EventName": "bp_dyn_ind_pred",
+    "EventCode": "0x8e",
+    "BriefDescription": "Dynamic Indirect Predictions.",
+    "PublicDescription": "Indirect Branch Prediction for potential multi-target branch (speculative)."
+  },
+  {
+    "EventName": "bp_de_redirect",
+    "EventCode": "0x91",
+    "BriefDescription": "Decoder Overrides Existing Branch Prediction (speculative)."
+  },
+  {
+    "EventName": "bp_l1_tlb_fetch_hit",
+    "EventCode": "0x94",
+    "BriefDescription": "All instruction fetches.",
+    "PublicDescription": "The number of instruction fetches that hit in the L1 ITLB.",
+    "UMask": "0xFF"
+  },
+  {
+    "EventName": "bp_l1_tlb_fetch_hit.if1g",
+    "EventCode": "0x94",
+    "BriefDescription": "Instuction fetches to a 1GB page.",
+    "PublicDescription": "The number of instruction fetches that hit in the L1 ITLB.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "bp_l1_tlb_fetch_hit.if2m",
+    "EventCode": "0x94",
+    "BriefDescription": "Instuction fetches to a 2MB page.",
+    "PublicDescription": "The number of instruction fetches that hit in the L1 ITLB.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "bp_l1_tlb_fetch_hit.if4k",
+    "EventCode": "0x94",
+    "BriefDescription": "Instuction fetches to a 4KB page.",
+    "PublicDescription": "The number of instruction fetches that hit in the L1 ITLB.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "bp_tlb_rel",
+    "EventCode": "0x99",
+    "BriefDescription": "The number of ITLB reload requests."
+  }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/cache.json b/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
new file mode 100644
index 000000000000..aee22537b711
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
@@ -0,0 +1,375 @@
+[
+  {
+    "EventName": "l2_request_g1.rd_blk_l",
+    "EventCode": "0x60",
+    "BriefDescription": "Requests to L2 Group1.",
+    "PublicDescription": "Requests to L2 Group1.",
+    "UMask": "0x80"
+  },
+  {
+    "EventName": "l2_request_g1.rd_blk_x",
+    "EventCode": "0x60",
+    "BriefDescription": "Requests to L2 Group1.",
+    "PublicDescription": "Requests to L2 Group1.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "l2_request_g1.ls_rd_blk_c_s",
+    "EventCode": "0x60",
+    "BriefDescription": "Requests to L2 Group1.",
+    "PublicDescription": "Requests to L2 Group1.",
+    "UMask": "0x20"
+  },
+  {
+    "EventName": "l2_request_g1.cacheable_ic_read",
+    "EventCode": "0x60",
+    "BriefDescription": "Requests to L2 Group1.",
+    "PublicDescription": "Requests to L2 Group1.",
+    "UMask": "0x10"
+  },
+  {
+    "EventName": "l2_request_g1.change_to_x",
+    "EventCode": "0x60",
+    "BriefDescription": "Requests to L2 Group1.",
+    "PublicDescription": "Requests to L2 Group1.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "l2_request_g1.prefetch_l2",
+    "EventCode": "0x60",
+    "BriefDescription": "Requests to L2 Group1.",
+    "PublicDescription": "Requests to L2 Group1.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "l2_request_g1.l2_hw_pf",
+    "EventCode": "0x60",
+    "BriefDescription": "Requests to L2 Group1.",
+    "PublicDescription": "Requests to L2 Group1.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "l2_request_g1.other_requests",
+    "EventCode": "0x60",
+    "BriefDescription": "Events covered by l2_request_g2.",
+    "PublicDescription": "Requests to L2 Group1. Events covered by l2_request_g2.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "l2_request_g2.group1",
+    "EventCode": "0x61",
+    "BriefDescription": "All Group 1 commands not in unit0.",
+    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous. All Group 1 commands not in unit0.",
+    "UMask": "0x80"
+  },
+  {
+    "EventName": "l2_request_g2.ls_rd_sized",
+    "EventCode": "0x61",
+    "BriefDescription": "RdSized, RdSized32, RdSized64.",
+    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous. RdSized, RdSized32, RdSized64.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "l2_request_g2.ls_rd_sized_nc",
+    "EventCode": "0x61",
+    "BriefDescription": "RdSizedNC, RdSized32NC, RdSized64NC.",
+    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous. RdSizedNC, RdSized32NC, RdSized64NC.",
+    "UMask": "0x20"
+  },
+  {
+    "EventName": "l2_request_g2.ic_rd_sized",
+    "EventCode": "0x61",
+    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "UMask": "0x10"
+  },
+  {
+    "EventName": "l2_request_g2.ic_rd_sized_nc",
+    "EventCode": "0x61",
+    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "l2_request_g2.smc_inval",
+    "EventCode": "0x61",
+    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "l2_request_g2.bus_locks_originator",
+    "EventCode": "0x61",
+    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "l2_request_g2.bus_locks_responses",
+    "EventCode": "0x61",
+    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "l2_latency.l2_cycles_waiting_on_fills",
+    "EventCode": "0x62",
+    "BriefDescription": "Total cycles spent waiting for L2 fills to complete from L3 or memory, divided by four. Event counts are for both threads. To calculate average latency, the number of fills from both threads must be used.",
+    "PublicDescription": "Total cycles spent waiting for L2 fills to complete from L3 or memory, divided by four. Event counts are for both threads. To calculate average latency, the number of fills from both threads must be used.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "l2_wcb_req.wcb_write",
+    "EventCode": "0x63",
+    "PublicDescription": "LS (Load/Store unit) to L2 WCB (Write Combining Buffer) write requests.",
+    "BriefDescription": "LS to L2 WCB write requests.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "l2_wcb_req.wcb_close",
+    "EventCode": "0x63",
+    "BriefDescription": "LS to L2 WCB close requests.",
+    "PublicDescription": "LS (Load/Store unit) to L2 WCB (Write Combining Buffer) close requests.",
+    "UMask": "0x20"
+  },
+  {
+    "EventName": "l2_wcb_req.zero_byte_store",
+    "EventCode": "0x63",
+    "BriefDescription": "LS to L2 WCB zero byte store requests.",
+    "PublicDescription": "LS (Load/Store unit) to L2 WCB (Write Combining Buffer) zero byte store requests.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "l2_wcb_req.cl_zero",
+    "EventCode": "0x63",
+    "PublicDescription": "LS to L2 WCB cache line zeroing requests.",
+    "BriefDescription": "LS (Load/Store unit) to L2 WCB (Write Combining Buffer) cache line zeroing requests.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "l2_cache_req_stat.ls_rd_blk_cs",
+    "EventCode": "0x64",
+    "BriefDescription": "LS ReadBlock C/S Hit.",
+    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LS ReadBlock C/S Hit.",
+    "UMask": "0x80"
+  },
+  {
+    "EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_x",
+    "EventCode": "0x64",
+    "BriefDescription": "LS Read Block L Hit X.",
+    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LS Read Block L Hit X.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_s",
+    "EventCode": "0x64",
+    "BriefDescription": "LsRdBlkL Hit Shared.",
+    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LsRdBlkL Hit Shared.",
+    "UMask": "0x20"
+  },
+  {
+    "EventName": "l2_cache_req_stat.ls_rd_blk_x",
+    "EventCode": "0x64",
+    "BriefDescription": "LsRdBlkX/ChgToX Hit X.  Count RdBlkX finding Shared as a Miss.",
+    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LsRdBlkX/ChgToX Hit X.  Count RdBlkX finding Shared as a Miss.",
+    "UMask": "0x10"
+  },
+  {
+    "EventName": "l2_cache_req_stat.ls_rd_blk_c",
+    "EventCode": "0x64",
+    "BriefDescription": "LS Read Block C S L X Change to X Miss.",
+    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LS Read Block C S L X Change to X Miss.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "l2_cache_req_stat.ic_fill_hit_x",
+    "EventCode": "0x64",
+    "BriefDescription": "IC Fill Hit Exclusive Stale.",
+    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. IC Fill Hit Exclusive Stale.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "l2_cache_req_stat.ic_fill_hit_s",
+    "EventCode": "0x64",
+    "BriefDescription": "IC Fill Hit Shared.",
+    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. IC Fill Hit Shared.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "l2_cache_req_stat.ic_fill_miss",
+    "EventCode": "0x64",
+    "BriefDescription": "IC Fill Miss.",
+    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. IC Fill Miss.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "l2_fill_pending.l2_fill_busy",
+    "EventCode": "0x6d",
+    "BriefDescription": "Total cycles spent with one or more fill requests in flight from L2.",
+    "PublicDescription": "Total cycles spent with one or more fill requests in flight from L2.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "l2_pf_hit_l2",
+    "EventCode": "0x70",
+    "BriefDescription": "All L2 prefetches accepted by the L2 pipeline which hit the L2."
+  },
+  {
+    "EventName": "l2_pf_miss_l2_hit_l3",
+    "EventCode": "0x71",
+    "BriefDescription": "All L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3."
+  },
+  {
+    "EventName": "l2_pf_miss_l2_l3",
+    "EventCode": "0x72",
+    "BriefDescription": "All L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches."
+  },
+  {
+    "EventName": "ic_fw32",
+    "EventCode": "0x80",
+    "BriefDescription": "The number of 32B fetch windows transferred from IC pipe to DE instruction decoder (includes non-cacheable and cacheable fill responses)."
+  },
+  {
+    "EventName": "ic_fw32_miss",
+    "EventCode": "0x81",
+    "BriefDescription": "The number of 32B fetch windows tried to read the L1 IC and missed in the full tag."
+  },
+  {
+    "EventName": "ic_cache_fill_l2",
+    "EventCode": "0x82",
+    "BriefDescription": "The number of 64 byte instruction cache line was fulfilled from the L2 cache."
+  },
+  {
+    "EventName": "ic_cache_fill_sys",
+    "EventCode": "0x83",
+    "BriefDescription": "The number of 64 byte instruction cache line fulfilled from system memory or another cache."
+  },
+  {
+    "EventName": "bp_l1_tlb_miss_l2_hit",
+    "EventCode": "0x84",
+    "BriefDescription": "The number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB."
+  },
+  {
+    "EventName": "bp_l1_tlb_miss_l2_tlb_miss",
+    "EventCode": "0x85",
+    "BriefDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs.",
+    "UMask": "0xff"
+  },
+  {
+    "EventName": "bp_l1_tlb_miss_l2_tlb_miss.if1g",
+    "EventCode": "0x85",
+    "BriefDescription": "Instruction fetches to a 1GB page.",
+    "PublicDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "bp_l1_tlb_miss_l2_tlb_miss.if2m",
+    "EventCode": "0x85",
+    "BriefDescription": "Instruction fetches to a 2MB page.",
+    "PublicDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "bp_l1_tlb_miss_l2_tlb_miss.if4k",
+    "EventCode": "0x85",
+    "BriefDescription": "Instruction fetches to a 4KB page.",
+    "PublicDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "bp_snp_re_sync",
+    "EventCode": "0x86",
+    "BriefDescription": "The number of pipeline restarts caused by invalidating probes that hit on the instruction stream currently being executed. This would happen if the active instruction stream was being modified by another processor in an MP system - typically a highly unlikely event."
+  },
+  {
+    "EventName": "ic_fetch_stall.ic_stall_any",
+    "EventCode": "0x87",
+    "BriefDescription": "IC pipe was stalled during this clock cycle for any reason (nothing valid in pipe ICM1).",
+    "PublicDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle for any reason (nothing valid in pipe ICM1).",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "ic_fetch_stall.ic_stall_dq_empty",
+    "EventCode": "0x87",
+    "BriefDescription": "IC pipe was stalled during this clock cycle (including IC to OC fetches) due to DQ empty.",
+    "PublicDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle (including IC to OC fetches) due to DQ empty.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ic_fetch_stall.ic_stall_back_pressure",
+    "EventCode": "0x87",
+    "BriefDescription": "IC pipe was stalled during this clock cycle (including IC to OC fetches) due to back-pressure.",
+    "PublicDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle (including IC to OC fetches) due to back-pressure.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ic_cache_inval.l2_invalidating_probe",
+    "EventCode": "0x8c",
+    "BriefDescription": "IC line invalidated due to L2 invalidating probe (external or LS).",
+    "PublicDescription": "The number of instruction cache lines invalidated. A non-SMC event is CMC (cross modifying code), either from the other thread of the core or another core. IC line invalidated due to L2 invalidating probe (external or LS).",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ic_cache_inval.fill_invalidated",
+    "EventCode": "0x8c",
+    "BriefDescription": "IC line invalidated due to overwriting fill response.",
+    "PublicDescription": "The number of instruction cache lines invalidated. A non-SMC event is CMC (cross modifying code), either from the other thread of the core or another core. IC line invalidated due to overwriting fill response.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ic_oc_mode_switch.oc_ic_mode_switch",
+    "EventCode": "0x28a",
+    "BriefDescription": "OC to IC mode switch.",
+    "PublicDescription": "OC Mode Switch. OC to IC mode switch.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ic_oc_mode_switch.ic_oc_mode_switch",
+    "EventCode": "0x28a",
+    "BriefDescription": "IC to OC mode switch.",
+    "PublicDescription": "OC Mode Switch. IC to OC mode switch.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "l3_request_g1.caching_l3_cache_accesses",
+    "EventCode": "0x01",
+    "BriefDescription": "Caching: L3 cache accesses",
+    "UMask": "0x80",
+    "Unit": "L3PMC"
+  },
+  {
+    "EventName": "l3_lookup_state.all_l3_req_typs",
+    "EventCode": "0x04",
+    "BriefDescription": "All L3 Request Types",
+    "UMask": "0xff",
+    "Unit": "L3PMC"
+  },
+  {
+    "EventName": "l3_comb_clstr_state.other_l3_miss_typs",
+    "EventCode": "0x06",
+    "BriefDescription": "Other L3 Miss Request Types",
+    "UMask": "0xfe",
+    "Unit": "L3PMC"
+  },
+  {
+    "EventName": "l3_comb_clstr_state.request_miss",
+    "EventCode": "0x06",
+    "BriefDescription": "L3 cache misses",
+    "UMask": "0x01",
+    "Unit": "L3PMC"
+  },
+  {
+    "EventName": "xi_sys_fill_latency",
+    "EventCode": "0x90",
+    "BriefDescription": "L3 Cache Miss Latency. Total cycles for all transactions divided by 16. Ignores SliceMask and ThreadMask.",
+    "UMask": "0x00",
+    "Unit": "L3PMC"
+  },
+  {
+    "EventName": "xi_ccx_sdp_req1.all_l3_miss_req_typs",
+    "EventCode": "0x9A",
+    "BriefDescription": "All L3 Miss Request Types. Ignores SliceMask and ThreadMask.",
+    "UMask": "0x3f",
+    "Unit": "L3PMC"
+  }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/core.json b/tools/perf/pmu-events/arch/x86/amdzen2/core.json
new file mode 100644
index 000000000000..1079544eeed5
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/core.json
@@ -0,0 +1,134 @@
+[
+  {
+    "EventName": "ex_ret_instr",
+    "EventCode": "0xc0",
+    "BriefDescription": "Retired Instructions."
+  },
+  {
+    "EventName": "ex_ret_cops",
+    "EventCode": "0xc1",
+    "BriefDescription": "Retired Uops.",
+    "PublicDescription": "The number of uOps retired. This includes all processor activity (instructions, exceptions, interrupts, microcode assists, etc.). The number of events logged per cycle can vary from 0 to 4."
+  },
+  {
+    "EventName": "ex_ret_brn",
+    "EventCode": "0xc2",
+    "BriefDescription": "Retired Branch Instructions.",
+    "PublicDescription": "The number of branch instructions retired. This includes all types of architectural control flow changes, including exceptions and interrupts."
+  },
+  {
+    "EventName": "ex_ret_brn_misp",
+    "EventCode": "0xc3",
+    "BriefDescription": "Retired Branch Instructions Mispredicted.",
+    "PublicDescription": "The number of branch instructions retired, of any type, that were not correctly predicted. This includes those for which prediction is not attempted (far control transfers, exceptions and interrupts)."
+  },
+  {
+    "EventName": "ex_ret_brn_tkn",
+    "EventCode": "0xc4",
+    "BriefDescription": "Retired Taken Branch Instructions.",
+    "PublicDescription": "The number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts."
+  },
+  {
+    "EventName": "ex_ret_brn_tkn_misp",
+    "EventCode": "0xc5",
+    "BriefDescription": "Retired Taken Branch Instructions Mispredicted.",
+    "PublicDescription": "The number of retired taken branch instructions that were mispredicted."
+  },
+  {
+    "EventName": "ex_ret_brn_far",
+    "EventCode": "0xc6",
+    "BriefDescription": "Retired Far Control Transfers.",
+    "PublicDescription": "The number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts. Far control transfers are not subject to branch prediction."
+  },
+  {
+    "EventName": "ex_ret_brn_resync",
+    "EventCode": "0xc7",
+    "BriefDescription": "Retired Branch Resyncs.",
+    "PublicDescription": "The number of resync branches. These reflect pipeline restarts due to certain microcode assists and events such as writes to the active instruction stream, among other things. Each occurrence reflects a restart penalty similar to a branch mispredict. This is relatively rare."
+  },
+  {
+    "EventName": "ex_ret_near_ret",
+    "EventCode": "0xc8",
+    "BriefDescription": "Retired Near Returns.",
+    "PublicDescription": "The number of near return instructions (RET or RET Iw) retired."
+  },
+  {
+    "EventName": "ex_ret_near_ret_mispred",
+    "EventCode": "0xc9",
+    "BriefDescription": "Retired Near Returns Mispredicted.",
+    "PublicDescription": "The number of near returns retired that were not correctly predicted by the return address predictor. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction."
+  },
+  {
+    "EventName": "ex_ret_brn_ind_misp",
+    "EventCode": "0xca",
+    "BriefDescription": "Retired Indirect Branch Instructions Mispredicted.",
+    "PublicDescription": "Retired Indirect Branch Instructions Mispredicted."
+  },
+  {
+    "EventName": "ex_ret_mmx_fp_instr.sse_instr",
+    "EventCode": "0xcb",
+    "BriefDescription": "SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41, SSE42, AVX).",
+    "PublicDescription": "The number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions it is not suitable for measuring MFLOPS. SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41, SSE42, AVX).",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "ex_ret_mmx_fp_instr.mmx_instr",
+    "EventCode": "0xcb",
+    "BriefDescription": "MMX instructions.",
+    "PublicDescription": "The number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions it is not suitable for measuring MFLOPS. MMX instructions.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ex_ret_mmx_fp_instr.x87_instr",
+    "EventCode": "0xcb",
+    "BriefDescription": "x87 instructions.",
+    "PublicDescription": "The number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions it is not suitable for measuring MFLOPS. x87 instructions.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ex_ret_cond",
+    "EventCode": "0xd1",
+    "BriefDescription": "Retired Conditional Branch Instructions."
+  },
+  {
+    "EventName": "ex_ret_cond_misp",
+    "EventCode": "0xd2",
+    "BriefDescription": "Retired Conditional Branch Instructions Mispredicted."
+  },
+  {
+    "EventName": "ex_div_busy",
+    "EventCode": "0xd3",
+    "BriefDescription": "Div Cycles Busy count."
+  },
+  {
+    "EventName": "ex_div_count",
+    "EventCode": "0xd4",
+    "BriefDescription": "Div Op Count."
+  },
+  {
+    "EventName": "ex_tagged_ibs_ops.ibs_count_rollover",
+    "EventCode": "0x1cf",
+    "BriefDescription": "Number of times an op could not be tagged by IBS because of a previous tagged op that has not retired.",
+    "PublicDescription": "Tagged IBS Ops. Number of times an op could not be tagged by IBS because of a previous tagged op that has not retired.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "ex_tagged_ibs_ops.ibs_tagged_ops_ret",
+    "EventCode": "0x1cf",
+    "BriefDescription": "Number of Ops tagged by IBS that retired.",
+    "PublicDescription": "Tagged IBS Ops. Number of Ops tagged by IBS that retired.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ex_tagged_ibs_ops.ibs_tagged_ops",
+    "EventCode": "0x1cf",
+    "BriefDescription": "Number of Ops tagged by IBS.",
+    "PublicDescription": "Tagged IBS Ops. Number of Ops tagged by IBS.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ex_ret_fus_brnch_inst",
+    "EventCode": "0x1d0",
+    "BriefDescription": "The number of fused retired branch instructions retired per cycle. The number of events logged per cycle can vary from 0 to 3."
+  }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json b/tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json
new file mode 100644
index 000000000000..df530b398f9d
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json
@@ -0,0 +1,128 @@
+[
+  {
+    "EventName": "fpu_pipe_assignment.total",
+    "EventCode": "0x00",
+    "BriefDescription": "Total number of fp uOps.",
+    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",
+    "UMask": "0xf"
+  },
+  {
+    "EventName": "fp_ret_sse_avx_ops.all",
+    "EventCode": "0x03",
+    "BriefDescription": "All FLOPS.",
+    "PublicDescription": "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15.",
+    "UMask": "0xff"
+  },
+  {
+    "EventName": "fp_ret_sse_avx_ops.mult_add_flops",
+    "EventCode": "0x03",
+    "BriefDescription": "Single precision multiply-add FLOPS. Multiply-add counts as 2 FLOPS.",
+    "PublicDescription": "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15. Single precision multiply-add FLOPS. Multiply-add counts as 2 FLOPS.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "fp_ret_sse_avx_ops.div_flops",
+    "EventCode": "0x03",
+    "BriefDescription": "Divide/square root FLOPS.",
+    "PublicDescription": "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15. Single-precision divide/square root FLOPS.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "fp_ret_sse_avx_ops.mult_flops",
+    "EventCode": "0x03",
+    "BriefDescription": "Multiply FLOPS.",
+    "PublicDescription": "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15. Single-precision multiply FLOPS.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "fp_ret_sse_avx_ops.add_sub_flops",
+    "EventCode": "0x03",
+    "BriefDescription": "Add/subtract FLOPS.",
+    "PublicDescription": "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15. Single-precision add/subtract FLOPS.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "fp_num_mov_elim_scal_op.optimized",
+    "EventCode": "0x04",
+    "BriefDescription": "Number of Scalar Ops optimized.",
+    "PublicDescription": "This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes. Number of Scalar Ops optimized.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "fp_num_mov_elim_scal_op.opt_potential",
+    "EventCode": "0x04",
+    "BriefDescription": "Number of Ops that are candidates for optimization (have Z-bit either set or pass).",
+    "PublicDescription": "This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes. Number of Ops that are candidates for optimization (have Z-bit either set or pass).",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "fp_num_mov_elim_scal_op.sse_mov_ops_elim",
+    "EventCode": "0x04",
+    "BriefDescription": "Number of SSE Move Ops eliminated.",
+    "PublicDescription": "This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes. Number of SSE Move Ops eliminated.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "fp_num_mov_elim_scal_op.sse_mov_ops",
+    "EventCode": "0x04",
+    "BriefDescription": "Number of SSE Move Ops.",
+    "PublicDescription": "This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes. Number of SSE Move Ops.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "fp_retired_ser_ops.sse_bot_ret",
+    "EventCode": "0x05",
+    "BriefDescription": "SSE bottom-executing uOps retired.",
+    "PublicDescription": "The number of serializing Ops retired. SSE bottom-executing uOps retired.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "fp_retired_ser_ops.sse_ctrl_ret",
+    "EventCode": "0x05",
+    "BriefDescription": "SSE control word mispredict traps due to mispredictions in RC, FTZ or DAZ, or changes in mask bits.",
+    "PublicDescription": "The number of serializing Ops retired. SSE control word mispredict traps due to mispredictions in RC, FTZ or DAZ, or changes in mask bits.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "fp_retired_ser_ops.x87_bot_ret",
+    "EventCode": "0x05",
+    "BriefDescription": "x87 bottom-executing uOps retired.",
+    "PublicDescription": "The number of serializing Ops retired. x87 bottom-executing uOps retired.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "fp_retired_ser_ops.x87_ctrl_ret",
+    "EventCode": "0x05",
+    "BriefDescription": "x87 control word mispredict traps due to mispredictions in RC or PC, or changes in mask bits.",
+    "PublicDescription": "The number of serializing Ops retired. x87 control word mispredict traps due to mispredictions in RC or PC, or changes in mask bits.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "fp_disp_faults.ymm_spill_fault",
+    "EventCode": "0x0e",
+    "BriefDescription": "YMM spill fault.",
+    "PublicDescription": "Floating Point Dispatch Faults.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "fp_disp_faults.ymm_fill_fault",
+    "EventCode": "0x0e",
+    "BriefDescription": "YMM fill fault.",
+    "PublicDescription": "Floating Point Dispatch Faults.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "fp_disp_faults.xmm_fill_fault",
+    "EventCode": "0x0e",
+    "BriefDescription": "XMM fill fault.",
+    "PublicDescription": "Floating Point Dispatch Faults.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "fp_disp_faults.x87_fill_fault",
+    "EventCode": "0x0e",
+    "BriefDescription": "x87 fill fault.",
+    "PublicDescription": "Floating Point Dispatch Faults.",
+    "UMask": "0x1"
+  }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/memory.json b/tools/perf/pmu-events/arch/x86/amdzen2/memory.json
new file mode 100644
index 000000000000..5c0f80588c61
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/memory.json
@@ -0,0 +1,349 @@
+[
+  {
+    "EventName": "ls_bad_status2.stli_other",
+    "EventCode": "0x24",
+    "BriefDescription": "Non-forwardable conflict; used to reduce STLI's via software. All reasons.",
+    "PublicDescription": "Store To Load Interlock (STLI) are loads that were unable to complete because of a possible match with an older store, and the older store could not do STLF for some reason. There are a number of reasons why this occurs, and this perfmon organizes them into three major groups. ",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_locks.spec_lock_hi_spec",
+    "EventCode": "0x25",
+    "BriefDescription": "High speculative cacheable lock speculation succeeded.",
+    "PublicDescription": "Retired lock instructions.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "ls_locks.spec_lock_lo_spec",
+    "EventCode": "0x25",
+    "BriefDescription": "Low speculative cacheable lock speculation succeeded.",
+    "PublicDescription": "Retired lock instructions.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "ls_locks.non_spec_lock",
+    "EventCode": "0x25",
+    "BriefDescription": "Non-speculative lock succeeded.",
+    "PublicDescription": "Retired lock instructions.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_locks.bus_lock",
+    "EventCode": "0x25",
+    "BriefDescription": "Comparable to legacy bus lock.",
+    "PublicDescription": "Retired lock instructions. Bus lock when a locked operations crosses a cache boundary or is done on an uncacheable memory type.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ls_ret_cl_flush",
+    "EventCode": "0x26",
+    "BriefDescription": "Number of retired CLFLUSH instructions."
+  },
+  {
+    "EventName": "ls_ret_cpuid",
+    "EventCode": "0x27",
+    "BriefDescription": "Number of retired CLFLUSH instructions."
+  },
+  {
+    "EventName": "ls_dispatch.ld_st_dispatch",
+    "EventCode": "0x29",
+    "BriefDescription": "Number of single ops that do load/store to an address.",
+    "PublicDescription": "Dispatch of a single op that performs a load from and store to the same memory address.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "ls_dispatch.store_dispatch",
+    "EventCode": "0x29",
+    "BriefDescription": "Number of stores dispatched.",
+    "PublicDescription": "Counts the number of operations dispatched to the LS unit. Unit Masks ADDed.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_dispatch.ld_dispatch",
+    "EventCode": "0x29",
+    "BriefDescription": "Number of loads dispatched.",
+    "PublicDescription": "Counts the number of operations dispatched to the LS unit. Unit Masks ADDed.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ls_smi_rx",
+    "EventCode": "0x2B",
+    "BriefDescription": "Number of SMIs received."
+  },
+  {
+    "EventName": "ls_int_taken",
+    "EventCode": "0x2C",
+    "BriefDescription": "Number of interrupts taken."
+  },
+  {
+    "EventName": "ls_rdtsc",
+    "EventCode": "0x2D",
+    "BriefDescription": "Number of reads of the TSC (RDTSC instructions). The count is speculative."
+  },
+  {
+    "EventName": "ls_stlf",
+    "EventCode": "0x35",
+    "BriefDescription": "Number of STLF hits."
+  },
+  {
+    "EventName": "ls_st_commit_cancel2.st_commit_cancel_wcb_full",
+    "EventCode": "0x37",
+    "BriefDescription": "A non-cacheable store and the non-cacheable commit buffer is full."
+  },
+  {
+    "EventName": "ls_dc_accesses",
+    "EventCode": "0x40",
+    "BriefDescription": "Number of accesses to the dcache for load/store references.",
+    "PublicDescription": "The number of accesses to the data cache for load and store references. This may include certain microcode scratchpad accesses, although these are generally rare. Each increment represents an eight-byte access, although the instruction may only be accessing a portion of that. This event is a speculative event."
+  },
+  {
+    "EventName": "ls_mab_alloc.dc_prefetcher",
+    "EventCode": "0x41",
+    "BriefDescription": "Data cache prefetcher miss.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "ls_mab_alloc.stores",
+    "EventCode": "0x41",
+    "BriefDescription": "Data cache store miss.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_mab_alloc.loads",
+    "EventCode": "0x41",
+    "BriefDescription": "Data cache load miss.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ls_refills_from_sys.ls_mabresp_rmt_dram",
+    "EventCode": "0x43",
+    "BriefDescription": "DRAM or IO from different die.",
+    "PublicDescription": "Demand Data Cache Fills by Data Source.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "ls_refills_from_sys.ls_mabresp_rmt_cache",
+    "EventCode": "0x43",
+    "BriefDescription": "Hit in cache; Remote CCX and the address's Home Node is on a different die.",
+    "PublicDescription": "Demand Data Cache Fills by Data Source.",
+    "UMask": "0x10"
+  },
+  {
+    "EventName": "ls_refills_from_sys.ls_mabresp_lcl_dram",
+    "EventCode": "0x43",
+    "BriefDescription": "DRAM or IO from this thread's die.",
+    "PublicDescription": "Demand Data Cache Fills by Data Source.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "ls_refills_from_sys.ls_mabresp_lcl_cache",
+    "EventCode": "0x43",
+    "BriefDescription": "Hit in cache; local CCX (not Local L2), or Remote CCX and the address's Home Node is on this thread's die.",
+    "PublicDescription": "Demand Data Cache Fills by Data Source.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_refills_from_sys.ls_mabresp_lcl_l2",
+    "EventCode": "0x43",
+    "BriefDescription": "Local L2 hit.",
+    "PublicDescription": "Demand Data Cache Fills by Data Source.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ls_l1_d_tlb_miss.all",
+    "EventCode": "0x45",
+    "BriefDescription": "L1 DTLB Miss or Reload off all sizes.",
+    "PublicDescription": "L1 DTLB Miss or Reload off all sizes.",
+    "UMask": "0xff"
+  },
+  {
+    "EventName": "ls_l1_d_tlb_miss.tlb_reload_1g_l2_miss",
+    "EventCode": "0x45",
+    "BriefDescription": "L1 DTLB Miss of a page of 1G size.",
+    "PublicDescription": "L1 DTLB Miss of a page of 1G size.",
+    "UMask": "0x80"
+  },
+  {
+    "EventName": "ls_l1_d_tlb_miss.tlb_reload_2m_l2_miss",
+    "EventCode": "0x45",
+    "BriefDescription": "L1 DTLB Miss of a page of 2M size.",
+    "PublicDescription": "L1 DTLB Miss of a page of 2M size.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "ls_l1_d_tlb_miss.tlb_reload_32k_l2_miss",
+    "EventCode": "0x45",
+    "BriefDescription": "L1 DTLB Miss of a page of 32K size.",
+    "PublicDescription": "L1 DTLB Miss of a page of 32K size.",
+    "UMask": "0x20"
+  },
+  {
+    "EventName": "ls_l1_d_tlb_miss.tlb_reload_4k_l2_miss",
+    "EventCode": "0x45",
+    "BriefDescription": "L1 DTLB Miss of a page of 4K size.",
+    "PublicDescription": "L1 DTLB Miss of a page of 4K size.",
+    "UMask": "0x10"
+  },
+  {
+    "EventName": "ls_l1_d_tlb_miss.tlb_reload_1g_l2_hit",
+    "EventCode": "0x45",
+    "BriefDescription": "L1 DTLB Reload of a page of 1G size.",
+    "PublicDescription": "L1 DTLB Reload of a page of 1G size.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "ls_l1_d_tlb_miss.tlb_reload_2m_l2_hit",
+    "EventCode": "0x45",
+    "BriefDescription": "L1 DTLB Reload of a page of 2M size.",
+    "PublicDescription": "L1 DTLB Reload of a page of 2M size.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "ls_l1_d_tlb_miss.tlb_reload_32k_l2_hit",
+    "EventCode": "0x45",
+    "BriefDescription": "L1 DTLB Reload of a page of 32K size.",
+    "PublicDescription": "L1 DTLB Reload of a page of 32K size.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_l1_d_tlb_miss.tlb_reload_4k_l2_hit",
+    "EventCode": "0x45",
+    "BriefDescription": "L1 DTLB Reload of a page of 4K size.",
+    "PublicDescription": "L1 DTLB Reload of a page of 4K size.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ls_tablewalker.perf_mon_tablewalk_alloc_iside",
+    "EventCode": "0x46",
+    "BriefDescription": "Tablewalker allocation.",
+    "PublicDescription": "Tablewalker allocation.",
+    "UMask": "0xc"
+  },
+  {
+    "EventName": "ls_tablewalker.perf_mon_tablewalk_alloc_dside",
+    "EventCode": "0x46",
+    "BriefDescription": "Tablewalker allocation.",
+    "PublicDescription": "Tablewalker allocation.",
+    "UMask": "0x3"
+  },
+  {
+    "EventName": "ls_misal_accesses",
+    "EventCode": "0x47",
+    "BriefDescription": "Misaligned loads."
+  },
+  {
+    "EventName": "ls_pref_instr_disp.prefetch_nta",
+    "EventCode": "0x4b",
+    "BriefDescription": "Software Prefetch Instructions (PREFETCHNTA instruction) Dispatched.",
+    "PublicDescription": "Software Prefetch Instructions (PREFETCHNTA instruction) Dispatched.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "ls_pref_instr_disp.store_prefetch_w",
+    "EventCode": "0x4b",
+    "BriefDescription": "Software Prefetch Instructions (3DNow PREFETCHW instruction) Dispatched.",
+    "PublicDescription": "Software Prefetch Instructions (3DNow PREFETCHW instruction) Dispatched.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_pref_instr_disp.load_prefetch_w",
+    "EventCode": "0x4b",
+    "BriefDescription": "Prefetch, Prefetch_T0_T1_T2.",
+    "PublicDescription": "Software Prefetch Instructions Dispatched. Prefetch, Prefetch_T0_T1_T2.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ls_inef_sw_pref.mab_mch_cnt",
+    "EventCode": "0x52",
+    "BriefDescription": "Software PREFETCH instruction saw a match on an already-allocated miss request buffer.",
+    "PublicDescription": "The number of software prefetches that did not fetch data outside of the processor core.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_inef_sw_pref.data_pipe_sw_pf_dc_hit",
+    "EventCode": "0x52",
+    "BriefDescription": "Software PREFETCH instruction saw a DC hit.",
+    "PublicDescription": "The number of software prefetches that did not fetch data outside of the processor core.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_rmt_dram",
+    "EventCode": "0x59",
+    "BriefDescription": "DRAM or IO from different die.",
+    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_rmt_cache",
+    "EventCode": "0x59",
+    "BriefDescription": "Hit in cache; Remote CCX and the address's Home Node is on a different die.",
+    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x10"
+  },
+  {
+    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_lcl_dram",
+    "EventCode": "0x59",
+    "BriefDescription": "DRAM or IO from this thread's die.",
+    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_lcl_cache",
+    "EventCode": "0x59",
+    "BriefDescription": "Hit in cache; local CCX (not Local L2), or Remote CCX and the address's Home Node is on this thread's die.",
+    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_lcl_l2",
+    "EventCode": "0x59",
+    "BriefDescription": "Local L2 hit.",
+    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_rmt_dram",
+    "EventCode": "0x5A",
+    "BriefDescription": "DRAM or IO from different die.",
+    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_rmt_cache",
+    "EventCode": "0x5A",
+    "BriefDescription": "Hit in cache; Remote CCX and the address's Home Node is on a different die.",
+    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x10"
+  },
+  {
+    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_lcl_dram",
+    "EventCode": "0x5A",
+    "BriefDescription": "DRAM or IO from this thread's die.",
+    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_lcl_cache",
+    "EventCode": "0x5A",
+    "BriefDescription": "Hit in cache; local CCX (not Local L2), or Remote CCX and the address's Home Node is on this thread's die.",
+    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_lcl_l2",
+    "EventCode": "0x5A",
+    "BriefDescription": "Local L2 hit.",
+    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "ls_not_halted_cyc",
+    "EventCode": "0x76",
+    "BriefDescription": "Cycles not in Halt."
+  },
+  {
+    "EventName": "ls_tlb_flush",
+    "EventCode": "0x78",
+    "BriefDescription": "All TLB Flushes"
+  }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/other.json b/tools/perf/pmu-events/arch/x86/amdzen2/other.json
new file mode 100644
index 000000000000..5d2b53a7465d
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/other.json
@@ -0,0 +1,137 @@
+[
+  {
+    "EventName": "de_dis_uop_queue_empty_di0",
+    "EventCode": "0xa9",
+    "BriefDescription": "Cycles where the Micro-Op Queue is empty."
+  },
+  {
+    "EventName": "de_dis_uops_from_decoder",
+    "EventCode": "0xaa",
+    "BriefDescription": "Ops dispatched from either the decoders, OpCache or both.",
+    "UMask": "0xff"
+  },
+  {
+    "EventName": "de_dis_uops_from_decoder.opcache_dispatched",
+    "EventCode": "0xaa",
+    "BriefDescription": "Count of dispatched Ops from OpCache.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "de_dis_uops_from_decoder.decoder_dispatched",
+    "EventCode": "0xaa",
+    "BriefDescription": "Count of dispatched Ops from Decoder.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls1.fp_misc_rsrc_stall",
+    "EventCode": "0xae",
+    "BriefDescription": "FP Miscellaneous resource unavailable",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x80"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls1.fp_sch_rsrc_stall",
+    "EventCode": "0xae",
+    "BriefDescription": "FP scheduler resource stall.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls1.fp_reg_file_rsrc_stall",
+    "EventCode": "0xae",
+    "BriefDescription": "Floating point register file resource stall.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x20"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls1.taken_branch_buffer_rsrc_stall",
+    "EventCode": "0xae",
+    "BriefDescription": "",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x10"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls1.int_sched_misc_token_stall",
+    "EventCode": "0xae",
+    "BriefDescription": "Integer Scheduler miscellaneous resource stall.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls1.store_queue_token_stall",
+    "EventCode": "0xae",
+    "BriefDescription": "Store queue resource stall.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls1.load_queue_token_stall",
+    "EventCode": "0xae",
+    "BriefDescription": "Load queue resource stall.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls1.int_phy_reg_file_token_stall",
+    "EventCode": "0xae",
+    "BriefDescription": "Integer Physical Register File resource stall.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x1"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls0.sc_agu_dispatch_stall",
+    "EventCode": "0xaf",
+    "BriefDescription": "SC AGU dispatch stall.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x80"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls0.retire_token_stall",
+    "EventCode": "0xaf",
+    "BriefDescription": "RETIRE Tokens unavailable.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls0.agsq_token_stall",
+    "EventCode": "0xaf",
+    "BriefDescription": "AGSQ Tokens unavailable.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. AGSQ Tokens unavailable.",
+    "UMask": "0x20"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls0.alu_token_stall",
+    "EventCode": "0xaf",
+    "BriefDescription": "ALU tokens total unavailable.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. ALU tokens total unavailable.",
+    "UMask": "0x10"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls0.alsq3_0_token_stall",
+    "EventCode": "0xaf",
+    "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls0.alsq3_token_stall",
+    "EventCode": "0xaf",
+    "BriefDescription": "ALSQ 3 Tokens unavailable.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. ALSQ 3 Tokens unavailable.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls0.alsq2_token_stall",
+    "EventCode": "0xaf",
+    "BriefDescription": "ALSQ 2 Tokens unavailable.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. ALSQ 2 Tokens unavailable.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "de_dis_dispatch_token_stalls0.alsq1_token_stall",
+    "EventCode": "0xaf",
+    "BriefDescription": "ALSQ 1 Tokens unavailable.",
+    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. ALSQ 1 Tokens unavailable.",
+    "UMask": "0x1"
+  }
+]
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 82a9db00125e..244a36e37a3a 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -37,3 +37,4 @@ GenuineIntel-6-7D,v1,icelake,core
 GenuineIntel-6-7E,v1,icelake,core
 GenuineIntel-6-86,v1,tremontx,core
 AuthenticAMD-23-([12][0-9A-F]|[0-9A-F]),v1,amdzen1,core
+AuthenticAMD-23-[[:xdigit:]]+,v1,amdzen2,core
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] perf vendor events amd: update Zen1 events to V2
  2020-02-25 19:28 [PATCH v2 0/3] latest PMU events for zen1/zen2 Vijay Thakkar
  2020-02-25 19:28 ` [PATCH v2 1/3] perf vendor events amd: restrict model detection for zen1 based processors Vijay Thakkar
  2020-02-25 19:28 ` [PATCH v2 2/3] perf vendor events amd: add Zen2 events Vijay Thakkar
@ 2020-02-25 19:28 ` Vijay Thakkar
  2020-02-25 22:53   ` Kim Phillips
  2 siblings, 1 reply; 12+ messages in thread
From: Vijay Thakkar @ 2020-02-25 19:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Vijay Thakkar, Peter Zijlstra, Ingo Molnar, Kim Phillips,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

This patch updates the PMCs for AMD Zen1 core based processors (Family
17h; Models 0 through 2F) to be in accordance with PMCs as
documented in the latest versions of the AMD Processor Programming
Reference [1] and [2].

PMCs added:
fpu_pipe_assignment.dual{0|1|2|3}
fpu_pipe_assignment.total{0|1|2|3}
ls_mab_alloc.dc_prefetcher
ls_mab_alloc.stores
ls_mab_alloc.loads
bp_dyn_ind_pred
bp_de_redirect

PMC removed:
ex_ret_cond_misp

Cumulative counts, fpu_pipe_assignment.total and
fpu_pipe_assignment.dual, existed in v1, but did expose port-level
counters.

ex_ret_cond_misp has been removed as it has been removed from the latest
versions of the PPR, and when tested, always seems to sample zero as
tested on a Ryzen 3400G system.

[1]: Processor Programming Reference (PPR) for AMD Family 17h Models
01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019.
[2]: Processor Programming Reference (PPR) for AMD Family 17h Model 18h,
Revision B1 Processors, 55570-B1 Rev 3.14 - Sep 26, 2019.
All of the PPRs can be found at:
https://bugzilla.kernel.org/show_bug.cgi?id=206537

Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>

---
Changes in v2:
    - Correct the UMasks for fpu_pipe_assignment.dual* by left shifting
    all by 4 bits.
    - Correct UMask for ls_mab_alloc.loads
    - add bp_dyn_ind_pred (PMC0x08E)
    - add bp_de_redirect  (PMC0x091)
 
 .../pmu-events/arch/x86/amdzen1/branch.json   | 11 ++++
 .../pmu-events/arch/x86/amdzen1/core.json     |  5 --
 .../arch/x86/amdzen1/floating-point.json      | 56 +++++++++++++++++++
 .../pmu-events/arch/x86/amdzen1/memory.json   | 18 ++++++
 tools/perf/pmu-events/arch/x86/mapfile.csv    |  2 +-
 5 files changed, 86 insertions(+), 6 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/branch.json b/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
index 93ddfd8053ca..a9943eeb8d6b 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
@@ -8,5 +8,16 @@
     "EventName": "bp_l2_btb_correct",
     "EventCode": "0x8b",
     "BriefDescription": "L2 BTB Correction."
+  },
+  {
+    "EventName": "bp_dyn_ind_pred",
+    "EventCode": "0x8e",
+    "BriefDescription": "Dynamic Indirect Predictions.",
+    "PublicDescription": "Indirect Branch Prediction for potential multi-target branch (speculative)."
+  },
+  {
+    "EventName": "bp_de_redirect",
+    "EventCode": "0x91",
+    "BriefDescription": "Decoder Overrides Existing Branch Prediction (speculative)."
   }
 ]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/core.json b/tools/perf/pmu-events/arch/x86/amdzen1/core.json
index 1079544eeed5..38994fb4b625 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen1/core.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/core.json
@@ -90,11 +90,6 @@
     "EventCode": "0xd1",
     "BriefDescription": "Retired Conditional Branch Instructions."
   },
-  {
-    "EventName": "ex_ret_cond_misp",
-    "EventCode": "0xd2",
-    "BriefDescription": "Retired Conditional Branch Instructions Mispredicted."
-  },
   {
     "EventName": "ex_div_busy",
     "EventCode": "0xd3",
diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/floating-point.json b/tools/perf/pmu-events/arch/x86/amdzen1/floating-point.json
index ea4711983d1d..351ebf00bd21 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen1/floating-point.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/floating-point.json
@@ -6,6 +6,34 @@
     "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3.",
     "UMask": "0xf0"
   },
+  {
+    "EventName": "fpu_pipe_assignment.dual3",
+    "EventCode": "0x00",
+    "BriefDescription": "Total number multi-pipe uOps to pipe 3.",
+    "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3.",
+    "UMask": "0x80"
+  },
+  {
+    "EventName": "fpu_pipe_assignment.dual2",
+    "EventCode": "0x00",
+    "BriefDescription": "Total number multi-pipe uOps to pipe 2.",
+    "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3.",
+    "UMask": "0x40"
+  },
+  {
+    "EventName": "fpu_pipe_assignment.dual1",
+    "EventCode": "0x00",
+    "BriefDescription": "Total number multi-pipe uOps to pipe 1.",
+    "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3.",
+    "UMask": "0x20"
+  },
+  {
+    "EventName": "fpu_pipe_assignment.dual0",
+    "EventCode": "0x00",
+    "BriefDescription": "Total number multi-pipe uOps to pipe 0.",
+    "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3.",
+    "UMask": "0x10"
+  },
   {
     "EventName": "fpu_pipe_assignment.total",
     "EventCode": "0x00",
@@ -13,6 +41,34 @@
     "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to Pipe 3.",
     "UMask": "0xf"
   },
+  {
+    "EventName": "fpu_pipe_assignment.total3",
+    "EventCode": "0x00",
+    "BriefDescription": "Total number of fp uOps on pipe 3.",
+    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "fpu_pipe_assignment.total2",
+    "EventCode": "0x00",
+    "BriefDescription": "Total number of fp uOps on pipe 2.",
+    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",
+    "UMask": "0x4"
+  },
+  {
+    "EventName": "fpu_pipe_assignment.total1",
+    "EventCode": "0x00",
+    "BriefDescription": "Total number of fp uOps on pipe 1.",
+    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "fpu_pipe_assignment.total0",
+    "EventCode": "0x00",
+    "BriefDescription": "Total number of fp uOps  on pipe 0.",
+    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",
+    "UMask": "0x1"
+  },
   {
     "EventName": "fp_sched_empty",
     "EventCode": "0x01",
diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/memory.json b/tools/perf/pmu-events/arch/x86/amdzen1/memory.json
index fa2d60d4def0..9206a1a131fa 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen1/memory.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/memory.json
@@ -37,6 +37,24 @@
     "EventCode": "0x40",
     "BriefDescription": "The number of accesses to the data cache for load and store references. This may include certain microcode scratchpad accesses, although these are generally rare. Each increment represents an eight-byte access, although the instruction may only be accessing a portion of that. This event is a speculative event."
   },
+  {
+    "EventName": "ls_mab_alloc.dc_prefetcher",
+    "EventCode": "0x41",
+    "BriefDescription": "Data cache prefetcher miss.",
+    "UMask": "0x8"
+  },
+  {
+    "EventName": "ls_mab_alloc.stores",
+    "EventCode": "0x41",
+    "BriefDescription": "Data cache store miss.",
+    "UMask": "0x2"
+  },
+  {
+    "EventName": "ls_mab_alloc.loads",
+    "EventCode": "0x41",
+    "BriefDescription": "Data cache load miss.",
+    "UMask": "0x01"
+  },
   {
     "EventName": "ls_l1_d_tlb_miss.all",
     "EventCode": "0x45",
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 244a36e37a3a..25b06cf98747 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -36,5 +36,5 @@ GenuineIntel-6-55-[56789ABCDEF],v1,cascadelakex,core
 GenuineIntel-6-7D,v1,icelake,core
 GenuineIntel-6-7E,v1,icelake,core
 GenuineIntel-6-86,v1,tremontx,core
-AuthenticAMD-23-([12][0-9A-F]|[0-9A-F]),v1,amdzen1,core
+AuthenticAMD-23-([12][0-9A-F]|[0-9A-F]),v2,amdzen1,core
 AuthenticAMD-23-[[:xdigit:]]+,v1,amdzen2,core
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] perf vendor events amd: update Zen1 events to V2
  2020-02-25 19:28 ` [PATCH v2 3/3] perf vendor events amd: update Zen1 events to V2 Vijay Thakkar
@ 2020-02-25 22:53   ` Kim Phillips
  2020-02-27 20:00     ` Vijay Thakkar
  0 siblings, 1 reply; 12+ messages in thread
From: Kim Phillips @ 2020-02-25 22:53 UTC (permalink / raw)
  To: Vijay Thakkar, Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Martin Liška, Jon Grimm, linux-kernel,
	linux-perf-users

Hi Vijay,

Thanks for your resubmission.

On 2/25/20 1:28 PM, Vijay Thakkar wrote:
> [1]: Processor Programming Reference (PPR) for AMD Family 17h Models
> 01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019.
> [2]: Processor Programming Reference (PPR) for AMD Family 17h Model 18h,
> Revision B1 Processors, 55570-B1 Rev 3.14 - Sep 26, 2019.

Events such as the FPU pipe assignment ones are not
included in the above docs.  So can you add this one
to your list of references, since it has them listed?:

OSRR for AMD Family 17h processors, Models 00h-2Fh, 56255 Rev 3.03 - July, 2018

> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/floating-point.json
> @@ -6,6 +6,34 @@
>      "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3.",

Omit the trailing " to Pipe 3.", since this one's umask
represents all pipes.  Feel free to add "all pipes" instead.
I realize this isn't a line you're adding, but since we're
here, we might as well fix it.

>      "UMask": "0xf0"
>    },
> +  {
> +    "EventName": "fpu_pipe_assignment.dual3",
> +    "EventCode": "0x00",
> +    "BriefDescription": "Total number multi-pipe uOps to pipe 3.",
> +    "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3."
This one is ok.

> +    "UMask": "0x80"
> +  },
> +  {
> +    "EventName": "fpu_pipe_assignment.dual2",
> +    "EventCode": "0x00",
> +    "BriefDescription": "Total number multi-pipe uOps to pipe 2.",
> +    "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3.",

That trailing part of the string should say ..." to Pipe 2." , not Pipe 3.

> +    "UMask": "0x40"
> +  },
> +  {
> +    "EventName": "fpu_pipe_assignment.dual1",
> +    "EventCode": "0x00",
> +    "BriefDescription": "Total number multi-pipe uOps to pipe 1.",
> +    "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3.",
> +    "UMask": "0x20"
> +  },

That trailing part of the string should say ..." to Pipe 1." , not Pipe 3.

> +  {
> +    "EventName": "fpu_pipe_assignment.dual0",
> +    "EventCode": "0x00",
> +    "BriefDescription": "Total number multi-pipe uOps to pipe 0.",
> +    "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number multi-pipe uOps assigned to Pipe 3.",

That trailing part of the string should say ..." to Pipe 0." , not Pipe 3.

> +    "UMask": "0x10"
> +  },
>    {
>      "EventName": "fpu_pipe_assignment.total",
>      "EventCode": "0x00",
> @@ -13,6 +41,34 @@
>      "PublicDescription": "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to Pipe 3.",

Omit the trailing " to Pipe 3.", since this one's umask represents all pipes.

>      "UMask": "0xf"
>    },
> +  {
> +    "EventName": "fpu_pipe_assignment.total3",
> +    "EventCode": "0x00",
> +    "BriefDescription": "Total number of fp uOps on pipe 3.",
> +    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",

Please concatenate " Total3: Total number uOps assigned to Pipe 3." to the above string.

> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "fpu_pipe_assignment.total2",
> +    "EventCode": "0x00",
> +    "BriefDescription": "Total number of fp uOps on pipe 2.",
> +    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",

same here, but for pipe 2.

> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "fpu_pipe_assignment.total1",
> +    "EventCode": "0x00",
> +    "BriefDescription": "Total number of fp uOps on pipe 1.",
> +    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",

and here.

> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "fpu_pipe_assignment.total0",
> +    "EventCode": "0x00",
> +    "BriefDescription": "Total number of fp uOps  on pipe 0.",
> +    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",
> +    "UMask": "0x1"
> +  },

and here.

> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/memory.json
> @@ -37,6 +37,24 @@
>      "EventCode": "0x40",
>      "BriefDescription": "The number of accesses to the data cache for load and store references. This may include certain microcode scratchpad accesses, although these are generally rare. Each increment represents an eight-byte access, although the instruction may only be accessing a portion of that. This event is a speculative event."
>    },
> +  {
> +    "EventName": "ls_mab_alloc.dc_prefetcher",
> +    "EventCode": "0x41",
> +    "BriefDescription": "Data cache prefetcher miss.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "ls_mab_alloc.stores",
> +    "EventCode": "0x41",
> +    "BriefDescription": "Data cache store miss.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_mab_alloc.loads",
> +    "EventCode": "0x41",
> +    "BriefDescription": "Data cache load miss.",
> +    "UMask": "0x01"
> +  },

Hm, PMCx041 didn't exist when I wrote commit 0e3b74e26280
"perf/x86/amd: Update generic hardware cache events for Family 17h",
and their counts don't seem to match up very well when running
various workloads.  The microarchitecture is likely to have changed
in this area from families prior to 17h, so a MAB alloc can likely
count different events than what is presumed here: a Data cache
load/store/prefetch miss.

I think it's safer to just leave the PPR text "LS MAB Allocates
by Type" as-is, instead of assuming they are L1 load/store misses.
What do you think?

I'll review patches 1-2 tomorrow.

Thanks,

Kim

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] perf vendor events amd: add Zen2 events
  2020-02-25 19:28 ` [PATCH v2 2/3] perf vendor events amd: add Zen2 events Vijay Thakkar
@ 2020-02-26 22:09   ` Kim Phillips
  2020-02-28 16:00     ` Vijay Thakkar
  2020-02-28 17:34     ` Vijay Thakkar
  0 siblings, 2 replies; 12+ messages in thread
From: Kim Phillips @ 2020-02-26 22:09 UTC (permalink / raw)
  To: Vijay Thakkar, Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Martin Liška, Jon Grimm, linux-kernel,
	linux-perf-users

On 2/25/20 1:28 PM, Vijay Thakkar wrote:
<snip>
> +  {
> +    "EventName": "bp_l1_tlb_fetch_hit",
> +    "EventCode": "0x94",
> +    "BriefDescription": "All instruction fetches.",

Nit picking here, but "All instruction fetches" doesn't occur in
PMCx094's description.  I'm looking at both Model 71h and 31h
documentation.  This event specifically calls out the context of
the L1 ITLB and its *hits*, i.e., specifically this text:

> +    "PublicDescription": "The number of instruction fetches that hit in the L1 ITLB.",

So this is the text that should probably be in the BriefDescription
as well, but if both BriefDescription and PublicDescription are going
to be equal, just provide the BriefDescription version, since it
shows with and without the -v and --details flags to perf list.

> +    "UMask": "0xFF"
> +  },
> +  {
> +    "EventName": "bp_l1_tlb_fetch_hit.if1g",
> +    "EventCode": "0x94",
> +    "BriefDescription": "Instuction fetches to a 1GB page.",
                            ^^^^^^^^^^ Instruction

> +    "PublicDescription": "The number of instruction fetches that hit in the L1 ITLB.",

"The number of instruction fetches that hit in the L1 ITLB.  Instruction fetches to a 1 GB page."

> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "bp_l1_tlb_fetch_hit.if2m",
> +    "EventCode": "0x94",
> +    "BriefDescription": "Instuction fetches to a 2MB page.",
                            ^^^^^^^^^^ Instruction

> +    "PublicDescription": "The number of instruction fetches that hit in the L1 ITLB.",

Concatenate the BriefDescription text here.

> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "bp_l1_tlb_fetch_hit.if4k",
> +    "EventCode": "0x94",
> +    "BriefDescription": "Instuction fetches to a 4KB page.",
                            ^^^^^^^^^^ Instruction

> +    "PublicDescription": "The number of instruction fetches that hit in the L1 ITLB.",

Concatenate the BriefDescription text here.

<snip>

> diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/cache.json b/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
> new file mode 100644
> index 000000000000..aee22537b711
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
> @@ -0,0 +1,375 @@
> +[
> +  {
> +    "EventName": "l2_request_g1.rd_blk_l",
> +    "EventCode": "0x60",
> +    "BriefDescription": "Requests to L2 Group1.",
> +    "PublicDescription": "Requests to L2 Group1.",
> +    "UMask": "0x80"
> +  },

Where is the "Data Cache Reads (including hardware and software prefetch)."
text from the document(s)?

> +  {
> +    "EventName": "l2_request_g1.rd_blk_x",
> +    "EventCode": "0x60",
> +    "BriefDescription": "Requests to L2 Group1.",
> +    "PublicDescription": "Requests to L2 Group1.",
> +    "UMask": "0x40"
> +  },

Here too: missing unit mask description text.

> +  {
> +    "EventName": "l2_request_g1.ls_rd_blk_c_s",
> +    "EventCode": "0x60",
> +    "BriefDescription": "Requests to L2 Group1.",
> +    "PublicDescription": "Requests to L2 Group1.",
> +    "UMask": "0x20"
> +  },
> +  {
> +    "EventName": "l2_request_g1.cacheable_ic_read",
> +    "EventCode": "0x60",
> +    "BriefDescription": "Requests to L2 Group1.",
> +    "PublicDescription": "Requests to L2 Group1.",
> +    "UMask": "0x10"
> +  },
> +  {
> +    "EventName": "l2_request_g1.change_to_x",
> +    "EventCode": "0x60",
> +    "BriefDescription": "Requests to L2 Group1.",
> +    "PublicDescription": "Requests to L2 Group1.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "l2_request_g1.prefetch_l2",
> +    "EventCode": "0x60",
> +    "BriefDescription": "Requests to L2 Group1.",
> +    "PublicDescription": "Requests to L2 Group1.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "l2_request_g1.l2_hw_pf",
> +    "EventCode": "0x60",
> +    "BriefDescription": "Requests to L2 Group1.",
> +    "PublicDescription": "Requests to L2 Group1.",
> +    "UMask": "0x2"
> +  },

same with all the above.

> +  {
> +    "EventName": "l2_request_g1.other_requests",
> +    "EventCode": "0x60",
> +    "BriefDescription": "Events covered by l2_request_g2.",
> +    "PublicDescription": "Requests to L2 Group1. Events covered by l2_request_g2.",
> +    "UMask": "0x1"
> +  },

This text doesn't match the text in either of these:

PPR for AMD Family 17h Model 71h B0 - 56176 Rev 3.06 - Jul 17, 2019
PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 - Sep 12, 2019

The text in the above is an improved version from the Zen 1 variants.
Care to use the text from these documents?

> +  {
> +    "EventName": "l2_request_g2.group1",
> +    "EventCode": "0x61",
> +    "BriefDescription": "All Group 1 commands not in unit0.",
> +    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous. All Group 1 commands not in unit0.",
> +    "UMask": "0x80"
> +  },
> +  {
> +    "EventName": "l2_request_g2.ls_rd_sized",
> +    "EventCode": "0x61",
> +    "BriefDescription": "RdSized, RdSized32, RdSized64.",
> +    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous. RdSized, RdSized32, RdSized64.",
> +    "UMask": "0x40"
> +  },
> +  {
> +    "EventName": "l2_request_g2.ls_rd_sized_nc",
> +    "EventCode": "0x61",
> +    "BriefDescription": "RdSizedNC, RdSized32NC, RdSized64NC.",
> +    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous. RdSizedNC, RdSized32NC, RdSized64NC.",
> +    "UMask": "0x20"
> +  },
> +  {
> +    "EventName": "l2_request_g2.ic_rd_sized",
> +    "EventCode": "0x61",
> +    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "UMask": "0x10"
> +  },
> +  {
> +    "EventName": "l2_request_g2.ic_rd_sized_nc",
> +    "EventCode": "0x61",
> +    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "l2_request_g2.smc_inval",
> +    "EventCode": "0x61",
> +    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "l2_request_g2.bus_locks_originator",
> +    "EventCode": "0x61",
> +    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "l2_request_g2.bus_locks_responses",
> +    "EventCode": "0x61",
> +    "BriefDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "PublicDescription": "Multi-events in that LS and IF requests can be received simultaneous.",
> +    "UMask": "0x1"
> +  },

that last comment applies to all the above.

> +  {
> +    "EventName": "l2_latency.l2_cycles_waiting_on_fills",
> +    "EventCode": "0x62",
> +    "BriefDescription": "Total cycles spent waiting for L2 fills to complete from L3 or memory, divided by four. Event counts are for both threads. To calculate average latency, the number of fills from both threads must be used.",
> +    "PublicDescription": "Total cycles spent waiting for L2 fills to complete from L3 or memory, divided by four. Event counts are for both threads. To calculate average latency, the number of fills from both threads must be used.",
> +    "UMask": "0x1"
> +  },
> +  {
> +    "EventName": "l2_wcb_req.wcb_write",
> +    "EventCode": "0x63",
> +    "PublicDescription": "LS (Load/Store unit) to L2 WCB (Write Combining Buffer) write requests.",
> +    "BriefDescription": "LS to L2 WCB write requests.",
> +    "UMask": "0x40"
> +  },
> +  {
> +    "EventName": "l2_wcb_req.wcb_close",
> +    "EventCode": "0x63",
> +    "BriefDescription": "LS to L2 WCB close requests.",
> +    "PublicDescription": "LS (Load/Store unit) to L2 WCB (Write Combining Buffer) close requests.",
> +    "UMask": "0x20"
> +  },
> +  {
> +    "EventName": "l2_wcb_req.zero_byte_store",
> +    "EventCode": "0x63",
> +    "BriefDescription": "LS to L2 WCB zero byte store requests.",
> +    "PublicDescription": "LS (Load/Store unit) to L2 WCB (Write Combining Buffer) zero byte store requests.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "l2_wcb_req.cl_zero",
> +    "EventCode": "0x63",
> +    "PublicDescription": "LS to L2 WCB cache line zeroing requests.",
> +    "BriefDescription": "LS (Load/Store unit) to L2 WCB (Write Combining Buffer) cache line zeroing requests.",
> +    "UMask": "0x1"
> +  },


> +  {
> +    "EventName": "l2_cache_req_stat.ls_rd_blk_cs",
> +    "EventCode": "0x64",
> +    "BriefDescription": "LS ReadBlock C/S Hit.",
> +    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LS ReadBlock C/S Hit.",
> +    "UMask": "0x80"
> +  },

This text doesn't match that of the updated documents: please update it.

> +  {
> +    "EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_x",
> +    "EventCode": "0x64",
> +    "BriefDescription": "LS Read Block L Hit X.",
> +    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LS Read Block L Hit X.",
> +    "UMask": "0x40"
> +  },
> +  {
> +    "EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_s",
> +    "EventCode": "0x64",
> +    "BriefDescription": "LsRdBlkL Hit Shared.",
> +    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LsRdBlkL Hit Shared.",
> +    "UMask": "0x20"
> +  },
> +  {
> +    "EventName": "l2_cache_req_stat.ls_rd_blk_x",
> +    "EventCode": "0x64",
> +    "BriefDescription": "LsRdBlkX/ChgToX Hit X.  Count RdBlkX finding Shared as a Miss.",
> +    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LsRdBlkX/ChgToX Hit X.  Count RdBlkX finding Shared as a Miss.",
> +    "UMask": "0x10"
> +  },
> +  {
> +    "EventName": "l2_cache_req_stat.ls_rd_blk_c",
> +    "EventCode": "0x64",
> +    "BriefDescription": "LS Read Block C S L X Change to X Miss.",
> +    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. LS Read Block C S L X Change to X Miss.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "l2_cache_req_stat.ic_fill_hit_x",
> +    "EventCode": "0x64",
> +    "BriefDescription": "IC Fill Hit Exclusive Stale.",
> +    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. IC Fill Hit Exclusive Stale.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "l2_cache_req_stat.ic_fill_hit_s",
> +    "EventCode": "0x64",
> +    "BriefDescription": "IC Fill Hit Shared.",
> +    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. IC Fill Hit Shared.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "l2_cache_req_stat.ic_fill_miss",
> +    "EventCode": "0x64",
> +    "BriefDescription": "IC Fill Miss.",
> +    "PublicDescription": "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher. IC Fill Miss.",
> +    "UMask": "0x1"
> +  },

that last comment applies to all unit masks until here.

> +  {
> +    "EventName": "l2_pf_hit_l2",
> +    "EventCode": "0x70",
> +    "BriefDescription": "All L2 prefetches accepted by the L2 pipeline which hit the L2."
> +  },
> +  {
> +    "EventName": "l2_pf_miss_l2_hit_l3",
> +    "EventCode": "0x71",
> +    "BriefDescription": "All L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3."
> +  },
> +  {
> +    "EventName": "l2_pf_miss_l2_l3",
> +    "EventCode": "0x72",
> +    "BriefDescription": "All L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches."
> +  },

These events produce zero counts on Zen 2;
they need a "UMask": "0xff".

<snip>

> +  {
> +    "EventName": "bp_l1_tlb_miss_l2_tlb_miss.if1g",
> +    "EventCode": "0x85",
> +    "BriefDescription": "Instruction fetches to a 1GB page.",
> +    "PublicDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "bp_l1_tlb_miss_l2_tlb_miss.if2m",
> +    "EventCode": "0x85",
> +    "BriefDescription": "Instruction fetches to a 2MB page.",
> +    "PublicDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "bp_l1_tlb_miss_l2_tlb_miss.if4k",
> +    "EventCode": "0x85",
> +    "BriefDescription": "Instruction fetches to a 4KB page.",
> +    "PublicDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs.",
> +    "UMask": "0x1"
> +  },

The BriefDescription text needs to be concatenated to the
PublicDescription text for all of the above events.

<snip>

> +  {
> +    "EventName": "ex_ret_cops",
> +    "EventCode": "0xc1",
> +    "BriefDescription": "Retired Uops.",
> +    "PublicDescription": "The number of uOps retired. This includes all processor activity (instructions, exceptions, interrupts, microcode assists, etc.). The number of events logged per cycle can vary from 0 to 4."
> +  },

This text is not up to date with the latest documents.

> +  {
> +    "EventName": "ex_ret_fus_brnch_inst",
> +    "EventCode": "0x1d0",
> +    "BriefDescription": "The number of fused retired branch instructions retired per cycle. The number of events logged per cycle can vary from 0 to 3."
> +  }

This event's description text has been updated in the
latest documents; please use that text instead.

> diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json b/tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json
> new file mode 100644
> index 000000000000..df530b398f9d
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/x86/amdzen2/floating-point.json
> @@ -0,0 +1,128 @@
> +[
> +  {
> +    "EventName": "fpu_pipe_assignment.total",
> +    "EventCode": "0x00",
> +    "BriefDescription": "Total number of fp uOps.",
> +    "PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",
> +    "UMask": "0xf"
> +  },
> +  {
> +    "EventName": "fp_ret_sse_avx_ops.all",
> +    "EventCode": "0x03",
> +    "BriefDescription": "All FLOPS.",
> +    "PublicDescription": "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15.",
> +    "UMask": "0xff"
> +  },

The BriefDescription text needs to be concatenated to the
PublicDescription text for all of the above events.

<snip>
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/memory.json b/tools/perf/pmu-events/arch/x86/amdzen2/memory.json
> new file mode 100644
> index 000000000000..5c0f80588c61
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/x86/amdzen2/memory.json
> @@ -0,0 +1,349 @@
> +[
> +  {
> +    "EventName": "ls_bad_status2.stli_other",
> +    "EventCode": "0x24",
> +    "BriefDescription": "Non-forwardable conflict; used to reduce STLI's via software. All reasons.",
> +    "PublicDescription": "Store To Load Interlock (STLI) are loads that were unable to complete because of a possible match with an older store, and the older store could not do STLF for some reason. There are a number of reasons why this occurs, and this perfmon organizes them into three major groups. ",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_locks.spec_lock_hi_spec",
> +    "EventCode": "0x25",
> +    "BriefDescription": "High speculative cacheable lock speculation succeeded.",
> +    "PublicDescription": "Retired lock instructions.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "ls_locks.spec_lock_lo_spec",
> +    "EventCode": "0x25",
> +    "BriefDescription": "Low speculative cacheable lock speculation succeeded.",
> +    "PublicDescription": "Retired lock instructions.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "ls_locks.non_spec_lock",
> +    "EventCode": "0x25",
> +    "BriefDescription": "Non-speculative lock succeeded.",
> +    "PublicDescription": "Retired lock instructions.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_locks.bus_lock",
> +    "EventCode": "0x25",
> +    "BriefDescription": "Comparable to legacy bus lock.",
> +    "PublicDescription": "Retired lock instructions. Bus lock when a locked operations crosses a cache boundary or is done on an uncacheable memory type.",
> +    "UMask": "0x1"
> +  },

The BriefDescription text needs to be concatenated to the
PublicDescription text for all of the above events.

> +  {
> +    "EventName": "ls_ret_cl_flush",
> +    "EventCode": "0x26",
> +    "BriefDescription": "Number of retired CLFLUSH instructions."
> +  },
> +  {
> +    "EventName": "ls_ret_cpuid",
> +    "EventCode": "0x27",
> +    "BriefDescription": "Number of retired CLFLUSH instructions."

This text should read "The number of CPUID instructions retired."

> +  },
> +  {
> +    "EventName": "ls_dispatch.ld_st_dispatch",
> +    "EventCode": "0x29",
> +    "BriefDescription": "Number of single ops that do load/store to an address.",
> +    "PublicDescription": "Dispatch of a single op that performs a load from and store to the same memory address.",
> +    "UMask": "0x4"
> +  },> +  {
> +    "EventName": "ls_dispatch.store_dispatch",
> +    "EventCode": "0x29",
> +    "BriefDescription": "Number of stores dispatched.",
> +    "PublicDescription": "Counts the number of operations dispatched to the LS unit. Unit Masks ADDed.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_dispatch.ld_dispatch",
> +    "EventCode": "0x29",
> +    "BriefDescription": "Number of loads dispatched.",
> +    "PublicDescription": "Counts the number of operations dispatched to the LS unit. Unit Masks ADDed.",
> +    "UMask": "0x1"
> +  },

The BriefDescription text needs to be concatenated to the
PublicDescription text for all of the above events.

<snip>

> +  {
> +    "EventName": "ls_refills_from_sys.ls_mabresp_rmt_dram",
> +    "EventCode": "0x43",
> +    "BriefDescription": "DRAM or IO from different die.",
> +    "PublicDescription": "Demand Data Cache Fills by Data Source.",
> +    "UMask": "0x40"
> +  },
> +  {
> +    "EventName": "ls_refills_from_sys.ls_mabresp_rmt_cache",
> +    "EventCode": "0x43",
> +    "BriefDescription": "Hit in cache; Remote CCX and the address's Home Node is on a different die.",
> +    "PublicDescription": "Demand Data Cache Fills by Data Source.",
> +    "UMask": "0x10"
> +  },
> +  {
> +    "EventName": "ls_refills_from_sys.ls_mabresp_lcl_dram",
> +    "EventCode": "0x43",
> +    "BriefDescription": "DRAM or IO from this thread's die.",
> +    "PublicDescription": "Demand Data Cache Fills by Data Source.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "ls_refills_from_sys.ls_mabresp_lcl_cache",
> +    "EventCode": "0x43",
> +    "BriefDescription": "Hit in cache; local CCX (not Local L2), or Remote CCX and the address's Home Node is on this thread's die.",
> +    "PublicDescription": "Demand Data Cache Fills by Data Source.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_refills_from_sys.ls_mabresp_lcl_l2",
> +    "EventCode": "0x43",
> +    "BriefDescription": "Local L2 hit.",
> +    "PublicDescription": "Demand Data Cache Fills by Data Source.",
> +    "UMask": "0x1"
> +  },

The BriefDescription text needs to be concatenated to the
PublicDescription text for all of the above events.

> +  {
> +    "EventName": "ls_l1_d_tlb_miss.all",
> +    "EventCode": "0x45",
> +    "BriefDescription": "L1 DTLB Miss or Reload off all sizes.",
> +    "PublicDescription": "L1 DTLB Miss or Reload off all sizes.",
> +    "UMask": "0xff"
> +  },
> +  {
> +    "EventName": "ls_l1_d_tlb_miss.tlb_reload_1g_l2_miss",
> +    "EventCode": "0x45",
> +    "BriefDescription": "L1 DTLB Miss of a page of 1G size.",
> +    "PublicDescription": "L1 DTLB Miss of a page of 1G size.",
> +    "UMask": "0x80"
> +  },
> +  {
> +    "EventName": "ls_l1_d_tlb_miss.tlb_reload_2m_l2_miss",
> +    "EventCode": "0x45",
> +    "BriefDescription": "L1 DTLB Miss of a page of 2M size.",
> +    "PublicDescription": "L1 DTLB Miss of a page of 2M size.",
> +    "UMask": "0x40"
> +  },
> +  {
> +    "EventName": "ls_l1_d_tlb_miss.tlb_reload_32k_l2_miss",
> +    "EventCode": "0x45",
> +    "BriefDescription": "L1 DTLB Miss of a page of 32K size.",
> +    "PublicDescription": "L1 DTLB Miss of a page of 32K size.",
> +    "UMask": "0x20"
> +  },
> +  {
> +    "EventName": "ls_l1_d_tlb_miss.tlb_reload_4k_l2_miss",
> +    "EventCode": "0x45",
> +    "BriefDescription": "L1 DTLB Miss of a page of 4K size.",
> +    "PublicDescription": "L1 DTLB Miss of a page of 4K size.",
> +    "UMask": "0x10"
> +  },
> +  {
> +    "EventName": "ls_l1_d_tlb_miss.tlb_reload_1g_l2_hit",
> +    "EventCode": "0x45",
> +    "BriefDescription": "L1 DTLB Reload of a page of 1G size.",
> +    "PublicDescription": "L1 DTLB Reload of a page of 1G size.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "ls_l1_d_tlb_miss.tlb_reload_2m_l2_hit",
> +    "EventCode": "0x45",
> +    "BriefDescription": "L1 DTLB Reload of a page of 2M size.",
> +    "PublicDescription": "L1 DTLB Reload of a page of 2M size.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "ls_l1_d_tlb_miss.tlb_reload_32k_l2_hit",
> +    "EventCode": "0x45",
> +    "BriefDescription": "L1 DTLB Reload of a page of 32K size.",
> +    "PublicDescription": "L1 DTLB Reload of a page of 32K size.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_l1_d_tlb_miss.tlb_reload_4k_l2_hit",
> +    "EventCode": "0x45",
> +    "BriefDescription": "L1 DTLB Reload of a page of 4K size.",
> +    "PublicDescription": "L1 DTLB Reload of a page of 4K size.",
> +    "UMask": "0x1"
> +  },

The above are not up to date, e.g., unit mask 0x2 is now
TlbReloadCoalescedPageHit.

Also, wherever BriefDescription equals PublicDescription, 
please only provide the BriefDescription.

> +  {
> +    "EventName": "ls_pref_instr_disp.prefetch_nta",
> +    "EventCode": "0x4b",
> +    "BriefDescription": "Software Prefetch Instructions (PREFETCHNTA instruction) Dispatched.",
> +    "PublicDescription": "Software Prefetch Instructions (PREFETCHNTA instruction) Dispatched.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "ls_pref_instr_disp.store_prefetch_w",
> +    "EventCode": "0x4b",
> +    "BriefDescription": "Software Prefetch Instructions (3DNow PREFETCHW instruction) Dispatched.",
> +    "PublicDescription": "Software Prefetch Instructions (3DNow PREFETCHW instruction) Dispatched.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_pref_instr_disp.load_prefetch_w",
> +    "EventCode": "0x4b",
> +    "BriefDescription": "Prefetch, Prefetch_T0_T1_T2.",
> +    "PublicDescription": "Software Prefetch Instructions Dispatched. Prefetch, Prefetch_T0_T1_T2.",
> +    "UMask": "0x1"
> +  },

The text for the above events has been updated in the documents;
please update it here, too.

> +  {
> +    "EventName": "ls_inef_sw_pref.mab_mch_cnt",
> +    "EventCode": "0x52",
> +    "BriefDescription": "Software PREFETCH instruction saw a match on an already-allocated miss request buffer.",
> +    "PublicDescription": "The number of software prefetches that did not fetch data outside of the processor core.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_inef_sw_pref.data_pipe_sw_pf_dc_hit",
> +    "EventCode": "0x52",
> +    "BriefDescription": "Software PREFETCH instruction saw a DC hit.",
> +    "PublicDescription": "The number of software prefetches that did not fetch data outside of the processor core.",
> +    "UMask": "0x1"
> +  },
> +  {
> +    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_rmt_dram",
> +    "EventCode": "0x59",
> +    "BriefDescription": "DRAM or IO from different die.",
> +    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x40"
> +  },
> +  {
> +    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_rmt_cache",
> +    "EventCode": "0x59",
> +    "BriefDescription": "Hit in cache; Remote CCX and the address's Home Node is on a different die.",
> +    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x10"
> +  },
> +  {
> +    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_lcl_dram",
> +    "EventCode": "0x59",
> +    "BriefDescription": "DRAM or IO from this thread's die.",
> +    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_lcl_cache",
> +    "EventCode": "0x59",
> +    "BriefDescription": "Hit in cache; local CCX (not Local L2), or Remote CCX and the address's Home Node is on this thread's die.",
> +    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_sw_pf_dc_fill.ls_mabresp_lcl_l2",
> +    "EventCode": "0x59",
> +    "BriefDescription": "Local L2 hit.",
> +    "PublicDescription": "Software Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x1"
> +  },
> +  {
> +    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_rmt_dram",
> +    "EventCode": "0x5A",
> +    "BriefDescription": "DRAM or IO from different die.",
> +    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x40"
> +  },
> +  {
> +    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_rmt_cache",
> +    "EventCode": "0x5A",
> +    "BriefDescription": "Hit in cache; Remote CCX and the address's Home Node is on a different die.",
> +    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x10"
> +  },
> +  {
> +    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_lcl_dram",
> +    "EventCode": "0x5A",
> +    "BriefDescription": "DRAM or IO from this thread's die.",
> +    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_lcl_cache",
> +    "EventCode": "0x5A",
> +    "BriefDescription": "Hit in cache; local CCX (not Local L2), or Remote CCX and the address's Home Node is on this thread's die.",
> +    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "ls_hw_pf_dc_fill.ls_mabresp_lcl_l2",
> +    "EventCode": "0x5A",
> +    "BriefDescription": "Local L2 hit.",
> +    "PublicDescription": "Hardware Prefetch Data Cache Fills by Data Source.",
> +    "UMask": "0x1"
> +  },

The BriefDescription text needs to be concatenated to the
PublicDescription text for all of the above events.

> +  {
> +    "EventName": "de_dis_dispatch_token_stalls1.fp_misc_rsrc_stall",
> +    "EventCode": "0xae",
> +    "BriefDescription": "FP Miscellaneous resource unavailable",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> +    "UMask": "0x80"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls1.fp_sch_rsrc_stall",
> +    "EventCode": "0xae",
> +    "BriefDescription": "FP scheduler resource stall.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> +    "UMask": "0x40"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls1.fp_reg_file_rsrc_stall",
> +    "EventCode": "0xae",
> +    "BriefDescription": "Floating point register file resource stall.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> +    "UMask": "0x20"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls1.taken_branch_buffer_rsrc_stall",
> +    "EventCode": "0xae",
> +    "BriefDescription": "",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> +    "UMask": "0x10"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls1.int_sched_misc_token_stall",
> +    "EventCode": "0xae",
> +    "BriefDescription": "Integer Scheduler miscellaneous resource stall.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls1.store_queue_token_stall",
> +    "EventCode": "0xae",
> +    "BriefDescription": "Store queue resource stall.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls1.load_queue_token_stall",
> +    "EventCode": "0xae",
> +    "BriefDescription": "Load queue resource stall.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls1.int_phy_reg_file_token_stall",
> +    "EventCode": "0xae",
> +    "BriefDescription": "Integer Physical Register File resource stall.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> +    "UMask": "0x1"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls0.sc_agu_dispatch_stall",
> +    "EventCode": "0xaf",
> +    "BriefDescription": "SC AGU dispatch stall.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> +    "UMask": "0x80"
> +  },

All the above Public descriptions have "RETIRE Tokens unavailable." 
yet their BriefDescriptions are different.  Please fix.

> +  {
> +    "EventName": "de_dis_dispatch_token_stalls0.agsq_token_stall",
> +    "EventCode": "0xaf",
> +    "BriefDescription": "AGSQ Tokens unavailable.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. AGSQ Tokens unavailable.",
> +    "UMask": "0x20"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls0.alu_token_stall",
> +    "EventCode": "0xaf",
> +    "BriefDescription": "ALU tokens total unavailable.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. ALU tokens total unavailable.",
> +    "UMask": "0x10"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls0.alsq3_0_token_stall",
> +    "EventCode": "0xaf",
> +    "BriefDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall.",
> +    "UMask": "0x8"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls0.alsq3_token_stall",
> +    "EventCode": "0xaf",
> +    "BriefDescription": "ALSQ 3 Tokens unavailable.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. ALSQ 3 Tokens unavailable.",
> +    "UMask": "0x4"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls0.alsq2_token_stall",
> +    "EventCode": "0xaf",
> +    "BriefDescription": "ALSQ 2 Tokens unavailable.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. ALSQ 2 Tokens unavailable.",
> +    "UMask": "0x2"
> +  },
> +  {
> +    "EventName": "de_dis_dispatch_token_stalls0.alsq1_token_stall",
> +    "EventCode": "0xaf",
> +    "BriefDescription": "ALSQ 1 Tokens unavailable.",
> +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. ALSQ 1 Tokens unavailable.",
> +    "UMask": "0x1"
> +  }

All the above events have updated text in the documents; please
update the text here too.

Patch 1 of 3 looks OK to me.  Assuming there are no further
comments to this version, please make the requested changes
to patches 2 and 3 of the series, and resend as a v3.

Thanks!

Kim

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] perf vendor events amd: update Zen1 events to V2
  2020-02-25 22:53   ` Kim Phillips
@ 2020-02-27 20:00     ` Vijay Thakkar
  2020-02-27 21:20       ` Kim Phillips
  0 siblings, 1 reply; 12+ messages in thread
From: Vijay Thakkar @ 2020-02-27 20:00 UTC (permalink / raw)
  To: Kim Phillips
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

> OSRR for AMD Family 17h processors, Models 00h-2Fh, 56255 Rev 3.03 - July, 2018
I have included this for v3 that I will submit later, including all the
changes for the FPU counters. Sorry, I messed up copy-pasting the text
and forgot to change the trailing pipe number.
> and their counts don't seem to match up very well when running
> various workloads.  The microarchitecture is likely to have changed
> in this area from families prior to 17h, so a MAB alloc can likely
> count different events than what is presumed here: a Data cache
> load/store/prefetch miss.
> 
> I think it's safer to just leave the PPR text "LS MAB Allocates
> by Type" as-is, instead of assuming they are L1 load/store misses.
> What do you think?

I did some checking accross PPRs, and this counter seems to have changed
names multiple times throughout the PPR revisions. 

Zen1 PPR (54945 Rev 1.14 - April 15, 2017) lists counter called "LsMabAllocPipe"
with 5 subcounters that have different names compared to ones we see in
the mainline right now. PPRs for stepping B2
onwards change this to the 3 sub-counter and primary counter name
we see right now. This public description still changes accross various
PPR revisions, which is why I had this set to what it was. The lastest
PPR I can find is indeed lists it as "LS MAB Allocates by Type";
I will change it to that with the fuffix of tehe sub-counter name. Since
the same counter is in Zen2 as well, I will make the same changes there
too.

Let me know if this sounds good to you!
Best,
Vijay


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] perf vendor events amd: update Zen1 events to V2
  2020-02-27 20:00     ` Vijay Thakkar
@ 2020-02-27 21:20       ` Kim Phillips
  0 siblings, 0 replies; 12+ messages in thread
From: Kim Phillips @ 2020-02-27 21:20 UTC (permalink / raw)
  To: Vijay Thakkar
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

On 2/27/20 2:00 PM, Vijay Thakkar wrote:
>> OSRR for AMD Family 17h processors, Models 00h-2Fh, 56255 Rev 3.03 - July, 2018
> I have included this for v3 that I will submit later, including all the
> changes for the FPU counters. Sorry, I messed up copy-pasting the text
> and forgot to change the trailing pipe number.

Please also look at addressing the review comments for patch 2 of 3.

>> and their counts don't seem to match up very well when running
>> various workloads.  The microarchitecture is likely to have changed
>> in this area from families prior to 17h, so a MAB alloc can likely
>> count different events than what is presumed here: a Data cache
>> load/store/prefetch miss.
>>
>> I think it's safer to just leave the PPR text "LS MAB Allocates
>> by Type" as-is, instead of assuming they are L1 load/store misses.
>> What do you think?
> 
> I did some checking accross PPRs, and this counter seems to have changed
> names multiple times throughout the PPR revisions. 
> 
> Zen1 PPR (54945 Rev 1.14 - April 15, 2017) lists counter called "LsMabAllocPipe"
> with 5 subcounters that have different names compared to ones we see in
> the mainline right now. PPRs for stepping B2
> onwards change this to the 3 sub-counter and primary counter name
> we see right now. This public description still changes accross various
> PPR revisions, which is why I had this set to what it was. The lastest
> PPR I can find is indeed lists it as "LS MAB Allocates by Type";
> I will change it to that with the fuffix of tehe sub-counter name. Since
> the same counter is in Zen2 as well, I will make the same changes there
> too.

Thanks, yes, and if you look at the Software Optimization Guide that I
just added to the bugzilla [1], Figure 7 "Load-Store Unit" on page 40
shows a MAB block separate from the Data Cache block.

Kim

[1] https://bugzilla.kernel.org/show_bug.cgi?id=206537

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] perf vendor events amd: add Zen2 events
  2020-02-26 22:09   ` Kim Phillips
@ 2020-02-28 16:00     ` Vijay Thakkar
  2020-02-28 16:24       ` Kim Phillips
  2020-02-28 17:34     ` Vijay Thakkar
  1 sibling, 1 reply; 12+ messages in thread
From: Vijay Thakkar @ 2020-02-28 16:00 UTC (permalink / raw)
  To: Kim Phillips
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

> > +  {
> > +    "EventName": "ls_pref_instr_disp.prefetch_nta",
> > +    "EventCode": "0x4b",
> > +    "BriefDescription": "Software Prefetch Instructions (PREFETCHNTA instruction) Dispatched.",
> > +    "PublicDescription": "Software Prefetch Instructions (PREFETCHNTA instruction) Dispatched.",
> > +    "UMask": "0x4"
> > +  },
> > +  {
> > +    "EventName": "ls_pref_instr_disp.store_prefetch_w",
> > +    "EventCode": "0x4b",
> > +    "BriefDescription": "Software Prefetch Instructions (3DNow PREFETCHW instruction) Dispatched.",
> > +    "PublicDescription": "Software Prefetch Instructions (3DNow PREFETCHW instruction) Dispatched.",
> > +    "UMask": "0x2"
> > +  },
> > +  {
> > +    "EventName": "ls_pref_instr_disp.load_prefetch_w",
> > +    "EventCode": "0x4b",
> > +    "BriefDescription": "Prefetch, Prefetch_T0_T1_T2.",
> > +    "PublicDescription": "Software Prefetch Instructions Dispatched. Prefetch, Prefetch_T0_T1_T2.",
> > +    "UMask": "0x1"
> > +  },
These three are present in the PPR for model 71h (56176 Rev 3.06 - Jul
17, 2019) but are missing from the PPR for model 31h (55803 Rev 0.54 -
Sep 12, 2019). Not sure what to do about it. 

Similarly, PMC 0x0AF - Dispatch Resource Stall Cycles 0 only has one
subcounter in the model 31h PPR, whereas the PPR for 71h is the one that
contains the eight subcounters we see in the mainline right now.

There could be more subtle differences like these, since I have not
really compared the PPR versions that thoroughly. I was going with the
assumption that since both are for SoCs based on the Zen2, they would
have identical events. 

Otherwise, I have made all the other changes and corrections, and will
send in v3 after you suggest how to proceed about the above two.

Best,
Vijay

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] perf vendor events amd: add Zen2 events
  2020-02-28 16:00     ` Vijay Thakkar
@ 2020-02-28 16:24       ` Kim Phillips
  2020-02-28 17:27         ` Vijay Thakkar
  0 siblings, 1 reply; 12+ messages in thread
From: Kim Phillips @ 2020-02-28 16:24 UTC (permalink / raw)
  To: Vijay Thakkar
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

On 2/28/20 10:00 AM, Vijay Thakkar wrote:
>>> +  {
>>> +    "EventName": "ls_pref_instr_disp.prefetch_nta",
>>> +    "EventCode": "0x4b",
>>> +    "BriefDescription": "Software Prefetch Instructions (PREFETCHNTA instruction) Dispatched.",
>>> +    "PublicDescription": "Software Prefetch Instructions (PREFETCHNTA instruction) Dispatched.",
>>> +    "UMask": "0x4"
>>> +  },
>>> +  {
>>> +    "EventName": "ls_pref_instr_disp.store_prefetch_w",
>>> +    "EventCode": "0x4b",
>>> +    "BriefDescription": "Software Prefetch Instructions (3DNow PREFETCHW instruction) Dispatched.",
>>> +    "PublicDescription": "Software Prefetch Instructions (3DNow PREFETCHW instruction) Dispatched.",
>>> +    "UMask": "0x2"
>>> +  },
>>> +  {
>>> +    "EventName": "ls_pref_instr_disp.load_prefetch_w",
>>> +    "EventCode": "0x4b",
>>> +    "BriefDescription": "Prefetch, Prefetch_T0_T1_T2.",
>>> +    "PublicDescription": "Software Prefetch Instructions Dispatched. Prefetch, Prefetch_T0_T1_T2.",
>>> +    "UMask": "0x1"
>>> +  },
> These three are present in the PPR for model 71h (56176 Rev 3.06 - Jul
> 17, 2019) but are missing from the PPR for model 31h (55803 Rev 0.54 -
> Sep 12, 2019). Not sure what to do about it. 

They're producing nonzero counts on my model 31h, so leave them in?

> Similarly, PMC 0x0AF - Dispatch Resource Stall Cycles 0 only has one
> subcounter in the model 31h PPR, whereas the PPR for 71h is the one that
> contains the eight subcounters we see in the mainline right now.

I'm getting nonzero values on my model 31h for that event's
various unit masks, too.

> There could be more subtle differences like these, since I have not
> really compared the PPR versions that thoroughly. I was going with the
> assumption that since both are for SoCs based on the Zen2, they would
> have identical events. 

I think that's a reasonable assumption.

> Otherwise, I have made all the other changes and corrections, and will
> send in v3 after you suggest how to proceed about the above two.

Thanks, I'd veer toward making them available despite differences in PPR
versions.

Thanks,

Kim

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] perf vendor events amd: add Zen2 events
  2020-02-28 16:24       ` Kim Phillips
@ 2020-02-28 17:27         ` Vijay Thakkar
  0 siblings, 0 replies; 12+ messages in thread
From: Vijay Thakkar @ 2020-02-28 17:27 UTC (permalink / raw)
  To: Kim Phillips
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

> They're producing nonzero counts on my model 31h, so leave them in?

> I'm getting nonzero values on my model 31h for that event's
> various unit masks, too.
Okay, will do.

> Thanks, I'd veer toward making them available despite differences in PPR
> versions.

Okay, that is consistent with what I have done so far for the Zen1
counters that are not mentioned in the Zen2 PPRs. Great! I will send v3
in a bit.

Best,
Vijay

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] perf vendor events amd: add Zen2 events
  2020-02-26 22:09   ` Kim Phillips
  2020-02-28 16:00     ` Vijay Thakkar
@ 2020-02-28 17:34     ` Vijay Thakkar
  1 sibling, 0 replies; 12+ messages in thread
From: Vijay Thakkar @ 2020-02-28 17:34 UTC (permalink / raw)
  To: Kim Phillips
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Martin Liška,
	Jon Grimm, linux-kernel, linux-perf-users

> > +  {
> > +    "EventName": "de_dis_dispatch_token_stalls0.sc_agu_dispatch_stall",
> > +    "EventCode": "0xaf",
> > +    "BriefDescription": "SC AGU dispatch stall.",
> > +    "PublicDescription": "Cycles where a dispatch group is valid but does not get dispatched due to a token stall. RETIRE Tokens unavailable.",
> > +    "UMask": "0x80"
> > +  },
> 
> All the above Public descriptions have "RETIRE Tokens unavailable." 
> yet their BriefDescriptions are different.  Please fix.

Just noticed that this one is not present at all in PPR 56176 Rev 3.06 - Jul
17, 2019 and that the umasks for the rest of the sub-counters in PMC0xAF
are right shifted by one when compared to those in Zen1. As a result,
the previous counter at umask 0x1 is no longer present.

I will update this counter to reflect what is documented in model 71h PPR 56176 Rev
3.06 - Jul 17, 2019.

-Vijay

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-02-28 17:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-25 19:28 [PATCH v2 0/3] latest PMU events for zen1/zen2 Vijay Thakkar
2020-02-25 19:28 ` [PATCH v2 1/3] perf vendor events amd: restrict model detection for zen1 based processors Vijay Thakkar
2020-02-25 19:28 ` [PATCH v2 2/3] perf vendor events amd: add Zen2 events Vijay Thakkar
2020-02-26 22:09   ` Kim Phillips
2020-02-28 16:00     ` Vijay Thakkar
2020-02-28 16:24       ` Kim Phillips
2020-02-28 17:27         ` Vijay Thakkar
2020-02-28 17:34     ` Vijay Thakkar
2020-02-25 19:28 ` [PATCH v2 3/3] perf vendor events amd: update Zen1 events to V2 Vijay Thakkar
2020-02-25 22:53   ` Kim Phillips
2020-02-27 20:00     ` Vijay Thakkar
2020-02-27 21:20       ` Kim Phillips

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).