All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/14] Explicit MCR handling and MTL steering
@ 2022-10-01  0:45 ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

Steering of multicast/replicated registers becomes a bit more
complicated on Meteor Lake.  Whereas previously the control register we
used to manage the steering was only used by our driver[*], software's
control of steering has now been consolidated with the controls for
various other hardware/firmware agents into a single register.  We can
no longer utilize pre-programmed implicit steering since other firmware
agents may change the steering target and not restore it afterward;
we'll need to explicitly steer all types of MCR registers (including the
GSLICE/COMPUTE/DSS ranges that have been handled implicitly in the
past).  Furthermore, since multiple agents will now be sharing a single
steering control register, races are possible.  To address this, the
hardware adds a new MCR semaphore register which is supposed to be used
to temporarily lock the steering while performing MCR operations.

It's going to become important for us to handle accesses of multicast
registers very explicitly going forward.  This series provides some prep
work for that by updating our register definitions to clearly define
registers as either MCR or non-MCR and ensure that we're using the
intel_gt_mcr_*() functions rather than intel_uncore_*() when operating
on MCR registers.  By declaring MCR registers as a new C type (i.e., not
an i915_reg_t) we can have the compiler help us find any mistakes where
non-MCR functions are used on MCR registers and vice-versa.

This series also includes the general MCR steering tables and logic, but
does not yet introduce the support for the semaphore register that will
be used to coordinate steering updates with other agents; we'll provide
that support in a separate series once this preparation work has landed.


[*] This is a bit of an oversimplification; there are some hardware and
software debug tools that use the same MCR_SELECTOR register that i915
does and which could potentially re-steer MCR accesses behind our back.
E.g., simply using IGT's "intel_reg" tool to write the MCR_SELECTOR
register at the wrong time could interfere with driver operation.  But
given that these debug facilities require root privileges to run and are
only used by people intentionally debugging the driver or hardware, we
can ignore such races for real-world usage.

v2:
 - Add MCR handling of TLB invalidation registers (since they're in the
   GAM range, their read behavior behaves a bit more like unicast
   registers, but they should still be treated as MCR for writes).
 - Add MCR rmw and 'wait_for_reg' operations; these are only typically
   needed for GAM registers.
 - Switched a few more instances of MISCCPCTL, HALF_SLICE_CHICKEN3, and
   the fault registers over to MCR handling where appropriate.  (Bala)

Matt Roper (14):
  drm/i915/gen8: Create separate reg definitions for new MCR registers
  drm/i915/xehp: Create separate reg definitions for new MCR registers
  drm/i915/gt: Drop a few unused register definitions
  drm/i915/gt: Correct prefix on a few registers
  drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation
  drm/i915/xehp: Check for faults on primary GAM
  drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()
  drm/i915: Define MCR registers explicitly
  drm/i915/gt: Always use MCR functions on multicast registers
  drm/i915/guc: Handle save/restore of MCR registers explicitly
  drm/i915/gt: Add MCR-specific workaround initializers
  drm/i915: Define multicast registers as a new type
  drm/i915/mtl: Add multicast steering for render GT
  drm/i915/mtl: Add multicast steering for media GT

 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   4 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   4 +-
 drivers/gpu/drm/i915/gt/intel_gt.c            | 126 ++++-
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 228 +++++++--
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h        |  24 +-
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 162 +++---
 drivers/gpu/drm/i915/gt/intel_gt_types.h      |   9 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  22 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   2 +-
 drivers/gpu/drm/i915/gt/intel_mocs.c          |  12 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 474 +++++++++++-------
 .../gpu/drm/i915/gt/intel_workarounds_types.h |   9 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c    |   2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  60 ++-
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |   8 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c     |  12 +-
 drivers/gpu/drm/i915/gvt/handlers.c           |  19 +-
 drivers/gpu/drm/i915/gvt/mmio_context.c       |  16 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  12 +-
 drivers/gpu/drm/i915/i915_pci.c               |   1 +
 drivers/gpu/drm/i915/i915_reg_defs.h          |  22 +-
 drivers/gpu/drm/i915/intel_gvt_mmio_table.c   |  12 +-
 drivers/gpu/drm/i915/intel_pm.c               |  20 +-
 23 files changed, 862 insertions(+), 398 deletions(-)

-- 
2.37.3


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 00/14] Explicit MCR handling and MTL steering
@ 2022-10-01  0:45 ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

Steering of multicast/replicated registers becomes a bit more
complicated on Meteor Lake.  Whereas previously the control register we
used to manage the steering was only used by our driver[*], software's
control of steering has now been consolidated with the controls for
various other hardware/firmware agents into a single register.  We can
no longer utilize pre-programmed implicit steering since other firmware
agents may change the steering target and not restore it afterward;
we'll need to explicitly steer all types of MCR registers (including the
GSLICE/COMPUTE/DSS ranges that have been handled implicitly in the
past).  Furthermore, since multiple agents will now be sharing a single
steering control register, races are possible.  To address this, the
hardware adds a new MCR semaphore register which is supposed to be used
to temporarily lock the steering while performing MCR operations.

It's going to become important for us to handle accesses of multicast
registers very explicitly going forward.  This series provides some prep
work for that by updating our register definitions to clearly define
registers as either MCR or non-MCR and ensure that we're using the
intel_gt_mcr_*() functions rather than intel_uncore_*() when operating
on MCR registers.  By declaring MCR registers as a new C type (i.e., not
an i915_reg_t) we can have the compiler help us find any mistakes where
non-MCR functions are used on MCR registers and vice-versa.

This series also includes the general MCR steering tables and logic, but
does not yet introduce the support for the semaphore register that will
be used to coordinate steering updates with other agents; we'll provide
that support in a separate series once this preparation work has landed.


[*] This is a bit of an oversimplification; there are some hardware and
software debug tools that use the same MCR_SELECTOR register that i915
does and which could potentially re-steer MCR accesses behind our back.
E.g., simply using IGT's "intel_reg" tool to write the MCR_SELECTOR
register at the wrong time could interfere with driver operation.  But
given that these debug facilities require root privileges to run and are
only used by people intentionally debugging the driver or hardware, we
can ignore such races for real-world usage.

v2:
 - Add MCR handling of TLB invalidation registers (since they're in the
   GAM range, their read behavior behaves a bit more like unicast
   registers, but they should still be treated as MCR for writes).
 - Add MCR rmw and 'wait_for_reg' operations; these are only typically
   needed for GAM registers.
 - Switched a few more instances of MISCCPCTL, HALF_SLICE_CHICKEN3, and
   the fault registers over to MCR handling where appropriate.  (Bala)

Matt Roper (14):
  drm/i915/gen8: Create separate reg definitions for new MCR registers
  drm/i915/xehp: Create separate reg definitions for new MCR registers
  drm/i915/gt: Drop a few unused register definitions
  drm/i915/gt: Correct prefix on a few registers
  drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation
  drm/i915/xehp: Check for faults on primary GAM
  drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()
  drm/i915: Define MCR registers explicitly
  drm/i915/gt: Always use MCR functions on multicast registers
  drm/i915/guc: Handle save/restore of MCR registers explicitly
  drm/i915/gt: Add MCR-specific workaround initializers
  drm/i915: Define multicast registers as a new type
  drm/i915/mtl: Add multicast steering for render GT
  drm/i915/mtl: Add multicast steering for media GT

 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   4 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   4 +-
 drivers/gpu/drm/i915/gt/intel_gt.c            | 126 ++++-
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 228 +++++++--
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h        |  24 +-
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 162 +++---
 drivers/gpu/drm/i915/gt/intel_gt_types.h      |   9 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  22 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   2 +-
 drivers/gpu/drm/i915/gt/intel_mocs.c          |  12 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 474 +++++++++++-------
 .../gpu/drm/i915/gt/intel_workarounds_types.h |   9 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c    |   2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  60 ++-
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |   8 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c     |  12 +-
 drivers/gpu/drm/i915/gvt/handlers.c           |  19 +-
 drivers/gpu/drm/i915/gvt/mmio_context.c       |  16 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  12 +-
 drivers/gpu/drm/i915/i915_pci.c               |   1 +
 drivers/gpu/drm/i915/i915_reg_defs.h          |  22 +-
 drivers/gpu/drm/i915/intel_gvt_mmio_table.c   |  12 +-
 drivers/gpu/drm/i915/intel_pm.c               |  20 +-
 23 files changed, 862 insertions(+), 398 deletions(-)

-- 
2.37.3


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v2 01/14] drm/i915/gen8: Create separate reg definitions for new MCR registers
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

Gen8 was the first time our hardware had multicast registers (or at
least the first time the multicast nature was exposed and MMIO accesses
could be steered).  There are some registers that transitioned from
singleton behavior to multicast during the gen7 -> gen8 transition;
let's duplicate the register definitions for those registers in
preparation for upcoming patches that will handle MCR registers in a
special manner.

The registers adjusted are:
 * MISCCPCTL
 * SAMPLER_INSTDONE
 * ROW_INSTDONE
 * ROW_CHICKEN2
 * HALF_SLICE_CHICKEN1
 * HALF_SLICE_CHICKEN3

v2:
 - Use the gen8 version of HALF_SLICE_CHICKEN3 in GVT's gen9 engine MMIO
   list.  (Bala)
 - Update to the gen8 version of MISCCPCTL in a couple new workarounds
   that were recently added for DG2/PVC.  (Bala)

Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  4 +--
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 11 +++++++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 26 +++++++++----------
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 +--
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c     |  2 +-
 drivers/gpu/drm/i915/gvt/handlers.c           |  2 +-
 drivers/gpu/drm/i915/gvt/mmio_context.c       |  2 +-
 drivers/gpu/drm/i915/intel_gvt_mmio_table.c   |  2 +-
 drivers/gpu/drm/i915/intel_pm.c               | 10 +++----
 9 files changed, 36 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 2ddcad497fa3..c408bac3c533 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1559,11 +1559,11 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
 		for_each_ss_steering(iter, engine->gt, slice, subslice) {
 			instdone->sampler[slice][subslice] =
 				intel_gt_mcr_read(engine->gt,
-						  GEN7_SAMPLER_INSTDONE,
+						  GEN8_SAMPLER_INSTDONE,
 						  slice, subslice);
 			instdone->row[slice][subslice] =
 				intel_gt_mcr_read(engine->gt,
-						  GEN7_ROW_INSTDONE,
+						  GEN8_ROW_INSTDONE,
 						  slice, subslice);
 		}
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 7f79bbf97828..ba4ce668042c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -651,6 +651,9 @@
 
 #define GEN7_MISCCPCTL				_MMIO(0x9424)
 #define   GEN7_DOP_CLOCK_GATE_ENABLE		(1 << 0)
+
+#define GEN8_MISCCPCTL				_MMIO(0x9424)
+#define   GEN8_DOP_CLOCK_GATE_ENABLE		REG_BIT(0)
 #define   GEN12_DOP_CLOCK_GATE_RENDER_ENABLE	REG_BIT(1)
 #define   GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE	(1 << 2)
 #define   GEN8_DOP_CLOCK_GATE_GUC_ENABLE	(1 << 4)
@@ -1072,18 +1075,22 @@
 #define GEN12_GAM_DONE				_MMIO(0xcf68)
 
 #define GEN7_HALF_SLICE_CHICKEN1		_MMIO(0xe100) /* IVB GT1 + VLV */
+#define GEN8_HALF_SLICE_CHICKEN1		_MMIO(0xe100)
 #define   GEN7_MAX_PS_THREAD_DEP		(8 << 12)
 #define   GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE	(1 << 10)
 #define   GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE	(1 << 4)
 #define   GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE	(1 << 3)
 
 #define GEN7_SAMPLER_INSTDONE			_MMIO(0xe160)
+#define GEN8_SAMPLER_INSTDONE			_MMIO(0xe160)
 #define GEN7_ROW_INSTDONE			_MMIO(0xe164)
+#define GEN8_ROW_INSTDONE			_MMIO(0xe164)
 
 #define HALF_SLICE_CHICKEN2			_MMIO(0xe180)
 #define   GEN8_ST_PO_DISABLE			(1 << 13)
 
-#define HALF_SLICE_CHICKEN3			_MMIO(0xe184)
+#define HSW_HALF_SLICE_CHICKEN3			_MMIO(0xe184)
+#define GEN8_HALF_SLICE_CHICKEN3		_MMIO(0xe184)
 #define   HSW_SAMPLE_C_PERFORMANCE		(1 << 9)
 #define   GEN8_CENTROID_PIXEL_OPT_DIS		(1 << 8)
 #define   GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC	(1 << 5)
@@ -1136,6 +1143,8 @@
 #define   DISABLE_EARLY_EOT			REG_BIT(1)
 
 #define GEN7_ROW_CHICKEN2			_MMIO(0xe4f4)
+
+#define GEN8_ROW_CHICKEN2			_MMIO(0xe4f4)
 #define   GEN12_DISABLE_READ_SUPPRESSION	REG_BIT(15)
 #define   GEN12_DISABLE_EARLY_READ		REG_BIT(14)
 #define   GEN12_ENABLE_LARGE_GRF_MODE		REG_BIT(12)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index ff5220694f46..7a0f0146ee01 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -295,10 +295,10 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine,
 	 * Also see the related UCGTCL1 write in bdw_init_clock_gating()
 	 * to disable EUTC clock gating.
 	 */
-	wa_masked_en(wal, GEN7_ROW_CHICKEN2,
+	wa_masked_en(wal, GEN8_ROW_CHICKEN2,
 		     DOP_CLOCK_GATING_DISABLE);
 
-	wa_masked_en(wal, HALF_SLICE_CHICKEN3,
+	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
 		     GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	wa_masked_en(wal, HDC_CHICKEN0,
@@ -386,7 +386,7 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 	    IS_KABYLAKE(i915) ||
 	    IS_COFFEELAKE(i915) ||
 	    IS_COMETLAKE(i915))
-		wa_masked_en(wal, HALF_SLICE_CHICKEN3,
+		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
 			     GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
@@ -490,7 +490,7 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1,
+	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
 		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
@@ -514,7 +514,7 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1,
+	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
 		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
@@ -1517,7 +1517,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS);
 
 	/* Wa_14015795083 */
-	wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
+	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
 static void
@@ -1526,7 +1526,7 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	pvc_init_mcr(gt, wal);
 
 	/* Wa_14015795083 */
-	wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
+	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
 static void
@@ -2117,7 +2117,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14013392000:dg2_g11 */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) ||
@@ -2163,7 +2163,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 			    DISABLE_128B_EVICTION_COMMAND_UDW);
 
 		/* Wa_22012856258:dg2 */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2,
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_DISABLE_READ_SUPPRESSION);
 
 		/*
@@ -2260,7 +2260,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
 
 		/*
 		 * Wa_1407928979:tgl A*
@@ -2289,7 +2289,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	    IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2,
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_PUSH_CONST_DEREF_HOLD_DIS);
 
 		/*
@@ -2456,7 +2456,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_HASWELL(i915)) {
 		/* WaSampleCChickenBitEnable:hsw */
 		wa_masked_en(wal,
-			     HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE);
+			     HSW_HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE);
 
 		wa_masked_dis(wal,
 			      CACHE_MODE_0_GEN7,
@@ -2754,7 +2754,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 		wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 
 		/* Wa_14010449647:xehpsdv */
-		wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1,
+		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
 			     GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
 
 		/* Wa_18011725039:xehpsdv */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
index 8f1165146013..9495a7928bc8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
@@ -244,8 +244,8 @@ struct __ext_steer_reg {
 };
 
 static const struct __ext_steer_reg xe_extregs[] = {
-	{"GEN7_SAMPLER_INSTDONE", GEN7_SAMPLER_INSTDONE},
-	{"GEN7_ROW_INSTDONE", GEN7_ROW_INSTDONE}
+	{"GEN8_SAMPLER_INSTDONE", GEN8_SAMPLER_INSTDONE},
+	{"GEN8_ROW_INSTDONE", GEN8_ROW_INSTDONE}
 };
 
 static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index a0372735cddb..9229243992c2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -35,7 +35,7 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
 
 	if (GRAPHICS_VER(uncore->i915) == 9) {
 		/* DOP Clock Gating Enable for GuC clocks */
-		intel_uncore_rmw(uncore, GEN7_MISCCPCTL,
+		intel_uncore_rmw(uncore, GEN8_MISCCPCTL,
 				 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE);
 
 		/* allows for 5us (in 10ns units) before GT can go to RC6 */
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index daac2050d77d..700cc9688f47 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -2257,7 +2257,7 @@ static int init_generic_mmio_info(struct intel_gvt *gvt)
 	MMIO_DFH(_MMIO(0x2438), D_ALL, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x243c), D_ALL, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x7018), D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH(HSW_HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(GEN7_HALF_SLICE_CHICKEN1, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 
 	/* display */
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
index 1c6e941c9666..d177884d8f7d 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -111,7 +111,7 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
 	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
 	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
 	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
-	{RCS0, HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
+	{RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
 	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
 	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
 	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
index 8279dc580a3e..638b77d64bf4 100644
--- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
+++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
@@ -102,7 +102,7 @@ static int iterate_generic_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(_MMIO(0x2438));
 	MMIO_D(_MMIO(0x243c));
 	MMIO_D(_MMIO(0x7018));
-	MMIO_D(HALF_SLICE_CHICKEN3);
+	MMIO_D(HSW_HALF_SLICE_CHICKEN3);
 	MMIO_D(GEN7_HALF_SLICE_CHICKEN1);
 	/* display */
 	MMIO_F(_MMIO(0x60220), 0x20);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2595ec5aeb77..c694433a7da5 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4339,8 +4339,8 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	u32 val;
 
 	/* WaTempDisableDOPClkGating:bdw */
-	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL);
-	intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL);
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
 	val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1);
 	val &= ~L3_PRIO_CREDITS_MASK;
@@ -4354,7 +4354,7 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	 */
 	intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1);
 	udelay(1);
-	intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, misccpctl);
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl);
 }
 
 static void icl_init_clock_gating(struct drm_i915_private *dev_priv)
@@ -4514,8 +4514,8 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
 	gen9_init_clock_gating(dev_priv);
 
 	/* WaDisableDopClockGating:skl */
-	intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL) &
-		   ~GEN7_DOP_CLOCK_GATE_ENABLE);
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &
+		   ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
 	/* WAC6entrylatency:skl */
 	intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) |
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 01/14] drm/i915/gen8: Create separate reg definitions for new MCR registers
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

Gen8 was the first time our hardware had multicast registers (or at
least the first time the multicast nature was exposed and MMIO accesses
could be steered).  There are some registers that transitioned from
singleton behavior to multicast during the gen7 -> gen8 transition;
let's duplicate the register definitions for those registers in
preparation for upcoming patches that will handle MCR registers in a
special manner.

The registers adjusted are:
 * MISCCPCTL
 * SAMPLER_INSTDONE
 * ROW_INSTDONE
 * ROW_CHICKEN2
 * HALF_SLICE_CHICKEN1
 * HALF_SLICE_CHICKEN3

v2:
 - Use the gen8 version of HALF_SLICE_CHICKEN3 in GVT's gen9 engine MMIO
   list.  (Bala)
 - Update to the gen8 version of MISCCPCTL in a couple new workarounds
   that were recently added for DG2/PVC.  (Bala)

Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  4 +--
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 11 +++++++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 26 +++++++++----------
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 +--
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c     |  2 +-
 drivers/gpu/drm/i915/gvt/handlers.c           |  2 +-
 drivers/gpu/drm/i915/gvt/mmio_context.c       |  2 +-
 drivers/gpu/drm/i915/intel_gvt_mmio_table.c   |  2 +-
 drivers/gpu/drm/i915/intel_pm.c               | 10 +++----
 9 files changed, 36 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 2ddcad497fa3..c408bac3c533 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1559,11 +1559,11 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
 		for_each_ss_steering(iter, engine->gt, slice, subslice) {
 			instdone->sampler[slice][subslice] =
 				intel_gt_mcr_read(engine->gt,
-						  GEN7_SAMPLER_INSTDONE,
+						  GEN8_SAMPLER_INSTDONE,
 						  slice, subslice);
 			instdone->row[slice][subslice] =
 				intel_gt_mcr_read(engine->gt,
-						  GEN7_ROW_INSTDONE,
+						  GEN8_ROW_INSTDONE,
 						  slice, subslice);
 		}
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 7f79bbf97828..ba4ce668042c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -651,6 +651,9 @@
 
 #define GEN7_MISCCPCTL				_MMIO(0x9424)
 #define   GEN7_DOP_CLOCK_GATE_ENABLE		(1 << 0)
+
+#define GEN8_MISCCPCTL				_MMIO(0x9424)
+#define   GEN8_DOP_CLOCK_GATE_ENABLE		REG_BIT(0)
 #define   GEN12_DOP_CLOCK_GATE_RENDER_ENABLE	REG_BIT(1)
 #define   GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE	(1 << 2)
 #define   GEN8_DOP_CLOCK_GATE_GUC_ENABLE	(1 << 4)
@@ -1072,18 +1075,22 @@
 #define GEN12_GAM_DONE				_MMIO(0xcf68)
 
 #define GEN7_HALF_SLICE_CHICKEN1		_MMIO(0xe100) /* IVB GT1 + VLV */
+#define GEN8_HALF_SLICE_CHICKEN1		_MMIO(0xe100)
 #define   GEN7_MAX_PS_THREAD_DEP		(8 << 12)
 #define   GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE	(1 << 10)
 #define   GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE	(1 << 4)
 #define   GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE	(1 << 3)
 
 #define GEN7_SAMPLER_INSTDONE			_MMIO(0xe160)
+#define GEN8_SAMPLER_INSTDONE			_MMIO(0xe160)
 #define GEN7_ROW_INSTDONE			_MMIO(0xe164)
+#define GEN8_ROW_INSTDONE			_MMIO(0xe164)
 
 #define HALF_SLICE_CHICKEN2			_MMIO(0xe180)
 #define   GEN8_ST_PO_DISABLE			(1 << 13)
 
-#define HALF_SLICE_CHICKEN3			_MMIO(0xe184)
+#define HSW_HALF_SLICE_CHICKEN3			_MMIO(0xe184)
+#define GEN8_HALF_SLICE_CHICKEN3		_MMIO(0xe184)
 #define   HSW_SAMPLE_C_PERFORMANCE		(1 << 9)
 #define   GEN8_CENTROID_PIXEL_OPT_DIS		(1 << 8)
 #define   GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC	(1 << 5)
@@ -1136,6 +1143,8 @@
 #define   DISABLE_EARLY_EOT			REG_BIT(1)
 
 #define GEN7_ROW_CHICKEN2			_MMIO(0xe4f4)
+
+#define GEN8_ROW_CHICKEN2			_MMIO(0xe4f4)
 #define   GEN12_DISABLE_READ_SUPPRESSION	REG_BIT(15)
 #define   GEN12_DISABLE_EARLY_READ		REG_BIT(14)
 #define   GEN12_ENABLE_LARGE_GRF_MODE		REG_BIT(12)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index ff5220694f46..7a0f0146ee01 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -295,10 +295,10 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine,
 	 * Also see the related UCGTCL1 write in bdw_init_clock_gating()
 	 * to disable EUTC clock gating.
 	 */
-	wa_masked_en(wal, GEN7_ROW_CHICKEN2,
+	wa_masked_en(wal, GEN8_ROW_CHICKEN2,
 		     DOP_CLOCK_GATING_DISABLE);
 
-	wa_masked_en(wal, HALF_SLICE_CHICKEN3,
+	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
 		     GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	wa_masked_en(wal, HDC_CHICKEN0,
@@ -386,7 +386,7 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 	    IS_KABYLAKE(i915) ||
 	    IS_COFFEELAKE(i915) ||
 	    IS_COMETLAKE(i915))
-		wa_masked_en(wal, HALF_SLICE_CHICKEN3,
+		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
 			     GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
@@ -490,7 +490,7 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1,
+	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
 		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
@@ -514,7 +514,7 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1,
+	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
 		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
@@ -1517,7 +1517,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS);
 
 	/* Wa_14015795083 */
-	wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
+	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
 static void
@@ -1526,7 +1526,7 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	pvc_init_mcr(gt, wal);
 
 	/* Wa_14015795083 */
-	wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
+	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
 static void
@@ -2117,7 +2117,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14013392000:dg2_g11 */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) ||
@@ -2163,7 +2163,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 			    DISABLE_128B_EVICTION_COMMAND_UDW);
 
 		/* Wa_22012856258:dg2 */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2,
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_DISABLE_READ_SUPPRESSION);
 
 		/*
@@ -2260,7 +2260,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
 
 		/*
 		 * Wa_1407928979:tgl A*
@@ -2289,7 +2289,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	    IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2,
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_PUSH_CONST_DEREF_HOLD_DIS);
 
 		/*
@@ -2456,7 +2456,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_HASWELL(i915)) {
 		/* WaSampleCChickenBitEnable:hsw */
 		wa_masked_en(wal,
-			     HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE);
+			     HSW_HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE);
 
 		wa_masked_dis(wal,
 			      CACHE_MODE_0_GEN7,
@@ -2754,7 +2754,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 		wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 
 		/* Wa_14010449647:xehpsdv */
-		wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1,
+		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
 			     GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
 
 		/* Wa_18011725039:xehpsdv */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
index 8f1165146013..9495a7928bc8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
@@ -244,8 +244,8 @@ struct __ext_steer_reg {
 };
 
 static const struct __ext_steer_reg xe_extregs[] = {
-	{"GEN7_SAMPLER_INSTDONE", GEN7_SAMPLER_INSTDONE},
-	{"GEN7_ROW_INSTDONE", GEN7_ROW_INSTDONE}
+	{"GEN8_SAMPLER_INSTDONE", GEN8_SAMPLER_INSTDONE},
+	{"GEN8_ROW_INSTDONE", GEN8_ROW_INSTDONE}
 };
 
 static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index a0372735cddb..9229243992c2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -35,7 +35,7 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
 
 	if (GRAPHICS_VER(uncore->i915) == 9) {
 		/* DOP Clock Gating Enable for GuC clocks */
-		intel_uncore_rmw(uncore, GEN7_MISCCPCTL,
+		intel_uncore_rmw(uncore, GEN8_MISCCPCTL,
 				 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE);
 
 		/* allows for 5us (in 10ns units) before GT can go to RC6 */
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index daac2050d77d..700cc9688f47 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -2257,7 +2257,7 @@ static int init_generic_mmio_info(struct intel_gvt *gvt)
 	MMIO_DFH(_MMIO(0x2438), D_ALL, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x243c), D_ALL, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x7018), D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH(HSW_HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(GEN7_HALF_SLICE_CHICKEN1, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 
 	/* display */
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
index 1c6e941c9666..d177884d8f7d 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -111,7 +111,7 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
 	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
 	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
 	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
-	{RCS0, HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
+	{RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
 	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
 	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
 	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
index 8279dc580a3e..638b77d64bf4 100644
--- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
+++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
@@ -102,7 +102,7 @@ static int iterate_generic_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(_MMIO(0x2438));
 	MMIO_D(_MMIO(0x243c));
 	MMIO_D(_MMIO(0x7018));
-	MMIO_D(HALF_SLICE_CHICKEN3);
+	MMIO_D(HSW_HALF_SLICE_CHICKEN3);
 	MMIO_D(GEN7_HALF_SLICE_CHICKEN1);
 	/* display */
 	MMIO_F(_MMIO(0x60220), 0x20);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2595ec5aeb77..c694433a7da5 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4339,8 +4339,8 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	u32 val;
 
 	/* WaTempDisableDOPClkGating:bdw */
-	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL);
-	intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL);
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
 	val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1);
 	val &= ~L3_PRIO_CREDITS_MASK;
@@ -4354,7 +4354,7 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	 */
 	intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1);
 	udelay(1);
-	intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, misccpctl);
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl);
 }
 
 static void icl_init_clock_gating(struct drm_i915_private *dev_priv)
@@ -4514,8 +4514,8 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
 	gen9_init_clock_gating(dev_priv);
 
 	/* WaDisableDopClockGating:skl */
-	intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL) &
-		   ~GEN7_DOP_CLOCK_GATE_ENABLE);
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &
+		   ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
 	/* WAC6entrylatency:skl */
 	intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) |
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 02/14] drm/i915/xehp: Create separate reg definitions for new MCR registers
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

Starting in Xe_HP, several registers our driver works with have been
converted from singleton registers into replicated registers with
multicast behavior.  Although the registers are still located at the
same MMIO offsets as on previous platforms, let's duplicate the register
definitions in preparation for upcoming patches that will handle
multicast registers in a special manner.

The registers that are now replicated on Xe_HP are:
 * PAT_INDEX (mslice replication)
 * FF_MODE2 (gslice replication)
 * COMMON_SLICE_CHICKEN3 (gslice replication)
 * SLICE_COMMON_ECO_CHICKEN1 (gslice replication)
 * SLICE_UNIT_LEVEL_CLKGATE (gslice replication)
 * LNCFCMOCS (lncf replication)

Note that there are a couple places in selftest_mocs.c where the
gen9 version of LNCFCMOCS is still used without regards for which
platform we're on.  Those cases are just doing an offset lookup and not
issuing any CPU reads/writes of the register, so the potentially
multicast nature of the register doesn't come into play.

v2:
 - Add commit message note about the unconditional GEN9_LNCFCMOCS usage
   in selftest_mocs.  (Bala)
 - Include some additional TLB registers.

Bspec: 66534
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c        |  4 ++--
 drivers/gpu/drm/i915/gt/intel_gt.c          | 18 ++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     | 26 +++++++++++++++------
 drivers/gpu/drm/i915/gt/intel_gtt.c         | 22 ++++++++++++++---
 drivers/gpu/drm/i915/gt/intel_gtt.h         |  2 +-
 drivers/gpu/drm/i915/gt/intel_mocs.c        |  5 +++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 24 +++++++++----------
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  |  7 ++++--
 8 files changed, 78 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index b31fe0fb013f..af5b98b56a49 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -986,7 +986,7 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 
 	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
 
-	setup_private_pat(ggtt->vm.gt->uncore);
+	setup_private_pat(ggtt->vm.gt);
 
 	return ggtt_probe_common(ggtt, size);
 }
@@ -1302,7 +1302,7 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt)
 		wbinvd_on_all_cpus();
 
 	if (GRAPHICS_VER(ggtt->vm.i915) >= 8)
-		setup_private_pat(ggtt->vm.gt->uncore);
+		setup_private_pat(ggtt->vm.gt);
 
 	intel_ggtt_restore_fences(ggtt);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index b367cfff48d5..445e171940fa 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -314,7 +314,11 @@ static void gen8_check_faults(struct intel_gt *gt)
 	i915_reg_t fault_reg, fault_data0_reg, fault_data1_reg;
 	u32 fault;
 
-	if (GRAPHICS_VER(gt->i915) >= 12) {
+	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) {
+		fault_reg = XEHP_RING_FAULT_REG;
+		fault_data0_reg = XEHP_FAULT_TLB_DATA0;
+		fault_data1_reg = XEHP_FAULT_TLB_DATA1;
+	} else if (GRAPHICS_VER(gt->i915) >= 12) {
 		fault_reg = GEN12_RING_FAULT_REG;
 		fault_data0_reg = GEN12_FAULT_TLB_DATA0;
 		fault_data1_reg = GEN12_FAULT_TLB_DATA1;
@@ -990,6 +994,13 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 		[COPY_ENGINE_CLASS]		= GEN12_BLT_TLB_INV_CR,
 		[COMPUTE_CLASS]			= GEN12_COMPCTX_TLB_INV_CR,
 	};
+	static const i915_reg_t xehp_regs[] = {
+		[RENDER_CLASS]			= XEHP_GFX_TLB_INV_CR,
+		[VIDEO_DECODE_CLASS]		= XEHP_VD_TLB_INV_CR,
+		[VIDEO_ENHANCEMENT_CLASS]	= XEHP_VE_TLB_INV_CR,
+		[COPY_ENGINE_CLASS]		= XEHP_BLT_TLB_INV_CR,
+		[COMPUTE_CLASS]			= XEHP_COMPCTX_TLB_INV_CR,
+	};
 	struct drm_i915_private *i915 = gt->i915;
 	struct intel_uncore *uncore = gt->uncore;
 	struct intel_engine_cs *engine;
@@ -998,7 +1009,10 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	const i915_reg_t *regs;
 	unsigned int num = 0;
 
-	if (GRAPHICS_VER(i915) == 12) {
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+		regs = xehp_regs;
+		num = ARRAY_SIZE(xehp_regs);
+	} else if (GRAPHICS_VER(i915) == 12) {
 		regs = gen12_regs;
 		num = ARRAY_SIZE(gen12_regs);
 	} else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) {
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index ba4ce668042c..68f45a748712 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -333,6 +333,7 @@
 #define GEN7_TLB_RD_ADDR			_MMIO(0x4700)
 
 #define GEN12_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
+#define XEHP_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
 
 #define XEHP_TILE0_ADDR_RANGE			_MMIO(0x4900)
 #define   XEHP_TILE_LMEM_RANGE_SHIFT		8
@@ -391,7 +392,8 @@
 #define   DIS_OVER_FETCH_CACHE			REG_BIT(1)
 #define   DIS_MULT_MISS_RD_SQUASH		REG_BIT(0)
 
-#define FF_MODE2				_MMIO(0x6604)
+#define GEN12_FF_MODE2				_MMIO(0x6604)
+#define XEHP_FF_MODE2				_MMIO(0x6604)
 #define   FF_MODE2_GS_TIMER_MASK		REG_GENMASK(31, 24)
 #define   FF_MODE2_GS_TIMER_224			REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224)
 #define   FF_MODE2_TDS_TIMER_MASK		REG_GENMASK(23, 16)
@@ -446,6 +448,7 @@
 #define GEN8_HDC_CHICKEN1			_MMIO(0x7304)
 
 #define GEN11_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
+#define XEHP_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
 #define   DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN	REG_BIT(12)
 #define   XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE	REG_BIT(12)
 #define   GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC	REG_BIT(11)
@@ -459,10 +462,9 @@
 #define   DISABLE_PIXEL_MASK_CAMMING		(1 << 14)
 
 #define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
-#define   GEN11_STATE_CACHE_REDIRECT_TO_CS	(1 << 11)
-
-#define SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
+#define XEHP_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
 #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE	REG_BIT(14)
+#define   GEN11_STATE_CACHE_REDIRECT_TO_CS	(1 << 11)
 
 #define GEN9_SLICE_PGCTL_ACK(slice)		_MMIO(0x804c + (slice) * 0x4)
 #define GEN10_SLICE_PGCTL_ACK(slice)		_MMIO(0x804c + ((slice) / 3) * 0x34 + \
@@ -707,7 +709,8 @@
 #define   GAMTLBVEBOX0_CLKGATE_DIS		REG_BIT(16)
 #define   LTCDD_CLKGATE_DIS			REG_BIT(10)
 
-#define SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
+#define GEN11_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
+#define XEHP_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
 #define   SARBUNIT_CLKGATE_DIS			(1 << 5)
 #define   RCCUNIT_CLKGATE_DIS			(1 << 7)
 #define   MSCUNIT_CLKGATE_DIS			(1 << 10)
@@ -722,7 +725,7 @@
 #define   VSUNIT_CLKGATE_DIS_TGL		REG_BIT(19)
 #define   PSDUNIT_CLKGATE_DIS			REG_BIT(5)
 
-#define SUBSLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x9524)
+#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE	_MMIO(0x9524)
 #define   DSS_ROUTER_CLKGATE_DIS		REG_BIT(28)
 #define   GWUNIT_CLKGATE_DIS			REG_BIT(16)
 
@@ -947,7 +950,8 @@
 
 /* MOCS (Memory Object Control State) registers */
 #define GEN9_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
-#define GEN9_LNCFCMOCS_REG_COUNT		32
+#define XEHP_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
+#define LNCFCMOCS_REG_COUNT			32
 
 #define GEN7_L3CNTLREG3				_MMIO(0xb024)
 
@@ -1039,11 +1043,14 @@
 #define GEN9_BLT_MOCS(i)			_MMIO(__GEN9_BCS0_MOCS0 + (i) * 4)
 
 #define GEN12_FAULT_TLB_DATA0			_MMIO(0xceb8)
+#define XEHP_FAULT_TLB_DATA0			_MMIO(0xceb8)
 #define GEN12_FAULT_TLB_DATA1			_MMIO(0xcebc)
+#define XEHP_FAULT_TLB_DATA1			_MMIO(0xcebc)
 #define   FAULT_VA_HIGH_BITS			(0xf << 0)
 #define   FAULT_GTT_SEL				(1 << 4)
 
 #define GEN12_RING_FAULT_REG			_MMIO(0xcec4)
+#define XEHP_RING_FAULT_REG			_MMIO(0xcec4)
 #define   GEN8_RING_FAULT_ENGINE_ID(x)		(((x) >> 12) & 0x7)
 #define   RING_FAULT_GTTSEL_MASK		(1 << 11)
 #define   RING_FAULT_SRCID(x)			(((x) >> 3) & 0xff)
@@ -1051,10 +1058,15 @@
 #define   RING_FAULT_VALID			(1 << 0)
 
 #define GEN12_GFX_TLB_INV_CR			_MMIO(0xced8)
+#define XEHP_GFX_TLB_INV_CR			_MMIO(0xced8)
 #define GEN12_VD_TLB_INV_CR			_MMIO(0xcedc)
+#define XEHP_VD_TLB_INV_CR			_MMIO(0xcedc)
 #define GEN12_VE_TLB_INV_CR			_MMIO(0xcee0)
+#define XEHP_VE_TLB_INV_CR			_MMIO(0xcee0)
 #define GEN12_BLT_TLB_INV_CR			_MMIO(0xcee4)
+#define XEHP_BLT_TLB_INV_CR			_MMIO(0xcee4)
 #define GEN12_COMPCTX_TLB_INV_CR		_MMIO(0xcf04)
+#define XEHP_COMPCTX_TLB_INV_CR			_MMIO(0xcf04)
 
 #define GEN12_MERT_MOD_CTRL			_MMIO(0xcf28)
 #define RENDER_MOD_CTRL				_MMIO(0xcf2c)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 2eaeba14319e..5ce98c30728a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -15,6 +15,7 @@
 #include "i915_trace.h"
 #include "i915_utils.h"
 #include "intel_gt.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_regs.h"
 #include "intel_gtt.h"
 
@@ -493,6 +494,18 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 	intel_uncore_write(uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB);
 }
 
+static void xehp_setup_private_ppat(struct intel_gt *gt)
+{
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
+}
+
 static void icl_setup_private_ppat(struct intel_uncore *uncore)
 {
 	intel_uncore_write(uncore,
@@ -585,13 +598,16 @@ static void chv_setup_private_ppat(struct intel_uncore *uncore)
 	intel_uncore_write(uncore, GEN8_PRIVATE_PAT_HI, upper_32_bits(pat));
 }
 
-void setup_private_pat(struct intel_uncore *uncore)
+void setup_private_pat(struct intel_gt *gt)
 {
-	struct drm_i915_private *i915 = uncore->i915;
+	struct intel_uncore *uncore = gt->uncore;
+	struct drm_i915_private *i915 = gt->i915;
 
 	GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
 
-	if (GRAPHICS_VER(i915) >= 12)
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+		xehp_setup_private_ppat(gt);
+	else if (GRAPHICS_VER(i915) >= 12)
 		tgl_setup_private_ppat(uncore);
 	else if (GRAPHICS_VER(i915) >= 11)
 		icl_setup_private_ppat(uncore);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index c0ca53cba9f0..5236c60f2a68 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -667,7 +667,7 @@ void ppgtt_unbind_vma(struct i915_address_space *vm,
 
 void gtt_write_workarounds(struct intel_gt *gt);
 
-void setup_private_pat(struct intel_uncore *uncore);
+void setup_private_pat(struct intel_gt *gt);
 
 int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index c6ebe2781076..06643701bf24 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -608,7 +608,10 @@ static void init_l3cc_table(struct intel_uncore *uncore,
 	u32 l3cc;
 
 	for_each_l3cc(l3cc, table, i)
-		intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
+		if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50))
+			intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc);
+		else
+			intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
 }
 
 void intel_mocs_init_engine(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 7a0f0146ee01..7c237eeeac85 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -572,7 +572,7 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 	wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
 			 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
 	wa_add(wal,
-	       FF_MODE2,
+	       XEHP_FF_MODE2,
 	       FF_MODE2_TDS_TIMER_MASK,
 	       FF_MODE2_TDS_TIMER_128,
 	       0, false);
@@ -599,7 +599,7 @@ static void gen12_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 	 * verification is ignored.
 	 */
 	wa_add(wal,
-	       FF_MODE2,
+	       GEN12_FF_MODE2,
 	       FF_MODE2_TDS_TIMER_MASK,
 	       FF_MODE2_TDS_TIMER_128,
 	       0, false);
@@ -637,7 +637,7 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine,
 	 * to Wa_1608008084.
 	 */
 	wa_add(wal,
-	       FF_MODE2,
+	       GEN12_FF_MODE2,
 	       FF_MODE2_GS_TIMER_MASK,
 	       FF_MODE2_GS_TIMER_224,
 	       0, false);
@@ -670,7 +670,7 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_14010469329:dg2_g10 */
-		wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
+		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
 			     XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
 
 		/*
@@ -678,12 +678,12 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 		 * Wa_22010613112:dg2_g10
 		 * Wa_14010698770:dg2_g10
 		 */
-		wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
+		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
 			     GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
 	}
 
 	/* Wa_16013271637:dg2 */
-	wa_masked_en(wal, SLICE_COMMON_ECO_CHICKEN1,
+	wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1,
 		     MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
 
 	/* Wa_14014947963:dg2 */
@@ -1265,14 +1265,14 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 
 	/* Wa_1406680159:icl,ehl */
 	wa_write_or(wal,
-		    SUBSLICE_UNIT_LEVEL_CLKGATE,
+		    GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
 		    GWUNIT_CLKGATE_DIS);
 
 	/* Wa_1607087056:icl,ehl,jsl */
 	if (IS_ICELAKE(i915) ||
 	    IS_JSL_EHL_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
 		wa_write_or(wal,
-			    SLICE_UNIT_LEVEL_CLKGATE,
+			    GEN11_SLICE_UNIT_LEVEL_CLKGATE,
 			    L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
 
 	/*
@@ -1332,7 +1332,7 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	/* Wa_1607087056:tgl also know as BUG:1409180338 */
 	if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
 		wa_write_or(wal,
-			    SLICE_UNIT_LEVEL_CLKGATE,
+			    GEN11_SLICE_UNIT_LEVEL_CLKGATE,
 			    L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
 
 	/* Wa_1408615072:tgl[a0] */
@@ -1351,7 +1351,7 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	/* Wa_1607087056:dg1 */
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
 		wa_write_or(wal,
-			    SLICE_UNIT_LEVEL_CLKGATE,
+			    GEN11_SLICE_UNIT_LEVEL_CLKGATE,
 			    L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
 
 	/* Wa_1409420604:dg1 */
@@ -1455,7 +1455,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 			    CG3DDISCFEG_CLKGATE_DIS);
 
 		/* Wa_14011006942:dg2 */
-		wa_write_or(wal, SUBSLICE_UNIT_LEVEL_CLKGATE,
+		wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
 			    DSS_ROUTER_CLKGATE_DIS);
 	}
 
@@ -1467,7 +1467,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 		wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS);
 
 		/* Wa_14011371254:dg2_g10 */
-		wa_write_or(wal, SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
+		wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
 
 		/* Wa_14011431319:dg2_g10 */
 		wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS |
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 657f0beb8e06..cc357fa0c270 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -373,8 +373,11 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
 					false);
 
 	/* add in local MOCS registers */
-	for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++)
-		ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
+	for (i = 0; i < LNCFCMOCS_REG_COUNT; i++)
+		if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+			ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false);
+		else
+			ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
 
 	return ret ? -1 : 0;
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 02/14] drm/i915/xehp: Create separate reg definitions for new MCR registers
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

Starting in Xe_HP, several registers our driver works with have been
converted from singleton registers into replicated registers with
multicast behavior.  Although the registers are still located at the
same MMIO offsets as on previous platforms, let's duplicate the register
definitions in preparation for upcoming patches that will handle
multicast registers in a special manner.

The registers that are now replicated on Xe_HP are:
 * PAT_INDEX (mslice replication)
 * FF_MODE2 (gslice replication)
 * COMMON_SLICE_CHICKEN3 (gslice replication)
 * SLICE_COMMON_ECO_CHICKEN1 (gslice replication)
 * SLICE_UNIT_LEVEL_CLKGATE (gslice replication)
 * LNCFCMOCS (lncf replication)

Note that there are a couple places in selftest_mocs.c where the
gen9 version of LNCFCMOCS is still used without regards for which
platform we're on.  Those cases are just doing an offset lookup and not
issuing any CPU reads/writes of the register, so the potentially
multicast nature of the register doesn't come into play.

v2:
 - Add commit message note about the unconditional GEN9_LNCFCMOCS usage
   in selftest_mocs.  (Bala)
 - Include some additional TLB registers.

Bspec: 66534
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c        |  4 ++--
 drivers/gpu/drm/i915/gt/intel_gt.c          | 18 ++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     | 26 +++++++++++++++------
 drivers/gpu/drm/i915/gt/intel_gtt.c         | 22 ++++++++++++++---
 drivers/gpu/drm/i915/gt/intel_gtt.h         |  2 +-
 drivers/gpu/drm/i915/gt/intel_mocs.c        |  5 +++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 24 +++++++++----------
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  |  7 ++++--
 8 files changed, 78 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index b31fe0fb013f..af5b98b56a49 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -986,7 +986,7 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 
 	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
 
-	setup_private_pat(ggtt->vm.gt->uncore);
+	setup_private_pat(ggtt->vm.gt);
 
 	return ggtt_probe_common(ggtt, size);
 }
@@ -1302,7 +1302,7 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt)
 		wbinvd_on_all_cpus();
 
 	if (GRAPHICS_VER(ggtt->vm.i915) >= 8)
-		setup_private_pat(ggtt->vm.gt->uncore);
+		setup_private_pat(ggtt->vm.gt);
 
 	intel_ggtt_restore_fences(ggtt);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index b367cfff48d5..445e171940fa 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -314,7 +314,11 @@ static void gen8_check_faults(struct intel_gt *gt)
 	i915_reg_t fault_reg, fault_data0_reg, fault_data1_reg;
 	u32 fault;
 
-	if (GRAPHICS_VER(gt->i915) >= 12) {
+	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) {
+		fault_reg = XEHP_RING_FAULT_REG;
+		fault_data0_reg = XEHP_FAULT_TLB_DATA0;
+		fault_data1_reg = XEHP_FAULT_TLB_DATA1;
+	} else if (GRAPHICS_VER(gt->i915) >= 12) {
 		fault_reg = GEN12_RING_FAULT_REG;
 		fault_data0_reg = GEN12_FAULT_TLB_DATA0;
 		fault_data1_reg = GEN12_FAULT_TLB_DATA1;
@@ -990,6 +994,13 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 		[COPY_ENGINE_CLASS]		= GEN12_BLT_TLB_INV_CR,
 		[COMPUTE_CLASS]			= GEN12_COMPCTX_TLB_INV_CR,
 	};
+	static const i915_reg_t xehp_regs[] = {
+		[RENDER_CLASS]			= XEHP_GFX_TLB_INV_CR,
+		[VIDEO_DECODE_CLASS]		= XEHP_VD_TLB_INV_CR,
+		[VIDEO_ENHANCEMENT_CLASS]	= XEHP_VE_TLB_INV_CR,
+		[COPY_ENGINE_CLASS]		= XEHP_BLT_TLB_INV_CR,
+		[COMPUTE_CLASS]			= XEHP_COMPCTX_TLB_INV_CR,
+	};
 	struct drm_i915_private *i915 = gt->i915;
 	struct intel_uncore *uncore = gt->uncore;
 	struct intel_engine_cs *engine;
@@ -998,7 +1009,10 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	const i915_reg_t *regs;
 	unsigned int num = 0;
 
-	if (GRAPHICS_VER(i915) == 12) {
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+		regs = xehp_regs;
+		num = ARRAY_SIZE(xehp_regs);
+	} else if (GRAPHICS_VER(i915) == 12) {
 		regs = gen12_regs;
 		num = ARRAY_SIZE(gen12_regs);
 	} else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) {
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index ba4ce668042c..68f45a748712 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -333,6 +333,7 @@
 #define GEN7_TLB_RD_ADDR			_MMIO(0x4700)
 
 #define GEN12_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
+#define XEHP_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
 
 #define XEHP_TILE0_ADDR_RANGE			_MMIO(0x4900)
 #define   XEHP_TILE_LMEM_RANGE_SHIFT		8
@@ -391,7 +392,8 @@
 #define   DIS_OVER_FETCH_CACHE			REG_BIT(1)
 #define   DIS_MULT_MISS_RD_SQUASH		REG_BIT(0)
 
-#define FF_MODE2				_MMIO(0x6604)
+#define GEN12_FF_MODE2				_MMIO(0x6604)
+#define XEHP_FF_MODE2				_MMIO(0x6604)
 #define   FF_MODE2_GS_TIMER_MASK		REG_GENMASK(31, 24)
 #define   FF_MODE2_GS_TIMER_224			REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224)
 #define   FF_MODE2_TDS_TIMER_MASK		REG_GENMASK(23, 16)
@@ -446,6 +448,7 @@
 #define GEN8_HDC_CHICKEN1			_MMIO(0x7304)
 
 #define GEN11_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
+#define XEHP_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
 #define   DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN	REG_BIT(12)
 #define   XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE	REG_BIT(12)
 #define   GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC	REG_BIT(11)
@@ -459,10 +462,9 @@
 #define   DISABLE_PIXEL_MASK_CAMMING		(1 << 14)
 
 #define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
-#define   GEN11_STATE_CACHE_REDIRECT_TO_CS	(1 << 11)
-
-#define SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
+#define XEHP_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
 #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE	REG_BIT(14)
+#define   GEN11_STATE_CACHE_REDIRECT_TO_CS	(1 << 11)
 
 #define GEN9_SLICE_PGCTL_ACK(slice)		_MMIO(0x804c + (slice) * 0x4)
 #define GEN10_SLICE_PGCTL_ACK(slice)		_MMIO(0x804c + ((slice) / 3) * 0x34 + \
@@ -707,7 +709,8 @@
 #define   GAMTLBVEBOX0_CLKGATE_DIS		REG_BIT(16)
 #define   LTCDD_CLKGATE_DIS			REG_BIT(10)
 
-#define SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
+#define GEN11_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
+#define XEHP_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
 #define   SARBUNIT_CLKGATE_DIS			(1 << 5)
 #define   RCCUNIT_CLKGATE_DIS			(1 << 7)
 #define   MSCUNIT_CLKGATE_DIS			(1 << 10)
@@ -722,7 +725,7 @@
 #define   VSUNIT_CLKGATE_DIS_TGL		REG_BIT(19)
 #define   PSDUNIT_CLKGATE_DIS			REG_BIT(5)
 
-#define SUBSLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x9524)
+#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE	_MMIO(0x9524)
 #define   DSS_ROUTER_CLKGATE_DIS		REG_BIT(28)
 #define   GWUNIT_CLKGATE_DIS			REG_BIT(16)
 
@@ -947,7 +950,8 @@
 
 /* MOCS (Memory Object Control State) registers */
 #define GEN9_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
-#define GEN9_LNCFCMOCS_REG_COUNT		32
+#define XEHP_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
+#define LNCFCMOCS_REG_COUNT			32
 
 #define GEN7_L3CNTLREG3				_MMIO(0xb024)
 
@@ -1039,11 +1043,14 @@
 #define GEN9_BLT_MOCS(i)			_MMIO(__GEN9_BCS0_MOCS0 + (i) * 4)
 
 #define GEN12_FAULT_TLB_DATA0			_MMIO(0xceb8)
+#define XEHP_FAULT_TLB_DATA0			_MMIO(0xceb8)
 #define GEN12_FAULT_TLB_DATA1			_MMIO(0xcebc)
+#define XEHP_FAULT_TLB_DATA1			_MMIO(0xcebc)
 #define   FAULT_VA_HIGH_BITS			(0xf << 0)
 #define   FAULT_GTT_SEL				(1 << 4)
 
 #define GEN12_RING_FAULT_REG			_MMIO(0xcec4)
+#define XEHP_RING_FAULT_REG			_MMIO(0xcec4)
 #define   GEN8_RING_FAULT_ENGINE_ID(x)		(((x) >> 12) & 0x7)
 #define   RING_FAULT_GTTSEL_MASK		(1 << 11)
 #define   RING_FAULT_SRCID(x)			(((x) >> 3) & 0xff)
@@ -1051,10 +1058,15 @@
 #define   RING_FAULT_VALID			(1 << 0)
 
 #define GEN12_GFX_TLB_INV_CR			_MMIO(0xced8)
+#define XEHP_GFX_TLB_INV_CR			_MMIO(0xced8)
 #define GEN12_VD_TLB_INV_CR			_MMIO(0xcedc)
+#define XEHP_VD_TLB_INV_CR			_MMIO(0xcedc)
 #define GEN12_VE_TLB_INV_CR			_MMIO(0xcee0)
+#define XEHP_VE_TLB_INV_CR			_MMIO(0xcee0)
 #define GEN12_BLT_TLB_INV_CR			_MMIO(0xcee4)
+#define XEHP_BLT_TLB_INV_CR			_MMIO(0xcee4)
 #define GEN12_COMPCTX_TLB_INV_CR		_MMIO(0xcf04)
+#define XEHP_COMPCTX_TLB_INV_CR			_MMIO(0xcf04)
 
 #define GEN12_MERT_MOD_CTRL			_MMIO(0xcf28)
 #define RENDER_MOD_CTRL				_MMIO(0xcf2c)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 2eaeba14319e..5ce98c30728a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -15,6 +15,7 @@
 #include "i915_trace.h"
 #include "i915_utils.h"
 #include "intel_gt.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_regs.h"
 #include "intel_gtt.h"
 
@@ -493,6 +494,18 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 	intel_uncore_write(uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB);
 }
 
+static void xehp_setup_private_ppat(struct intel_gt *gt)
+{
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
+	intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
+}
+
 static void icl_setup_private_ppat(struct intel_uncore *uncore)
 {
 	intel_uncore_write(uncore,
@@ -585,13 +598,16 @@ static void chv_setup_private_ppat(struct intel_uncore *uncore)
 	intel_uncore_write(uncore, GEN8_PRIVATE_PAT_HI, upper_32_bits(pat));
 }
 
-void setup_private_pat(struct intel_uncore *uncore)
+void setup_private_pat(struct intel_gt *gt)
 {
-	struct drm_i915_private *i915 = uncore->i915;
+	struct intel_uncore *uncore = gt->uncore;
+	struct drm_i915_private *i915 = gt->i915;
 
 	GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
 
-	if (GRAPHICS_VER(i915) >= 12)
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+		xehp_setup_private_ppat(gt);
+	else if (GRAPHICS_VER(i915) >= 12)
 		tgl_setup_private_ppat(uncore);
 	else if (GRAPHICS_VER(i915) >= 11)
 		icl_setup_private_ppat(uncore);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index c0ca53cba9f0..5236c60f2a68 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -667,7 +667,7 @@ void ppgtt_unbind_vma(struct i915_address_space *vm,
 
 void gtt_write_workarounds(struct intel_gt *gt);
 
-void setup_private_pat(struct intel_uncore *uncore);
+void setup_private_pat(struct intel_gt *gt);
 
 int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index c6ebe2781076..06643701bf24 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -608,7 +608,10 @@ static void init_l3cc_table(struct intel_uncore *uncore,
 	u32 l3cc;
 
 	for_each_l3cc(l3cc, table, i)
-		intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
+		if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50))
+			intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc);
+		else
+			intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
 }
 
 void intel_mocs_init_engine(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 7a0f0146ee01..7c237eeeac85 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -572,7 +572,7 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 	wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
 			 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
 	wa_add(wal,
-	       FF_MODE2,
+	       XEHP_FF_MODE2,
 	       FF_MODE2_TDS_TIMER_MASK,
 	       FF_MODE2_TDS_TIMER_128,
 	       0, false);
@@ -599,7 +599,7 @@ static void gen12_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 	 * verification is ignored.
 	 */
 	wa_add(wal,
-	       FF_MODE2,
+	       GEN12_FF_MODE2,
 	       FF_MODE2_TDS_TIMER_MASK,
 	       FF_MODE2_TDS_TIMER_128,
 	       0, false);
@@ -637,7 +637,7 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine,
 	 * to Wa_1608008084.
 	 */
 	wa_add(wal,
-	       FF_MODE2,
+	       GEN12_FF_MODE2,
 	       FF_MODE2_GS_TIMER_MASK,
 	       FF_MODE2_GS_TIMER_224,
 	       0, false);
@@ -670,7 +670,7 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_14010469329:dg2_g10 */
-		wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
+		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
 			     XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
 
 		/*
@@ -678,12 +678,12 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 		 * Wa_22010613112:dg2_g10
 		 * Wa_14010698770:dg2_g10
 		 */
-		wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
+		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
 			     GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
 	}
 
 	/* Wa_16013271637:dg2 */
-	wa_masked_en(wal, SLICE_COMMON_ECO_CHICKEN1,
+	wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1,
 		     MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
 
 	/* Wa_14014947963:dg2 */
@@ -1265,14 +1265,14 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 
 	/* Wa_1406680159:icl,ehl */
 	wa_write_or(wal,
-		    SUBSLICE_UNIT_LEVEL_CLKGATE,
+		    GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
 		    GWUNIT_CLKGATE_DIS);
 
 	/* Wa_1607087056:icl,ehl,jsl */
 	if (IS_ICELAKE(i915) ||
 	    IS_JSL_EHL_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
 		wa_write_or(wal,
-			    SLICE_UNIT_LEVEL_CLKGATE,
+			    GEN11_SLICE_UNIT_LEVEL_CLKGATE,
 			    L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
 
 	/*
@@ -1332,7 +1332,7 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	/* Wa_1607087056:tgl also know as BUG:1409180338 */
 	if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
 		wa_write_or(wal,
-			    SLICE_UNIT_LEVEL_CLKGATE,
+			    GEN11_SLICE_UNIT_LEVEL_CLKGATE,
 			    L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
 
 	/* Wa_1408615072:tgl[a0] */
@@ -1351,7 +1351,7 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	/* Wa_1607087056:dg1 */
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
 		wa_write_or(wal,
-			    SLICE_UNIT_LEVEL_CLKGATE,
+			    GEN11_SLICE_UNIT_LEVEL_CLKGATE,
 			    L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
 
 	/* Wa_1409420604:dg1 */
@@ -1455,7 +1455,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 			    CG3DDISCFEG_CLKGATE_DIS);
 
 		/* Wa_14011006942:dg2 */
-		wa_write_or(wal, SUBSLICE_UNIT_LEVEL_CLKGATE,
+		wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
 			    DSS_ROUTER_CLKGATE_DIS);
 	}
 
@@ -1467,7 +1467,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 		wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS);
 
 		/* Wa_14011371254:dg2_g10 */
-		wa_write_or(wal, SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
+		wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
 
 		/* Wa_14011431319:dg2_g10 */
 		wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS |
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 657f0beb8e06..cc357fa0c270 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -373,8 +373,11 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
 					false);
 
 	/* add in local MOCS registers */
-	for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++)
-		ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
+	for (i = 0; i < LNCFCMOCS_REG_COUNT; i++)
+		if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+			ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false);
+		else
+			ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
 
 	return ret ? -1 : 0;
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 03/14] drm/i915/gt: Drop a few unused register definitions
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

Let's drop a few register definitions that are unused anywhere in the
driver today.  Since the referenced offsets are part of what is now
considered a multicast register region, the current definitions would
not be correct for use on any future platform.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 68f45a748712..3a50e8e966aa 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -454,13 +454,6 @@
 #define   GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC	REG_BIT(11)
 #define   GEN12_DISABLE_CPS_AWARE_COLOR_PIPE	REG_BIT(9)
 
-/* GEN9 chicken */
-#define SLICE_ECO_CHICKEN0			_MMIO(0x7308)
-#define   PIXEL_MASK_CAMMING_DISABLE		(1 << 14)
-
-#define GEN9_SLICE_COMMON_ECO_CHICKEN0		_MMIO(0x7308)
-#define   DISABLE_PIXEL_MASK_CAMMING		(1 << 14)
-
 #define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
 #define XEHP_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
 #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE	REG_BIT(14)
@@ -967,11 +960,6 @@
 #define GEN7_L3LOG(slice, i)			_MMIO(0xb070 + (slice) * 0x200 + (i) * 4)
 #define   GEN7_L3LOG_SIZE			0x80
 
-#define GEN10_SCRATCH_LNCF2			_MMIO(0xb0a0)
-#define   PMFLUSHDONE_LNICRSDROP		(1 << 20)
-#define   PMFLUSH_GAPL3UNBLOCK			(1 << 21)
-#define   PMFLUSHDONE_LNEBLK			(1 << 22)
-
 #define XEHP_L3NODEARBCFG			_MMIO(0xb0b4)
 #define   XEHP_LNESPARE				REG_BIT(19)
 
@@ -986,9 +974,6 @@
 #define   L3_HIGH_PRIO_CREDITS(x)		(((x) >> 1) << 14)
 #define   L3_PRIO_CREDITS_MASK			((0x1f << 19) | (0x1f << 14))
 
-#define GEN10_L3_CHICKEN_MODE_REGISTER		_MMIO(0xb114)
-#define   GEN11_I2M_WRITE_DISABLE		(1 << 28)
-
 #define GEN8_L3SQCREG4				_MMIO(0xb118)
 #define   GEN11_LQSC_CLEAN_EVICT_DISABLE	(1 << 6)
 #define   GEN8_LQSC_RO_PERF_DIS			(1 << 27)
@@ -1191,8 +1176,6 @@
 #define SARB_CHICKEN1				_MMIO(0xe90c)
 #define   COMP_CKN_IN				REG_GENMASK(30, 29)
 
-#define GEN7_HALF_SLICE_CHICKEN1_GT2		_MMIO(0xf100)
-
 #define GEN7_ROW_CHICKEN2_GT2			_MMIO(0xf4f4)
 #define   DOP_CLOCK_GATING_DISABLE		(1 << 0)
 #define   PUSH_CONSTANT_DEREF_DISABLE		(1 << 8)
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 03/14] drm/i915/gt: Drop a few unused register definitions
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

Let's drop a few register definitions that are unused anywhere in the
driver today.  Since the referenced offsets are part of what is now
considered a multicast register region, the current definitions would
not be correct for use on any future platform.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 68f45a748712..3a50e8e966aa 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -454,13 +454,6 @@
 #define   GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC	REG_BIT(11)
 #define   GEN12_DISABLE_CPS_AWARE_COLOR_PIPE	REG_BIT(9)
 
-/* GEN9 chicken */
-#define SLICE_ECO_CHICKEN0			_MMIO(0x7308)
-#define   PIXEL_MASK_CAMMING_DISABLE		(1 << 14)
-
-#define GEN9_SLICE_COMMON_ECO_CHICKEN0		_MMIO(0x7308)
-#define   DISABLE_PIXEL_MASK_CAMMING		(1 << 14)
-
 #define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
 #define XEHP_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
 #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE	REG_BIT(14)
@@ -967,11 +960,6 @@
 #define GEN7_L3LOG(slice, i)			_MMIO(0xb070 + (slice) * 0x200 + (i) * 4)
 #define   GEN7_L3LOG_SIZE			0x80
 
-#define GEN10_SCRATCH_LNCF2			_MMIO(0xb0a0)
-#define   PMFLUSHDONE_LNICRSDROP		(1 << 20)
-#define   PMFLUSH_GAPL3UNBLOCK			(1 << 21)
-#define   PMFLUSHDONE_LNEBLK			(1 << 22)
-
 #define XEHP_L3NODEARBCFG			_MMIO(0xb0b4)
 #define   XEHP_LNESPARE				REG_BIT(19)
 
@@ -986,9 +974,6 @@
 #define   L3_HIGH_PRIO_CREDITS(x)		(((x) >> 1) << 14)
 #define   L3_PRIO_CREDITS_MASK			((0x1f << 19) | (0x1f << 14))
 
-#define GEN10_L3_CHICKEN_MODE_REGISTER		_MMIO(0xb114)
-#define   GEN11_I2M_WRITE_DISABLE		(1 << 28)
-
 #define GEN8_L3SQCREG4				_MMIO(0xb118)
 #define   GEN11_LQSC_CLEAN_EVICT_DISABLE	(1 << 6)
 #define   GEN8_LQSC_RO_PERF_DIS			(1 << 27)
@@ -1191,8 +1176,6 @@
 #define SARB_CHICKEN1				_MMIO(0xe90c)
 #define   COMP_CKN_IN				REG_GENMASK(30, 29)
 
-#define GEN7_HALF_SLICE_CHICKEN1_GT2		_MMIO(0xf100)
-
 #define GEN7_ROW_CHICKEN2_GT2			_MMIO(0xf4f4)
 #define   DOP_CLOCK_GATING_DISABLE		(1 << 0)
 #define   PUSH_CONSTANT_DEREF_DISABLE		(1 << 8)
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 04/14] drm/i915/gt: Correct prefix on a few registers
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

We have a few registers that have existed for several hardware
generations, but are only used by the driver on Xe_HP and beyond.  In
cases where the Xe_HP version of the register is now replicated and uses
multicast behavior, but earlier generations were singleton, let's change
the register prefix to "XEHP_" to help clarify that we're using the
newer multicast form of the register.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  8 ++++----
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 10 +++++-----
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 3a50e8e966aa..86757dd22c13 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -486,7 +486,7 @@
 
 #define GEN8_RC6_CTX_INFO			_MMIO(0x8504)
 
-#define GEN12_SQCM				_MMIO(0x8724)
+#define XEHP_SQCM				_MMIO(0x8724)
 #define   EN_32B_ACCESS				REG_BIT(30)
 
 #define HSW_IDICR				_MMIO(0x9008)
@@ -989,7 +989,7 @@
 #define GEN11_SCRATCH2				_MMIO(0xb140)
 #define   GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE	(1 << 19)
 
-#define GEN11_L3SQCREG5				_MMIO(0xb158)
+#define XEHP_L3SQCREG5				_MMIO(0xb158)
 #define   L3_PWM_TIMER_INIT_VAL_MASK		REG_GENMASK(9, 0)
 
 #define MLTICTXCTL				_MMIO(0xb170)
@@ -1053,7 +1053,7 @@
 #define GEN12_COMPCTX_TLB_INV_CR		_MMIO(0xcf04)
 #define XEHP_COMPCTX_TLB_INV_CR			_MMIO(0xcf04)
 
-#define GEN12_MERT_MOD_CTRL			_MMIO(0xcf28)
+#define XEHP_MERT_MOD_CTRL			_MMIO(0xcf28)
 #define RENDER_MOD_CTRL				_MMIO(0xcf2c)
 #define COMP_MOD_CTRL				_MMIO(0xcf30)
 #define VDBX_MOD_CTRL				_MMIO(0xcf34)
@@ -1155,7 +1155,7 @@
 #define EU_PERF_CNTL1				_MMIO(0xe558)
 #define EU_PERF_CNTL5				_MMIO(0xe55c)
 
-#define GEN12_HDC_CHICKEN0			_MMIO(0xe5f0)
+#define XEHP_HDC_CHICKEN0			_MMIO(0xe5f0)
 #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
 #define ICL_HDC_MODE				_MMIO(0xe5f4)
 
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 7c237eeeac85..7bb878581d00 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -569,7 +569,7 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 				   struct i915_wa_list *wal)
 {
 	wa_masked_en(wal, CHICKEN_RASTER_2, TBIMR_FAST_CLIP);
-	wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
+	wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
 			 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
 	wa_add(wal,
 	       XEHP_FF_MODE2,
@@ -1514,7 +1514,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	 * recommended tuning settings documented in the bspec's
 	 * performance guide section.
 	 */
-	wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS);
+	wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS);
 
 	/* Wa_14015795083 */
 	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
@@ -2170,7 +2170,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_22010960976:dg2
 		 * Wa_14013347512:dg2
 		 */
-		wa_masked_dis(wal, GEN12_HDC_CHICKEN0,
+		wa_masked_dis(wal, XEHP_HDC_CHICKEN0,
 			      LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK);
 	}
 
@@ -2223,7 +2223,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) ||
 	    IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14012362059:dg2 */
-		wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_B0, STEP_FOREVER) ||
@@ -2764,7 +2764,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 		}
 
 		/* Wa_14012362059:xehpsdv */
-		wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 
 		/* Wa_14014368820:xehpsdv */
 		wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS |
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 04/14] drm/i915/gt: Correct prefix on a few registers
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

We have a few registers that have existed for several hardware
generations, but are only used by the driver on Xe_HP and beyond.  In
cases where the Xe_HP version of the register is now replicated and uses
multicast behavior, but earlier generations were singleton, let's change
the register prefix to "XEHP_" to help clarify that we're using the
newer multicast form of the register.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  8 ++++----
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 10 +++++-----
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 3a50e8e966aa..86757dd22c13 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -486,7 +486,7 @@
 
 #define GEN8_RC6_CTX_INFO			_MMIO(0x8504)
 
-#define GEN12_SQCM				_MMIO(0x8724)
+#define XEHP_SQCM				_MMIO(0x8724)
 #define   EN_32B_ACCESS				REG_BIT(30)
 
 #define HSW_IDICR				_MMIO(0x9008)
@@ -989,7 +989,7 @@
 #define GEN11_SCRATCH2				_MMIO(0xb140)
 #define   GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE	(1 << 19)
 
-#define GEN11_L3SQCREG5				_MMIO(0xb158)
+#define XEHP_L3SQCREG5				_MMIO(0xb158)
 #define   L3_PWM_TIMER_INIT_VAL_MASK		REG_GENMASK(9, 0)
 
 #define MLTICTXCTL				_MMIO(0xb170)
@@ -1053,7 +1053,7 @@
 #define GEN12_COMPCTX_TLB_INV_CR		_MMIO(0xcf04)
 #define XEHP_COMPCTX_TLB_INV_CR			_MMIO(0xcf04)
 
-#define GEN12_MERT_MOD_CTRL			_MMIO(0xcf28)
+#define XEHP_MERT_MOD_CTRL			_MMIO(0xcf28)
 #define RENDER_MOD_CTRL				_MMIO(0xcf2c)
 #define COMP_MOD_CTRL				_MMIO(0xcf30)
 #define VDBX_MOD_CTRL				_MMIO(0xcf34)
@@ -1155,7 +1155,7 @@
 #define EU_PERF_CNTL1				_MMIO(0xe558)
 #define EU_PERF_CNTL5				_MMIO(0xe55c)
 
-#define GEN12_HDC_CHICKEN0			_MMIO(0xe5f0)
+#define XEHP_HDC_CHICKEN0			_MMIO(0xe5f0)
 #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
 #define ICL_HDC_MODE				_MMIO(0xe5f4)
 
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 7c237eeeac85..7bb878581d00 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -569,7 +569,7 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 				   struct i915_wa_list *wal)
 {
 	wa_masked_en(wal, CHICKEN_RASTER_2, TBIMR_FAST_CLIP);
-	wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
+	wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
 			 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
 	wa_add(wal,
 	       XEHP_FF_MODE2,
@@ -1514,7 +1514,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	 * recommended tuning settings documented in the bspec's
 	 * performance guide section.
 	 */
-	wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS);
+	wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS);
 
 	/* Wa_14015795083 */
 	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
@@ -2170,7 +2170,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_22010960976:dg2
 		 * Wa_14013347512:dg2
 		 */
-		wa_masked_dis(wal, GEN12_HDC_CHICKEN0,
+		wa_masked_dis(wal, XEHP_HDC_CHICKEN0,
 			      LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK);
 	}
 
@@ -2223,7 +2223,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) ||
 	    IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14012362059:dg2 */
-		wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_B0, STEP_FOREVER) ||
@@ -2764,7 +2764,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 		}
 
 		/* Wa_14012362059:xehpsdv */
-		wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 
 		/* Wa_14014368820:xehpsdv */
 		wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS |
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 05/14] drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

There are cases where we wish to read from any non-terminated MCR
register instance (or the primary instance in the case of GAM ranges),
clear/set some bits, and then write the value back out to the register
in a multicast manner.  Adding a "multicast RMW" will avoid the need to
open-code this.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 24 ++++++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h |  3 +++
 2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index a2047a68ea7a..962d3e974384 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -302,6 +302,30 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va
 	intel_uncore_write_fw(gt->uncore, reg, value);
 }
 
+/**
+ * intel_gt_mcr_multicast_rmw - Performs a multicast RMW operations
+ * @gt: GT structure
+ * @reg: the MCR register to read and write
+ * @clear: bits to clear during RMW
+ * @set: bits to set during RMW
+ *
+ * Performs a read-modify-write on an MCR register in a multicast manner.
+ * This operation only makes sense on MCR registers where all instances are
+ * expected to have the same value.  The read will target any non-terminated
+ * instance and the write will be applied to all instances.
+ *
+ * This function assumes the caller is already holding any necessary forcewake
+ * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should
+ * be obtained automatically.
+ */
+void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
+				u32 clear, u32 set)
+{
+	u32 val = intel_gt_mcr_read_any(gt, reg);
+
+	intel_gt_mcr_multicast_write(gt, reg, (val & ~clear) | set);
+}
+
 /*
  * reg_needs_read_steering - determine whether a register read requires
  *     explicit steering
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
index 77a8b11c287d..e47e75e66835 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -24,6 +24,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
 void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
 				     i915_reg_t reg, u32 value);
 
+void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
+				u32 clear, u32 set);
+
 void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
 					     i915_reg_t reg,
 					     u8 *group, u8 *instance);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 05/14] drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

There are cases where we wish to read from any non-terminated MCR
register instance (or the primary instance in the case of GAM ranges),
clear/set some bits, and then write the value back out to the register
in a multicast manner.  Adding a "multicast RMW" will avoid the need to
open-code this.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 24 ++++++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h |  3 +++
 2 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index a2047a68ea7a..962d3e974384 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -302,6 +302,30 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va
 	intel_uncore_write_fw(gt->uncore, reg, value);
 }
 
+/**
+ * intel_gt_mcr_multicast_rmw - Performs a multicast RMW operations
+ * @gt: GT structure
+ * @reg: the MCR register to read and write
+ * @clear: bits to clear during RMW
+ * @set: bits to set during RMW
+ *
+ * Performs a read-modify-write on an MCR register in a multicast manner.
+ * This operation only makes sense on MCR registers where all instances are
+ * expected to have the same value.  The read will target any non-terminated
+ * instance and the write will be applied to all instances.
+ *
+ * This function assumes the caller is already holding any necessary forcewake
+ * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should
+ * be obtained automatically.
+ */
+void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
+				u32 clear, u32 set)
+{
+	u32 val = intel_gt_mcr_read_any(gt, reg);
+
+	intel_gt_mcr_multicast_write(gt, reg, (val & ~clear) | set);
+}
+
 /*
  * reg_needs_read_steering - determine whether a register read requires
  *     explicit steering
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
index 77a8b11c287d..e47e75e66835 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -24,6 +24,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
 void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
 				     i915_reg_t reg, u32 value);
 
+void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
+				u32 clear, u32 set);
+
 void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
 					     i915_reg_t reg,
 					     u8 *group, u8 *instance);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 06/14] drm/i915/xehp: Check for faults on primary GAM
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

On Xe_HP the fault registers are now in a multicast register range.
However as part of the GAM these registers follow special rules and we
need only read from the "primary" GAM's instance to get the information
we need.  So a single intel_gt_mcr_read_any() (which will automatically
steer to the primary GAM) is sufficient; we don't need to loop over each
instance of the MCR register.

v2:
 - Update more instances of fault registers.  (Bala)

Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c    | 52 +++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gpu_error.c | 12 +++++--
 2 files changed, 55 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 445e171940fa..e14f159ad9fc 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -270,7 +270,11 @@ intel_gt_clear_error_registers(struct intel_gt *gt,
 				   I915_MASTER_ERROR_INTERRUPT);
 	}
 
-	if (GRAPHICS_VER(i915) >= 12) {
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+		intel_gt_mcr_multicast_rmw(gt, XEHP_RING_FAULT_REG,
+					   RING_FAULT_VALID, 0);
+		intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG);
+	} else if (GRAPHICS_VER(i915) >= 12) {
 		rmw_clear(uncore, GEN12_RING_FAULT_REG, RING_FAULT_VALID);
 		intel_uncore_posting_read(uncore, GEN12_RING_FAULT_REG);
 	} else if (GRAPHICS_VER(i915) >= 8) {
@@ -308,17 +312,49 @@ static void gen6_check_faults(struct intel_gt *gt)
 	}
 }
 
+static void xehp_check_faults(struct intel_gt *gt)
+{
+	u32 fault;
+
+	/*
+	 * Although the fault register now lives in an MCR register range,
+	 * the GAM registers are special and we only truly need to read
+	 * the "primary" GAM instance rather than handling each instance
+	 * individually.  intel_gt_mcr_read_any() will automatically steer
+	 * toward the primary instance.
+	 */
+	fault = intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG);
+	if (fault & RING_FAULT_VALID) {
+		u32 fault_data0, fault_data1;
+		u64 fault_addr;
+
+		fault_data0 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA0);
+		fault_data1 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA1);
+
+		fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) |
+			     ((u64)fault_data0 << 12);
+
+		drm_dbg(&gt->i915->drm, "Unexpected fault\n"
+			"\tAddr: 0x%08x_%08x\n"
+			"\tAddress space: %s\n"
+			"\tEngine ID: %d\n"
+			"\tSource ID: %d\n"
+			"\tType: %d\n",
+			upper_32_bits(fault_addr), lower_32_bits(fault_addr),
+			fault_data1 & FAULT_GTT_SEL ? "GGTT" : "PPGTT",
+			GEN8_RING_FAULT_ENGINE_ID(fault),
+			RING_FAULT_SRCID(fault),
+			RING_FAULT_FAULT_TYPE(fault));
+	}
+}
+
 static void gen8_check_faults(struct intel_gt *gt)
 {
 	struct intel_uncore *uncore = gt->uncore;
 	i915_reg_t fault_reg, fault_data0_reg, fault_data1_reg;
 	u32 fault;
 
-	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) {
-		fault_reg = XEHP_RING_FAULT_REG;
-		fault_data0_reg = XEHP_FAULT_TLB_DATA0;
-		fault_data1_reg = XEHP_FAULT_TLB_DATA1;
-	} else if (GRAPHICS_VER(gt->i915) >= 12) {
+	if (GRAPHICS_VER(gt->i915) >= 12) {
 		fault_reg = GEN12_RING_FAULT_REG;
 		fault_data0_reg = GEN12_FAULT_TLB_DATA0;
 		fault_data1_reg = GEN12_FAULT_TLB_DATA1;
@@ -358,7 +394,9 @@ void intel_gt_check_and_clear_faults(struct intel_gt *gt)
 	struct drm_i915_private *i915 = gt->i915;
 
 	/* From GEN8 onwards we only have one 'All Engine Fault Register' */
-	if (GRAPHICS_VER(i915) >= 8)
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+		xehp_check_faults(gt);
+	else if (GRAPHICS_VER(i915) >= 8)
 		gen8_check_faults(gt);
 	else if (GRAPHICS_VER(i915) >= 6)
 		gen6_check_faults(gt);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 9ea2fe34e7d3..f2d53edcd2ee 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1221,7 +1221,10 @@ static void engine_record_registers(struct intel_engine_coredump *ee)
 	if (GRAPHICS_VER(i915) >= 6) {
 		ee->rc_psmi = ENGINE_READ(engine, RING_PSMI_CTL);
 
-		if (GRAPHICS_VER(i915) >= 12)
+		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+			ee->fault_reg = intel_gt_mcr_read_any(engine->gt,
+							      XEHP_RING_FAULT_REG);
+		else if (GRAPHICS_VER(i915) >= 12)
 			ee->fault_reg = intel_uncore_read(engine->uncore,
 							  GEN12_RING_FAULT_REG);
 		else if (GRAPHICS_VER(i915) >= 8)
@@ -1820,7 +1823,12 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt)
 	if (GRAPHICS_VER(i915) == 7)
 		gt->err_int = intel_uncore_read(uncore, GEN7_ERR_INT);
 
-	if (GRAPHICS_VER(i915) >= 12) {
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+		gt->fault_data0 = intel_gt_mcr_read_any((struct intel_gt *)gt->_gt,
+							XEHP_FAULT_TLB_DATA0);
+		gt->fault_data1 = intel_gt_mcr_read_any((struct intel_gt *)gt->_gt,
+							XEHP_FAULT_TLB_DATA1);
+	} else if (GRAPHICS_VER(i915) >= 12) {
 		gt->fault_data0 = intel_uncore_read(uncore,
 						    GEN12_FAULT_TLB_DATA0);
 		gt->fault_data1 = intel_uncore_read(uncore,
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 06/14] drm/i915/xehp: Check for faults on primary GAM
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

On Xe_HP the fault registers are now in a multicast register range.
However as part of the GAM these registers follow special rules and we
need only read from the "primary" GAM's instance to get the information
we need.  So a single intel_gt_mcr_read_any() (which will automatically
steer to the primary GAM) is sufficient; we don't need to loop over each
instance of the MCR register.

v2:
 - Update more instances of fault registers.  (Bala)

Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c    | 52 +++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gpu_error.c | 12 +++++--
 2 files changed, 55 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 445e171940fa..e14f159ad9fc 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -270,7 +270,11 @@ intel_gt_clear_error_registers(struct intel_gt *gt,
 				   I915_MASTER_ERROR_INTERRUPT);
 	}
 
-	if (GRAPHICS_VER(i915) >= 12) {
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+		intel_gt_mcr_multicast_rmw(gt, XEHP_RING_FAULT_REG,
+					   RING_FAULT_VALID, 0);
+		intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG);
+	} else if (GRAPHICS_VER(i915) >= 12) {
 		rmw_clear(uncore, GEN12_RING_FAULT_REG, RING_FAULT_VALID);
 		intel_uncore_posting_read(uncore, GEN12_RING_FAULT_REG);
 	} else if (GRAPHICS_VER(i915) >= 8) {
@@ -308,17 +312,49 @@ static void gen6_check_faults(struct intel_gt *gt)
 	}
 }
 
+static void xehp_check_faults(struct intel_gt *gt)
+{
+	u32 fault;
+
+	/*
+	 * Although the fault register now lives in an MCR register range,
+	 * the GAM registers are special and we only truly need to read
+	 * the "primary" GAM instance rather than handling each instance
+	 * individually.  intel_gt_mcr_read_any() will automatically steer
+	 * toward the primary instance.
+	 */
+	fault = intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG);
+	if (fault & RING_FAULT_VALID) {
+		u32 fault_data0, fault_data1;
+		u64 fault_addr;
+
+		fault_data0 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA0);
+		fault_data1 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA1);
+
+		fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) |
+			     ((u64)fault_data0 << 12);
+
+		drm_dbg(&gt->i915->drm, "Unexpected fault\n"
+			"\tAddr: 0x%08x_%08x\n"
+			"\tAddress space: %s\n"
+			"\tEngine ID: %d\n"
+			"\tSource ID: %d\n"
+			"\tType: %d\n",
+			upper_32_bits(fault_addr), lower_32_bits(fault_addr),
+			fault_data1 & FAULT_GTT_SEL ? "GGTT" : "PPGTT",
+			GEN8_RING_FAULT_ENGINE_ID(fault),
+			RING_FAULT_SRCID(fault),
+			RING_FAULT_FAULT_TYPE(fault));
+	}
+}
+
 static void gen8_check_faults(struct intel_gt *gt)
 {
 	struct intel_uncore *uncore = gt->uncore;
 	i915_reg_t fault_reg, fault_data0_reg, fault_data1_reg;
 	u32 fault;
 
-	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) {
-		fault_reg = XEHP_RING_FAULT_REG;
-		fault_data0_reg = XEHP_FAULT_TLB_DATA0;
-		fault_data1_reg = XEHP_FAULT_TLB_DATA1;
-	} else if (GRAPHICS_VER(gt->i915) >= 12) {
+	if (GRAPHICS_VER(gt->i915) >= 12) {
 		fault_reg = GEN12_RING_FAULT_REG;
 		fault_data0_reg = GEN12_FAULT_TLB_DATA0;
 		fault_data1_reg = GEN12_FAULT_TLB_DATA1;
@@ -358,7 +394,9 @@ void intel_gt_check_and_clear_faults(struct intel_gt *gt)
 	struct drm_i915_private *i915 = gt->i915;
 
 	/* From GEN8 onwards we only have one 'All Engine Fault Register' */
-	if (GRAPHICS_VER(i915) >= 8)
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+		xehp_check_faults(gt);
+	else if (GRAPHICS_VER(i915) >= 8)
 		gen8_check_faults(gt);
 	else if (GRAPHICS_VER(i915) >= 6)
 		gen6_check_faults(gt);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 9ea2fe34e7d3..f2d53edcd2ee 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1221,7 +1221,10 @@ static void engine_record_registers(struct intel_engine_coredump *ee)
 	if (GRAPHICS_VER(i915) >= 6) {
 		ee->rc_psmi = ENGINE_READ(engine, RING_PSMI_CTL);
 
-		if (GRAPHICS_VER(i915) >= 12)
+		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+			ee->fault_reg = intel_gt_mcr_read_any(engine->gt,
+							      XEHP_RING_FAULT_REG);
+		else if (GRAPHICS_VER(i915) >= 12)
 			ee->fault_reg = intel_uncore_read(engine->uncore,
 							  GEN12_RING_FAULT_REG);
 		else if (GRAPHICS_VER(i915) >= 8)
@@ -1820,7 +1823,12 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt)
 	if (GRAPHICS_VER(i915) == 7)
 		gt->err_int = intel_uncore_read(uncore, GEN7_ERR_INT);
 
-	if (GRAPHICS_VER(i915) >= 12) {
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+		gt->fault_data0 = intel_gt_mcr_read_any((struct intel_gt *)gt->_gt,
+							XEHP_FAULT_TLB_DATA0);
+		gt->fault_data1 = intel_gt_mcr_read_any((struct intel_gt *)gt->_gt,
+							XEHP_FAULT_TLB_DATA1);
+	} else if (GRAPHICS_VER(i915) >= 12) {
 		gt->fault_data0 = intel_uncore_read(uncore,
 						    GEN12_FAULT_TLB_DATA0);
 		gt->fault_data1 = intel_uncore_read(uncore,
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 07/14] drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

Xe_HP has some MCR registers that need to be polled for completion of
operations like TLB invalidation.  Those registers are in the GAM range,
which rolls up the status from each unit into the 'primary' instance's
value.  This makes it useful to have a dedicated 'wait for register'
function that handles this on MCR registers, similar to the
__intel_wait_for_register_fw() function we already have for regular
registers.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 55 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h |  7 ++++
 2 files changed, 62 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index 962d3e974384..a2bc77c33835 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -564,3 +564,58 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
 		return;
 	}
 }
+
+/**
+ * intel_gt_mcr_wait_for_reg_fw - wait until MCR register matches expected state
+ * @gt: GT structure
+ * @reg: the register to read
+ * @mask: mask to apply to register value
+ * @value: value to wait for
+ * @fast_timeout_us: fast timeout in microsecond for atomic/tight wait
+ * @slow_timeout_ms: slow timeout in millisecond
+ *
+ * This routine waits until the target register @reg contains the expected
+ * @value after applying the @mask, i.e. it waits until ::
+ *
+ *     (intel_gt_mcr_read_any_fw(gt, reg) & mask) == value
+ *
+ * Otherwise, the wait will timeout after @slow_timeout_ms milliseconds.
+ * For atomic context @slow_timeout_ms must be zero and @fast_timeout_us
+ * must be not larger than 20,0000 microseconds.
+ *
+ * This function is basically an MCR-friendly version of
+ * __intel_wait_for_register_fw().  Generally this function will only be used
+ * on GAM registers which are a bit special --- although they're MCR registers,
+ * reads (e.g., waiting for status updates) are always directed to the primary
+ * instance.
+ *
+ * Note that this routine assumes the caller holds forcewake asserted, it is
+ * not suitable for very long waits.
+ *
+ * Return: 0 if the register matches the desired condition, or -ETIMEDOUT.
+ */
+int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
+				 i915_reg_t reg,
+				 u32 mask,
+				 u32 value,
+				 unsigned int fast_timeout_us,
+				 unsigned int slow_timeout_ms)
+{
+	u32 reg_value = 0;
+#define done (((reg_value = intel_gt_mcr_read_any_fw(gt, reg)) & mask) == value)
+	int ret;
+
+	/* Catch any overuse of this function */
+	might_sleep_if(slow_timeout_ms);
+	GEM_BUG_ON(fast_timeout_us > 20000);
+	GEM_BUG_ON(!fast_timeout_us && !slow_timeout_ms);
+
+	ret = -ETIMEDOUT;
+	if (fast_timeout_us && fast_timeout_us <= 20000)
+		ret = _wait_for_atomic(done, fast_timeout_us, 0);
+	if (ret && slow_timeout_ms)
+		ret = wait_for(done, slow_timeout_ms);
+
+	return ret;
+#undef done
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
index e47e75e66835..96979af01f2b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -37,6 +37,13 @@ void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
 void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
 				  unsigned int *group, unsigned int *instance);
 
+int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
+				 i915_reg_t reg,
+				 u32 mask,
+				 u32 value,
+				 unsigned int fast_timeout_us,
+				 unsigned int slow_timeout_ms);
+
 /*
  * Helper for for_each_ss_steering loop.  On pre-Xe_HP platforms, subslice
  * presence is determined by using the group/instance as direct lookups in the
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 07/14] drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

Xe_HP has some MCR registers that need to be polled for completion of
operations like TLB invalidation.  Those registers are in the GAM range,
which rolls up the status from each unit into the 'primary' instance's
value.  This makes it useful to have a dedicated 'wait for register'
function that handles this on MCR registers, similar to the
__intel_wait_for_register_fw() function we already have for regular
registers.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c | 55 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h |  7 ++++
 2 files changed, 62 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index 962d3e974384..a2bc77c33835 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -564,3 +564,58 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
 		return;
 	}
 }
+
+/**
+ * intel_gt_mcr_wait_for_reg_fw - wait until MCR register matches expected state
+ * @gt: GT structure
+ * @reg: the register to read
+ * @mask: mask to apply to register value
+ * @value: value to wait for
+ * @fast_timeout_us: fast timeout in microsecond for atomic/tight wait
+ * @slow_timeout_ms: slow timeout in millisecond
+ *
+ * This routine waits until the target register @reg contains the expected
+ * @value after applying the @mask, i.e. it waits until ::
+ *
+ *     (intel_gt_mcr_read_any_fw(gt, reg) & mask) == value
+ *
+ * Otherwise, the wait will timeout after @slow_timeout_ms milliseconds.
+ * For atomic context @slow_timeout_ms must be zero and @fast_timeout_us
+ * must be not larger than 20,0000 microseconds.
+ *
+ * This function is basically an MCR-friendly version of
+ * __intel_wait_for_register_fw().  Generally this function will only be used
+ * on GAM registers which are a bit special --- although they're MCR registers,
+ * reads (e.g., waiting for status updates) are always directed to the primary
+ * instance.
+ *
+ * Note that this routine assumes the caller holds forcewake asserted, it is
+ * not suitable for very long waits.
+ *
+ * Return: 0 if the register matches the desired condition, or -ETIMEDOUT.
+ */
+int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
+				 i915_reg_t reg,
+				 u32 mask,
+				 u32 value,
+				 unsigned int fast_timeout_us,
+				 unsigned int slow_timeout_ms)
+{
+	u32 reg_value = 0;
+#define done (((reg_value = intel_gt_mcr_read_any_fw(gt, reg)) & mask) == value)
+	int ret;
+
+	/* Catch any overuse of this function */
+	might_sleep_if(slow_timeout_ms);
+	GEM_BUG_ON(fast_timeout_us > 20000);
+	GEM_BUG_ON(!fast_timeout_us && !slow_timeout_ms);
+
+	ret = -ETIMEDOUT;
+	if (fast_timeout_us && fast_timeout_us <= 20000)
+		ret = _wait_for_atomic(done, fast_timeout_us, 0);
+	if (ret && slow_timeout_ms)
+		ret = wait_for(done, slow_timeout_ms);
+
+	return ret;
+#undef done
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
index e47e75e66835..96979af01f2b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -37,6 +37,13 @@ void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
 void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
 				  unsigned int *group, unsigned int *instance);
 
+int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
+				 i915_reg_t reg,
+				 u32 mask,
+				 u32 value,
+				 unsigned int fast_timeout_us,
+				 unsigned int slow_timeout_ms);
+
 /*
  * Helper for for_each_ss_steering loop.  On pre-Xe_HP platforms, subslice
  * presence is determined by using the group/instance as direct lookups in the
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 08/14] drm/i915: Define MCR registers explicitly
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

Rather than using the same _MMIO() macro to define MCR registers as
singleton registers, let's use a new MCR_REG() macro to make it clear
that these registers are special and should be handled accordingly.  For
now MCR_REG() will still generate an i915_reg_t with the given offset,
but we'll change that in future patches.

Bspec: 66673, 66696, 66534, 67609
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h | 134 ++++++++++++------------
 1 file changed, 68 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 86757dd22c13..901ea03b9fd8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -8,6 +8,8 @@
 
 #include "i915_reg_defs.h"
 
+#define MCR_REG(offset)	_MMIO(offset)
+
 /* RPM unit config (Gen8+) */
 #define RPM_CONFIG0				_MMIO(0xd00)
 #define   GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT	3
@@ -333,12 +335,12 @@
 #define GEN7_TLB_RD_ADDR			_MMIO(0x4700)
 
 #define GEN12_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
-#define XEHP_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
+#define XEHP_PAT_INDEX(index)			MCR_REG(0x4800 + (index) * 4)
 
-#define XEHP_TILE0_ADDR_RANGE			_MMIO(0x4900)
+#define XEHP_TILE0_ADDR_RANGE			MCR_REG(0x4900)
 #define   XEHP_TILE_LMEM_RANGE_SHIFT		8
 
-#define XEHP_FLAT_CCS_BASE_ADDR			_MMIO(0x4910)
+#define XEHP_FLAT_CCS_BASE_ADDR			MCR_REG(0x4910)
 #define   XEHP_CCS_BASE_SHIFT			8
 
 #define GAMTARBMODE				_MMIO(0x4a08)
@@ -388,18 +390,18 @@
 #define CHICKEN_RASTER_2			_MMIO(0x6208)
 #define   TBIMR_FAST_CLIP			REG_BIT(5)
 
-#define VFLSKPD					_MMIO(0x62a8)
+#define VFLSKPD					MCR_REG(0x62a8)
 #define   DIS_OVER_FETCH_CACHE			REG_BIT(1)
 #define   DIS_MULT_MISS_RD_SQUASH		REG_BIT(0)
 
 #define GEN12_FF_MODE2				_MMIO(0x6604)
-#define XEHP_FF_MODE2				_MMIO(0x6604)
+#define XEHP_FF_MODE2				MCR_REG(0x6604)
 #define   FF_MODE2_GS_TIMER_MASK		REG_GENMASK(31, 24)
 #define   FF_MODE2_GS_TIMER_224			REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224)
 #define   FF_MODE2_TDS_TIMER_MASK		REG_GENMASK(23, 16)
 #define   FF_MODE2_TDS_TIMER_128		REG_FIELD_PREP(FF_MODE2_TDS_TIMER_MASK, 4)
 
-#define XEHPG_INSTDONE_GEOM_SVG			_MMIO(0x666c)
+#define XEHPG_INSTDONE_GEOM_SVG			MCR_REG(0x666c)
 
 #define CACHE_MODE_0_GEN7			_MMIO(0x7000) /* IVB+ */
 #define   RC_OP_FLUSH_ENABLE			(1 << 0)
@@ -448,14 +450,14 @@
 #define GEN8_HDC_CHICKEN1			_MMIO(0x7304)
 
 #define GEN11_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
-#define XEHP_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
+#define XEHP_COMMON_SLICE_CHICKEN3		MCR_REG(0x7304)
 #define   DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN	REG_BIT(12)
 #define   XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE	REG_BIT(12)
 #define   GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC	REG_BIT(11)
 #define   GEN12_DISABLE_CPS_AWARE_COLOR_PIPE	REG_BIT(9)
 
 #define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
-#define XEHP_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
+#define XEHP_SLICE_COMMON_ECO_CHICKEN1		MCR_REG(0x731c)
 #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE	REG_BIT(14)
 #define   GEN11_STATE_CACHE_REDIRECT_TO_CS	(1 << 11)
 
@@ -486,7 +488,7 @@
 
 #define GEN8_RC6_CTX_INFO			_MMIO(0x8504)
 
-#define XEHP_SQCM				_MMIO(0x8724)
+#define XEHP_SQCM				MCR_REG(0x8724)
 #define   EN_32B_ACCESS				REG_BIT(30)
 
 #define HSW_IDICR				_MMIO(0x9008)
@@ -647,7 +649,7 @@
 #define GEN7_MISCCPCTL				_MMIO(0x9424)
 #define   GEN7_DOP_CLOCK_GATE_ENABLE		(1 << 0)
 
-#define GEN8_MISCCPCTL				_MMIO(0x9424)
+#define GEN8_MISCCPCTL				MCR_REG(0x9424)
 #define   GEN8_DOP_CLOCK_GATE_ENABLE		REG_BIT(0)
 #define   GEN12_DOP_CLOCK_GATE_RENDER_ENABLE	REG_BIT(1)
 #define   GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE	(1 << 2)
@@ -703,7 +705,7 @@
 #define   LTCDD_CLKGATE_DIS			REG_BIT(10)
 
 #define GEN11_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
-#define XEHP_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
+#define XEHP_SLICE_UNIT_LEVEL_CLKGATE		MCR_REG(0x94d4)
 #define   SARBUNIT_CLKGATE_DIS			(1 << 5)
 #define   RCCUNIT_CLKGATE_DIS			(1 << 7)
 #define   MSCUNIT_CLKGATE_DIS			(1 << 10)
@@ -711,27 +713,27 @@
 #define   L3_CLKGATE_DIS			REG_BIT(16)
 #define   L3_CR2X_CLKGATE_DIS			REG_BIT(17)
 
-#define SCCGCTL94DC				_MMIO(0x94dc)
+#define SCCGCTL94DC				MCR_REG(0x94dc)
 #define   CG3DDISURB				REG_BIT(14)
 
 #define UNSLICE_UNIT_LEVEL_CLKGATE2		_MMIO(0x94e4)
 #define   VSUNIT_CLKGATE_DIS_TGL		REG_BIT(19)
 #define   PSDUNIT_CLKGATE_DIS			REG_BIT(5)
 
-#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE	_MMIO(0x9524)
+#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE	MCR_REG(0x9524)
 #define   DSS_ROUTER_CLKGATE_DIS		REG_BIT(28)
 #define   GWUNIT_CLKGATE_DIS			REG_BIT(16)
 
-#define SUBSLICE_UNIT_LEVEL_CLKGATE2		_MMIO(0x9528)
+#define SUBSLICE_UNIT_LEVEL_CLKGATE2		MCR_REG(0x9528)
 #define   CPSSUNIT_CLKGATE_DIS			REG_BIT(9)
 
-#define SSMCGCTL9530				_MMIO(0x9530)
+#define SSMCGCTL9530				MCR_REG(0x9530)
 #define   RTFUNIT_CLKGATE_DIS			REG_BIT(18)
 
-#define GEN10_DFR_RATIO_EN_AND_CHICKEN		_MMIO(0x9550)
+#define GEN10_DFR_RATIO_EN_AND_CHICKEN		MCR_REG(0x9550)
 #define   DFR_DISABLE				(1 << 9)
 
-#define INF_UNIT_LEVEL_CLKGATE			_MMIO(0x9560)
+#define INF_UNIT_LEVEL_CLKGATE			MCR_REG(0x9560)
 #define   CGPSF_CLKGATE_DIS			(1 << 3)
 
 #define MICRO_BP0_0				_MMIO(0x9800)
@@ -943,7 +945,7 @@
 
 /* MOCS (Memory Object Control State) registers */
 #define GEN9_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
-#define XEHP_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
+#define XEHP_LNCFCMOCS(i)			MCR_REG(0xb020 + (i) * 4)	/* L3 Cache Control */
 #define LNCFCMOCS_REG_COUNT			32
 
 #define GEN7_L3CNTLREG3				_MMIO(0xb024)
@@ -960,10 +962,10 @@
 #define GEN7_L3LOG(slice, i)			_MMIO(0xb070 + (slice) * 0x200 + (i) * 4)
 #define   GEN7_L3LOG_SIZE			0x80
 
-#define XEHP_L3NODEARBCFG			_MMIO(0xb0b4)
+#define XEHP_L3NODEARBCFG			MCR_REG(0xb0b4)
 #define   XEHP_LNESPARE				REG_BIT(19)
 
-#define GEN8_L3SQCREG1				_MMIO(0xb100)
+#define GEN8_L3SQCREG1				MCR_REG(0xb100)
 /*
  * Note that on CHV the following has an off-by-one error wrt. to BSpec.
  * Using the formula in BSpec leads to a hang, while the formula here works
@@ -974,28 +976,28 @@
 #define   L3_HIGH_PRIO_CREDITS(x)		(((x) >> 1) << 14)
 #define   L3_PRIO_CREDITS_MASK			((0x1f << 19) | (0x1f << 14))
 
-#define GEN8_L3SQCREG4				_MMIO(0xb118)
+#define GEN8_L3SQCREG4				MCR_REG(0xb118)
 #define   GEN11_LQSC_CLEAN_EVICT_DISABLE	(1 << 6)
 #define   GEN8_LQSC_RO_PERF_DIS			(1 << 27)
 #define   GEN8_LQSC_FLUSH_COHERENT_LINES	(1 << 21)
 #define   GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE	REG_BIT(22)
 
-#define GEN9_SCRATCH1				_MMIO(0xb11c)
+#define GEN9_SCRATCH1				MCR_REG(0xb11c)
 #define   EVICTION_PERF_FIX_ENABLE		REG_BIT(8)
 
-#define BDW_SCRATCH1				_MMIO(0xb11c)
+#define BDW_SCRATCH1				MCR_REG(0xb11c)
 #define   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE	(1 << 2)
 
-#define GEN11_SCRATCH2				_MMIO(0xb140)
+#define GEN11_SCRATCH2				MCR_REG(0xb140)
 #define   GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE	(1 << 19)
 
-#define XEHP_L3SQCREG5				_MMIO(0xb158)
+#define XEHP_L3SQCREG5				MCR_REG(0xb158)
 #define   L3_PWM_TIMER_INIT_VAL_MASK		REG_GENMASK(9, 0)
 
-#define MLTICTXCTL				_MMIO(0xb170)
+#define MLTICTXCTL				MCR_REG(0xb170)
 #define   TDONRENDER				REG_BIT(2)
 
-#define XEHP_L3SCQREG7				_MMIO(0xb188)
+#define XEHP_L3SCQREG7				MCR_REG(0xb188)
 #define   BLEND_FILL_CACHING_OPT_DIS		REG_BIT(3)
 
 #define XEHPC_L3SCRUB				_MMIO(0xb18c)
@@ -1003,7 +1005,7 @@
 #define   SCRUB_RATE_PER_BANK_MASK		REG_GENMASK(2, 0)
 #define   SCRUB_RATE_4B_PER_CLK			REG_FIELD_PREP(SCRUB_RATE_PER_BANK_MASK, 0x6)
 
-#define L3SQCREG1_CCS0				_MMIO(0xb200)
+#define L3SQCREG1_CCS0				MCR_REG(0xb200)
 #define   FLUSHALLNONCOH			REG_BIT(5)
 
 #define GEN11_GLBLINVL				_MMIO(0xb404)
@@ -1028,14 +1030,14 @@
 #define GEN9_BLT_MOCS(i)			_MMIO(__GEN9_BCS0_MOCS0 + (i) * 4)
 
 #define GEN12_FAULT_TLB_DATA0			_MMIO(0xceb8)
-#define XEHP_FAULT_TLB_DATA0			_MMIO(0xceb8)
+#define XEHP_FAULT_TLB_DATA0			MCR_REG(0xceb8)
 #define GEN12_FAULT_TLB_DATA1			_MMIO(0xcebc)
-#define XEHP_FAULT_TLB_DATA1			_MMIO(0xcebc)
+#define XEHP_FAULT_TLB_DATA1			MCR_REG(0xcebc)
 #define   FAULT_VA_HIGH_BITS			(0xf << 0)
 #define   FAULT_GTT_SEL				(1 << 4)
 
 #define GEN12_RING_FAULT_REG			_MMIO(0xcec4)
-#define XEHP_RING_FAULT_REG			_MMIO(0xcec4)
+#define XEHP_RING_FAULT_REG			MCR_REG(0xcec4)
 #define   GEN8_RING_FAULT_ENGINE_ID(x)		(((x) >> 12) & 0x7)
 #define   RING_FAULT_GTTSEL_MASK		(1 << 11)
 #define   RING_FAULT_SRCID(x)			(((x) >> 3) & 0xff)
@@ -1043,21 +1045,21 @@
 #define   RING_FAULT_VALID			(1 << 0)
 
 #define GEN12_GFX_TLB_INV_CR			_MMIO(0xced8)
-#define XEHP_GFX_TLB_INV_CR			_MMIO(0xced8)
+#define XEHP_GFX_TLB_INV_CR			MCR_REG(0xced8)
 #define GEN12_VD_TLB_INV_CR			_MMIO(0xcedc)
-#define XEHP_VD_TLB_INV_CR			_MMIO(0xcedc)
+#define XEHP_VD_TLB_INV_CR			MCR_REG(0xcedc)
 #define GEN12_VE_TLB_INV_CR			_MMIO(0xcee0)
-#define XEHP_VE_TLB_INV_CR			_MMIO(0xcee0)
+#define XEHP_VE_TLB_INV_CR			MCR_REG(0xcee0)
 #define GEN12_BLT_TLB_INV_CR			_MMIO(0xcee4)
-#define XEHP_BLT_TLB_INV_CR			_MMIO(0xcee4)
+#define XEHP_BLT_TLB_INV_CR			MCR_REG(0xcee4)
 #define GEN12_COMPCTX_TLB_INV_CR		_MMIO(0xcf04)
-#define XEHP_COMPCTX_TLB_INV_CR			_MMIO(0xcf04)
+#define XEHP_COMPCTX_TLB_INV_CR			MCR_REG(0xcf04)
 
-#define XEHP_MERT_MOD_CTRL			_MMIO(0xcf28)
-#define RENDER_MOD_CTRL				_MMIO(0xcf2c)
-#define COMP_MOD_CTRL				_MMIO(0xcf30)
-#define VDBX_MOD_CTRL				_MMIO(0xcf34)
-#define VEBX_MOD_CTRL				_MMIO(0xcf38)
+#define XEHP_MERT_MOD_CTRL			MCR_REG(0xcf28)
+#define RENDER_MOD_CTRL				MCR_REG(0xcf2c)
+#define COMP_MOD_CTRL				MCR_REG(0xcf30)
+#define VDBX_MOD_CTRL				MCR_REG(0xcf34)
+#define VEBX_MOD_CTRL				MCR_REG(0xcf38)
 #define   FORCE_MISS_FTLB			REG_BIT(3)
 
 #define GEN12_GAMSTLB_CTRL			_MMIO(0xcf4c)
@@ -1072,52 +1074,52 @@
 #define GEN12_GAM_DONE				_MMIO(0xcf68)
 
 #define GEN7_HALF_SLICE_CHICKEN1		_MMIO(0xe100) /* IVB GT1 + VLV */
-#define GEN8_HALF_SLICE_CHICKEN1		_MMIO(0xe100)
+#define GEN8_HALF_SLICE_CHICKEN1		MCR_REG(0xe100)
 #define   GEN7_MAX_PS_THREAD_DEP		(8 << 12)
 #define   GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE	(1 << 10)
 #define   GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE	(1 << 4)
 #define   GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE	(1 << 3)
 
 #define GEN7_SAMPLER_INSTDONE			_MMIO(0xe160)
-#define GEN8_SAMPLER_INSTDONE			_MMIO(0xe160)
+#define GEN8_SAMPLER_INSTDONE			MCR_REG(0xe160)
 #define GEN7_ROW_INSTDONE			_MMIO(0xe164)
-#define GEN8_ROW_INSTDONE			_MMIO(0xe164)
+#define GEN8_ROW_INSTDONE			MCR_REG(0xe164)
 
-#define HALF_SLICE_CHICKEN2			_MMIO(0xe180)
+#define HALF_SLICE_CHICKEN2			MCR_REG(0xe180)
 #define   GEN8_ST_PO_DISABLE			(1 << 13)
 
 #define HSW_HALF_SLICE_CHICKEN3			_MMIO(0xe184)
-#define GEN8_HALF_SLICE_CHICKEN3		_MMIO(0xe184)
+#define GEN8_HALF_SLICE_CHICKEN3		MCR_REG(0xe184)
 #define   HSW_SAMPLE_C_PERFORMANCE		(1 << 9)
 #define   GEN8_CENTROID_PIXEL_OPT_DIS		(1 << 8)
 #define   GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC	(1 << 5)
 #define   GEN8_SAMPLER_POWER_BYPASS_DIS		(1 << 1)
 
-#define GEN9_HALF_SLICE_CHICKEN5		_MMIO(0xe188)
+#define GEN9_HALF_SLICE_CHICKEN5		MCR_REG(0xe188)
 #define   GEN9_DG_MIRROR_FIX_ENABLE		(1 << 5)
 #define   GEN9_CCS_TLB_PREFETCH_ENABLE		(1 << 3)
 
-#define GEN10_SAMPLER_MODE			_MMIO(0xe18c)
+#define GEN10_SAMPLER_MODE			MCR_REG(0xe18c)
 #define   ENABLE_SMALLPL			REG_BIT(15)
 #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
 #define   GEN11_SAMPLER_ENABLE_HEADLESS_MSG	REG_BIT(5)
 
-#define GEN9_HALF_SLICE_CHICKEN7		_MMIO(0xe194)
+#define GEN9_HALF_SLICE_CHICKEN7		MCR_REG(0xe194)
 #define   DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA	REG_BIT(15)
 #define   GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR	REG_BIT(8)
 #define   GEN9_ENABLE_YV12_BUGFIX		REG_BIT(4)
 #define   GEN9_ENABLE_GPGPU_PREEMPTION		REG_BIT(2)
 
-#define GEN10_CACHE_MODE_SS			_MMIO(0xe420)
+#define GEN10_CACHE_MODE_SS			MCR_REG(0xe420)
 #define   ENABLE_EU_COUNT_FOR_TDL_FLUSH		REG_BIT(10)
 #define   DISABLE_ECC				REG_BIT(5)
 #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
 #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
 
-#define EU_PERF_CNTL0				_MMIO(0xe458)
-#define EU_PERF_CNTL4				_MMIO(0xe45c)
+#define EU_PERF_CNTL0				MCR_REG(0xe458)
+#define EU_PERF_CNTL4				MCR_REG(0xe45c)
 
-#define GEN9_ROW_CHICKEN4			_MMIO(0xe48c)
+#define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
 #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
 #define   XEHP_DIS_BBL_SYSPIPE			REG_BIT(11)
 #define   GEN12_DISABLE_TDL_PUSH		REG_BIT(9)
@@ -1129,7 +1131,7 @@
 #define HSW_ROW_CHICKEN3			_MMIO(0xe49c)
 #define   HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE	(1 << 6)
 
-#define GEN8_ROW_CHICKEN			_MMIO(0xe4f0)
+#define GEN8_ROW_CHICKEN			MCR_REG(0xe4f0)
 #define   FLOW_CONTROL_ENABLE			REG_BIT(15)
 #define   UGM_BACKUP_MODE			REG_BIT(13)
 #define   MDQ_ARBITRATION_MODE			REG_BIT(12)
@@ -1141,39 +1143,39 @@
 
 #define GEN7_ROW_CHICKEN2			_MMIO(0xe4f4)
 
-#define GEN8_ROW_CHICKEN2			_MMIO(0xe4f4)
+#define GEN8_ROW_CHICKEN2			MCR_REG(0xe4f4)
 #define   GEN12_DISABLE_READ_SUPPRESSION	REG_BIT(15)
 #define   GEN12_DISABLE_EARLY_READ		REG_BIT(14)
 #define   GEN12_ENABLE_LARGE_GRF_MODE		REG_BIT(12)
 #define   GEN12_PUSH_CONST_DEREF_HOLD_DIS	REG_BIT(8)
 
-#define RT_CTRL					_MMIO(0xe530)
+#define RT_CTRL					MCR_REG(0xe530)
 #define   DIS_NULL_QUERY			REG_BIT(10)
 #define   STACKID_CTRL				REG_GENMASK(6, 5)
 #define   STACKID_CTRL_512			REG_FIELD_PREP(STACKID_CTRL, 0x2)
 
-#define EU_PERF_CNTL1				_MMIO(0xe558)
-#define EU_PERF_CNTL5				_MMIO(0xe55c)
+#define EU_PERF_CNTL1				MCR_REG(0xe558)
+#define EU_PERF_CNTL5				MCR_REG(0xe55c)
 
-#define XEHP_HDC_CHICKEN0			_MMIO(0xe5f0)
+#define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
 #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
-#define ICL_HDC_MODE				_MMIO(0xe5f4)
+#define ICL_HDC_MODE				MCR_REG(0xe5f4)
 
-#define EU_PERF_CNTL2				_MMIO(0xe658)
-#define EU_PERF_CNTL6				_MMIO(0xe65c)
-#define EU_PERF_CNTL3				_MMIO(0xe758)
+#define EU_PERF_CNTL2				MCR_REG(0xe658)
+#define EU_PERF_CNTL6				MCR_REG(0xe65c)
+#define EU_PERF_CNTL3				MCR_REG(0xe758)
 
-#define LSC_CHICKEN_BIT_0			_MMIO(0xe7c8)
+#define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
 #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
 #define   FORCE_1_SUB_MESSAGE_PER_FRAGMENT	REG_BIT(15)
-#define LSC_CHICKEN_BIT_0_UDW			_MMIO(0xe7c8 + 4)
+#define LSC_CHICKEN_BIT_0_UDW			MCR_REG(0xe7c8 + 4)
 #define   DIS_CHAIN_2XSIMD8			REG_BIT(55 - 32)
 #define   FORCE_SLM_FENCE_SCOPE_TO_TILE		REG_BIT(42 - 32)
 #define   FORCE_UGM_FENCE_SCOPE_TO_TILE		REG_BIT(41 - 32)
 #define   MAXREQS_PER_BANK			REG_GENMASK(39 - 32, 37 - 32)
 #define   DISABLE_128B_EVICTION_COMMAND_UDW	REG_BIT(36 - 32)
 
-#define SARB_CHICKEN1				_MMIO(0xe90c)
+#define SARB_CHICKEN1				MCR_REG(0xe90c)
 #define   COMP_CKN_IN				REG_GENMASK(30, 29)
 
 #define GEN7_ROW_CHICKEN2_GT2			_MMIO(0xf4f4)
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 08/14] drm/i915: Define MCR registers explicitly
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

Rather than using the same _MMIO() macro to define MCR registers as
singleton registers, let's use a new MCR_REG() macro to make it clear
that these registers are special and should be handled accordingly.  For
now MCR_REG() will still generate an i915_reg_t with the given offset,
but we'll change that in future patches.

Bspec: 66673, 66696, 66534, 67609
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h | 134 ++++++++++++------------
 1 file changed, 68 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 86757dd22c13..901ea03b9fd8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -8,6 +8,8 @@
 
 #include "i915_reg_defs.h"
 
+#define MCR_REG(offset)	_MMIO(offset)
+
 /* RPM unit config (Gen8+) */
 #define RPM_CONFIG0				_MMIO(0xd00)
 #define   GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT	3
@@ -333,12 +335,12 @@
 #define GEN7_TLB_RD_ADDR			_MMIO(0x4700)
 
 #define GEN12_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
-#define XEHP_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
+#define XEHP_PAT_INDEX(index)			MCR_REG(0x4800 + (index) * 4)
 
-#define XEHP_TILE0_ADDR_RANGE			_MMIO(0x4900)
+#define XEHP_TILE0_ADDR_RANGE			MCR_REG(0x4900)
 #define   XEHP_TILE_LMEM_RANGE_SHIFT		8
 
-#define XEHP_FLAT_CCS_BASE_ADDR			_MMIO(0x4910)
+#define XEHP_FLAT_CCS_BASE_ADDR			MCR_REG(0x4910)
 #define   XEHP_CCS_BASE_SHIFT			8
 
 #define GAMTARBMODE				_MMIO(0x4a08)
@@ -388,18 +390,18 @@
 #define CHICKEN_RASTER_2			_MMIO(0x6208)
 #define   TBIMR_FAST_CLIP			REG_BIT(5)
 
-#define VFLSKPD					_MMIO(0x62a8)
+#define VFLSKPD					MCR_REG(0x62a8)
 #define   DIS_OVER_FETCH_CACHE			REG_BIT(1)
 #define   DIS_MULT_MISS_RD_SQUASH		REG_BIT(0)
 
 #define GEN12_FF_MODE2				_MMIO(0x6604)
-#define XEHP_FF_MODE2				_MMIO(0x6604)
+#define XEHP_FF_MODE2				MCR_REG(0x6604)
 #define   FF_MODE2_GS_TIMER_MASK		REG_GENMASK(31, 24)
 #define   FF_MODE2_GS_TIMER_224			REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224)
 #define   FF_MODE2_TDS_TIMER_MASK		REG_GENMASK(23, 16)
 #define   FF_MODE2_TDS_TIMER_128		REG_FIELD_PREP(FF_MODE2_TDS_TIMER_MASK, 4)
 
-#define XEHPG_INSTDONE_GEOM_SVG			_MMIO(0x666c)
+#define XEHPG_INSTDONE_GEOM_SVG			MCR_REG(0x666c)
 
 #define CACHE_MODE_0_GEN7			_MMIO(0x7000) /* IVB+ */
 #define   RC_OP_FLUSH_ENABLE			(1 << 0)
@@ -448,14 +450,14 @@
 #define GEN8_HDC_CHICKEN1			_MMIO(0x7304)
 
 #define GEN11_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
-#define XEHP_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
+#define XEHP_COMMON_SLICE_CHICKEN3		MCR_REG(0x7304)
 #define   DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN	REG_BIT(12)
 #define   XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE	REG_BIT(12)
 #define   GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC	REG_BIT(11)
 #define   GEN12_DISABLE_CPS_AWARE_COLOR_PIPE	REG_BIT(9)
 
 #define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
-#define XEHP_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
+#define XEHP_SLICE_COMMON_ECO_CHICKEN1		MCR_REG(0x731c)
 #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE	REG_BIT(14)
 #define   GEN11_STATE_CACHE_REDIRECT_TO_CS	(1 << 11)
 
@@ -486,7 +488,7 @@
 
 #define GEN8_RC6_CTX_INFO			_MMIO(0x8504)
 
-#define XEHP_SQCM				_MMIO(0x8724)
+#define XEHP_SQCM				MCR_REG(0x8724)
 #define   EN_32B_ACCESS				REG_BIT(30)
 
 #define HSW_IDICR				_MMIO(0x9008)
@@ -647,7 +649,7 @@
 #define GEN7_MISCCPCTL				_MMIO(0x9424)
 #define   GEN7_DOP_CLOCK_GATE_ENABLE		(1 << 0)
 
-#define GEN8_MISCCPCTL				_MMIO(0x9424)
+#define GEN8_MISCCPCTL				MCR_REG(0x9424)
 #define   GEN8_DOP_CLOCK_GATE_ENABLE		REG_BIT(0)
 #define   GEN12_DOP_CLOCK_GATE_RENDER_ENABLE	REG_BIT(1)
 #define   GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE	(1 << 2)
@@ -703,7 +705,7 @@
 #define   LTCDD_CLKGATE_DIS			REG_BIT(10)
 
 #define GEN11_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
-#define XEHP_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
+#define XEHP_SLICE_UNIT_LEVEL_CLKGATE		MCR_REG(0x94d4)
 #define   SARBUNIT_CLKGATE_DIS			(1 << 5)
 #define   RCCUNIT_CLKGATE_DIS			(1 << 7)
 #define   MSCUNIT_CLKGATE_DIS			(1 << 10)
@@ -711,27 +713,27 @@
 #define   L3_CLKGATE_DIS			REG_BIT(16)
 #define   L3_CR2X_CLKGATE_DIS			REG_BIT(17)
 
-#define SCCGCTL94DC				_MMIO(0x94dc)
+#define SCCGCTL94DC				MCR_REG(0x94dc)
 #define   CG3DDISURB				REG_BIT(14)
 
 #define UNSLICE_UNIT_LEVEL_CLKGATE2		_MMIO(0x94e4)
 #define   VSUNIT_CLKGATE_DIS_TGL		REG_BIT(19)
 #define   PSDUNIT_CLKGATE_DIS			REG_BIT(5)
 
-#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE	_MMIO(0x9524)
+#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE	MCR_REG(0x9524)
 #define   DSS_ROUTER_CLKGATE_DIS		REG_BIT(28)
 #define   GWUNIT_CLKGATE_DIS			REG_BIT(16)
 
-#define SUBSLICE_UNIT_LEVEL_CLKGATE2		_MMIO(0x9528)
+#define SUBSLICE_UNIT_LEVEL_CLKGATE2		MCR_REG(0x9528)
 #define   CPSSUNIT_CLKGATE_DIS			REG_BIT(9)
 
-#define SSMCGCTL9530				_MMIO(0x9530)
+#define SSMCGCTL9530				MCR_REG(0x9530)
 #define   RTFUNIT_CLKGATE_DIS			REG_BIT(18)
 
-#define GEN10_DFR_RATIO_EN_AND_CHICKEN		_MMIO(0x9550)
+#define GEN10_DFR_RATIO_EN_AND_CHICKEN		MCR_REG(0x9550)
 #define   DFR_DISABLE				(1 << 9)
 
-#define INF_UNIT_LEVEL_CLKGATE			_MMIO(0x9560)
+#define INF_UNIT_LEVEL_CLKGATE			MCR_REG(0x9560)
 #define   CGPSF_CLKGATE_DIS			(1 << 3)
 
 #define MICRO_BP0_0				_MMIO(0x9800)
@@ -943,7 +945,7 @@
 
 /* MOCS (Memory Object Control State) registers */
 #define GEN9_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
-#define XEHP_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
+#define XEHP_LNCFCMOCS(i)			MCR_REG(0xb020 + (i) * 4)	/* L3 Cache Control */
 #define LNCFCMOCS_REG_COUNT			32
 
 #define GEN7_L3CNTLREG3				_MMIO(0xb024)
@@ -960,10 +962,10 @@
 #define GEN7_L3LOG(slice, i)			_MMIO(0xb070 + (slice) * 0x200 + (i) * 4)
 #define   GEN7_L3LOG_SIZE			0x80
 
-#define XEHP_L3NODEARBCFG			_MMIO(0xb0b4)
+#define XEHP_L3NODEARBCFG			MCR_REG(0xb0b4)
 #define   XEHP_LNESPARE				REG_BIT(19)
 
-#define GEN8_L3SQCREG1				_MMIO(0xb100)
+#define GEN8_L3SQCREG1				MCR_REG(0xb100)
 /*
  * Note that on CHV the following has an off-by-one error wrt. to BSpec.
  * Using the formula in BSpec leads to a hang, while the formula here works
@@ -974,28 +976,28 @@
 #define   L3_HIGH_PRIO_CREDITS(x)		(((x) >> 1) << 14)
 #define   L3_PRIO_CREDITS_MASK			((0x1f << 19) | (0x1f << 14))
 
-#define GEN8_L3SQCREG4				_MMIO(0xb118)
+#define GEN8_L3SQCREG4				MCR_REG(0xb118)
 #define   GEN11_LQSC_CLEAN_EVICT_DISABLE	(1 << 6)
 #define   GEN8_LQSC_RO_PERF_DIS			(1 << 27)
 #define   GEN8_LQSC_FLUSH_COHERENT_LINES	(1 << 21)
 #define   GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE	REG_BIT(22)
 
-#define GEN9_SCRATCH1				_MMIO(0xb11c)
+#define GEN9_SCRATCH1				MCR_REG(0xb11c)
 #define   EVICTION_PERF_FIX_ENABLE		REG_BIT(8)
 
-#define BDW_SCRATCH1				_MMIO(0xb11c)
+#define BDW_SCRATCH1				MCR_REG(0xb11c)
 #define   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE	(1 << 2)
 
-#define GEN11_SCRATCH2				_MMIO(0xb140)
+#define GEN11_SCRATCH2				MCR_REG(0xb140)
 #define   GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE	(1 << 19)
 
-#define XEHP_L3SQCREG5				_MMIO(0xb158)
+#define XEHP_L3SQCREG5				MCR_REG(0xb158)
 #define   L3_PWM_TIMER_INIT_VAL_MASK		REG_GENMASK(9, 0)
 
-#define MLTICTXCTL				_MMIO(0xb170)
+#define MLTICTXCTL				MCR_REG(0xb170)
 #define   TDONRENDER				REG_BIT(2)
 
-#define XEHP_L3SCQREG7				_MMIO(0xb188)
+#define XEHP_L3SCQREG7				MCR_REG(0xb188)
 #define   BLEND_FILL_CACHING_OPT_DIS		REG_BIT(3)
 
 #define XEHPC_L3SCRUB				_MMIO(0xb18c)
@@ -1003,7 +1005,7 @@
 #define   SCRUB_RATE_PER_BANK_MASK		REG_GENMASK(2, 0)
 #define   SCRUB_RATE_4B_PER_CLK			REG_FIELD_PREP(SCRUB_RATE_PER_BANK_MASK, 0x6)
 
-#define L3SQCREG1_CCS0				_MMIO(0xb200)
+#define L3SQCREG1_CCS0				MCR_REG(0xb200)
 #define   FLUSHALLNONCOH			REG_BIT(5)
 
 #define GEN11_GLBLINVL				_MMIO(0xb404)
@@ -1028,14 +1030,14 @@
 #define GEN9_BLT_MOCS(i)			_MMIO(__GEN9_BCS0_MOCS0 + (i) * 4)
 
 #define GEN12_FAULT_TLB_DATA0			_MMIO(0xceb8)
-#define XEHP_FAULT_TLB_DATA0			_MMIO(0xceb8)
+#define XEHP_FAULT_TLB_DATA0			MCR_REG(0xceb8)
 #define GEN12_FAULT_TLB_DATA1			_MMIO(0xcebc)
-#define XEHP_FAULT_TLB_DATA1			_MMIO(0xcebc)
+#define XEHP_FAULT_TLB_DATA1			MCR_REG(0xcebc)
 #define   FAULT_VA_HIGH_BITS			(0xf << 0)
 #define   FAULT_GTT_SEL				(1 << 4)
 
 #define GEN12_RING_FAULT_REG			_MMIO(0xcec4)
-#define XEHP_RING_FAULT_REG			_MMIO(0xcec4)
+#define XEHP_RING_FAULT_REG			MCR_REG(0xcec4)
 #define   GEN8_RING_FAULT_ENGINE_ID(x)		(((x) >> 12) & 0x7)
 #define   RING_FAULT_GTTSEL_MASK		(1 << 11)
 #define   RING_FAULT_SRCID(x)			(((x) >> 3) & 0xff)
@@ -1043,21 +1045,21 @@
 #define   RING_FAULT_VALID			(1 << 0)
 
 #define GEN12_GFX_TLB_INV_CR			_MMIO(0xced8)
-#define XEHP_GFX_TLB_INV_CR			_MMIO(0xced8)
+#define XEHP_GFX_TLB_INV_CR			MCR_REG(0xced8)
 #define GEN12_VD_TLB_INV_CR			_MMIO(0xcedc)
-#define XEHP_VD_TLB_INV_CR			_MMIO(0xcedc)
+#define XEHP_VD_TLB_INV_CR			MCR_REG(0xcedc)
 #define GEN12_VE_TLB_INV_CR			_MMIO(0xcee0)
-#define XEHP_VE_TLB_INV_CR			_MMIO(0xcee0)
+#define XEHP_VE_TLB_INV_CR			MCR_REG(0xcee0)
 #define GEN12_BLT_TLB_INV_CR			_MMIO(0xcee4)
-#define XEHP_BLT_TLB_INV_CR			_MMIO(0xcee4)
+#define XEHP_BLT_TLB_INV_CR			MCR_REG(0xcee4)
 #define GEN12_COMPCTX_TLB_INV_CR		_MMIO(0xcf04)
-#define XEHP_COMPCTX_TLB_INV_CR			_MMIO(0xcf04)
+#define XEHP_COMPCTX_TLB_INV_CR			MCR_REG(0xcf04)
 
-#define XEHP_MERT_MOD_CTRL			_MMIO(0xcf28)
-#define RENDER_MOD_CTRL				_MMIO(0xcf2c)
-#define COMP_MOD_CTRL				_MMIO(0xcf30)
-#define VDBX_MOD_CTRL				_MMIO(0xcf34)
-#define VEBX_MOD_CTRL				_MMIO(0xcf38)
+#define XEHP_MERT_MOD_CTRL			MCR_REG(0xcf28)
+#define RENDER_MOD_CTRL				MCR_REG(0xcf2c)
+#define COMP_MOD_CTRL				MCR_REG(0xcf30)
+#define VDBX_MOD_CTRL				MCR_REG(0xcf34)
+#define VEBX_MOD_CTRL				MCR_REG(0xcf38)
 #define   FORCE_MISS_FTLB			REG_BIT(3)
 
 #define GEN12_GAMSTLB_CTRL			_MMIO(0xcf4c)
@@ -1072,52 +1074,52 @@
 #define GEN12_GAM_DONE				_MMIO(0xcf68)
 
 #define GEN7_HALF_SLICE_CHICKEN1		_MMIO(0xe100) /* IVB GT1 + VLV */
-#define GEN8_HALF_SLICE_CHICKEN1		_MMIO(0xe100)
+#define GEN8_HALF_SLICE_CHICKEN1		MCR_REG(0xe100)
 #define   GEN7_MAX_PS_THREAD_DEP		(8 << 12)
 #define   GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE	(1 << 10)
 #define   GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE	(1 << 4)
 #define   GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE	(1 << 3)
 
 #define GEN7_SAMPLER_INSTDONE			_MMIO(0xe160)
-#define GEN8_SAMPLER_INSTDONE			_MMIO(0xe160)
+#define GEN8_SAMPLER_INSTDONE			MCR_REG(0xe160)
 #define GEN7_ROW_INSTDONE			_MMIO(0xe164)
-#define GEN8_ROW_INSTDONE			_MMIO(0xe164)
+#define GEN8_ROW_INSTDONE			MCR_REG(0xe164)
 
-#define HALF_SLICE_CHICKEN2			_MMIO(0xe180)
+#define HALF_SLICE_CHICKEN2			MCR_REG(0xe180)
 #define   GEN8_ST_PO_DISABLE			(1 << 13)
 
 #define HSW_HALF_SLICE_CHICKEN3			_MMIO(0xe184)
-#define GEN8_HALF_SLICE_CHICKEN3		_MMIO(0xe184)
+#define GEN8_HALF_SLICE_CHICKEN3		MCR_REG(0xe184)
 #define   HSW_SAMPLE_C_PERFORMANCE		(1 << 9)
 #define   GEN8_CENTROID_PIXEL_OPT_DIS		(1 << 8)
 #define   GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC	(1 << 5)
 #define   GEN8_SAMPLER_POWER_BYPASS_DIS		(1 << 1)
 
-#define GEN9_HALF_SLICE_CHICKEN5		_MMIO(0xe188)
+#define GEN9_HALF_SLICE_CHICKEN5		MCR_REG(0xe188)
 #define   GEN9_DG_MIRROR_FIX_ENABLE		(1 << 5)
 #define   GEN9_CCS_TLB_PREFETCH_ENABLE		(1 << 3)
 
-#define GEN10_SAMPLER_MODE			_MMIO(0xe18c)
+#define GEN10_SAMPLER_MODE			MCR_REG(0xe18c)
 #define   ENABLE_SMALLPL			REG_BIT(15)
 #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
 #define   GEN11_SAMPLER_ENABLE_HEADLESS_MSG	REG_BIT(5)
 
-#define GEN9_HALF_SLICE_CHICKEN7		_MMIO(0xe194)
+#define GEN9_HALF_SLICE_CHICKEN7		MCR_REG(0xe194)
 #define   DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA	REG_BIT(15)
 #define   GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR	REG_BIT(8)
 #define   GEN9_ENABLE_YV12_BUGFIX		REG_BIT(4)
 #define   GEN9_ENABLE_GPGPU_PREEMPTION		REG_BIT(2)
 
-#define GEN10_CACHE_MODE_SS			_MMIO(0xe420)
+#define GEN10_CACHE_MODE_SS			MCR_REG(0xe420)
 #define   ENABLE_EU_COUNT_FOR_TDL_FLUSH		REG_BIT(10)
 #define   DISABLE_ECC				REG_BIT(5)
 #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
 #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
 
-#define EU_PERF_CNTL0				_MMIO(0xe458)
-#define EU_PERF_CNTL4				_MMIO(0xe45c)
+#define EU_PERF_CNTL0				MCR_REG(0xe458)
+#define EU_PERF_CNTL4				MCR_REG(0xe45c)
 
-#define GEN9_ROW_CHICKEN4			_MMIO(0xe48c)
+#define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
 #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
 #define   XEHP_DIS_BBL_SYSPIPE			REG_BIT(11)
 #define   GEN12_DISABLE_TDL_PUSH		REG_BIT(9)
@@ -1129,7 +1131,7 @@
 #define HSW_ROW_CHICKEN3			_MMIO(0xe49c)
 #define   HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE	(1 << 6)
 
-#define GEN8_ROW_CHICKEN			_MMIO(0xe4f0)
+#define GEN8_ROW_CHICKEN			MCR_REG(0xe4f0)
 #define   FLOW_CONTROL_ENABLE			REG_BIT(15)
 #define   UGM_BACKUP_MODE			REG_BIT(13)
 #define   MDQ_ARBITRATION_MODE			REG_BIT(12)
@@ -1141,39 +1143,39 @@
 
 #define GEN7_ROW_CHICKEN2			_MMIO(0xe4f4)
 
-#define GEN8_ROW_CHICKEN2			_MMIO(0xe4f4)
+#define GEN8_ROW_CHICKEN2			MCR_REG(0xe4f4)
 #define   GEN12_DISABLE_READ_SUPPRESSION	REG_BIT(15)
 #define   GEN12_DISABLE_EARLY_READ		REG_BIT(14)
 #define   GEN12_ENABLE_LARGE_GRF_MODE		REG_BIT(12)
 #define   GEN12_PUSH_CONST_DEREF_HOLD_DIS	REG_BIT(8)
 
-#define RT_CTRL					_MMIO(0xe530)
+#define RT_CTRL					MCR_REG(0xe530)
 #define   DIS_NULL_QUERY			REG_BIT(10)
 #define   STACKID_CTRL				REG_GENMASK(6, 5)
 #define   STACKID_CTRL_512			REG_FIELD_PREP(STACKID_CTRL, 0x2)
 
-#define EU_PERF_CNTL1				_MMIO(0xe558)
-#define EU_PERF_CNTL5				_MMIO(0xe55c)
+#define EU_PERF_CNTL1				MCR_REG(0xe558)
+#define EU_PERF_CNTL5				MCR_REG(0xe55c)
 
-#define XEHP_HDC_CHICKEN0			_MMIO(0xe5f0)
+#define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
 #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
-#define ICL_HDC_MODE				_MMIO(0xe5f4)
+#define ICL_HDC_MODE				MCR_REG(0xe5f4)
 
-#define EU_PERF_CNTL2				_MMIO(0xe658)
-#define EU_PERF_CNTL6				_MMIO(0xe65c)
-#define EU_PERF_CNTL3				_MMIO(0xe758)
+#define EU_PERF_CNTL2				MCR_REG(0xe658)
+#define EU_PERF_CNTL6				MCR_REG(0xe65c)
+#define EU_PERF_CNTL3				MCR_REG(0xe758)
 
-#define LSC_CHICKEN_BIT_0			_MMIO(0xe7c8)
+#define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
 #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
 #define   FORCE_1_SUB_MESSAGE_PER_FRAGMENT	REG_BIT(15)
-#define LSC_CHICKEN_BIT_0_UDW			_MMIO(0xe7c8 + 4)
+#define LSC_CHICKEN_BIT_0_UDW			MCR_REG(0xe7c8 + 4)
 #define   DIS_CHAIN_2XSIMD8			REG_BIT(55 - 32)
 #define   FORCE_SLM_FENCE_SCOPE_TO_TILE		REG_BIT(42 - 32)
 #define   FORCE_UGM_FENCE_SCOPE_TO_TILE		REG_BIT(41 - 32)
 #define   MAXREQS_PER_BANK			REG_GENMASK(39 - 32, 37 - 32)
 #define   DISABLE_128B_EVICTION_COMMAND_UDW	REG_BIT(36 - 32)
 
-#define SARB_CHICKEN1				_MMIO(0xe90c)
+#define SARB_CHICKEN1				MCR_REG(0xe90c)
 #define   COMP_CKN_IN				REG_GENMASK(30, 29)
 
 #define GEN7_ROW_CHICKEN2_GT2			_MMIO(0xf4f4)
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 09/14] drm/i915/gt: Always use MCR functions on multicast registers
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

Rather than relying on the implicit behavior of intel_uncore_*()
functions, let's always use the intel_gt_mcr_*() functions to operate on
multicast/replicated registers.

v2:
 - Add TLB invalidation registers

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c        | 42 ++++++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_mocs.c      | 13 +++----
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 12 ++++---
 drivers/gpu/drm/i915/intel_pm.c           | 20 ++++++-----
 4 files changed, 56 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index e14f159ad9fc..e763dc719d3a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -1017,6 +1017,32 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8,
 	return rb;
 }
 
+/*
+ * HW architecture suggest typical invalidation time at 40us,
+ * with pessimistic cases up to 100us and a recommendation to
+ * cap at 1ms. We go a bit higher just in case.
+ */
+#define TLB_INVAL_TIMEOUT_US 100
+#define TLB_INVAL_TIMEOUT_MS 4
+
+/*
+ * On Xe_HP the TLB invalidation registers are located at the same MMIO offsets
+ * but are now considered MCR registers.  Since they exist within a GAM range,
+ * the primary instance of the register rolls up the status from each unit.
+ */
+static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb)
+{
+	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+		return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0,
+						    TLB_INVAL_TIMEOUT_US,
+						    TLB_INVAL_TIMEOUT_MS);
+	else
+		return __intel_wait_for_register_fw(gt->uncore, rb.reg, rb.bit, 0,
+						    TLB_INVAL_TIMEOUT_US,
+						    TLB_INVAL_TIMEOUT_MS,
+						    NULL);
+}
+
 static void mmio_invalidate_full(struct intel_gt *gt)
 {
 	static const i915_reg_t gen8_regs[] = {
@@ -1099,22 +1125,12 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	for_each_engine_masked(engine, gt, awake, tmp) {
 		struct reg_and_bit rb;
 
-		/*
-		 * HW architecture suggest typical invalidation time at 40us,
-		 * with pessimistic cases up to 100us and a recommendation to
-		 * cap at 1ms. We go a bit higher just in case.
-		 */
-		const unsigned int timeout_us = 100;
-		const unsigned int timeout_ms = 4;
-
 		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
-		if (__intel_wait_for_register_fw(uncore,
-						 rb.reg, rb.bit, 0,
-						 timeout_us, timeout_ms,
-						 NULL))
+
+		if (wait_for_invalidate(gt, rb))
 			drm_err_ratelimited(&gt->i915->drm,
 					    "%s TLB invalidation did not complete in %ums!\n",
-					    engine->name, timeout_ms);
+					    engine->name, TLB_INVAL_TIMEOUT_MS);
 	}
 
 	/*
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 06643701bf24..89ef1e06bf1d 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -7,6 +7,7 @@
 
 #include "intel_engine.h"
 #include "intel_gt.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_regs.h"
 #include "intel_mocs.h"
 #include "intel_ring.h"
@@ -601,17 +602,17 @@ static u32 l3cc_combine(u16 low, u16 high)
 	     0; \
 	     i++)
 
-static void init_l3cc_table(struct intel_uncore *uncore,
+static void init_l3cc_table(struct intel_gt *gt,
 			    const struct drm_i915_mocs_table *table)
 {
 	unsigned int i;
 	u32 l3cc;
 
 	for_each_l3cc(l3cc, table, i)
-		if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50))
-			intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc);
+		if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+			intel_gt_mcr_multicast_write_fw(gt, XEHP_LNCFCMOCS(i), l3cc);
 		else
-			intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
+			intel_uncore_write_fw(gt->uncore, GEN9_LNCFCMOCS(i), l3cc);
 }
 
 void intel_mocs_init_engine(struct intel_engine_cs *engine)
@@ -631,7 +632,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
 		init_mocs_table(engine, &table);
 
 	if (flags & HAS_RENDER_L3CC && engine->class == RENDER_CLASS)
-		init_l3cc_table(engine->uncore, &table);
+		init_l3cc_table(engine->gt, &table);
 }
 
 static u32 global_mocs_offset(void)
@@ -667,7 +668,7 @@ void intel_mocs_init(struct intel_gt *gt)
 	 * memory transactions including guc transactions
 	 */
 	if (flags & HAS_RENDER_L3CC)
-		init_l3cc_table(gt->uncore, &table);
+		init_l3cc_table(gt, &table);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index 9229243992c2..5b86b2e286e0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -10,12 +10,15 @@
  */
 
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_mcr.h"
 #include "gt/intel_gt_regs.h"
 #include "intel_guc_fw.h"
 #include "i915_drv.h"
 
-static void guc_prepare_xfer(struct intel_uncore *uncore)
+static void guc_prepare_xfer(struct intel_gt *gt)
 {
+	struct intel_uncore *uncore = gt->uncore;
+
 	u32 shim_flags = GUC_ENABLE_READ_CACHE_LOGIC |
 			 GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA |
 			 GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA |
@@ -35,8 +38,9 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
 
 	if (GRAPHICS_VER(uncore->i915) == 9) {
 		/* DOP Clock Gating Enable for GuC clocks */
-		intel_uncore_rmw(uncore, GEN8_MISCCPCTL,
-				 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE);
+		intel_gt_mcr_multicast_write(gt, GEN8_MISCCPCTL,
+					     GEN8_DOP_CLOCK_GATE_GUC_ENABLE |
+					     intel_gt_mcr_read_any(gt, GEN8_MISCCPCTL));
 
 		/* allows for 5us (in 10ns units) before GT can go to RC6 */
 		intel_uncore_write(uncore, GUC_ARAT_C6DIS, 0x1FF);
@@ -168,7 +172,7 @@ int intel_guc_fw_upload(struct intel_guc *guc)
 	struct intel_uncore *uncore = gt->uncore;
 	int ret;
 
-	guc_prepare_xfer(uncore);
+	guc_prepare_xfer(gt);
 
 	/*
 	 * Note that GuC needs the CSS header plus uKernel code to be copied
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c694433a7da5..381f3d7ef8a7 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -30,6 +30,8 @@
 #include "display/skl_watermark.h"
 
 #include "gt/intel_engine_regs.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_mcr.h"
 #include "gt/intel_gt_regs.h"
 
 #include "i915_drv.h"
@@ -4339,22 +4341,23 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	u32 val;
 
 	/* WaTempDisableDOPClkGating:bdw */
-	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL);
-	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
+	misccpctl = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
+				     misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
-	val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1);
+	val = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
 	val &= ~L3_PRIO_CREDITS_MASK;
 	val |= L3_GENERAL_PRIO_CREDITS(general_prio_credits);
 	val |= L3_HIGH_PRIO_CREDITS(high_prio_credits);
-	intel_uncore_write(&dev_priv->uncore, GEN8_L3SQCREG1, val);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_L3SQCREG1, val);
 
 	/*
 	 * Wait at least 100 clocks before re-enabling clock gating.
 	 * See the definition of L3SQCREG1 in BSpec.
 	 */
-	intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1);
+	intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
 	udelay(1);
-	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL, misccpctl);
 }
 
 static void icl_init_clock_gating(struct drm_i915_private *dev_priv)
@@ -4514,8 +4517,9 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
 	gen9_init_clock_gating(dev_priv);
 
 	/* WaDisableDopClockGating:skl */
-	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &
-		   ~GEN8_DOP_CLOCK_GATE_ENABLE);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
+				     intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL) &
+				     ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
 	/* WAC6entrylatency:skl */
 	intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) |
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 09/14] drm/i915/gt: Always use MCR functions on multicast registers
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

Rather than relying on the implicit behavior of intel_uncore_*()
functions, let's always use the intel_gt_mcr_*() functions to operate on
multicast/replicated registers.

v2:
 - Add TLB invalidation registers

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c        | 42 ++++++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_mocs.c      | 13 +++----
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 12 ++++---
 drivers/gpu/drm/i915/intel_pm.c           | 20 ++++++-----
 4 files changed, 56 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index e14f159ad9fc..e763dc719d3a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -1017,6 +1017,32 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8,
 	return rb;
 }
 
+/*
+ * HW architecture suggest typical invalidation time at 40us,
+ * with pessimistic cases up to 100us and a recommendation to
+ * cap at 1ms. We go a bit higher just in case.
+ */
+#define TLB_INVAL_TIMEOUT_US 100
+#define TLB_INVAL_TIMEOUT_MS 4
+
+/*
+ * On Xe_HP the TLB invalidation registers are located at the same MMIO offsets
+ * but are now considered MCR registers.  Since they exist within a GAM range,
+ * the primary instance of the register rolls up the status from each unit.
+ */
+static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb)
+{
+	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+		return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0,
+						    TLB_INVAL_TIMEOUT_US,
+						    TLB_INVAL_TIMEOUT_MS);
+	else
+		return __intel_wait_for_register_fw(gt->uncore, rb.reg, rb.bit, 0,
+						    TLB_INVAL_TIMEOUT_US,
+						    TLB_INVAL_TIMEOUT_MS,
+						    NULL);
+}
+
 static void mmio_invalidate_full(struct intel_gt *gt)
 {
 	static const i915_reg_t gen8_regs[] = {
@@ -1099,22 +1125,12 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	for_each_engine_masked(engine, gt, awake, tmp) {
 		struct reg_and_bit rb;
 
-		/*
-		 * HW architecture suggest typical invalidation time at 40us,
-		 * with pessimistic cases up to 100us and a recommendation to
-		 * cap at 1ms. We go a bit higher just in case.
-		 */
-		const unsigned int timeout_us = 100;
-		const unsigned int timeout_ms = 4;
-
 		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
-		if (__intel_wait_for_register_fw(uncore,
-						 rb.reg, rb.bit, 0,
-						 timeout_us, timeout_ms,
-						 NULL))
+
+		if (wait_for_invalidate(gt, rb))
 			drm_err_ratelimited(&gt->i915->drm,
 					    "%s TLB invalidation did not complete in %ums!\n",
-					    engine->name, timeout_ms);
+					    engine->name, TLB_INVAL_TIMEOUT_MS);
 	}
 
 	/*
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 06643701bf24..89ef1e06bf1d 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -7,6 +7,7 @@
 
 #include "intel_engine.h"
 #include "intel_gt.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_regs.h"
 #include "intel_mocs.h"
 #include "intel_ring.h"
@@ -601,17 +602,17 @@ static u32 l3cc_combine(u16 low, u16 high)
 	     0; \
 	     i++)
 
-static void init_l3cc_table(struct intel_uncore *uncore,
+static void init_l3cc_table(struct intel_gt *gt,
 			    const struct drm_i915_mocs_table *table)
 {
 	unsigned int i;
 	u32 l3cc;
 
 	for_each_l3cc(l3cc, table, i)
-		if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50))
-			intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc);
+		if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+			intel_gt_mcr_multicast_write_fw(gt, XEHP_LNCFCMOCS(i), l3cc);
 		else
-			intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
+			intel_uncore_write_fw(gt->uncore, GEN9_LNCFCMOCS(i), l3cc);
 }
 
 void intel_mocs_init_engine(struct intel_engine_cs *engine)
@@ -631,7 +632,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
 		init_mocs_table(engine, &table);
 
 	if (flags & HAS_RENDER_L3CC && engine->class == RENDER_CLASS)
-		init_l3cc_table(engine->uncore, &table);
+		init_l3cc_table(engine->gt, &table);
 }
 
 static u32 global_mocs_offset(void)
@@ -667,7 +668,7 @@ void intel_mocs_init(struct intel_gt *gt)
 	 * memory transactions including guc transactions
 	 */
 	if (flags & HAS_RENDER_L3CC)
-		init_l3cc_table(gt->uncore, &table);
+		init_l3cc_table(gt, &table);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index 9229243992c2..5b86b2e286e0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -10,12 +10,15 @@
  */
 
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_mcr.h"
 #include "gt/intel_gt_regs.h"
 #include "intel_guc_fw.h"
 #include "i915_drv.h"
 
-static void guc_prepare_xfer(struct intel_uncore *uncore)
+static void guc_prepare_xfer(struct intel_gt *gt)
 {
+	struct intel_uncore *uncore = gt->uncore;
+
 	u32 shim_flags = GUC_ENABLE_READ_CACHE_LOGIC |
 			 GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA |
 			 GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA |
@@ -35,8 +38,9 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
 
 	if (GRAPHICS_VER(uncore->i915) == 9) {
 		/* DOP Clock Gating Enable for GuC clocks */
-		intel_uncore_rmw(uncore, GEN8_MISCCPCTL,
-				 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE);
+		intel_gt_mcr_multicast_write(gt, GEN8_MISCCPCTL,
+					     GEN8_DOP_CLOCK_GATE_GUC_ENABLE |
+					     intel_gt_mcr_read_any(gt, GEN8_MISCCPCTL));
 
 		/* allows for 5us (in 10ns units) before GT can go to RC6 */
 		intel_uncore_write(uncore, GUC_ARAT_C6DIS, 0x1FF);
@@ -168,7 +172,7 @@ int intel_guc_fw_upload(struct intel_guc *guc)
 	struct intel_uncore *uncore = gt->uncore;
 	int ret;
 
-	guc_prepare_xfer(uncore);
+	guc_prepare_xfer(gt);
 
 	/*
 	 * Note that GuC needs the CSS header plus uKernel code to be copied
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c694433a7da5..381f3d7ef8a7 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -30,6 +30,8 @@
 #include "display/skl_watermark.h"
 
 #include "gt/intel_engine_regs.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_mcr.h"
 #include "gt/intel_gt_regs.h"
 
 #include "i915_drv.h"
@@ -4339,22 +4341,23 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	u32 val;
 
 	/* WaTempDisableDOPClkGating:bdw */
-	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL);
-	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
+	misccpctl = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
+				     misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
-	val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1);
+	val = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
 	val &= ~L3_PRIO_CREDITS_MASK;
 	val |= L3_GENERAL_PRIO_CREDITS(general_prio_credits);
 	val |= L3_HIGH_PRIO_CREDITS(high_prio_credits);
-	intel_uncore_write(&dev_priv->uncore, GEN8_L3SQCREG1, val);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_L3SQCREG1, val);
 
 	/*
 	 * Wait at least 100 clocks before re-enabling clock gating.
 	 * See the definition of L3SQCREG1 in BSpec.
 	 */
-	intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1);
+	intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
 	udelay(1);
-	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL, misccpctl);
 }
 
 static void icl_init_clock_gating(struct drm_i915_private *dev_priv)
@@ -4514,8 +4517,9 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
 	gen9_init_clock_gating(dev_priv);
 
 	/* WaDisableDopClockGating:skl */
-	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &
-		   ~GEN8_DOP_CLOCK_GATE_ENABLE);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
+				     intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL) &
+				     ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
 	/* WAC6entrylatency:skl */
 	intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) |
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 10/14] drm/i915/guc: Handle save/restore of MCR registers explicitly
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

MCR registers can be placed on the GuC's save/restore list, but at the
moment they are always handled in a multicast manner (i.e., the GuC
reads one instance to save the value and then does a multicast write to
restore that single value to all instances).  In the future the GuC will
probably give us an alternate interface to do unicast per-instance
save/restore operations, so we should be very clear about which
registers on the list are MCR registers (and in the future which
save/restore behavior we want for them).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 55 +++++++++++++---------
 1 file changed, 34 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index cc357fa0c270..de923fb82301 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -278,24 +278,16 @@ __mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg)
 	return slot;
 }
 
-#define GUC_REGSET_STEERING(group, instance) ( \
-	FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \
-	FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \
-	GUC_REGSET_NEEDS_STEERING \
-)
-
 static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 					  struct temp_regset *regset,
-					  i915_reg_t reg, u32 flags)
+					  u32 offset, u32 flags)
 {
 	u32 count = regset->storage_used - (regset->registers - regset->storage);
-	u32 offset = i915_mmio_reg_offset(reg);
 	struct guc_mmio_reg entry = {
 		.offset = offset,
 		.flags = flags,
 	};
 	struct guc_mmio_reg *slot;
-	u8 group, inst;
 
 	/*
 	 * The mmio list is built using separate lists within the driver.
@@ -307,17 +299,6 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 		    sizeof(entry), guc_mmio_reg_cmp))
 		return 0;
 
-	/*
-	 * The GuC doesn't have a default steering, so we need to explicitly
-	 * steer all registers that need steering. However, we do not keep track
-	 * of all the steering ranges, only of those that have a chance of using
-	 * a non-default steering from the i915 pov. Instead of adding such
-	 * tracking, it is easier to just program the default steering for all
-	 * regs that don't need a non-default one.
-	 */
-	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
-	entry.flags |= GUC_REGSET_STEERING(group, inst);
-
 	slot = __mmio_reg_add(regset, &entry);
 	if (IS_ERR(slot))
 		return PTR_ERR(slot);
@@ -335,6 +316,38 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 
 #define GUC_MMIO_REG_ADD(gt, regset, reg, masked) \
 	guc_mmio_reg_add(gt, \
+			 regset, \
+			 i915_mmio_reg_offset(reg), \
+			 (masked) ? GUC_REGSET_MASKED : 0)
+
+#define GUC_REGSET_STEERING(group, instance) ( \
+	FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \
+	FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \
+	GUC_REGSET_NEEDS_STEERING \
+)
+
+static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
+					 struct temp_regset *regset,
+					 i915_reg_t reg, u32 flags)
+{
+	u8 group, inst;
+
+	/*
+	 * The GuC doesn't have a default steering, so we need to explicitly
+	 * steer all registers that need steering. However, we do not keep track
+	 * of all the steering ranges, only of those that have a chance of using
+	 * a non-default steering from the i915 pov. Instead of adding such
+	 * tracking, it is easier to just program the default steering for all
+	 * regs that don't need a non-default one.
+	 */
+	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
+	flags |= GUC_REGSET_STEERING(group, inst);
+
+	return guc_mmio_reg_add(gt, regset, i915_mmio_reg_offset(reg), flags);
+}
+
+#define GUC_MCR_REG_ADD(gt, regset, reg, masked) \
+	guc_mcr_reg_add(gt, \
 			 regset, \
 			 (reg), \
 			 (masked) ? GUC_REGSET_MASKED : 0)
@@ -375,7 +388,7 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
 	/* add in local MOCS registers */
 	for (i = 0; i < LNCFCMOCS_REG_COUNT; i++)
 		if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
-			ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false);
+			ret |= GUC_MCR_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false);
 		else
 			ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 10/14] drm/i915/guc: Handle save/restore of MCR registers explicitly
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

MCR registers can be placed on the GuC's save/restore list, but at the
moment they are always handled in a multicast manner (i.e., the GuC
reads one instance to save the value and then does a multicast write to
restore that single value to all instances).  In the future the GuC will
probably give us an alternate interface to do unicast per-instance
save/restore operations, so we should be very clear about which
registers on the list are MCR registers (and in the future which
save/restore behavior we want for them).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 55 +++++++++++++---------
 1 file changed, 34 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index cc357fa0c270..de923fb82301 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -278,24 +278,16 @@ __mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg)
 	return slot;
 }
 
-#define GUC_REGSET_STEERING(group, instance) ( \
-	FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \
-	FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \
-	GUC_REGSET_NEEDS_STEERING \
-)
-
 static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 					  struct temp_regset *regset,
-					  i915_reg_t reg, u32 flags)
+					  u32 offset, u32 flags)
 {
 	u32 count = regset->storage_used - (regset->registers - regset->storage);
-	u32 offset = i915_mmio_reg_offset(reg);
 	struct guc_mmio_reg entry = {
 		.offset = offset,
 		.flags = flags,
 	};
 	struct guc_mmio_reg *slot;
-	u8 group, inst;
 
 	/*
 	 * The mmio list is built using separate lists within the driver.
@@ -307,17 +299,6 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 		    sizeof(entry), guc_mmio_reg_cmp))
 		return 0;
 
-	/*
-	 * The GuC doesn't have a default steering, so we need to explicitly
-	 * steer all registers that need steering. However, we do not keep track
-	 * of all the steering ranges, only of those that have a chance of using
-	 * a non-default steering from the i915 pov. Instead of adding such
-	 * tracking, it is easier to just program the default steering for all
-	 * regs that don't need a non-default one.
-	 */
-	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
-	entry.flags |= GUC_REGSET_STEERING(group, inst);
-
 	slot = __mmio_reg_add(regset, &entry);
 	if (IS_ERR(slot))
 		return PTR_ERR(slot);
@@ -335,6 +316,38 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 
 #define GUC_MMIO_REG_ADD(gt, regset, reg, masked) \
 	guc_mmio_reg_add(gt, \
+			 regset, \
+			 i915_mmio_reg_offset(reg), \
+			 (masked) ? GUC_REGSET_MASKED : 0)
+
+#define GUC_REGSET_STEERING(group, instance) ( \
+	FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \
+	FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \
+	GUC_REGSET_NEEDS_STEERING \
+)
+
+static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
+					 struct temp_regset *regset,
+					 i915_reg_t reg, u32 flags)
+{
+	u8 group, inst;
+
+	/*
+	 * The GuC doesn't have a default steering, so we need to explicitly
+	 * steer all registers that need steering. However, we do not keep track
+	 * of all the steering ranges, only of those that have a chance of using
+	 * a non-default steering from the i915 pov. Instead of adding such
+	 * tracking, it is easier to just program the default steering for all
+	 * regs that don't need a non-default one.
+	 */
+	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
+	flags |= GUC_REGSET_STEERING(group, inst);
+
+	return guc_mmio_reg_add(gt, regset, i915_mmio_reg_offset(reg), flags);
+}
+
+#define GUC_MCR_REG_ADD(gt, regset, reg, masked) \
+	guc_mcr_reg_add(gt, \
 			 regset, \
 			 (reg), \
 			 (masked) ? GUC_REGSET_MASKED : 0)
@@ -375,7 +388,7 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
 	/* add in local MOCS registers */
 	for (i = 0; i < LNCFCMOCS_REG_COUNT; i++)
 		if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
-			ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false);
+			ret |= GUC_MCR_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false);
 		else
 			ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 11/14] drm/i915/gt: Add MCR-specific workaround initializers
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

Let's be more explicit about which of our workarounds are updating MCR
registers.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 428 +++++++++++-------
 .../gpu/drm/i915/gt/intel_workarounds_types.h |   4 +-
 2 files changed, 261 insertions(+), 171 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 7bb878581d00..c5cd89aa13c9 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -166,12 +166,33 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
 	_wa_add(wal, &wa);
 }
 
+static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
+		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
+{
+	struct i915_wa wa = {
+		.reg  = reg,
+		.clr  = clear,
+		.set  = set,
+		.read = read_mask,
+		.masked_reg = masked_reg,
+		.is_mcr = 1,
+	};
+
+	_wa_add(wal, &wa);
+}
+
 static void
 wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
 {
 	wa_add(wal, reg, clear, set, clear, false);
 }
 
+static void
+wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
+{
+	wa_mcr_add(wal, reg, clear, set, clear, false);
+}
+
 static void
 wa_write(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 {
@@ -184,12 +205,24 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 	wa_write_clr_set(wal, reg, set, set);
 }
 
+static void
+wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
+{
+	wa_mcr_write_clr_set(wal, reg, set, set);
+}
+
 static void
 wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
 {
 	wa_write_clr_set(wal, reg, clr, 0);
 }
 
+static void
+wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
+{
+	wa_mcr_write_clr_set(wal, reg, clr, 0);
+}
+
 /*
  * WA operations on "masked register". A masked register has the upper 16 bits
  * documented as "masked" in b-spec. Its purpose is to allow writing to just a
@@ -207,12 +240,24 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 	wa_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
 }
 
+static void
+wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+{
+	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
+}
+
 static void
 wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 {
 	wa_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
 }
 
+static void
+wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+{
+	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
+}
+
 static void
 wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
 		    u32 mask, u32 val)
@@ -220,6 +265,13 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
 	wa_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
 }
 
+static void
+wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
+			u32 mask, u32 val)
+{
+	wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
+}
+
 static void gen6_ctx_workarounds_init(struct intel_engine_cs *engine,
 				      struct i915_wa_list *wal)
 {
@@ -241,7 +293,7 @@ static void gen8_ctx_workarounds_init(struct intel_engine_cs *engine,
 	wa_masked_en(wal, RING_MI_MODE(RENDER_RING_BASE), ASYNC_FLIP_PERF_DISABLE);
 
 	/* WaDisablePartialInstShootdown:bdw,chv */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN,
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 		     PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
 	/* Use Force Non-Coherent whenever executing a 3D context. This is a
@@ -288,18 +340,18 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine,
 	gen8_ctx_workarounds_init(engine, wal);
 
 	/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
 
 	/* WaDisableDopClockGating:bdw
 	 *
 	 * Also see the related UCGTCL1 write in bdw_init_clock_gating()
 	 * to disable EUTC clock gating.
 	 */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN2,
-		     DOP_CLOCK_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
+			 DOP_CLOCK_GATING_DISABLE);
 
-	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
-		     GEN8_SAMPLER_POWER_BYPASS_DIS);
+	wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
+			 GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	wa_masked_en(wal, HDC_CHICKEN0,
 		     /* WaForceContextSaveRestoreNonCoherent:bdw */
@@ -314,7 +366,7 @@ static void chv_ctx_workarounds_init(struct intel_engine_cs *engine,
 	gen8_ctx_workarounds_init(engine, wal);
 
 	/* WaDisableThreadStallDopClockGating:chv */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
 
 	/* Improve HiZ throughput on CHV. */
 	wa_masked_en(wal, HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
@@ -333,21 +385,21 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 		 */
 		wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
 			     GEN9_PBE_COMPRESSED_HASH_SELECTION);
-		wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
-			     GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
+		wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
+				 GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
 	}
 
 	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
 	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN,
-		     FLOW_CONTROL_ENABLE |
-		     PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
+			 FLOW_CONTROL_ENABLE |
+			 PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
 	/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
 	/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
-	wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
-		     GEN9_ENABLE_YV12_BUGFIX |
-		     GEN9_ENABLE_GPGPU_PREEMPTION);
+	wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
+			 GEN9_ENABLE_YV12_BUGFIX |
+			 GEN9_ENABLE_GPGPU_PREEMPTION);
 
 	/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
 	/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
@@ -356,8 +408,8 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE);
 
 	/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
-	wa_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5,
-		      GEN9_CCS_TLB_PREFETCH_ENABLE);
+	wa_mcr_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5,
+			  GEN9_CCS_TLB_PREFETCH_ENABLE);
 
 	/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
 	wa_masked_en(wal, HDC_CHICKEN0,
@@ -386,11 +438,11 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 	    IS_KABYLAKE(i915) ||
 	    IS_COFFEELAKE(i915) ||
 	    IS_COMETLAKE(i915))
-		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
-			     GEN8_SAMPLER_POWER_BYPASS_DIS);
+		wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
+				 GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
-	wa_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
+	wa_mcr_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
 
 	/*
 	 * Supporting preemption with fine-granularity requires changes in the
@@ -469,8 +521,8 @@ static void bxt_ctx_workarounds_init(struct intel_engine_cs *engine,
 	gen9_ctx_workarounds_init(engine, wal);
 
 	/* WaDisableThreadStallDopClockGating:bxt */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN,
-		     STALL_DOP_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
+			 STALL_DOP_GATING_DISABLE);
 
 	/* WaToEnableHwFixForPushConstHWBug:bxt */
 	wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
@@ -490,8 +542,8 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
-		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
+			 GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
 static void glk_ctx_workarounds_init(struct intel_engine_cs *engine,
@@ -514,8 +566,8 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
-		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
+			 GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
 static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
@@ -534,13 +586,13 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 	 * (the register is whitelisted in hardware now, so UMDs can opt in
 	 * for coherency if they have a good reason).
 	 */
-	wa_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT);
+	wa_mcr_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT);
 
 	/* WaEnableFloatBlendOptimization:icl */
-	wa_add(wal, GEN10_CACHE_MODE_SS, 0,
-	       _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE),
-	       0 /* write-only, so skip validation */,
-	       true);
+	wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0,
+		   _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE),
+		   0 /* write-only, so skip validation */,
+		   true);
 
 	/* WaDisableGPGPUMidThreadPreemption:icl */
 	wa_masked_field_set(wal, GEN8_CS_CHICKEN1,
@@ -548,8 +600,8 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			    GEN9_PREEMPT_GPGPU_THREAD_GROUP_LEVEL);
 
 	/* allow headerless messages for preemptible GPGPU context */
-	wa_masked_en(wal, GEN10_SAMPLER_MODE,
-		     GEN11_SAMPLER_ENABLE_HEADLESS_MSG);
+	wa_mcr_masked_en(wal, GEN10_SAMPLER_MODE,
+			 GEN11_SAMPLER_ENABLE_HEADLESS_MSG);
 
 	/* Wa_1604278689:icl,ehl */
 	wa_write(wal, IVB_FBC_RT_BASE, 0xFFFFFFFF & ~ILK_FBC_RT_VALID);
@@ -558,7 +610,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			 0xFFFFFFFF);
 
 	/* Wa_1406306137:icl,ehl */
-	wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU);
+	wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU);
 }
 
 /*
@@ -569,13 +621,13 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 				   struct i915_wa_list *wal)
 {
 	wa_masked_en(wal, CHICKEN_RASTER_2, TBIMR_FAST_CLIP);
-	wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
-			 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
-	wa_add(wal,
-	       XEHP_FF_MODE2,
-	       FF_MODE2_TDS_TIMER_MASK,
-	       FF_MODE2_TDS_TIMER_128,
-	       0, false);
+	wa_mcr_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
+			     REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
+	wa_mcr_add(wal,
+		   XEHP_FF_MODE2,
+		   FF_MODE2_TDS_TIMER_MASK,
+		   FF_MODE2_TDS_TIMER_128,
+		   0, false);
 }
 
 /*
@@ -664,27 +716,27 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 
 	/* Wa_16011186671:dg2_g11 */
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
-		wa_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH);
-		wa_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE);
+		wa_mcr_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH);
+		wa_mcr_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_14010469329:dg2_g10 */
-		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
-			     XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
+		wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
+				 XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
 
 		/*
 		 * Wa_22010465075:dg2_g10
 		 * Wa_22010613112:dg2_g10
 		 * Wa_14010698770:dg2_g10
 		 */
-		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
-			     GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
+		wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
+				 GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
 	}
 
 	/* Wa_16013271637:dg2 */
-	wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1,
-		     MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
+	wa_mcr_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1,
+			 MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
 
 	/* Wa_14014947963:dg2 */
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_B0, STEP_FOREVER) ||
@@ -1264,9 +1316,9 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 		    PSDUNIT_CLKGATE_DIS);
 
 	/* Wa_1406680159:icl,ehl */
-	wa_write_or(wal,
-		    GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
-		    GWUNIT_CLKGATE_DIS);
+	wa_mcr_write_or(wal,
+			GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
+			GWUNIT_CLKGATE_DIS);
 
 	/* Wa_1607087056:icl,ehl,jsl */
 	if (IS_ICELAKE(i915) ||
@@ -1279,7 +1331,7 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	 * This is not a documented workaround, but rather an optimization
 	 * to reduce sampler power.
 	 */
-	wa_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
+	wa_mcr_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
 }
 
 /*
@@ -1313,7 +1365,7 @@ gen12_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	wa_14011060649(gt, wal);
 
 	/* Wa_14011059788:tgl,rkl,adl-s,dg1,adl-p */
-	wa_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
+	wa_mcr_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
 }
 
 static void
@@ -1325,9 +1377,9 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 
 	/* Wa_1409420604:tgl */
 	if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
-		wa_write_or(wal,
-			    SUBSLICE_UNIT_LEVEL_CLKGATE2,
-			    CPSSUNIT_CLKGATE_DIS);
+		wa_mcr_write_or(wal,
+				SUBSLICE_UNIT_LEVEL_CLKGATE2,
+				CPSSUNIT_CLKGATE_DIS);
 
 	/* Wa_1607087056:tgl also know as BUG:1409180338 */
 	if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
@@ -1356,9 +1408,9 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 
 	/* Wa_1409420604:dg1 */
 	if (IS_DG1(i915))
-		wa_write_or(wal,
-			    SUBSLICE_UNIT_LEVEL_CLKGATE2,
-			    CPSSUNIT_CLKGATE_DIS);
+		wa_mcr_write_or(wal,
+				SUBSLICE_UNIT_LEVEL_CLKGATE2,
+				CPSSUNIT_CLKGATE_DIS);
 
 	/* Wa_1408615072:dg1 */
 	/* Empirical testing shows this register is unaffected by engine reset. */
@@ -1375,7 +1427,7 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	xehp_init_mcr(gt, wal);
 
 	/* Wa_1409757795:xehpsdv */
-	wa_write_or(wal, SCCGCTL94DC, CG3DDISURB);
+	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
 
 	/* Wa_16011155590:xehpsdv */
 	if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
@@ -1455,8 +1507,8 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 			    CG3DDISCFEG_CLKGATE_DIS);
 
 		/* Wa_14011006942:dg2 */
-		wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
-			    DSS_ROUTER_CLKGATE_DIS);
+		wa_mcr_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
+				DSS_ROUTER_CLKGATE_DIS);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_B0)) {
@@ -1467,7 +1519,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 		wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS);
 
 		/* Wa_14011371254:dg2_g10 */
-		wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
+		wa_mcr_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
 
 		/* Wa_14011431319:dg2_g10 */
 		wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS |
@@ -1503,21 +1555,21 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 			    GAMEDIA_CLKGATE_DIS);
 
 		/* Wa_14011028019:dg2_g10 */
-		wa_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS);
+		wa_mcr_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS);
 	}
 
 	/* Wa_14014830051:dg2 */
-	wa_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN);
+	wa_mcr_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN);
 
 	/*
 	 * The following are not actually "workarounds" but rather
 	 * recommended tuning settings documented in the bspec's
 	 * performance guide section.
 	 */
-	wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS);
+	wa_mcr_write_or(wal, XEHP_SQCM, EN_32B_ACCESS);
 
 	/* Wa_14015795083 */
-	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
+	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
 static void
@@ -1526,7 +1578,7 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	pvc_init_mcr(gt, wal);
 
 	/* Wa_14015795083 */
-	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
+	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
 static void
@@ -1638,14 +1690,25 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
 		u32 val, old = 0;
 
 		/* open-coded rmw due to steering */
-		old = wa->clr ? intel_gt_mcr_read_any_fw(gt, wa->reg) : 0;
+		if (wa->clr)
+			old = wa->is_mcr ?
+				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_uncore_read_fw(uncore, wa->reg);
 		val = (old & ~wa->clr) | wa->set;
-		if (val != old || !wa->clr)
-			intel_uncore_write_fw(uncore, wa->reg, val);
+		if (val != old || !wa->clr) {
+			if (wa->is_mcr)
+				intel_gt_mcr_multicast_write_fw(gt, wa->reg, val);
+			else
+				intel_uncore_write_fw(uncore, wa->reg, val);
+		}
+
+		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
+			u32 val = wa->is_mcr ?
+				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_uncore_read_fw(uncore, wa->reg);
 
-		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
-			wa_verify(wa, intel_gt_mcr_read_any_fw(gt, wa->reg),
-				  wal->name, "application");
+			wa_verify(wa, val, wal->name, "application");
+		}
 	}
 
 	intel_uncore_forcewake_put__locked(uncore, fw);
@@ -1674,8 +1737,9 @@ static bool wa_list_verify(struct intel_gt *gt,
 	intel_uncore_forcewake_get__locked(uncore, fw);
 
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
-		ok &= wa_verify(wa,
-				intel_gt_mcr_read_any_fw(gt, wa->reg),
+		ok &= wa_verify(wa, wa->is_mcr ?
+				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_uncore_read_fw(uncore, wa->reg),
 				wal->name, from);
 
 	intel_uncore_forcewake_put__locked(uncore, fw);
@@ -1721,12 +1785,36 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
 	_wa_add(wal, &wa);
 }
 
+static void
+whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
+{
+	struct i915_wa wa = {
+		.reg = reg,
+		.is_mcr = 1,
+	};
+
+	if (GEM_DEBUG_WARN_ON(wal->count >= RING_MAX_NONPRIV_SLOTS))
+		return;
+
+	if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags)))
+		return;
+
+	wa.reg.reg |= flags;
+	_wa_add(wal, &wa);
+}
+
 static void
 whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
 {
 	whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
 }
 
+static void
+whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
+{
+	whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
+}
+
 static void gen9_whitelist_build(struct i915_wa_list *w)
 {
 	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
@@ -1752,7 +1840,7 @@ static void skl_whitelist_build(struct intel_engine_cs *engine)
 	gen9_whitelist_build(w);
 
 	/* WaDisableLSQCROPERFforOCL:skl */
-	whitelist_reg(w, GEN8_L3SQCREG4);
+	whitelist_mcr_reg(w, GEN8_L3SQCREG4);
 }
 
 static void bxt_whitelist_build(struct intel_engine_cs *engine)
@@ -1773,7 +1861,7 @@ static void kbl_whitelist_build(struct intel_engine_cs *engine)
 	gen9_whitelist_build(w);
 
 	/* WaDisableLSQCROPERFforOCL:kbl */
-	whitelist_reg(w, GEN8_L3SQCREG4);
+	whitelist_mcr_reg(w, GEN8_L3SQCREG4);
 }
 
 static void glk_whitelist_build(struct intel_engine_cs *engine)
@@ -1838,10 +1926,10 @@ static void icl_whitelist_build(struct intel_engine_cs *engine)
 	switch (engine->class) {
 	case RENDER_CLASS:
 		/* WaAllowUMDToModifyHalfSliceChicken7:icl */
-		whitelist_reg(w, GEN9_HALF_SLICE_CHICKEN7);
+		whitelist_mcr_reg(w, GEN9_HALF_SLICE_CHICKEN7);
 
 		/* WaAllowUMDToModifySamplerMode:icl */
-		whitelist_reg(w, GEN10_SAMPLER_MODE);
+		whitelist_mcr_reg(w, GEN10_SAMPLER_MODE);
 
 		/* WaEnableStateCacheRedirectToCS:icl */
 		whitelist_reg(w, GEN9_SLICE_COMMON_ECO_CHICKEN1);
@@ -2117,21 +2205,21 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14013392000:dg2_g11 */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) ||
 	    IS_DG2_G11(i915) || IS_DG2_G12(i915)) {
 		/* Wa_1509727124:dg2 */
-		wa_masked_en(wal, GEN10_SAMPLER_MODE,
-			     SC_DISABLE_POWER_OPTIMIZATION_EBB);
+		wa_mcr_masked_en(wal, GEN10_SAMPLER_MODE,
+				 SC_DISABLE_POWER_OPTIMIZATION_EBB);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0) ||
 	    IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14012419201:dg2 */
-		wa_masked_en(wal, GEN9_ROW_CHICKEN4,
-			     GEN12_DISABLE_HDR_PAST_PAYLOAD_HOLD_FIX);
+		wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4,
+				 GEN12_DISABLE_HDR_PAST_PAYLOAD_HOLD_FIX);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_C0) ||
@@ -2140,13 +2228,13 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_22012826095:dg2
 		 * Wa_22013059131:dg2
 		 */
-		wa_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW,
-				 MAXREQS_PER_BANK,
-				 REG_FIELD_PREP(MAXREQS_PER_BANK, 2));
+		wa_mcr_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW,
+				     MAXREQS_PER_BANK,
+				     REG_FIELD_PREP(MAXREQS_PER_BANK, 2));
 
 		/* Wa_22013059131:dg2 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0,
-			    FORCE_1_SUB_MESSAGE_PER_FRAGMENT);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0,
+				FORCE_1_SUB_MESSAGE_PER_FRAGMENT);
 	}
 
 	/* Wa_1308578152:dg2_g10 when first gslice is fused off */
@@ -2159,19 +2247,19 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) ||
 	    IS_DG2_G11(i915) || IS_DG2_G12(i915)) {
 		/* Wa_22013037850:dg2 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW,
-			    DISABLE_128B_EVICTION_COMMAND_UDW);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW,
+				DISABLE_128B_EVICTION_COMMAND_UDW);
 
 		/* Wa_22012856258:dg2 */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_DISABLE_READ_SUPPRESSION);
 
 		/*
 		 * Wa_22010960976:dg2
 		 * Wa_14013347512:dg2
 		 */
-		wa_masked_dis(wal, XEHP_HDC_CHICKEN0,
-			      LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK);
+		wa_mcr_masked_dis(wal, XEHP_HDC_CHICKEN0,
+				  LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) {
@@ -2179,7 +2267,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_1608949956:dg2_g10
 		 * Wa_14010198302:dg2_g10
 		 */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN,
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 			     MDQ_ARBITRATION_MODE | UGM_BACKUP_MODE);
 
 		/*
@@ -2188,31 +2276,31 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * LSC_CHICKEN_BIT_0 always reads back as 0 is this stepping,
 		 * so ignoring verification.
 		 */
-		wa_add(wal, LSC_CHICKEN_BIT_0_UDW, 0,
-		       FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE,
-		       0, false);
+		wa_mcr_add(wal, LSC_CHICKEN_BIT_0_UDW, 0,
+			   FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE,
+			   0, false);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_22010430635:dg2 */
-		wa_masked_en(wal,
-			     GEN9_ROW_CHICKEN4,
-			     GEN12_DISABLE_GRF_CLEAR);
+		wa_mcr_masked_en(wal,
+				 GEN9_ROW_CHICKEN4,
+				 GEN12_DISABLE_GRF_CLEAR);
 
 		/* Wa_14010648519:dg2 */
-		wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
+		wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 	}
 
 	/* Wa_14013202645:dg2 */
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_C0) ||
 	    IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0))
-		wa_write_or(wal, RT_CTRL, DIS_NULL_QUERY);
+		wa_mcr_write_or(wal, RT_CTRL, DIS_NULL_QUERY);
 
 	/* Wa_22012532006:dg2 */
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_C0) ||
 	    IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0))
-		wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
-			     DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA);
+		wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
+				 DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA);
 
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_14010680813:dg2_g10 */
@@ -2223,17 +2311,17 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) ||
 	    IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14012362059:dg2 */
-		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_B0, STEP_FOREVER) ||
 	    IS_DG2_G10(i915)) {
 		/* Wa_22014600077:dg2 */
-		wa_add(wal, GEN10_CACHE_MODE_SS, 0,
-		       _MASKED_BIT_ENABLE(ENABLE_EU_COUNT_FOR_TDL_FLUSH),
-		       0 /* Wa_14012342262 :write-only reg, so skip
-			    verification */,
-		       true);
+		wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0,
+			   _MASKED_BIT_ENABLE(ENABLE_EU_COUNT_FOR_TDL_FLUSH),
+			   0 /* Wa_14012342262 :write-only reg, so skip
+				verification */,
+			   true);
 	}
 
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
@@ -2260,7 +2348,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
 
 		/*
 		 * Wa_1407928979:tgl A*
@@ -2289,14 +2377,14 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	    IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
-			     GEN12_PUSH_CONST_DEREF_HOLD_DIS);
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
+				 GEN12_PUSH_CONST_DEREF_HOLD_DIS);
 
 		/*
 		 * Wa_1409085225:tgl
 		 * Wa_14010229206:tgl,rkl,dg1[a0],adl-s,adl-p
 		 */
-		wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH);
+		wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH);
 	}
 
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
@@ -2320,9 +2408,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915) ||
 	    IS_ALDERLAKE_S(i915) || IS_ALDERLAKE_P(i915)) {
 		/* Wa_1406941453:tgl,rkl,dg1,adl-s,adl-p */
-		wa_masked_en(wal,
-			     GEN10_SAMPLER_MODE,
-			     ENABLE_SMALLPL);
+		wa_mcr_masked_en(wal,
+				 GEN10_SAMPLER_MODE,
+				 ENABLE_SMALLPL);
 	}
 
 	if (GRAPHICS_VER(i915) == 11) {
@@ -2356,9 +2444,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_1405733216:icl
 		 * Formerly known as WaDisableCleanEvicts
 		 */
-		wa_write_or(wal,
-			    GEN8_L3SQCREG4,
-			    GEN11_LQSC_CLEAN_EVICT_DISABLE);
+		wa_mcr_write_or(wal,
+				GEN8_L3SQCREG4,
+				GEN11_LQSC_CLEAN_EVICT_DISABLE);
 
 		/* Wa_1606682166:icl */
 		wa_write_or(wal,
@@ -2366,10 +2454,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 			    GEN7_DISABLE_SAMPLER_PREFETCH);
 
 		/* Wa_1409178092:icl */
-		wa_write_clr_set(wal,
-				 GEN11_SCRATCH2,
-				 GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE,
-				 0);
+		wa_mcr_write_clr_set(wal,
+				     GEN11_SCRATCH2,
+				     GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE,
+				     0);
 
 		/* WaEnable32PlaneMode:icl */
 		wa_masked_en(wal, GEN9_CSFE_CHICKEN1_RCS,
@@ -2427,30 +2515,30 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 			     GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE);
 
 		/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
-		wa_write_or(wal,
-			    BDW_SCRATCH1,
-			    GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+		wa_mcr_write_or(wal,
+				BDW_SCRATCH1,
+				GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
 
 		/* WaProgramL3SqcReg1DefaultForPerf:bxt,glk */
 		if (IS_GEN9_LP(i915))
-			wa_write_clr_set(wal,
-					 GEN8_L3SQCREG1,
-					 L3_PRIO_CREDITS_MASK,
-					 L3_GENERAL_PRIO_CREDITS(62) |
-					 L3_HIGH_PRIO_CREDITS(2));
+			wa_mcr_write_clr_set(wal,
+					     GEN8_L3SQCREG1,
+					     L3_PRIO_CREDITS_MASK,
+					     L3_GENERAL_PRIO_CREDITS(62) |
+					     L3_HIGH_PRIO_CREDITS(2));
 
 		/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
-		wa_write_or(wal,
-			    GEN8_L3SQCREG4,
-			    GEN8_LQSC_FLUSH_COHERENT_LINES);
+		wa_mcr_write_or(wal,
+				GEN8_L3SQCREG4,
+				GEN8_LQSC_FLUSH_COHERENT_LINES);
 
 		/* Disable atomics in L3 to prevent unrecoverable hangs */
 		wa_write_clr_set(wal, GEN9_SCRATCH_LNCF1,
 				 GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE, 0);
-		wa_write_clr_set(wal, GEN8_L3SQCREG4,
-				 GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
-		wa_write_clr_set(wal, GEN9_SCRATCH1,
-				 EVICTION_PERF_FIX_ENABLE, 0);
+		wa_mcr_write_clr_set(wal, GEN8_L3SQCREG4,
+				     GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
+		wa_mcr_write_clr_set(wal, GEN9_SCRATCH1,
+				     EVICTION_PERF_FIX_ENABLE, 0);
 	}
 
 	if (IS_HASWELL(i915)) {
@@ -2664,7 +2752,7 @@ ccs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 {
 	if (IS_PVC_CT_STEP(engine->i915, STEP_A0, STEP_C0)) {
 		/* Wa_14014999345:pvc */
-		wa_masked_en(wal, GEN10_CACHE_MODE_SS, DISABLE_ECC);
+		wa_mcr_masked_en(wal, GEN10_CACHE_MODE_SS, DISABLE_ECC);
 	}
 }
 
@@ -2690,8 +2778,8 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915,
 	}
 
 	if (IS_DG2(i915)) {
-		wa_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS);
-		wa_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512);
+		wa_mcr_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS);
+		wa_mcr_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512);
 
 		/*
 		 * This is also listed as Wa_22012654132 for certain DG2
@@ -2702,10 +2790,10 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915,
 		 * back for verification on DG2 (due to Wa_14012342262), so
 		 * we need to explicitly skip the readback.
 		 */
-		wa_add(wal, GEN10_CACHE_MODE_SS, 0,
-		       _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC),
-		       0 /* write-only, so skip validation */,
-		       true);
+		wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0,
+			   _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC),
+			   0 /* write-only, so skip validation */,
+			   true);
 	}
 
 	/*
@@ -2714,8 +2802,8 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915,
 	 * platforms.
 	 */
 	if (INTEL_INFO(i915)->tuning_thread_rr_after_dep)
-		wa_masked_field_set(wal, GEN9_ROW_CHICKEN4, THREAD_EX_ARB_MODE,
-				    THREAD_EX_ARB_MODE_RR_AFTER_DEP);
+		wa_mcr_masked_field_set(wal, GEN9_ROW_CHICKEN4, THREAD_EX_ARB_MODE,
+					THREAD_EX_ARB_MODE_RR_AFTER_DEP);
 }
 
 /*
@@ -2741,30 +2829,30 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 
 	if (IS_XEHPSDV(i915)) {
 		/* Wa_1409954639 */
-		wa_masked_en(wal,
-			     GEN8_ROW_CHICKEN,
-			     SYSTOLIC_DOP_CLOCK_GATING_DIS);
+		wa_mcr_masked_en(wal,
+				 GEN8_ROW_CHICKEN,
+				 SYSTOLIC_DOP_CLOCK_GATING_DIS);
 
 		/* Wa_1607196519 */
-		wa_masked_en(wal,
-			     GEN9_ROW_CHICKEN4,
-			     GEN12_DISABLE_GRF_CLEAR);
+		wa_mcr_masked_en(wal,
+				 GEN9_ROW_CHICKEN4,
+				 GEN12_DISABLE_GRF_CLEAR);
 
 		/* Wa_14010670810:xehpsdv */
-		wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
+		wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 
 		/* Wa_14010449647:xehpsdv */
-		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
-			     GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
+		wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
+				 GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
 
 		/* Wa_18011725039:xehpsdv */
 		if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A1, STEP_B0)) {
-			wa_masked_dis(wal, MLTICTXCTL, TDONRENDER);
-			wa_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH);
+			wa_mcr_masked_dis(wal, MLTICTXCTL, TDONRENDER);
+			wa_mcr_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH);
 		}
 
 		/* Wa_14012362059:xehpsdv */
-		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 
 		/* Wa_14014368820:xehpsdv */
 		wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS |
@@ -2773,19 +2861,19 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 
 	if (IS_DG2(i915) || IS_PONTEVECCHIO(i915)) {
 		/* Wa_14015227452:dg2,pvc */
-		wa_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE);
+		wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE);
 
 		/* Wa_22014226127:dg2,pvc */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE);
 
 		/* Wa_16015675438:dg2,pvc */
 		wa_masked_en(wal, FF_SLICE_CS_CHICKEN2, GEN12_PERF_FIX_BALANCING_CFE_DISABLE);
 
 		/* Wa_18018781329:dg2,pvc */
-		wa_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB);
-		wa_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB);
-		wa_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB);
-		wa_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB);
 	}
 
 	if (IS_DG2(i915)) {
@@ -2793,7 +2881,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 		 * Wa_16011620976:dg2_g11
 		 * Wa_22015475538:dg2
 		 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8);
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
index 8a4b6de4e754..f05b37e56fa9 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
@@ -15,7 +15,9 @@ struct i915_wa {
 	u32		clr;
 	u32		set;
 	u32		read;
-	bool		masked_reg;
+
+	u32		masked_reg:1;
+	u32		is_mcr:1;
 };
 
 struct i915_wa_list {
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 11/14] drm/i915/gt: Add MCR-specific workaround initializers
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

Let's be more explicit about which of our workarounds are updating MCR
registers.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 428 +++++++++++-------
 .../gpu/drm/i915/gt/intel_workarounds_types.h |   4 +-
 2 files changed, 261 insertions(+), 171 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 7bb878581d00..c5cd89aa13c9 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -166,12 +166,33 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
 	_wa_add(wal, &wa);
 }
 
+static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
+		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
+{
+	struct i915_wa wa = {
+		.reg  = reg,
+		.clr  = clear,
+		.set  = set,
+		.read = read_mask,
+		.masked_reg = masked_reg,
+		.is_mcr = 1,
+	};
+
+	_wa_add(wal, &wa);
+}
+
 static void
 wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
 {
 	wa_add(wal, reg, clear, set, clear, false);
 }
 
+static void
+wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
+{
+	wa_mcr_add(wal, reg, clear, set, clear, false);
+}
+
 static void
 wa_write(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 {
@@ -184,12 +205,24 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 	wa_write_clr_set(wal, reg, set, set);
 }
 
+static void
+wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
+{
+	wa_mcr_write_clr_set(wal, reg, set, set);
+}
+
 static void
 wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
 {
 	wa_write_clr_set(wal, reg, clr, 0);
 }
 
+static void
+wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
+{
+	wa_mcr_write_clr_set(wal, reg, clr, 0);
+}
+
 /*
  * WA operations on "masked register". A masked register has the upper 16 bits
  * documented as "masked" in b-spec. Its purpose is to allow writing to just a
@@ -207,12 +240,24 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 	wa_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
 }
 
+static void
+wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+{
+	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
+}
+
 static void
 wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 {
 	wa_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
 }
 
+static void
+wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+{
+	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
+}
+
 static void
 wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
 		    u32 mask, u32 val)
@@ -220,6 +265,13 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
 	wa_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
 }
 
+static void
+wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
+			u32 mask, u32 val)
+{
+	wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
+}
+
 static void gen6_ctx_workarounds_init(struct intel_engine_cs *engine,
 				      struct i915_wa_list *wal)
 {
@@ -241,7 +293,7 @@ static void gen8_ctx_workarounds_init(struct intel_engine_cs *engine,
 	wa_masked_en(wal, RING_MI_MODE(RENDER_RING_BASE), ASYNC_FLIP_PERF_DISABLE);
 
 	/* WaDisablePartialInstShootdown:bdw,chv */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN,
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 		     PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
 	/* Use Force Non-Coherent whenever executing a 3D context. This is a
@@ -288,18 +340,18 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine,
 	gen8_ctx_workarounds_init(engine, wal);
 
 	/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
 
 	/* WaDisableDopClockGating:bdw
 	 *
 	 * Also see the related UCGTCL1 write in bdw_init_clock_gating()
 	 * to disable EUTC clock gating.
 	 */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN2,
-		     DOP_CLOCK_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
+			 DOP_CLOCK_GATING_DISABLE);
 
-	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
-		     GEN8_SAMPLER_POWER_BYPASS_DIS);
+	wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
+			 GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	wa_masked_en(wal, HDC_CHICKEN0,
 		     /* WaForceContextSaveRestoreNonCoherent:bdw */
@@ -314,7 +366,7 @@ static void chv_ctx_workarounds_init(struct intel_engine_cs *engine,
 	gen8_ctx_workarounds_init(engine, wal);
 
 	/* WaDisableThreadStallDopClockGating:chv */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
 
 	/* Improve HiZ throughput on CHV. */
 	wa_masked_en(wal, HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
@@ -333,21 +385,21 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 		 */
 		wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
 			     GEN9_PBE_COMPRESSED_HASH_SELECTION);
-		wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
-			     GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
+		wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
+				 GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
 	}
 
 	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
 	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN,
-		     FLOW_CONTROL_ENABLE |
-		     PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
+			 FLOW_CONTROL_ENABLE |
+			 PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
 	/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
 	/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
-	wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
-		     GEN9_ENABLE_YV12_BUGFIX |
-		     GEN9_ENABLE_GPGPU_PREEMPTION);
+	wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
+			 GEN9_ENABLE_YV12_BUGFIX |
+			 GEN9_ENABLE_GPGPU_PREEMPTION);
 
 	/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
 	/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
@@ -356,8 +408,8 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE);
 
 	/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
-	wa_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5,
-		      GEN9_CCS_TLB_PREFETCH_ENABLE);
+	wa_mcr_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5,
+			  GEN9_CCS_TLB_PREFETCH_ENABLE);
 
 	/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
 	wa_masked_en(wal, HDC_CHICKEN0,
@@ -386,11 +438,11 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 	    IS_KABYLAKE(i915) ||
 	    IS_COFFEELAKE(i915) ||
 	    IS_COMETLAKE(i915))
-		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
-			     GEN8_SAMPLER_POWER_BYPASS_DIS);
+		wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
+				 GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
-	wa_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
+	wa_mcr_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
 
 	/*
 	 * Supporting preemption with fine-granularity requires changes in the
@@ -469,8 +521,8 @@ static void bxt_ctx_workarounds_init(struct intel_engine_cs *engine,
 	gen9_ctx_workarounds_init(engine, wal);
 
 	/* WaDisableThreadStallDopClockGating:bxt */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN,
-		     STALL_DOP_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
+			 STALL_DOP_GATING_DISABLE);
 
 	/* WaToEnableHwFixForPushConstHWBug:bxt */
 	wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
@@ -490,8 +542,8 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
-		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
+			 GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
 static void glk_ctx_workarounds_init(struct intel_engine_cs *engine,
@@ -514,8 +566,8 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
-		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
+			 GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
 static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
@@ -534,13 +586,13 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 	 * (the register is whitelisted in hardware now, so UMDs can opt in
 	 * for coherency if they have a good reason).
 	 */
-	wa_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT);
+	wa_mcr_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT);
 
 	/* WaEnableFloatBlendOptimization:icl */
-	wa_add(wal, GEN10_CACHE_MODE_SS, 0,
-	       _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE),
-	       0 /* write-only, so skip validation */,
-	       true);
+	wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0,
+		   _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE),
+		   0 /* write-only, so skip validation */,
+		   true);
 
 	/* WaDisableGPGPUMidThreadPreemption:icl */
 	wa_masked_field_set(wal, GEN8_CS_CHICKEN1,
@@ -548,8 +600,8 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			    GEN9_PREEMPT_GPGPU_THREAD_GROUP_LEVEL);
 
 	/* allow headerless messages for preemptible GPGPU context */
-	wa_masked_en(wal, GEN10_SAMPLER_MODE,
-		     GEN11_SAMPLER_ENABLE_HEADLESS_MSG);
+	wa_mcr_masked_en(wal, GEN10_SAMPLER_MODE,
+			 GEN11_SAMPLER_ENABLE_HEADLESS_MSG);
 
 	/* Wa_1604278689:icl,ehl */
 	wa_write(wal, IVB_FBC_RT_BASE, 0xFFFFFFFF & ~ILK_FBC_RT_VALID);
@@ -558,7 +610,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			 0xFFFFFFFF);
 
 	/* Wa_1406306137:icl,ehl */
-	wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU);
+	wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU);
 }
 
 /*
@@ -569,13 +621,13 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 				   struct i915_wa_list *wal)
 {
 	wa_masked_en(wal, CHICKEN_RASTER_2, TBIMR_FAST_CLIP);
-	wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
-			 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
-	wa_add(wal,
-	       XEHP_FF_MODE2,
-	       FF_MODE2_TDS_TIMER_MASK,
-	       FF_MODE2_TDS_TIMER_128,
-	       0, false);
+	wa_mcr_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
+			     REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
+	wa_mcr_add(wal,
+		   XEHP_FF_MODE2,
+		   FF_MODE2_TDS_TIMER_MASK,
+		   FF_MODE2_TDS_TIMER_128,
+		   0, false);
 }
 
 /*
@@ -664,27 +716,27 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 
 	/* Wa_16011186671:dg2_g11 */
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
-		wa_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH);
-		wa_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE);
+		wa_mcr_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH);
+		wa_mcr_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_14010469329:dg2_g10 */
-		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
-			     XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
+		wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
+				 XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
 
 		/*
 		 * Wa_22010465075:dg2_g10
 		 * Wa_22010613112:dg2_g10
 		 * Wa_14010698770:dg2_g10
 		 */
-		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
-			     GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
+		wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
+				 GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
 	}
 
 	/* Wa_16013271637:dg2 */
-	wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1,
-		     MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
+	wa_mcr_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1,
+			 MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
 
 	/* Wa_14014947963:dg2 */
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_B0, STEP_FOREVER) ||
@@ -1264,9 +1316,9 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 		    PSDUNIT_CLKGATE_DIS);
 
 	/* Wa_1406680159:icl,ehl */
-	wa_write_or(wal,
-		    GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
-		    GWUNIT_CLKGATE_DIS);
+	wa_mcr_write_or(wal,
+			GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
+			GWUNIT_CLKGATE_DIS);
 
 	/* Wa_1607087056:icl,ehl,jsl */
 	if (IS_ICELAKE(i915) ||
@@ -1279,7 +1331,7 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	 * This is not a documented workaround, but rather an optimization
 	 * to reduce sampler power.
 	 */
-	wa_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
+	wa_mcr_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
 }
 
 /*
@@ -1313,7 +1365,7 @@ gen12_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	wa_14011060649(gt, wal);
 
 	/* Wa_14011059788:tgl,rkl,adl-s,dg1,adl-p */
-	wa_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
+	wa_mcr_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
 }
 
 static void
@@ -1325,9 +1377,9 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 
 	/* Wa_1409420604:tgl */
 	if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
-		wa_write_or(wal,
-			    SUBSLICE_UNIT_LEVEL_CLKGATE2,
-			    CPSSUNIT_CLKGATE_DIS);
+		wa_mcr_write_or(wal,
+				SUBSLICE_UNIT_LEVEL_CLKGATE2,
+				CPSSUNIT_CLKGATE_DIS);
 
 	/* Wa_1607087056:tgl also know as BUG:1409180338 */
 	if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
@@ -1356,9 +1408,9 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 
 	/* Wa_1409420604:dg1 */
 	if (IS_DG1(i915))
-		wa_write_or(wal,
-			    SUBSLICE_UNIT_LEVEL_CLKGATE2,
-			    CPSSUNIT_CLKGATE_DIS);
+		wa_mcr_write_or(wal,
+				SUBSLICE_UNIT_LEVEL_CLKGATE2,
+				CPSSUNIT_CLKGATE_DIS);
 
 	/* Wa_1408615072:dg1 */
 	/* Empirical testing shows this register is unaffected by engine reset. */
@@ -1375,7 +1427,7 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	xehp_init_mcr(gt, wal);
 
 	/* Wa_1409757795:xehpsdv */
-	wa_write_or(wal, SCCGCTL94DC, CG3DDISURB);
+	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
 
 	/* Wa_16011155590:xehpsdv */
 	if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
@@ -1455,8 +1507,8 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 			    CG3DDISCFEG_CLKGATE_DIS);
 
 		/* Wa_14011006942:dg2 */
-		wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
-			    DSS_ROUTER_CLKGATE_DIS);
+		wa_mcr_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
+				DSS_ROUTER_CLKGATE_DIS);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_B0)) {
@@ -1467,7 +1519,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 		wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS);
 
 		/* Wa_14011371254:dg2_g10 */
-		wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
+		wa_mcr_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
 
 		/* Wa_14011431319:dg2_g10 */
 		wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS |
@@ -1503,21 +1555,21 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 			    GAMEDIA_CLKGATE_DIS);
 
 		/* Wa_14011028019:dg2_g10 */
-		wa_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS);
+		wa_mcr_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS);
 	}
 
 	/* Wa_14014830051:dg2 */
-	wa_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN);
+	wa_mcr_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN);
 
 	/*
 	 * The following are not actually "workarounds" but rather
 	 * recommended tuning settings documented in the bspec's
 	 * performance guide section.
 	 */
-	wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS);
+	wa_mcr_write_or(wal, XEHP_SQCM, EN_32B_ACCESS);
 
 	/* Wa_14015795083 */
-	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
+	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
 static void
@@ -1526,7 +1578,7 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	pvc_init_mcr(gt, wal);
 
 	/* Wa_14015795083 */
-	wa_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
+	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
 static void
@@ -1638,14 +1690,25 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
 		u32 val, old = 0;
 
 		/* open-coded rmw due to steering */
-		old = wa->clr ? intel_gt_mcr_read_any_fw(gt, wa->reg) : 0;
+		if (wa->clr)
+			old = wa->is_mcr ?
+				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_uncore_read_fw(uncore, wa->reg);
 		val = (old & ~wa->clr) | wa->set;
-		if (val != old || !wa->clr)
-			intel_uncore_write_fw(uncore, wa->reg, val);
+		if (val != old || !wa->clr) {
+			if (wa->is_mcr)
+				intel_gt_mcr_multicast_write_fw(gt, wa->reg, val);
+			else
+				intel_uncore_write_fw(uncore, wa->reg, val);
+		}
+
+		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
+			u32 val = wa->is_mcr ?
+				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_uncore_read_fw(uncore, wa->reg);
 
-		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
-			wa_verify(wa, intel_gt_mcr_read_any_fw(gt, wa->reg),
-				  wal->name, "application");
+			wa_verify(wa, val, wal->name, "application");
+		}
 	}
 
 	intel_uncore_forcewake_put__locked(uncore, fw);
@@ -1674,8 +1737,9 @@ static bool wa_list_verify(struct intel_gt *gt,
 	intel_uncore_forcewake_get__locked(uncore, fw);
 
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
-		ok &= wa_verify(wa,
-				intel_gt_mcr_read_any_fw(gt, wa->reg),
+		ok &= wa_verify(wa, wa->is_mcr ?
+				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_uncore_read_fw(uncore, wa->reg),
 				wal->name, from);
 
 	intel_uncore_forcewake_put__locked(uncore, fw);
@@ -1721,12 +1785,36 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
 	_wa_add(wal, &wa);
 }
 
+static void
+whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
+{
+	struct i915_wa wa = {
+		.reg = reg,
+		.is_mcr = 1,
+	};
+
+	if (GEM_DEBUG_WARN_ON(wal->count >= RING_MAX_NONPRIV_SLOTS))
+		return;
+
+	if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags)))
+		return;
+
+	wa.reg.reg |= flags;
+	_wa_add(wal, &wa);
+}
+
 static void
 whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
 {
 	whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
 }
 
+static void
+whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
+{
+	whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
+}
+
 static void gen9_whitelist_build(struct i915_wa_list *w)
 {
 	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
@@ -1752,7 +1840,7 @@ static void skl_whitelist_build(struct intel_engine_cs *engine)
 	gen9_whitelist_build(w);
 
 	/* WaDisableLSQCROPERFforOCL:skl */
-	whitelist_reg(w, GEN8_L3SQCREG4);
+	whitelist_mcr_reg(w, GEN8_L3SQCREG4);
 }
 
 static void bxt_whitelist_build(struct intel_engine_cs *engine)
@@ -1773,7 +1861,7 @@ static void kbl_whitelist_build(struct intel_engine_cs *engine)
 	gen9_whitelist_build(w);
 
 	/* WaDisableLSQCROPERFforOCL:kbl */
-	whitelist_reg(w, GEN8_L3SQCREG4);
+	whitelist_mcr_reg(w, GEN8_L3SQCREG4);
 }
 
 static void glk_whitelist_build(struct intel_engine_cs *engine)
@@ -1838,10 +1926,10 @@ static void icl_whitelist_build(struct intel_engine_cs *engine)
 	switch (engine->class) {
 	case RENDER_CLASS:
 		/* WaAllowUMDToModifyHalfSliceChicken7:icl */
-		whitelist_reg(w, GEN9_HALF_SLICE_CHICKEN7);
+		whitelist_mcr_reg(w, GEN9_HALF_SLICE_CHICKEN7);
 
 		/* WaAllowUMDToModifySamplerMode:icl */
-		whitelist_reg(w, GEN10_SAMPLER_MODE);
+		whitelist_mcr_reg(w, GEN10_SAMPLER_MODE);
 
 		/* WaEnableStateCacheRedirectToCS:icl */
 		whitelist_reg(w, GEN9_SLICE_COMMON_ECO_CHICKEN1);
@@ -2117,21 +2205,21 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14013392000:dg2_g11 */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) ||
 	    IS_DG2_G11(i915) || IS_DG2_G12(i915)) {
 		/* Wa_1509727124:dg2 */
-		wa_masked_en(wal, GEN10_SAMPLER_MODE,
-			     SC_DISABLE_POWER_OPTIMIZATION_EBB);
+		wa_mcr_masked_en(wal, GEN10_SAMPLER_MODE,
+				 SC_DISABLE_POWER_OPTIMIZATION_EBB);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0) ||
 	    IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14012419201:dg2 */
-		wa_masked_en(wal, GEN9_ROW_CHICKEN4,
-			     GEN12_DISABLE_HDR_PAST_PAYLOAD_HOLD_FIX);
+		wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4,
+				 GEN12_DISABLE_HDR_PAST_PAYLOAD_HOLD_FIX);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_C0) ||
@@ -2140,13 +2228,13 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_22012826095:dg2
 		 * Wa_22013059131:dg2
 		 */
-		wa_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW,
-				 MAXREQS_PER_BANK,
-				 REG_FIELD_PREP(MAXREQS_PER_BANK, 2));
+		wa_mcr_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW,
+				     MAXREQS_PER_BANK,
+				     REG_FIELD_PREP(MAXREQS_PER_BANK, 2));
 
 		/* Wa_22013059131:dg2 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0,
-			    FORCE_1_SUB_MESSAGE_PER_FRAGMENT);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0,
+				FORCE_1_SUB_MESSAGE_PER_FRAGMENT);
 	}
 
 	/* Wa_1308578152:dg2_g10 when first gslice is fused off */
@@ -2159,19 +2247,19 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) ||
 	    IS_DG2_G11(i915) || IS_DG2_G12(i915)) {
 		/* Wa_22013037850:dg2 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW,
-			    DISABLE_128B_EVICTION_COMMAND_UDW);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW,
+				DISABLE_128B_EVICTION_COMMAND_UDW);
 
 		/* Wa_22012856258:dg2 */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_DISABLE_READ_SUPPRESSION);
 
 		/*
 		 * Wa_22010960976:dg2
 		 * Wa_14013347512:dg2
 		 */
-		wa_masked_dis(wal, XEHP_HDC_CHICKEN0,
-			      LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK);
+		wa_mcr_masked_dis(wal, XEHP_HDC_CHICKEN0,
+				  LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) {
@@ -2179,7 +2267,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_1608949956:dg2_g10
 		 * Wa_14010198302:dg2_g10
 		 */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN,
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 			     MDQ_ARBITRATION_MODE | UGM_BACKUP_MODE);
 
 		/*
@@ -2188,31 +2276,31 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * LSC_CHICKEN_BIT_0 always reads back as 0 is this stepping,
 		 * so ignoring verification.
 		 */
-		wa_add(wal, LSC_CHICKEN_BIT_0_UDW, 0,
-		       FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE,
-		       0, false);
+		wa_mcr_add(wal, LSC_CHICKEN_BIT_0_UDW, 0,
+			   FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE,
+			   0, false);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_22010430635:dg2 */
-		wa_masked_en(wal,
-			     GEN9_ROW_CHICKEN4,
-			     GEN12_DISABLE_GRF_CLEAR);
+		wa_mcr_masked_en(wal,
+				 GEN9_ROW_CHICKEN4,
+				 GEN12_DISABLE_GRF_CLEAR);
 
 		/* Wa_14010648519:dg2 */
-		wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
+		wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 	}
 
 	/* Wa_14013202645:dg2 */
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_C0) ||
 	    IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0))
-		wa_write_or(wal, RT_CTRL, DIS_NULL_QUERY);
+		wa_mcr_write_or(wal, RT_CTRL, DIS_NULL_QUERY);
 
 	/* Wa_22012532006:dg2 */
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_C0) ||
 	    IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0))
-		wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
-			     DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA);
+		wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
+				 DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA);
 
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_14010680813:dg2_g10 */
@@ -2223,17 +2311,17 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) ||
 	    IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14012362059:dg2 */
-		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_B0, STEP_FOREVER) ||
 	    IS_DG2_G10(i915)) {
 		/* Wa_22014600077:dg2 */
-		wa_add(wal, GEN10_CACHE_MODE_SS, 0,
-		       _MASKED_BIT_ENABLE(ENABLE_EU_COUNT_FOR_TDL_FLUSH),
-		       0 /* Wa_14012342262 :write-only reg, so skip
-			    verification */,
-		       true);
+		wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0,
+			   _MASKED_BIT_ENABLE(ENABLE_EU_COUNT_FOR_TDL_FLUSH),
+			   0 /* Wa_14012342262 :write-only reg, so skip
+				verification */,
+			   true);
 	}
 
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
@@ -2260,7 +2348,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
 
 		/*
 		 * Wa_1407928979:tgl A*
@@ -2289,14 +2377,14 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	    IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
-			     GEN12_PUSH_CONST_DEREF_HOLD_DIS);
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
+				 GEN12_PUSH_CONST_DEREF_HOLD_DIS);
 
 		/*
 		 * Wa_1409085225:tgl
 		 * Wa_14010229206:tgl,rkl,dg1[a0],adl-s,adl-p
 		 */
-		wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH);
+		wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH);
 	}
 
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
@@ -2320,9 +2408,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915) ||
 	    IS_ALDERLAKE_S(i915) || IS_ALDERLAKE_P(i915)) {
 		/* Wa_1406941453:tgl,rkl,dg1,adl-s,adl-p */
-		wa_masked_en(wal,
-			     GEN10_SAMPLER_MODE,
-			     ENABLE_SMALLPL);
+		wa_mcr_masked_en(wal,
+				 GEN10_SAMPLER_MODE,
+				 ENABLE_SMALLPL);
 	}
 
 	if (GRAPHICS_VER(i915) == 11) {
@@ -2356,9 +2444,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_1405733216:icl
 		 * Formerly known as WaDisableCleanEvicts
 		 */
-		wa_write_or(wal,
-			    GEN8_L3SQCREG4,
-			    GEN11_LQSC_CLEAN_EVICT_DISABLE);
+		wa_mcr_write_or(wal,
+				GEN8_L3SQCREG4,
+				GEN11_LQSC_CLEAN_EVICT_DISABLE);
 
 		/* Wa_1606682166:icl */
 		wa_write_or(wal,
@@ -2366,10 +2454,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 			    GEN7_DISABLE_SAMPLER_PREFETCH);
 
 		/* Wa_1409178092:icl */
-		wa_write_clr_set(wal,
-				 GEN11_SCRATCH2,
-				 GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE,
-				 0);
+		wa_mcr_write_clr_set(wal,
+				     GEN11_SCRATCH2,
+				     GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE,
+				     0);
 
 		/* WaEnable32PlaneMode:icl */
 		wa_masked_en(wal, GEN9_CSFE_CHICKEN1_RCS,
@@ -2427,30 +2515,30 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 			     GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE);
 
 		/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
-		wa_write_or(wal,
-			    BDW_SCRATCH1,
-			    GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+		wa_mcr_write_or(wal,
+				BDW_SCRATCH1,
+				GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
 
 		/* WaProgramL3SqcReg1DefaultForPerf:bxt,glk */
 		if (IS_GEN9_LP(i915))
-			wa_write_clr_set(wal,
-					 GEN8_L3SQCREG1,
-					 L3_PRIO_CREDITS_MASK,
-					 L3_GENERAL_PRIO_CREDITS(62) |
-					 L3_HIGH_PRIO_CREDITS(2));
+			wa_mcr_write_clr_set(wal,
+					     GEN8_L3SQCREG1,
+					     L3_PRIO_CREDITS_MASK,
+					     L3_GENERAL_PRIO_CREDITS(62) |
+					     L3_HIGH_PRIO_CREDITS(2));
 
 		/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
-		wa_write_or(wal,
-			    GEN8_L3SQCREG4,
-			    GEN8_LQSC_FLUSH_COHERENT_LINES);
+		wa_mcr_write_or(wal,
+				GEN8_L3SQCREG4,
+				GEN8_LQSC_FLUSH_COHERENT_LINES);
 
 		/* Disable atomics in L3 to prevent unrecoverable hangs */
 		wa_write_clr_set(wal, GEN9_SCRATCH_LNCF1,
 				 GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE, 0);
-		wa_write_clr_set(wal, GEN8_L3SQCREG4,
-				 GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
-		wa_write_clr_set(wal, GEN9_SCRATCH1,
-				 EVICTION_PERF_FIX_ENABLE, 0);
+		wa_mcr_write_clr_set(wal, GEN8_L3SQCREG4,
+				     GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
+		wa_mcr_write_clr_set(wal, GEN9_SCRATCH1,
+				     EVICTION_PERF_FIX_ENABLE, 0);
 	}
 
 	if (IS_HASWELL(i915)) {
@@ -2664,7 +2752,7 @@ ccs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 {
 	if (IS_PVC_CT_STEP(engine->i915, STEP_A0, STEP_C0)) {
 		/* Wa_14014999345:pvc */
-		wa_masked_en(wal, GEN10_CACHE_MODE_SS, DISABLE_ECC);
+		wa_mcr_masked_en(wal, GEN10_CACHE_MODE_SS, DISABLE_ECC);
 	}
 }
 
@@ -2690,8 +2778,8 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915,
 	}
 
 	if (IS_DG2(i915)) {
-		wa_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS);
-		wa_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512);
+		wa_mcr_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS);
+		wa_mcr_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512);
 
 		/*
 		 * This is also listed as Wa_22012654132 for certain DG2
@@ -2702,10 +2790,10 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915,
 		 * back for verification on DG2 (due to Wa_14012342262), so
 		 * we need to explicitly skip the readback.
 		 */
-		wa_add(wal, GEN10_CACHE_MODE_SS, 0,
-		       _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC),
-		       0 /* write-only, so skip validation */,
-		       true);
+		wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0,
+			   _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC),
+			   0 /* write-only, so skip validation */,
+			   true);
 	}
 
 	/*
@@ -2714,8 +2802,8 @@ add_render_compute_tuning_settings(struct drm_i915_private *i915,
 	 * platforms.
 	 */
 	if (INTEL_INFO(i915)->tuning_thread_rr_after_dep)
-		wa_masked_field_set(wal, GEN9_ROW_CHICKEN4, THREAD_EX_ARB_MODE,
-				    THREAD_EX_ARB_MODE_RR_AFTER_DEP);
+		wa_mcr_masked_field_set(wal, GEN9_ROW_CHICKEN4, THREAD_EX_ARB_MODE,
+					THREAD_EX_ARB_MODE_RR_AFTER_DEP);
 }
 
 /*
@@ -2741,30 +2829,30 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 
 	if (IS_XEHPSDV(i915)) {
 		/* Wa_1409954639 */
-		wa_masked_en(wal,
-			     GEN8_ROW_CHICKEN,
-			     SYSTOLIC_DOP_CLOCK_GATING_DIS);
+		wa_mcr_masked_en(wal,
+				 GEN8_ROW_CHICKEN,
+				 SYSTOLIC_DOP_CLOCK_GATING_DIS);
 
 		/* Wa_1607196519 */
-		wa_masked_en(wal,
-			     GEN9_ROW_CHICKEN4,
-			     GEN12_DISABLE_GRF_CLEAR);
+		wa_mcr_masked_en(wal,
+				 GEN9_ROW_CHICKEN4,
+				 GEN12_DISABLE_GRF_CLEAR);
 
 		/* Wa_14010670810:xehpsdv */
-		wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
+		wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 
 		/* Wa_14010449647:xehpsdv */
-		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
-			     GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
+		wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
+				 GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
 
 		/* Wa_18011725039:xehpsdv */
 		if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A1, STEP_B0)) {
-			wa_masked_dis(wal, MLTICTXCTL, TDONRENDER);
-			wa_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH);
+			wa_mcr_masked_dis(wal, MLTICTXCTL, TDONRENDER);
+			wa_mcr_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH);
 		}
 
 		/* Wa_14012362059:xehpsdv */
-		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 
 		/* Wa_14014368820:xehpsdv */
 		wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS |
@@ -2773,19 +2861,19 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 
 	if (IS_DG2(i915) || IS_PONTEVECCHIO(i915)) {
 		/* Wa_14015227452:dg2,pvc */
-		wa_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE);
+		wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE);
 
 		/* Wa_22014226127:dg2,pvc */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE);
 
 		/* Wa_16015675438:dg2,pvc */
 		wa_masked_en(wal, FF_SLICE_CS_CHICKEN2, GEN12_PERF_FIX_BALANCING_CFE_DISABLE);
 
 		/* Wa_18018781329:dg2,pvc */
-		wa_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB);
-		wa_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB);
-		wa_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB);
-		wa_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB);
 	}
 
 	if (IS_DG2(i915)) {
@@ -2793,7 +2881,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 		 * Wa_16011620976:dg2_g11
 		 * Wa_22015475538:dg2
 		 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8);
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
index 8a4b6de4e754..f05b37e56fa9 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
@@ -15,7 +15,9 @@ struct i915_wa {
 	u32		clr;
 	u32		set;
 	u32		read;
-	bool		masked_reg;
+
+	u32		masked_reg:1;
+	u32		is_mcr:1;
 };
 
 struct i915_wa_list {
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

Rather than treating multicast registers as 'i915_reg_t' let's define
them as a completely new type.  This will allow the compiler to help us
make sure we're using multicast-aware functions to operate on multicast
registers.

This plan does break down a bit in places where we're just maintaining
heterogeneous lists of registers (e.g., various MMIO whitelists used by
perf, GVT, etc.) rather than performing reads/writes.  We only really
care about the offset in those cases, so for now we can "cast" the
registers as non-MCR, leaving us with a list of i915_reg_t's, but we may
want to look for better ways to store mixed collections of i915_reg_t
and i915_mcr_reg_t in the future.

v2:
 - Add TLB invalidation registers

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c            | 32 +++++++----
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 53 ++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h        | 18 +++----
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 27 +++++++---
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 32 +++++------
 .../gpu/drm/i915/gt/intel_workarounds_types.h |  5 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c    |  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 +-
 drivers/gpu/drm/i915/gvt/handlers.c           | 17 +++---
 drivers/gpu/drm/i915/gvt/mmio_context.c       | 16 +++---
 drivers/gpu/drm/i915/i915_reg_defs.h          | 22 +++-----
 drivers/gpu/drm/i915/intel_gvt_mmio_table.c   | 10 ++--
 13 files changed, 140 insertions(+), 100 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index e763dc719d3a..27dbb9e4bd6c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -991,7 +991,10 @@ void intel_gt_info_print(const struct intel_gt_info *info,
 }
 
 struct reg_and_bit {
-	i915_reg_t reg;
+	union {
+		i915_reg_t reg;
+		i915_mcr_reg_t mcr_reg;
+	};
 	u32 bit;
 };
 
@@ -1033,7 +1036,7 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8,
 static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb)
 {
 	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
-		return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0,
+		return intel_gt_mcr_wait_for_reg_fw(gt, rb.mcr_reg, rb.bit, 0,
 						    TLB_INVAL_TIMEOUT_US,
 						    TLB_INVAL_TIMEOUT_MS);
 	else
@@ -1058,7 +1061,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 		[COPY_ENGINE_CLASS]		= GEN12_BLT_TLB_INV_CR,
 		[COMPUTE_CLASS]			= GEN12_COMPCTX_TLB_INV_CR,
 	};
-	static const i915_reg_t xehp_regs[] = {
+	static const i915_mcr_reg_t xehp_regs[] = {
 		[RENDER_CLASS]			= XEHP_GFX_TLB_INV_CR,
 		[VIDEO_DECODE_CLASS]		= XEHP_VD_TLB_INV_CR,
 		[VIDEO_ENHANCEMENT_CLASS]	= XEHP_VE_TLB_INV_CR,
@@ -1074,7 +1077,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	unsigned int num = 0;
 
 	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
-		regs = xehp_regs;
+		regs = NULL;
 		num = ARRAY_SIZE(xehp_regs);
 	} else if (GRAPHICS_VER(i915) == 12) {
 		regs = gen12_regs;
@@ -1101,11 +1104,17 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 		if (!intel_engine_pm_is_awake(engine))
 			continue;
 
-		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
-		if (!i915_mmio_reg_offset(rb.reg))
-			continue;
+		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+			intel_gt_mcr_multicast_write_fw(gt,
+							xehp_regs[engine->class],
+							BIT(engine->instance));
+		} else {
+			rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+			if (!i915_mmio_reg_offset(rb.reg))
+				continue;
 
-		intel_uncore_write_fw(uncore, rb.reg, rb.bit);
+			intel_uncore_write_fw(uncore, rb.reg, rb.bit);
+		}
 		awake |= engine->mask;
 	}
 
@@ -1125,7 +1134,12 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	for_each_engine_masked(engine, gt, awake, tmp) {
 		struct reg_and_bit rb;
 
-		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+			rb.mcr_reg = xehp_regs[engine->class];
+			rb.bit = BIT(engine->instance);
+		} else {
+			rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+		}
 
 		if (wait_for_invalidate(gt, rb))
 			drm_err_ratelimited(&gt->i915->drm,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index a2bc77c33835..9e2caba64f19 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -150,6 +150,19 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 	}
 }
 
+/*
+ * Although the rest of the driver should use MCR-specific functions to
+ * read/write MCR registers, we still use the regular intel_uncore_* functions
+ * internally to implement those, so we need a way for the functions in this
+ * file to "cast" an i915_mcr_reg_t into an i915_reg_t.
+ */
+static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
+{
+	i915_reg_t r = { .reg = mcr.reg };
+
+	return r;
+}
+
 /*
  * rw_with_mcr_steering_fw - Access a register with specific MCR steering
  * @uncore: pointer to struct intel_uncore
@@ -164,7 +177,7 @@ void intel_gt_mcr_init(struct intel_gt *gt)
  * Caller needs to make sure the relevant forcewake wells are up.
  */
 static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
-				   i915_reg_t reg, u8 rw_flag,
+				   i915_mcr_reg_t reg, u8 rw_flag,
 				   int group, int instance, u32 value)
 {
 	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
@@ -201,9 +214,9 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
 
 	if (rw_flag == FW_REG_READ)
-		val = intel_uncore_read_fw(uncore, reg);
+		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
 	else
-		intel_uncore_write_fw(uncore, reg, value);
+		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
 
 	mcr &= ~mcr_mask;
 	mcr |= old_mcr & mcr_mask;
@@ -214,14 +227,14 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 }
 
 static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
-				i915_reg_t reg, u8 rw_flag,
+				i915_mcr_reg_t reg, u8 rw_flag,
 				int group, int instance,
 				u32 value)
 {
 	enum forcewake_domains fw_domains;
 	u32 val;
 
-	fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
+	fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg),
 						    rw_flag);
 	fw_domains |= intel_uncore_forcewake_for_reg(uncore,
 						     GEN8_MCR_SELECTOR,
@@ -249,7 +262,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
  * group/instance.
  */
 u32 intel_gt_mcr_read(struct intel_gt *gt,
-		      i915_reg_t reg,
+		      i915_mcr_reg_t reg,
 		      int group, int instance)
 {
 	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, instance, 0);
@@ -266,7 +279,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
  * Write an MCR register in unicast mode after steering toward a specific
  * group/instance.
  */
-void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value,
+void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value,
 				int group, int instance)
 {
 	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, value);
@@ -281,9 +294,9 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value,
  * Write an MCR register in multicast mode to update all instances.
  */
 void intel_gt_mcr_multicast_write(struct intel_gt *gt,
-				i915_reg_t reg, u32 value)
+				i915_mcr_reg_t reg, u32 value)
 {
-	intel_uncore_write(gt->uncore, reg, value);
+	intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value);
 }
 
 /**
@@ -297,9 +310,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
  * domains; use intel_gt_mcr_multicast_write() in cases where forcewake should
  * be obtained automatically.
  */
-void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 value)
+void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value)
 {
-	intel_uncore_write_fw(gt->uncore, reg, value);
+	intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value);
 }
 
 /**
@@ -318,7 +331,7 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va
  * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should
  * be obtained automatically.
  */
-void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
+void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
 				u32 clear, u32 set)
 {
 	u32 val = intel_gt_mcr_read_any(gt, reg);
@@ -341,10 +354,10 @@ void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
  * for @type steering too.
  */
 static bool reg_needs_read_steering(struct intel_gt *gt,
-				    i915_reg_t reg,
+				    i915_mcr_reg_t reg,
 				    enum intel_steering_type type)
 {
-	const u32 offset = i915_mmio_reg_offset(reg);
+	const u32 offset = i915_mmio_reg_offset(mcr_reg_cast(reg));
 	const struct intel_mmio_range *entry;
 
 	if (likely(!gt->steering_table[type]))
@@ -424,7 +437,7 @@ static void get_nonterminated_steering(struct intel_gt *gt,
  * steering.
  */
 void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
-					     i915_reg_t reg,
+					     i915_mcr_reg_t reg,
 					     u8 *group, u8 *instance)
 {
 	int type;
@@ -453,7 +466,7 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
  *
  * Returns the value from a non-terminated instance of @reg.
  */
-u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
+u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
 {
 	int type;
 	u8 group, instance;
@@ -467,7 +480,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
 		}
 	}
 
-	return intel_uncore_read_fw(gt->uncore, reg);
+	return intel_uncore_read_fw(gt->uncore, mcr_reg_cast(reg));
 }
 
 /**
@@ -480,7 +493,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
  *
  * Returns the value from a non-terminated instance of @reg.
  */
-u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
+u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
 {
 	int type;
 	u8 group, instance;
@@ -494,7 +507,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
 		}
 	}
 
-	return intel_uncore_read(gt->uncore, reg);
+	return intel_uncore_read(gt->uncore, mcr_reg_cast(reg));
 }
 
 static void report_steering_type(struct drm_printer *p,
@@ -595,7 +608,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
  * Return: 0 if the register matches the desired condition, or -ETIMEDOUT.
  */
 int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
-				 i915_reg_t reg,
+				 i915_mcr_reg_t reg,
 				 u32 mask,
 				 u32 value,
 				 unsigned int fast_timeout_us,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
index 96979af01f2b..7fb574d31d73 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -11,24 +11,24 @@
 void intel_gt_mcr_init(struct intel_gt *gt);
 
 u32 intel_gt_mcr_read(struct intel_gt *gt,
-		      i915_reg_t reg,
+		      i915_mcr_reg_t reg,
 		      int group, int instance);
-u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
-u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
+u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg);
+u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg);
 
 void intel_gt_mcr_unicast_write(struct intel_gt *gt,
-				i915_reg_t reg, u32 value,
+				i915_mcr_reg_t reg, u32 value,
 				int group, int instance);
 void intel_gt_mcr_multicast_write(struct intel_gt *gt,
-				  i915_reg_t reg, u32 value);
+				  i915_mcr_reg_t reg, u32 value);
 void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
-				     i915_reg_t reg, u32 value);
+				     i915_mcr_reg_t reg, u32 value);
 
-void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
+void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
 				u32 clear, u32 set);
 
 void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
-					     i915_reg_t reg,
+					     i915_mcr_reg_t reg,
 					     u8 *group, u8 *instance);
 
 void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
@@ -38,7 +38,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
 				  unsigned int *group, unsigned int *instance);
 
 int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
-				 i915_reg_t reg,
+				 i915_mcr_reg_t reg,
 				 u32 mask,
 				 u32 value,
 				 unsigned int fast_timeout_us,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 901ea03b9fd8..b7341f01ec9f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -8,7 +8,18 @@
 
 #include "i915_reg_defs.h"
 
-#define MCR_REG(offset)	_MMIO(offset)
+#define MCR_REG(offset)	((const i915_mcr_reg_t){ .reg = (offset) })
+
+/*
+ * The perf control registers are technically multicast registers, but the
+ * driver never needs to read/write them directly; we only use them to build
+ * lists of registers (where they're mixed in with other non-MCR registers)
+ * and then operate on the offset directly.  For now we'll just define them
+ * as non-multicast so we can place them on the same list, but we may want
+ * to try to come up with a better way to handle heterogeneous lists of
+ * registers in the future.
+ */
+#define PERF_REG(offset)			_MMIO(offset)
 
 /* RPM unit config (Gen8+) */
 #define RPM_CONFIG0				_MMIO(0xd00)
@@ -1116,8 +1127,8 @@
 #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
 #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
 
-#define EU_PERF_CNTL0				MCR_REG(0xe458)
-#define EU_PERF_CNTL4				MCR_REG(0xe45c)
+#define EU_PERF_CNTL0				PERF_REG(0xe458)
+#define EU_PERF_CNTL4				PERF_REG(0xe45c)
 
 #define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
 #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
@@ -1154,16 +1165,16 @@
 #define   STACKID_CTRL				REG_GENMASK(6, 5)
 #define   STACKID_CTRL_512			REG_FIELD_PREP(STACKID_CTRL, 0x2)
 
-#define EU_PERF_CNTL1				MCR_REG(0xe558)
-#define EU_PERF_CNTL5				MCR_REG(0xe55c)
+#define EU_PERF_CNTL1				PERF_REG(0xe558)
+#define EU_PERF_CNTL5				PERF_REG(0xe55c)
 
 #define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
 #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
 #define ICL_HDC_MODE				MCR_REG(0xe5f4)
 
-#define EU_PERF_CNTL2				MCR_REG(0xe658)
-#define EU_PERF_CNTL6				MCR_REG(0xe65c)
-#define EU_PERF_CNTL3				MCR_REG(0xe758)
+#define EU_PERF_CNTL2				PERF_REG(0xe658)
+#define EU_PERF_CNTL6				PERF_REG(0xe65c)
+#define EU_PERF_CNTL3				PERF_REG(0xe758)
 
 #define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
 #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index c5cd89aa13c9..174c74dcda99 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -166,11 +166,11 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
 	_wa_add(wal, &wa);
 }
 
-static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
+static void wa_mcr_add(struct i915_wa_list *wal, i915_mcr_reg_t reg,
 		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
 {
 	struct i915_wa wa = {
-		.reg  = reg,
+		.mcr_reg = reg,
 		.clr  = clear,
 		.set  = set,
 		.read = read_mask,
@@ -188,7 +188,7 @@ wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
 }
 
 static void
-wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
+wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set)
 {
 	wa_mcr_add(wal, reg, clear, set, clear, false);
 }
@@ -206,7 +206,7 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 }
 
 static void
-wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
+wa_mcr_write_or(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set)
 {
 	wa_mcr_write_clr_set(wal, reg, set, set);
 }
@@ -218,7 +218,7 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
 }
 
 static void
-wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
+wa_mcr_write_clr(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clr)
 {
 	wa_mcr_write_clr_set(wal, reg, clr, 0);
 }
@@ -241,7 +241,7 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 }
 
 static void
-wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+wa_mcr_masked_en(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
 {
 	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
 }
@@ -253,7 +253,7 @@ wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 }
 
 static void
-wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+wa_mcr_masked_dis(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
 {
 	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
 }
@@ -266,7 +266,7 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
 }
 
 static void
-wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
+wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_mcr_reg_t reg,
 			u32 mask, u32 val)
 {
 	wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
@@ -1692,19 +1692,19 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
 		/* open-coded rmw due to steering */
 		if (wa->clr)
 			old = wa->is_mcr ?
-				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
 				intel_uncore_read_fw(uncore, wa->reg);
 		val = (old & ~wa->clr) | wa->set;
 		if (val != old || !wa->clr) {
 			if (wa->is_mcr)
-				intel_gt_mcr_multicast_write_fw(gt, wa->reg, val);
+				intel_gt_mcr_multicast_write_fw(gt, wa->mcr_reg, val);
 			else
 				intel_uncore_write_fw(uncore, wa->reg, val);
 		}
 
 		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
 			u32 val = wa->is_mcr ?
-				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
 				intel_uncore_read_fw(uncore, wa->reg);
 
 			wa_verify(wa, val, wal->name, "application");
@@ -1738,7 +1738,7 @@ static bool wa_list_verify(struct intel_gt *gt,
 
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
 		ok &= wa_verify(wa, wa->is_mcr ?
-				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
 				intel_uncore_read_fw(uncore, wa->reg),
 				wal->name, from);
 
@@ -1786,10 +1786,10 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
 }
 
 static void
-whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
+whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 flags)
 {
 	struct i915_wa wa = {
-		.reg = reg,
+		.mcr_reg = reg,
 		.is_mcr = 1,
 	};
 
@@ -1799,7 +1799,7 @@ whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
 	if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags)))
 		return;
 
-	wa.reg.reg |= flags;
+	wa.mcr_reg.reg |= flags;
 	_wa_add(wal, &wa);
 }
 
@@ -1810,7 +1810,7 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
 }
 
 static void
-whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
+whitelist_mcr_reg(struct i915_wa_list *wal, i915_mcr_reg_t reg)
 {
 	whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
index f05b37e56fa9..7c8b01d00043 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
@@ -11,7 +11,10 @@
 #include "i915_reg_defs.h"
 
 struct i915_wa {
-	i915_reg_t	reg;
+	union {
+		i915_reg_t	reg;
+		i915_mcr_reg_t	mcr_reg;
+	};
 	u32		clr;
 	u32		set;
 	u32		read;
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 67a9aab801dd..21b1edc052f8 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -991,7 +991,7 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg)
 	/* Alas, we must pardon some whitelists. Mistakes already made */
 	static const struct regmask pardon[] = {
 		{ GEN9_CTX_PREEMPT_REG, 9 },
-		{ GEN8_L3SQCREG4, 9 },
+		{ _MMIO(0xb118), 9 }, /* GEN8_L3SQCREG4 */
 	};
 
 	return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon));
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index de923fb82301..34ef4f36e660 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -328,7 +328,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 
 static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
 					 struct temp_regset *regset,
-					 i915_reg_t reg, u32 flags)
+					 i915_mcr_reg_t reg, u32 flags)
 {
 	u8 group, inst;
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
index 9495a7928bc8..d5c03e7a7843 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
@@ -240,7 +240,7 @@ static void guc_capture_free_extlists(struct __guc_mmio_reg_descr_group *reglist
 
 struct __ext_steer_reg {
 	const char *name;
-	i915_reg_t reg;
+	i915_mcr_reg_t reg;
 };
 
 static const struct __ext_steer_reg xe_extregs[] = {
@@ -252,7 +252,7 @@ static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
 			   const struct __ext_steer_reg *extlist,
 			   int slice_id, int subslice_id)
 {
-	ext->reg = extlist->reg;
+	ext->reg = _MMIO(i915_mmio_reg_offset(extlist->reg));
 	ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id);
 	ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id);
 	ext->regname = extlist->name;
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index 700cc9688f47..a52c4645cdbe 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -734,7 +734,7 @@ static i915_reg_t force_nonpriv_white_list[] = {
 	_MMIO(0x770c),
 	_MMIO(0x83a8),
 	_MMIO(0xb110),
-	GEN8_L3SQCREG4,//_MMIO(0xb118)
+	_MMIO(0xb118),
 	_MMIO(0xe100),
 	_MMIO(0xe18c),
 	_MMIO(0xe48c),
@@ -2140,6 +2140,9 @@ static int csfe_chicken1_mmio_write(struct intel_vgpu *vgpu,
 #define MMIO_DFH(reg, d, f, r, w) \
 	MMIO_F(reg, 4, f, 0, 0, d, r, w)
 
+#define MMIO_DFH_MCR(reg, d, f, r, w) \
+	MMIO_F(_MMIO(i915_mmio_reg_offset(reg)), 4, f, 0, 0, d, r, w)
+
 #define MMIO_GM(reg, d, r, w) \
 	MMIO_F(reg, 4, F_GMADR, 0xFFFFF000, 0, d, r, w)
 
@@ -2529,15 +2532,15 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
 
 	MMIO_DFH(HDC_CHICKEN0, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 
-	MMIO_DFH(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
-		NULL, NULL);
+	MMIO_DFH_MCR(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
+		     NULL, NULL);
 	MMIO_DFH(GEN7_ROW_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
 		NULL, NULL);
 	MMIO_DFH(GEN8_UCGCTL6, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
 
 	MMIO_DFH(_MMIO(0xb1f0), D_BDW, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xb1c0), D_BDW, F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
 
@@ -2550,7 +2553,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
 
 	MMIO_DFH(_MMIO(0xe194), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xe188), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x2580), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 
 	MMIO_DFH(_MMIO(0x2248), D_BDW, F_CMD_ACCESS, NULL, NULL);
@@ -2699,7 +2702,7 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
 	MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 3)), D_SKL_PLUS, NULL, NULL);
 	MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 4)), D_SKL_PLUS, NULL, NULL);
 
-	MMIO_DFH(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
 
 	MMIO_F(GEN9_GFX_MOCS(0), 0x7f8, F_CMD_ACCESS, 0, 0, D_SKL_PLUS,
 		NULL, NULL);
@@ -2769,7 +2772,7 @@ static int init_bxt_mmio_info(struct intel_gvt *gvt)
 	MMIO_DH(BXT_PORT_TX_DW3_LN0(DPIO_PHY1, DPIO_CH0), D_BXT,
 		bxt_port_tx_dw3_read, NULL);
 	MMIO_DH(BXT_DE_PLL_ENABLE, D_BXT, NULL, bxt_de_pll_enable_write);
-	MMIO_DFH(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(GEN8_L3CNTLREG, D_BXT, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x20D8), D_BXT, F_CMD_ACCESS, NULL, NULL);
 	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40, F_CMD_ACCESS,
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
index d177884d8f7d..7f40055b0474 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -44,6 +44,8 @@
 
 #define GEN9_MOCS_SIZE		64
 
+#define MCR_CAST(mcr)	_MMIO(i915_mcr_reg_offset(mcr))
+
 /* Raw offset is appened to each line for convenience. */
 static struct engine_mmio gen8_engine_mmio_list[] __cacheline_aligned = {
 	{RCS0, RING_MODE_GEN7(RENDER_RING_BASE), 0xffff, false}, /* 0x229c */
@@ -106,15 +108,15 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
 	{RCS0, GEN8_CS_CHICKEN1, 0xffff, true}, /* 0x2580 */
 	{RCS0, COMMON_SLICE_CHICKEN2, 0xffff, true}, /* 0x7014 */
 	{RCS0, GEN9_CS_DEBUG_MODE1, 0xffff, false}, /* 0x20ec */
-	{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
-	{RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
+	{RCS0, _MMIO(0xb118), 0, false}, /* GEN8_L3SQCREG4 */
+	{RCS0, _MMIO(0xb11c), 0, false}, /* GEN9_SCRATCH1 */
 	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
 	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
-	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
-	{RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
-	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
-	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
-	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
+	{RCS0, _MMIO(0xe180), 0xffff, true}, /* HALF_SLICE_CHICKEN2 */
+	{RCS0, _MMIO(0xe184), 0xffff, true}, /* 0xe184 */
+	{RCS0, _MMIO(0xe188), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN5 */
+	{RCS0, _MMIO(0xe194), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN7 */
+	{RCS0, _MMIO(0xe4f0), 0xffff, true}, /* GEN8_ROW_CHICKEN */
 	{RCS0, TRVATTL3PTRDW(0), 0, true}, /* 0x4de0 */
 	{RCS0, TRVATTL3PTRDW(1), 0, true}, /* 0x4de4 */
 	{RCS0, TRNULLDETCT, 0, true}, /* 0x4de8 */
diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
index 8f486f77609f..e823869b9afd 100644
--- a/drivers/gpu/drm/i915/i915_reg_defs.h
+++ b/drivers/gpu/drm/i915/i915_reg_defs.h
@@ -104,22 +104,16 @@ typedef struct {
 
 #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
 
-#define INVALID_MMIO_REG _MMIO(0)
-
-static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
-{
-	return reg.reg;
-}
+typedef struct {
+	u32 reg;
+} i915_mcr_reg_t;
 
-static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
-{
-	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
-}
+#define INVALID_MMIO_REG _MMIO(0)
 
-static inline bool i915_mmio_reg_valid(i915_reg_t reg)
-{
-	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
-}
+/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
+#define i915_mmio_reg_offset(r) (r.reg)
+#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
+#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
 
 #define VLV_DISPLAY_BASE		0x180000
 
diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
index 638b77d64bf4..8c1515f5c04c 100644
--- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
+++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
@@ -816,10 +816,10 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(GEN8_EU_DISABLE1);
 	MMIO_D(GEN8_EU_DISABLE2);
 	MMIO_D(_MMIO(0xfdc));
-	MMIO_D(GEN8_ROW_CHICKEN);
+	MMIO_D(_MMIO(0xe4f0));	/* GEN8_ROW_CHICKEN */
 	MMIO_D(GEN7_ROW_CHICKEN2);
 	MMIO_D(GEN8_UCGCTL6);
-	MMIO_D(GEN8_L3SQCREG4);
+	MMIO_D(_MMIO(0xb118));	/* GEN8_L3SQCREG4 */
 	MMIO_D(GEN9_SCRATCH_LNCF1);
 	MMIO_F(_MMIO(0x24d0), 48);
 	MMIO_D(_MMIO(0x44484));
@@ -831,7 +831,7 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(_MMIO(0x65f10));
 	MMIO_D(_MMIO(0xe194));
 	MMIO_D(_MMIO(0xe188));
-	MMIO_D(HALF_SLICE_CHICKEN2);
+	MMIO_D(_MMIO(0xe180));	/* HALF_SLICE_CHICKEN2 */
 	MMIO_D(_MMIO(0x2580));
 	MMIO_D(_MMIO(0xe220));
 	MMIO_D(_MMIO(0xe230));
@@ -1025,7 +1025,7 @@ static int iterate_skl_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(DMC_SSP_BASE);
 	MMIO_D(DMC_HTP_SKL);
 	MMIO_D(DMC_LAST_WRITE);
-	MMIO_D(BDW_SCRATCH1);
+	MMIO_D(_MMIO(0xb11c));	/* BDW_SCRATCH1 */
 	MMIO_D(SKL_DFSM);
 	MMIO_D(DISPIO_CR_TX_BMU_CR0);
 	MMIO_F(GEN9_GFX_MOCS(0), 0x7f8);
@@ -1228,7 +1228,7 @@ static int iterate_bxt_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(GEN8_PUSHBUS_ENABLE);
 	MMIO_D(GEN8_PUSHBUS_SHIFT);
 	MMIO_D(GEN6_GFXPAUSE);
-	MMIO_D(GEN8_L3SQCREG1);
+	MMIO_D(_MMIO(0xb100));	/* GEN8_L3SQCREG1 */
 	MMIO_D(GEN8_L3CNTLREG);
 	MMIO_D(_MMIO(0x20D8));
 	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

Rather than treating multicast registers as 'i915_reg_t' let's define
them as a completely new type.  This will allow the compiler to help us
make sure we're using multicast-aware functions to operate on multicast
registers.

This plan does break down a bit in places where we're just maintaining
heterogeneous lists of registers (e.g., various MMIO whitelists used by
perf, GVT, etc.) rather than performing reads/writes.  We only really
care about the offset in those cases, so for now we can "cast" the
registers as non-MCR, leaving us with a list of i915_reg_t's, but we may
want to look for better ways to store mixed collections of i915_reg_t
and i915_mcr_reg_t in the future.

v2:
 - Add TLB invalidation registers

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c            | 32 +++++++----
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 53 ++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h        | 18 +++----
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 27 +++++++---
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 32 +++++------
 .../gpu/drm/i915/gt/intel_workarounds_types.h |  5 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c    |  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 +-
 drivers/gpu/drm/i915/gvt/handlers.c           | 17 +++---
 drivers/gpu/drm/i915/gvt/mmio_context.c       | 16 +++---
 drivers/gpu/drm/i915/i915_reg_defs.h          | 22 +++-----
 drivers/gpu/drm/i915/intel_gvt_mmio_table.c   | 10 ++--
 13 files changed, 140 insertions(+), 100 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index e763dc719d3a..27dbb9e4bd6c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -991,7 +991,10 @@ void intel_gt_info_print(const struct intel_gt_info *info,
 }
 
 struct reg_and_bit {
-	i915_reg_t reg;
+	union {
+		i915_reg_t reg;
+		i915_mcr_reg_t mcr_reg;
+	};
 	u32 bit;
 };
 
@@ -1033,7 +1036,7 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8,
 static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb)
 {
 	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
-		return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0,
+		return intel_gt_mcr_wait_for_reg_fw(gt, rb.mcr_reg, rb.bit, 0,
 						    TLB_INVAL_TIMEOUT_US,
 						    TLB_INVAL_TIMEOUT_MS);
 	else
@@ -1058,7 +1061,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 		[COPY_ENGINE_CLASS]		= GEN12_BLT_TLB_INV_CR,
 		[COMPUTE_CLASS]			= GEN12_COMPCTX_TLB_INV_CR,
 	};
-	static const i915_reg_t xehp_regs[] = {
+	static const i915_mcr_reg_t xehp_regs[] = {
 		[RENDER_CLASS]			= XEHP_GFX_TLB_INV_CR,
 		[VIDEO_DECODE_CLASS]		= XEHP_VD_TLB_INV_CR,
 		[VIDEO_ENHANCEMENT_CLASS]	= XEHP_VE_TLB_INV_CR,
@@ -1074,7 +1077,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	unsigned int num = 0;
 
 	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
-		regs = xehp_regs;
+		regs = NULL;
 		num = ARRAY_SIZE(xehp_regs);
 	} else if (GRAPHICS_VER(i915) == 12) {
 		regs = gen12_regs;
@@ -1101,11 +1104,17 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 		if (!intel_engine_pm_is_awake(engine))
 			continue;
 
-		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
-		if (!i915_mmio_reg_offset(rb.reg))
-			continue;
+		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+			intel_gt_mcr_multicast_write_fw(gt,
+							xehp_regs[engine->class],
+							BIT(engine->instance));
+		} else {
+			rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+			if (!i915_mmio_reg_offset(rb.reg))
+				continue;
 
-		intel_uncore_write_fw(uncore, rb.reg, rb.bit);
+			intel_uncore_write_fw(uncore, rb.reg, rb.bit);
+		}
 		awake |= engine->mask;
 	}
 
@@ -1125,7 +1134,12 @@ static void mmio_invalidate_full(struct intel_gt *gt)
 	for_each_engine_masked(engine, gt, awake, tmp) {
 		struct reg_and_bit rb;
 
-		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+			rb.mcr_reg = xehp_regs[engine->class];
+			rb.bit = BIT(engine->instance);
+		} else {
+			rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+		}
 
 		if (wait_for_invalidate(gt, rb))
 			drm_err_ratelimited(&gt->i915->drm,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index a2bc77c33835..9e2caba64f19 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -150,6 +150,19 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 	}
 }
 
+/*
+ * Although the rest of the driver should use MCR-specific functions to
+ * read/write MCR registers, we still use the regular intel_uncore_* functions
+ * internally to implement those, so we need a way for the functions in this
+ * file to "cast" an i915_mcr_reg_t into an i915_reg_t.
+ */
+static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
+{
+	i915_reg_t r = { .reg = mcr.reg };
+
+	return r;
+}
+
 /*
  * rw_with_mcr_steering_fw - Access a register with specific MCR steering
  * @uncore: pointer to struct intel_uncore
@@ -164,7 +177,7 @@ void intel_gt_mcr_init(struct intel_gt *gt)
  * Caller needs to make sure the relevant forcewake wells are up.
  */
 static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
-				   i915_reg_t reg, u8 rw_flag,
+				   i915_mcr_reg_t reg, u8 rw_flag,
 				   int group, int instance, u32 value)
 {
 	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
@@ -201,9 +214,9 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
 
 	if (rw_flag == FW_REG_READ)
-		val = intel_uncore_read_fw(uncore, reg);
+		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
 	else
-		intel_uncore_write_fw(uncore, reg, value);
+		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
 
 	mcr &= ~mcr_mask;
 	mcr |= old_mcr & mcr_mask;
@@ -214,14 +227,14 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 }
 
 static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
-				i915_reg_t reg, u8 rw_flag,
+				i915_mcr_reg_t reg, u8 rw_flag,
 				int group, int instance,
 				u32 value)
 {
 	enum forcewake_domains fw_domains;
 	u32 val;
 
-	fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
+	fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg),
 						    rw_flag);
 	fw_domains |= intel_uncore_forcewake_for_reg(uncore,
 						     GEN8_MCR_SELECTOR,
@@ -249,7 +262,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
  * group/instance.
  */
 u32 intel_gt_mcr_read(struct intel_gt *gt,
-		      i915_reg_t reg,
+		      i915_mcr_reg_t reg,
 		      int group, int instance)
 {
 	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, instance, 0);
@@ -266,7 +279,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
  * Write an MCR register in unicast mode after steering toward a specific
  * group/instance.
  */
-void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value,
+void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value,
 				int group, int instance)
 {
 	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, value);
@@ -281,9 +294,9 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value,
  * Write an MCR register in multicast mode to update all instances.
  */
 void intel_gt_mcr_multicast_write(struct intel_gt *gt,
-				i915_reg_t reg, u32 value)
+				i915_mcr_reg_t reg, u32 value)
 {
-	intel_uncore_write(gt->uncore, reg, value);
+	intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value);
 }
 
 /**
@@ -297,9 +310,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
  * domains; use intel_gt_mcr_multicast_write() in cases where forcewake should
  * be obtained automatically.
  */
-void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 value)
+void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value)
 {
-	intel_uncore_write_fw(gt->uncore, reg, value);
+	intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value);
 }
 
 /**
@@ -318,7 +331,7 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va
  * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should
  * be obtained automatically.
  */
-void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
+void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
 				u32 clear, u32 set)
 {
 	u32 val = intel_gt_mcr_read_any(gt, reg);
@@ -341,10 +354,10 @@ void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
  * for @type steering too.
  */
 static bool reg_needs_read_steering(struct intel_gt *gt,
-				    i915_reg_t reg,
+				    i915_mcr_reg_t reg,
 				    enum intel_steering_type type)
 {
-	const u32 offset = i915_mmio_reg_offset(reg);
+	const u32 offset = i915_mmio_reg_offset(mcr_reg_cast(reg));
 	const struct intel_mmio_range *entry;
 
 	if (likely(!gt->steering_table[type]))
@@ -424,7 +437,7 @@ static void get_nonterminated_steering(struct intel_gt *gt,
  * steering.
  */
 void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
-					     i915_reg_t reg,
+					     i915_mcr_reg_t reg,
 					     u8 *group, u8 *instance)
 {
 	int type;
@@ -453,7 +466,7 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
  *
  * Returns the value from a non-terminated instance of @reg.
  */
-u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
+u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
 {
 	int type;
 	u8 group, instance;
@@ -467,7 +480,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
 		}
 	}
 
-	return intel_uncore_read_fw(gt->uncore, reg);
+	return intel_uncore_read_fw(gt->uncore, mcr_reg_cast(reg));
 }
 
 /**
@@ -480,7 +493,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
  *
  * Returns the value from a non-terminated instance of @reg.
  */
-u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
+u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
 {
 	int type;
 	u8 group, instance;
@@ -494,7 +507,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
 		}
 	}
 
-	return intel_uncore_read(gt->uncore, reg);
+	return intel_uncore_read(gt->uncore, mcr_reg_cast(reg));
 }
 
 static void report_steering_type(struct drm_printer *p,
@@ -595,7 +608,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
  * Return: 0 if the register matches the desired condition, or -ETIMEDOUT.
  */
 int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
-				 i915_reg_t reg,
+				 i915_mcr_reg_t reg,
 				 u32 mask,
 				 u32 value,
 				 unsigned int fast_timeout_us,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
index 96979af01f2b..7fb574d31d73 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -11,24 +11,24 @@
 void intel_gt_mcr_init(struct intel_gt *gt);
 
 u32 intel_gt_mcr_read(struct intel_gt *gt,
-		      i915_reg_t reg,
+		      i915_mcr_reg_t reg,
 		      int group, int instance);
-u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
-u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
+u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg);
+u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg);
 
 void intel_gt_mcr_unicast_write(struct intel_gt *gt,
-				i915_reg_t reg, u32 value,
+				i915_mcr_reg_t reg, u32 value,
 				int group, int instance);
 void intel_gt_mcr_multicast_write(struct intel_gt *gt,
-				  i915_reg_t reg, u32 value);
+				  i915_mcr_reg_t reg, u32 value);
 void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
-				     i915_reg_t reg, u32 value);
+				     i915_mcr_reg_t reg, u32 value);
 
-void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
+void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
 				u32 clear, u32 set);
 
 void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
-					     i915_reg_t reg,
+					     i915_mcr_reg_t reg,
 					     u8 *group, u8 *instance);
 
 void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
@@ -38,7 +38,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
 				  unsigned int *group, unsigned int *instance);
 
 int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
-				 i915_reg_t reg,
+				 i915_mcr_reg_t reg,
 				 u32 mask,
 				 u32 value,
 				 unsigned int fast_timeout_us,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 901ea03b9fd8..b7341f01ec9f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -8,7 +8,18 @@
 
 #include "i915_reg_defs.h"
 
-#define MCR_REG(offset)	_MMIO(offset)
+#define MCR_REG(offset)	((const i915_mcr_reg_t){ .reg = (offset) })
+
+/*
+ * The perf control registers are technically multicast registers, but the
+ * driver never needs to read/write them directly; we only use them to build
+ * lists of registers (where they're mixed in with other non-MCR registers)
+ * and then operate on the offset directly.  For now we'll just define them
+ * as non-multicast so we can place them on the same list, but we may want
+ * to try to come up with a better way to handle heterogeneous lists of
+ * registers in the future.
+ */
+#define PERF_REG(offset)			_MMIO(offset)
 
 /* RPM unit config (Gen8+) */
 #define RPM_CONFIG0				_MMIO(0xd00)
@@ -1116,8 +1127,8 @@
 #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
 #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
 
-#define EU_PERF_CNTL0				MCR_REG(0xe458)
-#define EU_PERF_CNTL4				MCR_REG(0xe45c)
+#define EU_PERF_CNTL0				PERF_REG(0xe458)
+#define EU_PERF_CNTL4				PERF_REG(0xe45c)
 
 #define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
 #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
@@ -1154,16 +1165,16 @@
 #define   STACKID_CTRL				REG_GENMASK(6, 5)
 #define   STACKID_CTRL_512			REG_FIELD_PREP(STACKID_CTRL, 0x2)
 
-#define EU_PERF_CNTL1				MCR_REG(0xe558)
-#define EU_PERF_CNTL5				MCR_REG(0xe55c)
+#define EU_PERF_CNTL1				PERF_REG(0xe558)
+#define EU_PERF_CNTL5				PERF_REG(0xe55c)
 
 #define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
 #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
 #define ICL_HDC_MODE				MCR_REG(0xe5f4)
 
-#define EU_PERF_CNTL2				MCR_REG(0xe658)
-#define EU_PERF_CNTL6				MCR_REG(0xe65c)
-#define EU_PERF_CNTL3				MCR_REG(0xe758)
+#define EU_PERF_CNTL2				PERF_REG(0xe658)
+#define EU_PERF_CNTL6				PERF_REG(0xe65c)
+#define EU_PERF_CNTL3				PERF_REG(0xe758)
 
 #define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
 #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index c5cd89aa13c9..174c74dcda99 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -166,11 +166,11 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
 	_wa_add(wal, &wa);
 }
 
-static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
+static void wa_mcr_add(struct i915_wa_list *wal, i915_mcr_reg_t reg,
 		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
 {
 	struct i915_wa wa = {
-		.reg  = reg,
+		.mcr_reg = reg,
 		.clr  = clear,
 		.set  = set,
 		.read = read_mask,
@@ -188,7 +188,7 @@ wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
 }
 
 static void
-wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
+wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set)
 {
 	wa_mcr_add(wal, reg, clear, set, clear, false);
 }
@@ -206,7 +206,7 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 }
 
 static void
-wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
+wa_mcr_write_or(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set)
 {
 	wa_mcr_write_clr_set(wal, reg, set, set);
 }
@@ -218,7 +218,7 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
 }
 
 static void
-wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
+wa_mcr_write_clr(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clr)
 {
 	wa_mcr_write_clr_set(wal, reg, clr, 0);
 }
@@ -241,7 +241,7 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 }
 
 static void
-wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+wa_mcr_masked_en(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
 {
 	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
 }
@@ -253,7 +253,7 @@ wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 }
 
 static void
-wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+wa_mcr_masked_dis(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
 {
 	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
 }
@@ -266,7 +266,7 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
 }
 
 static void
-wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
+wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_mcr_reg_t reg,
 			u32 mask, u32 val)
 {
 	wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
@@ -1692,19 +1692,19 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
 		/* open-coded rmw due to steering */
 		if (wa->clr)
 			old = wa->is_mcr ?
-				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
 				intel_uncore_read_fw(uncore, wa->reg);
 		val = (old & ~wa->clr) | wa->set;
 		if (val != old || !wa->clr) {
 			if (wa->is_mcr)
-				intel_gt_mcr_multicast_write_fw(gt, wa->reg, val);
+				intel_gt_mcr_multicast_write_fw(gt, wa->mcr_reg, val);
 			else
 				intel_uncore_write_fw(uncore, wa->reg, val);
 		}
 
 		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
 			u32 val = wa->is_mcr ?
-				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
 				intel_uncore_read_fw(uncore, wa->reg);
 
 			wa_verify(wa, val, wal->name, "application");
@@ -1738,7 +1738,7 @@ static bool wa_list_verify(struct intel_gt *gt,
 
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
 		ok &= wa_verify(wa, wa->is_mcr ?
-				intel_gt_mcr_read_any_fw(gt, wa->reg) :
+				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
 				intel_uncore_read_fw(uncore, wa->reg),
 				wal->name, from);
 
@@ -1786,10 +1786,10 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
 }
 
 static void
-whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
+whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 flags)
 {
 	struct i915_wa wa = {
-		.reg = reg,
+		.mcr_reg = reg,
 		.is_mcr = 1,
 	};
 
@@ -1799,7 +1799,7 @@ whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
 	if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags)))
 		return;
 
-	wa.reg.reg |= flags;
+	wa.mcr_reg.reg |= flags;
 	_wa_add(wal, &wa);
 }
 
@@ -1810,7 +1810,7 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
 }
 
 static void
-whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
+whitelist_mcr_reg(struct i915_wa_list *wal, i915_mcr_reg_t reg)
 {
 	whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
index f05b37e56fa9..7c8b01d00043 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
@@ -11,7 +11,10 @@
 #include "i915_reg_defs.h"
 
 struct i915_wa {
-	i915_reg_t	reg;
+	union {
+		i915_reg_t	reg;
+		i915_mcr_reg_t	mcr_reg;
+	};
 	u32		clr;
 	u32		set;
 	u32		read;
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 67a9aab801dd..21b1edc052f8 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -991,7 +991,7 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg)
 	/* Alas, we must pardon some whitelists. Mistakes already made */
 	static const struct regmask pardon[] = {
 		{ GEN9_CTX_PREEMPT_REG, 9 },
-		{ GEN8_L3SQCREG4, 9 },
+		{ _MMIO(0xb118), 9 }, /* GEN8_L3SQCREG4 */
 	};
 
 	return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon));
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index de923fb82301..34ef4f36e660 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -328,7 +328,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 
 static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
 					 struct temp_regset *regset,
-					 i915_reg_t reg, u32 flags)
+					 i915_mcr_reg_t reg, u32 flags)
 {
 	u8 group, inst;
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
index 9495a7928bc8..d5c03e7a7843 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
@@ -240,7 +240,7 @@ static void guc_capture_free_extlists(struct __guc_mmio_reg_descr_group *reglist
 
 struct __ext_steer_reg {
 	const char *name;
-	i915_reg_t reg;
+	i915_mcr_reg_t reg;
 };
 
 static const struct __ext_steer_reg xe_extregs[] = {
@@ -252,7 +252,7 @@ static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
 			   const struct __ext_steer_reg *extlist,
 			   int slice_id, int subslice_id)
 {
-	ext->reg = extlist->reg;
+	ext->reg = _MMIO(i915_mmio_reg_offset(extlist->reg));
 	ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id);
 	ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id);
 	ext->regname = extlist->name;
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index 700cc9688f47..a52c4645cdbe 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -734,7 +734,7 @@ static i915_reg_t force_nonpriv_white_list[] = {
 	_MMIO(0x770c),
 	_MMIO(0x83a8),
 	_MMIO(0xb110),
-	GEN8_L3SQCREG4,//_MMIO(0xb118)
+	_MMIO(0xb118),
 	_MMIO(0xe100),
 	_MMIO(0xe18c),
 	_MMIO(0xe48c),
@@ -2140,6 +2140,9 @@ static int csfe_chicken1_mmio_write(struct intel_vgpu *vgpu,
 #define MMIO_DFH(reg, d, f, r, w) \
 	MMIO_F(reg, 4, f, 0, 0, d, r, w)
 
+#define MMIO_DFH_MCR(reg, d, f, r, w) \
+	MMIO_F(_MMIO(i915_mmio_reg_offset(reg)), 4, f, 0, 0, d, r, w)
+
 #define MMIO_GM(reg, d, r, w) \
 	MMIO_F(reg, 4, F_GMADR, 0xFFFFF000, 0, d, r, w)
 
@@ -2529,15 +2532,15 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
 
 	MMIO_DFH(HDC_CHICKEN0, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 
-	MMIO_DFH(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
-		NULL, NULL);
+	MMIO_DFH_MCR(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
+		     NULL, NULL);
 	MMIO_DFH(GEN7_ROW_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
 		NULL, NULL);
 	MMIO_DFH(GEN8_UCGCTL6, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
 
 	MMIO_DFH(_MMIO(0xb1f0), D_BDW, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xb1c0), D_BDW, F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
 
@@ -2550,7 +2553,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
 
 	MMIO_DFH(_MMIO(0xe194), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xe188), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x2580), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 
 	MMIO_DFH(_MMIO(0x2248), D_BDW, F_CMD_ACCESS, NULL, NULL);
@@ -2699,7 +2702,7 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
 	MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 3)), D_SKL_PLUS, NULL, NULL);
 	MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 4)), D_SKL_PLUS, NULL, NULL);
 
-	MMIO_DFH(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
 
 	MMIO_F(GEN9_GFX_MOCS(0), 0x7f8, F_CMD_ACCESS, 0, 0, D_SKL_PLUS,
 		NULL, NULL);
@@ -2769,7 +2772,7 @@ static int init_bxt_mmio_info(struct intel_gvt *gvt)
 	MMIO_DH(BXT_PORT_TX_DW3_LN0(DPIO_PHY1, DPIO_CH0), D_BXT,
 		bxt_port_tx_dw3_read, NULL);
 	MMIO_DH(BXT_DE_PLL_ENABLE, D_BXT, NULL, bxt_de_pll_enable_write);
-	MMIO_DFH(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(GEN8_L3CNTLREG, D_BXT, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x20D8), D_BXT, F_CMD_ACCESS, NULL, NULL);
 	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40, F_CMD_ACCESS,
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
index d177884d8f7d..7f40055b0474 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -44,6 +44,8 @@
 
 #define GEN9_MOCS_SIZE		64
 
+#define MCR_CAST(mcr)	_MMIO(i915_mcr_reg_offset(mcr))
+
 /* Raw offset is appened to each line for convenience. */
 static struct engine_mmio gen8_engine_mmio_list[] __cacheline_aligned = {
 	{RCS0, RING_MODE_GEN7(RENDER_RING_BASE), 0xffff, false}, /* 0x229c */
@@ -106,15 +108,15 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
 	{RCS0, GEN8_CS_CHICKEN1, 0xffff, true}, /* 0x2580 */
 	{RCS0, COMMON_SLICE_CHICKEN2, 0xffff, true}, /* 0x7014 */
 	{RCS0, GEN9_CS_DEBUG_MODE1, 0xffff, false}, /* 0x20ec */
-	{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
-	{RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
+	{RCS0, _MMIO(0xb118), 0, false}, /* GEN8_L3SQCREG4 */
+	{RCS0, _MMIO(0xb11c), 0, false}, /* GEN9_SCRATCH1 */
 	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
 	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
-	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
-	{RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
-	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
-	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
-	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
+	{RCS0, _MMIO(0xe180), 0xffff, true}, /* HALF_SLICE_CHICKEN2 */
+	{RCS0, _MMIO(0xe184), 0xffff, true}, /* 0xe184 */
+	{RCS0, _MMIO(0xe188), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN5 */
+	{RCS0, _MMIO(0xe194), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN7 */
+	{RCS0, _MMIO(0xe4f0), 0xffff, true}, /* GEN8_ROW_CHICKEN */
 	{RCS0, TRVATTL3PTRDW(0), 0, true}, /* 0x4de0 */
 	{RCS0, TRVATTL3PTRDW(1), 0, true}, /* 0x4de4 */
 	{RCS0, TRNULLDETCT, 0, true}, /* 0x4de8 */
diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
index 8f486f77609f..e823869b9afd 100644
--- a/drivers/gpu/drm/i915/i915_reg_defs.h
+++ b/drivers/gpu/drm/i915/i915_reg_defs.h
@@ -104,22 +104,16 @@ typedef struct {
 
 #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
 
-#define INVALID_MMIO_REG _MMIO(0)
-
-static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
-{
-	return reg.reg;
-}
+typedef struct {
+	u32 reg;
+} i915_mcr_reg_t;
 
-static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
-{
-	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
-}
+#define INVALID_MMIO_REG _MMIO(0)
 
-static inline bool i915_mmio_reg_valid(i915_reg_t reg)
-{
-	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
-}
+/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
+#define i915_mmio_reg_offset(r) (r.reg)
+#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
+#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
 
 #define VLV_DISPLAY_BASE		0x180000
 
diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
index 638b77d64bf4..8c1515f5c04c 100644
--- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
+++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
@@ -816,10 +816,10 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(GEN8_EU_DISABLE1);
 	MMIO_D(GEN8_EU_DISABLE2);
 	MMIO_D(_MMIO(0xfdc));
-	MMIO_D(GEN8_ROW_CHICKEN);
+	MMIO_D(_MMIO(0xe4f0));	/* GEN8_ROW_CHICKEN */
 	MMIO_D(GEN7_ROW_CHICKEN2);
 	MMIO_D(GEN8_UCGCTL6);
-	MMIO_D(GEN8_L3SQCREG4);
+	MMIO_D(_MMIO(0xb118));	/* GEN8_L3SQCREG4 */
 	MMIO_D(GEN9_SCRATCH_LNCF1);
 	MMIO_F(_MMIO(0x24d0), 48);
 	MMIO_D(_MMIO(0x44484));
@@ -831,7 +831,7 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(_MMIO(0x65f10));
 	MMIO_D(_MMIO(0xe194));
 	MMIO_D(_MMIO(0xe188));
-	MMIO_D(HALF_SLICE_CHICKEN2);
+	MMIO_D(_MMIO(0xe180));	/* HALF_SLICE_CHICKEN2 */
 	MMIO_D(_MMIO(0x2580));
 	MMIO_D(_MMIO(0xe220));
 	MMIO_D(_MMIO(0xe230));
@@ -1025,7 +1025,7 @@ static int iterate_skl_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(DMC_SSP_BASE);
 	MMIO_D(DMC_HTP_SKL);
 	MMIO_D(DMC_LAST_WRITE);
-	MMIO_D(BDW_SCRATCH1);
+	MMIO_D(_MMIO(0xb11c));	/* BDW_SCRATCH1 */
 	MMIO_D(SKL_DFSM);
 	MMIO_D(DISPIO_CR_TX_BMU_CR0);
 	MMIO_F(GEN9_GFX_MOCS(0), 0x7f8);
@@ -1228,7 +1228,7 @@ static int iterate_bxt_mmio(struct intel_gvt_mmio_table_iter *iter)
 	MMIO_D(GEN8_PUSHBUS_ENABLE);
 	MMIO_D(GEN8_PUSHBUS_SHIFT);
 	MMIO_D(GEN6_GFXPAUSE);
-	MMIO_D(GEN8_L3SQCREG1);
+	MMIO_D(_MMIO(0xb100));	/* GEN8_L3SQCREG1 */
 	MMIO_D(GEN8_L3CNTLREG);
 	MMIO_D(_MMIO(0x20D8));
 	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 13/14] drm/i915/mtl: Add multicast steering for render GT
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx
  Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel,
	Radhakrishna Sripada

MTL once again changes the multicast register types and steering
details.  Key changes from past platforms:
 * The number of instances of some MCR types (NODE, OAAL2, and GAM) vary
   according to the MTL subplatform and cannot be read from fuse
   registers.
 * The MCR steering register (and its bitfields) has changed.

Unlike past platforms, we will be explicitly steering all types of MCR
accesses, including those for "SLICE" and "DSS" ranges; we no longer
rely on implicit steering.  On previous platforms, various
hardware/firmware agents that needed to access registers typically had
their own steering control registers, allowing them to perform multicast
steering without clobbering the CPU/kernel steering.  Starting with MTL,
more of these agents now share a single steering register (0xFD4) and it
is no longer safe for us to assume that the value will remain unchanged
from how we initialized it during startup.  There is also a slight
chance of race conditions between the driver and a hardware/firmware
agent, so the hardware provides a semaphore register that can be used to
coordinate access to the steering register.  Support for the semaphore
register will be introduced in a future patch.

Bspec: 67788, 67112
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 85 ++++++++++++++++++---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  5 ++
 drivers/gpu/drm/i915/gt/intel_gt_types.h    |  8 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 ++++-
 drivers/gpu/drm/i915/i915_pci.c             |  1 +
 5 files changed, 102 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index 9e2caba64f19..a1fa71b47831 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -41,6 +41,7 @@ static const char * const intel_steering_types[] = {
 	"MSLICE",
 	"LNCF",
 	"GAM",
+	"DSS",
 	"INSTANCE 0",
 };
 
@@ -99,9 +100,40 @@ static const struct intel_mmio_range pvc_instance0_steering_table[] = {
 	{},
 };
 
+static const struct intel_mmio_range mtl3d_instance0_steering_table[] = {
+	{ 0x000B00, 0x000BFF },         /* SQIDI */
+	{ 0x001000, 0x001FFF },         /* SQIDI */
+	{ 0x004000, 0x0048FF },         /* GAM */
+	{ 0x008700, 0x0087FF },         /* SQIDI */
+	{ 0x00B000, 0x00B0FF },         /* NODE */
+	{ 0x00C800, 0x00CFFF },         /* GAM */
+	{ 0x00D880, 0x00D8FF },         /* NODE */
+	{ 0x00DD00, 0x00DDFF },         /* OAAL2 */
+	{},
+};
+
+static const struct intel_mmio_range mtl3d_l3bank_steering_table[] = {
+	{ 0x00B100, 0x00B3FF },
+	{},
+};
+
+/* DSS steering is used for SLICE ranges as well */
+static const struct intel_mmio_range mtl3d_dss_steering_table[] = {
+	{ 0x005200, 0x0052FF },		/* SLICE */
+	{ 0x005500, 0x007FFF },		/* SLICE */
+	{ 0x008140, 0x00815F },		/* SLICE (0x8140-0x814F), DSS (0x8150-0x815F) */
+	{ 0x0094D0, 0x00955F },		/* SLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */
+	{ 0x009680, 0x0096FF },		/* DSS */
+	{ 0x00D800, 0x00D87F },		/* SLICE */
+	{ 0x00DC00, 0x00DCFF },		/* SLICE */
+	{ 0x00DE80, 0x00E8FF },		/* DSS (0xE000-0xE0FF reserved) */
+};
+
 void intel_gt_mcr_init(struct intel_gt *gt)
 {
 	struct drm_i915_private *i915 = gt->i915;
+	unsigned long fuse;
+	int i;
 
 	/*
 	 * An mslice is unavailable only if both the meml3 for the slice is
@@ -119,7 +151,22 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 			drm_warn(&i915->drm, "mslice mask all zero!\n");
 	}
 
-	if (IS_PONTEVECCHIO(i915)) {
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) &&
+	    gt->type == GT_PRIMARY) {
+		fuse = REG_FIELD_GET(GT_L3_EXC_MASK,
+				     intel_uncore_read(gt->uncore, XEHP_FUSE4));
+
+		/*
+		 * Despite the register field being named "exclude mask" the
+		 * bits actually represent enabled banks (two banks per bit).
+		 */
+		for_each_set_bit(i, &fuse, 3)
+			gt->info.l3bank_mask |= (0x3 << 2*i);
+
+		gt->steering_table[INSTANCE0] = mtl3d_instance0_steering_table;
+		gt->steering_table[L3BANK] = mtl3d_l3bank_steering_table;
+		gt->steering_table[DSS] = mtl3d_dss_steering_table;
+	} else if (IS_PONTEVECCHIO(i915)) {
 		gt->steering_table[INSTANCE0] = pvc_instance0_steering_table;
 	} else if (IS_DG2(i915)) {
 		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
@@ -184,7 +231,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 
 	lockdep_assert_held(&uncore->lock);
 
-	if (GRAPHICS_VER(uncore->i915) >= 11) {
+	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) {
+		/* No need to save old steering reg value */
+		intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR,
+				      REG_FIELD_PREP(MTL_MCR_GROUPID, group) |
+				      REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance));
+	} else if (GRAPHICS_VER(uncore->i915) >= 11) {
 		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
 		mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance);
 
@@ -202,26 +254,30 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 		 */
 		if (rw_flag == FW_REG_WRITE)
 			mcr_mask |= GEN11_MCR_MULTICAST;
+
+		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
+
+		mcr &= ~mcr_mask;
+		mcr |= mcr_ss;
+		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
 	} else {
 		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
 		mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance);
-	}
 
-	old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
+		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
 
-	mcr &= ~mcr_mask;
-	mcr |= mcr_ss;
-	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
+		mcr &= ~mcr_mask;
+		mcr |= mcr_ss;
+		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
+	}
 
 	if (rw_flag == FW_REG_READ)
 		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
 	else
 		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
 
-	mcr &= ~mcr_mask;
-	mcr |= old_mcr & mcr_mask;
-
-	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
+	if (GRAPHICS_VER_FULL(uncore->i915) < IP_VER(12, 70))
+		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
 
 	return val;
 }
@@ -385,6 +441,8 @@ static void get_nonterminated_steering(struct intel_gt *gt,
 				       enum intel_steering_type type,
 				       u8 *group, u8 *instance)
 {
+	u32 dss;
+
 	switch (type) {
 	case L3BANK:
 		*group = 0;		/* unused */
@@ -408,6 +466,11 @@ static void get_nonterminated_steering(struct intel_gt *gt,
 		*group = IS_DG2(gt->i915) ? 1 : 0;
 		*instance = 0;
 		break;
+	case DSS:
+		dss = intel_sseu_find_first_xehp_dss(&gt->info.sseu, 0, 0);
+		*group = dss / GEN_DSS_PER_GSLICE;
+		*instance = dss % GEN_DSS_PER_GSLICE;
+		break;
 	case INSTANCE0:
 		/*
 		 * There are a lot of MCR types for which instance (0, 0)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index b7341f01ec9f..c5b9671097e3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -59,6 +59,7 @@
 #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
 
 #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
+#define MTL_MCR_SELECTOR			_MMIO(0xfd4)
 #define SF_MCR_SELECTOR				_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)
 #define GAM_MCR_SELECTOR			_MMIO(0xfe0)
@@ -71,6 +72,8 @@
 #define   GEN11_MCR_SLICE_MASK			GEN11_MCR_SLICE(0xf)
 #define   GEN11_MCR_SUBSLICE(subslice)		(((subslice) & 0x7) << 24)
 #define   GEN11_MCR_SUBSLICE_MASK		GEN11_MCR_SUBSLICE(0x7)
+#define   MTL_MCR_GROUPID			REG_GENMASK(11, 8)
+#define   MTL_MCR_INSTANCEID			REG_GENMASK(3, 0)
 
 #define IPEIR_I965				_MMIO(0x2064)
 #define IPEHR_I965				_MMIO(0x2068)
@@ -531,6 +534,8 @@
 #define   GEN6_MBCTL_BOOT_FETCH_MECH		(1 << 0)
 
 /* Fuse readout registers for GT */
+#define XEHP_FUSE4				_MMIO(0x9114)
+#define   GT_L3_EXC_MASK			REG_GENMASK(6, 4)
 #define	GEN10_MIRROR_FUSE3			_MMIO(0x9118)
 #define   GEN10_L3BANK_PAIR_COUNT		4
 #define   GEN10_L3BANK_MASK			0x0F
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 30003d68fd51..77ecbd6be331 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -60,6 +60,7 @@ enum intel_steering_type {
 	MSLICE,
 	LNCF,
 	GAM,
+	DSS,
 
 	/*
 	 * On some platforms there are multiple types of MCR registers that
@@ -255,8 +256,6 @@ struct intel_gt {
 
 		intel_engine_mask_t engine_mask;
 
-		u32 l3bank_mask;
-
 		u8 num_engines;
 
 		/* General presence of SFC units */
@@ -268,7 +267,10 @@ struct intel_gt {
 		/* Slice/subslice/EU info */
 		struct sseu_dev_info sseu;
 
-		unsigned long mslice_mask;
+		union {
+			unsigned long mslice_mask;
+			unsigned long l3bank_mask;
+		};
 
 		/** @hwconfig: hardware configuration data */
 		struct intel_hwconfig hwconfig;
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 174c74dcda99..9be048da7fb3 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1581,12 +1581,28 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
+static void
+mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
+{
+	/*
+	 * Unlike older platforms, we no longer setup implicit steering here;
+	 * all MCR accesses are explicitly steered.
+	 */
+	if (drm_debug_enabled(DRM_UT_DRIVER)) {
+		struct drm_printer p = drm_debug_printer("MCR Steering:");
+
+		intel_gt_mcr_report_steering(&p, gt, false);
+	}
+}
+
 static void
 gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
 {
 	struct drm_i915_private *i915 = gt->i915;
 
-	if (IS_PONTEVECCHIO(i915))
+	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
+		mtl_3d_gt_workarounds_init(gt, wal);
+	else if (IS_PONTEVECCHIO(i915))
 		pvc_gt_workarounds_init(gt, wal);
 	else if (IS_DG2(i915))
 		dg2_gt_workarounds_init(gt, wal);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 38460a0bd7cb..bd1d8e0205a6 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1145,6 +1145,7 @@ static const struct intel_device_info mtl_info = {
 	.extra_gt_list = xelpmp_extra_gt,
 	.has_flat_ccs = 0,
 	.has_gmd_id = 1,
+	.has_mslice_steering = 0,
 	.has_snoop = 1,
 	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
 	.__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 13/14] drm/i915/mtl: Add multicast steering for render GT
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

MTL once again changes the multicast register types and steering
details.  Key changes from past platforms:
 * The number of instances of some MCR types (NODE, OAAL2, and GAM) vary
   according to the MTL subplatform and cannot be read from fuse
   registers.
 * The MCR steering register (and its bitfields) has changed.

Unlike past platforms, we will be explicitly steering all types of MCR
accesses, including those for "SLICE" and "DSS" ranges; we no longer
rely on implicit steering.  On previous platforms, various
hardware/firmware agents that needed to access registers typically had
their own steering control registers, allowing them to perform multicast
steering without clobbering the CPU/kernel steering.  Starting with MTL,
more of these agents now share a single steering register (0xFD4) and it
is no longer safe for us to assume that the value will remain unchanged
from how we initialized it during startup.  There is also a slight
chance of race conditions between the driver and a hardware/firmware
agent, so the hardware provides a semaphore register that can be used to
coordinate access to the steering register.  Support for the semaphore
register will be introduced in a future patch.

Bspec: 67788, 67112
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 85 ++++++++++++++++++---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  5 ++
 drivers/gpu/drm/i915/gt/intel_gt_types.h    |  8 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 ++++-
 drivers/gpu/drm/i915/i915_pci.c             |  1 +
 5 files changed, 102 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index 9e2caba64f19..a1fa71b47831 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -41,6 +41,7 @@ static const char * const intel_steering_types[] = {
 	"MSLICE",
 	"LNCF",
 	"GAM",
+	"DSS",
 	"INSTANCE 0",
 };
 
@@ -99,9 +100,40 @@ static const struct intel_mmio_range pvc_instance0_steering_table[] = {
 	{},
 };
 
+static const struct intel_mmio_range mtl3d_instance0_steering_table[] = {
+	{ 0x000B00, 0x000BFF },         /* SQIDI */
+	{ 0x001000, 0x001FFF },         /* SQIDI */
+	{ 0x004000, 0x0048FF },         /* GAM */
+	{ 0x008700, 0x0087FF },         /* SQIDI */
+	{ 0x00B000, 0x00B0FF },         /* NODE */
+	{ 0x00C800, 0x00CFFF },         /* GAM */
+	{ 0x00D880, 0x00D8FF },         /* NODE */
+	{ 0x00DD00, 0x00DDFF },         /* OAAL2 */
+	{},
+};
+
+static const struct intel_mmio_range mtl3d_l3bank_steering_table[] = {
+	{ 0x00B100, 0x00B3FF },
+	{},
+};
+
+/* DSS steering is used for SLICE ranges as well */
+static const struct intel_mmio_range mtl3d_dss_steering_table[] = {
+	{ 0x005200, 0x0052FF },		/* SLICE */
+	{ 0x005500, 0x007FFF },		/* SLICE */
+	{ 0x008140, 0x00815F },		/* SLICE (0x8140-0x814F), DSS (0x8150-0x815F) */
+	{ 0x0094D0, 0x00955F },		/* SLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */
+	{ 0x009680, 0x0096FF },		/* DSS */
+	{ 0x00D800, 0x00D87F },		/* SLICE */
+	{ 0x00DC00, 0x00DCFF },		/* SLICE */
+	{ 0x00DE80, 0x00E8FF },		/* DSS (0xE000-0xE0FF reserved) */
+};
+
 void intel_gt_mcr_init(struct intel_gt *gt)
 {
 	struct drm_i915_private *i915 = gt->i915;
+	unsigned long fuse;
+	int i;
 
 	/*
 	 * An mslice is unavailable only if both the meml3 for the slice is
@@ -119,7 +151,22 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 			drm_warn(&i915->drm, "mslice mask all zero!\n");
 	}
 
-	if (IS_PONTEVECCHIO(i915)) {
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) &&
+	    gt->type == GT_PRIMARY) {
+		fuse = REG_FIELD_GET(GT_L3_EXC_MASK,
+				     intel_uncore_read(gt->uncore, XEHP_FUSE4));
+
+		/*
+		 * Despite the register field being named "exclude mask" the
+		 * bits actually represent enabled banks (two banks per bit).
+		 */
+		for_each_set_bit(i, &fuse, 3)
+			gt->info.l3bank_mask |= (0x3 << 2*i);
+
+		gt->steering_table[INSTANCE0] = mtl3d_instance0_steering_table;
+		gt->steering_table[L3BANK] = mtl3d_l3bank_steering_table;
+		gt->steering_table[DSS] = mtl3d_dss_steering_table;
+	} else if (IS_PONTEVECCHIO(i915)) {
 		gt->steering_table[INSTANCE0] = pvc_instance0_steering_table;
 	} else if (IS_DG2(i915)) {
 		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
@@ -184,7 +231,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 
 	lockdep_assert_held(&uncore->lock);
 
-	if (GRAPHICS_VER(uncore->i915) >= 11) {
+	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) {
+		/* No need to save old steering reg value */
+		intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR,
+				      REG_FIELD_PREP(MTL_MCR_GROUPID, group) |
+				      REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance));
+	} else if (GRAPHICS_VER(uncore->i915) >= 11) {
 		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
 		mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance);
 
@@ -202,26 +254,30 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 		 */
 		if (rw_flag == FW_REG_WRITE)
 			mcr_mask |= GEN11_MCR_MULTICAST;
+
+		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
+
+		mcr &= ~mcr_mask;
+		mcr |= mcr_ss;
+		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
 	} else {
 		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
 		mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance);
-	}
 
-	old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
+		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
 
-	mcr &= ~mcr_mask;
-	mcr |= mcr_ss;
-	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
+		mcr &= ~mcr_mask;
+		mcr |= mcr_ss;
+		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
+	}
 
 	if (rw_flag == FW_REG_READ)
 		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
 	else
 		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
 
-	mcr &= ~mcr_mask;
-	mcr |= old_mcr & mcr_mask;
-
-	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
+	if (GRAPHICS_VER_FULL(uncore->i915) < IP_VER(12, 70))
+		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
 
 	return val;
 }
@@ -385,6 +441,8 @@ static void get_nonterminated_steering(struct intel_gt *gt,
 				       enum intel_steering_type type,
 				       u8 *group, u8 *instance)
 {
+	u32 dss;
+
 	switch (type) {
 	case L3BANK:
 		*group = 0;		/* unused */
@@ -408,6 +466,11 @@ static void get_nonterminated_steering(struct intel_gt *gt,
 		*group = IS_DG2(gt->i915) ? 1 : 0;
 		*instance = 0;
 		break;
+	case DSS:
+		dss = intel_sseu_find_first_xehp_dss(&gt->info.sseu, 0, 0);
+		*group = dss / GEN_DSS_PER_GSLICE;
+		*instance = dss % GEN_DSS_PER_GSLICE;
+		break;
 	case INSTANCE0:
 		/*
 		 * There are a lot of MCR types for which instance (0, 0)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index b7341f01ec9f..c5b9671097e3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -59,6 +59,7 @@
 #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
 
 #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
+#define MTL_MCR_SELECTOR			_MMIO(0xfd4)
 #define SF_MCR_SELECTOR				_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)
 #define GAM_MCR_SELECTOR			_MMIO(0xfe0)
@@ -71,6 +72,8 @@
 #define   GEN11_MCR_SLICE_MASK			GEN11_MCR_SLICE(0xf)
 #define   GEN11_MCR_SUBSLICE(subslice)		(((subslice) & 0x7) << 24)
 #define   GEN11_MCR_SUBSLICE_MASK		GEN11_MCR_SUBSLICE(0x7)
+#define   MTL_MCR_GROUPID			REG_GENMASK(11, 8)
+#define   MTL_MCR_INSTANCEID			REG_GENMASK(3, 0)
 
 #define IPEIR_I965				_MMIO(0x2064)
 #define IPEHR_I965				_MMIO(0x2068)
@@ -531,6 +534,8 @@
 #define   GEN6_MBCTL_BOOT_FETCH_MECH		(1 << 0)
 
 /* Fuse readout registers for GT */
+#define XEHP_FUSE4				_MMIO(0x9114)
+#define   GT_L3_EXC_MASK			REG_GENMASK(6, 4)
 #define	GEN10_MIRROR_FUSE3			_MMIO(0x9118)
 #define   GEN10_L3BANK_PAIR_COUNT		4
 #define   GEN10_L3BANK_MASK			0x0F
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 30003d68fd51..77ecbd6be331 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -60,6 +60,7 @@ enum intel_steering_type {
 	MSLICE,
 	LNCF,
 	GAM,
+	DSS,
 
 	/*
 	 * On some platforms there are multiple types of MCR registers that
@@ -255,8 +256,6 @@ struct intel_gt {
 
 		intel_engine_mask_t engine_mask;
 
-		u32 l3bank_mask;
-
 		u8 num_engines;
 
 		/* General presence of SFC units */
@@ -268,7 +267,10 @@ struct intel_gt {
 		/* Slice/subslice/EU info */
 		struct sseu_dev_info sseu;
 
-		unsigned long mslice_mask;
+		union {
+			unsigned long mslice_mask;
+			unsigned long l3bank_mask;
+		};
 
 		/** @hwconfig: hardware configuration data */
 		struct intel_hwconfig hwconfig;
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 174c74dcda99..9be048da7fb3 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1581,12 +1581,28 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
 }
 
+static void
+mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
+{
+	/*
+	 * Unlike older platforms, we no longer setup implicit steering here;
+	 * all MCR accesses are explicitly steered.
+	 */
+	if (drm_debug_enabled(DRM_UT_DRIVER)) {
+		struct drm_printer p = drm_debug_printer("MCR Steering:");
+
+		intel_gt_mcr_report_steering(&p, gt, false);
+	}
+}
+
 static void
 gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
 {
 	struct drm_i915_private *i915 = gt->i915;
 
-	if (IS_PONTEVECCHIO(i915))
+	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
+		mtl_3d_gt_workarounds_init(gt, wal);
+	else if (IS_PONTEVECCHIO(i915))
 		pvc_gt_workarounds_init(gt, wal);
 	else if (IS_DG2(i915))
 		dg2_gt_workarounds_init(gt, wal);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 38460a0bd7cb..bd1d8e0205a6 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1145,6 +1145,7 @@ static const struct intel_device_info mtl_info = {
 	.extra_gt_list = xelpmp_extra_gt,
 	.has_flat_ccs = 0,
 	.has_gmd_id = 1,
+	.has_mslice_steering = 0,
 	.has_snoop = 1,
 	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
 	.__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v2 14/14] drm/i915/mtl: Add multicast steering for media GT
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
@ 2022-10-01  0:45   ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

MTL's media GT only has a single type of steering ("OAADDRM") which
selects between media slice 0 and media slice 1.  We'll always steer to
media slice 0 unless it is fused off (which is the case when VD0, VE0,
and SFC0 are all reported as unavailable).

Bspec: 67789
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 19 +++++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 +++++++++++++++++-
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index a1fa71b47831..a580723bdd04 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -42,6 +42,7 @@ static const char * const intel_steering_types[] = {
 	"LNCF",
 	"GAM",
 	"DSS",
+	"OADDRM",
 	"INSTANCE 0",
 };
 
@@ -129,6 +130,12 @@ static const struct intel_mmio_range mtl3d_dss_steering_table[] = {
 	{ 0x00DE80, 0x00E8FF },		/* DSS (0xE000-0xE0FF reserved) */
 };
 
+static const struct intel_mmio_range xelpmp_oaddrm_steering_table[] = {
+	{ 0x393200, 0x39323F },
+	{ 0x393400, 0x3934FF },
+};
+
+
 void intel_gt_mcr_init(struct intel_gt *gt)
 {
 	struct drm_i915_private *i915 = gt->i915;
@@ -151,8 +158,9 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 			drm_warn(&i915->drm, "mslice mask all zero!\n");
 	}
 
-	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) &&
-	    gt->type == GT_PRIMARY) {
+	if (MEDIA_VER(i915) >= 13 && gt->type == GT_MEDIA) {
+		gt->steering_table[OADDRM] = xelpmp_oaddrm_steering_table;
+	} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) {
 		fuse = REG_FIELD_GET(GT_L3_EXC_MASK,
 				     intel_uncore_read(gt->uncore, XEHP_FUSE4));
 
@@ -479,6 +487,13 @@ static void get_nonterminated_steering(struct intel_gt *gt,
 		*group = 0;
 		*instance = 0;
 		break;
+	case OADDRM:
+		if ((VDBOX_MASK(gt) | VEBOX_MASK(gt) | gt->info.sfc_mask) & BIT(0))
+			*group = 0;
+		else
+			*group = 1;
+		*instance = 0;
+		break;
 	default:
 		MISSING_CASE(type);
 		*group = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 77ecbd6be331..5d049e0c66f5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -61,6 +61,7 @@ enum intel_steering_type {
 	LNCF,
 	GAM,
 	DSS,
+	OADDRM,
 
 	/*
 	 * On some platforms there are multiple types of MCR registers that
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 9be048da7fb3..744d83e5bb1f 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1595,12 +1595,28 @@ mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	}
 }
 
+static void
+mtl_media_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
+{
+	/*
+	 * Unlike older platforms, we no longer setup implicit steering here;
+	 * all MCR accesses are explicitly steered.
+	 */
+	if (drm_debug_enabled(DRM_UT_DRIVER)) {
+		struct drm_printer p = drm_debug_printer("MCR Steering:");
+
+		intel_gt_mcr_report_steering(&p, gt, false);
+	}
+}
+
 static void
 gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
 {
 	struct drm_i915_private *i915 = gt->i915;
 
-	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
+	if (IS_METEORLAKE(i915) && gt->type == GT_MEDIA)
+		mtl_media_gt_workarounds_init(gt, wal);
+	else if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
 		mtl_3d_gt_workarounds_init(gt, wal);
 	else if (IS_PONTEVECCHIO(i915))
 		pvc_gt_workarounds_init(gt, wal);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] [PATCH v2 14/14] drm/i915/mtl: Add multicast steering for media GT
@ 2022-10-01  0:45   ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-01  0:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

MTL's media GT only has a single type of steering ("OAADDRM") which
selects between media slice 0 and media slice 1.  We'll always steer to
media slice 0 unless it is fused off (which is the case when VD0, VE0,
and SFC0 are all reported as unavailable).

Bspec: 67789
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 19 +++++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 +++++++++++++++++-
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index a1fa71b47831..a580723bdd04 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -42,6 +42,7 @@ static const char * const intel_steering_types[] = {
 	"LNCF",
 	"GAM",
 	"DSS",
+	"OADDRM",
 	"INSTANCE 0",
 };
 
@@ -129,6 +130,12 @@ static const struct intel_mmio_range mtl3d_dss_steering_table[] = {
 	{ 0x00DE80, 0x00E8FF },		/* DSS (0xE000-0xE0FF reserved) */
 };
 
+static const struct intel_mmio_range xelpmp_oaddrm_steering_table[] = {
+	{ 0x393200, 0x39323F },
+	{ 0x393400, 0x3934FF },
+};
+
+
 void intel_gt_mcr_init(struct intel_gt *gt)
 {
 	struct drm_i915_private *i915 = gt->i915;
@@ -151,8 +158,9 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 			drm_warn(&i915->drm, "mslice mask all zero!\n");
 	}
 
-	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) &&
-	    gt->type == GT_PRIMARY) {
+	if (MEDIA_VER(i915) >= 13 && gt->type == GT_MEDIA) {
+		gt->steering_table[OADDRM] = xelpmp_oaddrm_steering_table;
+	} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) {
 		fuse = REG_FIELD_GET(GT_L3_EXC_MASK,
 				     intel_uncore_read(gt->uncore, XEHP_FUSE4));
 
@@ -479,6 +487,13 @@ static void get_nonterminated_steering(struct intel_gt *gt,
 		*group = 0;
 		*instance = 0;
 		break;
+	case OADDRM:
+		if ((VDBOX_MASK(gt) | VEBOX_MASK(gt) | gt->info.sfc_mask) & BIT(0))
+			*group = 0;
+		else
+			*group = 1;
+		*instance = 0;
+		break;
 	default:
 		MISSING_CASE(type);
 		*group = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 77ecbd6be331..5d049e0c66f5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -61,6 +61,7 @@ enum intel_steering_type {
 	LNCF,
 	GAM,
 	DSS,
+	OADDRM,
 
 	/*
 	 * On some platforms there are multiple types of MCR registers that
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 9be048da7fb3..744d83e5bb1f 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1595,12 +1595,28 @@ mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	}
 }
 
+static void
+mtl_media_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
+{
+	/*
+	 * Unlike older platforms, we no longer setup implicit steering here;
+	 * all MCR accesses are explicitly steered.
+	 */
+	if (drm_debug_enabled(DRM_UT_DRIVER)) {
+		struct drm_printer p = drm_debug_printer("MCR Steering:");
+
+		intel_gt_mcr_report_steering(&p, gt, false);
+	}
+}
+
 static void
 gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
 {
 	struct drm_i915_private *i915 = gt->i915;
 
-	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
+	if (IS_METEORLAKE(i915) && gt->type == GT_MEDIA)
+		mtl_media_gt_workarounds_init(gt, wal);
+	else if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
 		mtl_3d_gt_workarounds_init(gt, wal);
 	else if (IS_PONTEVECCHIO(i915))
 		pvc_gt_workarounds_init(gt, wal);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Explicit MCR handling and MTL steering (rev2)
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
                   ` (14 preceding siblings ...)
  (?)
@ 2022-10-01  1:49 ` Patchwork
  -1 siblings, 0 replies; 54+ messages in thread
From: Patchwork @ 2022-10-01  1:49 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

== Series Details ==

Series: Explicit MCR handling and MTL steering (rev2)
URL   : https://patchwork.freedesktop.org/series/108755/
State : warning

== Summary ==

Error: dim checkpatch failed
bf9edd88aea6 drm/i915/gen8: Create separate reg definitions for new MCR registers
-:292: WARNING:LONG_LINE: line length of 103 exceeds 100 columns
#292: FILE: drivers/gpu/drm/i915/intel_pm.c:4343:
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);

-:311: WARNING:LONG_LINE: line length of 116 exceeds 100 columns
#311: FILE: drivers/gpu/drm/i915/intel_pm.c:4517:
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &

total: 0 errors, 2 warnings, 0 checks, 223 lines checked
0fc47efe5eb9 drm/i915/xehp: Create separate reg definitions for new MCR registers
c3383e4218c1 drm/i915/gt: Drop a few unused register definitions
d41aca1d73e0 drm/i915/gt: Correct prefix on a few registers
0da9a749484d drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation
7f4fc3c493ef drm/i915/xehp: Check for faults on primary GAM
2af2f36fab78 drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()
a01a5757e7d7 drm/i915: Define MCR registers explicitly
-:149: WARNING:LONG_LINE_COMMENT: line length of 102 exceeds 100 columns
#149: FILE: drivers/gpu/drm/i915/gt/intel_gt_regs.h:948:
+#define XEHP_LNCFCMOCS(i)			MCR_REG(0xb020 + (i) * 4)	/* L3 Cache Control */

total: 0 errors, 1 warnings, 0 checks, 351 lines checked
01a8cd41ae1b drm/i915/gt: Always use MCR functions on multicast registers
bedd7ad259b9 drm/i915/guc: Handle save/restore of MCR registers explicitly
0c36060edf2c drm/i915/gt: Add MCR-specific workaround initializers
-:119: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#119: FILE: drivers/gpu/drm/i915/gt/intel_workarounds.c:297:
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 		     PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);

-:631: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#631: FILE: drivers/gpu/drm/i915/gt/intel_workarounds.c:2255:
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_DISABLE_READ_SUPPRESSION);

-:650: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#650: FILE: drivers/gpu/drm/i915/gt/intel_workarounds.c:2271:
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 			     MDQ_ARBITRATION_MODE | UGM_BACKUP_MODE);

-:714: WARNING:BLOCK_COMMENT_STYLE: Block comments use * on subsequent lines
#714: FILE: drivers/gpu/drm/i915/gt/intel_workarounds.c:2323:
+			   0 /* Wa_14012342262 :write-only reg, so skip
+				verification */,

total: 0 errors, 1 warnings, 3 checks, 899 lines checked
d8c8cc59fd59 drm/i915: Define multicast registers as a new type
-:704: WARNING:NEW_TYPEDEFS: do not add new typedefs
#704: FILE: drivers/gpu/drm/i915/i915_reg_defs.h:107:
+typedef struct {

total: 0 errors, 1 warnings, 0 checks, 636 lines checked
d8d24aeb0f55 drm/i915/mtl: Add multicast steering for render GT
-:99: CHECK:SPACING: spaces preferred around that '*' (ctx:VxV)
#99: FILE: drivers/gpu/drm/i915/gt/intel_gt_mcr.c:164:
+			gt->info.l3bank_mask |= (0x3 << 2*i);
 			                                 ^

-:127: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#127: FILE: drivers/gpu/drm/i915/gt/intel_gt_mcr.c:258:
+		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);

-:138: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#138: FILE: drivers/gpu/drm/i915/gt/intel_gt_mcr.c:267:
+		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);

total: 0 errors, 0 warnings, 3 checks, 226 lines checked
f82a42d420ed drm/i915/mtl: Add multicast steering for media GT
-:35: CHECK:LINE_SPACING: Please don't use multiple blank lines
#35: FILE: drivers/gpu/drm/i915/gt/intel_gt_mcr.c:138:
+
+

total: 0 errors, 0 warnings, 1 checks, 79 lines checked



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Explicit MCR handling and MTL steering (rev2)
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
                   ` (15 preceding siblings ...)
  (?)
@ 2022-10-01  1:49 ` Patchwork
  -1 siblings, 0 replies; 54+ messages in thread
From: Patchwork @ 2022-10-01  1:49 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

== Series Details ==

Series: Explicit MCR handling and MTL steering (rev2)
URL   : https://patchwork.freedesktop.org/series/108755/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for Explicit MCR handling and MTL steering (rev2)
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
                   ` (16 preceding siblings ...)
  (?)
@ 2022-10-01  2:13 ` Patchwork
  -1 siblings, 0 replies; 54+ messages in thread
From: Patchwork @ 2022-10-01  2:13 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 15312 bytes --]

== Series Details ==

Series: Explicit MCR handling and MTL steering (rev2)
URL   : https://patchwork.freedesktop.org/series/108755/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_12204 -> Patchwork_108755v2
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_108755v2 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_108755v2, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/index.html

Participating hosts (48 -> 46)
------------------------------

  Additional (3): bat-dg2-11 fi-rkl-11600 fi-tgl-dsi 
  Missing    (5): fi-kbl-soraka fi-hsw-4200u fi-tgl-u2 fi-ctg-p8600 fi-bdw-samus 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_108755v2:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live@execlists:
    - fi-bsw-nick:        [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-bsw-nick/igt@i915_selftest@live@execlists.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-bsw-nick/igt@i915_selftest@live@execlists.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live@execlists:
    - {fi-ehl-2}:         [PASS][3] -> [DMESG-WARN][4] +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-ehl-2/igt@i915_selftest@live@execlists.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-ehl-2/igt@i915_selftest@live@execlists.html

  * igt@i915_selftest@live@perf:
    - {bat-dg2-11}:       NOTRUN -> [DMESG-WARN][5] +51 similar issues
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/bat-dg2-11/igt@i915_selftest@live@perf.html

  * igt@i915_selftest@live@requests:
    - {bat-jsl-3}:        [PASS][6] -> [DMESG-WARN][7] +1 similar issue
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/bat-jsl-3/igt@i915_selftest@live@requests.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/bat-jsl-3/igt@i915_selftest@live@requests.html
    - {bat-jsl-1}:        [PASS][8] -> [DMESG-WARN][9] +1 similar issue
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/bat-jsl-1/igt@i915_selftest@live@requests.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/bat-jsl-1/igt@i915_selftest@live@requests.html
    - {fi-jsl-1}:         [PASS][10] -> [DMESG-WARN][11] +1 similar issue
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-jsl-1/igt@i915_selftest@live@requests.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-jsl-1/igt@i915_selftest@live@requests.html

  * igt@i915_selftest@live@workarounds:
    - {bat-dg2-11}:       NOTRUN -> [DMESG-FAIL][12]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/bat-dg2-11/igt@i915_selftest@live@workarounds.html

  
Known issues
------------

  Here are the changes found in Patchwork_108755v2 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_huc_copy@huc-copy:
    - fi-rkl-11600:       NOTRUN -> [SKIP][13] ([i915#2190])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@gem_huc_copy@huc-copy.html

  * igt@gem_lmem_swapping@basic:
    - fi-bdw-gvtdvm:      NOTRUN -> [SKIP][14] ([fdo#109271]) +19 similar issues
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-bdw-gvtdvm/igt@gem_lmem_swapping@basic.html
    - fi-rkl-11600:       NOTRUN -> [SKIP][15] ([i915#4613]) +3 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@gem_lmem_swapping@basic.html

  * igt@gem_tiled_pread_basic:
    - fi-rkl-11600:       NOTRUN -> [SKIP][16] ([i915#3282])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@gem_tiled_pread_basic.html

  * igt@i915_pm_backlight@basic-brightness:
    - fi-rkl-11600:       NOTRUN -> [SKIP][17] ([i915#3012])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@i915_pm_backlight@basic-brightness.html

  * igt@i915_selftest@live:
    - fi-cfl-8109u:       NOTRUN -> [INCOMPLETE][18] ([i915#6114])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-cfl-8109u/igt@i915_selftest@live.html

  * igt@i915_selftest@live@hangcheck:
    - fi-hsw-4770:        [PASS][19] -> [INCOMPLETE][20] ([i915#4785])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html

  * igt@i915_suspend@basic-s2idle-without-i915:
    - fi-bdw-gvtdvm:      NOTRUN -> [INCOMPLETE][21] ([i915#4817])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-bdw-gvtdvm/igt@i915_suspend@basic-s2idle-without-i915.html

  * igt@i915_suspend@basic-s3-without-i915:
    - fi-rkl-11600:       NOTRUN -> [INCOMPLETE][22] ([i915#5982])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@i915_suspend@basic-s3-without-i915.html

  * igt@kms_chamelium@hdmi-edid-read:
    - fi-rkl-11600:       NOTRUN -> [SKIP][23] ([fdo#111827]) +7 similar issues
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@kms_chamelium@hdmi-edid-read.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor:
    - fi-rkl-11600:       NOTRUN -> [SKIP][24] ([i915#4103])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@kms_cursor_legacy@basic-busy-flip-before-cursor.html

  * igt@kms_force_connector_basic@force-load-detect:
    - fi-rkl-11600:       NOTRUN -> [SKIP][25] ([fdo#109285] / [i915#4098])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@kms_force_connector_basic@force-load-detect.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-cfl-8109u:       [PASS][26] -> [DMESG-FAIL][27] ([i915#62]) +1 similar issue
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-cfl-8109u/igt@kms_frontbuffer_tracking@basic.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-cfl-8109u/igt@kms_frontbuffer_tracking@basic.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-c-dp-1:
    - fi-cfl-8109u:       [PASS][28] -> [DMESG-WARN][29] ([i915#62]) +10 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-cfl-8109u/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-c-dp-1.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-cfl-8109u/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-c-dp-1.html

  * igt@kms_psr@primary_page_flip:
    - fi-rkl-11600:       NOTRUN -> [SKIP][30] ([i915#1072]) +3 similar issues
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@kms_psr@primary_page_flip.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - fi-rkl-11600:       NOTRUN -> [SKIP][31] ([i915#3555] / [i915#4098])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@kms_setmode@basic-clone-single-crtc.html

  * igt@prime_vgem@basic-read:
    - fi-rkl-11600:       NOTRUN -> [SKIP][32] ([fdo#109295] / [i915#3291] / [i915#3708]) +2 similar issues
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@prime_vgem@basic-read.html

  * igt@prime_vgem@basic-userptr:
    - fi-rkl-11600:       NOTRUN -> [SKIP][33] ([fdo#109295] / [i915#3301] / [i915#3708])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-11600/igt@prime_vgem@basic-userptr.html

  * igt@runner@aborted:
    - fi-hsw-4770:        NOTRUN -> [FAIL][34] ([fdo#109271] / [i915#4312] / [i915#5594])
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-hsw-4770/igt@runner@aborted.html
    - fi-cfl-8109u:       NOTRUN -> [FAIL][35] ([i915#4312])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-cfl-8109u/igt@runner@aborted.html
    - fi-bsw-nick:        NOTRUN -> [FAIL][36] ([fdo#109271] / [i915#4312])
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-bsw-nick/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s0@smem:
    - {bat-adlm-1}:       [DMESG-WARN][37] ([i915#2867]) -> [PASS][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/bat-adlm-1/igt@gem_exec_suspend@basic-s0@smem.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/bat-adlm-1/igt@gem_exec_suspend@basic-s0@smem.html

  * igt@i915_module_load@load:
    - fi-bdw-gvtdvm:      [DMESG-WARN][39] ([i915#6540]) -> [PASS][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-bdw-gvtdvm/igt@i915_module_load@load.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-bdw-gvtdvm/igt@i915_module_load@load.html

  * igt@i915_selftest@live@mman:
    - fi-rkl-guc:         [INCOMPLETE][41] ([i915#6794]) -> [PASS][42]
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-rkl-guc/igt@i915_selftest@live@mman.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-rkl-guc/igt@i915_selftest@live@mman.html

  * igt@i915_selftest@live@reset:
    - {bat-rpls-1}:       [DMESG-FAIL][43] ([i915#4983]) -> [PASS][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/bat-rpls-1/igt@i915_selftest@live@reset.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/bat-rpls-1/igt@i915_selftest@live@reset.html

  * igt@kms_force_connector_basic@force-connector-state:
    - fi-bdw-gvtdvm:      [DMESG-WARN][45] ([i915#5922]) -> [PASS][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-bdw-gvtdvm/igt@kms_force_connector_basic@force-connector-state.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/fi-bdw-gvtdvm/igt@kms_force_connector_basic@force-connector-state.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109284]: https://bugs.freedesktop.org/show_bug.cgi?id=109284
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109295]: https://bugs.freedesktop.org/show_bug.cgi?id=109295
  [fdo#110189]: https://bugs.freedesktop.org/show_bug.cgi?id=110189
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1155]: https://gitlab.freedesktop.org/drm/intel/issues/1155
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2582]: https://gitlab.freedesktop.org/drm/intel/issues/2582
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#3012]: https://gitlab.freedesktop.org/drm/intel/issues/3012
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3291]: https://gitlab.freedesktop.org/drm/intel/issues/3291
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3595]: https://gitlab.freedesktop.org/drm/intel/issues/3595
  [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708
  [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077
  [i915#4079]: https://gitlab.freedesktop.org/drm/intel/issues/4079
  [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083
  [i915#4098]: https://gitlab.freedesktop.org/drm/intel/issues/4098
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4212]: https://gitlab.freedesktop.org/drm/intel/issues/4212
  [i915#4213]: https://gitlab.freedesktop.org/drm/intel/issues/4213
  [i915#4215]: https://gitlab.freedesktop.org/drm/intel/issues/4215
  [i915#4258]: https://gitlab.freedesktop.org/drm/intel/issues/4258
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4785]: https://gitlab.freedesktop.org/drm/intel/issues/4785
  [i915#4817]: https://gitlab.freedesktop.org/drm/intel/issues/4817
  [i915#4873]: https://gitlab.freedesktop.org/drm/intel/issues/4873
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#5190]: https://gitlab.freedesktop.org/drm/intel/issues/5190
  [i915#5274]: https://gitlab.freedesktop.org/drm/intel/issues/5274
  [i915#5537]: https://gitlab.freedesktop.org/drm/intel/issues/5537
  [i915#5594]: https://gitlab.freedesktop.org/drm/intel/issues/5594
  [i915#5828]: https://gitlab.freedesktop.org/drm/intel/issues/5828
  [i915#5922]: https://gitlab.freedesktop.org/drm/intel/issues/5922
  [i915#5982]: https://gitlab.freedesktop.org/drm/intel/issues/5982
  [i915#6114]: https://gitlab.freedesktop.org/drm/intel/issues/6114
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#6540]: https://gitlab.freedesktop.org/drm/intel/issues/6540
  [i915#6621]: https://gitlab.freedesktop.org/drm/intel/issues/6621
  [i915#6794]: https://gitlab.freedesktop.org/drm/intel/issues/6794
  [i915#6816]: https://gitlab.freedesktop.org/drm/intel/issues/6816
  [i915#6856]: https://gitlab.freedesktop.org/drm/intel/issues/6856


Build changes
-------------

  * Linux: CI_DRM_12204 -> Patchwork_108755v2

  CI-20190529: 20190529
  CI_DRM_12204: fd2f9b9a4178e667adad268a662eb8a9c0ddc8f8 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6670: d618e9865fe5cbaf511ca43503abad442605d0a5 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_108755v2: fd2f9b9a4178e667adad268a662eb8a9c0ddc8f8 @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

93d1d05b3df7 drm/i915/mtl: Add multicast steering for media GT
aedfd9c150a9 drm/i915/mtl: Add multicast steering for render GT
80ad14ecf8b3 drm/i915: Define multicast registers as a new type
6dc4ec1667c3 drm/i915/gt: Add MCR-specific workaround initializers
de4a4556e221 drm/i915/guc: Handle save/restore of MCR registers explicitly
f7138865e8b1 drm/i915/gt: Always use MCR functions on multicast registers
e1a33c9ae908 drm/i915: Define MCR registers explicitly
06efc77cd1fc drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()
ab55919659e0 drm/i915/xehp: Check for faults on primary GAM
c2591f17e69f drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation
e8a0e1a1edac drm/i915/gt: Correct prefix on a few registers
c82d5521a8dc drm/i915/gt: Drop a few unused register definitions
6153291fc50b drm/i915/xehp: Create separate reg definitions for new MCR registers
ddd20309b197 drm/i915/gen8: Create separate reg definitions for new MCR registers

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v2/index.html

[-- Attachment #2: Type: text/html, Size: 15930 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Explicit MCR handling and MTL steering (rev3)
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
                   ` (17 preceding siblings ...)
  (?)
@ 2022-10-01  5:35 ` Patchwork
  -1 siblings, 0 replies; 54+ messages in thread
From: Patchwork @ 2022-10-01  5:35 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

== Series Details ==

Series: Explicit MCR handling and MTL steering (rev3)
URL   : https://patchwork.freedesktop.org/series/108755/
State : warning

== Summary ==

Error: dim checkpatch failed
98e3505e9ca1 drm/i915/gen8: Create separate reg definitions for new MCR registers
-:292: WARNING:LONG_LINE: line length of 103 exceeds 100 columns
#292: FILE: drivers/gpu/drm/i915/intel_pm.c:4343:
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);

-:311: WARNING:LONG_LINE: line length of 116 exceeds 100 columns
#311: FILE: drivers/gpu/drm/i915/intel_pm.c:4517:
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &

total: 0 errors, 2 warnings, 0 checks, 223 lines checked
b00aea4326d3 drm/i915/xehp: Create separate reg definitions for new MCR registers
010e40263fbf drm/i915/gt: Drop a few unused register definitions
059a86ae20fc drm/i915/gt: Correct prefix on a few registers
7b8d6d5b1a51 drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation
035142257827 drm/i915/xehp: Check for faults on primary GAM
2e41a4442059 drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()
ccf70c008b7c drm/i915: Define MCR registers explicitly
-:149: WARNING:LONG_LINE_COMMENT: line length of 102 exceeds 100 columns
#149: FILE: drivers/gpu/drm/i915/gt/intel_gt_regs.h:948:
+#define XEHP_LNCFCMOCS(i)			MCR_REG(0xb020 + (i) * 4)	/* L3 Cache Control */

total: 0 errors, 1 warnings, 0 checks, 351 lines checked
d3286964cb22 drm/i915/gt: Always use MCR functions on multicast registers
a00c2e3cdbd1 drm/i915/guc: Handle save/restore of MCR registers explicitly
5454314ef6c3 drm/i915/gt: Add MCR-specific workaround initializers
-:119: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#119: FILE: drivers/gpu/drm/i915/gt/intel_workarounds.c:297:
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 		     PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);

-:631: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#631: FILE: drivers/gpu/drm/i915/gt/intel_workarounds.c:2255:
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_DISABLE_READ_SUPPRESSION);

-:650: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#650: FILE: drivers/gpu/drm/i915/gt/intel_workarounds.c:2271:
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 			     MDQ_ARBITRATION_MODE | UGM_BACKUP_MODE);

-:714: WARNING:BLOCK_COMMENT_STYLE: Block comments use * on subsequent lines
#714: FILE: drivers/gpu/drm/i915/gt/intel_workarounds.c:2323:
+			   0 /* Wa_14012342262 :write-only reg, so skip
+				verification */,

total: 0 errors, 1 warnings, 3 checks, 899 lines checked
48ce34e8c471 drm/i915: Define multicast registers as a new type
-:704: WARNING:NEW_TYPEDEFS: do not add new typedefs
#704: FILE: drivers/gpu/drm/i915/i915_reg_defs.h:107:
+typedef struct {

total: 0 errors, 1 warnings, 0 checks, 636 lines checked
628485d2e33d drm/i915/mtl: Add multicast steering for render GT
-:99: CHECK:SPACING: spaces preferred around that '*' (ctx:VxV)
#99: FILE: drivers/gpu/drm/i915/gt/intel_gt_mcr.c:164:
+			gt->info.l3bank_mask |= (0x3 << 2*i);
 			                                 ^

-:127: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#127: FILE: drivers/gpu/drm/i915/gt/intel_gt_mcr.c:258:
+		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);

-:138: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#138: FILE: drivers/gpu/drm/i915/gt/intel_gt_mcr.c:267:
+		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);

total: 0 errors, 0 warnings, 3 checks, 226 lines checked
ac8a45eb9091 drm/i915/mtl: Add multicast steering for media GT
-:35: CHECK:LINE_SPACING: Please don't use multiple blank lines
#35: FILE: drivers/gpu/drm/i915/gt/intel_gt_mcr.c:138:
+
+

total: 0 errors, 0 warnings, 1 checks, 79 lines checked



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Explicit MCR handling and MTL steering (rev3)
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
                   ` (18 preceding siblings ...)
  (?)
@ 2022-10-01  5:35 ` Patchwork
  -1 siblings, 0 replies; 54+ messages in thread
From: Patchwork @ 2022-10-01  5:35 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

== Series Details ==

Series: Explicit MCR handling and MTL steering (rev3)
URL   : https://patchwork.freedesktop.org/series/108755/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for Explicit MCR handling and MTL steering (rev3)
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
                   ` (19 preceding siblings ...)
  (?)
@ 2022-10-01  5:58 ` Patchwork
  -1 siblings, 0 replies; 54+ messages in thread
From: Patchwork @ 2022-10-01  5:58 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 12931 bytes --]

== Series Details ==

Series: Explicit MCR handling and MTL steering (rev3)
URL   : https://patchwork.freedesktop.org/series/108755/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12204 -> Patchwork_108755v3
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/index.html

Participating hosts (48 -> 46)
------------------------------

  Additional (2): bat-dg2-11 fi-rkl-11600 
  Missing    (4): fi-ctg-p8600 fi-hsw-4200u fi-bdw-samus fi-tgl-u2 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_108755v3:

### IGT changes ###

#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live@execlists:
    - {fi-ehl-2}:         [PASS][1] -> [DMESG-WARN][2] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-ehl-2/igt@i915_selftest@live@execlists.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-ehl-2/igt@i915_selftest@live@execlists.html

  * igt@i915_selftest@live@perf:
    - {bat-dg2-11}:       NOTRUN -> [DMESG-WARN][3] +49 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/bat-dg2-11/igt@i915_selftest@live@perf.html

  * igt@i915_selftest@live@requests:
    - {bat-jsl-3}:        [PASS][4] -> [DMESG-WARN][5] +1 similar issue
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/bat-jsl-3/igt@i915_selftest@live@requests.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/bat-jsl-3/igt@i915_selftest@live@requests.html
    - {bat-jsl-1}:        [PASS][6] -> [DMESG-WARN][7] +1 similar issue
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/bat-jsl-1/igt@i915_selftest@live@requests.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/bat-jsl-1/igt@i915_selftest@live@requests.html
    - {fi-jsl-1}:         [PASS][8] -> [DMESG-WARN][9] +1 similar issue
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-jsl-1/igt@i915_selftest@live@requests.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-jsl-1/igt@i915_selftest@live@requests.html

  * igt@i915_selftest@live@workarounds:
    - {bat-dg2-11}:       NOTRUN -> [DMESG-FAIL][10] +1 similar issue
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/bat-dg2-11/igt@i915_selftest@live@workarounds.html

  
Known issues
------------

  Here are the changes found in Patchwork_108755v3 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_huc_copy@huc-copy:
    - fi-rkl-11600:       NOTRUN -> [SKIP][11] ([i915#2190])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@gem_huc_copy@huc-copy.html

  * igt@gem_lmem_swapping@basic:
    - fi-bdw-gvtdvm:      NOTRUN -> [SKIP][12] ([fdo#109271]) +19 similar issues
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-bdw-gvtdvm/igt@gem_lmem_swapping@basic.html
    - fi-rkl-11600:       NOTRUN -> [SKIP][13] ([i915#4613]) +3 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@gem_lmem_swapping@basic.html

  * igt@gem_tiled_pread_basic:
    - fi-rkl-11600:       NOTRUN -> [SKIP][14] ([i915#3282])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@gem_tiled_pread_basic.html

  * igt@i915_pm_backlight@basic-brightness:
    - fi-rkl-11600:       NOTRUN -> [SKIP][15] ([i915#3012])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@i915_pm_backlight@basic-brightness.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-bxt-dsi:         [PASS][16] -> [DMESG-FAIL][17] ([i915#5334])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-bxt-dsi/igt@i915_selftest@live@gt_heartbeat.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-bxt-dsi/igt@i915_selftest@live@gt_heartbeat.html

  * igt@i915_suspend@basic-s2idle-without-i915:
    - fi-bdw-gvtdvm:      NOTRUN -> [INCOMPLETE][18] ([i915#4817])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-bdw-gvtdvm/igt@i915_suspend@basic-s2idle-without-i915.html

  * igt@i915_suspend@basic-s3-without-i915:
    - fi-rkl-11600:       NOTRUN -> [INCOMPLETE][19] ([i915#5982])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@i915_suspend@basic-s3-without-i915.html
    - fi-bdw-5557u:       [PASS][20] -> [INCOMPLETE][21] ([i915#146] / [i915#6712])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-bdw-5557u/igt@i915_suspend@basic-s3-without-i915.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-bdw-5557u/igt@i915_suspend@basic-s3-without-i915.html

  * igt@kms_chamelium@hdmi-edid-read:
    - fi-rkl-11600:       NOTRUN -> [SKIP][22] ([fdo#111827]) +7 similar issues
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@kms_chamelium@hdmi-edid-read.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor:
    - fi-rkl-11600:       NOTRUN -> [SKIP][23] ([i915#4103])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@kms_cursor_legacy@basic-busy-flip-before-cursor.html

  * igt@kms_force_connector_basic@force-load-detect:
    - fi-rkl-11600:       NOTRUN -> [SKIP][24] ([fdo#109285] / [i915#4098])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@kms_force_connector_basic@force-load-detect.html

  * igt@kms_psr@primary_page_flip:
    - fi-rkl-11600:       NOTRUN -> [SKIP][25] ([i915#1072]) +3 similar issues
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@kms_psr@primary_page_flip.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - fi-rkl-11600:       NOTRUN -> [SKIP][26] ([i915#3555] / [i915#4098])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@kms_setmode@basic-clone-single-crtc.html

  * igt@prime_vgem@basic-read:
    - fi-rkl-11600:       NOTRUN -> [SKIP][27] ([fdo#109295] / [i915#3291] / [i915#3708]) +2 similar issues
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@prime_vgem@basic-read.html

  * igt@prime_vgem@basic-userptr:
    - fi-rkl-11600:       NOTRUN -> [SKIP][28] ([fdo#109295] / [i915#3301] / [i915#3708])
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-11600/igt@prime_vgem@basic-userptr.html

  
#### Possible fixes ####

  * igt@i915_module_load@load:
    - fi-bdw-gvtdvm:      [DMESG-WARN][29] ([i915#6540]) -> [PASS][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-bdw-gvtdvm/igt@i915_module_load@load.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-bdw-gvtdvm/igt@i915_module_load@load.html

  * igt@i915_selftest@live@mman:
    - fi-rkl-guc:         [INCOMPLETE][31] ([i915#6794]) -> [PASS][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-rkl-guc/igt@i915_selftest@live@mman.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-rkl-guc/igt@i915_selftest@live@mman.html

  * igt@i915_selftest@live@reset:
    - {bat-rpls-1}:       [DMESG-FAIL][33] ([i915#4983]) -> [PASS][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/bat-rpls-1/igt@i915_selftest@live@reset.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/bat-rpls-1/igt@i915_selftest@live@reset.html

  * igt@i915_selftest@live@slpc:
    - {bat-adln-1}:       [DMESG-FAIL][35] -> [PASS][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/bat-adln-1/igt@i915_selftest@live@slpc.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/bat-adln-1/igt@i915_selftest@live@slpc.html

  * igt@kms_force_connector_basic@force-connector-state:
    - fi-bdw-gvtdvm:      [DMESG-WARN][37] ([i915#5922]) -> [PASS][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/fi-bdw-gvtdvm/igt@kms_force_connector_basic@force-connector-state.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/fi-bdw-gvtdvm/igt@kms_force_connector_basic@force-connector-state.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109295]: https://bugs.freedesktop.org/show_bug.cgi?id=109295
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1155]: https://gitlab.freedesktop.org/drm/intel/issues/1155
  [i915#146]: https://gitlab.freedesktop.org/drm/intel/issues/146
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2582]: https://gitlab.freedesktop.org/drm/intel/issues/2582
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#3012]: https://gitlab.freedesktop.org/drm/intel/issues/3012
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3291]: https://gitlab.freedesktop.org/drm/intel/issues/3291
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3595]: https://gitlab.freedesktop.org/drm/intel/issues/3595
  [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708
  [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077
  [i915#4079]: https://gitlab.freedesktop.org/drm/intel/issues/4079
  [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083
  [i915#4098]: https://gitlab.freedesktop.org/drm/intel/issues/4098
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4212]: https://gitlab.freedesktop.org/drm/intel/issues/4212
  [i915#4213]: https://gitlab.freedesktop.org/drm/intel/issues/4213
  [i915#4215]: https://gitlab.freedesktop.org/drm/intel/issues/4215
  [i915#4258]: https://gitlab.freedesktop.org/drm/intel/issues/4258
  [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4817]: https://gitlab.freedesktop.org/drm/intel/issues/4817
  [i915#4873]: https://gitlab.freedesktop.org/drm/intel/issues/4873
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#5190]: https://gitlab.freedesktop.org/drm/intel/issues/5190
  [i915#5274]: https://gitlab.freedesktop.org/drm/intel/issues/5274
  [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
  [i915#5922]: https://gitlab.freedesktop.org/drm/intel/issues/5922
  [i915#5982]: https://gitlab.freedesktop.org/drm/intel/issues/5982
  [i915#6540]: https://gitlab.freedesktop.org/drm/intel/issues/6540
  [i915#6621]: https://gitlab.freedesktop.org/drm/intel/issues/6621
  [i915#6712]: https://gitlab.freedesktop.org/drm/intel/issues/6712
  [i915#6794]: https://gitlab.freedesktop.org/drm/intel/issues/6794
  [i915#6816]: https://gitlab.freedesktop.org/drm/intel/issues/6816
  [i915#6818]: https://gitlab.freedesktop.org/drm/intel/issues/6818


Build changes
-------------

  * Linux: CI_DRM_12204 -> Patchwork_108755v3

  CI-20190529: 20190529
  CI_DRM_12204: fd2f9b9a4178e667adad268a662eb8a9c0ddc8f8 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6670: d618e9865fe5cbaf511ca43503abad442605d0a5 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_108755v3: fd2f9b9a4178e667adad268a662eb8a9c0ddc8f8 @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

91c2151e11e0 drm/i915/mtl: Add multicast steering for media GT
c2dc34bc7bb4 drm/i915/mtl: Add multicast steering for render GT
c79be27e323e drm/i915: Define multicast registers as a new type
2b7e9a043d62 drm/i915/gt: Add MCR-specific workaround initializers
e5ff01d93551 drm/i915/guc: Handle save/restore of MCR registers explicitly
ad3e593670a9 drm/i915/gt: Always use MCR functions on multicast registers
d472dca55bf6 drm/i915: Define MCR registers explicitly
a2778b13ccbc drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw()
360719a71dac drm/i915/xehp: Check for faults on primary GAM
a496bad77f38 drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation
2280cd54d540 drm/i915/gt: Correct prefix on a few registers
0097cfececb4 drm/i915/gt: Drop a few unused register definitions
ebc278da47ba drm/i915/xehp: Create separate reg definitions for new MCR registers
0124fe4a5788 drm/i915/gen8: Create separate reg definitions for new MCR registers

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/index.html

[-- Attachment #2: Type: text/html, Size: 13266 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for Explicit MCR handling and MTL steering (rev3)
  2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
                   ` (20 preceding siblings ...)
  (?)
@ 2022-10-01 21:41 ` Patchwork
  -1 siblings, 0 replies; 54+ messages in thread
From: Patchwork @ 2022-10-01 21:41 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 25201 bytes --]

== Series Details ==

Series: Explicit MCR handling and MTL steering (rev3)
URL   : https://patchwork.freedesktop.org/series/108755/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_12204_full -> Patchwork_108755v3_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_108755v3_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_108755v3_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (9 -> 9)
------------------------------

  No changes in participating hosts

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_108755v3_full:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_pm_rpm@system-suspend-execbuf:
    - shard-iclb:         [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb6/igt@i915_pm_rpm@system-suspend-execbuf.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb6/igt@i915_pm_rpm@system-suspend-execbuf.html

  * igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-render:
    - shard-tglb:         [PASS][3] -> [INCOMPLETE][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-tglb2/igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-render.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-tglb8/igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-render.html

  
Known issues
------------

  Here are the changes found in Patchwork_108755v3_full that come from known issues:

### CI changes ###

#### Possible fixes ####

  * boot:
    - shard-apl:          ([PASS][5], [PASS][6], [PASS][7], [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], [FAIL][25], [PASS][26], [PASS][27], [PASS][28], [PASS][29]) ([i915#4386]) -> ([PASS][30], [PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], [PASS][49], [PASS][50], [PASS][51], [PASS][52], [PASS][53], [PASS][54])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl3/boot.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl8/boot.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl8/boot.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl8/boot.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl8/boot.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl1/boot.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl1/boot.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl7/boot.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl1/boot.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl1/boot.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl1/boot.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl7/boot.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl2/boot.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl7/boot.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl7/boot.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl7/boot.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl6/boot.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl6/boot.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl6/boot.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl3/boot.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl2/boot.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl2/boot.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl2/boot.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl3/boot.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl3/boot.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/boot.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/boot.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/boot.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/boot.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl2/boot.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl2/boot.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl2/boot.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl2/boot.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl3/boot.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl3/boot.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl3/boot.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl3/boot.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl3/boot.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl6/boot.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl6/boot.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl6/boot.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl6/boot.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl7/boot.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl7/boot.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl7/boot.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl7/boot.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl8/boot.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl8/boot.html
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl8/boot.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl8/boot.html

  

### IGT changes ###

#### Issues hit ####

  * igt@gem_create@create-massive:
    - shard-apl:          NOTRUN -> [DMESG-WARN][55] ([i915#4991])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl3/igt@gem_create@create-massive.html

  * igt@gem_ctx_persistence@legacy-engines-cleanup:
    - shard-snb:          NOTRUN -> [SKIP][56] ([fdo#109271] / [i915#1099])
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-snb4/igt@gem_ctx_persistence@legacy-engines-cleanup.html

  * igt@gem_exec_balancer@parallel-bb-first:
    - shard-iclb:         [PASS][57] -> [SKIP][58] ([i915#4525]) +1 similar issue
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb2/igt@gem_exec_balancer@parallel-bb-first.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb8/igt@gem_exec_balancer@parallel-bb-first.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-glk:          [PASS][59] -> [FAIL][60] ([i915#2842])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-glk3/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-glk3/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_lmem_swapping@parallel-multi:
    - shard-apl:          NOTRUN -> [SKIP][61] ([fdo#109271] / [i915#4613])
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@gem_lmem_swapping@parallel-multi.html

  * igt@gem_softpin@evict-single-offset:
    - shard-apl:          NOTRUN -> [FAIL][62] ([i915#4171])
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@gem_softpin@evict-single-offset.html

  * igt@gem_workarounds@suspend-resume-context:
    - shard-apl:          [PASS][63] -> [DMESG-WARN][64] ([i915#180])
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl2/igt@gem_workarounds@suspend-resume-context.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl6/igt@gem_workarounds@suspend-resume-context.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-glk:          [PASS][65] -> [DMESG-WARN][66] ([i915#5566] / [i915#716])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-glk7/igt@gen9_exec_parse@allowed-single.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-glk2/igt@gen9_exec_parse@allowed-single.html

  * igt@i915_pm_dc@dc9-dpms:
    - shard-iclb:         [PASS][67] -> [SKIP][68] ([i915#4281])
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb6/igt@i915_pm_dc@dc9-dpms.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb3/igt@i915_pm_dc@dc9-dpms.html

  * igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp:
    - shard-apl:          NOTRUN -> [SKIP][69] ([fdo#109271] / [i915#1937])
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp.html

  * igt@i915_pm_rps@engine-order:
    - shard-apl:          [PASS][70] -> [FAIL][71] ([i915#6537])
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl3/igt@i915_pm_rps@engine-order.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl3/igt@i915_pm_rps@engine-order.html

  * igt@i915_selftest@live@hangcheck:
    - shard-tglb:         [PASS][72] -> [DMESG-WARN][73] ([i915#5591])
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-tglb2/igt@i915_selftest@live@hangcheck.html
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-tglb5/igt@i915_selftest@live@hangcheck.html

  * igt@kms_ccs@pipe-a-crc-primary-rotation-180-y_tiled_gen12_mc_ccs:
    - shard-apl:          NOTRUN -> [SKIP][74] ([fdo#109271] / [i915#3886]) +5 similar issues
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@kms_ccs@pipe-a-crc-primary-rotation-180-y_tiled_gen12_mc_ccs.html

  * igt@kms_chamelium@vga-frame-dump:
    - shard-apl:          NOTRUN -> [SKIP][75] ([fdo#109271] / [fdo#111827]) +8 similar issues
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl3/igt@kms_chamelium@vga-frame-dump.html

  * igt@kms_flip@2x-nonexisting-fb:
    - shard-apl:          NOTRUN -> [SKIP][76] ([fdo#109271]) +98 similar issues
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@kms_flip@2x-nonexisting-fb.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@a-dp1:
    - shard-apl:          [PASS][77] -> [FAIL][78] ([i915#79])
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl3/igt@kms_flip@flip-vs-expired-vblank-interruptible@a-dp1.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@kms_flip@flip-vs-expired-vblank-interruptible@a-dp1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-4tile-to-64bpp-4tile-downscaling@pipe-a-valid-mode:
    - shard-iclb:         NOTRUN -> [SKIP][79] ([i915#2587] / [i915#2672])
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb6/igt@kms_flip_scaled_crc@flip-32bpp-4tile-to-64bpp-4tile-downscaling@pipe-a-valid-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-linear-to-32bpp-linear-downscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][80] ([i915#3555]) +1 similar issue
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb2/igt@kms_flip_scaled_crc@flip-64bpp-linear-to-32bpp-linear-downscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][81] ([i915#2672]) +6 similar issues
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb2/igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][82] ([i915#2672] / [i915#3555])
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb3/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling@pipe-a-default-mode.html

  * igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-indfb-draw-mmap-cpu:
    - shard-snb:          NOTRUN -> [SKIP][83] ([fdo#109271]) +37 similar issues
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-snb4/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-indfb-draw-mmap-cpu.html

  * igt@kms_plane_scaling@plane-scaler-with-clipping-clamping-pixel-formats@pipe-b-edp-1:
    - shard-iclb:         [PASS][84] -> [SKIP][85] ([i915#5176]) +1 similar issue
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb6/igt@kms_plane_scaling@plane-scaler-with-clipping-clamping-pixel-formats@pipe-b-edp-1.html
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb3/igt@kms_plane_scaling@plane-scaler-with-clipping-clamping-pixel-formats@pipe-b-edp-1.html

  * igt@kms_psr@psr2_primary_mmap_cpu:
    - shard-iclb:         [PASS][86] -> [SKIP][87] ([fdo#109441]) +2 similar issues
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb2/igt@kms_psr@psr2_primary_mmap_cpu.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb6/igt@kms_psr@psr2_primary_mmap_cpu.html

  * igt@kms_psr_stress_test@flip-primary-invalidate-overlay:
    - shard-tglb:         [PASS][88] -> [SKIP][89] ([i915#5519])
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-tglb8/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-tglb3/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html
    - shard-iclb:         [PASS][90] -> [SKIP][91] ([i915#5519])
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb8/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb5/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html

  * igt@kms_vblank@pipe-d-wait-idle:
    - shard-apl:          NOTRUN -> [SKIP][92] ([fdo#109271] / [i915#533])
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@kms_vblank@pipe-d-wait-idle.html

  * igt@perf_pmu@module-unload:
    - shard-iclb:         [PASS][93] -> [DMESG-WARN][94] ([i915#2867])
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb6/igt@perf_pmu@module-unload.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb6/igt@perf_pmu@module-unload.html

  * igt@sysfs_clients@sema-50:
    - shard-apl:          NOTRUN -> [SKIP][95] ([fdo#109271] / [i915#2994])
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@sysfs_clients@sema-50.html

  
#### Possible fixes ####

  * igt@gem_exec_balancer@parallel-contexts:
    - shard-iclb:         [SKIP][96] ([i915#4525]) -> [PASS][97] +1 similar issue
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb7/igt@gem_exec_balancer@parallel-contexts.html
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb1/igt@gem_exec_balancer@parallel-contexts.html

  * igt@gem_huc_copy@huc-copy:
    - shard-tglb:         [SKIP][98] ([i915#2190]) -> [PASS][99]
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-tglb7/igt@gem_huc_copy@huc-copy.html
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-tglb2/igt@gem_huc_copy@huc-copy.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-apl:          [DMESG-WARN][100] ([i915#5566] / [i915#716]) -> [PASS][101]
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl7/igt@gen9_exec_parse@allowed-single.html
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@gen9_exec_parse@allowed-single.html

  * igt@gen9_exec_parse@bb-large:
    - shard-apl:          [TIMEOUT][102] ([i915#4639]) -> [PASS][103]
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl2/igt@gen9_exec_parse@bb-large.html
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl6/igt@gen9_exec_parse@bb-large.html

  * igt@i915_selftest@live@hangcheck:
    - shard-snb:          [INCOMPLETE][104] ([i915#6992]) -> [PASS][105]
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-snb4/igt@i915_selftest@live@hangcheck.html
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-snb4/igt@i915_selftest@live@hangcheck.html

  * igt@i915_suspend@forcewake:
    - shard-apl:          [DMESG-WARN][106] ([i915#180]) -> [PASS][107] +1 similar issue
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-apl1/igt@i915_suspend@forcewake.html
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-apl1/igt@i915_suspend@forcewake.html

  * igt@kms_addfb_basic@legacy-format:
    - shard-tglb:         [INCOMPLETE][108] ([i915#6987]) -> [PASS][109]
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-tglb3/igt@kms_addfb_basic@legacy-format.html
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-tglb7/igt@kms_addfb_basic@legacy-format.html

  * igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions-varying-size:
    - shard-glk:          [FAIL][110] ([i915#2346]) -> [PASS][111]
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-glk2/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions-varying-size.html
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-glk8/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions-varying-size.html

  * igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5@pipe-a-edp-1:
    - shard-iclb:         [SKIP][112] ([i915#5235]) -> [PASS][113] +2 similar issues
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb2/igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5@pipe-a-edp-1.html
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb8/igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-5@pipe-a-edp-1.html

  * igt@kms_psr@psr2_cursor_blt:
    - shard-iclb:         [SKIP][114] ([fdo#109441]) -> [PASS][115] +3 similar issues
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb8/igt@kms_psr@psr2_cursor_blt.html
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb2/igt@kms_psr@psr2_cursor_blt.html

  * igt@perf@polling-parameterized:
    - shard-tglb:         [FAIL][116] ([i915#5639]) -> [PASS][117]
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-tglb7/igt@perf@polling-parameterized.html
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-tglb2/igt@perf@polling-parameterized.html

  
#### Warnings ####

  * igt@i915_pm_dc@dc3co-vpb-simulation:
    - shard-iclb:         [SKIP][118] ([i915#588]) -> [SKIP][119] ([i915#658])
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb2/igt@i915_pm_dc@dc3co-vpb-simulation.html
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb6/igt@i915_pm_dc@dc3co-vpb-simulation.html

  * igt@i915_pm_rc6_residency@rc6-idle@rcs0:
    - shard-iclb:         [WARN][120] ([i915#2684]) -> [FAIL][121] ([i915#2684])
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb3/igt@i915_pm_rc6_residency@rc6-idle@rcs0.html
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb3/igt@i915_pm_rc6_residency@rc6-idle@rcs0.html

  * igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf:
    - shard-iclb:         [SKIP][122] ([i915#658]) -> [SKIP][123] ([i915#2920])
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb8/igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf.html
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb2/igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area:
    - shard-iclb:         [SKIP][124] ([fdo#111068] / [i915#658]) -> [SKIP][125] ([i915#2920]) +1 similar issue
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb4/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area.html
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb2/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area.html

  * igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-big-fb:
    - shard-iclb:         [SKIP][126] ([i915#2920]) -> [SKIP][127] ([i915#658])
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-iclb2/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-big-fb.html
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/shard-iclb8/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-big-fb.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1099]: https://gitlab.freedesktop.org/drm/intel/issues/1099
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1937]: https://gitlab.freedesktop.org/drm/intel/issues/1937
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2587]: https://gitlab.freedesktop.org/drm/intel/issues/2587
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#2684]: https://gitlab.freedesktop.org/drm/intel/issues/2684
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#2994]: https://gitlab.freedesktop.org/drm/intel/issues/2994
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886
  [i915#4171]: https://gitlab.freedesktop.org/drm/intel/issues/4171
  [i915#4281]: https://gitlab.freedesktop.org/drm/intel/issues/4281
  [i915#4386]: https://gitlab.freedesktop.org/drm/intel/issues/4386
  [i915#4525]: https://gitlab.freedesktop.org/drm/intel/issues/4525
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4639]: https://gitlab.freedesktop.org/drm/intel/issues/4639
  [i915#4991]: https://gitlab.freedesktop.org/drm/intel/issues/4991
  [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176
  [i915#5235]: https://gitlab.freedesktop.org/drm/intel/issues/5235
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#5519]: https://gitlab.freedesktop.org/drm/intel/issues/5519
  [i915#5566]: https://gitlab.freedesktop.org/drm/intel/issues/5566
  [i915#5591]: https://gitlab.freedesktop.org/drm/intel/issues/5591
  [i915#5639]: https://gitlab.freedesktop.org/drm/intel/issues/5639
  [i915#588]: https://gitlab.freedesktop.org/drm/intel/issues/588
  [i915#6537]: https://gitlab.freedesktop.org/drm/intel/issues/6537
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#6987]: https://gitlab.freedesktop.org/drm/intel/issues/6987
  [i915#6992]: https://gitlab.freedesktop.org/drm/intel/issues/6992
  [i915#716]: https://gitlab.freedesktop.org/drm/intel/issues/716
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79


Build changes
-------------

  * Linux: CI_DRM_12204 -> Patchwork_108755v3

  CI-20190529: 20190529
  CI_DRM_12204: fd2f9b9a4178e667adad268a662eb8a9c0ddc8f8 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6670: d618e9865fe5cbaf511ca43503abad442605d0a5 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_108755v3: fd2f9b9a4178e667adad268a662eb8a9c0ddc8f8 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_108755v3/index.html

[-- Attachment #2: Type: text/html, Size: 28744 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 14/14] drm/i915/mtl: Add multicast steering for media GT
  2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
  (?)
@ 2022-10-03  8:56   ` Tvrtko Ursulin
  2022-10-03 19:32     ` Matt Roper
  -1 siblings, 1 reply; 54+ messages in thread
From: Tvrtko Ursulin @ 2022-10-03  8:56 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel


Hi Matt,

On 01/10/2022 01:45, Matt Roper wrote:
> MTL's media GT only has a single type of steering ("OAADDRM") which
> selects between media slice 0 and media slice 1.  We'll always steer to
> media slice 0 unless it is fused off (which is the case when VD0, VE0,
> and SFC0 are all reported as unavailable).
> 
> Bspec: 67789
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 19 +++++++++++++++++--
>   drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
>   drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 +++++++++++++++++-
>   3 files changed, 35 insertions(+), 3 deletions(-)

[snip]

> +static void
> +mtl_media_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> +{
> +	/*
> +	 * Unlike older platforms, we no longer setup implicit steering here;
> +	 * all MCR accesses are explicitly steered.
> +	 */
> +	if (drm_debug_enabled(DRM_UT_DRIVER)) {
> +		struct drm_printer p = drm_debug_printer("MCR Steering:");
> +
> +		intel_gt_mcr_report_steering(&p, gt, false);
> +	}
> +}
> +
>   static void
>   gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
>   {
>   	struct drm_i915_private *i915 = gt->i915;
>   
> -	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
> +	if (IS_METEORLAKE(i915) && gt->type == GT_MEDIA)
> +		mtl_media_gt_workarounds_init(gt, wal);
> +	else if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
>   		mtl_3d_gt_workarounds_init(gt, wal);
>   	else if (IS_PONTEVECCHIO(i915))
>   		pvc_gt_workarounds_init(gt, wal);

Casually reading only - wouldn't it be nicer if the if-ladder in here 
(gt_init_workarounds) would have a single case per platform, and then 
you'd fork further (3d vs media) in MTL specific function?

Also, series ends up with mtl_media_gt_workarounds_init and 
mtl_3d_gt_workarounds_init apparently 100% identical. You will need two 
copies in the future?

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 14/14] drm/i915/mtl: Add multicast steering for media GT
  2022-10-03  8:56   ` Tvrtko Ursulin
@ 2022-10-03 19:32     ` Matt Roper
  2022-10-04 10:13       ` Tvrtko Ursulin
  0 siblings, 1 reply; 54+ messages in thread
From: Matt Roper @ 2022-10-03 19:32 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: ravi.kumar.vodapalli, intel-gfx, dri-devel

On Mon, Oct 03, 2022 at 09:56:18AM +0100, Tvrtko Ursulin wrote:
> 
> Hi Matt,
> 
> On 01/10/2022 01:45, Matt Roper wrote:
> > MTL's media GT only has a single type of steering ("OAADDRM") which
> > selects between media slice 0 and media slice 1.  We'll always steer to
> > media slice 0 unless it is fused off (which is the case when VD0, VE0,
> > and SFC0 are all reported as unavailable).
> > 
> > Bspec: 67789
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 19 +++++++++++++++++--
> >   drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
> >   drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 +++++++++++++++++-
> >   3 files changed, 35 insertions(+), 3 deletions(-)
> 
> [snip]
> 
> > +static void
> > +mtl_media_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> > +{
> > +	/*
> > +	 * Unlike older platforms, we no longer setup implicit steering here;
> > +	 * all MCR accesses are explicitly steered.
> > +	 */
> > +	if (drm_debug_enabled(DRM_UT_DRIVER)) {
> > +		struct drm_printer p = drm_debug_printer("MCR Steering:");
> > +
> > +		intel_gt_mcr_report_steering(&p, gt, false);
> > +	}
> > +}
> > +
> >   static void
> >   gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
> >   {
> >   	struct drm_i915_private *i915 = gt->i915;
> > -	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
> > +	if (IS_METEORLAKE(i915) && gt->type == GT_MEDIA)
> > +		mtl_media_gt_workarounds_init(gt, wal);
> > +	else if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
> >   		mtl_3d_gt_workarounds_init(gt, wal);
> >   	else if (IS_PONTEVECCHIO(i915))
> >   		pvc_gt_workarounds_init(gt, wal);
> 
> Casually reading only - wouldn't it be nicer if the if-ladder in here
> (gt_init_workarounds) would have a single case per platform, and then you'd
> fork further (3d vs media) in MTL specific function?

Actually, that reminds me that we probably need to change this in a
different direction --- starting with MTL, we should stop tying
workarounds to the platform (IS_METEORLAKE) but rather tie them to the
IP version (i.e., GRAPHICS_VER or MEDIA_VER) since in the future the
same chiplets can potentially be re-used on multiple platforms.
Conversely, you could also potentially have variants of the same
"platform" (e.g., MTL) that incorporate chiplets with different IP
versions (and thus need distinct lists of workarounds and such).

At the moment MTL is the only platform we have with the standalone media
design so there's no potential for mix-and-match of chiplets yet, and
IS_METEORLAKE works fine in the short term, but we do need to start
planning ahead and moving off of platform checks in areas of the driver
like this.

> 
> Also, series ends up with mtl_media_gt_workarounds_init and
> mtl_3d_gt_workarounds_init apparently 100% identical. You will need two
> copies in the future?

Yes, the two GTs are expected to end up with completely different sets
of workarounds once the platform is enabled.  We've just been delaying
on actually sending the new MTL workarounds upstream to give the
workaround database a bit more time to settle.


Matt

> 
> Regards,
> 
> Tvrtko

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 14/14] drm/i915/mtl: Add multicast steering for media GT
  2022-10-03 19:32     ` Matt Roper
@ 2022-10-04 10:13       ` Tvrtko Ursulin
  0 siblings, 0 replies; 54+ messages in thread
From: Tvrtko Ursulin @ 2022-10-04 10:13 UTC (permalink / raw)
  To: Matt Roper; +Cc: ravi.kumar.vodapalli, intel-gfx, dri-devel


On 03/10/2022 20:32, Matt Roper wrote:
> On Mon, Oct 03, 2022 at 09:56:18AM +0100, Tvrtko Ursulin wrote:
>>
>> Hi Matt,
>>
>> On 01/10/2022 01:45, Matt Roper wrote:
>>> MTL's media GT only has a single type of steering ("OAADDRM") which
>>> selects between media slice 0 and media slice 1.  We'll always steer to
>>> media slice 0 unless it is fused off (which is the case when VD0, VE0,
>>> and SFC0 are all reported as unavailable).
>>>
>>> Bspec: 67789
>>> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 19 +++++++++++++++++--
>>>    drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
>>>    drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 +++++++++++++++++-
>>>    3 files changed, 35 insertions(+), 3 deletions(-)
>>
>> [snip]
>>
>>> +static void
>>> +mtl_media_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>>> +{
>>> +	/*
>>> +	 * Unlike older platforms, we no longer setup implicit steering here;
>>> +	 * all MCR accesses are explicitly steered.
>>> +	 */
>>> +	if (drm_debug_enabled(DRM_UT_DRIVER)) {
>>> +		struct drm_printer p = drm_debug_printer("MCR Steering:");
>>> +
>>> +		intel_gt_mcr_report_steering(&p, gt, false);
>>> +	}
>>> +}
>>> +
>>>    static void
>>>    gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    {
>>>    	struct drm_i915_private *i915 = gt->i915;
>>> -	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
>>> +	if (IS_METEORLAKE(i915) && gt->type == GT_MEDIA)
>>> +		mtl_media_gt_workarounds_init(gt, wal);
>>> +	else if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
>>>    		mtl_3d_gt_workarounds_init(gt, wal);
>>>    	else if (IS_PONTEVECCHIO(i915))
>>>    		pvc_gt_workarounds_init(gt, wal);
>>
>> Casually reading only - wouldn't it be nicer if the if-ladder in here
>> (gt_init_workarounds) would have a single case per platform, and then you'd
>> fork further (3d vs media) in MTL specific function?
> 
> Actually, that reminds me that we probably need to change this in a
> different direction --- starting with MTL, we should stop tying
> workarounds to the platform (IS_METEORLAKE) but rather tie them to the
> IP version (i.e., GRAPHICS_VER or MEDIA_VER) since in the future the
> same chiplets can potentially be re-used on multiple platforms.
> Conversely, you could also potentially have variants of the same
> "platform" (e.g., MTL) that incorporate chiplets with different IP
> versions (and thus need distinct lists of workarounds and such).
> 
> At the moment MTL is the only platform we have with the standalone media
> design so there's no potential for mix-and-match of chiplets yet, and
> IS_METEORLAKE works fine in the short term, but we do need to start
> planning ahead and moving off of platform checks in areas of the driver
> like this.
> 
>>
>> Also, series ends up with mtl_media_gt_workarounds_init and
>> mtl_3d_gt_workarounds_init apparently 100% identical. You will need two
>> copies in the future?
> 
> Yes, the two GTs are expected to end up with completely different sets
> of workarounds once the platform is enabled.  We've just been delaying
> on actually sending the new MTL workarounds upstream to give the
> workaround database a bit more time to settle.

Ah yes, I misread the banner printed from those two "as no workaround 
will be programmed from here" and thought why you'd need two copies of a 
nearly empty function and two identical comments. My bad.

You will end up with three instances of "if debug report steering" so 
could in theory add a helper for that. For some minimal value of 
consolidation.. up to you.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
  2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
@ 2022-10-04 12:56     ` Jani Nikula
  -1 siblings, 0 replies; 54+ messages in thread
From: Jani Nikula @ 2022-10-04 12:56 UTC (permalink / raw)
  To: Matt Roper, intel-gfx
  Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

On Fri, 30 Sep 2022, Matt Roper <matthew.d.roper@intel.com> wrote:
> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
> index 8f486f77609f..e823869b9afd 100644
> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> @@ -104,22 +104,16 @@ typedef struct {
>  
>  #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
>  
> -#define INVALID_MMIO_REG _MMIO(0)
> -
> -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
> -{
> -	return reg.reg;
> -}
> +typedef struct {
> +	u32 reg;
> +} i915_mcr_reg_t;
>  
> -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
> -{
> -	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
> -}
> +#define INVALID_MMIO_REG _MMIO(0)
>  
> -static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> -{
> -	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
> -}
> +/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
> +#define i915_mmio_reg_offset(r) (r.reg)
> +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
> +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
>  

I don't really like losing the type safety here. The whole and only
purpose of typedeffing i915_reg_t as a struct instead of just u32 was
the strict type safety.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
@ 2022-10-04 12:56     ` Jani Nikula
  0 siblings, 0 replies; 54+ messages in thread
From: Jani Nikula @ 2022-10-04 12:56 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

On Fri, 30 Sep 2022, Matt Roper <matthew.d.roper@intel.com> wrote:
> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
> index 8f486f77609f..e823869b9afd 100644
> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> @@ -104,22 +104,16 @@ typedef struct {
>  
>  #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
>  
> -#define INVALID_MMIO_REG _MMIO(0)
> -
> -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
> -{
> -	return reg.reg;
> -}
> +typedef struct {
> +	u32 reg;
> +} i915_mcr_reg_t;
>  
> -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
> -{
> -	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
> -}
> +#define INVALID_MMIO_REG _MMIO(0)
>  
> -static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> -{
> -	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
> -}
> +/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
> +#define i915_mmio_reg_offset(r) (r.reg)
> +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
> +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
>  

I don't really like losing the type safety here. The whole and only
purpose of typedeffing i915_reg_t as a struct instead of just u32 was
the strict type safety.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
  2022-10-04 12:56     ` [Intel-gfx] " Jani Nikula
@ 2022-10-04 13:00       ` Jani Nikula
  -1 siblings, 0 replies; 54+ messages in thread
From: Jani Nikula @ 2022-10-04 13:00 UTC (permalink / raw)
  To: Matt Roper, intel-gfx
  Cc: ravi.kumar.vodapalli, balasubramani.vivekanandan, dri-devel

On Tue, 04 Oct 2022, Jani Nikula <jani.nikula@linux.intel.com> wrote:
> On Fri, 30 Sep 2022, Matt Roper <matthew.d.roper@intel.com> wrote:
>> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
>> index 8f486f77609f..e823869b9afd 100644
>> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
>> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
>> @@ -104,22 +104,16 @@ typedef struct {
>>  
>>  #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
>>  
>> -#define INVALID_MMIO_REG _MMIO(0)
>> -
>> -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
>> -{
>> -	return reg.reg;
>> -}
>> +typedef struct {
>> +	u32 reg;
>> +} i915_mcr_reg_t;
>>  
>> -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
>> -{
>> -	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
>> -}
>> +#define INVALID_MMIO_REG _MMIO(0)
>>  
>> -static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>> -{
>> -	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
>> -}
>> +/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
>> +#define i915_mmio_reg_offset(r) (r.reg)
>> +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
>> +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
>>  
>
> I don't really like losing the type safety here. The whole and only
> purpose of typedeffing i915_reg_t as a struct instead of just u32 was
> the strict type safety.

PS. Changes like this should really be highlighted better, in the commit
subject and title. Now it feels like it's hidden within a big commit
within a big series, without even mentioning it in the commit message.


BR,
Jani.


>
> BR,
> Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
@ 2022-10-04 13:00       ` Jani Nikula
  0 siblings, 0 replies; 54+ messages in thread
From: Jani Nikula @ 2022-10-04 13:00 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

On Tue, 04 Oct 2022, Jani Nikula <jani.nikula@linux.intel.com> wrote:
> On Fri, 30 Sep 2022, Matt Roper <matthew.d.roper@intel.com> wrote:
>> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
>> index 8f486f77609f..e823869b9afd 100644
>> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
>> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
>> @@ -104,22 +104,16 @@ typedef struct {
>>  
>>  #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
>>  
>> -#define INVALID_MMIO_REG _MMIO(0)
>> -
>> -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
>> -{
>> -	return reg.reg;
>> -}
>> +typedef struct {
>> +	u32 reg;
>> +} i915_mcr_reg_t;
>>  
>> -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
>> -{
>> -	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
>> -}
>> +#define INVALID_MMIO_REG _MMIO(0)
>>  
>> -static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>> -{
>> -	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
>> -}
>> +/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
>> +#define i915_mmio_reg_offset(r) (r.reg)
>> +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
>> +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
>>  
>
> I don't really like losing the type safety here. The whole and only
> purpose of typedeffing i915_reg_t as a struct instead of just u32 was
> the strict type safety.

PS. Changes like this should really be highlighted better, in the commit
subject and title. Now it feels like it's hidden within a big commit
within a big series, without even mentioning it in the commit message.


BR,
Jani.


>
> BR,
> Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
  2022-10-04 13:00       ` [Intel-gfx] " Jani Nikula
@ 2022-10-05  1:01         ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-05  1:01 UTC (permalink / raw)
  To: Jani Nikula
  Cc: ravi.kumar.vodapalli, intel-gfx, balasubramani.vivekanandan, dri-devel

On Tue, Oct 04, 2022 at 04:00:57PM +0300, Jani Nikula wrote:
> On Tue, 04 Oct 2022, Jani Nikula <jani.nikula@linux.intel.com> wrote:
> > On Fri, 30 Sep 2022, Matt Roper <matthew.d.roper@intel.com> wrote:
> >> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
> >> index 8f486f77609f..e823869b9afd 100644
> >> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> >> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> >> @@ -104,22 +104,16 @@ typedef struct {
> >>  
> >>  #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
> >>  
> >> -#define INVALID_MMIO_REG _MMIO(0)
> >> -
> >> -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
> >> -{
> >> -	return reg.reg;
> >> -}
> >> +typedef struct {
> >> +	u32 reg;
> >> +} i915_mcr_reg_t;
> >>  
> >> -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
> >> -{
> >> -	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
> >> -}
> >> +#define INVALID_MMIO_REG _MMIO(0)
> >>  
> >> -static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> >> -{
> >> -	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
> >> -}
> >> +/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
> >> +#define i915_mmio_reg_offset(r) (r.reg)
> >> +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
> >> +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
> >>  
> >
> > I don't really like losing the type safety here. The whole and only
> > purpose of typedeffing i915_reg_t as a struct instead of just u32 was
> > the strict type safety.
> 
> PS. Changes like this should really be highlighted better, in the commit
> subject and title. Now it feels like it's hidden within a big commit
> within a big series, without even mentioning it in the commit message.

This patch adds the type safety we've been missing until now.  The only
thing that's slightly less safe is the i915_mmio_reg_offset() macro
itself since it allows either of the two split register types to be
passed to the same macro for simplicity.  In theory if you had some
other structure that also had a 'reg' member it could sneak through
here, but most type mistakes (e.g., passing an integer) would still
cause a build failure.

If we want to make this specific macro more bullet-proof, while still
minimizing thrash elsewhere, we could re-write this as

    #define i915_mmio_reg_offset(r) \
         _Generic((r), i915_reg_t: (r).reg, i915_mcr_reg_t: (r).reg)

which I believe should result in a build failure if anything other than
an i915_reg_t or i915_mcr_reg_t is passed in.

We could also just split this out into separate MCR vs non-MCR offset
lookup functions as we've done with other things in this series.  But if
I recall correctly that leads to annoying thrash in stuff like
cmdparser, gvt, perf, etc. that never actually wanted registers to begin
with, but rather lists/ranges of MMIO offsets without a care for the
type of register that lives at the offset.


Matt

> 
> 
> BR,
> Jani.
> 
> 
> >
> > BR,
> > Jani.
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
@ 2022-10-05  1:01         ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-05  1:01 UTC (permalink / raw)
  To: Jani Nikula; +Cc: ravi.kumar.vodapalli, intel-gfx, dri-devel

On Tue, Oct 04, 2022 at 04:00:57PM +0300, Jani Nikula wrote:
> On Tue, 04 Oct 2022, Jani Nikula <jani.nikula@linux.intel.com> wrote:
> > On Fri, 30 Sep 2022, Matt Roper <matthew.d.roper@intel.com> wrote:
> >> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
> >> index 8f486f77609f..e823869b9afd 100644
> >> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> >> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> >> @@ -104,22 +104,16 @@ typedef struct {
> >>  
> >>  #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
> >>  
> >> -#define INVALID_MMIO_REG _MMIO(0)
> >> -
> >> -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
> >> -{
> >> -	return reg.reg;
> >> -}
> >> +typedef struct {
> >> +	u32 reg;
> >> +} i915_mcr_reg_t;
> >>  
> >> -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
> >> -{
> >> -	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
> >> -}
> >> +#define INVALID_MMIO_REG _MMIO(0)
> >>  
> >> -static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> >> -{
> >> -	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
> >> -}
> >> +/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
> >> +#define i915_mmio_reg_offset(r) (r.reg)
> >> +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
> >> +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
> >>  
> >
> > I don't really like losing the type safety here. The whole and only
> > purpose of typedeffing i915_reg_t as a struct instead of just u32 was
> > the strict type safety.
> 
> PS. Changes like this should really be highlighted better, in the commit
> subject and title. Now it feels like it's hidden within a big commit
> within a big series, without even mentioning it in the commit message.

This patch adds the type safety we've been missing until now.  The only
thing that's slightly less safe is the i915_mmio_reg_offset() macro
itself since it allows either of the two split register types to be
passed to the same macro for simplicity.  In theory if you had some
other structure that also had a 'reg' member it could sneak through
here, but most type mistakes (e.g., passing an integer) would still
cause a build failure.

If we want to make this specific macro more bullet-proof, while still
minimizing thrash elsewhere, we could re-write this as

    #define i915_mmio_reg_offset(r) \
         _Generic((r), i915_reg_t: (r).reg, i915_mcr_reg_t: (r).reg)

which I believe should result in a build failure if anything other than
an i915_reg_t or i915_mcr_reg_t is passed in.

We could also just split this out into separate MCR vs non-MCR offset
lookup functions as we've done with other things in this series.  But if
I recall correctly that leads to annoying thrash in stuff like
cmdparser, gvt, perf, etc. that never actually wanted registers to begin
with, but rather lists/ranges of MMIO offsets without a care for the
type of register that lives at the offset.


Matt

> 
> 
> BR,
> Jani.
> 
> 
> >
> > BR,
> > Jani.
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 09/14] drm/i915/gt: Always use MCR functions on multicast registers
  2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
@ 2022-10-12 16:18     ` Balasubramani Vivekanandan
  -1 siblings, 0 replies; 54+ messages in thread
From: Balasubramani Vivekanandan @ 2022-10-12 16:18 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

On 30.09.2022 17:45, Matt Roper wrote:
> Rather than relying on the implicit behavior of intel_uncore_*()
> functions, let's always use the intel_gt_mcr_*() functions to operate on
> multicast/replicated registers.
> 
> v2:
>  - Add TLB invalidation registers
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c        | 42 ++++++++++++++++-------
>  drivers/gpu/drm/i915/gt/intel_mocs.c      | 13 +++----
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 12 ++++---
>  drivers/gpu/drm/i915/intel_pm.c           | 20 ++++++-----
>  4 files changed, 56 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index e14f159ad9fc..e763dc719d3a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -1017,6 +1017,32 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8,
>  	return rb;
>  }
>  
> +/*
> + * HW architecture suggest typical invalidation time at 40us,
> + * with pessimistic cases up to 100us and a recommendation to
> + * cap at 1ms. We go a bit higher just in case.
> + */
> +#define TLB_INVAL_TIMEOUT_US 100
> +#define TLB_INVAL_TIMEOUT_MS 4
> +
> +/*
> + * On Xe_HP the TLB invalidation registers are located at the same MMIO offsets
> + * but are now considered MCR registers.  Since they exist within a GAM range,
> + * the primary instance of the register rolls up the status from each unit.
> + */
> +static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb)
> +{
> +	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
> +		return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0,
> +						    TLB_INVAL_TIMEOUT_US,
> +						    TLB_INVAL_TIMEOUT_MS);
> +	else
> +		return __intel_wait_for_register_fw(gt->uncore, rb.reg, rb.bit, 0,
> +						    TLB_INVAL_TIMEOUT_US,
> +						    TLB_INVAL_TIMEOUT_MS,
> +						    NULL);
> +}
> +
>  static void mmio_invalidate_full(struct intel_gt *gt)
>  {

mmio_invalidate_full() contains a non-MCR register write function - 
`intel_uncore_write_fw(uncore, rb.reg, rb.bit)`
This function is used for MCR registers as well.

I notice in the later patches of the series mmio_invalidate_full() is
modified to use the MCR functions for MCR registers. But I think those
changes should be part of this patch.


Regards,
Bala

>  	static const i915_reg_t gen8_regs[] = {
> @@ -1099,22 +1125,12 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  	for_each_engine_masked(engine, gt, awake, tmp) {
>  		struct reg_and_bit rb;
>  
> -		/*
> -		 * HW architecture suggest typical invalidation time at 40us,
> -		 * with pessimistic cases up to 100us and a recommendation to
> -		 * cap at 1ms. We go a bit higher just in case.
> -		 */
> -		const unsigned int timeout_us = 100;
> -		const unsigned int timeout_ms = 4;
> -
>  		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> -		if (__intel_wait_for_register_fw(uncore,
> -						 rb.reg, rb.bit, 0,
> -						 timeout_us, timeout_ms,
> -						 NULL))
> +
> +		if (wait_for_invalidate(gt, rb))
>  			drm_err_ratelimited(&gt->i915->drm,
>  					    "%s TLB invalidation did not complete in %ums!\n",
> -					    engine->name, timeout_ms);
> +					    engine->name, TLB_INVAL_TIMEOUT_MS);
>  	}
>  
>  	/*
> diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
> index 06643701bf24..89ef1e06bf1d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> @@ -7,6 +7,7 @@
>  
>  #include "intel_engine.h"
>  #include "intel_gt.h"
> +#include "intel_gt_mcr.h"
>  #include "intel_gt_regs.h"
>  #include "intel_mocs.h"
>  #include "intel_ring.h"
> @@ -601,17 +602,17 @@ static u32 l3cc_combine(u16 low, u16 high)
>  	     0; \
>  	     i++)
>  
> -static void init_l3cc_table(struct intel_uncore *uncore,
> +static void init_l3cc_table(struct intel_gt *gt,
>  			    const struct drm_i915_mocs_table *table)
>  {
>  	unsigned int i;
>  	u32 l3cc;
>  
>  	for_each_l3cc(l3cc, table, i)
> -		if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50))
> -			intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc);
> +		if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
> +			intel_gt_mcr_multicast_write_fw(gt, XEHP_LNCFCMOCS(i), l3cc);
>  		else
> -			intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
> +			intel_uncore_write_fw(gt->uncore, GEN9_LNCFCMOCS(i), l3cc);
>  }
>  
>  void intel_mocs_init_engine(struct intel_engine_cs *engine)
> @@ -631,7 +632,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
>  		init_mocs_table(engine, &table);
>  
>  	if (flags & HAS_RENDER_L3CC && engine->class == RENDER_CLASS)
> -		init_l3cc_table(engine->uncore, &table);
> +		init_l3cc_table(engine->gt, &table);
>  }
>  
>  static u32 global_mocs_offset(void)
> @@ -667,7 +668,7 @@ void intel_mocs_init(struct intel_gt *gt)
>  	 * memory transactions including guc transactions
>  	 */
>  	if (flags & HAS_RENDER_L3CC)
> -		init_l3cc_table(gt->uncore, &table);
> +		init_l3cc_table(gt, &table);
>  }
>  
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> index 9229243992c2..5b86b2e286e0 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> @@ -10,12 +10,15 @@
>   */
>  
>  #include "gt/intel_gt.h"
> +#include "gt/intel_gt_mcr.h"
>  #include "gt/intel_gt_regs.h"
>  #include "intel_guc_fw.h"
>  #include "i915_drv.h"
>  
> -static void guc_prepare_xfer(struct intel_uncore *uncore)
> +static void guc_prepare_xfer(struct intel_gt *gt)
>  {
> +	struct intel_uncore *uncore = gt->uncore;
> +
>  	u32 shim_flags = GUC_ENABLE_READ_CACHE_LOGIC |
>  			 GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA |
>  			 GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA |
> @@ -35,8 +38,9 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
>  
>  	if (GRAPHICS_VER(uncore->i915) == 9) {
>  		/* DOP Clock Gating Enable for GuC clocks */
> -		intel_uncore_rmw(uncore, GEN8_MISCCPCTL,
> -				 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE);
> +		intel_gt_mcr_multicast_write(gt, GEN8_MISCCPCTL,
> +					     GEN8_DOP_CLOCK_GATE_GUC_ENABLE |
> +					     intel_gt_mcr_read_any(gt, GEN8_MISCCPCTL));
>  
>  		/* allows for 5us (in 10ns units) before GT can go to RC6 */
>  		intel_uncore_write(uncore, GUC_ARAT_C6DIS, 0x1FF);
> @@ -168,7 +172,7 @@ int intel_guc_fw_upload(struct intel_guc *guc)
>  	struct intel_uncore *uncore = gt->uncore;
>  	int ret;
>  
> -	guc_prepare_xfer(uncore);
> +	guc_prepare_xfer(gt);
>  
>  	/*
>  	 * Note that GuC needs the CSS header plus uKernel code to be copied
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index c694433a7da5..381f3d7ef8a7 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -30,6 +30,8 @@
>  #include "display/skl_watermark.h"
>  
>  #include "gt/intel_engine_regs.h"
> +#include "gt/intel_gt.h"
> +#include "gt/intel_gt_mcr.h"
>  #include "gt/intel_gt_regs.h"
>  
>  #include "i915_drv.h"
> @@ -4339,22 +4341,23 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
>  	u32 val;
>  
>  	/* WaTempDisableDOPClkGating:bdw */
> -	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL);
> -	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
> +	misccpctl = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL);
> +	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
> +				     misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
>  
> -	val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1);
> +	val = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
>  	val &= ~L3_PRIO_CREDITS_MASK;
>  	val |= L3_GENERAL_PRIO_CREDITS(general_prio_credits);
>  	val |= L3_HIGH_PRIO_CREDITS(high_prio_credits);
> -	intel_uncore_write(&dev_priv->uncore, GEN8_L3SQCREG1, val);
> +	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_L3SQCREG1, val);
>  
>  	/*
>  	 * Wait at least 100 clocks before re-enabling clock gating.
>  	 * See the definition of L3SQCREG1 in BSpec.
>  	 */
> -	intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1);
> +	intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
>  	udelay(1);
> -	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl);
> +	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL, misccpctl);
>  }
>  
>  static void icl_init_clock_gating(struct drm_i915_private *dev_priv)
> @@ -4514,8 +4517,9 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
>  	gen9_init_clock_gating(dev_priv);
>  
>  	/* WaDisableDopClockGating:skl */
> -	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &
> -		   ~GEN8_DOP_CLOCK_GATE_ENABLE);
> +	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
> +				     intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL) &
> +				     ~GEN8_DOP_CLOCK_GATE_ENABLE);
>  
>  	/* WAC6entrylatency:skl */
>  	intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) |
> -- 
> 2.37.3
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 09/14] drm/i915/gt: Always use MCR functions on multicast registers
@ 2022-10-12 16:18     ` Balasubramani Vivekanandan
  0 siblings, 0 replies; 54+ messages in thread
From: Balasubramani Vivekanandan @ 2022-10-12 16:18 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

On 30.09.2022 17:45, Matt Roper wrote:
> Rather than relying on the implicit behavior of intel_uncore_*()
> functions, let's always use the intel_gt_mcr_*() functions to operate on
> multicast/replicated registers.
> 
> v2:
>  - Add TLB invalidation registers
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c        | 42 ++++++++++++++++-------
>  drivers/gpu/drm/i915/gt/intel_mocs.c      | 13 +++----
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 12 ++++---
>  drivers/gpu/drm/i915/intel_pm.c           | 20 ++++++-----
>  4 files changed, 56 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index e14f159ad9fc..e763dc719d3a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -1017,6 +1017,32 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8,
>  	return rb;
>  }
>  
> +/*
> + * HW architecture suggest typical invalidation time at 40us,
> + * with pessimistic cases up to 100us and a recommendation to
> + * cap at 1ms. We go a bit higher just in case.
> + */
> +#define TLB_INVAL_TIMEOUT_US 100
> +#define TLB_INVAL_TIMEOUT_MS 4
> +
> +/*
> + * On Xe_HP the TLB invalidation registers are located at the same MMIO offsets
> + * but are now considered MCR registers.  Since they exist within a GAM range,
> + * the primary instance of the register rolls up the status from each unit.
> + */
> +static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb)
> +{
> +	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
> +		return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0,
> +						    TLB_INVAL_TIMEOUT_US,
> +						    TLB_INVAL_TIMEOUT_MS);
> +	else
> +		return __intel_wait_for_register_fw(gt->uncore, rb.reg, rb.bit, 0,
> +						    TLB_INVAL_TIMEOUT_US,
> +						    TLB_INVAL_TIMEOUT_MS,
> +						    NULL);
> +}
> +
>  static void mmio_invalidate_full(struct intel_gt *gt)
>  {

mmio_invalidate_full() contains a non-MCR register write function - 
`intel_uncore_write_fw(uncore, rb.reg, rb.bit)`
This function is used for MCR registers as well.

I notice in the later patches of the series mmio_invalidate_full() is
modified to use the MCR functions for MCR registers. But I think those
changes should be part of this patch.


Regards,
Bala

>  	static const i915_reg_t gen8_regs[] = {
> @@ -1099,22 +1125,12 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  	for_each_engine_masked(engine, gt, awake, tmp) {
>  		struct reg_and_bit rb;
>  
> -		/*
> -		 * HW architecture suggest typical invalidation time at 40us,
> -		 * with pessimistic cases up to 100us and a recommendation to
> -		 * cap at 1ms. We go a bit higher just in case.
> -		 */
> -		const unsigned int timeout_us = 100;
> -		const unsigned int timeout_ms = 4;
> -
>  		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> -		if (__intel_wait_for_register_fw(uncore,
> -						 rb.reg, rb.bit, 0,
> -						 timeout_us, timeout_ms,
> -						 NULL))
> +
> +		if (wait_for_invalidate(gt, rb))
>  			drm_err_ratelimited(&gt->i915->drm,
>  					    "%s TLB invalidation did not complete in %ums!\n",
> -					    engine->name, timeout_ms);
> +					    engine->name, TLB_INVAL_TIMEOUT_MS);
>  	}
>  
>  	/*
> diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
> index 06643701bf24..89ef1e06bf1d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> @@ -7,6 +7,7 @@
>  
>  #include "intel_engine.h"
>  #include "intel_gt.h"
> +#include "intel_gt_mcr.h"
>  #include "intel_gt_regs.h"
>  #include "intel_mocs.h"
>  #include "intel_ring.h"
> @@ -601,17 +602,17 @@ static u32 l3cc_combine(u16 low, u16 high)
>  	     0; \
>  	     i++)
>  
> -static void init_l3cc_table(struct intel_uncore *uncore,
> +static void init_l3cc_table(struct intel_gt *gt,
>  			    const struct drm_i915_mocs_table *table)
>  {
>  	unsigned int i;
>  	u32 l3cc;
>  
>  	for_each_l3cc(l3cc, table, i)
> -		if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50))
> -			intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc);
> +		if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
> +			intel_gt_mcr_multicast_write_fw(gt, XEHP_LNCFCMOCS(i), l3cc);
>  		else
> -			intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
> +			intel_uncore_write_fw(gt->uncore, GEN9_LNCFCMOCS(i), l3cc);
>  }
>  
>  void intel_mocs_init_engine(struct intel_engine_cs *engine)
> @@ -631,7 +632,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
>  		init_mocs_table(engine, &table);
>  
>  	if (flags & HAS_RENDER_L3CC && engine->class == RENDER_CLASS)
> -		init_l3cc_table(engine->uncore, &table);
> +		init_l3cc_table(engine->gt, &table);
>  }
>  
>  static u32 global_mocs_offset(void)
> @@ -667,7 +668,7 @@ void intel_mocs_init(struct intel_gt *gt)
>  	 * memory transactions including guc transactions
>  	 */
>  	if (flags & HAS_RENDER_L3CC)
> -		init_l3cc_table(gt->uncore, &table);
> +		init_l3cc_table(gt, &table);
>  }
>  
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> index 9229243992c2..5b86b2e286e0 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> @@ -10,12 +10,15 @@
>   */
>  
>  #include "gt/intel_gt.h"
> +#include "gt/intel_gt_mcr.h"
>  #include "gt/intel_gt_regs.h"
>  #include "intel_guc_fw.h"
>  #include "i915_drv.h"
>  
> -static void guc_prepare_xfer(struct intel_uncore *uncore)
> +static void guc_prepare_xfer(struct intel_gt *gt)
>  {
> +	struct intel_uncore *uncore = gt->uncore;
> +
>  	u32 shim_flags = GUC_ENABLE_READ_CACHE_LOGIC |
>  			 GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA |
>  			 GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA |
> @@ -35,8 +38,9 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
>  
>  	if (GRAPHICS_VER(uncore->i915) == 9) {
>  		/* DOP Clock Gating Enable for GuC clocks */
> -		intel_uncore_rmw(uncore, GEN8_MISCCPCTL,
> -				 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE);
> +		intel_gt_mcr_multicast_write(gt, GEN8_MISCCPCTL,
> +					     GEN8_DOP_CLOCK_GATE_GUC_ENABLE |
> +					     intel_gt_mcr_read_any(gt, GEN8_MISCCPCTL));
>  
>  		/* allows for 5us (in 10ns units) before GT can go to RC6 */
>  		intel_uncore_write(uncore, GUC_ARAT_C6DIS, 0x1FF);
> @@ -168,7 +172,7 @@ int intel_guc_fw_upload(struct intel_guc *guc)
>  	struct intel_uncore *uncore = gt->uncore;
>  	int ret;
>  
> -	guc_prepare_xfer(uncore);
> +	guc_prepare_xfer(gt);
>  
>  	/*
>  	 * Note that GuC needs the CSS header plus uKernel code to be copied
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index c694433a7da5..381f3d7ef8a7 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -30,6 +30,8 @@
>  #include "display/skl_watermark.h"
>  
>  #include "gt/intel_engine_regs.h"
> +#include "gt/intel_gt.h"
> +#include "gt/intel_gt_mcr.h"
>  #include "gt/intel_gt_regs.h"
>  
>  #include "i915_drv.h"
> @@ -4339,22 +4341,23 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
>  	u32 val;
>  
>  	/* WaTempDisableDOPClkGating:bdw */
> -	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL);
> -	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
> +	misccpctl = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL);
> +	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
> +				     misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
>  
> -	val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1);
> +	val = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
>  	val &= ~L3_PRIO_CREDITS_MASK;
>  	val |= L3_GENERAL_PRIO_CREDITS(general_prio_credits);
>  	val |= L3_HIGH_PRIO_CREDITS(high_prio_credits);
> -	intel_uncore_write(&dev_priv->uncore, GEN8_L3SQCREG1, val);
> +	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_L3SQCREG1, val);
>  
>  	/*
>  	 * Wait at least 100 clocks before re-enabling clock gating.
>  	 * See the definition of L3SQCREG1 in BSpec.
>  	 */
> -	intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1);
> +	intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
>  	udelay(1);
> -	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl);
> +	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL, misccpctl);
>  }
>  
>  static void icl_init_clock_gating(struct drm_i915_private *dev_priv)
> @@ -4514,8 +4517,9 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
>  	gen9_init_clock_gating(dev_priv);
>  
>  	/* WaDisableDopClockGating:skl */
> -	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &
> -		   ~GEN8_DOP_CLOCK_GATE_ENABLE);
> +	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
> +				     intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL) &
> +				     ~GEN8_DOP_CLOCK_GATE_ENABLE);
>  
>  	/* WAC6entrylatency:skl */
>  	intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) |
> -- 
> 2.37.3
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
  2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
@ 2022-10-13  7:21     ` Balasubramani Vivekanandan
  -1 siblings, 0 replies; 54+ messages in thread
From: Balasubramani Vivekanandan @ 2022-10-13  7:21 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

On 30.09.2022 17:45, Matt Roper wrote:
> Rather than treating multicast registers as 'i915_reg_t' let's define
> them as a completely new type.  This will allow the compiler to help us
> make sure we're using multicast-aware functions to operate on multicast
> registers.
> 
> This plan does break down a bit in places where we're just maintaining
> heterogeneous lists of registers (e.g., various MMIO whitelists used by
> perf, GVT, etc.) rather than performing reads/writes.  We only really
> care about the offset in those cases, so for now we can "cast" the
> registers as non-MCR, leaving us with a list of i915_reg_t's, but we may
> want to look for better ways to store mixed collections of i915_reg_t
> and i915_mcr_reg_t in the future.
> 
> v2:
>  - Add TLB invalidation registers
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c            | 32 +++++++----
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 53 ++++++++++++-------
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.h        | 18 +++----
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 27 +++++++---
>  drivers/gpu/drm/i915/gt/intel_workarounds.c   | 32 +++++------
>  .../gpu/drm/i915/gt/intel_workarounds_types.h |  5 +-
>  .../gpu/drm/i915/gt/selftest_workarounds.c    |  2 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  2 +-
>  .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 +-
>  drivers/gpu/drm/i915/gvt/handlers.c           | 17 +++---
>  drivers/gpu/drm/i915/gvt/mmio_context.c       | 16 +++---
>  drivers/gpu/drm/i915/i915_reg_defs.h          | 22 +++-----
>  drivers/gpu/drm/i915/intel_gvt_mmio_table.c   | 10 ++--
>  13 files changed, 140 insertions(+), 100 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index e763dc719d3a..27dbb9e4bd6c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -991,7 +991,10 @@ void intel_gt_info_print(const struct intel_gt_info *info,
>  }
>  
>  struct reg_and_bit {
> -	i915_reg_t reg;
> +	union {
> +		i915_reg_t reg;
> +		i915_mcr_reg_t mcr_reg;
> +	};
>  	u32 bit;
>  };
>  
> @@ -1033,7 +1036,7 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8,
>  static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb)
>  {
>  	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
> -		return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0,
> +		return intel_gt_mcr_wait_for_reg_fw(gt, rb.mcr_reg, rb.bit, 0,
>  						    TLB_INVAL_TIMEOUT_US,
>  						    TLB_INVAL_TIMEOUT_MS);
>  	else
> @@ -1058,7 +1061,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  		[COPY_ENGINE_CLASS]		= GEN12_BLT_TLB_INV_CR,
>  		[COMPUTE_CLASS]			= GEN12_COMPCTX_TLB_INV_CR,
>  	};
> -	static const i915_reg_t xehp_regs[] = {
> +	static const i915_mcr_reg_t xehp_regs[] = {
>  		[RENDER_CLASS]			= XEHP_GFX_TLB_INV_CR,
>  		[VIDEO_DECODE_CLASS]		= XEHP_VD_TLB_INV_CR,
>  		[VIDEO_ENHANCEMENT_CLASS]	= XEHP_VE_TLB_INV_CR,
> @@ -1074,7 +1077,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  	unsigned int num = 0;
>  
>  	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
> -		regs = xehp_regs;
> +		regs = NULL;
>  		num = ARRAY_SIZE(xehp_regs);
The change looks a bit odd. Can you assign regs = NULL at the definition
of it and remove the assignment here?

>  	} else if (GRAPHICS_VER(i915) == 12) {
>  		regs = gen12_regs;
> @@ -1101,11 +1104,17 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  		if (!intel_engine_pm_is_awake(engine))
>  			continue;
>  
> -		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> -		if (!i915_mmio_reg_offset(rb.reg))
> -			continue;
> +		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
> +			intel_gt_mcr_multicast_write_fw(gt,
> +							xehp_regs[engine->class],
> +							BIT(engine->instance));
> +		} else {
> +			rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> +			if (!i915_mmio_reg_offset(rb.reg))
> +				continue;
>  
> -		intel_uncore_write_fw(uncore, rb.reg, rb.bit);
> +			intel_uncore_write_fw(uncore, rb.reg, rb.bit);
> +		}
>  		awake |= engine->mask;
>  	}
>  
> @@ -1125,7 +1134,12 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  	for_each_engine_masked(engine, gt, awake, tmp) {
>  		struct reg_and_bit rb;
>  
> -		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> +		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
> +			rb.mcr_reg = xehp_regs[engine->class];
> +			rb.bit = BIT(engine->instance);
> +		} else {
> +			rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> +		}
>  
>  		if (wait_for_invalidate(gt, rb))
>  			drm_err_ratelimited(&gt->i915->drm,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index a2bc77c33835..9e2caba64f19 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -150,6 +150,19 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>  	}
>  }
>  
> +/*
> + * Although the rest of the driver should use MCR-specific functions to
> + * read/write MCR registers, we still use the regular intel_uncore_* functions
> + * internally to implement those, so we need a way for the functions in this
> + * file to "cast" an i915_mcr_reg_t into an i915_reg_t.
> + */
> +static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
> +{
> +	i915_reg_t r = { .reg = mcr.reg };
> +
> +	return r;
> +}
> +
>  /*
>   * rw_with_mcr_steering_fw - Access a register with specific MCR steering
>   * @uncore: pointer to struct intel_uncore
> @@ -164,7 +177,7 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>   * Caller needs to make sure the relevant forcewake wells are up.
>   */
>  static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> -				   i915_reg_t reg, u8 rw_flag,
> +				   i915_mcr_reg_t reg, u8 rw_flag,
>  				   int group, int instance, u32 value)
>  {
>  	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
> @@ -201,9 +214,9 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>  	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
>  
>  	if (rw_flag == FW_REG_READ)
> -		val = intel_uncore_read_fw(uncore, reg);
> +		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
>  	else
> -		intel_uncore_write_fw(uncore, reg, value);
> +		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
>  
>  	mcr &= ~mcr_mask;
>  	mcr |= old_mcr & mcr_mask;
> @@ -214,14 +227,14 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>  }
>  
>  static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
> -				i915_reg_t reg, u8 rw_flag,
> +				i915_mcr_reg_t reg, u8 rw_flag,
>  				int group, int instance,
>  				u32 value)
>  {
>  	enum forcewake_domains fw_domains;
>  	u32 val;
>  
> -	fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
> +	fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg),
>  						    rw_flag);
>  	fw_domains |= intel_uncore_forcewake_for_reg(uncore,
>  						     GEN8_MCR_SELECTOR,
> @@ -249,7 +262,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
>   * group/instance.
>   */
>  u32 intel_gt_mcr_read(struct intel_gt *gt,
> -		      i915_reg_t reg,
> +		      i915_mcr_reg_t reg,
>  		      int group, int instance)
>  {
>  	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, instance, 0);
> @@ -266,7 +279,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
>   * Write an MCR register in unicast mode after steering toward a specific
>   * group/instance.
>   */
> -void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value,
> +void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value,
>  				int group, int instance)
>  {
>  	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, value);
> @@ -281,9 +294,9 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value,
>   * Write an MCR register in multicast mode to update all instances.
>   */
>  void intel_gt_mcr_multicast_write(struct intel_gt *gt,
> -				i915_reg_t reg, u32 value)
> +				i915_mcr_reg_t reg, u32 value)
>  {
> -	intel_uncore_write(gt->uncore, reg, value);
> +	intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value);
>  }
>  
>  /**
> @@ -297,9 +310,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
>   * domains; use intel_gt_mcr_multicast_write() in cases where forcewake should
>   * be obtained automatically.
>   */
> -void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 value)
> +void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value)
>  {
> -	intel_uncore_write_fw(gt->uncore, reg, value);
> +	intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value);
>  }
>  
>  /**
> @@ -318,7 +331,7 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va
>   * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should
>   * be obtained automatically.
>   */
> -void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
> +void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
>  				u32 clear, u32 set)
>  {
>  	u32 val = intel_gt_mcr_read_any(gt, reg);
> @@ -341,10 +354,10 @@ void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
>   * for @type steering too.
>   */
>  static bool reg_needs_read_steering(struct intel_gt *gt,
> -				    i915_reg_t reg,
> +				    i915_mcr_reg_t reg,
>  				    enum intel_steering_type type)
>  {
> -	const u32 offset = i915_mmio_reg_offset(reg);
> +	const u32 offset = i915_mmio_reg_offset(mcr_reg_cast(reg));
>  	const struct intel_mmio_range *entry;
>  
>  	if (likely(!gt->steering_table[type]))
> @@ -424,7 +437,7 @@ static void get_nonterminated_steering(struct intel_gt *gt,
>   * steering.
>   */
>  void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
> -					     i915_reg_t reg,
> +					     i915_mcr_reg_t reg,
>  					     u8 *group, u8 *instance)
>  {
>  	int type;
> @@ -453,7 +466,7 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
>   *
>   * Returns the value from a non-terminated instance of @reg.
>   */
> -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
> +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
>  {
>  	int type;
>  	u8 group, instance;
> @@ -467,7 +480,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
>  		}
>  	}
>  
> -	return intel_uncore_read_fw(gt->uncore, reg);
> +	return intel_uncore_read_fw(gt->uncore, mcr_reg_cast(reg));
>  }
>  
>  /**
> @@ -480,7 +493,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
>   *
>   * Returns the value from a non-terminated instance of @reg.
>   */
> -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
> +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
>  {
>  	int type;
>  	u8 group, instance;
> @@ -494,7 +507,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
>  		}
>  	}
>  
> -	return intel_uncore_read(gt->uncore, reg);
> +	return intel_uncore_read(gt->uncore, mcr_reg_cast(reg));
>  }
>  
>  static void report_steering_type(struct drm_printer *p,
> @@ -595,7 +608,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
>   * Return: 0 if the register matches the desired condition, or -ETIMEDOUT.
>   */
>  int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
> -				 i915_reg_t reg,
> +				 i915_mcr_reg_t reg,
>  				 u32 mask,
>  				 u32 value,
>  				 unsigned int fast_timeout_us,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> index 96979af01f2b..7fb574d31d73 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> @@ -11,24 +11,24 @@
>  void intel_gt_mcr_init(struct intel_gt *gt);
>  
>  u32 intel_gt_mcr_read(struct intel_gt *gt,
> -		      i915_reg_t reg,
> +		      i915_mcr_reg_t reg,
>  		      int group, int instance);
> -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
> -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
> +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg);
> +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg);
>  
>  void intel_gt_mcr_unicast_write(struct intel_gt *gt,
> -				i915_reg_t reg, u32 value,
> +				i915_mcr_reg_t reg, u32 value,
>  				int group, int instance);
>  void intel_gt_mcr_multicast_write(struct intel_gt *gt,
> -				  i915_reg_t reg, u32 value);
> +				  i915_mcr_reg_t reg, u32 value);
>  void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
> -				     i915_reg_t reg, u32 value);
> +				     i915_mcr_reg_t reg, u32 value);
>  
> -void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
> +void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
>  				u32 clear, u32 set);
>  
>  void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
> -					     i915_reg_t reg,
> +					     i915_mcr_reg_t reg,
>  					     u8 *group, u8 *instance);
>  
>  void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
> @@ -38,7 +38,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
>  				  unsigned int *group, unsigned int *instance);
>  
>  int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
> -				 i915_reg_t reg,
> +				 i915_mcr_reg_t reg,
>  				 u32 mask,
>  				 u32 value,
>  				 unsigned int fast_timeout_us,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index 901ea03b9fd8..b7341f01ec9f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -8,7 +8,18 @@
>  
>  #include "i915_reg_defs.h"
>  
> -#define MCR_REG(offset)	_MMIO(offset)
> +#define MCR_REG(offset)	((const i915_mcr_reg_t){ .reg = (offset) })
> +
> +/*
> + * The perf control registers are technically multicast registers, but the
> + * driver never needs to read/write them directly; we only use them to build
> + * lists of registers (where they're mixed in with other non-MCR registers)
> + * and then operate on the offset directly.  For now we'll just define them
> + * as non-multicast so we can place them on the same list, but we may want
> + * to try to come up with a better way to handle heterogeneous lists of
> + * registers in the future.
> + */
> +#define PERF_REG(offset)			_MMIO(offset)
>  
>  /* RPM unit config (Gen8+) */
>  #define RPM_CONFIG0				_MMIO(0xd00)
> @@ -1116,8 +1127,8 @@
>  #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
>  #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
>  
> -#define EU_PERF_CNTL0				MCR_REG(0xe458)
> -#define EU_PERF_CNTL4				MCR_REG(0xe45c)
> +#define EU_PERF_CNTL0				PERF_REG(0xe458)
> +#define EU_PERF_CNTL4				PERF_REG(0xe45c)
>  
>  #define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
>  #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
> @@ -1154,16 +1165,16 @@
>  #define   STACKID_CTRL				REG_GENMASK(6, 5)
>  #define   STACKID_CTRL_512			REG_FIELD_PREP(STACKID_CTRL, 0x2)
>  
> -#define EU_PERF_CNTL1				MCR_REG(0xe558)
> -#define EU_PERF_CNTL5				MCR_REG(0xe55c)
> +#define EU_PERF_CNTL1				PERF_REG(0xe558)
> +#define EU_PERF_CNTL5				PERF_REG(0xe55c)
>  
>  #define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
>  #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
>  #define ICL_HDC_MODE				MCR_REG(0xe5f4)
>  
> -#define EU_PERF_CNTL2				MCR_REG(0xe658)
> -#define EU_PERF_CNTL6				MCR_REG(0xe65c)
> -#define EU_PERF_CNTL3				MCR_REG(0xe758)
> +#define EU_PERF_CNTL2				PERF_REG(0xe658)
> +#define EU_PERF_CNTL6				PERF_REG(0xe65c)
> +#define EU_PERF_CNTL3				PERF_REG(0xe758)
>  
>  #define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
>  #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index c5cd89aa13c9..174c74dcda99 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -166,11 +166,11 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
>  	_wa_add(wal, &wa);
>  }
>  
> -static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
> +static void wa_mcr_add(struct i915_wa_list *wal, i915_mcr_reg_t reg,
>  		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
>  {
>  	struct i915_wa wa = {
> -		.reg  = reg,
> +		.mcr_reg = reg,
>  		.clr  = clear,
>  		.set  = set,
>  		.read = read_mask,
> @@ -188,7 +188,7 @@ wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
>  }
>  
>  static void
> -wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
> +wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set)
>  {
>  	wa_mcr_add(wal, reg, clear, set, clear, false);
>  }
> @@ -206,7 +206,7 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
>  }
>  
>  static void
> -wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
> +wa_mcr_write_or(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set)
>  {
>  	wa_mcr_write_clr_set(wal, reg, set, set);
>  }
> @@ -218,7 +218,7 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
>  }
>  
>  static void
> -wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
> +wa_mcr_write_clr(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clr)
>  {
>  	wa_mcr_write_clr_set(wal, reg, clr, 0);
>  }
> @@ -241,7 +241,7 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
>  }
>  
>  static void
> -wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> +wa_mcr_masked_en(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
>  {
>  	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
>  }
> @@ -253,7 +253,7 @@ wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
>  }
>  
>  static void
> -wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> +wa_mcr_masked_dis(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
>  {
>  	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
>  }
> @@ -266,7 +266,7 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
>  }
>  
>  static void
> -wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
> +wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_mcr_reg_t reg,
>  			u32 mask, u32 val)
>  {
>  	wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
> @@ -1692,19 +1692,19 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
>  		/* open-coded rmw due to steering */
>  		if (wa->clr)
>  			old = wa->is_mcr ?
> -				intel_gt_mcr_read_any_fw(gt, wa->reg) :
> +				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
>  				intel_uncore_read_fw(uncore, wa->reg);
>  		val = (old & ~wa->clr) | wa->set;
>  		if (val != old || !wa->clr) {
>  			if (wa->is_mcr)
> -				intel_gt_mcr_multicast_write_fw(gt, wa->reg, val);
> +				intel_gt_mcr_multicast_write_fw(gt, wa->mcr_reg, val);
>  			else
>  				intel_uncore_write_fw(uncore, wa->reg, val);
>  		}
>  
>  		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
>  			u32 val = wa->is_mcr ?
> -				intel_gt_mcr_read_any_fw(gt, wa->reg) :
> +				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
>  				intel_uncore_read_fw(uncore, wa->reg);
>  
>  			wa_verify(wa, val, wal->name, "application");
> @@ -1738,7 +1738,7 @@ static bool wa_list_verify(struct intel_gt *gt,
>  
>  	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
>  		ok &= wa_verify(wa, wa->is_mcr ?
> -				intel_gt_mcr_read_any_fw(gt, wa->reg) :
> +				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
>  				intel_uncore_read_fw(uncore, wa->reg),
>  				wal->name, from);
>  
> @@ -1786,10 +1786,10 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
>  }
>  
>  static void
> -whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
> +whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 flags)
>  {
>  	struct i915_wa wa = {
> -		.reg = reg,
> +		.mcr_reg = reg,
>  		.is_mcr = 1,
>  	};
>  
> @@ -1799,7 +1799,7 @@ whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
>  	if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags)))
>  		return;
>  
> -	wa.reg.reg |= flags;
> +	wa.mcr_reg.reg |= flags;
>  	_wa_add(wal, &wa);
>  }
>  
> @@ -1810,7 +1810,7 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
>  }
>  
>  static void
> -whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
> +whitelist_mcr_reg(struct i915_wa_list *wal, i915_mcr_reg_t reg)
>  {
>  	whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
>  }
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
> index f05b37e56fa9..7c8b01d00043 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
> @@ -11,7 +11,10 @@
>  #include "i915_reg_defs.h"
>  
>  struct i915_wa {
> -	i915_reg_t	reg;
> +	union {
> +		i915_reg_t	reg;
> +		i915_mcr_reg_t	mcr_reg;
> +	};
>  	u32		clr;
>  	u32		set;
>  	u32		read;
> diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> index 67a9aab801dd..21b1edc052f8 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> @@ -991,7 +991,7 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg)
>  	/* Alas, we must pardon some whitelists. Mistakes already made */
>  	static const struct regmask pardon[] = {
>  		{ GEN9_CTX_PREEMPT_REG, 9 },
> -		{ GEN8_L3SQCREG4, 9 },
> +		{ _MMIO(0xb118), 9 }, /* GEN8_L3SQCREG4 */
>  	};
>  
>  	return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon));
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> index de923fb82301..34ef4f36e660 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> @@ -328,7 +328,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
>  
>  static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
>  					 struct temp_regset *regset,
> -					 i915_reg_t reg, u32 flags)
> +					 i915_mcr_reg_t reg, u32 flags)
>  {
>  	u8 group, inst;
>  
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> index 9495a7928bc8..d5c03e7a7843 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> @@ -240,7 +240,7 @@ static void guc_capture_free_extlists(struct __guc_mmio_reg_descr_group *reglist
>  
>  struct __ext_steer_reg {
>  	const char *name;
> -	i915_reg_t reg;
> +	i915_mcr_reg_t reg;
>  };
>  
>  static const struct __ext_steer_reg xe_extregs[] = {
> @@ -252,7 +252,7 @@ static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
>  			   const struct __ext_steer_reg *extlist,
>  			   int slice_id, int subslice_id)
>  {
> -	ext->reg = extlist->reg;
> +	ext->reg = _MMIO(i915_mmio_reg_offset(extlist->reg));
>  	ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id);
>  	ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id);
>  	ext->regname = extlist->name;
> diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
> index 700cc9688f47..a52c4645cdbe 100644
> --- a/drivers/gpu/drm/i915/gvt/handlers.c
> +++ b/drivers/gpu/drm/i915/gvt/handlers.c
> @@ -734,7 +734,7 @@ static i915_reg_t force_nonpriv_white_list[] = {
>  	_MMIO(0x770c),
>  	_MMIO(0x83a8),
>  	_MMIO(0xb110),
> -	GEN8_L3SQCREG4,//_MMIO(0xb118)
> +	_MMIO(0xb118),
>  	_MMIO(0xe100),
>  	_MMIO(0xe18c),
>  	_MMIO(0xe48c),
> @@ -2140,6 +2140,9 @@ static int csfe_chicken1_mmio_write(struct intel_vgpu *vgpu,
>  #define MMIO_DFH(reg, d, f, r, w) \
>  	MMIO_F(reg, 4, f, 0, 0, d, r, w)
>  
> +#define MMIO_DFH_MCR(reg, d, f, r, w) \
> +	MMIO_F(_MMIO(i915_mmio_reg_offset(reg)), 4, f, 0, 0, d, r, w)
> +
>  #define MMIO_GM(reg, d, r, w) \
>  	MMIO_F(reg, 4, F_GMADR, 0xFFFFF000, 0, d, r, w)
>  
> @@ -2529,15 +2532,15 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
>  
>  	MMIO_DFH(HDC_CHICKEN0, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>  
> -	MMIO_DFH(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
> -		NULL, NULL);
> +	MMIO_DFH_MCR(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
> +		     NULL, NULL);
>  	MMIO_DFH(GEN7_ROW_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
>  		NULL, NULL);
>  	MMIO_DFH(GEN8_UCGCTL6, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
>  
>  	MMIO_DFH(_MMIO(0xb1f0), D_BDW, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0xb1c0), D_BDW, F_CMD_ACCESS, NULL, NULL);
> -	MMIO_DFH(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
>  
> @@ -2550,7 +2553,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
>  
>  	MMIO_DFH(_MMIO(0xe194), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0xe188), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> -	MMIO_DFH(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0x2580), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>  
>  	MMIO_DFH(_MMIO(0x2248), D_BDW, F_CMD_ACCESS, NULL, NULL);
> @@ -2699,7 +2702,7 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
>  	MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 3)), D_SKL_PLUS, NULL, NULL);
>  	MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 4)), D_SKL_PLUS, NULL, NULL);
>  
> -	MMIO_DFH(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
>  
>  	MMIO_F(GEN9_GFX_MOCS(0), 0x7f8, F_CMD_ACCESS, 0, 0, D_SKL_PLUS,
>  		NULL, NULL);
> @@ -2769,7 +2772,7 @@ static int init_bxt_mmio_info(struct intel_gvt *gvt)
>  	MMIO_DH(BXT_PORT_TX_DW3_LN0(DPIO_PHY1, DPIO_CH0), D_BXT,
>  		bxt_port_tx_dw3_read, NULL);
>  	MMIO_DH(BXT_DE_PLL_ENABLE, D_BXT, NULL, bxt_de_pll_enable_write);
> -	MMIO_DFH(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(GEN8_L3CNTLREG, D_BXT, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0x20D8), D_BXT, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40, F_CMD_ACCESS,
> diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
> index d177884d8f7d..7f40055b0474 100644
> --- a/drivers/gpu/drm/i915/gvt/mmio_context.c
> +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
> @@ -44,6 +44,8 @@
>  
>  #define GEN9_MOCS_SIZE		64
>  
> +#define MCR_CAST(mcr)	_MMIO(i915_mcr_reg_offset(mcr))
> +
s/i915_mcr_reg_offset/i915_mmio_reg_offset

Macro is defined but unused. I believe the intention is to replace the
hex values of the register offets with this macro. 

Regards,
Bala

>  /* Raw offset is appened to each line for convenience. */
>  static struct engine_mmio gen8_engine_mmio_list[] __cacheline_aligned = {
>  	{RCS0, RING_MODE_GEN7(RENDER_RING_BASE), 0xffff, false}, /* 0x229c */
> @@ -106,15 +108,15 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
>  	{RCS0, GEN8_CS_CHICKEN1, 0xffff, true}, /* 0x2580 */
>  	{RCS0, COMMON_SLICE_CHICKEN2, 0xffff, true}, /* 0x7014 */
>  	{RCS0, GEN9_CS_DEBUG_MODE1, 0xffff, false}, /* 0x20ec */
> -	{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
> -	{RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
> +	{RCS0, _MMIO(0xb118), 0, false}, /* GEN8_L3SQCREG4 */
> +	{RCS0, _MMIO(0xb11c), 0, false}, /* GEN9_SCRATCH1 */
>  	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
>  	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
> -	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
> -	{RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
> -	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
> -	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
> -	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
> +	{RCS0, _MMIO(0xe180), 0xffff, true}, /* HALF_SLICE_CHICKEN2 */
> +	{RCS0, _MMIO(0xe184), 0xffff, true}, /* 0xe184 */
> +	{RCS0, _MMIO(0xe188), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN5 */
> +	{RCS0, _MMIO(0xe194), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN7 */
> +	{RCS0, _MMIO(0xe4f0), 0xffff, true}, /* GEN8_ROW_CHICKEN */
>  	{RCS0, TRVATTL3PTRDW(0), 0, true}, /* 0x4de0 */
>  	{RCS0, TRVATTL3PTRDW(1), 0, true}, /* 0x4de4 */
>  	{RCS0, TRNULLDETCT, 0, true}, /* 0x4de8 */
> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
> index 8f486f77609f..e823869b9afd 100644
> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> @@ -104,22 +104,16 @@ typedef struct {
>  
>  #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
>  
> -#define INVALID_MMIO_REG _MMIO(0)
> -
> -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
> -{
> -	return reg.reg;
> -}
> +typedef struct {
> +	u32 reg;
> +} i915_mcr_reg_t;
>  
> -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
> -{
> -	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
> -}
> +#define INVALID_MMIO_REG _MMIO(0)
>  
> -static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> -{
> -	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
> -}
> +/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
> +#define i915_mmio_reg_offset(r) (r.reg)
> +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
> +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
>  
>  #define VLV_DISPLAY_BASE		0x180000
>  
> diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
> index 638b77d64bf4..8c1515f5c04c 100644
> --- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
> +++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
> @@ -816,10 +816,10 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
>  	MMIO_D(GEN8_EU_DISABLE1);
>  	MMIO_D(GEN8_EU_DISABLE2);
>  	MMIO_D(_MMIO(0xfdc));
> -	MMIO_D(GEN8_ROW_CHICKEN);
> +	MMIO_D(_MMIO(0xe4f0));	/* GEN8_ROW_CHICKEN */
>  	MMIO_D(GEN7_ROW_CHICKEN2);
>  	MMIO_D(GEN8_UCGCTL6);
> -	MMIO_D(GEN8_L3SQCREG4);
> +	MMIO_D(_MMIO(0xb118));	/* GEN8_L3SQCREG4 */
>  	MMIO_D(GEN9_SCRATCH_LNCF1);
>  	MMIO_F(_MMIO(0x24d0), 48);
>  	MMIO_D(_MMIO(0x44484));
> @@ -831,7 +831,7 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
>  	MMIO_D(_MMIO(0x65f10));
>  	MMIO_D(_MMIO(0xe194));
>  	MMIO_D(_MMIO(0xe188));
> -	MMIO_D(HALF_SLICE_CHICKEN2);
> +	MMIO_D(_MMIO(0xe180));	/* HALF_SLICE_CHICKEN2 */
>  	MMIO_D(_MMIO(0x2580));
>  	MMIO_D(_MMIO(0xe220));
>  	MMIO_D(_MMIO(0xe230));
> @@ -1025,7 +1025,7 @@ static int iterate_skl_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
>  	MMIO_D(DMC_SSP_BASE);
>  	MMIO_D(DMC_HTP_SKL);
>  	MMIO_D(DMC_LAST_WRITE);
> -	MMIO_D(BDW_SCRATCH1);
> +	MMIO_D(_MMIO(0xb11c));	/* BDW_SCRATCH1 */
>  	MMIO_D(SKL_DFSM);
>  	MMIO_D(DISPIO_CR_TX_BMU_CR0);
>  	MMIO_F(GEN9_GFX_MOCS(0), 0x7f8);
> @@ -1228,7 +1228,7 @@ static int iterate_bxt_mmio(struct intel_gvt_mmio_table_iter *iter)
>  	MMIO_D(GEN8_PUSHBUS_ENABLE);
>  	MMIO_D(GEN8_PUSHBUS_SHIFT);
>  	MMIO_D(GEN6_GFXPAUSE);
> -	MMIO_D(GEN8_L3SQCREG1);
> +	MMIO_D(_MMIO(0xb100));	/* GEN8_L3SQCREG1 */
>  	MMIO_D(GEN8_L3CNTLREG);
>  	MMIO_D(_MMIO(0x20D8));
>  	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40);
> -- 
> 2.37.3
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 12/14] drm/i915: Define multicast registers as a new type
@ 2022-10-13  7:21     ` Balasubramani Vivekanandan
  0 siblings, 0 replies; 54+ messages in thread
From: Balasubramani Vivekanandan @ 2022-10-13  7:21 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

On 30.09.2022 17:45, Matt Roper wrote:
> Rather than treating multicast registers as 'i915_reg_t' let's define
> them as a completely new type.  This will allow the compiler to help us
> make sure we're using multicast-aware functions to operate on multicast
> registers.
> 
> This plan does break down a bit in places where we're just maintaining
> heterogeneous lists of registers (e.g., various MMIO whitelists used by
> perf, GVT, etc.) rather than performing reads/writes.  We only really
> care about the offset in those cases, so for now we can "cast" the
> registers as non-MCR, leaving us with a list of i915_reg_t's, but we may
> want to look for better ways to store mixed collections of i915_reg_t
> and i915_mcr_reg_t in the future.
> 
> v2:
>  - Add TLB invalidation registers
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c            | 32 +++++++----
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 53 ++++++++++++-------
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.h        | 18 +++----
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 27 +++++++---
>  drivers/gpu/drm/i915/gt/intel_workarounds.c   | 32 +++++------
>  .../gpu/drm/i915/gt/intel_workarounds_types.h |  5 +-
>  .../gpu/drm/i915/gt/selftest_workarounds.c    |  2 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  2 +-
>  .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 +-
>  drivers/gpu/drm/i915/gvt/handlers.c           | 17 +++---
>  drivers/gpu/drm/i915/gvt/mmio_context.c       | 16 +++---
>  drivers/gpu/drm/i915/i915_reg_defs.h          | 22 +++-----
>  drivers/gpu/drm/i915/intel_gvt_mmio_table.c   | 10 ++--
>  13 files changed, 140 insertions(+), 100 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index e763dc719d3a..27dbb9e4bd6c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -991,7 +991,10 @@ void intel_gt_info_print(const struct intel_gt_info *info,
>  }
>  
>  struct reg_and_bit {
> -	i915_reg_t reg;
> +	union {
> +		i915_reg_t reg;
> +		i915_mcr_reg_t mcr_reg;
> +	};
>  	u32 bit;
>  };
>  
> @@ -1033,7 +1036,7 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8,
>  static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb)
>  {
>  	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
> -		return intel_gt_mcr_wait_for_reg_fw(gt, rb.reg, rb.bit, 0,
> +		return intel_gt_mcr_wait_for_reg_fw(gt, rb.mcr_reg, rb.bit, 0,
>  						    TLB_INVAL_TIMEOUT_US,
>  						    TLB_INVAL_TIMEOUT_MS);
>  	else
> @@ -1058,7 +1061,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  		[COPY_ENGINE_CLASS]		= GEN12_BLT_TLB_INV_CR,
>  		[COMPUTE_CLASS]			= GEN12_COMPCTX_TLB_INV_CR,
>  	};
> -	static const i915_reg_t xehp_regs[] = {
> +	static const i915_mcr_reg_t xehp_regs[] = {
>  		[RENDER_CLASS]			= XEHP_GFX_TLB_INV_CR,
>  		[VIDEO_DECODE_CLASS]		= XEHP_VD_TLB_INV_CR,
>  		[VIDEO_ENHANCEMENT_CLASS]	= XEHP_VE_TLB_INV_CR,
> @@ -1074,7 +1077,7 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  	unsigned int num = 0;
>  
>  	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
> -		regs = xehp_regs;
> +		regs = NULL;
>  		num = ARRAY_SIZE(xehp_regs);
The change looks a bit odd. Can you assign regs = NULL at the definition
of it and remove the assignment here?

>  	} else if (GRAPHICS_VER(i915) == 12) {
>  		regs = gen12_regs;
> @@ -1101,11 +1104,17 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  		if (!intel_engine_pm_is_awake(engine))
>  			continue;
>  
> -		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> -		if (!i915_mmio_reg_offset(rb.reg))
> -			continue;
> +		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
> +			intel_gt_mcr_multicast_write_fw(gt,
> +							xehp_regs[engine->class],
> +							BIT(engine->instance));
> +		} else {
> +			rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> +			if (!i915_mmio_reg_offset(rb.reg))
> +				continue;
>  
> -		intel_uncore_write_fw(uncore, rb.reg, rb.bit);
> +			intel_uncore_write_fw(uncore, rb.reg, rb.bit);
> +		}
>  		awake |= engine->mask;
>  	}
>  
> @@ -1125,7 +1134,12 @@ static void mmio_invalidate_full(struct intel_gt *gt)
>  	for_each_engine_masked(engine, gt, awake, tmp) {
>  		struct reg_and_bit rb;
>  
> -		rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> +		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
> +			rb.mcr_reg = xehp_regs[engine->class];
> +			rb.bit = BIT(engine->instance);
> +		} else {
> +			rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
> +		}
>  
>  		if (wait_for_invalidate(gt, rb))
>  			drm_err_ratelimited(&gt->i915->drm,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index a2bc77c33835..9e2caba64f19 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -150,6 +150,19 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>  	}
>  }
>  
> +/*
> + * Although the rest of the driver should use MCR-specific functions to
> + * read/write MCR registers, we still use the regular intel_uncore_* functions
> + * internally to implement those, so we need a way for the functions in this
> + * file to "cast" an i915_mcr_reg_t into an i915_reg_t.
> + */
> +static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
> +{
> +	i915_reg_t r = { .reg = mcr.reg };
> +
> +	return r;
> +}
> +
>  /*
>   * rw_with_mcr_steering_fw - Access a register with specific MCR steering
>   * @uncore: pointer to struct intel_uncore
> @@ -164,7 +177,7 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>   * Caller needs to make sure the relevant forcewake wells are up.
>   */
>  static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> -				   i915_reg_t reg, u8 rw_flag,
> +				   i915_mcr_reg_t reg, u8 rw_flag,
>  				   int group, int instance, u32 value)
>  {
>  	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
> @@ -201,9 +214,9 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>  	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
>  
>  	if (rw_flag == FW_REG_READ)
> -		val = intel_uncore_read_fw(uncore, reg);
> +		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
>  	else
> -		intel_uncore_write_fw(uncore, reg, value);
> +		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
>  
>  	mcr &= ~mcr_mask;
>  	mcr |= old_mcr & mcr_mask;
> @@ -214,14 +227,14 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>  }
>  
>  static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
> -				i915_reg_t reg, u8 rw_flag,
> +				i915_mcr_reg_t reg, u8 rw_flag,
>  				int group, int instance,
>  				u32 value)
>  {
>  	enum forcewake_domains fw_domains;
>  	u32 val;
>  
> -	fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
> +	fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg),
>  						    rw_flag);
>  	fw_domains |= intel_uncore_forcewake_for_reg(uncore,
>  						     GEN8_MCR_SELECTOR,
> @@ -249,7 +262,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
>   * group/instance.
>   */
>  u32 intel_gt_mcr_read(struct intel_gt *gt,
> -		      i915_reg_t reg,
> +		      i915_mcr_reg_t reg,
>  		      int group, int instance)
>  {
>  	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, instance, 0);
> @@ -266,7 +279,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
>   * Write an MCR register in unicast mode after steering toward a specific
>   * group/instance.
>   */
> -void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value,
> +void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value,
>  				int group, int instance)
>  {
>  	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, value);
> @@ -281,9 +294,9 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value,
>   * Write an MCR register in multicast mode to update all instances.
>   */
>  void intel_gt_mcr_multicast_write(struct intel_gt *gt,
> -				i915_reg_t reg, u32 value)
> +				i915_mcr_reg_t reg, u32 value)
>  {
> -	intel_uncore_write(gt->uncore, reg, value);
> +	intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value);
>  }
>  
>  /**
> @@ -297,9 +310,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
>   * domains; use intel_gt_mcr_multicast_write() in cases where forcewake should
>   * be obtained automatically.
>   */
> -void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 value)
> +void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_mcr_reg_t reg, u32 value)
>  {
> -	intel_uncore_write_fw(gt->uncore, reg, value);
> +	intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value);
>  }
>  
>  /**
> @@ -318,7 +331,7 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 va
>   * domains; use intel_gt_mcr_multicast_rmw() in cases where forcewake should
>   * be obtained automatically.
>   */
> -void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
> +void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
>  				u32 clear, u32 set)
>  {
>  	u32 val = intel_gt_mcr_read_any(gt, reg);
> @@ -341,10 +354,10 @@ void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
>   * for @type steering too.
>   */
>  static bool reg_needs_read_steering(struct intel_gt *gt,
> -				    i915_reg_t reg,
> +				    i915_mcr_reg_t reg,
>  				    enum intel_steering_type type)
>  {
> -	const u32 offset = i915_mmio_reg_offset(reg);
> +	const u32 offset = i915_mmio_reg_offset(mcr_reg_cast(reg));
>  	const struct intel_mmio_range *entry;
>  
>  	if (likely(!gt->steering_table[type]))
> @@ -424,7 +437,7 @@ static void get_nonterminated_steering(struct intel_gt *gt,
>   * steering.
>   */
>  void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
> -					     i915_reg_t reg,
> +					     i915_mcr_reg_t reg,
>  					     u8 *group, u8 *instance)
>  {
>  	int type;
> @@ -453,7 +466,7 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
>   *
>   * Returns the value from a non-terminated instance of @reg.
>   */
> -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
> +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
>  {
>  	int type;
>  	u8 group, instance;
> @@ -467,7 +480,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
>  		}
>  	}
>  
> -	return intel_uncore_read_fw(gt->uncore, reg);
> +	return intel_uncore_read_fw(gt->uncore, mcr_reg_cast(reg));
>  }
>  
>  /**
> @@ -480,7 +493,7 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
>   *
>   * Returns the value from a non-terminated instance of @reg.
>   */
> -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
> +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
>  {
>  	int type;
>  	u8 group, instance;
> @@ -494,7 +507,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
>  		}
>  	}
>  
> -	return intel_uncore_read(gt->uncore, reg);
> +	return intel_uncore_read(gt->uncore, mcr_reg_cast(reg));
>  }
>  
>  static void report_steering_type(struct drm_printer *p,
> @@ -595,7 +608,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
>   * Return: 0 if the register matches the desired condition, or -ETIMEDOUT.
>   */
>  int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
> -				 i915_reg_t reg,
> +				 i915_mcr_reg_t reg,
>  				 u32 mask,
>  				 u32 value,
>  				 unsigned int fast_timeout_us,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> index 96979af01f2b..7fb574d31d73 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> @@ -11,24 +11,24 @@
>  void intel_gt_mcr_init(struct intel_gt *gt);
>  
>  u32 intel_gt_mcr_read(struct intel_gt *gt,
> -		      i915_reg_t reg,
> +		      i915_mcr_reg_t reg,
>  		      int group, int instance);
> -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
> -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
> +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg);
> +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg);
>  
>  void intel_gt_mcr_unicast_write(struct intel_gt *gt,
> -				i915_reg_t reg, u32 value,
> +				i915_mcr_reg_t reg, u32 value,
>  				int group, int instance);
>  void intel_gt_mcr_multicast_write(struct intel_gt *gt,
> -				  i915_reg_t reg, u32 value);
> +				  i915_mcr_reg_t reg, u32 value);
>  void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
> -				     i915_reg_t reg, u32 value);
> +				     i915_mcr_reg_t reg, u32 value);
>  
> -void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_reg_t reg,
> +void intel_gt_mcr_multicast_rmw(struct intel_gt *gt, i915_mcr_reg_t reg,
>  				u32 clear, u32 set);
>  
>  void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
> -					     i915_reg_t reg,
> +					     i915_mcr_reg_t reg,
>  					     u8 *group, u8 *instance);
>  
>  void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
> @@ -38,7 +38,7 @@ void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
>  				  unsigned int *group, unsigned int *instance);
>  
>  int intel_gt_mcr_wait_for_reg_fw(struct intel_gt *gt,
> -				 i915_reg_t reg,
> +				 i915_mcr_reg_t reg,
>  				 u32 mask,
>  				 u32 value,
>  				 unsigned int fast_timeout_us,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index 901ea03b9fd8..b7341f01ec9f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -8,7 +8,18 @@
>  
>  #include "i915_reg_defs.h"
>  
> -#define MCR_REG(offset)	_MMIO(offset)
> +#define MCR_REG(offset)	((const i915_mcr_reg_t){ .reg = (offset) })
> +
> +/*
> + * The perf control registers are technically multicast registers, but the
> + * driver never needs to read/write them directly; we only use them to build
> + * lists of registers (where they're mixed in with other non-MCR registers)
> + * and then operate on the offset directly.  For now we'll just define them
> + * as non-multicast so we can place them on the same list, but we may want
> + * to try to come up with a better way to handle heterogeneous lists of
> + * registers in the future.
> + */
> +#define PERF_REG(offset)			_MMIO(offset)
>  
>  /* RPM unit config (Gen8+) */
>  #define RPM_CONFIG0				_MMIO(0xd00)
> @@ -1116,8 +1127,8 @@
>  #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
>  #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
>  
> -#define EU_PERF_CNTL0				MCR_REG(0xe458)
> -#define EU_PERF_CNTL4				MCR_REG(0xe45c)
> +#define EU_PERF_CNTL0				PERF_REG(0xe458)
> +#define EU_PERF_CNTL4				PERF_REG(0xe45c)
>  
>  #define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
>  #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
> @@ -1154,16 +1165,16 @@
>  #define   STACKID_CTRL				REG_GENMASK(6, 5)
>  #define   STACKID_CTRL_512			REG_FIELD_PREP(STACKID_CTRL, 0x2)
>  
> -#define EU_PERF_CNTL1				MCR_REG(0xe558)
> -#define EU_PERF_CNTL5				MCR_REG(0xe55c)
> +#define EU_PERF_CNTL1				PERF_REG(0xe558)
> +#define EU_PERF_CNTL5				PERF_REG(0xe55c)
>  
>  #define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
>  #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
>  #define ICL_HDC_MODE				MCR_REG(0xe5f4)
>  
> -#define EU_PERF_CNTL2				MCR_REG(0xe658)
> -#define EU_PERF_CNTL6				MCR_REG(0xe65c)
> -#define EU_PERF_CNTL3				MCR_REG(0xe758)
> +#define EU_PERF_CNTL2				PERF_REG(0xe658)
> +#define EU_PERF_CNTL6				PERF_REG(0xe65c)
> +#define EU_PERF_CNTL3				PERF_REG(0xe758)
>  
>  #define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
>  #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index c5cd89aa13c9..174c74dcda99 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -166,11 +166,11 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
>  	_wa_add(wal, &wa);
>  }
>  
> -static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
> +static void wa_mcr_add(struct i915_wa_list *wal, i915_mcr_reg_t reg,
>  		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
>  {
>  	struct i915_wa wa = {
> -		.reg  = reg,
> +		.mcr_reg = reg,
>  		.clr  = clear,
>  		.set  = set,
>  		.read = read_mask,
> @@ -188,7 +188,7 @@ wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
>  }
>  
>  static void
> -wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
> +wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set)
>  {
>  	wa_mcr_add(wal, reg, clear, set, clear, false);
>  }
> @@ -206,7 +206,7 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
>  }
>  
>  static void
> -wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
> +wa_mcr_write_or(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set)
>  {
>  	wa_mcr_write_clr_set(wal, reg, set, set);
>  }
> @@ -218,7 +218,7 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
>  }
>  
>  static void
> -wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
> +wa_mcr_write_clr(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clr)
>  {
>  	wa_mcr_write_clr_set(wal, reg, clr, 0);
>  }
> @@ -241,7 +241,7 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
>  }
>  
>  static void
> -wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> +wa_mcr_masked_en(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
>  {
>  	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
>  }
> @@ -253,7 +253,7 @@ wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
>  }
>  
>  static void
> -wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> +wa_mcr_masked_dis(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
>  {
>  	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
>  }
> @@ -266,7 +266,7 @@ wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
>  }
>  
>  static void
> -wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
> +wa_mcr_masked_field_set(struct i915_wa_list *wal, i915_mcr_reg_t reg,
>  			u32 mask, u32 val)
>  {
>  	wa_mcr_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
> @@ -1692,19 +1692,19 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
>  		/* open-coded rmw due to steering */
>  		if (wa->clr)
>  			old = wa->is_mcr ?
> -				intel_gt_mcr_read_any_fw(gt, wa->reg) :
> +				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
>  				intel_uncore_read_fw(uncore, wa->reg);
>  		val = (old & ~wa->clr) | wa->set;
>  		if (val != old || !wa->clr) {
>  			if (wa->is_mcr)
> -				intel_gt_mcr_multicast_write_fw(gt, wa->reg, val);
> +				intel_gt_mcr_multicast_write_fw(gt, wa->mcr_reg, val);
>  			else
>  				intel_uncore_write_fw(uncore, wa->reg, val);
>  		}
>  
>  		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) {
>  			u32 val = wa->is_mcr ?
> -				intel_gt_mcr_read_any_fw(gt, wa->reg) :
> +				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
>  				intel_uncore_read_fw(uncore, wa->reg);
>  
>  			wa_verify(wa, val, wal->name, "application");
> @@ -1738,7 +1738,7 @@ static bool wa_list_verify(struct intel_gt *gt,
>  
>  	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
>  		ok &= wa_verify(wa, wa->is_mcr ?
> -				intel_gt_mcr_read_any_fw(gt, wa->reg) :
> +				intel_gt_mcr_read_any_fw(gt, wa->mcr_reg) :
>  				intel_uncore_read_fw(uncore, wa->reg),
>  				wal->name, from);
>  
> @@ -1786,10 +1786,10 @@ whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
>  }
>  
>  static void
> -whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
> +whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 flags)
>  {
>  	struct i915_wa wa = {
> -		.reg = reg,
> +		.mcr_reg = reg,
>  		.is_mcr = 1,
>  	};
>  
> @@ -1799,7 +1799,7 @@ whitelist_mcr_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags)
>  	if (GEM_DEBUG_WARN_ON(!is_nonpriv_flags_valid(flags)))
>  		return;
>  
> -	wa.reg.reg |= flags;
> +	wa.mcr_reg.reg |= flags;
>  	_wa_add(wal, &wa);
>  }
>  
> @@ -1810,7 +1810,7 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
>  }
>  
>  static void
> -whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
> +whitelist_mcr_reg(struct i915_wa_list *wal, i915_mcr_reg_t reg)
>  {
>  	whitelist_mcr_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
>  }
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
> index f05b37e56fa9..7c8b01d00043 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
> @@ -11,7 +11,10 @@
>  #include "i915_reg_defs.h"
>  
>  struct i915_wa {
> -	i915_reg_t	reg;
> +	union {
> +		i915_reg_t	reg;
> +		i915_mcr_reg_t	mcr_reg;
> +	};
>  	u32		clr;
>  	u32		set;
>  	u32		read;
> diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> index 67a9aab801dd..21b1edc052f8 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> @@ -991,7 +991,7 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg)
>  	/* Alas, we must pardon some whitelists. Mistakes already made */
>  	static const struct regmask pardon[] = {
>  		{ GEN9_CTX_PREEMPT_REG, 9 },
> -		{ GEN8_L3SQCREG4, 9 },
> +		{ _MMIO(0xb118), 9 }, /* GEN8_L3SQCREG4 */
>  	};
>  
>  	return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon));
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> index de923fb82301..34ef4f36e660 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> @@ -328,7 +328,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
>  
>  static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
>  					 struct temp_regset *regset,
> -					 i915_reg_t reg, u32 flags)
> +					 i915_mcr_reg_t reg, u32 flags)
>  {
>  	u8 group, inst;
>  
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> index 9495a7928bc8..d5c03e7a7843 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> @@ -240,7 +240,7 @@ static void guc_capture_free_extlists(struct __guc_mmio_reg_descr_group *reglist
>  
>  struct __ext_steer_reg {
>  	const char *name;
> -	i915_reg_t reg;
> +	i915_mcr_reg_t reg;
>  };
>  
>  static const struct __ext_steer_reg xe_extregs[] = {
> @@ -252,7 +252,7 @@ static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
>  			   const struct __ext_steer_reg *extlist,
>  			   int slice_id, int subslice_id)
>  {
> -	ext->reg = extlist->reg;
> +	ext->reg = _MMIO(i915_mmio_reg_offset(extlist->reg));
>  	ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id);
>  	ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id);
>  	ext->regname = extlist->name;
> diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
> index 700cc9688f47..a52c4645cdbe 100644
> --- a/drivers/gpu/drm/i915/gvt/handlers.c
> +++ b/drivers/gpu/drm/i915/gvt/handlers.c
> @@ -734,7 +734,7 @@ static i915_reg_t force_nonpriv_white_list[] = {
>  	_MMIO(0x770c),
>  	_MMIO(0x83a8),
>  	_MMIO(0xb110),
> -	GEN8_L3SQCREG4,//_MMIO(0xb118)
> +	_MMIO(0xb118),
>  	_MMIO(0xe100),
>  	_MMIO(0xe18c),
>  	_MMIO(0xe48c),
> @@ -2140,6 +2140,9 @@ static int csfe_chicken1_mmio_write(struct intel_vgpu *vgpu,
>  #define MMIO_DFH(reg, d, f, r, w) \
>  	MMIO_F(reg, 4, f, 0, 0, d, r, w)
>  
> +#define MMIO_DFH_MCR(reg, d, f, r, w) \
> +	MMIO_F(_MMIO(i915_mmio_reg_offset(reg)), 4, f, 0, 0, d, r, w)
> +
>  #define MMIO_GM(reg, d, r, w) \
>  	MMIO_F(reg, 4, F_GMADR, 0xFFFFF000, 0, d, r, w)
>  
> @@ -2529,15 +2532,15 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
>  
>  	MMIO_DFH(HDC_CHICKEN0, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>  
> -	MMIO_DFH(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
> -		NULL, NULL);
> +	MMIO_DFH_MCR(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
> +		     NULL, NULL);
>  	MMIO_DFH(GEN7_ROW_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
>  		NULL, NULL);
>  	MMIO_DFH(GEN8_UCGCTL6, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
>  
>  	MMIO_DFH(_MMIO(0xb1f0), D_BDW, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0xb1c0), D_BDW, F_CMD_ACCESS, NULL, NULL);
> -	MMIO_DFH(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
>  
> @@ -2550,7 +2553,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
>  
>  	MMIO_DFH(_MMIO(0xe194), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0xe188), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> -	MMIO_DFH(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0x2580), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>  
>  	MMIO_DFH(_MMIO(0x2248), D_BDW, F_CMD_ACCESS, NULL, NULL);
> @@ -2699,7 +2702,7 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
>  	MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 3)), D_SKL_PLUS, NULL, NULL);
>  	MMIO_DH(_MMIO(_REG_701C4(PIPE_C, 4)), D_SKL_PLUS, NULL, NULL);
>  
> -	MMIO_DFH(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
>  
>  	MMIO_F(GEN9_GFX_MOCS(0), 0x7f8, F_CMD_ACCESS, 0, 0, D_SKL_PLUS,
>  		NULL, NULL);
> @@ -2769,7 +2772,7 @@ static int init_bxt_mmio_info(struct intel_gvt *gvt)
>  	MMIO_DH(BXT_PORT_TX_DW3_LN0(DPIO_PHY1, DPIO_CH0), D_BXT,
>  		bxt_port_tx_dw3_read, NULL);
>  	MMIO_DH(BXT_DE_PLL_ENABLE, D_BXT, NULL, bxt_de_pll_enable_write);
> -	MMIO_DFH(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(GEN8_L3CNTLREG, D_BXT, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_DFH(_MMIO(0x20D8), D_BXT, F_CMD_ACCESS, NULL, NULL);
>  	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40, F_CMD_ACCESS,
> diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
> index d177884d8f7d..7f40055b0474 100644
> --- a/drivers/gpu/drm/i915/gvt/mmio_context.c
> +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
> @@ -44,6 +44,8 @@
>  
>  #define GEN9_MOCS_SIZE		64
>  
> +#define MCR_CAST(mcr)	_MMIO(i915_mcr_reg_offset(mcr))
> +
s/i915_mcr_reg_offset/i915_mmio_reg_offset

Macro is defined but unused. I believe the intention is to replace the
hex values of the register offets with this macro. 

Regards,
Bala

>  /* Raw offset is appened to each line for convenience. */
>  static struct engine_mmio gen8_engine_mmio_list[] __cacheline_aligned = {
>  	{RCS0, RING_MODE_GEN7(RENDER_RING_BASE), 0xffff, false}, /* 0x229c */
> @@ -106,15 +108,15 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
>  	{RCS0, GEN8_CS_CHICKEN1, 0xffff, true}, /* 0x2580 */
>  	{RCS0, COMMON_SLICE_CHICKEN2, 0xffff, true}, /* 0x7014 */
>  	{RCS0, GEN9_CS_DEBUG_MODE1, 0xffff, false}, /* 0x20ec */
> -	{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
> -	{RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
> +	{RCS0, _MMIO(0xb118), 0, false}, /* GEN8_L3SQCREG4 */
> +	{RCS0, _MMIO(0xb11c), 0, false}, /* GEN9_SCRATCH1 */
>  	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
>  	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
> -	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
> -	{RCS0, GEN8_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
> -	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
> -	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
> -	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
> +	{RCS0, _MMIO(0xe180), 0xffff, true}, /* HALF_SLICE_CHICKEN2 */
> +	{RCS0, _MMIO(0xe184), 0xffff, true}, /* 0xe184 */
> +	{RCS0, _MMIO(0xe188), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN5 */
> +	{RCS0, _MMIO(0xe194), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN7 */
> +	{RCS0, _MMIO(0xe4f0), 0xffff, true}, /* GEN8_ROW_CHICKEN */
>  	{RCS0, TRVATTL3PTRDW(0), 0, true}, /* 0x4de0 */
>  	{RCS0, TRVATTL3PTRDW(1), 0, true}, /* 0x4de4 */
>  	{RCS0, TRNULLDETCT, 0, true}, /* 0x4de8 */
> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
> index 8f486f77609f..e823869b9afd 100644
> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> @@ -104,22 +104,16 @@ typedef struct {
>  
>  #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
>  
> -#define INVALID_MMIO_REG _MMIO(0)
> -
> -static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
> -{
> -	return reg.reg;
> -}
> +typedef struct {
> +	u32 reg;
> +} i915_mcr_reg_t;
>  
> -static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
> -{
> -	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
> -}
> +#define INVALID_MMIO_REG _MMIO(0)
>  
> -static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> -{
> -	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
> -}
> +/* These macros can be used on either i915_reg_t or i915_mcr_reg_t */
> +#define i915_mmio_reg_offset(r) (r.reg)
> +#define i915_mmio_reg_equal(a, b) (i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b))
> +#define i915_mmio_reg_valid(r) (!i915_mmio_reg_equal(r, INVALID_MMIO_REG))
>  
>  #define VLV_DISPLAY_BASE		0x180000
>  
> diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
> index 638b77d64bf4..8c1515f5c04c 100644
> --- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
> +++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c
> @@ -816,10 +816,10 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
>  	MMIO_D(GEN8_EU_DISABLE1);
>  	MMIO_D(GEN8_EU_DISABLE2);
>  	MMIO_D(_MMIO(0xfdc));
> -	MMIO_D(GEN8_ROW_CHICKEN);
> +	MMIO_D(_MMIO(0xe4f0));	/* GEN8_ROW_CHICKEN */
>  	MMIO_D(GEN7_ROW_CHICKEN2);
>  	MMIO_D(GEN8_UCGCTL6);
> -	MMIO_D(GEN8_L3SQCREG4);
> +	MMIO_D(_MMIO(0xb118));	/* GEN8_L3SQCREG4 */
>  	MMIO_D(GEN9_SCRATCH_LNCF1);
>  	MMIO_F(_MMIO(0x24d0), 48);
>  	MMIO_D(_MMIO(0x44484));
> @@ -831,7 +831,7 @@ static int iterate_bdw_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
>  	MMIO_D(_MMIO(0x65f10));
>  	MMIO_D(_MMIO(0xe194));
>  	MMIO_D(_MMIO(0xe188));
> -	MMIO_D(HALF_SLICE_CHICKEN2);
> +	MMIO_D(_MMIO(0xe180));	/* HALF_SLICE_CHICKEN2 */
>  	MMIO_D(_MMIO(0x2580));
>  	MMIO_D(_MMIO(0xe220));
>  	MMIO_D(_MMIO(0xe230));
> @@ -1025,7 +1025,7 @@ static int iterate_skl_plus_mmio(struct intel_gvt_mmio_table_iter *iter)
>  	MMIO_D(DMC_SSP_BASE);
>  	MMIO_D(DMC_HTP_SKL);
>  	MMIO_D(DMC_LAST_WRITE);
> -	MMIO_D(BDW_SCRATCH1);
> +	MMIO_D(_MMIO(0xb11c));	/* BDW_SCRATCH1 */
>  	MMIO_D(SKL_DFSM);
>  	MMIO_D(DISPIO_CR_TX_BMU_CR0);
>  	MMIO_F(GEN9_GFX_MOCS(0), 0x7f8);
> @@ -1228,7 +1228,7 @@ static int iterate_bxt_mmio(struct intel_gvt_mmio_table_iter *iter)
>  	MMIO_D(GEN8_PUSHBUS_ENABLE);
>  	MMIO_D(GEN8_PUSHBUS_SHIFT);
>  	MMIO_D(GEN6_GFXPAUSE);
> -	MMIO_D(GEN8_L3SQCREG1);
> +	MMIO_D(_MMIO(0xb100));	/* GEN8_L3SQCREG1 */
>  	MMIO_D(GEN8_L3CNTLREG);
>  	MMIO_D(_MMIO(0x20D8));
>  	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40);
> -- 
> 2.37.3
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 13/14] drm/i915/mtl: Add multicast steering for render GT
  2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
@ 2022-10-14 16:02     ` Balasubramani Vivekanandan
  -1 siblings, 0 replies; 54+ messages in thread
From: Balasubramani Vivekanandan @ 2022-10-14 16:02 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: ravi.kumar.vodapalli, dri-devel

On 30.09.2022 17:45, Matt Roper wrote:
> MTL once again changes the multicast register types and steering
> details.  Key changes from past platforms:
>  * The number of instances of some MCR types (NODE, OAAL2, and GAM) vary
>    according to the MTL subplatform and cannot be read from fuse
>    registers.
>  * The MCR steering register (and its bitfields) has changed.
> 
> Unlike past platforms, we will be explicitly steering all types of MCR
> accesses, including those for "SLICE" and "DSS" ranges; we no longer
> rely on implicit steering.  On previous platforms, various
> hardware/firmware agents that needed to access registers typically had
> their own steering control registers, allowing them to perform multicast
> steering without clobbering the CPU/kernel steering.  Starting with MTL,
> more of these agents now share a single steering register (0xFD4) and it
> is no longer safe for us to assume that the value will remain unchanged
> from how we initialized it during startup.  There is also a slight
> chance of race conditions between the driver and a hardware/firmware
> agent, so the hardware provides a semaphore register that can be used to
> coordinate access to the steering register.  Support for the semaphore
> register will be introduced in a future patch.
> 
> Bspec: 67788, 67112
> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 85 ++++++++++++++++++---
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  5 ++
>  drivers/gpu/drm/i915/gt/intel_gt_types.h    |  8 +-
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 ++++-
>  drivers/gpu/drm/i915/i915_pci.c             |  1 +
>  5 files changed, 102 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index 9e2caba64f19..a1fa71b47831 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -41,6 +41,7 @@ static const char * const intel_steering_types[] = {
>  	"MSLICE",
>  	"LNCF",
>  	"GAM",
> +	"DSS",
>  	"INSTANCE 0",
>  };
>  
> @@ -99,9 +100,40 @@ static const struct intel_mmio_range pvc_instance0_steering_table[] = {
>  	{},
>  };
>  
> +static const struct intel_mmio_range mtl3d_instance0_steering_table[] = {
> +	{ 0x000B00, 0x000BFF },         /* SQIDI */
> +	{ 0x001000, 0x001FFF },         /* SQIDI */
> +	{ 0x004000, 0x0048FF },         /* GAM */
> +	{ 0x008700, 0x0087FF },         /* SQIDI */
> +	{ 0x00B000, 0x00B0FF },         /* NODE */
> +	{ 0x00C800, 0x00CFFF },         /* GAM */
> +	{ 0x00D880, 0x00D8FF },         /* NODE */
> +	{ 0x00DD00, 0x00DDFF },         /* OAAL2 */
> +	{},
> +};
> +
> +static const struct intel_mmio_range mtl3d_l3bank_steering_table[] = {
> +	{ 0x00B100, 0x00B3FF },
> +	{},
> +};
> +
> +/* DSS steering is used for SLICE ranges as well */
> +static const struct intel_mmio_range mtl3d_dss_steering_table[] = {
> +	{ 0x005200, 0x0052FF },		/* SLICE */
> +	{ 0x005500, 0x007FFF },		/* SLICE */
> +	{ 0x008140, 0x00815F },		/* SLICE (0x8140-0x814F), DSS (0x8150-0x815F) */
> +	{ 0x0094D0, 0x00955F },		/* SLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */
> +	{ 0x009680, 0x0096FF },		/* DSS */
> +	{ 0x00D800, 0x00D87F },		/* SLICE */
> +	{ 0x00DC00, 0x00DCFF },		/* SLICE */
> +	{ 0x00DE80, 0x00E8FF },		/* DSS (0xE000-0xE0FF reserved) */
> +};
> +
>  void intel_gt_mcr_init(struct intel_gt *gt)
>  {
>  	struct drm_i915_private *i915 = gt->i915;
> +	unsigned long fuse;
> +	int i;
>  
>  	/*
>  	 * An mslice is unavailable only if both the meml3 for the slice is
> @@ -119,7 +151,22 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>  			drm_warn(&i915->drm, "mslice mask all zero!\n");
>  	}
>  
> -	if (IS_PONTEVECCHIO(i915)) {
> +	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) &&
> +	    gt->type == GT_PRIMARY) {
> +		fuse = REG_FIELD_GET(GT_L3_EXC_MASK,
> +				     intel_uncore_read(gt->uncore, XEHP_FUSE4));
> +
> +		/*
> +		 * Despite the register field being named "exclude mask" the
> +		 * bits actually represent enabled banks (two banks per bit).
> +		 */
> +		for_each_set_bit(i, &fuse, 3)
> +			gt->info.l3bank_mask |= (0x3 << 2*i);
> +
> +		gt->steering_table[INSTANCE0] = mtl3d_instance0_steering_table;
> +		gt->steering_table[L3BANK] = mtl3d_l3bank_steering_table;
> +		gt->steering_table[DSS] = mtl3d_dss_steering_table;
> +	} else if (IS_PONTEVECCHIO(i915)) {
>  		gt->steering_table[INSTANCE0] = pvc_instance0_steering_table;
>  	} else if (IS_DG2(i915)) {
>  		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
> @@ -184,7 +231,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>  
>  	lockdep_assert_held(&uncore->lock);
>  
> -	if (GRAPHICS_VER(uncore->i915) >= 11) {
> +	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) {
> +		/* No need to save old steering reg value */
> +		intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR,
> +				      REG_FIELD_PREP(MTL_MCR_GROUPID, group) |
> +				      REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance));

I think we need to clear the MULTICAST bit in the MTL_MCR_SELECTOR
register here since we are doing unicast r/w.

And in the functions intel_gt_mcr_multicast_write() and
intel_gt_mcr_multicast_write_fw(), we need to set the MULTICAST bit
in the MTL_MCR_SELECTOR register for MTL since we no more rely on
default values for the MTL_MCR_SELECTOR.

Can I also suggest to optimize the groupid and instanceid calculation
during non-terminating unicast MCR read/write by caching the groupid and
instanceid in some variables instead of calculating it everytime?
Because from MTL we have replaced all MCR register reads with implicit
steering to intel_gt_mcr_read_any/_fw call, and each call of that
function does groupip/instanceid calculation.

Regards,
Bala

> +	} else if (GRAPHICS_VER(uncore->i915) >= 11) {
>  		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
>  		mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance);
>  
> @@ -202,26 +254,30 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>  		 */
>  		if (rw_flag == FW_REG_WRITE)
>  			mcr_mask |= GEN11_MCR_MULTICAST;
> +
> +		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> +
> +		mcr &= ~mcr_mask;
> +		mcr |= mcr_ss;
> +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
>  	} else {
>  		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
>  		mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance);
> -	}
>  
> -	old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> +		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
>  
> -	mcr &= ~mcr_mask;
> -	mcr |= mcr_ss;
> -	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> +		mcr &= ~mcr_mask;
> +		mcr |= mcr_ss;
> +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> +	}
>  
>  	if (rw_flag == FW_REG_READ)
>  		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
>  	else
>  		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
>  
> -	mcr &= ~mcr_mask;
> -	mcr |= old_mcr & mcr_mask;
> -
> -	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> +	if (GRAPHICS_VER_FULL(uncore->i915) < IP_VER(12, 70))
> +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
>  
>  	return val;
>  }
> @@ -385,6 +441,8 @@ static void get_nonterminated_steering(struct intel_gt *gt,
>  				       enum intel_steering_type type,
>  				       u8 *group, u8 *instance)
>  {
> +	u32 dss;
> +
>  	switch (type) {
>  	case L3BANK:
>  		*group = 0;		/* unused */
> @@ -408,6 +466,11 @@ static void get_nonterminated_steering(struct intel_gt *gt,
>  		*group = IS_DG2(gt->i915) ? 1 : 0;
>  		*instance = 0;
>  		break;
> +	case DSS:
> +		dss = intel_sseu_find_first_xehp_dss(&gt->info.sseu, 0, 0);
> +		*group = dss / GEN_DSS_PER_GSLICE;
> +		*instance = dss % GEN_DSS_PER_GSLICE;
> +		break;
>  	case INSTANCE0:
>  		/*
>  		 * There are a lot of MCR types for which instance (0, 0)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index b7341f01ec9f..c5b9671097e3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -59,6 +59,7 @@
>  #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
>  
>  #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
> +#define MTL_MCR_SELECTOR			_MMIO(0xfd4)
>  #define SF_MCR_SELECTOR				_MMIO(0xfd8)
>  #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)
>  #define GAM_MCR_SELECTOR			_MMIO(0xfe0)
> @@ -71,6 +72,8 @@
>  #define   GEN11_MCR_SLICE_MASK			GEN11_MCR_SLICE(0xf)
>  #define   GEN11_MCR_SUBSLICE(subslice)		(((subslice) & 0x7) << 24)
>  #define   GEN11_MCR_SUBSLICE_MASK		GEN11_MCR_SUBSLICE(0x7)
> +#define   MTL_MCR_GROUPID			REG_GENMASK(11, 8)
> +#define   MTL_MCR_INSTANCEID			REG_GENMASK(3, 0)
>  
>  #define IPEIR_I965				_MMIO(0x2064)
>  #define IPEHR_I965				_MMIO(0x2068)
> @@ -531,6 +534,8 @@
>  #define   GEN6_MBCTL_BOOT_FETCH_MECH		(1 << 0)
>  
>  /* Fuse readout registers for GT */
> +#define XEHP_FUSE4				_MMIO(0x9114)
> +#define   GT_L3_EXC_MASK			REG_GENMASK(6, 4)
>  #define	GEN10_MIRROR_FUSE3			_MMIO(0x9118)
>  #define   GEN10_L3BANK_PAIR_COUNT		4
>  #define   GEN10_L3BANK_MASK			0x0F
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> index 30003d68fd51..77ecbd6be331 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> @@ -60,6 +60,7 @@ enum intel_steering_type {
>  	MSLICE,
>  	LNCF,
>  	GAM,
> +	DSS,
>  
>  	/*
>  	 * On some platforms there are multiple types of MCR registers that
> @@ -255,8 +256,6 @@ struct intel_gt {
>  
>  		intel_engine_mask_t engine_mask;
>  
> -		u32 l3bank_mask;
> -
>  		u8 num_engines;
>  
>  		/* General presence of SFC units */
> @@ -268,7 +267,10 @@ struct intel_gt {
>  		/* Slice/subslice/EU info */
>  		struct sseu_dev_info sseu;
>  
> -		unsigned long mslice_mask;
> +		union {
> +			unsigned long mslice_mask;
> +			unsigned long l3bank_mask;
> +		};
>  
>  		/** @hwconfig: hardware configuration data */
>  		struct intel_hwconfig hwconfig;
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 174c74dcda99..9be048da7fb3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -1581,12 +1581,28 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>  	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
>  }
>  
> +static void
> +mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> +{
> +	/*
> +	 * Unlike older platforms, we no longer setup implicit steering here;
> +	 * all MCR accesses are explicitly steered.
> +	 */
> +	if (drm_debug_enabled(DRM_UT_DRIVER)) {
> +		struct drm_printer p = drm_debug_printer("MCR Steering:");
> +
> +		intel_gt_mcr_report_steering(&p, gt, false);
> +	}
> +}
> +
>  static void
>  gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
>  {
>  	struct drm_i915_private *i915 = gt->i915;
>  
> -	if (IS_PONTEVECCHIO(i915))
> +	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
> +		mtl_3d_gt_workarounds_init(gt, wal);
> +	else if (IS_PONTEVECCHIO(i915))
>  		pvc_gt_workarounds_init(gt, wal);
>  	else if (IS_DG2(i915))
>  		dg2_gt_workarounds_init(gt, wal);
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 38460a0bd7cb..bd1d8e0205a6 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -1145,6 +1145,7 @@ static const struct intel_device_info mtl_info = {
>  	.extra_gt_list = xelpmp_extra_gt,
>  	.has_flat_ccs = 0,
>  	.has_gmd_id = 1,
> +	.has_mslice_steering = 0,
>  	.has_snoop = 1,
>  	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
>  	.__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),
> -- 
> 2.37.3
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 13/14] drm/i915/mtl: Add multicast steering for render GT
@ 2022-10-14 16:02     ` Balasubramani Vivekanandan
  0 siblings, 0 replies; 54+ messages in thread
From: Balasubramani Vivekanandan @ 2022-10-14 16:02 UTC (permalink / raw)
  To: Matt Roper, intel-gfx
  Cc: ravi.kumar.vodapalli, dri-devel, Radhakrishna Sripada

On 30.09.2022 17:45, Matt Roper wrote:
> MTL once again changes the multicast register types and steering
> details.  Key changes from past platforms:
>  * The number of instances of some MCR types (NODE, OAAL2, and GAM) vary
>    according to the MTL subplatform and cannot be read from fuse
>    registers.
>  * The MCR steering register (and its bitfields) has changed.
> 
> Unlike past platforms, we will be explicitly steering all types of MCR
> accesses, including those for "SLICE" and "DSS" ranges; we no longer
> rely on implicit steering.  On previous platforms, various
> hardware/firmware agents that needed to access registers typically had
> their own steering control registers, allowing them to perform multicast
> steering without clobbering the CPU/kernel steering.  Starting with MTL,
> more of these agents now share a single steering register (0xFD4) and it
> is no longer safe for us to assume that the value will remain unchanged
> from how we initialized it during startup.  There is also a slight
> chance of race conditions between the driver and a hardware/firmware
> agent, so the hardware provides a semaphore register that can be used to
> coordinate access to the steering register.  Support for the semaphore
> register will be introduced in a future patch.
> 
> Bspec: 67788, 67112
> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 85 ++++++++++++++++++---
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  5 ++
>  drivers/gpu/drm/i915/gt/intel_gt_types.h    |  8 +-
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 ++++-
>  drivers/gpu/drm/i915/i915_pci.c             |  1 +
>  5 files changed, 102 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index 9e2caba64f19..a1fa71b47831 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -41,6 +41,7 @@ static const char * const intel_steering_types[] = {
>  	"MSLICE",
>  	"LNCF",
>  	"GAM",
> +	"DSS",
>  	"INSTANCE 0",
>  };
>  
> @@ -99,9 +100,40 @@ static const struct intel_mmio_range pvc_instance0_steering_table[] = {
>  	{},
>  };
>  
> +static const struct intel_mmio_range mtl3d_instance0_steering_table[] = {
> +	{ 0x000B00, 0x000BFF },         /* SQIDI */
> +	{ 0x001000, 0x001FFF },         /* SQIDI */
> +	{ 0x004000, 0x0048FF },         /* GAM */
> +	{ 0x008700, 0x0087FF },         /* SQIDI */
> +	{ 0x00B000, 0x00B0FF },         /* NODE */
> +	{ 0x00C800, 0x00CFFF },         /* GAM */
> +	{ 0x00D880, 0x00D8FF },         /* NODE */
> +	{ 0x00DD00, 0x00DDFF },         /* OAAL2 */
> +	{},
> +};
> +
> +static const struct intel_mmio_range mtl3d_l3bank_steering_table[] = {
> +	{ 0x00B100, 0x00B3FF },
> +	{},
> +};
> +
> +/* DSS steering is used for SLICE ranges as well */
> +static const struct intel_mmio_range mtl3d_dss_steering_table[] = {
> +	{ 0x005200, 0x0052FF },		/* SLICE */
> +	{ 0x005500, 0x007FFF },		/* SLICE */
> +	{ 0x008140, 0x00815F },		/* SLICE (0x8140-0x814F), DSS (0x8150-0x815F) */
> +	{ 0x0094D0, 0x00955F },		/* SLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */
> +	{ 0x009680, 0x0096FF },		/* DSS */
> +	{ 0x00D800, 0x00D87F },		/* SLICE */
> +	{ 0x00DC00, 0x00DCFF },		/* SLICE */
> +	{ 0x00DE80, 0x00E8FF },		/* DSS (0xE000-0xE0FF reserved) */
> +};
> +
>  void intel_gt_mcr_init(struct intel_gt *gt)
>  {
>  	struct drm_i915_private *i915 = gt->i915;
> +	unsigned long fuse;
> +	int i;
>  
>  	/*
>  	 * An mslice is unavailable only if both the meml3 for the slice is
> @@ -119,7 +151,22 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>  			drm_warn(&i915->drm, "mslice mask all zero!\n");
>  	}
>  
> -	if (IS_PONTEVECCHIO(i915)) {
> +	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) &&
> +	    gt->type == GT_PRIMARY) {
> +		fuse = REG_FIELD_GET(GT_L3_EXC_MASK,
> +				     intel_uncore_read(gt->uncore, XEHP_FUSE4));
> +
> +		/*
> +		 * Despite the register field being named "exclude mask" the
> +		 * bits actually represent enabled banks (two banks per bit).
> +		 */
> +		for_each_set_bit(i, &fuse, 3)
> +			gt->info.l3bank_mask |= (0x3 << 2*i);
> +
> +		gt->steering_table[INSTANCE0] = mtl3d_instance0_steering_table;
> +		gt->steering_table[L3BANK] = mtl3d_l3bank_steering_table;
> +		gt->steering_table[DSS] = mtl3d_dss_steering_table;
> +	} else if (IS_PONTEVECCHIO(i915)) {
>  		gt->steering_table[INSTANCE0] = pvc_instance0_steering_table;
>  	} else if (IS_DG2(i915)) {
>  		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
> @@ -184,7 +231,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>  
>  	lockdep_assert_held(&uncore->lock);
>  
> -	if (GRAPHICS_VER(uncore->i915) >= 11) {
> +	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) {
> +		/* No need to save old steering reg value */
> +		intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR,
> +				      REG_FIELD_PREP(MTL_MCR_GROUPID, group) |
> +				      REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance));

I think we need to clear the MULTICAST bit in the MTL_MCR_SELECTOR
register here since we are doing unicast r/w.

And in the functions intel_gt_mcr_multicast_write() and
intel_gt_mcr_multicast_write_fw(), we need to set the MULTICAST bit
in the MTL_MCR_SELECTOR register for MTL since we no more rely on
default values for the MTL_MCR_SELECTOR.

Can I also suggest to optimize the groupid and instanceid calculation
during non-terminating unicast MCR read/write by caching the groupid and
instanceid in some variables instead of calculating it everytime?
Because from MTL we have replaced all MCR register reads with implicit
steering to intel_gt_mcr_read_any/_fw call, and each call of that
function does groupip/instanceid calculation.

Regards,
Bala

> +	} else if (GRAPHICS_VER(uncore->i915) >= 11) {
>  		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
>  		mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance);
>  
> @@ -202,26 +254,30 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>  		 */
>  		if (rw_flag == FW_REG_WRITE)
>  			mcr_mask |= GEN11_MCR_MULTICAST;
> +
> +		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> +
> +		mcr &= ~mcr_mask;
> +		mcr |= mcr_ss;
> +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
>  	} else {
>  		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
>  		mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance);
> -	}
>  
> -	old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> +		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
>  
> -	mcr &= ~mcr_mask;
> -	mcr |= mcr_ss;
> -	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> +		mcr &= ~mcr_mask;
> +		mcr |= mcr_ss;
> +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> +	}
>  
>  	if (rw_flag == FW_REG_READ)
>  		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
>  	else
>  		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
>  
> -	mcr &= ~mcr_mask;
> -	mcr |= old_mcr & mcr_mask;
> -
> -	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> +	if (GRAPHICS_VER_FULL(uncore->i915) < IP_VER(12, 70))
> +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
>  
>  	return val;
>  }
> @@ -385,6 +441,8 @@ static void get_nonterminated_steering(struct intel_gt *gt,
>  				       enum intel_steering_type type,
>  				       u8 *group, u8 *instance)
>  {
> +	u32 dss;
> +
>  	switch (type) {
>  	case L3BANK:
>  		*group = 0;		/* unused */
> @@ -408,6 +466,11 @@ static void get_nonterminated_steering(struct intel_gt *gt,
>  		*group = IS_DG2(gt->i915) ? 1 : 0;
>  		*instance = 0;
>  		break;
> +	case DSS:
> +		dss = intel_sseu_find_first_xehp_dss(&gt->info.sseu, 0, 0);
> +		*group = dss / GEN_DSS_PER_GSLICE;
> +		*instance = dss % GEN_DSS_PER_GSLICE;
> +		break;
>  	case INSTANCE0:
>  		/*
>  		 * There are a lot of MCR types for which instance (0, 0)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index b7341f01ec9f..c5b9671097e3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -59,6 +59,7 @@
>  #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
>  
>  #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
> +#define MTL_MCR_SELECTOR			_MMIO(0xfd4)
>  #define SF_MCR_SELECTOR				_MMIO(0xfd8)
>  #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)
>  #define GAM_MCR_SELECTOR			_MMIO(0xfe0)
> @@ -71,6 +72,8 @@
>  #define   GEN11_MCR_SLICE_MASK			GEN11_MCR_SLICE(0xf)
>  #define   GEN11_MCR_SUBSLICE(subslice)		(((subslice) & 0x7) << 24)
>  #define   GEN11_MCR_SUBSLICE_MASK		GEN11_MCR_SUBSLICE(0x7)
> +#define   MTL_MCR_GROUPID			REG_GENMASK(11, 8)
> +#define   MTL_MCR_INSTANCEID			REG_GENMASK(3, 0)
>  
>  #define IPEIR_I965				_MMIO(0x2064)
>  #define IPEHR_I965				_MMIO(0x2068)
> @@ -531,6 +534,8 @@
>  #define   GEN6_MBCTL_BOOT_FETCH_MECH		(1 << 0)
>  
>  /* Fuse readout registers for GT */
> +#define XEHP_FUSE4				_MMIO(0x9114)
> +#define   GT_L3_EXC_MASK			REG_GENMASK(6, 4)
>  #define	GEN10_MIRROR_FUSE3			_MMIO(0x9118)
>  #define   GEN10_L3BANK_PAIR_COUNT		4
>  #define   GEN10_L3BANK_MASK			0x0F
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> index 30003d68fd51..77ecbd6be331 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> @@ -60,6 +60,7 @@ enum intel_steering_type {
>  	MSLICE,
>  	LNCF,
>  	GAM,
> +	DSS,
>  
>  	/*
>  	 * On some platforms there are multiple types of MCR registers that
> @@ -255,8 +256,6 @@ struct intel_gt {
>  
>  		intel_engine_mask_t engine_mask;
>  
> -		u32 l3bank_mask;
> -
>  		u8 num_engines;
>  
>  		/* General presence of SFC units */
> @@ -268,7 +267,10 @@ struct intel_gt {
>  		/* Slice/subslice/EU info */
>  		struct sseu_dev_info sseu;
>  
> -		unsigned long mslice_mask;
> +		union {
> +			unsigned long mslice_mask;
> +			unsigned long l3bank_mask;
> +		};
>  
>  		/** @hwconfig: hardware configuration data */
>  		struct intel_hwconfig hwconfig;
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 174c74dcda99..9be048da7fb3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -1581,12 +1581,28 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>  	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
>  }
>  
> +static void
> +mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> +{
> +	/*
> +	 * Unlike older platforms, we no longer setup implicit steering here;
> +	 * all MCR accesses are explicitly steered.
> +	 */
> +	if (drm_debug_enabled(DRM_UT_DRIVER)) {
> +		struct drm_printer p = drm_debug_printer("MCR Steering:");
> +
> +		intel_gt_mcr_report_steering(&p, gt, false);
> +	}
> +}
> +
>  static void
>  gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
>  {
>  	struct drm_i915_private *i915 = gt->i915;
>  
> -	if (IS_PONTEVECCHIO(i915))
> +	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
> +		mtl_3d_gt_workarounds_init(gt, wal);
> +	else if (IS_PONTEVECCHIO(i915))
>  		pvc_gt_workarounds_init(gt, wal);
>  	else if (IS_DG2(i915))
>  		dg2_gt_workarounds_init(gt, wal);
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 38460a0bd7cb..bd1d8e0205a6 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -1145,6 +1145,7 @@ static const struct intel_device_info mtl_info = {
>  	.extra_gt_list = xelpmp_extra_gt,
>  	.has_flat_ccs = 0,
>  	.has_gmd_id = 1,
> +	.has_mslice_steering = 0,
>  	.has_snoop = 1,
>  	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
>  	.__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),
> -- 
> 2.37.3
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v2 13/14] drm/i915/mtl: Add multicast steering for render GT
  2022-10-14 16:02     ` Balasubramani Vivekanandan
@ 2022-10-14 22:46       ` Matt Roper
  -1 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-14 22:46 UTC (permalink / raw)
  To: Balasubramani Vivekanandan
  Cc: ravi.kumar.vodapalli, intel-gfx, dri-devel, Radhakrishna Sripada

On Fri, Oct 14, 2022 at 09:32:55PM +0530, Balasubramani Vivekanandan wrote:
> On 30.09.2022 17:45, Matt Roper wrote:
> > MTL once again changes the multicast register types and steering
> > details.  Key changes from past platforms:
> >  * The number of instances of some MCR types (NODE, OAAL2, and GAM) vary
> >    according to the MTL subplatform and cannot be read from fuse
> >    registers.
> >  * The MCR steering register (and its bitfields) has changed.
> > 
> > Unlike past platforms, we will be explicitly steering all types of MCR
> > accesses, including those for "SLICE" and "DSS" ranges; we no longer
> > rely on implicit steering.  On previous platforms, various
> > hardware/firmware agents that needed to access registers typically had
> > their own steering control registers, allowing them to perform multicast
> > steering without clobbering the CPU/kernel steering.  Starting with MTL,
> > more of these agents now share a single steering register (0xFD4) and it
> > is no longer safe for us to assume that the value will remain unchanged
> > from how we initialized it during startup.  There is also a slight
> > chance of race conditions between the driver and a hardware/firmware
> > agent, so the hardware provides a semaphore register that can be used to
> > coordinate access to the steering register.  Support for the semaphore
> > register will be introduced in a future patch.
> > 
> > Bspec: 67788, 67112
> > Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 85 ++++++++++++++++++---
> >  drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  5 ++
> >  drivers/gpu/drm/i915/gt/intel_gt_types.h    |  8 +-
> >  drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 ++++-
> >  drivers/gpu/drm/i915/i915_pci.c             |  1 +
> >  5 files changed, 102 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > index 9e2caba64f19..a1fa71b47831 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > @@ -41,6 +41,7 @@ static const char * const intel_steering_types[] = {
> >  	"MSLICE",
> >  	"LNCF",
> >  	"GAM",
> > +	"DSS",
> >  	"INSTANCE 0",
> >  };
> >  
> > @@ -99,9 +100,40 @@ static const struct intel_mmio_range pvc_instance0_steering_table[] = {
> >  	{},
> >  };
> >  
> > +static const struct intel_mmio_range mtl3d_instance0_steering_table[] = {
> > +	{ 0x000B00, 0x000BFF },         /* SQIDI */
> > +	{ 0x001000, 0x001FFF },         /* SQIDI */
> > +	{ 0x004000, 0x0048FF },         /* GAM */
> > +	{ 0x008700, 0x0087FF },         /* SQIDI */
> > +	{ 0x00B000, 0x00B0FF },         /* NODE */
> > +	{ 0x00C800, 0x00CFFF },         /* GAM */
> > +	{ 0x00D880, 0x00D8FF },         /* NODE */
> > +	{ 0x00DD00, 0x00DDFF },         /* OAAL2 */
> > +	{},
> > +};
> > +
> > +static const struct intel_mmio_range mtl3d_l3bank_steering_table[] = {
> > +	{ 0x00B100, 0x00B3FF },
> > +	{},
> > +};
> > +
> > +/* DSS steering is used for SLICE ranges as well */
> > +static const struct intel_mmio_range mtl3d_dss_steering_table[] = {
> > +	{ 0x005200, 0x0052FF },		/* SLICE */
> > +	{ 0x005500, 0x007FFF },		/* SLICE */
> > +	{ 0x008140, 0x00815F },		/* SLICE (0x8140-0x814F), DSS (0x8150-0x815F) */
> > +	{ 0x0094D0, 0x00955F },		/* SLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */
> > +	{ 0x009680, 0x0096FF },		/* DSS */
> > +	{ 0x00D800, 0x00D87F },		/* SLICE */
> > +	{ 0x00DC00, 0x00DCFF },		/* SLICE */
> > +	{ 0x00DE80, 0x00E8FF },		/* DSS (0xE000-0xE0FF reserved) */
> > +};
> > +
> >  void intel_gt_mcr_init(struct intel_gt *gt)
> >  {
> >  	struct drm_i915_private *i915 = gt->i915;
> > +	unsigned long fuse;
> > +	int i;
> >  
> >  	/*
> >  	 * An mslice is unavailable only if both the meml3 for the slice is
> > @@ -119,7 +151,22 @@ void intel_gt_mcr_init(struct intel_gt *gt)
> >  			drm_warn(&i915->drm, "mslice mask all zero!\n");
> >  	}
> >  
> > -	if (IS_PONTEVECCHIO(i915)) {
> > +	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) &&
> > +	    gt->type == GT_PRIMARY) {
> > +		fuse = REG_FIELD_GET(GT_L3_EXC_MASK,
> > +				     intel_uncore_read(gt->uncore, XEHP_FUSE4));
> > +
> > +		/*
> > +		 * Despite the register field being named "exclude mask" the
> > +		 * bits actually represent enabled banks (two banks per bit).
> > +		 */
> > +		for_each_set_bit(i, &fuse, 3)
> > +			gt->info.l3bank_mask |= (0x3 << 2*i);
> > +
> > +		gt->steering_table[INSTANCE0] = mtl3d_instance0_steering_table;
> > +		gt->steering_table[L3BANK] = mtl3d_l3bank_steering_table;
> > +		gt->steering_table[DSS] = mtl3d_dss_steering_table;
> > +	} else if (IS_PONTEVECCHIO(i915)) {
> >  		gt->steering_table[INSTANCE0] = pvc_instance0_steering_table;
> >  	} else if (IS_DG2(i915)) {
> >  		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
> > @@ -184,7 +231,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> >  
> >  	lockdep_assert_held(&uncore->lock);
> >  
> > -	if (GRAPHICS_VER(uncore->i915) >= 11) {
> > +	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) {
> > +		/* No need to save old steering reg value */
> > +		intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR,
> > +				      REG_FIELD_PREP(MTL_MCR_GROUPID, group) |
> > +				      REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance));
> 
> I think we need to clear the MULTICAST bit in the MTL_MCR_SELECTOR
> register here since we are doing unicast r/w.

Since this is a write operation rather than a RMW, the resulting
register value is exactly the fields we set here (group and instance)
and the multicast bit is guaranteed to be clear after this.

However that does remind me that even though we don't need to restore
the original steering target, we do need to restore the MULTICAST bit to
'1' at the end of this function:

        "Prior to releasing the semaphore, multicast is returned to the
        default (1)."

We haven't implemented use of the hardware semaphore yet (that will come
in a future series), but in general any agent operating on MCR registers
should be guaranteed to have multicast behavior by default.  I'll make
that update in the next version.

And I guess we should probably also _set_ the multicast bit when doing
reads since reads don't actually care about the value of that bit
(they're always inherently a unicast operation), but based on
Wa_22013088509 it's safest to just leave multicast enabled at all times
except when we have a specific need to clear it (i.e., unicast writes).

> 
> And in the functions intel_gt_mcr_multicast_write() and
> intel_gt_mcr_multicast_write_fw(), we need to set the MULTICAST bit
> in the MTL_MCR_SELECTOR register for MTL since we no more rely on
> default values for the MTL_MCR_SELECTOR.

Based on the above, the one thing we're supposed to be guaranteed is
that the multicast bit is always set, unless we've intentionally
switched it to unicast mode temporarily (and in that case we're supposed
to switch it back to multicast at the end).  But it doesn't hurt to
explicitly set that bit, just to be safe; I'll include that too.

> 
> Can I also suggest to optimize the groupid and instanceid calculation
> during non-terminating unicast MCR read/write by caching the groupid and
> instanceid in some variables instead of calculating it everytime?
> Because from MTL we have replaced all MCR register reads with implicit
> steering to intel_gt_mcr_read_any/_fw call, and each call of that
> function does groupip/instanceid calculation.

Are you suggesting that get_nonterminated_steering() should return
pre-computed values rather than calculating them on each call?  We could
probably do that, but it kind of seems like a micro-optimization that
might not provide much actual benefit.  The heaviest steering
calculations basically boil down to an __ffs() operation which should
already be super fast and assembly-optimized.  And given that most of
the places that 'read any' operations happen are stuff like workarounds,
error state dumps, etc. that aren't on a hot path, I'm not sure if it's
worth adding the extra complexity unless we think there's a place where
it could measurably improve performance.


Matt

> 
> Regards,
> Bala
> 
> > +	} else if (GRAPHICS_VER(uncore->i915) >= 11) {
> >  		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
> >  		mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance);
> >  
> > @@ -202,26 +254,30 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> >  		 */
> >  		if (rw_flag == FW_REG_WRITE)
> >  			mcr_mask |= GEN11_MCR_MULTICAST;
> > +
> > +		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> > +
> > +		mcr &= ~mcr_mask;
> > +		mcr |= mcr_ss;
> > +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> >  	} else {
> >  		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
> >  		mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance);
> > -	}
> >  
> > -	old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> > +		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> >  
> > -	mcr &= ~mcr_mask;
> > -	mcr |= mcr_ss;
> > -	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> > +		mcr &= ~mcr_mask;
> > +		mcr |= mcr_ss;
> > +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> > +	}
> >  
> >  	if (rw_flag == FW_REG_READ)
> >  		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
> >  	else
> >  		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
> >  
> > -	mcr &= ~mcr_mask;
> > -	mcr |= old_mcr & mcr_mask;
> > -
> > -	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> > +	if (GRAPHICS_VER_FULL(uncore->i915) < IP_VER(12, 70))
> > +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> >  
> >  	return val;
> >  }
> > @@ -385,6 +441,8 @@ static void get_nonterminated_steering(struct intel_gt *gt,
> >  				       enum intel_steering_type type,
> >  				       u8 *group, u8 *instance)
> >  {
> > +	u32 dss;
> > +
> >  	switch (type) {
> >  	case L3BANK:
> >  		*group = 0;		/* unused */
> > @@ -408,6 +466,11 @@ static void get_nonterminated_steering(struct intel_gt *gt,
> >  		*group = IS_DG2(gt->i915) ? 1 : 0;
> >  		*instance = 0;
> >  		break;
> > +	case DSS:
> > +		dss = intel_sseu_find_first_xehp_dss(&gt->info.sseu, 0, 0);
> > +		*group = dss / GEN_DSS_PER_GSLICE;
> > +		*instance = dss % GEN_DSS_PER_GSLICE;
> > +		break;
> >  	case INSTANCE0:
> >  		/*
> >  		 * There are a lot of MCR types for which instance (0, 0)
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > index b7341f01ec9f..c5b9671097e3 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > @@ -59,6 +59,7 @@
> >  #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
> >  
> >  #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
> > +#define MTL_MCR_SELECTOR			_MMIO(0xfd4)
> >  #define SF_MCR_SELECTOR				_MMIO(0xfd8)
> >  #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)
> >  #define GAM_MCR_SELECTOR			_MMIO(0xfe0)
> > @@ -71,6 +72,8 @@
> >  #define   GEN11_MCR_SLICE_MASK			GEN11_MCR_SLICE(0xf)
> >  #define   GEN11_MCR_SUBSLICE(subslice)		(((subslice) & 0x7) << 24)
> >  #define   GEN11_MCR_SUBSLICE_MASK		GEN11_MCR_SUBSLICE(0x7)
> > +#define   MTL_MCR_GROUPID			REG_GENMASK(11, 8)
> > +#define   MTL_MCR_INSTANCEID			REG_GENMASK(3, 0)
> >  
> >  #define IPEIR_I965				_MMIO(0x2064)
> >  #define IPEHR_I965				_MMIO(0x2068)
> > @@ -531,6 +534,8 @@
> >  #define   GEN6_MBCTL_BOOT_FETCH_MECH		(1 << 0)
> >  
> >  /* Fuse readout registers for GT */
> > +#define XEHP_FUSE4				_MMIO(0x9114)
> > +#define   GT_L3_EXC_MASK			REG_GENMASK(6, 4)
> >  #define	GEN10_MIRROR_FUSE3			_MMIO(0x9118)
> >  #define   GEN10_L3BANK_PAIR_COUNT		4
> >  #define   GEN10_L3BANK_MASK			0x0F
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > index 30003d68fd51..77ecbd6be331 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > @@ -60,6 +60,7 @@ enum intel_steering_type {
> >  	MSLICE,
> >  	LNCF,
> >  	GAM,
> > +	DSS,
> >  
> >  	/*
> >  	 * On some platforms there are multiple types of MCR registers that
> > @@ -255,8 +256,6 @@ struct intel_gt {
> >  
> >  		intel_engine_mask_t engine_mask;
> >  
> > -		u32 l3bank_mask;
> > -
> >  		u8 num_engines;
> >  
> >  		/* General presence of SFC units */
> > @@ -268,7 +267,10 @@ struct intel_gt {
> >  		/* Slice/subslice/EU info */
> >  		struct sseu_dev_info sseu;
> >  
> > -		unsigned long mslice_mask;
> > +		union {
> > +			unsigned long mslice_mask;
> > +			unsigned long l3bank_mask;
> > +		};
> >  
> >  		/** @hwconfig: hardware configuration data */
> >  		struct intel_hwconfig hwconfig;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 174c74dcda99..9be048da7fb3 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -1581,12 +1581,28 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >  	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
> >  }
> >  
> > +static void
> > +mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> > +{
> > +	/*
> > +	 * Unlike older platforms, we no longer setup implicit steering here;
> > +	 * all MCR accesses are explicitly steered.
> > +	 */
> > +	if (drm_debug_enabled(DRM_UT_DRIVER)) {
> > +		struct drm_printer p = drm_debug_printer("MCR Steering:");
> > +
> > +		intel_gt_mcr_report_steering(&p, gt, false);
> > +	}
> > +}
> > +
> >  static void
> >  gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
> >  {
> >  	struct drm_i915_private *i915 = gt->i915;
> >  
> > -	if (IS_PONTEVECCHIO(i915))
> > +	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
> > +		mtl_3d_gt_workarounds_init(gt, wal);
> > +	else if (IS_PONTEVECCHIO(i915))
> >  		pvc_gt_workarounds_init(gt, wal);
> >  	else if (IS_DG2(i915))
> >  		dg2_gt_workarounds_init(gt, wal);
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> > index 38460a0bd7cb..bd1d8e0205a6 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -1145,6 +1145,7 @@ static const struct intel_device_info mtl_info = {
> >  	.extra_gt_list = xelpmp_extra_gt,
> >  	.has_flat_ccs = 0,
> >  	.has_gmd_id = 1,
> > +	.has_mslice_steering = 0,
> >  	.has_snoop = 1,
> >  	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
> >  	.__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),
> > -- 
> > 2.37.3
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Intel-gfx] [PATCH v2 13/14] drm/i915/mtl: Add multicast steering for render GT
@ 2022-10-14 22:46       ` Matt Roper
  0 siblings, 0 replies; 54+ messages in thread
From: Matt Roper @ 2022-10-14 22:46 UTC (permalink / raw)
  To: Balasubramani Vivekanandan; +Cc: ravi.kumar.vodapalli, intel-gfx, dri-devel

On Fri, Oct 14, 2022 at 09:32:55PM +0530, Balasubramani Vivekanandan wrote:
> On 30.09.2022 17:45, Matt Roper wrote:
> > MTL once again changes the multicast register types and steering
> > details.  Key changes from past platforms:
> >  * The number of instances of some MCR types (NODE, OAAL2, and GAM) vary
> >    according to the MTL subplatform and cannot be read from fuse
> >    registers.
> >  * The MCR steering register (and its bitfields) has changed.
> > 
> > Unlike past platforms, we will be explicitly steering all types of MCR
> > accesses, including those for "SLICE" and "DSS" ranges; we no longer
> > rely on implicit steering.  On previous platforms, various
> > hardware/firmware agents that needed to access registers typically had
> > their own steering control registers, allowing them to perform multicast
> > steering without clobbering the CPU/kernel steering.  Starting with MTL,
> > more of these agents now share a single steering register (0xFD4) and it
> > is no longer safe for us to assume that the value will remain unchanged
> > from how we initialized it during startup.  There is also a slight
> > chance of race conditions between the driver and a hardware/firmware
> > agent, so the hardware provides a semaphore register that can be used to
> > coordinate access to the steering register.  Support for the semaphore
> > register will be introduced in a future patch.
> > 
> > Bspec: 67788, 67112
> > Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 85 ++++++++++++++++++---
> >  drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  5 ++
> >  drivers/gpu/drm/i915/gt/intel_gt_types.h    |  8 +-
> >  drivers/gpu/drm/i915/gt/intel_workarounds.c | 18 ++++-
> >  drivers/gpu/drm/i915/i915_pci.c             |  1 +
> >  5 files changed, 102 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > index 9e2caba64f19..a1fa71b47831 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > @@ -41,6 +41,7 @@ static const char * const intel_steering_types[] = {
> >  	"MSLICE",
> >  	"LNCF",
> >  	"GAM",
> > +	"DSS",
> >  	"INSTANCE 0",
> >  };
> >  
> > @@ -99,9 +100,40 @@ static const struct intel_mmio_range pvc_instance0_steering_table[] = {
> >  	{},
> >  };
> >  
> > +static const struct intel_mmio_range mtl3d_instance0_steering_table[] = {
> > +	{ 0x000B00, 0x000BFF },         /* SQIDI */
> > +	{ 0x001000, 0x001FFF },         /* SQIDI */
> > +	{ 0x004000, 0x0048FF },         /* GAM */
> > +	{ 0x008700, 0x0087FF },         /* SQIDI */
> > +	{ 0x00B000, 0x00B0FF },         /* NODE */
> > +	{ 0x00C800, 0x00CFFF },         /* GAM */
> > +	{ 0x00D880, 0x00D8FF },         /* NODE */
> > +	{ 0x00DD00, 0x00DDFF },         /* OAAL2 */
> > +	{},
> > +};
> > +
> > +static const struct intel_mmio_range mtl3d_l3bank_steering_table[] = {
> > +	{ 0x00B100, 0x00B3FF },
> > +	{},
> > +};
> > +
> > +/* DSS steering is used for SLICE ranges as well */
> > +static const struct intel_mmio_range mtl3d_dss_steering_table[] = {
> > +	{ 0x005200, 0x0052FF },		/* SLICE */
> > +	{ 0x005500, 0x007FFF },		/* SLICE */
> > +	{ 0x008140, 0x00815F },		/* SLICE (0x8140-0x814F), DSS (0x8150-0x815F) */
> > +	{ 0x0094D0, 0x00955F },		/* SLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */
> > +	{ 0x009680, 0x0096FF },		/* DSS */
> > +	{ 0x00D800, 0x00D87F },		/* SLICE */
> > +	{ 0x00DC00, 0x00DCFF },		/* SLICE */
> > +	{ 0x00DE80, 0x00E8FF },		/* DSS (0xE000-0xE0FF reserved) */
> > +};
> > +
> >  void intel_gt_mcr_init(struct intel_gt *gt)
> >  {
> >  	struct drm_i915_private *i915 = gt->i915;
> > +	unsigned long fuse;
> > +	int i;
> >  
> >  	/*
> >  	 * An mslice is unavailable only if both the meml3 for the slice is
> > @@ -119,7 +151,22 @@ void intel_gt_mcr_init(struct intel_gt *gt)
> >  			drm_warn(&i915->drm, "mslice mask all zero!\n");
> >  	}
> >  
> > -	if (IS_PONTEVECCHIO(i915)) {
> > +	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70) &&
> > +	    gt->type == GT_PRIMARY) {
> > +		fuse = REG_FIELD_GET(GT_L3_EXC_MASK,
> > +				     intel_uncore_read(gt->uncore, XEHP_FUSE4));
> > +
> > +		/*
> > +		 * Despite the register field being named "exclude mask" the
> > +		 * bits actually represent enabled banks (two banks per bit).
> > +		 */
> > +		for_each_set_bit(i, &fuse, 3)
> > +			gt->info.l3bank_mask |= (0x3 << 2*i);
> > +
> > +		gt->steering_table[INSTANCE0] = mtl3d_instance0_steering_table;
> > +		gt->steering_table[L3BANK] = mtl3d_l3bank_steering_table;
> > +		gt->steering_table[DSS] = mtl3d_dss_steering_table;
> > +	} else if (IS_PONTEVECCHIO(i915)) {
> >  		gt->steering_table[INSTANCE0] = pvc_instance0_steering_table;
> >  	} else if (IS_DG2(i915)) {
> >  		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
> > @@ -184,7 +231,12 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> >  
> >  	lockdep_assert_held(&uncore->lock);
> >  
> > -	if (GRAPHICS_VER(uncore->i915) >= 11) {
> > +	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 70)) {
> > +		/* No need to save old steering reg value */
> > +		intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR,
> > +				      REG_FIELD_PREP(MTL_MCR_GROUPID, group) |
> > +				      REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance));
> 
> I think we need to clear the MULTICAST bit in the MTL_MCR_SELECTOR
> register here since we are doing unicast r/w.

Since this is a write operation rather than a RMW, the resulting
register value is exactly the fields we set here (group and instance)
and the multicast bit is guaranteed to be clear after this.

However that does remind me that even though we don't need to restore
the original steering target, we do need to restore the MULTICAST bit to
'1' at the end of this function:

        "Prior to releasing the semaphore, multicast is returned to the
        default (1)."

We haven't implemented use of the hardware semaphore yet (that will come
in a future series), but in general any agent operating on MCR registers
should be guaranteed to have multicast behavior by default.  I'll make
that update in the next version.

And I guess we should probably also _set_ the multicast bit when doing
reads since reads don't actually care about the value of that bit
(they're always inherently a unicast operation), but based on
Wa_22013088509 it's safest to just leave multicast enabled at all times
except when we have a specific need to clear it (i.e., unicast writes).

> 
> And in the functions intel_gt_mcr_multicast_write() and
> intel_gt_mcr_multicast_write_fw(), we need to set the MULTICAST bit
> in the MTL_MCR_SELECTOR register for MTL since we no more rely on
> default values for the MTL_MCR_SELECTOR.

Based on the above, the one thing we're supposed to be guaranteed is
that the multicast bit is always set, unless we've intentionally
switched it to unicast mode temporarily (and in that case we're supposed
to switch it back to multicast at the end).  But it doesn't hurt to
explicitly set that bit, just to be safe; I'll include that too.

> 
> Can I also suggest to optimize the groupid and instanceid calculation
> during non-terminating unicast MCR read/write by caching the groupid and
> instanceid in some variables instead of calculating it everytime?
> Because from MTL we have replaced all MCR register reads with implicit
> steering to intel_gt_mcr_read_any/_fw call, and each call of that
> function does groupip/instanceid calculation.

Are you suggesting that get_nonterminated_steering() should return
pre-computed values rather than calculating them on each call?  We could
probably do that, but it kind of seems like a micro-optimization that
might not provide much actual benefit.  The heaviest steering
calculations basically boil down to an __ffs() operation which should
already be super fast and assembly-optimized.  And given that most of
the places that 'read any' operations happen are stuff like workarounds,
error state dumps, etc. that aren't on a hot path, I'm not sure if it's
worth adding the extra complexity unless we think there's a place where
it could measurably improve performance.


Matt

> 
> Regards,
> Bala
> 
> > +	} else if (GRAPHICS_VER(uncore->i915) >= 11) {
> >  		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
> >  		mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance);
> >  
> > @@ -202,26 +254,30 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> >  		 */
> >  		if (rw_flag == FW_REG_WRITE)
> >  			mcr_mask |= GEN11_MCR_MULTICAST;
> > +
> > +		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> > +
> > +		mcr &= ~mcr_mask;
> > +		mcr |= mcr_ss;
> > +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> >  	} else {
> >  		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
> >  		mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance);
> > -	}
> >  
> > -	old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> > +		old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
> >  
> > -	mcr &= ~mcr_mask;
> > -	mcr |= mcr_ss;
> > -	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> > +		mcr &= ~mcr_mask;
> > +		mcr |= mcr_ss;
> > +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> > +	}
> >  
> >  	if (rw_flag == FW_REG_READ)
> >  		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
> >  	else
> >  		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
> >  
> > -	mcr &= ~mcr_mask;
> > -	mcr |= old_mcr & mcr_mask;
> > -
> > -	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> > +	if (GRAPHICS_VER_FULL(uncore->i915) < IP_VER(12, 70))
> > +		intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> >  
> >  	return val;
> >  }
> > @@ -385,6 +441,8 @@ static void get_nonterminated_steering(struct intel_gt *gt,
> >  				       enum intel_steering_type type,
> >  				       u8 *group, u8 *instance)
> >  {
> > +	u32 dss;
> > +
> >  	switch (type) {
> >  	case L3BANK:
> >  		*group = 0;		/* unused */
> > @@ -408,6 +466,11 @@ static void get_nonterminated_steering(struct intel_gt *gt,
> >  		*group = IS_DG2(gt->i915) ? 1 : 0;
> >  		*instance = 0;
> >  		break;
> > +	case DSS:
> > +		dss = intel_sseu_find_first_xehp_dss(&gt->info.sseu, 0, 0);
> > +		*group = dss / GEN_DSS_PER_GSLICE;
> > +		*instance = dss % GEN_DSS_PER_GSLICE;
> > +		break;
> >  	case INSTANCE0:
> >  		/*
> >  		 * There are a lot of MCR types for which instance (0, 0)
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > index b7341f01ec9f..c5b9671097e3 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > @@ -59,6 +59,7 @@
> >  #define GMD_ID_MEDIA				_MMIO(MTL_MEDIA_GSI_BASE + 0xd8c)
> >  
> >  #define MCFG_MCR_SELECTOR			_MMIO(0xfd0)
> > +#define MTL_MCR_SELECTOR			_MMIO(0xfd4)
> >  #define SF_MCR_SELECTOR				_MMIO(0xfd8)
> >  #define GEN8_MCR_SELECTOR			_MMIO(0xfdc)
> >  #define GAM_MCR_SELECTOR			_MMIO(0xfe0)
> > @@ -71,6 +72,8 @@
> >  #define   GEN11_MCR_SLICE_MASK			GEN11_MCR_SLICE(0xf)
> >  #define   GEN11_MCR_SUBSLICE(subslice)		(((subslice) & 0x7) << 24)
> >  #define   GEN11_MCR_SUBSLICE_MASK		GEN11_MCR_SUBSLICE(0x7)
> > +#define   MTL_MCR_GROUPID			REG_GENMASK(11, 8)
> > +#define   MTL_MCR_INSTANCEID			REG_GENMASK(3, 0)
> >  
> >  #define IPEIR_I965				_MMIO(0x2064)
> >  #define IPEHR_I965				_MMIO(0x2068)
> > @@ -531,6 +534,8 @@
> >  #define   GEN6_MBCTL_BOOT_FETCH_MECH		(1 << 0)
> >  
> >  /* Fuse readout registers for GT */
> > +#define XEHP_FUSE4				_MMIO(0x9114)
> > +#define   GT_L3_EXC_MASK			REG_GENMASK(6, 4)
> >  #define	GEN10_MIRROR_FUSE3			_MMIO(0x9118)
> >  #define   GEN10_L3BANK_PAIR_COUNT		4
> >  #define   GEN10_L3BANK_MASK			0x0F
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > index 30003d68fd51..77ecbd6be331 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > @@ -60,6 +60,7 @@ enum intel_steering_type {
> >  	MSLICE,
> >  	LNCF,
> >  	GAM,
> > +	DSS,
> >  
> >  	/*
> >  	 * On some platforms there are multiple types of MCR registers that
> > @@ -255,8 +256,6 @@ struct intel_gt {
> >  
> >  		intel_engine_mask_t engine_mask;
> >  
> > -		u32 l3bank_mask;
> > -
> >  		u8 num_engines;
> >  
> >  		/* General presence of SFC units */
> > @@ -268,7 +267,10 @@ struct intel_gt {
> >  		/* Slice/subslice/EU info */
> >  		struct sseu_dev_info sseu;
> >  
> > -		unsigned long mslice_mask;
> > +		union {
> > +			unsigned long mslice_mask;
> > +			unsigned long l3bank_mask;
> > +		};
> >  
> >  		/** @hwconfig: hardware configuration data */
> >  		struct intel_hwconfig hwconfig;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 174c74dcda99..9be048da7fb3 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -1581,12 +1581,28 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >  	wa_mcr_write_clr(wal, GEN8_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
> >  }
> >  
> > +static void
> > +mtl_3d_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> > +{
> > +	/*
> > +	 * Unlike older platforms, we no longer setup implicit steering here;
> > +	 * all MCR accesses are explicitly steered.
> > +	 */
> > +	if (drm_debug_enabled(DRM_UT_DRIVER)) {
> > +		struct drm_printer p = drm_debug_printer("MCR Steering:");
> > +
> > +		intel_gt_mcr_report_steering(&p, gt, false);
> > +	}
> > +}
> > +
> >  static void
> >  gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
> >  {
> >  	struct drm_i915_private *i915 = gt->i915;
> >  
> > -	if (IS_PONTEVECCHIO(i915))
> > +	if (IS_METEORLAKE(i915) && gt->type == GT_PRIMARY)
> > +		mtl_3d_gt_workarounds_init(gt, wal);
> > +	else if (IS_PONTEVECCHIO(i915))
> >  		pvc_gt_workarounds_init(gt, wal);
> >  	else if (IS_DG2(i915))
> >  		dg2_gt_workarounds_init(gt, wal);
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> > index 38460a0bd7cb..bd1d8e0205a6 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -1145,6 +1145,7 @@ static const struct intel_device_info mtl_info = {
> >  	.extra_gt_list = xelpmp_extra_gt,
> >  	.has_flat_ccs = 0,
> >  	.has_gmd_id = 1,
> > +	.has_mslice_steering = 0,
> >  	.has_snoop = 1,
> >  	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
> >  	.__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),
> > -- 
> > 2.37.3
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2022-10-14 22:46 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-01  0:45 [PATCH v2 00/14] Explicit MCR handling and MTL steering Matt Roper
2022-10-01  0:45 ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 01/14] drm/i915/gen8: Create separate reg definitions for new MCR registers Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 02/14] drm/i915/xehp: " Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 03/14] drm/i915/gt: Drop a few unused register definitions Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 04/14] drm/i915/gt: Correct prefix on a few registers Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 05/14] drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [Intel-gfx] [PATCH v2 06/14] drm/i915/xehp: Check for faults on primary GAM Matt Roper
2022-10-01  0:45   ` Matt Roper
2022-10-01  0:45 ` [PATCH v2 07/14] drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw() Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 08/14] drm/i915: Define MCR registers explicitly Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 09/14] drm/i915/gt: Always use MCR functions on multicast registers Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-12 16:18   ` Balasubramani Vivekanandan
2022-10-12 16:18     ` [Intel-gfx] " Balasubramani Vivekanandan
2022-10-01  0:45 ` [PATCH v2 10/14] drm/i915/guc: Handle save/restore of MCR registers explicitly Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 11/14] drm/i915/gt: Add MCR-specific workaround initializers Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 12/14] drm/i915: Define multicast registers as a new type Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-04 12:56   ` Jani Nikula
2022-10-04 12:56     ` [Intel-gfx] " Jani Nikula
2022-10-04 13:00     ` Jani Nikula
2022-10-04 13:00       ` [Intel-gfx] " Jani Nikula
2022-10-05  1:01       ` Matt Roper
2022-10-05  1:01         ` [Intel-gfx] " Matt Roper
2022-10-13  7:21   ` Balasubramani Vivekanandan
2022-10-13  7:21     ` [Intel-gfx] " Balasubramani Vivekanandan
2022-10-01  0:45 ` [PATCH v2 13/14] drm/i915/mtl: Add multicast steering for render GT Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-14 16:02   ` Balasubramani Vivekanandan
2022-10-14 16:02     ` Balasubramani Vivekanandan
2022-10-14 22:46     ` Matt Roper
2022-10-14 22:46       ` [Intel-gfx] " Matt Roper
2022-10-01  0:45 ` [PATCH v2 14/14] drm/i915/mtl: Add multicast steering for media GT Matt Roper
2022-10-01  0:45   ` [Intel-gfx] " Matt Roper
2022-10-03  8:56   ` Tvrtko Ursulin
2022-10-03 19:32     ` Matt Roper
2022-10-04 10:13       ` Tvrtko Ursulin
2022-10-01  1:49 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Explicit MCR handling and MTL steering (rev2) Patchwork
2022-10-01  1:49 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-10-01  2:13 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2022-10-01  5:35 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Explicit MCR handling and MTL steering (rev3) Patchwork
2022-10-01  5:35 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-10-01  5:58 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-10-01 21:41 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.