dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/15] i915: Explicit handling of multicast registers
@ 2022-03-30 23:28 Matt Roper
  2022-03-30 23:28 ` [PATCH 01/15] drm/i915/gen8: Create separate reg definitions for new MCR registers Matt Roper
                   ` (14 more replies)
  0 siblings, 15 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx
  Cc: Lucas De Marchi, Daniele Ceraolo Spurio, dri-devel, Tvrtko Ursulin

Multicast/replicated (MCR) registers on Intel hardware are a purely
GT-specific concept.  Rather than leaving MCR register handling spread
across several places throughout the driver (intel_uncore.c, intel_gt.c,
etc.) with confusing combinations of handler functions living in
different namespaces, let's consolidate it all into a single place
(intel_gt_mcr.c) and provide a more consistent and clearly-documented
interface for the rest of the driver to access such registers:

 * intel_gt_mcr_read -- unicast read from specific instance
 * intel_gt_mcr_read_any[_fw] -- unicast read from any non-terminated
   instance
 * intel_gt_mcr_unicast_write -- unicast write to specific instance
 * intel_gt_mcr_multicast_write[_fw] -- multicast write to all instances

To ensure these new interfaces are used for all accesses to MCR
registers (rather than relying on the implicit and possibly incorrect
semantics of our regular mmio accessors), we'll also promote multicast
registers to a unique type within the driver (i915_mcr_reg_t rather than
the traditional i915_reg_t).  This will let the compiler help us catch
places where the code is trying to perform a non-MCR-aware MMIO
operation on an MCR register.

Finally, we'll implement new guidance from our hardware architects that
we should steer every undirected access to MCR registers (including
registers in GSLICE/DSS MCR ranges) at the time of access on Xe_HP,
rather than relying on a default steering target programmed at driver
initialization time.

One aspect of this series that I'm not super happy with is the handling
of mixed lists MCR and non-MCR registers that we have in a few places
(e.g., the various whitelists used by perf, GVT, etc.).  Since we're not
actually accessing the registers in those spots, just listing them out
so their MMIO offsets can be used for comparison, for now I've
effectively cast the MCR registers on those lists back to i915_reg_t
type so that the compiler doesn't complain about seeing incompatible
i915_mcr_reg_t elements in a list that's supposed to be i915_reg_t.  We
may want to think about better ways to handle heterogeneous lists of MCR
and non-MCR registers, or possibly just convert those to lists of u32
offsets since we're not actually using them to perform MMIO accesses.


Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>


Matt Roper (15):
  drm/i915/gen8: Create separate reg definitions for new MCR registers
  drm/i915/xehp: Create separate reg definitions for new MCR registers
  drm/i915/gt: Drop a few unused register definitions
  drm/i915/gt: Correct prefix on a few registers
  drm/i915/xehp: Check for faults on all mslices
  drm/i915: Drop duplicated definition of XEHPSDV_FLAT_CCS_BASE_ADDR
  drm/i915: Move XEHPSDV_TILE0_ADDR_RANGE to GT register header
  drm/i915: Define MCR registers explicitly
  drm/i915/gt: Move multicast register handling to a dedicated file
  drm/i915/gt: Cleanup interface for MCR operations
  drm/i915/gt: Always use MCR functions on multicast registers
  drm/i915/guc: Handle save/restore of MCR registers explicitly
  drm/i915/gt: Add MCR-specific workaround initializers
  drm/i915: Define multicast registers as a new type
  drm/i915/xehp: Eliminate shared/implicit steering

 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |   4 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  34 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   4 +-
 drivers/gpu/drm/i915/gt/intel_gt.c            | 303 ++---------
 drivers/gpu/drm/i915/gt/intel_gt.h            |  15 -
 drivers/gpu/drm/i915/gt/intel_gt_debugfs.c    |   3 +-
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 506 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h        |  34 ++
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 153 +++---
 drivers/gpu/drm/i915/gt/intel_gt_types.h      |   1 +
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  44 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   6 +-
 drivers/gpu/drm/i915/gt/intel_mocs.c          |  12 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c   |   3 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 473 ++++++++--------
 .../gpu/drm/i915/gt/selftest_workarounds.c    |   2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  61 ++-
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |   8 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c     |  12 +-
 drivers/gpu/drm/i915/gvt/cmd_parser.c         |   2 +-
 drivers/gpu/drm/i915/gvt/handlers.c           |  19 +-
 drivers/gpu/drm/i915/gvt/mmio_context.c       |  16 +-
 drivers/gpu/drm/i915/i915_perf.c              |   2 +-
 drivers/gpu/drm/i915/i915_reg.h               |   6 -
 drivers/gpu/drm/i915/i915_reg_defs.h          |   9 +
 drivers/gpu/drm/i915/intel_pm.c               |  20 +-
 drivers/gpu/drm/i915/intel_uncore.c           | 112 ----
 drivers/gpu/drm/i915/intel_uncore.h           |   8 -
 30 files changed, 1059 insertions(+), 816 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_mcr.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_mcr.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/15] drm/i915/gen8: Create separate reg definitions for new MCR registers
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 02/15] drm/i915/xehp: " Matt Roper
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Gen8 was the first time our hardware had multicast registers (or at
least the first time the multicast nature was exposed and MMIO accesses
could be steered).  There are some registers that transitioned from
singleton behavior to multicast during the gen7 -> gen8 transition;
let's duplicate the register definitions for those registers in
preparation for upcoming patches that will handle MCR registers in a
special manner.

The registers adjusted are:
 * MISCCPCTL
 * SAMPLER_INSTDONE
 * ROW_INSTDONE
 * ROW_CHICKEN2
 * HALF_SLICE_CHICKEN1
 * HALF_SLICE_CHICKEN3

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  8 +++----
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 11 +++++++++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 22 +++++++++----------
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 ++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c     |  2 +-
 drivers/gpu/drm/i915/gvt/handlers.c           |  2 +-
 drivers/gpu/drm/i915/gvt/mmio_context.c       |  2 +-
 drivers/gpu/drm/i915/intel_pm.c               | 10 ++++-----
 8 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 98b61ff13c95..ad9e7e55ce17 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1367,19 +1367,19 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
 			for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) {
 				instdone->sampler[slice][subslice] =
 					read_subslice_reg(engine, slice, subslice,
-							  GEN7_SAMPLER_INSTDONE);
+							  GEN8_SAMPLER_INSTDONE);
 				instdone->row[slice][subslice] =
 					read_subslice_reg(engine, slice, subslice,
-							  GEN7_ROW_INSTDONE);
+							  GEN8_ROW_INSTDONE);
 			}
 		} else {
 			for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
 				instdone->sampler[slice][subslice] =
 					read_subslice_reg(engine, slice, subslice,
-							  GEN7_SAMPLER_INSTDONE);
+							  GEN8_SAMPLER_INSTDONE);
 				instdone->row[slice][subslice] =
 					read_subslice_reg(engine, slice, subslice,
-							  GEN7_ROW_INSTDONE);
+							  GEN8_ROW_INSTDONE);
 			}
 		}
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 17432b075d97..08309745d461 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -622,6 +622,9 @@
 
 #define GEN7_MISCCPCTL				_MMIO(0x9424)
 #define   GEN7_DOP_CLOCK_GATE_ENABLE		(1 << 0)
+
+#define GEN8_MISCCPCTL				_MMIO(0x9424)
+#define   GEN8_DOP_CLOCK_GATE_ENABLE		(1 << 0)
 #define   GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE	(1 << 2)
 #define   GEN8_DOP_CLOCK_GATE_GUC_ENABLE	(1 << 4)
 #define   GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE	(1 << 6)
@@ -1009,18 +1012,22 @@
 #define GEN12_GAM_DONE				_MMIO(0xcf68)
 
 #define GEN7_HALF_SLICE_CHICKEN1		_MMIO(0xe100) /* IVB GT1 + VLV */
+#define GEN8_HALF_SLICE_CHICKEN1		_MMIO(0xe100)
 #define   GEN7_MAX_PS_THREAD_DEP		(8 << 12)
 #define   GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE	(1 << 10)
 #define   GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE	(1 << 4)
 #define   GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE	(1 << 3)
 
 #define GEN7_SAMPLER_INSTDONE			_MMIO(0xe160)
+#define GEN8_SAMPLER_INSTDONE			_MMIO(0xe160)
 #define GEN7_ROW_INSTDONE			_MMIO(0xe164)
+#define GEN8_ROW_INSTDONE			_MMIO(0xe164)
 
 #define HALF_SLICE_CHICKEN2			_MMIO(0xe180)
 #define   GEN8_ST_PO_DISABLE			(1 << 13)
 
-#define HALF_SLICE_CHICKEN3			_MMIO(0xe184)
+#define HSW_HALF_SLICE_CHICKEN3			_MMIO(0xe184)
+#define GEN8_HALF_SLICE_CHICKEN3		_MMIO(0xe184)
 #define   HSW_SAMPLE_C_PERFORMANCE		(1 << 9)
 #define   GEN8_CENTROID_PIXEL_OPT_DIS		(1 << 8)
 #define   GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC	(1 << 5)
@@ -1068,6 +1075,8 @@
 #define   DISABLE_EARLY_EOT			REG_BIT(1)
 
 #define GEN7_ROW_CHICKEN2			_MMIO(0xe4f4)
+
+#define GEN8_ROW_CHICKEN2			_MMIO(0xe4f4)
 #define   GEN12_DISABLE_READ_SUPPRESSION	REG_BIT(15)
 #define   GEN12_DISABLE_EARLY_READ		REG_BIT(14)
 #define   GEN12_ENABLE_LARGE_GRF_MODE		REG_BIT(12)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 29c8cd0a81b6..608ed833307f 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -294,10 +294,10 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine,
 	 * Also see the related UCGTCL1 write in bdw_init_clock_gating()
 	 * to disable EUTC clock gating.
 	 */
-	wa_masked_en(wal, GEN7_ROW_CHICKEN2,
+	wa_masked_en(wal, GEN8_ROW_CHICKEN2,
 		     DOP_CLOCK_GATING_DISABLE);
 
-	wa_masked_en(wal, HALF_SLICE_CHICKEN3,
+	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
 		     GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	wa_masked_en(wal, HDC_CHICKEN0,
@@ -385,7 +385,7 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 	    IS_KABYLAKE(i915) ||
 	    IS_COFFEELAKE(i915) ||
 	    IS_COMETLAKE(i915))
-		wa_masked_en(wal, HALF_SLICE_CHICKEN3,
+		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
 			     GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
@@ -489,7 +489,7 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1,
+	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
 		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
@@ -513,7 +513,7 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1,
+	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
 		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
@@ -2046,7 +2046,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14013392000:dg2_g11 */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
 
 		/* Wa_16011620976:dg2_g11 */
 		wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8);
@@ -2088,7 +2088,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 			    DISABLE_128B_EVICTION_COMMAND_UDW);
 
 		/* Wa_22012856258:dg2 */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2,
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_DISABLE_READ_SUPPRESSION);
 
 		/*
@@ -2184,7 +2184,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
 
 		/*
 		 * Wa_1407928979:tgl A*
@@ -2209,7 +2209,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	    IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
-		wa_masked_en(wal, GEN7_ROW_CHICKEN2,
+		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_PUSH_CONST_DEREF_HOLD_DIS);
 
 		/*
@@ -2376,7 +2376,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_HASWELL(i915)) {
 		/* WaSampleCChickenBitEnable:hsw */
 		wa_masked_en(wal,
-			     HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE);
+			     HSW_HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE);
 
 		wa_masked_dis(wal,
 			      CACHE_MODE_0_GEN7,
@@ -2608,7 +2608,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 		wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 
 		/* Wa_14010449647:xehpsdv */
-		wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1,
+		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
 			     GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
 
 		/* Wa_18011725039:xehpsdv */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
index c4e25966d3e9..7f77e9cdaba4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
@@ -243,8 +243,8 @@ struct __ext_steer_reg {
 };
 
 static const struct __ext_steer_reg xe_extregs[] = {
-	{"GEN7_SAMPLER_INSTDONE", GEN7_SAMPLER_INSTDONE},
-	{"GEN7_ROW_INSTDONE", GEN7_ROW_INSTDONE}
+	{"GEN8_SAMPLER_INSTDONE", GEN8_SAMPLER_INSTDONE},
+	{"GEN8_ROW_INSTDONE", GEN8_ROW_INSTDONE}
 };
 
 static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index a0372735cddb..9229243992c2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -35,7 +35,7 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
 
 	if (GRAPHICS_VER(uncore->i915) == 9) {
 		/* DOP Clock Gating Enable for GuC clocks */
-		intel_uncore_rmw(uncore, GEN7_MISCCPCTL,
+		intel_uncore_rmw(uncore, GEN8_MISCCPCTL,
 				 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE);
 
 		/* allows for 5us (in 10ns units) before GT can go to RC6 */
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index 0ee3ecc83234..bad1065a99a7 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -2279,7 +2279,7 @@ static int init_generic_mmio_info(struct intel_gvt *gvt)
 	MMIO_DFH(_MMIO(0x2438), D_ALL, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x243c), D_ALL, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x7018), D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH(HSW_HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(GEN7_HALF_SLICE_CHICKEN1, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 
 	/* display */
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
index c85bafe7539e..4be07d627941 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -111,7 +111,7 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
 	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
 	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
 	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
-	{RCS0, HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
+	{RCS0, HSW_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
 	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
 	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
 	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 641616135955..43a2c95602c0 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7416,8 +7416,8 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	u32 val;
 
 	/* WaTempDisableDOPClkGating:bdw */
-	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL);
-	intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL);
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
 	val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1);
 	val &= ~L3_PRIO_CREDITS_MASK;
@@ -7431,7 +7431,7 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	 */
 	intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1);
 	udelay(1);
-	intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, misccpctl);
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl);
 }
 
 static void icl_init_clock_gating(struct drm_i915_private *dev_priv)
@@ -7579,8 +7579,8 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
 	gen9_init_clock_gating(dev_priv);
 
 	/* WaDisableDopClockGating:skl */
-	intel_uncore_write(&dev_priv->uncore, GEN7_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL) &
-		   ~GEN7_DOP_CLOCK_GATE_ENABLE);
+	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &
+		   ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
 	/* WAC6entrylatency:skl */
 	intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) |
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 02/15] drm/i915/xehp: Create separate reg definitions for new MCR registers
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
  2022-03-30 23:28 ` [PATCH 01/15] drm/i915/gen8: Create separate reg definitions for new MCR registers Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 03/15] drm/i915/gt: Drop a few unused register definitions Matt Roper
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Starting in Xe_HP, several registers our driver works with have been
converted from singleton registers into replicated registers with
multicast behavior.  Although the registers are still located at the
same MMIO offsets as on previous platforms, let's duplicate the register
definitions in preparation for upcoming patches that will handle
multicast registers in a special manner.

The registers that are now replicated on Xe_HP are:
 * PAT_INDEX (mslice replication)
 * FF_MODE2 (gslice replication)
 * COMMON_SLICE_CHICKEN3 (gslice replication)
 * SLICE_COMMON_ECO_CHICKEN1 (gslice replication)
 * SLICE_UNIT_LEVEL_CLKGATE (gslice replication)
 * LNCFCMOCS (lncf replication)

The *_TLB_INV_CR registers are also replicated (mslice replication), but
I'm skipping those for now because I think that code might need more
work in general for multicast behavior (e.g., do we need to wait for
the invalidation to report as completed on every mslice?).

Bspec: 66534
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     | 18 ++++++++-----
 drivers/gpu/drm/i915/gt/intel_gtt.c         | 29 ++++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_mocs.c        |  5 +++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 24 ++++++++---------
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  |  7 +++--
 5 files changed, 52 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 08309745d461..e98e04b4a7a8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -322,6 +322,7 @@
 #define GEN7_TLB_RD_ADDR			_MMIO(0x4700)
 
 #define GEN12_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
+#define XEHP_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
 
 #define XEHPSDV_FLAT_CCS_BASE_ADDR		_MMIO(0x4910)
 #define   XEHPSDV_CCS_BASE_SHIFT		8
@@ -371,7 +372,8 @@
 #define   DIS_OVER_FETCH_CACHE			REG_BIT(1)
 #define   DIS_MULT_MISS_RD_SQUASH		REG_BIT(0)
 
-#define FF_MODE2				_MMIO(0x6604)
+#define GEN12_FF_MODE2				_MMIO(0x6604)
+#define XEHP_FF_MODE2				_MMIO(0x6604)
 #define   FF_MODE2_GS_TIMER_MASK		REG_GENMASK(31, 24)
 #define   FF_MODE2_GS_TIMER_224			REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224)
 #define   FF_MODE2_TDS_TIMER_MASK		REG_GENMASK(23, 16)
@@ -426,6 +428,7 @@
 #define GEN8_HDC_CHICKEN1			_MMIO(0x7304)
 
 #define GEN11_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
+#define XEHP_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
 #define   DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN	REG_BIT(12)
 #define   XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE	REG_BIT(12)
 #define   GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC	REG_BIT(11)
@@ -439,10 +442,9 @@
 #define   DISABLE_PIXEL_MASK_CAMMING		(1 << 14)
 
 #define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
-#define   GEN11_STATE_CACHE_REDIRECT_TO_CS	(1 << 11)
-
-#define SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
+#define XEHP_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
 #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE	REG_BIT(14)
+#define   GEN11_STATE_CACHE_REDIRECT_TO_CS	(1 << 11)
 
 #define GEN9_SLICE_PGCTL_ACK(slice)		_MMIO(0x804c + (slice) * 0x4)
 #define GEN10_SLICE_PGCTL_ACK(slice)		_MMIO(0x804c + ((slice) / 3) * 0x34 + \
@@ -677,7 +679,8 @@
 #define   GAMTLBVEBOX0_CLKGATE_DIS		REG_BIT(16)
 #define   LTCDD_CLKGATE_DIS			REG_BIT(10)
 
-#define SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
+#define GEN11_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
+#define XEHP_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
 #define   SARBUNIT_CLKGATE_DIS			(1 << 5)
 #define   RCCUNIT_CLKGATE_DIS			(1 << 7)
 #define   MSCUNIT_CLKGATE_DIS			(1 << 10)
@@ -692,7 +695,7 @@
 #define   VSUNIT_CLKGATE_DIS_TGL		REG_BIT(19)
 #define   PSDUNIT_CLKGATE_DIS			REG_BIT(5)
 
-#define SUBSLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x9524)
+#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE	_MMIO(0x9524)
 #define   DSS_ROUTER_CLKGATE_DIS		REG_BIT(28)
 #define   GWUNIT_CLKGATE_DIS			REG_BIT(16)
 
@@ -892,7 +895,8 @@
 
 /* MOCS (Memory Object Control State) registers */
 #define GEN9_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
-#define GEN9_LNCFCMOCS_REG_COUNT		32
+#define XEHP_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
+#define LNCFCMOCS_REG_COUNT			32
 
 #define GEN7_L3CNTLREG3				_MMIO(0xb024)
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index b67831833c9a..601d89b4feb1 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -479,15 +479,26 @@ void gtt_write_workarounds(struct intel_gt *gt)
 
 static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 {
-	/* TGL doesn't support LLC or AGE settings */
-	intel_uncore_write(uncore, GEN12_PAT_INDEX(0), GEN8_PPAT_WB);
-	intel_uncore_write(uncore, GEN12_PAT_INDEX(1), GEN8_PPAT_WC);
-	intel_uncore_write(uncore, GEN12_PAT_INDEX(2), GEN8_PPAT_WT);
-	intel_uncore_write(uncore, GEN12_PAT_INDEX(3), GEN8_PPAT_UC);
-	intel_uncore_write(uncore, GEN12_PAT_INDEX(4), GEN8_PPAT_WB);
-	intel_uncore_write(uncore, GEN12_PAT_INDEX(5), GEN8_PPAT_WB);
-	intel_uncore_write(uncore, GEN12_PAT_INDEX(6), GEN8_PPAT_WB);
-	intel_uncore_write(uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB);
+	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50)) {
+		intel_uncore_write(uncore, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
+		intel_uncore_write(uncore, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
+		intel_uncore_write(uncore, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
+		intel_uncore_write(uncore, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
+		intel_uncore_write(uncore, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
+		intel_uncore_write(uncore, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
+		intel_uncore_write(uncore, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
+		intel_uncore_write(uncore, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
+	} else {
+		/* TGL doesn't support LLC or AGE settings */
+		intel_uncore_write(uncore, GEN12_PAT_INDEX(0), GEN8_PPAT_WB);
+		intel_uncore_write(uncore, GEN12_PAT_INDEX(1), GEN8_PPAT_WC);
+		intel_uncore_write(uncore, GEN12_PAT_INDEX(2), GEN8_PPAT_WT);
+		intel_uncore_write(uncore, GEN12_PAT_INDEX(3), GEN8_PPAT_UC);
+		intel_uncore_write(uncore, GEN12_PAT_INDEX(4), GEN8_PPAT_WB);
+		intel_uncore_write(uncore, GEN12_PAT_INDEX(5), GEN8_PPAT_WB);
+		intel_uncore_write(uncore, GEN12_PAT_INDEX(6), GEN8_PPAT_WB);
+		intel_uncore_write(uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB);
+	}
 }
 
 static void icl_setup_private_ppat(struct intel_uncore *uncore)
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index c4c37585ae8c..c14c0dab0164 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -588,7 +588,10 @@ static void init_l3cc_table(struct intel_uncore *uncore,
 	u32 l3cc;
 
 	for_each_l3cc(l3cc, table, i)
-		intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
+		if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50))
+			intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc);
+		else
+			intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
 }
 
 void intel_mocs_init_engine(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 608ed833307f..27807bc70610 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -570,7 +570,7 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 	wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
 			 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
 	wa_add(wal,
-	       FF_MODE2,
+	       XEHP_FF_MODE2,
 	       FF_MODE2_TDS_TIMER_MASK,
 	       FF_MODE2_TDS_TIMER_128,
 	       0, false);
@@ -597,7 +597,7 @@ static void gen12_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 	 * verification is ignored.
 	 */
 	wa_add(wal,
-	       FF_MODE2,
+	       GEN12_FF_MODE2,
 	       FF_MODE2_TDS_TIMER_MASK,
 	       FF_MODE2_TDS_TIMER_128,
 	       0, false);
@@ -635,7 +635,7 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine,
 	 * to Wa_1608008084.
 	 */
 	wa_add(wal,
-	       FF_MODE2,
+	       GEN12_FF_MODE2,
 	       FF_MODE2_GS_TIMER_MASK,
 	       FF_MODE2_GS_TIMER_224,
 	       0, false);
@@ -668,7 +668,7 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_14010469329:dg2_g10 */
-		wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
+		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
 			     XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
 
 		/*
@@ -676,12 +676,12 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 		 * Wa_22010613112:dg2_g10
 		 * Wa_14010698770:dg2_g10
 		 */
-		wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
+		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
 			     GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
 	}
 
 	/* Wa_16013271637:dg2 */
-	wa_masked_en(wal, SLICE_COMMON_ECO_CHICKEN1,
+	wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1,
 		     MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
 
 	/* Wa_14014947963:dg2 */
@@ -1237,14 +1237,14 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 
 	/* Wa_1406680159:icl,ehl */
 	wa_write_or(wal,
-		    SUBSLICE_UNIT_LEVEL_CLKGATE,
+		    GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
 		    GWUNIT_CLKGATE_DIS);
 
 	/* Wa_1607087056:icl,ehl,jsl */
 	if (IS_ICELAKE(i915) ||
 	    IS_JSL_EHL_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
 		wa_write_or(wal,
-			    SLICE_UNIT_LEVEL_CLKGATE,
+			    GEN11_SLICE_UNIT_LEVEL_CLKGATE,
 			    L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
 
 	/*
@@ -1304,7 +1304,7 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	/* Wa_1607087056:tgl also know as BUG:1409180338 */
 	if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
 		wa_write_or(wal,
-			    SLICE_UNIT_LEVEL_CLKGATE,
+			    GEN11_SLICE_UNIT_LEVEL_CLKGATE,
 			    L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
 
 	/* Wa_1408615072:tgl[a0] */
@@ -1323,7 +1323,7 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	/* Wa_1607087056:dg1 */
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
 		wa_write_or(wal,
-			    SLICE_UNIT_LEVEL_CLKGATE,
+			    GEN11_SLICE_UNIT_LEVEL_CLKGATE,
 			    L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
 
 	/* Wa_1409420604:dg1 */
@@ -1427,7 +1427,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 			    CG3DDISCFEG_CLKGATE_DIS);
 
 		/* Wa_14011006942:dg2 */
-		wa_write_or(wal, SUBSLICE_UNIT_LEVEL_CLKGATE,
+		wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
 			    DSS_ROUTER_CLKGATE_DIS);
 	}
 
@@ -1439,7 +1439,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 		wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS);
 
 		/* Wa_14011371254:dg2_g10 */
-		wa_write_or(wal, SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
+		wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
 
 		/* Wa_14011431319:dg2_g10 */
 		wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS |
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 17004bca4d24..e8a42d719f96 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -371,8 +371,11 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
 					false);
 
 	/* add in local MOCS registers */
-	for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++)
-		ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
+	for (i = 0; i < LNCFCMOCS_REG_COUNT; i++)
+		if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+			ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false);
+		else
+			ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
 
 	return ret ? -1 : 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 03/15] drm/i915/gt: Drop a few unused register definitions
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
  2022-03-30 23:28 ` [PATCH 01/15] drm/i915/gen8: Create separate reg definitions for new MCR registers Matt Roper
  2022-03-30 23:28 ` [PATCH 02/15] drm/i915/xehp: " Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 04/15] drm/i915/gt: Correct prefix on a few registers Matt Roper
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Let's drop a few register definitions that are unused anywhere in the
driver today.  Since the referenced offsets are part of what is now
considered a multicast register region, the current definitions would
not be correct for use on any future platform.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index e98e04b4a7a8..9e236397397f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -434,13 +434,6 @@
 #define   GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC	REG_BIT(11)
 #define   GEN12_DISABLE_CPS_AWARE_COLOR_PIPE	REG_BIT(9)
 
-/* GEN9 chicken */
-#define SLICE_ECO_CHICKEN0			_MMIO(0x7308)
-#define   PIXEL_MASK_CAMMING_DISABLE		(1 << 14)
-
-#define GEN9_SLICE_COMMON_ECO_CHICKEN0		_MMIO(0x7308)
-#define   DISABLE_PIXEL_MASK_CAMMING		(1 << 14)
-
 #define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
 #define XEHP_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
 #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE	REG_BIT(14)
@@ -912,11 +905,6 @@
 #define GEN7_L3LOG(slice, i)			_MMIO(0xb070 + (slice) * 0x200 + (i) * 4)
 #define   GEN7_L3LOG_SIZE			0x80
 
-#define GEN10_SCRATCH_LNCF2			_MMIO(0xb0a0)
-#define   PMFLUSHDONE_LNICRSDROP		(1 << 20)
-#define   PMFLUSH_GAPL3UNBLOCK			(1 << 21)
-#define   PMFLUSHDONE_LNEBLK			(1 << 22)
-
 #define XEHP_L3NODEARBCFG			_MMIO(0xb0b4)
 #define   XEHP_LNESPARE				REG_BIT(19)
 
@@ -931,9 +919,6 @@
 #define   L3_HIGH_PRIO_CREDITS(x)		(((x) >> 1) << 14)
 #define   L3_PRIO_CREDITS_MASK			((0x1f << 19) | (0x1f << 14))
 
-#define GEN10_L3_CHICKEN_MODE_REGISTER		_MMIO(0xb114)
-#define   GEN11_I2M_WRITE_DISABLE		(1 << 28)
-
 #define GEN8_L3SQCREG4				_MMIO(0xb118)
 #define   GEN11_LQSC_CLEAN_EVICT_DISABLE	(1 << 6)
 #define   GEN8_LQSC_RO_PERF_DIS			(1 << 27)
@@ -1113,8 +1098,6 @@
 #define SARB_CHICKEN1				_MMIO(0xe90c)
 #define   COMP_CKN_IN				REG_GENMASK(30, 29)
 
-#define GEN7_HALF_SLICE_CHICKEN1_GT2		_MMIO(0xf100)
-
 #define GEN7_ROW_CHICKEN2_GT2			_MMIO(0xf4f4)
 #define   DOP_CLOCK_GATING_DISABLE		(1 << 0)
 #define   PUSH_CONSTANT_DEREF_DISABLE		(1 << 8)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 04/15] drm/i915/gt: Correct prefix on a few registers
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (2 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 03/15] drm/i915/gt: Drop a few unused register definitions Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 05/15] drm/i915/xehp: Check for faults on all mslices Matt Roper
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

We have a few registers that have existed for several hardware
generations, but are only used by the driver on Xe_HP and beyond.  In
cases where the Xe_HP version of the register is now replicated and uses
multicast behavior, but earlier generations were singleton, let's change
the register prefix to "XEHP_" to help clarify that we're using the
newer multicast form of the register.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  8 ++++----
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 10 +++++-----
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 9e236397397f..0f05bbda773e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -466,7 +466,7 @@
 
 #define GEN8_RC6_CTX_INFO			_MMIO(0x8504)
 
-#define GEN12_SQCM				_MMIO(0x8724)
+#define XEHP_SQCM				_MMIO(0x8724)
 #define   EN_32B_ACCESS				REG_BIT(30)
 
 #define HSW_IDICR				_MMIO(0x9008)
@@ -934,7 +934,7 @@
 #define GEN11_SCRATCH2				_MMIO(0xb140)
 #define   GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE	(1 << 19)
 
-#define GEN11_L3SQCREG5				_MMIO(0xb158)
+#define XEHP_L3SQCREG5				_MMIO(0xb158)
 #define   L3_PWM_TIMER_INIT_VAL_MASK		REG_GENMASK(9, 0)
 
 #define MLTICTXCTL				_MMIO(0xb170)
@@ -982,7 +982,7 @@
 #define GEN12_VE_TLB_INV_CR			_MMIO(0xcee0)
 #define GEN12_BLT_TLB_INV_CR			_MMIO(0xcee4)
 
-#define GEN12_MERT_MOD_CTRL			_MMIO(0xcf28)
+#define XEHP_MERT_MOD_CTRL			_MMIO(0xcf28)
 #define RENDER_MOD_CTRL				_MMIO(0xcf2c)
 #define COMP_MOD_CTRL				_MMIO(0xcf30)
 #define VDBX_MOD_CTRL				_MMIO(0xcf34)
@@ -1077,7 +1077,7 @@
 #define EU_PERF_CNTL1				_MMIO(0xe558)
 #define EU_PERF_CNTL5				_MMIO(0xe55c)
 
-#define GEN12_HDC_CHICKEN0			_MMIO(0xe5f0)
+#define XEHP_HDC_CHICKEN0			_MMIO(0xe5f0)
 #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
 #define ICL_HDC_MODE				_MMIO(0xe5f4)
 
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 27807bc70610..544097c56619 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -567,7 +567,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 				   struct i915_wa_list *wal)
 {
-	wa_write_clr_set(wal, GEN11_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
+	wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
 			 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
 	wa_add(wal,
 	       XEHP_FF_MODE2,
@@ -1486,7 +1486,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	 * recommended tuning settings documented in the bspec's
 	 * performance guide section.
 	 */
-	wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS);
+	wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS);
 }
 
 static void
@@ -2095,7 +2095,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_22010960976:dg2
 		 * Wa_14013347512:dg2
 		 */
-		wa_masked_dis(wal, GEN12_HDC_CHICKEN0,
+		wa_masked_dis(wal, XEHP_HDC_CHICKEN0,
 			      LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK);
 	}
 
@@ -2157,7 +2157,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) ||
 	    IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14012362059:dg2 */
-		wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 	}
 
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
@@ -2618,7 +2618,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 		}
 
 		/* Wa_14012362059:xehpsdv */
-		wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 
 		/* Wa_14014368820:xehpsdv */
 		wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS |
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 05/15] drm/i915/xehp: Check for faults on all mslices
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (3 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 04/15] drm/i915/gt: Correct prefix on a few registers Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 06/15] drm/i915: Drop duplicated definition of XEHPSDV_FLAT_CCS_BASE_ADDR Matt Roper
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

The fault registers are multicast registers, replicated per-mslice
starting on Xe_HP.  When checking for faults, we should check each
mslice's instance of the register rather than just one of the instances.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c      | 44 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_gt_regs.h |  3 ++
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 5001a6168d56..1992325c2895 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -350,6 +350,46 @@ static void gen6_check_faults(struct intel_gt *gt)
 	}
 }
 
+static void xehp_check_faults(struct intel_gt *gt)
+{
+	struct intel_uncore *uncore = gt->uncore;
+	u32 fault;
+	int mslice;
+
+	/* Check each mslice's fault register */
+	for (mslice = 0; mslice < 4; mslice++) {
+		fault = intel_uncore_read_with_mcr_steering(uncore,
+							    XEHP_RING_FAULT_REG,
+							    mslice, 0);
+		if (fault & RING_FAULT_VALID) {
+			u32 fault_data0, fault_data1;
+			u64 fault_addr;
+
+			fault_data0 = intel_uncore_read_with_mcr_steering(uncore,
+									  XEHP_FAULT_TLB_DATA0,
+									  mslice, 0);
+			fault_data1 = intel_uncore_read_with_mcr_steering(uncore,
+									  XEHP_FAULT_TLB_DATA1,
+									  mslice, 0);
+
+			fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) |
+				     ((u64)fault_data0 << 12);
+
+			drm_dbg(&uncore->i915->drm, "Unexpected fault\n"
+				"\tAddr: 0x%08x_%08x\n"
+				"\tAddress space: %s\n"
+				"\tEngine ID: %d\n"
+				"\tSource ID: %d\n"
+				"\tType: %d\n",
+				upper_32_bits(fault_addr), lower_32_bits(fault_addr),
+				fault_data1 & FAULT_GTT_SEL ? "GGTT" : "PPGTT",
+				GEN8_RING_FAULT_ENGINE_ID(fault),
+				RING_FAULT_SRCID(fault),
+				RING_FAULT_FAULT_TYPE(fault));
+		}
+	}
+}
+
 static void gen8_check_faults(struct intel_gt *gt)
 {
 	struct intel_uncore *uncore = gt->uncore;
@@ -396,7 +436,9 @@ void intel_gt_check_and_clear_faults(struct intel_gt *gt)
 	struct drm_i915_private *i915 = gt->i915;
 
 	/* From GEN8 onwards we only have one 'All Engine Fault Register' */
-	if (GRAPHICS_VER(i915) >= 8)
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+		xehp_check_faults(gt);
+	else if (GRAPHICS_VER(i915) >= 8)
 		gen8_check_faults(gt);
 	else if (GRAPHICS_VER(i915) >= 6)
 		gen6_check_faults(gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 0f05bbda773e..a060de66126a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -966,11 +966,14 @@
 #define GEN9_BLT_MOCS(i)			_MMIO(__GEN9_BCS0_MOCS0 + (i) * 4)
 
 #define GEN12_FAULT_TLB_DATA0			_MMIO(0xceb8)
+#define XEHP_FAULT_TLB_DATA0			_MMIO(0xceb8)
 #define GEN12_FAULT_TLB_DATA1			_MMIO(0xcebc)
+#define XEHP_FAULT_TLB_DATA1			_MMIO(0xcebc)
 #define   FAULT_VA_HIGH_BITS			(0xf << 0)
 #define   FAULT_GTT_SEL				(1 << 4)
 
 #define GEN12_RING_FAULT_REG			_MMIO(0xcec4)
+#define XEHP_RING_FAULT_REG			_MMIO(0xcec4)
 #define   GEN8_RING_FAULT_ENGINE_ID(x)		(((x) >> 12) & 0x7)
 #define   RING_FAULT_GTTSEL_MASK		(1 << 11)
 #define   RING_FAULT_SRCID(x)			(((x) >> 3) & 0xff)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 06/15] drm/i915: Drop duplicated definition of XEHPSDV_FLAT_CCS_BASE_ADDR
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (4 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 05/15] drm/i915/xehp: Check for faults on all mslices Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 07/15] drm/i915: Move XEHPSDV_TILE0_ADDR_RANGE to GT register header Matt Roper
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

When this register was moved to intel_gt_regs.h it wasn't dropped from
i915_reg.h; do so now.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index efb81cb4c7c0..062e11289aa0 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8494,9 +8494,6 @@ enum skl_power_gate {
 #define XEHPSDV_TILE0_ADDR_RANGE	_MMIO(0x4900)
 #define   XEHPSDV_TILE_LMEM_RANGE_SHIFT  8
 
-#define XEHPSDV_FLAT_CCS_BASE_ADDR	_MMIO(0x4910)
-#define   XEHPSDV_CCS_BASE_SHIFT	8
-
 /* gamt regs */
 #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
 #define   GEN8_L3_LRA_1_GPGPU_DEFAULT_VALUE_BDW  0x67F1427F /* max/min for LRA1/2 */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 07/15] drm/i915: Move XEHPSDV_TILE0_ADDR_RANGE to GT register header
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (5 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 06/15] drm/i915: Drop duplicated definition of XEHPSDV_FLAT_CCS_BASE_ADDR Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 08/15] drm/i915: Define MCR registers explicitly Matt Roper
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

XEHPSDV_TILE0_ADDR_RANGE is a GT register and requires multicast
handling.  Move the definition to the proper header.

Fixes: b8ca8fef58d4 ("drm/i915/stolen: don't treat small BAR as an error")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 1 +
 drivers/gpu/drm/i915/gt/intel_gt_regs.h    | 3 +++
 drivers/gpu/drm/i915/i915_reg.h            | 3 ---
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 47b5e0e342ab..a10d857dfd9b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -13,6 +13,7 @@
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_regs.h"
 #include "gt/intel_region_lmem.h"
 #include "i915_drv.h"
 #include "i915_gem_stolen.h"
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index a060de66126a..5ea4e2fb8eb4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -324,6 +324,9 @@
 #define GEN12_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
 #define XEHP_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
 
+#define XEHPSDV_TILE0_ADDR_RANGE		_MMIO(0x4900)
+#define   XEHPSDV_TILE_LMEM_RANGE_SHIFT		8
+
 #define XEHPSDV_FLAT_CCS_BASE_ADDR		_MMIO(0x4910)
 #define   XEHPSDV_CCS_BASE_SHIFT		8
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 062e11289aa0..b0742b7f4201 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8491,9 +8491,6 @@ enum skl_power_gate {
 #define   SGGI_DIS			REG_BIT(15)
 #define   SGR_DIS			REG_BIT(13)
 
-#define XEHPSDV_TILE0_ADDR_RANGE	_MMIO(0x4900)
-#define   XEHPSDV_TILE_LMEM_RANGE_SHIFT  8
-
 /* gamt regs */
 #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
 #define   GEN8_L3_LRA_1_GPGPU_DEFAULT_VALUE_BDW  0x67F1427F /* max/min for LRA1/2 */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 08/15] drm/i915: Define MCR registers explicitly
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (6 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 07/15] drm/i915: Move XEHPSDV_TILE0_ADDR_RANGE to GT register header Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 09/15] drm/i915/gt: Move multicast register handling to a dedicated file Matt Roper
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Rather than using the same _MMIO() macro to define MCR registers as
singleton registers, let's use a new MCR_REG() macro to make it clear
that these registers are special and should be handled accordingly.  For
now MCR_REG() will still generate an i915_reg_t with the given offset,
but we'll change that in future patches.

Bspec: 66673, 66696, 66534, 67609
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h | 124 ++++++++++++------------
 1 file changed, 63 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 5ea4e2fb8eb4..3f5e01a48a17 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -8,6 +8,8 @@
 
 #include "i915_reg_defs.h"
 
+#define MCR_REG(offset)	_MMIO(offset)
+
 /* RPM unit config (Gen8+) */
 #define RPM_CONFIG0				_MMIO(0xd00)
 #define   GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT	3
@@ -322,12 +324,12 @@
 #define GEN7_TLB_RD_ADDR			_MMIO(0x4700)
 
 #define GEN12_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
-#define XEHP_PAT_INDEX(index)			_MMIO(0x4800 + (index) * 4)
+#define XEHP_PAT_INDEX(index)			MCR_REG(0x4800 + (index) * 4)
 
-#define XEHPSDV_TILE0_ADDR_RANGE		_MMIO(0x4900)
+#define XEHPSDV_TILE0_ADDR_RANGE		MCR_REG(0x4900)
 #define   XEHPSDV_TILE_LMEM_RANGE_SHIFT		8
 
-#define XEHPSDV_FLAT_CCS_BASE_ADDR		_MMIO(0x4910)
+#define XEHPSDV_FLAT_CCS_BASE_ADDR		MCR_REG(0x4910)
 #define   XEHPSDV_CCS_BASE_SHIFT		8
 
 #define GAMTARBMODE				_MMIO(0x4a08)
@@ -371,18 +373,18 @@
 #define GEN9_WM_CHICKEN3			_MMIO(0x5588)
 #define   GEN9_FACTOR_IN_CLR_VAL_HIZ		(1 << 9)
 
-#define VFLSKPD					_MMIO(0x62a8)
+#define VFLSKPD					MCR_REG(0x62a8)
 #define   DIS_OVER_FETCH_CACHE			REG_BIT(1)
 #define   DIS_MULT_MISS_RD_SQUASH		REG_BIT(0)
 
 #define GEN12_FF_MODE2				_MMIO(0x6604)
-#define XEHP_FF_MODE2				_MMIO(0x6604)
+#define XEHP_FF_MODE2				MCR_REG(0x6604)
 #define   FF_MODE2_GS_TIMER_MASK		REG_GENMASK(31, 24)
 #define   FF_MODE2_GS_TIMER_224			REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224)
 #define   FF_MODE2_TDS_TIMER_MASK		REG_GENMASK(23, 16)
 #define   FF_MODE2_TDS_TIMER_128		REG_FIELD_PREP(FF_MODE2_TDS_TIMER_MASK, 4)
 
-#define XEHPG_INSTDONE_GEOM_SVG			_MMIO(0x666c)
+#define XEHPG_INSTDONE_GEOM_SVG			MCR_REG(0x666c)
 
 #define CACHE_MODE_0_GEN7			_MMIO(0x7000) /* IVB+ */
 #define   RC_OP_FLUSH_ENABLE			(1 << 0)
@@ -431,14 +433,14 @@
 #define GEN8_HDC_CHICKEN1			_MMIO(0x7304)
 
 #define GEN11_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
-#define XEHP_COMMON_SLICE_CHICKEN3		_MMIO(0x7304)
+#define XEHP_COMMON_SLICE_CHICKEN3		MCR_REG(0x7304)
 #define   DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN	REG_BIT(12)
 #define   XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE	REG_BIT(12)
 #define   GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC	REG_BIT(11)
 #define   GEN12_DISABLE_CPS_AWARE_COLOR_PIPE	REG_BIT(9)
 
 #define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
-#define XEHP_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
+#define XEHP_SLICE_COMMON_ECO_CHICKEN1		MCR_REG(0x731c)
 #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE	REG_BIT(14)
 #define   GEN11_STATE_CACHE_REDIRECT_TO_CS	(1 << 11)
 
@@ -469,7 +471,7 @@
 
 #define GEN8_RC6_CTX_INFO			_MMIO(0x8504)
 
-#define XEHP_SQCM				_MMIO(0x8724)
+#define XEHP_SQCM				MCR_REG(0x8724)
 #define   EN_32B_ACCESS				REG_BIT(30)
 
 #define HSW_IDICR				_MMIO(0x9008)
@@ -621,7 +623,7 @@
 #define GEN7_MISCCPCTL				_MMIO(0x9424)
 #define   GEN7_DOP_CLOCK_GATE_ENABLE		(1 << 0)
 
-#define GEN8_MISCCPCTL				_MMIO(0x9424)
+#define GEN8_MISCCPCTL				MCR_REG(0x9424)
 #define   GEN8_DOP_CLOCK_GATE_ENABLE		(1 << 0)
 #define   GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE	(1 << 2)
 #define   GEN8_DOP_CLOCK_GATE_GUC_ENABLE	(1 << 4)
@@ -676,7 +678,7 @@
 #define   LTCDD_CLKGATE_DIS			REG_BIT(10)
 
 #define GEN11_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
-#define XEHP_SLICE_UNIT_LEVEL_CLKGATE		_MMIO(0x94d4)
+#define XEHP_SLICE_UNIT_LEVEL_CLKGATE		MCR_REG(0x94d4)
 #define   SARBUNIT_CLKGATE_DIS			(1 << 5)
 #define   RCCUNIT_CLKGATE_DIS			(1 << 7)
 #define   MSCUNIT_CLKGATE_DIS			(1 << 10)
@@ -684,27 +686,27 @@
 #define   L3_CLKGATE_DIS			REG_BIT(16)
 #define   L3_CR2X_CLKGATE_DIS			REG_BIT(17)
 
-#define SCCGCTL94DC				_MMIO(0x94dc)
+#define SCCGCTL94DC				MCR_REG(0x94dc)
 #define   CG3DDISURB				REG_BIT(14)
 
 #define UNSLICE_UNIT_LEVEL_CLKGATE2		_MMIO(0x94e4)
 #define   VSUNIT_CLKGATE_DIS_TGL		REG_BIT(19)
 #define   PSDUNIT_CLKGATE_DIS			REG_BIT(5)
 
-#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE	_MMIO(0x9524)
+#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE	MCR_REG(0x9524)
 #define   DSS_ROUTER_CLKGATE_DIS		REG_BIT(28)
 #define   GWUNIT_CLKGATE_DIS			REG_BIT(16)
 
-#define SUBSLICE_UNIT_LEVEL_CLKGATE2		_MMIO(0x9528)
+#define SUBSLICE_UNIT_LEVEL_CLKGATE2		MCR_REG(0x9528)
 #define   CPSSUNIT_CLKGATE_DIS			REG_BIT(9)
 
-#define SSMCGCTL9530				_MMIO(0x9530)
+#define SSMCGCTL9530				MCR_REG(0x9530)
 #define   RTFUNIT_CLKGATE_DIS			REG_BIT(18)
 
-#define GEN10_DFR_RATIO_EN_AND_CHICKEN		_MMIO(0x9550)
+#define GEN10_DFR_RATIO_EN_AND_CHICKEN		MCR_REG(0x9550)
 #define   DFR_DISABLE				(1 << 9)
 
-#define INF_UNIT_LEVEL_CLKGATE			_MMIO(0x9560)
+#define INF_UNIT_LEVEL_CLKGATE			MCR_REG(0x9560)
 #define   CGPSF_CLKGATE_DIS			(1 << 3)
 
 #define MICRO_BP0_0				_MMIO(0x9800)
@@ -891,7 +893,7 @@
 
 /* MOCS (Memory Object Control State) registers */
 #define GEN9_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
-#define XEHP_LNCFCMOCS(i)			_MMIO(0xb020 + (i) * 4)	/* L3 Cache Control */
+#define XEHP_LNCFCMOCS(i)			MCR_REG(0xb020 + (i) * 4)	/* L3 Cache Control */
 #define LNCFCMOCS_REG_COUNT			32
 
 #define GEN7_L3CNTLREG3				_MMIO(0xb024)
@@ -908,10 +910,10 @@
 #define GEN7_L3LOG(slice, i)			_MMIO(0xb070 + (slice) * 0x200 + (i) * 4)
 #define   GEN7_L3LOG_SIZE			0x80
 
-#define XEHP_L3NODEARBCFG			_MMIO(0xb0b4)
+#define XEHP_L3NODEARBCFG			MCR_REG(0xb0b4)
 #define   XEHP_LNESPARE				REG_BIT(19)
 
-#define GEN8_L3SQCREG1				_MMIO(0xb100)
+#define GEN8_L3SQCREG1				MCR_REG(0xb100)
 /*
  * Note that on CHV the following has an off-by-one error wrt. to BSpec.
  * Using the formula in BSpec leads to a hang, while the formula here works
@@ -922,31 +924,31 @@
 #define   L3_HIGH_PRIO_CREDITS(x)		(((x) >> 1) << 14)
 #define   L3_PRIO_CREDITS_MASK			((0x1f << 19) | (0x1f << 14))
 
-#define GEN8_L3SQCREG4				_MMIO(0xb118)
+#define GEN8_L3SQCREG4				MCR_REG(0xb118)
 #define   GEN11_LQSC_CLEAN_EVICT_DISABLE	(1 << 6)
 #define   GEN8_LQSC_RO_PERF_DIS			(1 << 27)
 #define   GEN8_LQSC_FLUSH_COHERENT_LINES	(1 << 21)
 #define   GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE	REG_BIT(22)
 
-#define GEN9_SCRATCH1				_MMIO(0xb11c)
+#define GEN9_SCRATCH1				MCR_REG(0xb11c)
 #define   EVICTION_PERF_FIX_ENABLE		REG_BIT(8)
 
-#define BDW_SCRATCH1				_MMIO(0xb11c)
+#define BDW_SCRATCH1				MCR_REG(0xb11c)
 #define   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE	(1 << 2)
 
-#define GEN11_SCRATCH2				_MMIO(0xb140)
+#define GEN11_SCRATCH2				MCR_REG(0xb140)
 #define   GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE	(1 << 19)
 
-#define XEHP_L3SQCREG5				_MMIO(0xb158)
+#define XEHP_L3SQCREG5				MCR_REG(0xb158)
 #define   L3_PWM_TIMER_INIT_VAL_MASK		REG_GENMASK(9, 0)
 
-#define MLTICTXCTL				_MMIO(0xb170)
+#define MLTICTXCTL				MCR_REG(0xb170)
 #define   TDONRENDER				REG_BIT(2)
 
-#define XEHP_L3SCQREG7				_MMIO(0xb188)
+#define XEHP_L3SCQREG7				MCR_REG(0xb188)
 #define   BLEND_FILL_CACHING_OPT_DIS		REG_BIT(3)
 
-#define L3SQCREG1_CCS0				_MMIO(0xb200)
+#define L3SQCREG1_CCS0				MCR_REG(0xb200)
 #define   FLUSHALLNONCOH			REG_BIT(5)
 
 #define GEN11_GLBLINVL				_MMIO(0xb404)
@@ -969,14 +971,14 @@
 #define GEN9_BLT_MOCS(i)			_MMIO(__GEN9_BCS0_MOCS0 + (i) * 4)
 
 #define GEN12_FAULT_TLB_DATA0			_MMIO(0xceb8)
-#define XEHP_FAULT_TLB_DATA0			_MMIO(0xceb8)
+#define XEHP_FAULT_TLB_DATA0			MCR_REG(0xceb8)
 #define GEN12_FAULT_TLB_DATA1			_MMIO(0xcebc)
-#define XEHP_FAULT_TLB_DATA1			_MMIO(0xcebc)
+#define XEHP_FAULT_TLB_DATA1			MCR_REG(0xcebc)
 #define   FAULT_VA_HIGH_BITS			(0xf << 0)
 #define   FAULT_GTT_SEL				(1 << 4)
 
 #define GEN12_RING_FAULT_REG			_MMIO(0xcec4)
-#define XEHP_RING_FAULT_REG			_MMIO(0xcec4)
+#define XEHP_RING_FAULT_REG			MCR_REG(0xcec4)
 #define   GEN8_RING_FAULT_ENGINE_ID(x)		(((x) >> 12) & 0x7)
 #define   RING_FAULT_GTTSEL_MASK		(1 << 11)
 #define   RING_FAULT_SRCID(x)			(((x) >> 3) & 0xff)
@@ -988,11 +990,11 @@
 #define GEN12_VE_TLB_INV_CR			_MMIO(0xcee0)
 #define GEN12_BLT_TLB_INV_CR			_MMIO(0xcee4)
 
-#define XEHP_MERT_MOD_CTRL			_MMIO(0xcf28)
-#define RENDER_MOD_CTRL				_MMIO(0xcf2c)
-#define COMP_MOD_CTRL				_MMIO(0xcf30)
-#define VDBX_MOD_CTRL				_MMIO(0xcf34)
-#define VEBX_MOD_CTRL				_MMIO(0xcf38)
+#define XEHP_MERT_MOD_CTRL			MCR_REG(0xcf28)
+#define RENDER_MOD_CTRL				MCR_REG(0xcf2c)
+#define COMP_MOD_CTRL				MCR_REG(0xcf30)
+#define VDBX_MOD_CTRL				MCR_REG(0xcf34)
+#define VEBX_MOD_CTRL				MCR_REG(0xcf38)
 #define   FORCE_MISS_FTLB			REG_BIT(3)
 
 #define GEN12_GAMSTLB_CTRL			_MMIO(0xcf4c)
@@ -1007,49 +1009,49 @@
 #define GEN12_GAM_DONE				_MMIO(0xcf68)
 
 #define GEN7_HALF_SLICE_CHICKEN1		_MMIO(0xe100) /* IVB GT1 + VLV */
-#define GEN8_HALF_SLICE_CHICKEN1		_MMIO(0xe100)
+#define GEN8_HALF_SLICE_CHICKEN1		MCR_REG(0xe100)
 #define   GEN7_MAX_PS_THREAD_DEP		(8 << 12)
 #define   GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE	(1 << 10)
 #define   GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE	(1 << 4)
 #define   GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE	(1 << 3)
 
 #define GEN7_SAMPLER_INSTDONE			_MMIO(0xe160)
-#define GEN8_SAMPLER_INSTDONE			_MMIO(0xe160)
+#define GEN8_SAMPLER_INSTDONE			MCR_REG(0xe160)
 #define GEN7_ROW_INSTDONE			_MMIO(0xe164)
-#define GEN8_ROW_INSTDONE			_MMIO(0xe164)
+#define GEN8_ROW_INSTDONE			MCR_REG(0xe164)
 
-#define HALF_SLICE_CHICKEN2			_MMIO(0xe180)
+#define HALF_SLICE_CHICKEN2			MCR_REG(0xe180)
 #define   GEN8_ST_PO_DISABLE			(1 << 13)
 
 #define HSW_HALF_SLICE_CHICKEN3			_MMIO(0xe184)
-#define GEN8_HALF_SLICE_CHICKEN3		_MMIO(0xe184)
+#define GEN8_HALF_SLICE_CHICKEN3		MCR_REG(0xe184)
 #define   HSW_SAMPLE_C_PERFORMANCE		(1 << 9)
 #define   GEN8_CENTROID_PIXEL_OPT_DIS		(1 << 8)
 #define   GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC	(1 << 5)
 #define   GEN8_SAMPLER_POWER_BYPASS_DIS		(1 << 1)
 
-#define GEN9_HALF_SLICE_CHICKEN5		_MMIO(0xe188)
+#define GEN9_HALF_SLICE_CHICKEN5		MCR_REG(0xe188)
 #define   GEN9_DG_MIRROR_FIX_ENABLE		(1 << 5)
 #define   GEN9_CCS_TLB_PREFETCH_ENABLE		(1 << 3)
 
-#define GEN10_SAMPLER_MODE			_MMIO(0xe18c)
+#define GEN10_SAMPLER_MODE			MCR_REG(0xe18c)
 #define   ENABLE_SMALLPL			REG_BIT(15)
 #define   GEN11_SAMPLER_ENABLE_HEADLESS_MSG	REG_BIT(5)
 
-#define GEN9_HALF_SLICE_CHICKEN7		_MMIO(0xe194)
+#define GEN9_HALF_SLICE_CHICKEN7		MCR_REG(0xe194)
 #define   DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA	REG_BIT(15)
 #define   GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR	REG_BIT(8)
 #define   GEN9_ENABLE_YV12_BUGFIX		REG_BIT(4)
 #define   GEN9_ENABLE_GPGPU_PREEMPTION		REG_BIT(2)
 
-#define GEN10_CACHE_MODE_SS			_MMIO(0xe420)
+#define GEN10_CACHE_MODE_SS			MCR_REG(0xe420)
 #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
 #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
 
-#define EU_PERF_CNTL0				_MMIO(0xe458)
-#define EU_PERF_CNTL4				_MMIO(0xe45c)
+#define EU_PERF_CNTL0				MCR_REG(0xe458)
+#define EU_PERF_CNTL4				MCR_REG(0xe45c)
 
-#define GEN9_ROW_CHICKEN4			_MMIO(0xe48c)
+#define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
 #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
 #define   XEHP_DIS_BBL_SYSPIPE			REG_BIT(11)
 #define   GEN12_DISABLE_TDL_PUSH		REG_BIT(9)
@@ -1059,7 +1061,7 @@
 #define HSW_ROW_CHICKEN3			_MMIO(0xe49c)
 #define   HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE	(1 << 6)
 
-#define GEN8_ROW_CHICKEN			_MMIO(0xe4f0)
+#define GEN8_ROW_CHICKEN			MCR_REG(0xe4f0)
 #define   FLOW_CONTROL_ENABLE			REG_BIT(15)
 #define   UGM_BACKUP_MODE			REG_BIT(13)
 #define   MDQ_ARBITRATION_MODE			REG_BIT(12)
@@ -1071,37 +1073,37 @@
 
 #define GEN7_ROW_CHICKEN2			_MMIO(0xe4f4)
 
-#define GEN8_ROW_CHICKEN2			_MMIO(0xe4f4)
+#define GEN8_ROW_CHICKEN2			MCR_REG(0xe4f4)
 #define   GEN12_DISABLE_READ_SUPPRESSION	REG_BIT(15)
 #define   GEN12_DISABLE_EARLY_READ		REG_BIT(14)
 #define   GEN12_ENABLE_LARGE_GRF_MODE		REG_BIT(12)
 #define   GEN12_PUSH_CONST_DEREF_HOLD_DIS	REG_BIT(8)
 
-#define RT_CTRL					_MMIO(0xe530)
+#define RT_CTRL					MCR_REG(0xe530)
 #define   DIS_NULL_QUERY			REG_BIT(10)
 
-#define EU_PERF_CNTL1				_MMIO(0xe558)
-#define EU_PERF_CNTL5				_MMIO(0xe55c)
+#define EU_PERF_CNTL1				MCR_REG(0xe558)
+#define EU_PERF_CNTL5				MCR_REG(0xe55c)
 
-#define XEHP_HDC_CHICKEN0			_MMIO(0xe5f0)
+#define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
 #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
-#define ICL_HDC_MODE				_MMIO(0xe5f4)
+#define ICL_HDC_MODE				MCR_REG(0xe5f4)
 
-#define EU_PERF_CNTL2				_MMIO(0xe658)
-#define EU_PERF_CNTL6				_MMIO(0xe65c)
-#define EU_PERF_CNTL3				_MMIO(0xe758)
+#define EU_PERF_CNTL2				MCR_REG(0xe658)
+#define EU_PERF_CNTL6				MCR_REG(0xe65c)
+#define EU_PERF_CNTL3				MCR_REG(0xe758)
 
-#define LSC_CHICKEN_BIT_0			_MMIO(0xe7c8)
+#define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
 #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
 #define   FORCE_1_SUB_MESSAGE_PER_FRAGMENT	REG_BIT(15)
-#define LSC_CHICKEN_BIT_0_UDW			_MMIO(0xe7c8 + 4)
+#define LSC_CHICKEN_BIT_0_UDW			MCR_REG(0xe7c8 + 4)
 #define   DIS_CHAIN_2XSIMD8			REG_BIT(55 - 32)
 #define   FORCE_SLM_FENCE_SCOPE_TO_TILE		REG_BIT(42 - 32)
 #define   FORCE_UGM_FENCE_SCOPE_TO_TILE		REG_BIT(41 - 32)
 #define   MAXREQS_PER_BANK			REG_GENMASK(39 - 32, 37 - 32)
 #define   DISABLE_128B_EVICTION_COMMAND_UDW	REG_BIT(36 - 32)
 
-#define SARB_CHICKEN1				_MMIO(0xe90c)
+#define SARB_CHICKEN1				MCR_REG(0xe90c)
 #define   COMP_CKN_IN				REG_GENMASK(30, 29)
 
 #define GEN7_ROW_CHICKEN2_GT2			_MMIO(0xf4f4)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 09/15] drm/i915/gt: Move multicast register handling to a dedicated file
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (7 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 08/15] drm/i915: Define MCR registers explicitly Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 10/15] drm/i915/gt: Cleanup interface for MCR operations Matt Roper
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Handling of multicast/replicated registers is spread across intel_gt.c
and intel_uncore.c today.  As multicast handling and the related
steering logic gets more complicated with the addition of new platforms
and new rules it makes sense to centralize it all in one place.

For now the existing functions have been moved to the new .c/.h as-is.
Function renames and updates to operate in a more consistent manner will
be done in subsequent patches.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/Makefile               |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c  |   1 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c   |   1 +
 drivers/gpu/drm/i915/gt/intel_gt.c          | 262 +------------
 drivers/gpu/drm/i915/gt/intel_gt.h          |  15 -
 drivers/gpu/drm/i915/gt/intel_gt_debugfs.c  |   1 +
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 412 ++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h      |  37 ++
 drivers/gpu/drm/i915/gt/intel_region_lmem.c |   1 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c |   1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  |   1 +
 drivers/gpu/drm/i915/intel_uncore.c         | 112 ------
 drivers/gpu/drm/i915/intel_uncore.h         |   8 -
 13 files changed, 458 insertions(+), 395 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_mcr.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_mcr.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index c1d5540f6052..b2df1ad6729e 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -101,6 +101,7 @@ gt-y += \
 	gt/intel_gt_debugfs.o \
 	gt/intel_gt_engines_debugfs.o \
 	gt/intel_gt_irq.o \
+	gt/intel_gt_mcr.o \
 	gt/intel_gt_pm.o \
 	gt/intel_gt_pm_debugfs.o \
 	gt/intel_gt_pm_irq.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index a10d857dfd9b..81604af8b2c2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -13,6 +13,7 @@
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_mcr.h"
 #include "gt/intel_gt_regs.h"
 #include "gt/intel_region_lmem.h"
 #include "i915_drv.h"
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index ad9e7e55ce17..7e6bd8465ed6 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -21,6 +21,7 @@
 #include "intel_engine_user.h"
 #include "intel_execlists_submission.h"
 #include "intel_gt.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_requests.h"
 #include "intel_gt_pm.h"
 #include "intel_lrc.h"
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 1992325c2895..59c1ab591b86 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -17,6 +17,7 @@
 #include "intel_gt_buffer_pool.h"
 #include "intel_gt_clock_utils.h"
 #include "intel_gt_debugfs.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_pm.h"
 #include "intel_gt_regs.h"
 #include "intel_gt_requests.h"
@@ -102,78 +103,13 @@ int intel_gt_assign_ggtt(struct intel_gt *gt)
 	return gt->ggtt ? 0 : -ENOMEM;
 }
 
-static const char * const intel_steering_types[] = {
-	"L3BANK",
-	"MSLICE",
-	"LNCF",
-};
-
-static const struct intel_mmio_range icl_l3bank_steering_table[] = {
-	{ 0x00B100, 0x00B3FF },
-	{},
-};
-
-static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
-	{ 0x004000, 0x004AFF },
-	{ 0x00C800, 0x00CFFF },
-	{ 0x00DD00, 0x00DDFF },
-	{ 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
-	{},
-};
-
-static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
-	{ 0x00B000, 0x00B0FF },
-	{ 0x00D800, 0x00D8FF },
-	{},
-};
-
-static const struct intel_mmio_range dg2_lncf_steering_table[] = {
-	{ 0x00B000, 0x00B0FF },
-	{ 0x00D880, 0x00D8FF },
-	{},
-};
-
-static u16 slicemask(struct intel_gt *gt, int count)
-{
-	u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
-
-	return intel_slicemask_from_dssmask(dss_mask, count);
-}
-
 int intel_gt_init_mmio(struct intel_gt *gt)
 {
-	struct drm_i915_private *i915 = gt->i915;
-
 	intel_gt_init_clock_frequency(gt);
 
 	intel_uc_init_mmio(&gt->uc);
 	intel_sseu_info_init(gt);
-
-	/*
-	 * An mslice is unavailable only if both the meml3 for the slice is
-	 * disabled *and* all of the DSS in the slice (quadrant) are disabled.
-	 */
-	if (HAS_MSLICES(i915))
-		gt->info.mslice_mask =
-			slicemask(gt, GEN_DSS_PER_MSLICE) |
-			(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
-			 GEN12_MEML3_EN_MASK);
-
-	if (IS_DG2(i915)) {
-		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
-		gt->steering_table[LNCF] = dg2_lncf_steering_table;
-	} else if (IS_XEHPSDV(i915)) {
-		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
-		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
-	} else if (GRAPHICS_VER(i915) >= 11 &&
-		   GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
-		gt->steering_table[L3BANK] = icl_l3bank_steering_table;
-		gt->info.l3bank_mask =
-			~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
-			GEN10_L3BANK_MASK;
-	} else if (HAS_MSLICES(i915)) {
-		MISSING_CASE(INTEL_INFO(i915)->platform);
-	}
+	intel_gt_mcr_init(gt);
 
 	return intel_engines_init_mmio(gt);
 }
@@ -873,200 +809,6 @@ void intel_gt_driver_late_release_all(struct drm_i915_private *i915)
 	}
 }
 
-/**
- * intel_gt_reg_needs_read_steering - determine whether a register read
- *     requires explicit steering
- * @gt: GT structure
- * @reg: the register to check steering requirements for
- * @type: type of multicast steering to check
- *
- * Determines whether @reg needs explicit steering of a specific type for
- * reads.
- *
- * Returns false if @reg does not belong to a register range of the given
- * steering type, or if the default (subslice-based) steering IDs are suitable
- * for @type steering too.
- */
-static bool intel_gt_reg_needs_read_steering(struct intel_gt *gt,
-					     i915_reg_t reg,
-					     enum intel_steering_type type)
-{
-	const u32 offset = i915_mmio_reg_offset(reg);
-	const struct intel_mmio_range *entry;
-
-	if (likely(!intel_gt_needs_read_steering(gt, type)))
-		return false;
-
-	for (entry = gt->steering_table[type]; entry->end; entry++) {
-		if (offset >= entry->start && offset <= entry->end)
-			return true;
-	}
-
-	return false;
-}
-
-/**
- * intel_gt_get_valid_steering - determines valid IDs for a class of MCR steering
- * @gt: GT structure
- * @type: multicast register type
- * @sliceid: Slice ID returned
- * @subsliceid: Subslice ID returned
- *
- * Determines sliceid and subsliceid values that will steer reads
- * of a specific multicast register class to a valid value.
- */
-static void intel_gt_get_valid_steering(struct intel_gt *gt,
-					enum intel_steering_type type,
-					u8 *sliceid, u8 *subsliceid)
-{
-	switch (type) {
-	case L3BANK:
-		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
-
-		*sliceid = 0;		/* unused */
-		*subsliceid = __ffs(gt->info.l3bank_mask);
-		break;
-	case MSLICE:
-		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
-
-		*sliceid = __ffs(gt->info.mslice_mask);
-		*subsliceid = 0;	/* unused */
-		break;
-	case LNCF:
-		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
-
-		/*
-		 * An LNCF is always present if its mslice is present, so we
-		 * can safely just steer to LNCF 0 in all cases.
-		 */
-		*sliceid = __ffs(gt->info.mslice_mask) << 1;
-		*subsliceid = 0;	/* unused */
-		break;
-	default:
-		MISSING_CASE(type);
-		*sliceid = 0;
-		*subsliceid = 0;
-	}
-}
-
-/**
- * intel_gt_read_register_fw - reads a GT register with support for multicast
- * @gt: GT structure
- * @reg: register to read
- *
- * This function will read a GT register.  If the register is a multicast
- * register, the read will be steered to a valid instance (i.e., one that
- * isn't fused off or powered down by power gating).
- *
- * Returns the value from a valid instance of @reg.
- */
-u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg)
-{
-	int type;
-	u8 sliceid, subsliceid;
-
-	for (type = 0; type < NUM_STEERING_TYPES; type++) {
-		if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
-			intel_gt_get_valid_steering(gt, type, &sliceid,
-						    &subsliceid);
-			return intel_uncore_read_with_mcr_steering_fw(gt->uncore,
-								      reg,
-								      sliceid,
-								      subsliceid);
-		}
-	}
-
-	return intel_uncore_read_fw(gt->uncore, reg);
-}
-
-/**
- * intel_gt_get_valid_steering_for_reg - get a valid steering for a register
- * @gt: GT structure
- * @reg: register for which the steering is required
- * @sliceid: return variable for slice steering
- * @subsliceid: return variable for subslice steering
- *
- * This function returns a slice/subslice pair that is guaranteed to work for
- * read steering of the given register. Note that a value will be returned even
- * if the register is not replicated and therefore does not actually require
- * steering.
- */
-void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
-					 u8 *sliceid, u8 *subsliceid)
-{
-	int type;
-
-	for (type = 0; type < NUM_STEERING_TYPES; type++) {
-		if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
-			intel_gt_get_valid_steering(gt, type, sliceid,
-						    subsliceid);
-			return;
-		}
-	}
-
-	*sliceid = gt->default_steering.groupid;
-	*subsliceid = gt->default_steering.instanceid;
-}
-
-u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg)
-{
-	int type;
-	u8 sliceid, subsliceid;
-
-	for (type = 0; type < NUM_STEERING_TYPES; type++) {
-		if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
-			intel_gt_get_valid_steering(gt, type, &sliceid,
-						    &subsliceid);
-			return intel_uncore_read_with_mcr_steering(gt->uncore,
-								   reg,
-								   sliceid,
-								   subsliceid);
-		}
-	}
-
-	return intel_uncore_read(gt->uncore, reg);
-}
-
-static void report_steering_type(struct drm_printer *p,
-				 struct intel_gt *gt,
-				 enum intel_steering_type type,
-				 bool dump_table)
-{
-	const struct intel_mmio_range *entry;
-	u8 slice, subslice;
-
-	BUILD_BUG_ON(ARRAY_SIZE(intel_steering_types) != NUM_STEERING_TYPES);
-
-	if (!gt->steering_table[type]) {
-		drm_printf(p, "%s steering: uses default steering\n",
-			   intel_steering_types[type]);
-		return;
-	}
-
-	intel_gt_get_valid_steering(gt, type, &slice, &subslice);
-	drm_printf(p, "%s steering: sliceid=0x%x, subsliceid=0x%x\n",
-		   intel_steering_types[type], slice, subslice);
-
-	if (!dump_table)
-		return;
-
-	for (entry = gt->steering_table[type]; entry->end; entry++)
-		drm_printf(p, "\t0x%06x - 0x%06x\n", entry->start, entry->end);
-}
-
-void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
-			      bool dump_table)
-{
-	drm_printf(p, "Default steering: sliceid=0x%x, subsliceid=0x%x\n",
-		   gt->default_steering.groupid,
-		   gt->default_steering.instanceid);
-
-	if (HAS_MSLICES(gt->i915)) {
-		report_steering_type(p, gt, MSLICE, dump_table);
-		report_steering_type(p, gt, LNCF, dump_table);
-	}
-}
-
 static int intel_gt_tile_setup(struct intel_gt *gt, phys_addr_t phys_addr)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h b/drivers/gpu/drm/i915/gt/intel_gt.h
index e76168e10a21..0c47d85256d9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt.h
@@ -81,21 +81,6 @@ static inline bool intel_gt_is_wedged(const struct intel_gt *gt)
 	return unlikely(test_bit(I915_WEDGED, &gt->reset.flags));
 }
 
-static inline bool intel_gt_needs_read_steering(struct intel_gt *gt,
-						enum intel_steering_type type)
-{
-	return gt->steering_table[type];
-}
-
-void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
-					 u8 *sliceid, u8 *subsliceid);
-
-u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
-u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg);
-
-void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
-			      bool dump_table);
-
 int intel_gt_probe_all(struct drm_i915_private *i915);
 int intel_gt_tiles_init(struct drm_i915_private *i915);
 void intel_gt_release_all(struct drm_i915_private *i915);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
index d886fdc2c694..ea07f2bb846f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
@@ -9,6 +9,7 @@
 #include "intel_gt.h"
 #include "intel_gt_debugfs.h"
 #include "intel_gt_engines_debugfs.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_pm_debugfs.h"
 #include "intel_sseu_debugfs.h"
 #include "pxp/intel_pxp_debugfs.h"
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
new file mode 100644
index 000000000000..21edee03ce0f
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -0,0 +1,412 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include "i915_drv.h"
+
+#include "intel_gt_mcr.h"
+#include "intel_gt_regs.h"
+
+/**
+ * DOC: GT Multicast/Replicated (MCR) Register Support
+ *
+ * Some GT registers are designed as "multicast" or "replicated" registers:
+ * multiple instances of the same register share a single MMIO offset.  MCR
+ * registers are generally used when the hardware needs to potentially track
+ * independent values of a register per hardware unit (e.g., per-subslice,
+ * per-L3bank, etc.).  The specific types of replication that exist vary
+ * per-platform.
+ *
+ * MMIO accesses to MCR registers are controlled according to the settings
+ * programmed in the platform's MCR_SELECTOR register(s).  MMIO writes to MCR
+ * registers can be done in either a (i.e., a single write updates all
+ * instances of the register to the same value) or unicast (a write updates only
+ * one specific instance).  Reads of MCR registers always operate in a unicast
+ * manner regardless of how the multicast/unicast bit is set in MCR_SELECTOR.
+ * Selection of a specific MCR instance for unicast operations is referred to
+ * as "steering."
+ *
+ * If MCR register operations are steered toward a hardware unit that is
+ * fused off or currently powered down due to power gating, the MMIO operation
+ * is "terminated" by the hardware.  Terminated read operations will return a
+ * value of zero and terminated unicast write operations will be silently
+ * ignored.
+ */
+
+static const char * const intel_steering_types[] = {
+	"L3BANK",
+	"MSLICE",
+	"LNCF",
+};
+
+static const struct intel_mmio_range icl_l3bank_steering_table[] = {
+	{ 0x00B100, 0x00B3FF },
+	{},
+};
+
+static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
+	{ 0x004000, 0x004AFF },
+	{ 0x00C800, 0x00CFFF },
+	{ 0x00DD00, 0x00DDFF },
+	{ 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
+	{},
+};
+
+static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
+	{ 0x00B000, 0x00B0FF },
+	{ 0x00D800, 0x00D8FF },
+	{},
+};
+
+static const struct intel_mmio_range dg2_lncf_steering_table[] = {
+	{ 0x00B000, 0x00B0FF },
+	{ 0x00D880, 0x00D8FF },
+	{},
+};
+
+static u16 slicemask(struct intel_gt *gt, int count)
+{
+	u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
+
+	return intel_slicemask_from_dssmask(dss_mask, count);
+}
+
+void intel_gt_mcr_init(struct intel_gt *gt)
+{
+	struct drm_i915_private *i915 = gt->i915;
+
+	/*
+	 * An mslice is unavailable only if both the meml3 for the slice is
+	 * disabled *and* all of the DSS in the slice (quadrant) are disabled.
+	 */
+	if (HAS_MSLICES(i915))
+		gt->info.mslice_mask =
+			slicemask(gt, GEN_DSS_PER_MSLICE) |
+			(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
+			 GEN12_MEML3_EN_MASK);
+
+	if (IS_DG2(i915)) {
+		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
+		gt->steering_table[LNCF] = dg2_lncf_steering_table;
+	} else if (IS_XEHPSDV(i915)) {
+		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
+		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
+	} else if (GRAPHICS_VER(i915) >= 11 &&
+		   GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
+		gt->steering_table[L3BANK] = icl_l3bank_steering_table;
+		gt->info.l3bank_mask =
+			~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
+			GEN10_L3BANK_MASK;
+	} else if (HAS_MSLICES(i915)) {
+		MISSING_CASE(INTEL_INFO(i915)->platform);
+	}
+}
+
+/**
+ * uncore_rw_with_mcr_steering_fw - Access a register after programming
+ *				    the MCR selector register.
+ * @uncore: pointer to struct intel_uncore
+ * @reg: register being accessed
+ * @rw_flag: FW_REG_READ for read access or FW_REG_WRITE for write access
+ * @slice: slice number (ignored for multi-cast write)
+ * @subslice: sub-slice number (ignored for multi-cast write)
+ * @value: register value to be written (ignored for read)
+ *
+ * Return: 0 for write access. register value for read access.
+ *
+ * Caller needs to make sure the relevant forcewake wells are up.
+ */
+static u32 uncore_rw_with_mcr_steering_fw(struct intel_uncore *uncore,
+					  i915_reg_t reg, u8 rw_flag,
+					  int slice, int subslice, u32 value)
+{
+	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
+
+	lockdep_assert_held(&uncore->lock);
+
+	if (GRAPHICS_VER(uncore->i915) >= 11) {
+		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
+		mcr_ss = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
+
+		/*
+		 * Wa_22013088509
+		 *
+		 * The setting of the multicast/unicast bit usually wouldn't
+		 * matter for read operations (which always return the value
+		 * from a single register instance regardless of how that bit
+		 * is set), but some platforms have a workaround requiring us
+		 * to remain in multicast mode for reads.  There's no real
+		 * downside to this, so we'll just go ahead and do so on all
+		 * platforms; we'll only clear the multicast bit from the mask
+		 * when exlicitly doing a write operation.
+		 */
+		if (rw_flag == FW_REG_WRITE)
+			mcr_mask |= GEN11_MCR_MULTICAST;
+	} else {
+		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
+		mcr_ss = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+	}
+
+	old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
+
+	mcr &= ~mcr_mask;
+	mcr |= mcr_ss;
+	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
+
+	if (rw_flag == FW_REG_READ)
+		val = intel_uncore_read_fw(uncore, reg);
+	else
+		intel_uncore_write_fw(uncore, reg, value);
+
+	mcr &= ~mcr_mask;
+	mcr |= old_mcr & mcr_mask;
+
+	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
+
+	return val;
+}
+
+static u32 uncore_rw_with_mcr_steering(struct intel_uncore *uncore,
+				       i915_reg_t reg, u8 rw_flag,
+				       int slice, int subslice,
+				       u32 value)
+{
+	enum forcewake_domains fw_domains;
+	u32 val;
+
+	fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
+						    rw_flag);
+	fw_domains |= intel_uncore_forcewake_for_reg(uncore,
+						     GEN8_MCR_SELECTOR,
+						     FW_REG_READ | FW_REG_WRITE);
+
+	spin_lock_irq(&uncore->lock);
+	intel_uncore_forcewake_get__locked(uncore, fw_domains);
+
+	val = uncore_rw_with_mcr_steering_fw(uncore, reg, rw_flag,
+					     slice, subslice, value);
+
+	intel_uncore_forcewake_put__locked(uncore, fw_domains);
+	spin_unlock_irq(&uncore->lock);
+
+	return val;
+}
+
+u32 intel_uncore_read_with_mcr_steering_fw(struct intel_uncore *uncore,
+					   i915_reg_t reg, int slice, int subslice)
+{
+	return uncore_rw_with_mcr_steering_fw(uncore, reg, FW_REG_READ,
+					      slice, subslice, 0);
+}
+
+u32 intel_uncore_read_with_mcr_steering(struct intel_uncore *uncore,
+					i915_reg_t reg, int slice, int subslice)
+{
+	return uncore_rw_with_mcr_steering(uncore, reg, FW_REG_READ,
+					   slice, subslice, 0);
+}
+
+void intel_uncore_write_with_mcr_steering(struct intel_uncore *uncore,
+					  i915_reg_t reg, u32 value,
+					  int slice, int subslice)
+{
+	uncore_rw_with_mcr_steering(uncore, reg, FW_REG_WRITE,
+				    slice, subslice, value);
+}
+
+
+/**
+ * intel_gt_reg_needs_read_steering - determine whether a register read
+ *     requires explicit steering
+ * @gt: GT structure
+ * @reg: the register to check steering requirements for
+ * @type: type of multicast steering to check
+ *
+ * Determines whether @reg needs explicit steering of a specific type for
+ * reads.
+ *
+ * Returns false if @reg does not belong to a register range of the given
+ * steering type, or if the default (subslice-based) steering IDs are suitable
+ * for @type steering too.
+ */
+static bool intel_gt_reg_needs_read_steering(struct intel_gt *gt,
+					     i915_reg_t reg,
+					     enum intel_steering_type type)
+{
+	const u32 offset = i915_mmio_reg_offset(reg);
+	const struct intel_mmio_range *entry;
+
+	if (likely(!intel_gt_needs_read_steering(gt, type)))
+		return false;
+
+	for (entry = gt->steering_table[type]; entry->end; entry++) {
+		if (offset >= entry->start && offset <= entry->end)
+			return true;
+	}
+
+	return false;
+}
+
+/**
+ * intel_gt_get_valid_steering - determines valid IDs for a class of MCR steering
+ * @gt: GT structure
+ * @type: multicast register type
+ * @sliceid: Slice ID returned
+ * @subsliceid: Subslice ID returned
+ *
+ * Determines sliceid and subsliceid values that will steer reads
+ * of a specific multicast register class to a valid value.
+ */
+static void intel_gt_get_valid_steering(struct intel_gt *gt,
+					enum intel_steering_type type,
+					u8 *sliceid, u8 *subsliceid)
+{
+	switch (type) {
+	case L3BANK:
+		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
+
+		*sliceid = 0;		/* unused */
+		*subsliceid = __ffs(gt->info.l3bank_mask);
+		break;
+	case MSLICE:
+		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
+
+		*sliceid = __ffs(gt->info.mslice_mask);
+		*subsliceid = 0;	/* unused */
+		break;
+	case LNCF:
+		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
+
+		/*
+		 * An LNCF is always present if its mslice is present, so we
+		 * can safely just steer to LNCF 0 in all cases.
+		 */
+		*sliceid = __ffs(gt->info.mslice_mask) << 1;
+		*subsliceid = 0;	/* unused */
+		break;
+	default:
+		MISSING_CASE(type);
+		*sliceid = 0;
+		*subsliceid = 0;
+	}
+}
+
+/**
+ * intel_gt_get_valid_steering_for_reg - get a valid steering for a register
+ * @gt: GT structure
+ * @reg: register for which the steering is required
+ * @sliceid: return variable for slice steering
+ * @subsliceid: return variable for subslice steering
+ *
+ * This function returns a slice/subslice pair that is guaranteed to work for
+ * read steering of the given register. Note that a value will be returned even
+ * if the register is not replicated and therefore does not actually require
+ * steering.
+ */
+void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
+					 u8 *sliceid, u8 *subsliceid)
+{
+	int type;
+
+	for (type = 0; type < NUM_STEERING_TYPES; type++) {
+		if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
+			intel_gt_get_valid_steering(gt, type, sliceid,
+						    subsliceid);
+			return;
+		}
+	}
+
+	*sliceid = gt->default_steering.groupid;
+	*subsliceid = gt->default_steering.instanceid;
+}
+
+/**
+ * intel_gt_read_register_fw - reads a GT register with support for multicast
+ * @gt: GT structure
+ * @reg: register to read
+ *
+ * This function will read a GT register.  If the register is a multicast
+ * register, the read will be steered to a valid instance (i.e., one that
+ * isn't fused off or powered down by power gating).
+ *
+ * Returns the value from a valid instance of @reg.
+ */
+u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg)
+{
+	int type;
+	u8 sliceid, subsliceid;
+
+	for (type = 0; type < NUM_STEERING_TYPES; type++) {
+		if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
+			intel_gt_get_valid_steering(gt, type, &sliceid,
+						    &subsliceid);
+			return intel_uncore_read_with_mcr_steering_fw(gt->uncore,
+								      reg,
+								      sliceid,
+								      subsliceid);
+		}
+	}
+
+	return intel_uncore_read_fw(gt->uncore, reg);
+}
+
+u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg)
+{
+	int type;
+	u8 sliceid, subsliceid;
+
+	for (type = 0; type < NUM_STEERING_TYPES; type++) {
+		if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
+			intel_gt_get_valid_steering(gt, type, &sliceid,
+						    &subsliceid);
+			return intel_uncore_read_with_mcr_steering(gt->uncore,
+								   reg,
+								   sliceid,
+								   subsliceid);
+		}
+	}
+
+	return intel_uncore_read(gt->uncore, reg);
+}
+
+static void report_steering_type(struct drm_printer *p,
+				 struct intel_gt *gt,
+				 enum intel_steering_type type,
+				 bool dump_table)
+{
+	const struct intel_mmio_range *entry;
+	u8 slice, subslice;
+
+	BUILD_BUG_ON(ARRAY_SIZE(intel_steering_types) != NUM_STEERING_TYPES);
+
+	if (!gt->steering_table[type]) {
+		drm_printf(p, "%s steering: uses default steering\n",
+			   intel_steering_types[type]);
+		return;
+	}
+
+	intel_gt_get_valid_steering(gt, type, &slice, &subslice);
+	drm_printf(p, "%s steering: sliceid=0x%x, subsliceid=0x%x\n",
+		   intel_steering_types[type], slice, subslice);
+
+	if (!dump_table)
+		return;
+
+	for (entry = gt->steering_table[type]; entry->end; entry++)
+		drm_printf(p, "\t0x%06x - 0x%06x\n", entry->start, entry->end);
+}
+
+void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
+			      bool dump_table)
+{
+	drm_printf(p, "Default steering: sliceid=0x%x, subsliceid=0x%x\n",
+		   gt->default_steering.groupid,
+		   gt->default_steering.instanceid);
+
+	if (HAS_MSLICES(gt->i915)) {
+		report_steering_type(p, gt, MSLICE, dump_table);
+		report_steering_type(p, gt, LNCF, dump_table);
+	}
+}
+
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
new file mode 100644
index 000000000000..b570c1571243
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __INTEL_GT_MCR__
+#define __INTEL_GT_MCR__
+
+#include "intel_gt_types.h"
+
+void intel_gt_mcr_init(struct intel_gt *gt);
+
+u32 intel_uncore_read_with_mcr_steering_fw(struct intel_uncore *uncore,
+					   i915_reg_t reg,
+					   int slice, int subslice);
+u32 intel_uncore_read_with_mcr_steering(struct intel_uncore *uncore,
+					i915_reg_t reg,	int slice, int subslice);
+void intel_uncore_write_with_mcr_steering(struct intel_uncore *uncore,
+					  i915_reg_t reg, u32 value,
+					  int slice, int subslice);
+
+u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
+u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg);
+
+static inline bool intel_gt_needs_read_steering(struct intel_gt *gt,
+						enum intel_steering_type type)
+{
+	return gt->steering_table[type];
+}
+
+void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
+					 u8 *sliceid, u8 *subsliceid);
+
+void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
+			      bool dump_table);
+
+#endif /* __INTEL_GT_MCR__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index f5111c0a0060..6e788a5fc85a 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -12,6 +12,7 @@
 #include "gem/i915_gem_region.h"
 #include "gem/i915_gem_ttm.h"
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_mcr.h"
 #include "gt/intel_gt_regs.h"
 
 static int
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 544097c56619..e9bf0d9f50d8 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -9,6 +9,7 @@
 #include "intel_engine_regs.h"
 #include "intel_gpu_commands.h"
 #include "intel_gt.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_regs.h"
 #include "intel_ring.h"
 #include "intel_workarounds.h"
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index e8a42d719f96..c63a6bf9e853 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -7,6 +7,7 @@
 
 #include "gt/intel_engine_regs.h"
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_mcr.h"
 #include "gt/intel_gt_regs.h"
 #include "gt/intel_lrc.h"
 #include "gt/shmem_utils.h"
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 8b9caaaacc21..8069671d73d6 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -2459,118 +2459,6 @@ intel_uncore_forcewake_for_reg(struct intel_uncore *uncore,
 	return fw_domains;
 }
 
-/**
- * uncore_rw_with_mcr_steering_fw - Access a register after programming
- *				    the MCR selector register.
- * @uncore: pointer to struct intel_uncore
- * @reg: register being accessed
- * @rw_flag: FW_REG_READ for read access or FW_REG_WRITE for write access
- * @slice: slice number (ignored for multi-cast write)
- * @subslice: sub-slice number (ignored for multi-cast write)
- * @value: register value to be written (ignored for read)
- *
- * Return: 0 for write access. register value for read access.
- *
- * Caller needs to make sure the relevant forcewake wells are up.
- */
-static u32 uncore_rw_with_mcr_steering_fw(struct intel_uncore *uncore,
-					  i915_reg_t reg, u8 rw_flag,
-					  int slice, int subslice, u32 value)
-{
-	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
-
-	lockdep_assert_held(&uncore->lock);
-
-	if (GRAPHICS_VER(uncore->i915) >= 11) {
-		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
-		mcr_ss = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
-
-		/*
-		 * Wa_22013088509
-		 *
-		 * The setting of the multicast/unicast bit usually wouldn't
-		 * matter for read operations (which always return the value
-		 * from a single register instance regardless of how that bit
-		 * is set), but some platforms have a workaround requiring us
-		 * to remain in multicast mode for reads.  There's no real
-		 * downside to this, so we'll just go ahead and do so on all
-		 * platforms; we'll only clear the multicast bit from the mask
-		 * when exlicitly doing a write operation.
-		 */
-		if (rw_flag == FW_REG_WRITE)
-			mcr_mask |= GEN11_MCR_MULTICAST;
-	} else {
-		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
-		mcr_ss = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
-	}
-
-	old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
-
-	mcr &= ~mcr_mask;
-	mcr |= mcr_ss;
-	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
-
-	if (rw_flag == FW_REG_READ)
-		val = intel_uncore_read_fw(uncore, reg);
-	else
-		intel_uncore_write_fw(uncore, reg, value);
-
-	mcr &= ~mcr_mask;
-	mcr |= old_mcr & mcr_mask;
-
-	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
-
-	return val;
-}
-
-static u32 uncore_rw_with_mcr_steering(struct intel_uncore *uncore,
-				       i915_reg_t reg, u8 rw_flag,
-				       int slice, int subslice,
-				       u32 value)
-{
-	enum forcewake_domains fw_domains;
-	u32 val;
-
-	fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
-						    rw_flag);
-	fw_domains |= intel_uncore_forcewake_for_reg(uncore,
-						     GEN8_MCR_SELECTOR,
-						     FW_REG_READ | FW_REG_WRITE);
-
-	spin_lock_irq(&uncore->lock);
-	intel_uncore_forcewake_get__locked(uncore, fw_domains);
-
-	val = uncore_rw_with_mcr_steering_fw(uncore, reg, rw_flag,
-					     slice, subslice, value);
-
-	intel_uncore_forcewake_put__locked(uncore, fw_domains);
-	spin_unlock_irq(&uncore->lock);
-
-	return val;
-}
-
-u32 intel_uncore_read_with_mcr_steering_fw(struct intel_uncore *uncore,
-					   i915_reg_t reg, int slice, int subslice)
-{
-	return uncore_rw_with_mcr_steering_fw(uncore, reg, FW_REG_READ,
-					      slice, subslice, 0);
-}
-
-u32 intel_uncore_read_with_mcr_steering(struct intel_uncore *uncore,
-					i915_reg_t reg, int slice, int subslice)
-{
-	return uncore_rw_with_mcr_steering(uncore, reg, FW_REG_READ,
-					   slice, subslice, 0);
-}
-
-void intel_uncore_write_with_mcr_steering(struct intel_uncore *uncore,
-					  i915_reg_t reg, u32 value,
-					  int slice, int subslice)
-{
-	uncore_rw_with_mcr_steering(uncore, reg, FW_REG_WRITE,
-				    slice, subslice, value);
-}
-
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_uncore.c"
 #include "selftests/intel_uncore.c"
diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h
index 52fe3d89dd2b..b1fa912a65e7 100644
--- a/drivers/gpu/drm/i915/intel_uncore.h
+++ b/drivers/gpu/drm/i915/intel_uncore.h
@@ -210,14 +210,6 @@ intel_uncore_has_fifo(const struct intel_uncore *uncore)
 	return uncore->flags & UNCORE_HAS_FIFO;
 }
 
-u32 intel_uncore_read_with_mcr_steering_fw(struct intel_uncore *uncore,
-					   i915_reg_t reg,
-					   int slice, int subslice);
-u32 intel_uncore_read_with_mcr_steering(struct intel_uncore *uncore,
-					i915_reg_t reg,	int slice, int subslice);
-void intel_uncore_write_with_mcr_steering(struct intel_uncore *uncore,
-					  i915_reg_t reg, u32 value,
-					  int slice, int subslice);
 void
 intel_uncore_mmio_debug_init_early(struct intel_uncore_mmio_debug *mmio_debug);
 void intel_uncore_init_early(struct intel_uncore *uncore,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/15] drm/i915/gt: Cleanup interface for MCR operations
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (8 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 09/15] drm/i915/gt: Move multicast register handling to a dedicated file Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 11/15] drm/i915/gt: Always use MCR functions on multicast registers Matt Roper
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Let's replace the assortment of intel_gt_* and intel_uncore_* functions
that operate on MCR registers with a cleaner set of interfaces:

  * intel_gt_mcr_read -- unicast read from specific instance
  * intel_gt_mcr_read_any[_fw] -- unicast read from any non-terminated
    instance
  * intel_gt_mcr_unicast_write -- unicast write to specific instance
  * intel_gt_mcr_multicast_write[_fw] -- multicast write to all instances

We'll also replace the historic "slice" and "subslice" terminology with
"group" and "instance" to match the documentation for more recent
platforms; these days MCR steering applies to more types of replication
than just slice/subslice.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c  |   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c   |  33 ++-
 drivers/gpu/drm/i915/gt/intel_gt.c          |  19 +-
 drivers/gpu/drm/i915/gt/intel_gt_debugfs.c  |   2 +-
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 225 ++++++++++++--------
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h      |  43 ++--
 drivers/gpu/drm/i915/gt/intel_region_lmem.c |   2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c |   8 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  |   2 +-
 9 files changed, 184 insertions(+), 152 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 81604af8b2c2..e63de9c06596 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -836,7 +836,7 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type,
 	} else {
 		resource_size_t lmem_range;
 
-		lmem_range = intel_gt_read_register(&i915->gt0, XEHPSDV_TILE0_ADDR_RANGE) & 0xFFFF;
+		lmem_range = intel_gt_mcr_read_any(&i915->gt0, XEHPSDV_TILE0_ADDR_RANGE) & 0xFFFF;
 		lmem_size = lmem_range >> XEHPSDV_TILE_LMEM_RANGE_SHIFT;
 		lmem_size *= SZ_1G;
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 7e6bd8465ed6..f48e87bfceac 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1326,14 +1326,6 @@ void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine)
 	ENGINE_WRITE_FW(engine, RING_MI_MODE, _MASKED_BIT_DISABLE(STOP_RING));
 }
 
-static u32
-read_subslice_reg(const struct intel_engine_cs *engine,
-		  int slice, int subslice, i915_reg_t reg)
-{
-	return intel_uncore_read_with_mcr_steering(engine->uncore, reg,
-						   slice, subslice);
-}
-
 /* NB: please notice the memset */
 void intel_engine_get_instdone(const struct intel_engine_cs *engine,
 			       struct intel_instdone *instdone)
@@ -1367,28 +1359,33 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
 		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
 			for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) {
 				instdone->sampler[slice][subslice] =
-					read_subslice_reg(engine, slice, subslice,
-							  GEN8_SAMPLER_INSTDONE);
+					intel_gt_mcr_read(engine->gt,
+							  GEN8_SAMPLER_INSTDONE,
+							  slice, subslice);
 				instdone->row[slice][subslice] =
-					read_subslice_reg(engine, slice, subslice,
-							  GEN8_ROW_INSTDONE);
+					intel_gt_mcr_read(engine->gt,
+							  GEN8_ROW_INSTDONE,
+							  slice, subslice);
 			}
 		} else {
 			for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
 				instdone->sampler[slice][subslice] =
-					read_subslice_reg(engine, slice, subslice,
-							  GEN8_SAMPLER_INSTDONE);
+					intel_gt_mcr_read(engine->gt,
+							  GEN8_SAMPLER_INSTDONE,
+							  slice, subslice);
 				instdone->row[slice][subslice] =
-					read_subslice_reg(engine, slice, subslice,
-							  GEN8_ROW_INSTDONE);
+					intel_gt_mcr_read(engine->gt,
+							  GEN8_ROW_INSTDONE,
+							  slice, subslice);
 			}
 		}
 
 		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
 			for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice)
 				instdone->geom_svg[slice][subslice] =
-					read_subslice_reg(engine, slice, subslice,
-							  XEHPG_INSTDONE_GEOM_SVG);
+					intel_gt_mcr_read(engine->gt,
+							  XEHPG_INSTDONE_GEOM_SVG,
+							  slice, subslice);
 		}
 	} else if (GRAPHICS_VER(i915) >= 7) {
 		instdone->instdone =
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 59c1ab591b86..b7c9cbdf3fc8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -288,30 +288,27 @@ static void gen6_check_faults(struct intel_gt *gt)
 
 static void xehp_check_faults(struct intel_gt *gt)
 {
-	struct intel_uncore *uncore = gt->uncore;
 	u32 fault;
 	int mslice;
 
 	/* Check each mslice's fault register */
 	for (mslice = 0; mslice < 4; mslice++) {
-		fault = intel_uncore_read_with_mcr_steering(uncore,
-							    XEHP_RING_FAULT_REG,
-							    mslice, 0);
+		fault = intel_gt_mcr_read(gt, XEHP_RING_FAULT_REG, mslice, 0);
 		if (fault & RING_FAULT_VALID) {
 			u32 fault_data0, fault_data1;
 			u64 fault_addr;
 
-			fault_data0 = intel_uncore_read_with_mcr_steering(uncore,
-									  XEHP_FAULT_TLB_DATA0,
-									  mslice, 0);
-			fault_data1 = intel_uncore_read_with_mcr_steering(uncore,
-									  XEHP_FAULT_TLB_DATA1,
-									  mslice, 0);
+			fault_data0 =
+				intel_gt_mcr_read(gt, XEHP_FAULT_TLB_DATA0,
+						  mslice, 0);
+			fault_data1 =
+				intel_gt_mcr_read(gt, XEHP_FAULT_TLB_DATA1,
+						  mslice, 0);
 
 			fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) |
 				     ((u64)fault_data0 << 12);
 
-			drm_dbg(&uncore->i915->drm, "Unexpected fault\n"
+			drm_dbg(&gt->i915->drm, "Unexpected fault\n"
 				"\tAddr: 0x%08x_%08x\n"
 				"\tAddress space: %s\n"
 				"\tEngine ID: %d\n"
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
index ea07f2bb846f..dd53641f3637 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
@@ -65,7 +65,7 @@ static int steering_show(struct seq_file *m, void *data)
 	struct drm_printer p = drm_seq_file_printer(m);
 	struct intel_gt *gt = m->private;
 
-	intel_gt_report_steering(&p, gt, true);
+	intel_gt_mcr_report_steering(&p, gt, true);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index 21edee03ce0f..c8e52d625f18 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -103,23 +103,22 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 	}
 }
 
-/**
- * uncore_rw_with_mcr_steering_fw - Access a register after programming
- *				    the MCR selector register.
+/*
+ * rw_with_mcr_steering_fw - Access a register with specific MCR steering
  * @uncore: pointer to struct intel_uncore
  * @reg: register being accessed
  * @rw_flag: FW_REG_READ for read access or FW_REG_WRITE for write access
- * @slice: slice number (ignored for multi-cast write)
- * @subslice: sub-slice number (ignored for multi-cast write)
+ * @group: group number
+ * @instance: instance number
  * @value: register value to be written (ignored for read)
  *
  * Return: 0 for write access. register value for read access.
  *
  * Caller needs to make sure the relevant forcewake wells are up.
  */
-static u32 uncore_rw_with_mcr_steering_fw(struct intel_uncore *uncore,
-					  i915_reg_t reg, u8 rw_flag,
-					  int slice, int subslice, u32 value)
+static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
+				   i915_reg_t reg, u8 rw_flag,
+				   int group, int instance, u32 value)
 {
 	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
 
@@ -127,7 +126,7 @@ static u32 uncore_rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 
 	if (GRAPHICS_VER(uncore->i915) >= 11) {
 		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
-		mcr_ss = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
+		mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance);
 
 		/*
 		 * Wa_22013088509
@@ -145,7 +144,7 @@ static u32 uncore_rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 			mcr_mask |= GEN11_MCR_MULTICAST;
 	} else {
 		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
-		mcr_ss = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+		mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance);
 	}
 
 	old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
@@ -167,10 +166,10 @@ static u32 uncore_rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 	return val;
 }
 
-static u32 uncore_rw_with_mcr_steering(struct intel_uncore *uncore,
-				       i915_reg_t reg, u8 rw_flag,
-				       int slice, int subslice,
-				       u32 value)
+static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
+				i915_reg_t reg, u8 rw_flag,
+				int group, int instance,
+				u32 value)
 {
 	enum forcewake_domains fw_domains;
 	u32 val;
@@ -184,8 +183,8 @@ static u32 uncore_rw_with_mcr_steering(struct intel_uncore *uncore,
 	spin_lock_irq(&uncore->lock);
 	intel_uncore_forcewake_get__locked(uncore, fw_domains);
 
-	val = uncore_rw_with_mcr_steering_fw(uncore, reg, rw_flag,
-					     slice, subslice, value);
+	val = rw_with_mcr_steering_fw(uncore, reg, rw_flag,
+				      group, instance, value);
 
 	intel_uncore_forcewake_put__locked(uncore, fw_domains);
 	spin_unlock_irq(&uncore->lock);
@@ -193,31 +192,74 @@ static u32 uncore_rw_with_mcr_steering(struct intel_uncore *uncore,
 	return val;
 }
 
-u32 intel_uncore_read_with_mcr_steering_fw(struct intel_uncore *uncore,
-					   i915_reg_t reg, int slice, int subslice)
+/**
+ * intel_gt_mcr_read - read a specific instance of an MCR register
+ * @gt: GT structure
+ * @reg: the MCR register to read
+ * @group: the MCR group
+ * @instance: the MCR instance
+ *
+ * Returns the value read from an MCR register after steering toward a specific
+ * group/instance.
+ */
+u32 intel_gt_mcr_read(struct intel_gt *gt,
+		      i915_reg_t reg,
+		      int group, int instance)
 {
-	return uncore_rw_with_mcr_steering_fw(uncore, reg, FW_REG_READ,
-					      slice, subslice, 0);
+	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ,
+				    group, instance, 0);
 }
 
-u32 intel_uncore_read_with_mcr_steering(struct intel_uncore *uncore,
-					i915_reg_t reg, int slice, int subslice)
+/**
+ * intel_gt_mcr_unicast_write - write a specific instance of an MCR register
+ * @gt: GT structure
+ * @reg: the MCR register to read
+ * @value: value to write
+ * @group: the MCR group
+ * @instance: the MCR instance
+ *
+ * Write an MCR register in unicast mode after steering toward a specific
+ * group/instance.
+ */
+void intel_gt_mcr_unicast_write(struct intel_gt *gt,
+				i915_reg_t reg, u32 value,
+				int group, int instance)
 {
-	return uncore_rw_with_mcr_steering(uncore, reg, FW_REG_READ,
-					   slice, subslice, 0);
+	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE,
+			     group, instance, value);
 }
 
-void intel_uncore_write_with_mcr_steering(struct intel_uncore *uncore,
-					  i915_reg_t reg, u32 value,
-					  int slice, int subslice)
+/**
+ * intel_gt_mcr_multicast_write - write a value to all instances of an MCR register
+ * @gt: GT structure
+ * @reg: the MCR register to read
+ * @value: value to write
+ *
+ * Write an MCR register in multicast mode to update all instances.
+ */
+void intel_gt_mcr_multicast_write(struct intel_gt *gt,
+				i915_reg_t reg, u32 value)
 {
-	uncore_rw_with_mcr_steering(uncore, reg, FW_REG_WRITE,
-				    slice, subslice, value);
+	intel_uncore_write(gt->uncore, reg, value);
 }
 
-
 /**
- * intel_gt_reg_needs_read_steering - determine whether a register read
+ * intel_gt_mcr_multicast_write_fw - write a value to all instances of an MCR register
+ * @gt: GT structure
+ * @reg: the MCR register to read
+ * @value: value to write
+ *
+ * Write an MCR register in multicast mode to update all instances.  The caller
+ * must already be holding any required forcewake.
+ */
+void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
+				i915_reg_t reg, u32 value)
+{
+	intel_uncore_write_fw(gt->uncore, reg, value);
+}
+
+/*
+ * reg_needs_read_steering - determine whether a register read
  *     requires explicit steering
  * @gt: GT structure
  * @reg: the register to check steering requirements for
@@ -230,14 +272,14 @@ void intel_uncore_write_with_mcr_steering(struct intel_uncore *uncore,
  * steering type, or if the default (subslice-based) steering IDs are suitable
  * for @type steering too.
  */
-static bool intel_gt_reg_needs_read_steering(struct intel_gt *gt,
-					     i915_reg_t reg,
-					     enum intel_steering_type type)
+static bool reg_needs_read_steering(struct intel_gt *gt,
+				    i915_reg_t reg,
+				    enum intel_steering_type type)
 {
 	const u32 offset = i915_mmio_reg_offset(reg);
 	const struct intel_mmio_range *entry;
 
-	if (likely(!intel_gt_needs_read_steering(gt, type)))
+	if (likely(!gt->steering_table[type]))
 		return false;
 
 	for (entry = gt->steering_table[type]; entry->end; entry++) {
@@ -248,32 +290,32 @@ static bool intel_gt_reg_needs_read_steering(struct intel_gt *gt,
 	return false;
 }
 
-/**
- * intel_gt_get_valid_steering - determines valid IDs for a class of MCR steering
+/*
+ * get_valid_steering - determines non-terminated steering for a class of MCR
  * @gt: GT structure
  * @type: multicast register type
- * @sliceid: Slice ID returned
- * @subsliceid: Subslice ID returned
+ * @group: Group ID returned
+ * @instance: Instance ID returned
  *
- * Determines sliceid and subsliceid values that will steer reads
- * of a specific multicast register class to a valid value.
+ * Determines group and instance values that will steer reads of the specified
+ * MCR class to a non-terminated instance.
  */
-static void intel_gt_get_valid_steering(struct intel_gt *gt,
-					enum intel_steering_type type,
-					u8 *sliceid, u8 *subsliceid)
+static void get_valid_steering(struct intel_gt *gt,
+			       enum intel_steering_type type,
+			       u8 *group, u8 *instance)
 {
 	switch (type) {
 	case L3BANK:
 		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
 
-		*sliceid = 0;		/* unused */
-		*subsliceid = __ffs(gt->info.l3bank_mask);
+		*group = 0;		/* unused */
+		*instance = __ffs(gt->info.l3bank_mask);
 		break;
 	case MSLICE:
 		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
 
-		*sliceid = __ffs(gt->info.mslice_mask);
-		*subsliceid = 0;	/* unused */
+		*group = __ffs(gt->info.mslice_mask);
+		*instance = 0;	/* unused */
 		break;
 	case LNCF:
 		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
@@ -282,88 +324,87 @@ static void intel_gt_get_valid_steering(struct intel_gt *gt,
 		 * An LNCF is always present if its mslice is present, so we
 		 * can safely just steer to LNCF 0 in all cases.
 		 */
-		*sliceid = __ffs(gt->info.mslice_mask) << 1;
-		*subsliceid = 0;	/* unused */
+		*group = __ffs(gt->info.mslice_mask) << 1;
+		*instance = 0;	/* unused */
 		break;
 	default:
 		MISSING_CASE(type);
-		*sliceid = 0;
-		*subsliceid = 0;
+		*group = 0;
+		*instance = 0;
 	}
 }
 
-/**
- * intel_gt_get_valid_steering_for_reg - get a valid steering for a register
+/*
+ * intel_gt_mcr_get_nonterminated_steering - find group/instance values that
+ *    will steer a register to a non-terminated instance
  * @gt: GT structure
  * @reg: register for which the steering is required
- * @sliceid: return variable for slice steering
- * @subsliceid: return variable for subslice steering
+ * @group: return variable for group steering
+ * @instance: return variable for instance steering
  *
- * This function returns a slice/subslice pair that is guaranteed to work for
+ * This function returns a group/instance pair that is guaranteed to work for
  * read steering of the given register. Note that a value will be returned even
  * if the register is not replicated and therefore does not actually require
  * steering.
  */
-void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
-					 u8 *sliceid, u8 *subsliceid)
+void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
+					     i915_reg_t reg,
+					     u8 *group, u8 *instance)
 {
 	int type;
 
 	for (type = 0; type < NUM_STEERING_TYPES; type++) {
-		if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
-			intel_gt_get_valid_steering(gt, type, sliceid,
-						    subsliceid);
+		if (reg_needs_read_steering(gt, reg, type)) {
+			get_valid_steering(gt, type, group, instance);
 			return;
 		}
 	}
 
-	*sliceid = gt->default_steering.groupid;
-	*subsliceid = gt->default_steering.instanceid;
+	*group = gt->default_steering.groupid;
+	*instance = gt->default_steering.instanceid;
 }
 
 /**
- * intel_gt_read_register_fw - reads a GT register with support for multicast
+ * intel_gt_mcr_read_any_fw - reads a GT register with support for multicast
  * @gt: GT structure
  * @reg: register to read
  *
  * This function will read a GT register.  If the register is a multicast
- * register, the read will be steered to a valid instance (i.e., one that
- * isn't fused off or powered down by power gating).
+ * register, the read will be steered to a non-terminated instance (i.e., one
+ * that isn't fused off or powered down by power gating).
+ *
+ * The caller should ensure any necessary forcewake is held.
  *
  * Returns the value from a valid instance of @reg.
  */
-u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg)
+u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
 {
 	int type;
-	u8 sliceid, subsliceid;
+	u8 group, instance;
 
 	for (type = 0; type < NUM_STEERING_TYPES; type++) {
-		if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
-			intel_gt_get_valid_steering(gt, type, &sliceid,
-						    &subsliceid);
-			return intel_uncore_read_with_mcr_steering_fw(gt->uncore,
-								      reg,
-								      sliceid,
-								      subsliceid);
+		if (reg_needs_read_steering(gt, reg, type)) {
+			get_valid_steering(gt, type, &group, &instance);
+			return rw_with_mcr_steering_fw(gt->uncore, reg,
+						       FW_REG_READ,
+						       group, instance, 0);
 		}
 	}
 
 	return intel_uncore_read_fw(gt->uncore, reg);
 }
 
-u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg)
+u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
 {
 	int type;
-	u8 sliceid, subsliceid;
+	u8 group, instance;
 
 	for (type = 0; type < NUM_STEERING_TYPES; type++) {
-		if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
-			intel_gt_get_valid_steering(gt, type, &sliceid,
-						    &subsliceid);
-			return intel_uncore_read_with_mcr_steering(gt->uncore,
-								   reg,
-								   sliceid,
-								   subsliceid);
+		if (reg_needs_read_steering(gt, reg, type)) {
+			get_valid_steering(gt, type, &group, &instance);
+			return rw_with_mcr_steering(gt->uncore, reg,
+						    FW_REG_READ,
+						    group, instance, 0);
 		}
 	}
 
@@ -376,7 +417,7 @@ static void report_steering_type(struct drm_printer *p,
 				 bool dump_table)
 {
 	const struct intel_mmio_range *entry;
-	u8 slice, subslice;
+	u8 group, instance;
 
 	BUILD_BUG_ON(ARRAY_SIZE(intel_steering_types) != NUM_STEERING_TYPES);
 
@@ -386,9 +427,9 @@ static void report_steering_type(struct drm_printer *p,
 		return;
 	}
 
-	intel_gt_get_valid_steering(gt, type, &slice, &subslice);
-	drm_printf(p, "%s steering: sliceid=0x%x, subsliceid=0x%x\n",
-		   intel_steering_types[type], slice, subslice);
+	get_valid_steering(gt, type, &group, &instance);
+	drm_printf(p, "%s steering: group=0x%x, instance=0x%x\n",
+		   intel_steering_types[type], group, instance);
 
 	if (!dump_table)
 		return;
@@ -397,10 +438,10 @@ static void report_steering_type(struct drm_printer *p,
 		drm_printf(p, "\t0x%06x - 0x%06x\n", entry->start, entry->end);
 }
 
-void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
-			      bool dump_table)
+void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
+				  bool dump_table)
 {
-	drm_printf(p, "Default steering: sliceid=0x%x, subsliceid=0x%x\n",
+	drm_printf(p, "Default steering: group=0x%x, instance=0x%x\n",
 		   gt->default_steering.groupid,
 		   gt->default_steering.instanceid);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
index b570c1571243..506b0cbc8db3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -10,28 +10,25 @@
 
 void intel_gt_mcr_init(struct intel_gt *gt);
 
-u32 intel_uncore_read_with_mcr_steering_fw(struct intel_uncore *uncore,
-					   i915_reg_t reg,
-					   int slice, int subslice);
-u32 intel_uncore_read_with_mcr_steering(struct intel_uncore *uncore,
-					i915_reg_t reg,	int slice, int subslice);
-void intel_uncore_write_with_mcr_steering(struct intel_uncore *uncore,
-					  i915_reg_t reg, u32 value,
-					  int slice, int subslice);
-
-u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
-u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg);
-
-static inline bool intel_gt_needs_read_steering(struct intel_gt *gt,
-						enum intel_steering_type type)
-{
-	return gt->steering_table[type];
-}
-
-void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
-					 u8 *sliceid, u8 *subsliceid);
-
-void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
-			      bool dump_table);
+u32 intel_gt_mcr_read(struct intel_gt *gt,
+		      i915_reg_t reg,
+		      int group, int instance);
+u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
+u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
+
+void intel_gt_mcr_unicast_write(struct intel_gt *gt,
+				i915_reg_t reg, u32 value,
+				int group, int instance);
+void intel_gt_mcr_multicast_write(struct intel_gt *gt,
+				  i915_reg_t reg, u32 value);
+void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
+				     i915_reg_t reg, u32 value);
+
+void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
+					     i915_reg_t reg,
+					     u8 *group, u8 *instance);
+
+void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
+				  bool dump_table);
 
 #endif /* __INTEL_GT_MCR__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index 6e788a5fc85a..6952312ddbe4 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -105,7 +105,7 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
 		u64 tile_stolen, flat_ccs_base;
 
 		lmem_size = pci_resource_len(pdev, 2);
-		flat_ccs_base = intel_gt_read_register(gt, XEHPSDV_FLAT_CCS_BASE_ADDR);
+		flat_ccs_base = intel_gt_mcr_read_any(gt, XEHPSDV_FLAT_CCS_BASE_ADDR);
 		flat_ccs_base = (flat_ccs_base >> XEHPSDV_CCS_BASE_SHIFT) * SZ_64K;
 
 		if (GEM_WARN_ON(lmem_size < flat_ccs_base))
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index e9bf0d9f50d8..1864e1fe1e87 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1081,7 +1081,7 @@ static void __add_mcr_wa(struct intel_gt *gt, struct i915_wa_list *wal,
 	gt->default_steering.instanceid = subslice;
 
 	if (drm_debug_enabled(DRM_UT_DRIVER))
-		intel_gt_report_steering(&p, gt, false);
+		intel_gt_mcr_report_steering(&p, gt, false);
 }
 
 static void
@@ -1597,13 +1597,13 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
 		u32 val, old = 0;
 
 		/* open-coded rmw due to steering */
-		old = wa->clr ? intel_gt_read_register_fw(gt, wa->reg) : 0;
+		old = wa->clr ? intel_gt_mcr_read_any_fw(gt, wa->reg) : 0;
 		val = (old & ~wa->clr) | wa->set;
 		if (val != old || !wa->clr)
 			intel_uncore_write_fw(uncore, wa->reg, val);
 
 		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
-			wa_verify(wa, intel_gt_read_register_fw(gt, wa->reg),
+			wa_verify(wa, intel_gt_mcr_read_any_fw(gt, wa->reg),
 				  wal->name, "application");
 	}
 
@@ -1634,7 +1634,7 @@ static bool wa_list_verify(struct intel_gt *gt,
 
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
 		ok &= wa_verify(wa,
-				intel_gt_read_register_fw(gt, wa->reg),
+				intel_gt_mcr_read_any_fw(gt, wa->reg),
 				wal->name, from);
 
 	intel_uncore_forcewake_put__locked(uncore, fw);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index c63a6bf9e853..15f2ded6debf 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -314,7 +314,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 	 * tracking, it is easier to just program the default steering for all
 	 * regs that don't need a non-default one.
 	 */
-	intel_gt_get_valid_steering_for_reg(gt, reg, &group, &inst);
+	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
 	entry.flags |= GUC_REGSET_STEERING(group, inst);
 
 	slot = __mmio_reg_add(regset, &entry);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 11/15] drm/i915/gt: Always use MCR functions on multicast registers
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (9 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 10/15] drm/i915/gt: Cleanup interface for MCR operations Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 12/15] drm/i915/guc: Handle save/restore of MCR registers explicitly Matt Roper
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Rather than relying on the implicit behavior of intel_uncore_*()
functions, let's always use the intel_gt_mcr_*() functions to operate on
multicast/replicated registers.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c      |  4 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c       | 49 ++++++++++++-----------
 drivers/gpu/drm/i915/gt/intel_gtt.h       |  2 +-
 drivers/gpu/drm/i915/gt/intel_mocs.c      | 13 +++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 12 ++++--
 drivers/gpu/drm/i915/intel_pm.c           | 20 +++++----
 6 files changed, 55 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 86b2cd2a9f34..866c66bcc67a 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -1000,7 +1000,7 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 
 	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
 
-	setup_private_pat(ggtt->vm.gt->uncore);
+	setup_private_pat(ggtt->vm.gt);
 
 	return ggtt_probe_common(ggtt, size);
 }
@@ -1349,7 +1349,7 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt)
 		wbinvd_on_all_cpus();
 
 	if (GRAPHICS_VER(ggtt->vm.i915) >= 8)
-		setup_private_pat(ggtt->vm.gt->uncore);
+		setup_private_pat(ggtt->vm.gt);
 
 	intel_ggtt_restore_fences(ggtt);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 601d89b4feb1..6f61c8da0b61 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -15,6 +15,7 @@
 #include "i915_trace.h"
 #include "i915_utils.h"
 #include "intel_gt.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_regs.h"
 #include "intel_gtt.h"
 
@@ -477,27 +478,27 @@ void gtt_write_workarounds(struct intel_gt *gt)
 	}
 }
 
-static void tgl_setup_private_ppat(struct intel_uncore *uncore)
+static void tgl_setup_private_ppat(struct intel_gt *gt)
 {
-	if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50)) {
-		intel_uncore_write(uncore, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
-		intel_uncore_write(uncore, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
-		intel_uncore_write(uncore, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
-		intel_uncore_write(uncore, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
-		intel_uncore_write(uncore, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
-		intel_uncore_write(uncore, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
-		intel_uncore_write(uncore, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
-		intel_uncore_write(uncore, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
+	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) {
+		intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB);
+		intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC);
+		intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT);
+		intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC);
+		intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB);
+		intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB);
+		intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB);
+		intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB);
 	} else {
 		/* TGL doesn't support LLC or AGE settings */
-		intel_uncore_write(uncore, GEN12_PAT_INDEX(0), GEN8_PPAT_WB);
-		intel_uncore_write(uncore, GEN12_PAT_INDEX(1), GEN8_PPAT_WC);
-		intel_uncore_write(uncore, GEN12_PAT_INDEX(2), GEN8_PPAT_WT);
-		intel_uncore_write(uncore, GEN12_PAT_INDEX(3), GEN8_PPAT_UC);
-		intel_uncore_write(uncore, GEN12_PAT_INDEX(4), GEN8_PPAT_WB);
-		intel_uncore_write(uncore, GEN12_PAT_INDEX(5), GEN8_PPAT_WB);
-		intel_uncore_write(uncore, GEN12_PAT_INDEX(6), GEN8_PPAT_WB);
-		intel_uncore_write(uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB);
+		intel_uncore_write(gt->uncore, GEN12_PAT_INDEX(0), GEN8_PPAT_WB);
+		intel_uncore_write(gt->uncore, GEN12_PAT_INDEX(1), GEN8_PPAT_WC);
+		intel_uncore_write(gt->uncore, GEN12_PAT_INDEX(2), GEN8_PPAT_WT);
+		intel_uncore_write(gt->uncore, GEN12_PAT_INDEX(3), GEN8_PPAT_UC);
+		intel_uncore_write(gt->uncore, GEN12_PAT_INDEX(4), GEN8_PPAT_WB);
+		intel_uncore_write(gt->uncore, GEN12_PAT_INDEX(5), GEN8_PPAT_WB);
+		intel_uncore_write(gt->uncore, GEN12_PAT_INDEX(6), GEN8_PPAT_WB);
+		intel_uncore_write(gt->uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB);
 	}
 }
 
@@ -593,20 +594,20 @@ static void chv_setup_private_ppat(struct intel_uncore *uncore)
 	intel_uncore_write(uncore, GEN8_PRIVATE_PAT_HI, upper_32_bits(pat));
 }
 
-void setup_private_pat(struct intel_uncore *uncore)
+void setup_private_pat(struct intel_gt *gt)
 {
-	struct drm_i915_private *i915 = uncore->i915;
+	struct drm_i915_private *i915 = gt->i915;
 
 	GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
 
 	if (GRAPHICS_VER(i915) >= 12)
-		tgl_setup_private_ppat(uncore);
+		tgl_setup_private_ppat(gt);
 	else if (GRAPHICS_VER(i915) >= 11)
-		icl_setup_private_ppat(uncore);
+		icl_setup_private_ppat(gt->uncore);
 	else if (IS_CHERRYVIEW(i915) || IS_GEN9_LP(i915))
-		chv_setup_private_ppat(uncore);
+		chv_setup_private_ppat(gt->uncore);
 	else
-		bdw_setup_private_ppat(uncore);
+		bdw_setup_private_ppat(gt->uncore);
 }
 
 struct i915_vma *
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 5922e2cf4d8d..505ccf048aae 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -630,7 +630,7 @@ void ppgtt_unbind_vma(struct i915_address_space *vm,
 
 void gtt_write_workarounds(struct intel_gt *gt);
 
-void setup_private_pat(struct intel_uncore *uncore);
+void setup_private_pat(struct intel_gt *gt);
 
 int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index c14c0dab0164..bf94f9941c6f 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -7,6 +7,7 @@
 
 #include "intel_engine.h"
 #include "intel_gt.h"
+#include "intel_gt_mcr.h"
 #include "intel_gt_regs.h"
 #include "intel_mocs.h"
 #include "intel_ring.h"
@@ -581,17 +582,17 @@ static u32 l3cc_combine(u16 low, u16 high)
 	     0; \
 	     i++)
 
-static void init_l3cc_table(struct intel_uncore *uncore,
+static void init_l3cc_table(struct intel_gt *gt,
 			    const struct drm_i915_mocs_table *table)
 {
 	unsigned int i;
 	u32 l3cc;
 
 	for_each_l3cc(l3cc, table, i)
-		if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50))
-			intel_uncore_write_fw(uncore, XEHP_LNCFCMOCS(i), l3cc);
+		if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+			intel_gt_mcr_multicast_write_fw(gt, XEHP_LNCFCMOCS(i), l3cc);
 		else
-			intel_uncore_write_fw(uncore, GEN9_LNCFCMOCS(i), l3cc);
+			intel_uncore_write_fw(gt->uncore, GEN9_LNCFCMOCS(i), l3cc);
 }
 
 void intel_mocs_init_engine(struct intel_engine_cs *engine)
@@ -611,7 +612,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
 		init_mocs_table(engine, &table);
 
 	if (flags & HAS_RENDER_L3CC && engine->class == RENDER_CLASS)
-		init_l3cc_table(engine->uncore, &table);
+		init_l3cc_table(engine->gt, &table);
 }
 
 static u32 global_mocs_offset(void)
@@ -645,7 +646,7 @@ void intel_mocs_init(struct intel_gt *gt)
 	 * memory transactions including guc transactions
 	 */
 	if (flags & HAS_RENDER_L3CC)
-		init_l3cc_table(gt->uncore, &table);
+		init_l3cc_table(gt, &table);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index 9229243992c2..5b86b2e286e0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -10,12 +10,15 @@
  */
 
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_mcr.h"
 #include "gt/intel_gt_regs.h"
 #include "intel_guc_fw.h"
 #include "i915_drv.h"
 
-static void guc_prepare_xfer(struct intel_uncore *uncore)
+static void guc_prepare_xfer(struct intel_gt *gt)
 {
+	struct intel_uncore *uncore = gt->uncore;
+
 	u32 shim_flags = GUC_ENABLE_READ_CACHE_LOGIC |
 			 GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA |
 			 GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA |
@@ -35,8 +38,9 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
 
 	if (GRAPHICS_VER(uncore->i915) == 9) {
 		/* DOP Clock Gating Enable for GuC clocks */
-		intel_uncore_rmw(uncore, GEN8_MISCCPCTL,
-				 0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE);
+		intel_gt_mcr_multicast_write(gt, GEN8_MISCCPCTL,
+					     GEN8_DOP_CLOCK_GATE_GUC_ENABLE |
+					     intel_gt_mcr_read_any(gt, GEN8_MISCCPCTL));
 
 		/* allows for 5us (in 10ns units) before GT can go to RC6 */
 		intel_uncore_write(uncore, GUC_ARAT_C6DIS, 0x1FF);
@@ -168,7 +172,7 @@ int intel_guc_fw_upload(struct intel_guc *guc)
 	struct intel_uncore *uncore = gt->uncore;
 	int ret;
 
-	guc_prepare_xfer(uncore);
+	guc_prepare_xfer(gt);
 
 	/*
 	 * Note that GuC needs the CSS header plus uKernel code to be copied
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 43a2c95602c0..39bf1d532114 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -45,6 +45,8 @@
 #include "display/skl_universal_plane.h"
 
 #include "gt/intel_engine_regs.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_mcr.h"
 #include "gt/intel_gt_regs.h"
 #include "gt/intel_llc.h"
 
@@ -7416,22 +7418,23 @@ static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
 	u32 val;
 
 	/* WaTempDisableDOPClkGating:bdw */
-	misccpctl = intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL);
-	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
+	misccpctl = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
+				     misccpctl & ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
-	val = intel_uncore_read(&dev_priv->uncore, GEN8_L3SQCREG1);
+	val = intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
 	val &= ~L3_PRIO_CREDITS_MASK;
 	val |= L3_GENERAL_PRIO_CREDITS(general_prio_credits);
 	val |= L3_HIGH_PRIO_CREDITS(high_prio_credits);
-	intel_uncore_write(&dev_priv->uncore, GEN8_L3SQCREG1, val);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_L3SQCREG1, val);
 
 	/*
 	 * Wait at least 100 clocks before re-enabling clock gating.
 	 * See the definition of L3SQCREG1 in BSpec.
 	 */
-	intel_uncore_posting_read(&dev_priv->uncore, GEN8_L3SQCREG1);
+	intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_L3SQCREG1);
 	udelay(1);
-	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, misccpctl);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL, misccpctl);
 }
 
 static void icl_init_clock_gating(struct drm_i915_private *dev_priv)
@@ -7579,8 +7582,9 @@ static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
 	gen9_init_clock_gating(dev_priv);
 
 	/* WaDisableDopClockGating:skl */
-	intel_uncore_write(&dev_priv->uncore, GEN8_MISCCPCTL, intel_uncore_read(&dev_priv->uncore, GEN8_MISCCPCTL) &
-		   ~GEN8_DOP_CLOCK_GATE_ENABLE);
+	intel_gt_mcr_multicast_write(to_gt(dev_priv), GEN8_MISCCPCTL,
+				     intel_gt_mcr_read_any(to_gt(dev_priv), GEN8_MISCCPCTL) &
+				     ~GEN8_DOP_CLOCK_GATE_ENABLE);
 
 	/* WAC6entrylatency:skl */
 	intel_uncore_write(&dev_priv->uncore, FBC_LLC_READ_CTRL, intel_uncore_read(&dev_priv->uncore, FBC_LLC_READ_CTRL) |
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 12/15] drm/i915/guc: Handle save/restore of MCR registers explicitly
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (10 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 11/15] drm/i915/gt: Always use MCR functions on multicast registers Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 13/15] drm/i915/gt: Add MCR-specific workaround initializers Matt Roper
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

MCR registers can be placed on the GuC's save/restore list, but at the
moment they are always handled in a multicast manner (i.e., the GuC
reads one instance to save the value and then does a multicast write to
restore that single value to all instances).  In the future the GuC will
probably give us an alternate interface to do unicast per-instance
save/restore operations, so we should be very clear about which
registers on the list are MCR registers (and in the future which
save/restore behavior we want for them).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 55 +++++++++++++---------
 1 file changed, 34 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 15f2ded6debf..389c5c0aad7a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -277,24 +277,16 @@ __mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg)
 	return slot;
 }
 
-#define GUC_REGSET_STEERING(group, instance) ( \
-	FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \
-	FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \
-	GUC_REGSET_NEEDS_STEERING \
-)
-
 static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 					  struct temp_regset *regset,
-					  i915_reg_t reg, u32 flags)
+					  u32 offset, u32 flags)
 {
 	u32 count = regset->storage_used - (regset->registers - regset->storage);
-	u32 offset = i915_mmio_reg_offset(reg);
 	struct guc_mmio_reg entry = {
 		.offset = offset,
 		.flags = flags,
 	};
 	struct guc_mmio_reg *slot;
-	u8 group, inst;
 
 	/*
 	 * The mmio list is built using separate lists within the driver.
@@ -306,17 +298,6 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 		    sizeof(entry), guc_mmio_reg_cmp))
 		return 0;
 
-	/*
-	 * The GuC doesn't have a default steering, so we need to explicitly
-	 * steer all registers that need steering. However, we do not keep track
-	 * of all the steering ranges, only of those that have a chance of using
-	 * a non-default steering from the i915 pov. Instead of adding such
-	 * tracking, it is easier to just program the default steering for all
-	 * regs that don't need a non-default one.
-	 */
-	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
-	entry.flags |= GUC_REGSET_STEERING(group, inst);
-
 	slot = __mmio_reg_add(regset, &entry);
 	if (IS_ERR(slot))
 		return PTR_ERR(slot);
@@ -334,6 +315,38 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 
 #define GUC_MMIO_REG_ADD(gt, regset, reg, masked) \
 	guc_mmio_reg_add(gt, \
+			 regset, \
+			 i915_mmio_reg_offset(reg), \
+			 (masked) ? GUC_REGSET_MASKED : 0)
+
+#define GUC_REGSET_STEERING(group, instance) ( \
+	FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \
+	FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \
+	GUC_REGSET_NEEDS_STEERING \
+)
+
+static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
+					 struct temp_regset *regset,
+					 i915_reg_t reg, u32 flags)
+{
+	u8 group, inst;
+
+	/*
+	 * The GuC doesn't have a default steering, so we need to explicitly
+	 * steer all registers that need steering. However, we do not keep track
+	 * of all the steering ranges, only of those that have a chance of using
+	 * a non-default steering from the i915 pov. Instead of adding such
+	 * tracking, it is easier to just program the default steering for all
+	 * regs that don't need a non-default one.
+	 */
+	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
+	flags |= GUC_REGSET_STEERING(group, inst);
+
+	return guc_mmio_reg_add(gt, regset, i915_mmio_reg_offset(reg), flags);
+}
+
+#define GUC_MCR_REG_ADD(gt, regset, reg, masked) \
+	guc_mcr_reg_add(gt, \
 			 regset, \
 			 (reg), \
 			 (masked) ? GUC_REGSET_MASKED : 0)
@@ -374,7 +387,7 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
 	/* add in local MOCS registers */
 	for (i = 0; i < LNCFCMOCS_REG_COUNT; i++)
 		if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
-			ret |= GUC_MMIO_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false);
+			ret |= GUC_MCR_REG_ADD(gt, regset, XEHP_LNCFCMOCS(i), false);
 		else
 			ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 13/15] drm/i915/gt: Add MCR-specific workaround initializers
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (11 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 12/15] drm/i915/guc: Handle save/restore of MCR registers explicitly Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-30 23:28 ` [PATCH 14/15] drm/i915: Define multicast registers as a new type Matt Roper
  2022-03-30 23:28 ` [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering Matt Roper
  14 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Let's be more explicit about which of our workarounds are updating MCR
registers.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 346 +++++++++++---------
 1 file changed, 198 insertions(+), 148 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 1864e1fe1e87..d7e61c8a8c04 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -166,12 +166,32 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
 	_wa_add(wal, &wa);
 }
 
+static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
+		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
+{
+	struct i915_wa wa = {
+		.reg  = reg,
+		.clr  = clear,
+		.set  = set,
+		.read = read_mask,
+		.masked_reg = masked_reg,
+	};
+
+	_wa_add(wal, &wa);
+}
+
 static void
 wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
 {
 	wa_add(wal, reg, clear, set, clear, false);
 }
 
+static void
+wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
+{
+	wa_mcr_add(wal, reg, clear, set, clear, false);
+}
+
 static void
 wa_write(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 {
@@ -184,12 +204,24 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 	wa_write_clr_set(wal, reg, set, set);
 }
 
+static void
+wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
+{
+	wa_mcr_write_clr_set(wal, reg, set, set);
+}
+
 static void
 wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
 {
 	wa_write_clr_set(wal, reg, clr, 0);
 }
 
+static void
+wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
+{
+	wa_mcr_write_clr_set(wal, reg, clr, 0);
+}
+
 /*
  * WA operations on "masked register". A masked register has the upper 16 bits
  * documented as "masked" in b-spec. Its purpose is to allow writing to just a
@@ -207,12 +239,24 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 	wa_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
 }
 
+static void
+wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+{
+	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
+}
+
 static void
 wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 {
 	wa_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
 }
 
+static void
+wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+{
+	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
+}
+
 static void
 wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
 		    u32 mask, u32 val)
@@ -241,7 +285,7 @@ static void gen8_ctx_workarounds_init(struct intel_engine_cs *engine,
 	wa_masked_en(wal, RING_MI_MODE(RENDER_RING_BASE), ASYNC_FLIP_PERF_DISABLE);
 
 	/* WaDisablePartialInstShootdown:bdw,chv */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN,
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 		     PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
 	/* Use Force Non-Coherent whenever executing a 3D context. This is a
@@ -288,18 +332,18 @@ static void bdw_ctx_workarounds_init(struct intel_engine_cs *engine,
 	gen8_ctx_workarounds_init(engine, wal);
 
 	/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
 
 	/* WaDisableDopClockGating:bdw
 	 *
 	 * Also see the related UCGTCL1 write in bdw_init_clock_gating()
 	 * to disable EUTC clock gating.
 	 */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN2,
-		     DOP_CLOCK_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
+			 DOP_CLOCK_GATING_DISABLE);
 
-	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
-		     GEN8_SAMPLER_POWER_BYPASS_DIS);
+	wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
+			 GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	wa_masked_en(wal, HDC_CHICKEN0,
 		     /* WaForceContextSaveRestoreNonCoherent:bdw */
@@ -314,7 +358,7 @@ static void chv_ctx_workarounds_init(struct intel_engine_cs *engine,
 	gen8_ctx_workarounds_init(engine, wal);
 
 	/* WaDisableThreadStallDopClockGating:chv */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
 
 	/* Improve HiZ throughput on CHV. */
 	wa_masked_en(wal, HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
@@ -333,21 +377,21 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 		 */
 		wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
 			     GEN9_PBE_COMPRESSED_HASH_SELECTION);
-		wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
-			     GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
+		wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
+				 GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
 	}
 
 	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
 	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN,
-		     FLOW_CONTROL_ENABLE |
-		     PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
+			 FLOW_CONTROL_ENABLE |
+			 PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
 	/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
 	/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
-	wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
-		     GEN9_ENABLE_YV12_BUGFIX |
-		     GEN9_ENABLE_GPGPU_PREEMPTION);
+	wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
+			 GEN9_ENABLE_YV12_BUGFIX |
+			 GEN9_ENABLE_GPGPU_PREEMPTION);
 
 	/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
 	/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
@@ -356,8 +400,8 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE);
 
 	/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
-	wa_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5,
-		      GEN9_CCS_TLB_PREFETCH_ENABLE);
+	wa_mcr_masked_dis(wal, GEN9_HALF_SLICE_CHICKEN5,
+			  GEN9_CCS_TLB_PREFETCH_ENABLE);
 
 	/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
 	wa_masked_en(wal, HDC_CHICKEN0,
@@ -386,11 +430,11 @@ static void gen9_ctx_workarounds_init(struct intel_engine_cs *engine,
 	    IS_KABYLAKE(i915) ||
 	    IS_COFFEELAKE(i915) ||
 	    IS_COMETLAKE(i915))
-		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
-			     GEN8_SAMPLER_POWER_BYPASS_DIS);
+		wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN3,
+				 GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
-	wa_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
+	wa_mcr_masked_en(wal, HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
 
 	/*
 	 * Supporting preemption with fine-granularity requires changes in the
@@ -469,8 +513,8 @@ static void bxt_ctx_workarounds_init(struct intel_engine_cs *engine,
 	gen9_ctx_workarounds_init(engine, wal);
 
 	/* WaDisableThreadStallDopClockGating:bxt */
-	wa_masked_en(wal, GEN8_ROW_CHICKEN,
-		     STALL_DOP_GATING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
+			 STALL_DOP_GATING_DISABLE);
 
 	/* WaToEnableHwFixForPushConstHWBug:bxt */
 	wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
@@ -490,8 +534,8 @@ static void kbl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
-		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
+			 GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
 static void glk_ctx_workarounds_init(struct intel_engine_cs *engine,
@@ -514,8 +558,8 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
-		     GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+	wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
+			 GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
 static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
@@ -534,13 +578,13 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 	 * (the register is whitelisted in hardware now, so UMDs can opt in
 	 * for coherency if they have a good reason).
 	 */
-	wa_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT);
+	wa_mcr_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT);
 
 	/* WaEnableFloatBlendOptimization:icl */
-	wa_add(wal, GEN10_CACHE_MODE_SS, 0,
-	       _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE),
-	       0 /* write-only, so skip validation */,
-	       true);
+	wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0,
+		   _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE),
+		   0 /* write-only, so skip validation */,
+		   true);
 
 	/* WaDisableGPGPUMidThreadPreemption:icl */
 	wa_masked_field_set(wal, GEN8_CS_CHICKEN1,
@@ -548,8 +592,8 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			    GEN9_PREEMPT_GPGPU_THREAD_GROUP_LEVEL);
 
 	/* allow headerless messages for preemptible GPGPU context */
-	wa_masked_en(wal, GEN10_SAMPLER_MODE,
-		     GEN11_SAMPLER_ENABLE_HEADLESS_MSG);
+	wa_mcr_masked_en(wal, GEN10_SAMPLER_MODE,
+			 GEN11_SAMPLER_ENABLE_HEADLESS_MSG);
 
 	/* Wa_1604278689:icl,ehl */
 	wa_write(wal, IVB_FBC_RT_BASE, 0xFFFFFFFF & ~ILK_FBC_RT_VALID);
@@ -558,7 +602,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 			 0xFFFFFFFF);
 
 	/* Wa_1406306137:icl,ehl */
-	wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU);
+	wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN11_DIS_PICK_2ND_EU);
 }
 
 /*
@@ -568,13 +612,13 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine,
 				   struct i915_wa_list *wal)
 {
-	wa_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
-			 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
-	wa_add(wal,
-	       XEHP_FF_MODE2,
-	       FF_MODE2_TDS_TIMER_MASK,
-	       FF_MODE2_TDS_TIMER_128,
-	       0, false);
+	wa_mcr_write_clr_set(wal, XEHP_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK,
+			     REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f));
+	wa_mcr_add(wal,
+		   XEHP_FF_MODE2,
+		   FF_MODE2_TDS_TIMER_MASK,
+		   FF_MODE2_TDS_TIMER_128,
+		   0, false);
 }
 
 /*
@@ -663,27 +707,27 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
 
 	/* Wa_16011186671:dg2_g11 */
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
-		wa_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH);
-		wa_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE);
+		wa_mcr_masked_dis(wal, VFLSKPD, DIS_MULT_MISS_RD_SQUASH);
+		wa_mcr_masked_en(wal, VFLSKPD, DIS_OVER_FETCH_CACHE);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_14010469329:dg2_g10 */
-		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
-			     XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
+		wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
+				 XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
 
 		/*
 		 * Wa_22010465075:dg2_g10
 		 * Wa_22010613112:dg2_g10
 		 * Wa_14010698770:dg2_g10
 		 */
-		wa_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
-			     GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
+		wa_mcr_masked_en(wal, XEHP_COMMON_SLICE_CHICKEN3,
+				 GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
 	}
 
 	/* Wa_16013271637:dg2 */
-	wa_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1,
-		     MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
+	wa_mcr_masked_en(wal, XEHP_SLICE_COMMON_ECO_CHICKEN1,
+			 MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
 
 	/* Wa_14014947963:dg2 */
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_B0, STEP_FOREVER) ||
@@ -1237,9 +1281,9 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 		    PSDUNIT_CLKGATE_DIS);
 
 	/* Wa_1406680159:icl,ehl */
-	wa_write_or(wal,
-		    GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
-		    GWUNIT_CLKGATE_DIS);
+	wa_mcr_write_or(wal,
+			GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
+			GWUNIT_CLKGATE_DIS);
 
 	/* Wa_1607087056:icl,ehl,jsl */
 	if (IS_ICELAKE(i915) ||
@@ -1252,7 +1296,7 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	 * This is not a documented workaround, but rather an optimization
 	 * to reduce sampler power.
 	 */
-	wa_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
+	wa_mcr_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
 }
 
 /*
@@ -1286,7 +1330,7 @@ gen12_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	wa_14011060649(gt, wal);
 
 	/* Wa_14011059788:tgl,rkl,adl-s,dg1,adl-p */
-	wa_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
+	wa_mcr_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
 }
 
 static void
@@ -1298,9 +1342,9 @@ tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 
 	/* Wa_1409420604:tgl */
 	if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
-		wa_write_or(wal,
-			    SUBSLICE_UNIT_LEVEL_CLKGATE2,
-			    CPSSUNIT_CLKGATE_DIS);
+		wa_mcr_write_or(wal,
+				SUBSLICE_UNIT_LEVEL_CLKGATE2,
+				CPSSUNIT_CLKGATE_DIS);
 
 	/* Wa_1607087056:tgl also know as BUG:1409180338 */
 	if (IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
@@ -1329,9 +1373,9 @@ dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 
 	/* Wa_1409420604:dg1 */
 	if (IS_DG1(i915))
-		wa_write_or(wal,
-			    SUBSLICE_UNIT_LEVEL_CLKGATE2,
-			    CPSSUNIT_CLKGATE_DIS);
+		wa_mcr_write_or(wal,
+				SUBSLICE_UNIT_LEVEL_CLKGATE2,
+				CPSSUNIT_CLKGATE_DIS);
 
 	/* Wa_1408615072:dg1 */
 	/* Empirical testing shows this register is unaffected by engine reset. */
@@ -1348,7 +1392,7 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 	xehp_init_mcr(gt, wal);
 
 	/* Wa_1409757795:xehpsdv */
-	wa_write_or(wal, SCCGCTL94DC, CG3DDISURB);
+	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
 
 	/* Wa_16011155590:xehpsdv */
 	if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
@@ -1428,8 +1472,8 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 			    CG3DDISCFEG_CLKGATE_DIS);
 
 		/* Wa_14011006942:dg2 */
-		wa_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
-			    DSS_ROUTER_CLKGATE_DIS);
+		wa_mcr_write_or(wal, GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE,
+				DSS_ROUTER_CLKGATE_DIS);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_B0)) {
@@ -1440,7 +1484,7 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 		wa_write_or(wal, UNSLCGCTL9444, LTCDD_CLKGATE_DIS);
 
 		/* Wa_14011371254:dg2_g10 */
-		wa_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
+		wa_mcr_write_or(wal, XEHP_SLICE_UNIT_LEVEL_CLKGATE, NODEDSS_CLKGATE_DIS);
 
 		/* Wa_14011431319:dg2_g10 */
 		wa_write_or(wal, UNSLCGCTL9440, GAMTLBOACS_CLKGATE_DIS |
@@ -1476,18 +1520,18 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 			    GAMEDIA_CLKGATE_DIS);
 
 		/* Wa_14011028019:dg2_g10 */
-		wa_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS);
+		wa_mcr_write_or(wal, SSMCGCTL9530, RTFUNIT_CLKGATE_DIS);
 	}
 
 	/* Wa_14014830051:dg2 */
-	wa_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN);
+	wa_mcr_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN);
 
 	/*
 	 * The following are not actually "workarounds" but rather
 	 * recommended tuning settings documented in the bspec's
 	 * performance guide section.
 	 */
-	wa_write_or(wal, XEHP_SQCM, EN_32B_ACCESS);
+	wa_mcr_write_or(wal, XEHP_SQCM, EN_32B_ACCESS);
 }
 
 static void
@@ -1686,6 +1730,12 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
 	whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
 }
 
+static void
+whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
+{
+	whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
+}
+
 static void gen9_whitelist_build(struct i915_wa_list *w)
 {
 	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
@@ -1711,7 +1761,7 @@ static void skl_whitelist_build(struct intel_engine_cs *engine)
 	gen9_whitelist_build(w);
 
 	/* WaDisableLSQCROPERFforOCL:skl */
-	whitelist_reg(w, GEN8_L3SQCREG4);
+	whitelist_mcr_reg(w, GEN8_L3SQCREG4);
 }
 
 static void bxt_whitelist_build(struct intel_engine_cs *engine)
@@ -1732,7 +1782,7 @@ static void kbl_whitelist_build(struct intel_engine_cs *engine)
 	gen9_whitelist_build(w);
 
 	/* WaDisableLSQCROPERFforOCL:kbl */
-	whitelist_reg(w, GEN8_L3SQCREG4);
+	whitelist_mcr_reg(w, GEN8_L3SQCREG4);
 }
 
 static void glk_whitelist_build(struct intel_engine_cs *engine)
@@ -1797,10 +1847,10 @@ static void icl_whitelist_build(struct intel_engine_cs *engine)
 	switch (engine->class) {
 	case RENDER_CLASS:
 		/* WaAllowUMDToModifyHalfSliceChicken7:icl */
-		whitelist_reg(w, GEN9_HALF_SLICE_CHICKEN7);
+		whitelist_mcr_reg(w, GEN9_HALF_SLICE_CHICKEN7);
 
 		/* WaAllowUMDToModifySamplerMode:icl */
-		whitelist_reg(w, GEN10_SAMPLER_MODE);
+		whitelist_mcr_reg(w, GEN10_SAMPLER_MODE);
 
 		/* WaEnableStateCacheRedirectToCS:icl */
 		whitelist_reg(w, GEN9_SLICE_COMMON_ECO_CHICKEN1);
@@ -2025,7 +2075,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 
 	if (IS_DG2(i915)) {
 		/* Wa_14015227452:dg2 */
-		wa_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE);
+		wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE);
 
 		/* Wa_1509235366:dg2 */
 		wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS |
@@ -2036,27 +2086,27 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * recommended tuning settings documented in the bspec's
 		 * performance guide section.
 		 */
-		wa_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS);
+		wa_mcr_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS);
 
 		/* Wa_18018781329:dg2 */
-		wa_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB);
-		wa_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB);
-		wa_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB);
-		wa_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14013392000:dg2_g11 */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_ENABLE_LARGE_GRF_MODE);
 
 		/* Wa_16011620976:dg2_g11 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW, DIS_CHAIN_2XSIMD8);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0) ||
 	    IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14012419201:dg2 */
-		wa_masked_en(wal, GEN9_ROW_CHICKEN4,
+		wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4,
 			     GEN12_DISABLE_HDR_PAST_PAYLOAD_HOLD_FIX);
 	}
 
@@ -2066,13 +2116,13 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_22012826095:dg2
 		 * Wa_22013059131:dg2
 		 */
-		wa_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW,
-				 MAXREQS_PER_BANK,
-				 REG_FIELD_PREP(MAXREQS_PER_BANK, 2));
+		wa_mcr_write_clr_set(wal, LSC_CHICKEN_BIT_0_UDW,
+				     MAXREQS_PER_BANK,
+				     REG_FIELD_PREP(MAXREQS_PER_BANK, 2));
 
 		/* Wa_22013059131:dg2 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0,
-			    FORCE_1_SUB_MESSAGE_PER_FRAGMENT);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0,
+				FORCE_1_SUB_MESSAGE_PER_FRAGMENT);
 	}
 
 	/* Wa_1308578152:dg2_g10 when first gslice is fused off */
@@ -2085,19 +2135,19 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_FOREVER) ||
 	    IS_DG2_G11(i915) || IS_DG2_G12(i915)) {
 		/* Wa_22013037850:dg2 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0_UDW,
-			    DISABLE_128B_EVICTION_COMMAND_UDW);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW,
+				DISABLE_128B_EVICTION_COMMAND_UDW);
 
 		/* Wa_22012856258:dg2 */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
 			     GEN12_DISABLE_READ_SUPPRESSION);
 
 		/*
 		 * Wa_22010960976:dg2
 		 * Wa_14013347512:dg2
 		 */
-		wa_masked_dis(wal, XEHP_HDC_CHICKEN0,
-			      LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK);
+		wa_mcr_masked_dis(wal, XEHP_HDC_CHICKEN0,
+				  LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) {
@@ -2105,7 +2155,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_1608949956:dg2_g10
 		 * Wa_14010198302:dg2_g10
 		 */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN,
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN,
 			     MDQ_ARBITRATION_MODE | UGM_BACKUP_MODE);
 
 		/*
@@ -2114,40 +2164,40 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * LSC_CHICKEN_BIT_0 always reads back as 0 is this stepping,
 		 * so ignoring verification.
 		 */
-		wa_add(wal, LSC_CHICKEN_BIT_0_UDW, 0,
-		       FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE,
-		       0, false);
+		wa_mcr_add(wal, LSC_CHICKEN_BIT_0_UDW, 0,
+			   FORCE_SLM_FENCE_SCOPE_TO_TILE | FORCE_UGM_FENCE_SCOPE_TO_TILE,
+			   0, false);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_22010430635:dg2 */
-		wa_masked_en(wal,
-			     GEN9_ROW_CHICKEN4,
-			     GEN12_DISABLE_GRF_CLEAR);
+		wa_mcr_masked_en(wal,
+				 GEN9_ROW_CHICKEN4,
+				 GEN12_DISABLE_GRF_CLEAR);
 
 		/* Wa_14010648519:dg2 */
-		wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
+		wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 	}
 
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_C0) ||
 	    IS_DG2_G11(i915)) {
 		/* Wa_22012654132:dg2 */
-		wa_add(wal, GEN10_CACHE_MODE_SS, 0,
-		       _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC),
-		       0 /* write-only, so skip validation */,
-		       true);
+		wa_mcr_add(wal, GEN10_CACHE_MODE_SS, 0,
+			   _MASKED_BIT_ENABLE(ENABLE_PREFETCH_INTO_IC),
+			   0 /* write-only, so skip validation */,
+			   true);
 	}
 
 	/* Wa_14013202645:dg2 */
 	if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_B0, STEP_C0) ||
 	    IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0))
-		wa_write_or(wal, RT_CTRL, DIS_NULL_QUERY);
+		wa_mcr_write_or(wal, RT_CTRL, DIS_NULL_QUERY);
 
 	/* Wa_22012532006:dg2 */
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_C0) ||
 	    IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0))
-		wa_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
-			     DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA);
+		wa_mcr_masked_en(wal, GEN9_HALF_SLICE_CHICKEN7,
+				 DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA);
 
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0)) {
 		/* Wa_14010680813:dg2_g10 */
@@ -2158,7 +2208,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0, STEP_B0) ||
 	    IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0, STEP_B0)) {
 		/* Wa_14012362059:dg2 */
-		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 	}
 
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
@@ -2185,7 +2235,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
 
 		/*
 		 * Wa_1407928979:tgl A*
@@ -2210,14 +2260,14 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	    IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
 	    IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
 		/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
-		wa_masked_en(wal, GEN8_ROW_CHICKEN2,
-			     GEN12_PUSH_CONST_DEREF_HOLD_DIS);
+		wa_mcr_masked_en(wal, GEN8_ROW_CHICKEN2,
+				 GEN12_PUSH_CONST_DEREF_HOLD_DIS);
 
 		/*
 		 * Wa_1409085225:tgl
 		 * Wa_14010229206:tgl,rkl,dg1[a0],adl-s,adl-p
 		 */
-		wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH);
+		wa_mcr_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH);
 	}
 
 	if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
@@ -2241,9 +2291,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 	if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915) ||
 	    IS_ALDERLAKE_S(i915) || IS_ALDERLAKE_P(i915)) {
 		/* Wa_1406941453:tgl,rkl,dg1,adl-s,adl-p */
-		wa_masked_en(wal,
-			     GEN10_SAMPLER_MODE,
-			     ENABLE_SMALLPL);
+		wa_mcr_masked_en(wal,
+				 GEN10_SAMPLER_MODE,
+				 ENABLE_SMALLPL);
 	}
 
 	if (GRAPHICS_VER(i915) == 11) {
@@ -2277,9 +2327,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 		 * Wa_1405733216:icl
 		 * Formerly known as WaDisableCleanEvicts
 		 */
-		wa_write_or(wal,
-			    GEN8_L3SQCREG4,
-			    GEN11_LQSC_CLEAN_EVICT_DISABLE);
+		wa_mcr_write_or(wal,
+				GEN8_L3SQCREG4,
+				GEN11_LQSC_CLEAN_EVICT_DISABLE);
 
 		/* Wa_1606682166:icl */
 		wa_write_or(wal,
@@ -2287,10 +2337,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 			    GEN7_DISABLE_SAMPLER_PREFETCH);
 
 		/* Wa_1409178092:icl */
-		wa_write_clr_set(wal,
-				 GEN11_SCRATCH2,
-				 GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE,
-				 0);
+		wa_mcr_write_clr_set(wal,
+				     GEN11_SCRATCH2,
+				     GEN11_COHERENT_PARTIAL_WRITE_MERGE_ENABLE,
+				     0);
 
 		/* WaEnable32PlaneMode:icl */
 		wa_masked_en(wal, GEN9_CSFE_CHICKEN1_RCS,
@@ -2348,30 +2398,30 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 			     GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE);
 
 		/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
-		wa_write_or(wal,
-			    BDW_SCRATCH1,
-			    GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+		wa_mcr_write_or(wal,
+				BDW_SCRATCH1,
+				GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
 
 		/* WaProgramL3SqcReg1DefaultForPerf:bxt,glk */
 		if (IS_GEN9_LP(i915))
-			wa_write_clr_set(wal,
-					 GEN8_L3SQCREG1,
-					 L3_PRIO_CREDITS_MASK,
-					 L3_GENERAL_PRIO_CREDITS(62) |
-					 L3_HIGH_PRIO_CREDITS(2));
+			wa_mcr_write_clr_set(wal,
+					     GEN8_L3SQCREG1,
+					     L3_PRIO_CREDITS_MASK,
+					     L3_GENERAL_PRIO_CREDITS(62) |
+					     L3_HIGH_PRIO_CREDITS(2));
 
 		/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
-		wa_write_or(wal,
-			    GEN8_L3SQCREG4,
-			    GEN8_LQSC_FLUSH_COHERENT_LINES);
+		wa_mcr_write_or(wal,
+				GEN8_L3SQCREG4,
+				GEN8_LQSC_FLUSH_COHERENT_LINES);
 
 		/* Disable atomics in L3 to prevent unrecoverable hangs */
 		wa_write_clr_set(wal, GEN9_SCRATCH_LNCF1,
 				 GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE, 0);
-		wa_write_clr_set(wal, GEN8_L3SQCREG4,
-				 GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
-		wa_write_clr_set(wal, GEN9_SCRATCH1,
-				 EVICTION_PERF_FIX_ENABLE, 0);
+		wa_mcr_write_clr_set(wal, GEN8_L3SQCREG4,
+				     GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
+		wa_mcr_write_clr_set(wal, GEN9_SCRATCH1,
+				     EVICTION_PERF_FIX_ENABLE, 0);
 	}
 
 	if (IS_HASWELL(i915)) {
@@ -2596,30 +2646,30 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 
 	if (IS_XEHPSDV(i915)) {
 		/* Wa_1409954639 */
-		wa_masked_en(wal,
-			     GEN8_ROW_CHICKEN,
-			     SYSTOLIC_DOP_CLOCK_GATING_DIS);
+		wa_mcr_masked_en(wal,
+				 GEN8_ROW_CHICKEN,
+				 SYSTOLIC_DOP_CLOCK_GATING_DIS);
 
 		/* Wa_1607196519 */
-		wa_masked_en(wal,
-			     GEN9_ROW_CHICKEN4,
-			     GEN12_DISABLE_GRF_CLEAR);
+		wa_mcr_masked_en(wal,
+				 GEN9_ROW_CHICKEN4,
+				 GEN12_DISABLE_GRF_CLEAR);
 
 		/* Wa_14010670810:xehpsdv */
-		wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
+		wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 
 		/* Wa_14010449647:xehpsdv */
-		wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
-			     GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
+		wa_mcr_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1,
+				 GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
 
 		/* Wa_18011725039:xehpsdv */
 		if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A1, STEP_B0)) {
-			wa_masked_dis(wal, MLTICTXCTL, TDONRENDER);
-			wa_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH);
+			wa_mcr_masked_dis(wal, MLTICTXCTL, TDONRENDER);
+			wa_mcr_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH);
 		}
 
 		/* Wa_14012362059:xehpsdv */
-		wa_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
+		wa_mcr_write_or(wal, XEHP_MERT_MOD_CTRL, FORCE_MISS_FTLB);
 
 		/* Wa_14014368820:xehpsdv */
 		wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS |
@@ -2628,7 +2678,7 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
 
 	if (IS_DG2(i915)) {
 		/* Wa_22014226127:dg2 */
-		wa_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE);
+		wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE);
 	}
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 14/15] drm/i915: Define multicast registers as a new type
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (12 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 13/15] drm/i915/gt: Add MCR-specific workaround initializers Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-04-01  7:55   ` [Intel-gfx] " Tvrtko Ursulin
  2022-03-30 23:28 ` [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering Matt Roper
  14 siblings, 1 reply; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Rather than treating multicast registers as 'i915_reg_t' let's define
them as a completely new type.  This will allow the compiler to help us
make sure we're using multicast-aware functions to operate on multicast
registers.

This plan does break down a bit in places where we're just maintaining
heterogeneous lists of registers (e.g., various MMIO whitelists used by
perf, GVT, etc.) rather than performing reads/writes.  We only really
care about the offset in those cases, so for now we can "cast" the
registers as non-MCR, leaving us with a list of i915_reg_t's, but we may
want to look for better ways to store mixed collections of i915_reg_t
and i915_mcr_reg_t in the future.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 49 ++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_gt_mcr.h        | 14 +++---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 27 +++++++---
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  6 +--
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 32 +++++++-----
 .../gpu/drm/i915/gt/selftest_workarounds.c    |  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  4 +-
 .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 +-
 drivers/gpu/drm/i915/gvt/cmd_parser.c         |  2 +-
 drivers/gpu/drm/i915/gvt/handlers.c           | 17 ++++---
 drivers/gpu/drm/i915/gvt/mmio_context.c       | 14 +++---
 drivers/gpu/drm/i915/i915_perf.c              |  2 +-
 drivers/gpu/drm/i915/i915_reg_defs.h          |  9 ++++
 13 files changed, 113 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index c8e52d625f18..a9a9fa6881f2 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -103,6 +103,19 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 	}
 }
 
+/*
+ * Although the rest of the driver should use MCR-specific functions to
+ * read/write MCR registers, we still use the regular intel_uncore_* functions
+ * internally to implement those, so we need a way for the functions in this
+ * file to "cast" an i915_mcr_reg_t into an i915_reg_t.
+ */
+static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
+{
+	i915_reg_t r = { .reg = mcr.mmio };
+
+	return r;
+}
+
 /*
  * rw_with_mcr_steering_fw - Access a register with specific MCR steering
  * @uncore: pointer to struct intel_uncore
@@ -117,7 +130,7 @@ void intel_gt_mcr_init(struct intel_gt *gt)
  * Caller needs to make sure the relevant forcewake wells are up.
  */
 static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
-				   i915_reg_t reg, u8 rw_flag,
+				   i915_mcr_reg_t reg, u8 rw_flag,
 				   int group, int instance, u32 value)
 {
 	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
@@ -154,9 +167,9 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
 
 	if (rw_flag == FW_REG_READ)
-		val = intel_uncore_read_fw(uncore, reg);
+		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
 	else
-		intel_uncore_write_fw(uncore, reg, value);
+		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
 
 	mcr &= ~mcr_mask;
 	mcr |= old_mcr & mcr_mask;
@@ -167,14 +180,14 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
 }
 
 static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
-				i915_reg_t reg, u8 rw_flag,
+				i915_mcr_reg_t reg, u8 rw_flag,
 				int group, int instance,
 				u32 value)
 {
 	enum forcewake_domains fw_domains;
 	u32 val;
 
-	fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
+	fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg),
 						    rw_flag);
 	fw_domains |= intel_uncore_forcewake_for_reg(uncore,
 						     GEN8_MCR_SELECTOR,
@@ -203,7 +216,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
  * group/instance.
  */
 u32 intel_gt_mcr_read(struct intel_gt *gt,
-		      i915_reg_t reg,
+		      i915_mcr_reg_t reg,
 		      int group, int instance)
 {
 	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ,
@@ -222,7 +235,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
  * group/instance.
  */
 void intel_gt_mcr_unicast_write(struct intel_gt *gt,
-				i915_reg_t reg, u32 value,
+				i915_mcr_reg_t reg, u32 value,
 				int group, int instance)
 {
 	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE,
@@ -238,9 +251,9 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt,
  * Write an MCR register in multicast mode to update all instances.
  */
 void intel_gt_mcr_multicast_write(struct intel_gt *gt,
-				i915_reg_t reg, u32 value)
+				i915_mcr_reg_t reg, u32 value)
 {
-	intel_uncore_write(gt->uncore, reg, value);
+	intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value);
 }
 
 /**
@@ -253,9 +266,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
  * must already be holding any required forcewake.
  */
 void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
-				i915_reg_t reg, u32 value)
+				     i915_mcr_reg_t reg, u32 value)
 {
-	intel_uncore_write_fw(gt->uncore, reg, value);
+	intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value);
 }
 
 /*
@@ -273,10 +286,10 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
  * for @type steering too.
  */
 static bool reg_needs_read_steering(struct intel_gt *gt,
-				    i915_reg_t reg,
+				    i915_mcr_reg_t reg,
 				    enum intel_steering_type type)
 {
-	const u32 offset = i915_mmio_reg_offset(reg);
+	const u32 offset = i915_mmio_reg_offset(mcr_reg_cast(reg));
 	const struct intel_mmio_range *entry;
 
 	if (likely(!gt->steering_table[type]))
@@ -348,7 +361,7 @@ static void get_valid_steering(struct intel_gt *gt,
  * steering.
  */
 void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
-					     i915_reg_t reg,
+					     i915_mcr_reg_t reg,
 					     u8 *group, u8 *instance)
 {
 	int type;
@@ -377,7 +390,7 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
  *
  * Returns the value from a valid instance of @reg.
  */
-u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
+u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
 {
 	int type;
 	u8 group, instance;
@@ -391,10 +404,10 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
 		}
 	}
 
-	return intel_uncore_read_fw(gt->uncore, reg);
+	return intel_uncore_read_fw(gt->uncore, mcr_reg_cast(reg));
 }
 
-u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
+u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
 {
 	int type;
 	u8 group, instance;
@@ -408,7 +421,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
 		}
 	}
 
-	return intel_uncore_read(gt->uncore, reg);
+	return intel_uncore_read(gt->uncore, mcr_reg_cast(reg));
 }
 
 static void report_steering_type(struct drm_printer *p,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
index 506b0cbc8db3..176501ea5926 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
@@ -11,21 +11,21 @@
 void intel_gt_mcr_init(struct intel_gt *gt);
 
 u32 intel_gt_mcr_read(struct intel_gt *gt,
-		      i915_reg_t reg,
+		      i915_mcr_reg_t reg,
 		      int group, int instance);
-u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
-u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
+u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg);
+u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg);
 
 void intel_gt_mcr_unicast_write(struct intel_gt *gt,
-				i915_reg_t reg, u32 value,
+				i915_mcr_reg_t reg, u32 value,
 				int group, int instance);
 void intel_gt_mcr_multicast_write(struct intel_gt *gt,
-				  i915_reg_t reg, u32 value);
+				  i915_mcr_reg_t reg, u32 value);
 void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
-				     i915_reg_t reg, u32 value);
+				     i915_mcr_reg_t reg, u32 value);
 
 void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
-					     i915_reg_t reg,
+					     i915_mcr_reg_t reg,
 					     u8 *group, u8 *instance);
 
 void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 3f5e01a48a17..926fb6a8558d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -8,7 +8,18 @@
 
 #include "i915_reg_defs.h"
 
-#define MCR_REG(offset)	_MMIO(offset)
+#define MCR_REG(offset)	((const i915_mcr_reg_t){ .mmio = (offset) })
+
+/*
+ * The perf control registers are technically multicast registers, but the
+ * driver never needs to read/write them directly; we only use them to build
+ * lists of registers (where they're mixed in with other non-MCR registers)
+ * and then operate on the offset directly.  For now we'll just define them
+ * as non-multicast so we can place them on the same list, but we may want
+ * to try to come up with a better way to handle heterogeneous lists of
+ * registers in the future.
+ */
+#define PERF_REG(offset)			_MMIO(offset)
 
 /* RPM unit config (Gen8+) */
 #define RPM_CONFIG0				_MMIO(0xd00)
@@ -1048,8 +1059,8 @@
 #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
 #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
 
-#define EU_PERF_CNTL0				MCR_REG(0xe458)
-#define EU_PERF_CNTL4				MCR_REG(0xe45c)
+#define EU_PERF_CNTL0				PERF_REG(0xe458)
+#define EU_PERF_CNTL4				PERF_REG(0xe45c)
 
 #define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
 #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
@@ -1082,16 +1093,16 @@
 #define RT_CTRL					MCR_REG(0xe530)
 #define   DIS_NULL_QUERY			REG_BIT(10)
 
-#define EU_PERF_CNTL1				MCR_REG(0xe558)
-#define EU_PERF_CNTL5				MCR_REG(0xe55c)
+#define EU_PERF_CNTL1				PERF_REG(0xe558)
+#define EU_PERF_CNTL5				PERF_REG(0xe55c)
 
 #define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
 #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
 #define ICL_HDC_MODE				MCR_REG(0xe5f4)
 
-#define EU_PERF_CNTL2				MCR_REG(0xe658)
-#define EU_PERF_CNTL6				MCR_REG(0xe65c)
-#define EU_PERF_CNTL3				MCR_REG(0xe758)
+#define EU_PERF_CNTL2				PERF_REG(0xe658)
+#define EU_PERF_CNTL6				PERF_REG(0xe65c)
+#define EU_PERF_CNTL3				PERF_REG(0xe758)
 
 #define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
 #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index dffef6ab4baf..c5fd17b6cf96 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1432,13 +1432,13 @@ gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
 {
 	/* NB no one else is allowed to scribble over scratch + 256! */
 	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = i915_mcr_reg_offset(GEN8_L3SQCREG4);
 	*batch++ = intel_gt_scratch_offset(engine->gt,
 					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
 	*batch++ = 0;
 
 	*batch++ = MI_LOAD_REGISTER_IMM(1);
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = i915_mcr_reg_offset(GEN8_L3SQCREG4);
 	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
 
 	batch = gen8_emit_pipe_control(batch,
@@ -1447,7 +1447,7 @@ gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
 				       0);
 
 	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = i915_mcr_reg_offset(GEN8_L3SQCREG4);
 	*batch++ = intel_gt_scratch_offset(engine->gt,
 					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
 	*batch++ = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index d7e61c8a8c04..818ba71f4909 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -166,11 +166,11 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
 	_wa_add(wal, &wa);
 }
 
-static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
+static void wa_mcr_add(struct i915_wa_list *wal, i915_mcr_reg_t reg,
 		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
 {
 	struct i915_wa wa = {
-		.reg  = reg,
+		.reg  = _MMIO(i915_mcr_reg_offset(reg)),
 		.clr  = clear,
 		.set  = set,
 		.read = read_mask,
@@ -187,7 +187,7 @@ wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
 }
 
 static void
-wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
+wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set)
 {
 	wa_mcr_add(wal, reg, clear, set, clear, false);
 }
@@ -205,7 +205,7 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 }
 
 static void
-wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
+wa_mcr_write_or(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set)
 {
 	wa_mcr_write_clr_set(wal, reg, set, set);
 }
@@ -217,7 +217,7 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
 }
 
 static void
-wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
+wa_mcr_write_clr(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clr)
 {
 	wa_mcr_write_clr_set(wal, reg, clr, 0);
 }
@@ -240,7 +240,7 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 }
 
 static void
-wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+wa_mcr_masked_en(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
 {
 	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
 }
@@ -252,7 +252,7 @@ wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
 }
 
 static void
-wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+wa_mcr_masked_dis(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
 {
 	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
 }
@@ -1638,16 +1638,18 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
 	intel_uncore_forcewake_get__locked(uncore, fw);
 
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
+		/* To be safe, just assume all registers are MCR */
+		i915_mcr_reg_t mcr_reg = MCR_REG(i915_mmio_reg_offset(wa->reg));
 		u32 val, old = 0;
 
 		/* open-coded rmw due to steering */
-		old = wa->clr ? intel_gt_mcr_read_any_fw(gt, wa->reg) : 0;
+		old = wa->clr ? intel_gt_mcr_read_any_fw(gt, mcr_reg) : 0;
 		val = (old & ~wa->clr) | wa->set;
 		if (val != old || !wa->clr)
 			intel_uncore_write_fw(uncore, wa->reg, val);
 
 		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
-			wa_verify(wa, intel_gt_mcr_read_any_fw(gt, wa->reg),
+			wa_verify(wa, intel_gt_mcr_read_any_fw(gt, mcr_reg),
 				  wal->name, "application");
 	}
 
@@ -1676,10 +1678,13 @@ static bool wa_list_verify(struct intel_gt *gt,
 	spin_lock_irqsave(&uncore->lock, flags);
 	intel_uncore_forcewake_get__locked(uncore, fw);
 
-	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
+	for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
+		i915_mcr_reg_t mcr_reg = MCR_REG(i915_mmio_reg_offset(wa->reg));
+
 		ok &= wa_verify(wa,
-				intel_gt_mcr_read_any_fw(gt, wa->reg),
+				intel_gt_mcr_read_any_fw(gt, mcr_reg),
 				wal->name, from);
+	}
 
 	intel_uncore_forcewake_put__locked(uncore, fw);
 	spin_unlock_irqrestore(&uncore->lock, flags);
@@ -1731,9 +1736,10 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
 }
 
 static void
-whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
+whitelist_mcr_reg(struct i915_wa_list *wal, i915_mcr_reg_t reg)
 {
-	whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
+	whitelist_reg_ext(wal, _MMIO(i915_mcr_reg_offset(reg)),
+			  RING_FORCE_TO_NONPRIV_ACCESS_RW);
 }
 
 static void gen9_whitelist_build(struct i915_wa_list *w)
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 67a9aab801dd..21b1edc052f8 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -991,7 +991,7 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg)
 	/* Alas, we must pardon some whitelists. Mistakes already made */
 	static const struct regmask pardon[] = {
 		{ GEN9_CTX_PREEMPT_REG, 9 },
-		{ GEN8_L3SQCREG4, 9 },
+		{ _MMIO(0xb118), 9 }, /* GEN8_L3SQCREG4 */
 	};
 
 	return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon));
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 389c5c0aad7a..0a2d50dbfe4b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -327,7 +327,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
 
 static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
 					 struct temp_regset *regset,
-					 i915_reg_t reg, u32 flags)
+					 i915_mcr_reg_t reg, u32 flags)
 {
 	u8 group, inst;
 
@@ -342,7 +342,7 @@ static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
 	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
 	flags |= GUC_REGSET_STEERING(group, inst);
 
-	return guc_mmio_reg_add(gt, regset, i915_mmio_reg_offset(reg), flags);
+	return guc_mmio_reg_add(gt, regset, i915_mcr_reg_offset(reg), flags);
 }
 
 #define GUC_MCR_REG_ADD(gt, regset, reg, masked) \
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
index 7f77e9cdaba4..8d1a85b06ff4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
@@ -239,7 +239,7 @@ static void guc_capture_free_extlists(struct __guc_mmio_reg_descr_group *reglist
 
 struct __ext_steer_reg {
 	const char *name;
-	i915_reg_t reg;
+	i915_mcr_reg_t reg;
 };
 
 static const struct __ext_steer_reg xe_extregs[] = {
@@ -251,7 +251,7 @@ static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
 			   const struct __ext_steer_reg *extlist,
 			   int slice_id, int subslice_id)
 {
-	ext->reg = extlist->reg;
+	ext->reg = _MMIO(i915_mcr_reg_offset(extlist->reg));
 	ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id);
 	ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id);
 	ext->regname = extlist->name;
diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c
index 2459213b6c87..8432f1fe25e6 100644
--- a/drivers/gpu/drm/i915/gvt/cmd_parser.c
+++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c
@@ -918,7 +918,7 @@ static int cmd_reg_handler(struct parser_exec_state *s,
 
 	if (!strncmp(cmd, "srm", 3) ||
 			!strncmp(cmd, "lrm", 3)) {
-		if (offset == i915_mmio_reg_offset(GEN8_L3SQCREG4) ||
+		if (offset == i915_mcr_reg_offset(GEN8_L3SQCREG4) ||
 		    offset == 0x21f0 ||
 		    (IS_BROADWELL(gvt->gt->i915) &&
 		     offset == i915_mmio_reg_offset(INSTPM)))
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index bad1065a99a7..84af773c5ebb 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -748,7 +748,7 @@ static i915_reg_t force_nonpriv_white_list[] = {
 	_MMIO(0x770c),
 	_MMIO(0x83a8),
 	_MMIO(0xb110),
-	GEN8_L3SQCREG4,//_MMIO(0xb118)
+	_MMIO(0xb118),
 	_MMIO(0xe100),
 	_MMIO(0xe18c),
 	_MMIO(0xe48c),
@@ -2157,6 +2157,9 @@ static int csfe_chicken1_mmio_write(struct intel_vgpu *vgpu,
 #define MMIO_DFH(reg, d, f, r, w) \
 	MMIO_F(reg, 4, f, 0, 0, d, r, w)
 
+#define MMIO_DFH_MCR(reg, d, f, r, w) \
+	MMIO_F(_MMIO(i915_mcr_reg_offset(reg)), 4, f, 0, 0, d, r, w)
+
 #define MMIO_GM(reg, d, r, w) \
 	MMIO_F(reg, 4, F_GMADR, 0xFFFFF000, 0, d, r, w)
 
@@ -3147,15 +3150,15 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
 	MMIO_D(GEN8_EU_DISABLE2, D_BDW_PLUS);
 
 	MMIO_D(_MMIO(0xfdc), D_BDW_PLUS);
-	MMIO_DFH(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
-		NULL, NULL);
+	MMIO_DFH_MCR(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
+		     NULL, NULL);
 	MMIO_DFH(GEN7_ROW_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
 		NULL, NULL);
 	MMIO_DFH(GEN8_UCGCTL6, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
 
 	MMIO_DFH(_MMIO(0xb1f0), D_BDW, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xb1c0), D_BDW, F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
 	MMIO_D(_MMIO(0xb110), D_BDW);
@@ -3181,7 +3184,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
 
 	MMIO_DFH(_MMIO(0xe194), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0xe188), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x2580), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
 
 	MMIO_DFH(_MMIO(0x2248), D_BDW, F_CMD_ACCESS, NULL, NULL);
@@ -3372,7 +3375,7 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
 	MMIO_D(DMC_HTP_SKL, D_SKL_PLUS);
 	MMIO_D(DMC_LAST_WRITE, D_SKL_PLUS);
 
-	MMIO_DFH(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
 
 	MMIO_D(SKL_DFSM, D_SKL_PLUS);
 	MMIO_D(DISPIO_CR_TX_BMU_CR0, D_SKL_PLUS);
@@ -3619,7 +3622,7 @@ static int init_bxt_mmio_info(struct intel_gvt *gvt)
 	MMIO_D(GEN8_PUSHBUS_ENABLE, D_BXT);
 	MMIO_D(GEN8_PUSHBUS_SHIFT, D_BXT);
 	MMIO_D(GEN6_GFXPAUSE, D_BXT);
-	MMIO_DFH(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
+	MMIO_DFH_MCR(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(GEN8_L3CNTLREG, D_BXT, F_CMD_ACCESS, NULL, NULL);
 	MMIO_DFH(_MMIO(0x20D8), D_BXT, F_CMD_ACCESS, NULL, NULL);
 	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40, F_CMD_ACCESS,
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
index 4be07d627941..bf10c3bf6ad8 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -44,6 +44,8 @@
 
 #define GEN9_MOCS_SIZE		64
 
+#define MCR_CAST(mcr)	_MMIO(i915_mcr_reg_offset(mcr))
+
 /* Raw offset is appened to each line for convenience. */
 static struct engine_mmio gen8_engine_mmio_list[] __cacheline_aligned = {
 	{RCS0, RING_MODE_GEN7(RENDER_RING_BASE), 0xffff, false}, /* 0x229c */
@@ -106,15 +108,15 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
 	{RCS0, GEN8_CS_CHICKEN1, 0xffff, true}, /* 0x2580 */
 	{RCS0, COMMON_SLICE_CHICKEN2, 0xffff, true}, /* 0x7014 */
 	{RCS0, GEN9_CS_DEBUG_MODE1, 0xffff, false}, /* 0x20ec */
-	{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
-	{RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
+	{RCS0, _MMIO(0xb118), 0, false}, /* GEN8_L3SQCREG4 */
+	{RCS0, _MMIO(0xb11c), 0, false}, /* GEN9_SCRATCH1 */
 	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
 	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
-	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
+	{RCS0, _MMIO(0xe180), 0xffff, true}, /* HALF_SLICE_CHICKEN2 */
 	{RCS0, HSW_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
-	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
-	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
-	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
+	{RCS0, _MMIO(0xe188), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN5 */
+	{RCS0, _MMIO(0xe194), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN7 */
+	{RCS0, _MMIO(0xe4f0), 0xffff, true}, /* GEN8_ROW_CHICKEN */
 	{RCS0, TRVATTL3PTRDW(0), 0, true}, /* 0x4de0 */
 	{RCS0, TRVATTL3PTRDW(1), 0, true}, /* 0x4de4 */
 	{RCS0, TRNULLDETCT, 0, true}, /* 0x4de8 */
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 0a9c3fcc09b1..22c10c4a1cbb 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3986,7 +3986,7 @@ static u32 mask_reg_value(u32 reg, u32 val)
 	 * WaDisableSTUnitPowerOptimization workaround. Make sure the value
 	 * programmed by userspace doesn't change this.
 	 */
-	if (REG_EQUAL(reg, HALF_SLICE_CHICKEN2))
+	if (reg == i915_mcr_reg_offset(HALF_SLICE_CHICKEN2))
 		val = val & ~_MASKED_BIT_ENABLE(GEN8_ST_PO_DISABLE);
 
 	/* WAIT_FOR_RC6_EXIT has only one bit fullfilling the function
diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
index 8f486f77609f..34eca053fab9 100644
--- a/drivers/gpu/drm/i915/i915_reg_defs.h
+++ b/drivers/gpu/drm/i915/i915_reg_defs.h
@@ -104,6 +104,10 @@ typedef struct {
 
 #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
 
+typedef struct {
+	u32 mmio;
+} i915_mcr_reg_t;
+
 #define INVALID_MMIO_REG _MMIO(0)
 
 static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
@@ -111,6 +115,11 @@ static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
 	return reg.reg;
 }
 
+static __always_inline u32 i915_mcr_reg_offset(const i915_mcr_reg_t reg)
+{
+	return reg.mmio;
+}
+
 static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
 {
 	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering
  2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
                   ` (13 preceding siblings ...)
  2022-03-30 23:28 ` [PATCH 14/15] drm/i915: Define multicast registers as a new type Matt Roper
@ 2022-03-30 23:28 ` Matt Roper
  2022-03-31 17:35   ` [Intel-gfx] " Tvrtko Ursulin
  2022-04-01  8:34   ` Tvrtko Ursulin
  14 siblings, 2 replies; 24+ messages in thread
From: Matt Roper @ 2022-03-30 23:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Historically we've selected and programmed a single MCR group/instance
ID at driver startup that will steer register accesses for GSLICE/DSS
ranges to a non-terminated instance.  Any reads of these register ranges
that don't need a specific unicast access won't bother explicitly
resteering because the control register should already be set to a
suitable value to receive a non-terminated response.  Any accesses to
other types of MCR ranges (MSLICE, LNCF, etc.) that are also satisfied
by the default steering target will also skip explicit re-steering at
runtime.

This approach has worked well for many years and many platforms, but our
hardware teams have recently advised us that they're not 100%
comfortable with our strategy going forward.  They now suggest
explicitly steering reads of any multicast register at the time the
register access happens rather than relying on previously-programmed
steering values.  In debug settings there could be external agents that
have adjusted the default steering without the driver's knowledge (e.g.,
we could do this manually with IGT's intel_reg today, although the
hardware teams also have other tools they use for debug and analysis).
In theory it would also be possible for bad firmware and/or incorrect
handling of power management events to clobber/wipe the steering
value we had previously initialized because they assume we'll be
re-programming it anyway.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 40 +++++++++
 drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 98 ++++-----------------
 3 files changed, 56 insertions(+), 83 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index a9a9fa6881f2..787752367337 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -35,6 +35,7 @@
  */
 
 static const char * const intel_steering_types[] = {
+	"GSLICE/DSS",
 	"L3BANK",
 	"MSLICE",
 	"LNCF",
@@ -45,6 +46,35 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
 	{},
 };
 
+static const struct intel_mmio_range xehpsdv_dss_steering_table[] = {
+	{ 0x005200, 0x0052FF },
+	{ 0x005400, 0x007FFF },
+	{ 0x008140, 0x00815F },
+	{ 0x008D00, 0x008DFF },
+	{ 0x0094D0, 0x00955F },
+	{ 0x009680, 0x0096FF },
+	{ 0x00DC00, 0x00DCFF },
+	{ 0x00DE80, 0x00E8FF },
+	{ 0x017000, 0x017FFF },
+	{ 0x024A00, 0x024A7F },
+	{},
+};
+
+static const struct intel_mmio_range dg2_dss_steering_table[] = {
+	{ 0x005200, 0x0052FF },
+	{ 0x005400, 0x007FFF },
+	{ 0x008140, 0x00815F },
+	{ 0x008D00, 0x008DFF },
+	{ 0x0094D0, 0x00955F },
+	{ 0x009680, 0x0096FF },
+	{ 0x00D800, 0x00D87F },
+	{ 0x00DC00, 0x00DCFF },
+	{ 0x00DE80, 0x00E8FF },
+	{ 0x017000, 0x017FFF },
+	{ 0x024A00, 0x024A7F },
+	{},
+};
+
 static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
 	{ 0x004000, 0x004AFF },
 	{ 0x00C800, 0x00CFFF },
@@ -87,9 +117,11 @@ void intel_gt_mcr_init(struct intel_gt *gt)
 			 GEN12_MEML3_EN_MASK);
 
 	if (IS_DG2(i915)) {
+		gt->steering_table[DSS] = dg2_dss_steering_table;
 		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
 		gt->steering_table[LNCF] = dg2_lncf_steering_table;
 	} else if (IS_XEHPSDV(i915)) {
+		gt->steering_table[DSS] = xehpsdv_dss_steering_table;
 		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
 		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
 	} else if (GRAPHICS_VER(i915) >= 11 &&
@@ -317,7 +349,15 @@ static void get_valid_steering(struct intel_gt *gt,
 			       enum intel_steering_type type,
 			       u8 *group, u8 *instance)
 {
+	u32 dssmask = intel_sseu_get_subslices(&gt->info.sseu, 0);
+
 	switch (type) {
+	case DSS:
+		drm_WARN_ON(&gt->i915->drm, dssmask == 0);
+
+		*group = __ffs(dssmask) / GEN_DSS_PER_GSLICE;
+		*instance = __ffs(dssmask) % GEN_DSS_PER_GSLICE;
+		break;
 	case L3BANK:
 		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 937b2e1a305e..b77bbaac7622 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -54,6 +54,7 @@ struct intel_mmio_range {
  * are listed here.
  */
 enum intel_steering_type {
+	DSS,
 	L3BANK,
 	MSLICE,
 	LNCF,
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 818ba71f4909..2486c6aa9d9d 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1160,87 +1160,6 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
 	__add_mcr_wa(gt, wal, slice, subslice);
 }
 
-static void
-xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
-{
-	const struct sseu_dev_info *sseu = &gt->info.sseu;
-	unsigned long slice, subslice = 0, slice_mask = 0;
-	u64 dss_mask = 0;
-	u32 lncf_mask = 0;
-	int i;
-
-	/*
-	 * On Xe_HP the steering increases in complexity. There are now several
-	 * more units that require steering and we're not guaranteed to be able
-	 * to find a common setting for all of them. These are:
-	 * - GSLICE (fusable)
-	 * - DSS (sub-unit within gslice; fusable)
-	 * - L3 Bank (fusable)
-	 * - MSLICE (fusable)
-	 * - LNCF (sub-unit within mslice; always present if mslice is present)
-	 *
-	 * We'll do our default/implicit steering based on GSLICE (in the
-	 * sliceid field) and DSS (in the subsliceid field).  If we can
-	 * find overlap between the valid MSLICE and/or LNCF values with
-	 * a suitable GSLICE, then we can just re-use the default value and
-	 * skip and explicit steering at runtime.
-	 *
-	 * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
-	 * a valid sliceid value.  DSS steering is the only type of steering
-	 * that utilizes the 'subsliceid' bits.
-	 *
-	 * Also note that, even though the steering domain is called "GSlice"
-	 * and it is encoded in the register using the gslice format, the spec
-	 * says that the combined (geometry | compute) fuse should be used to
-	 * select the steering.
-	 */
-
-	/* Find the potential gslice candidates */
-	dss_mask = intel_sseu_get_subslices(sseu, 0);
-	slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
-
-	/*
-	 * Find the potential LNCF candidates.  Either LNCF within a valid
-	 * mslice is fine.
-	 */
-	for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
-		lncf_mask |= (0x3 << (i * 2));
-
-	/*
-	 * Are there any sliceid values that work for both GSLICE and LNCF
-	 * steering?
-	 */
-	if (slice_mask & lncf_mask) {
-		slice_mask &= lncf_mask;
-		gt->steering_table[LNCF] = NULL;
-	}
-
-	/* How about sliceid values that also work for MSLICE steering? */
-	if (slice_mask & gt->info.mslice_mask) {
-		slice_mask &= gt->info.mslice_mask;
-		gt->steering_table[MSLICE] = NULL;
-	}
-
-	slice = __ffs(slice_mask);
-	subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
-	WARN_ON(subslice > GEN_DSS_PER_GSLICE);
-	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
-
-	__add_mcr_wa(gt, wal, slice, subslice);
-
-	/*
-	 * SQIDI ranges are special because they use different steering
-	 * registers than everything else we work with.  On XeHP SDV and
-	 * DG2-G10, any value in the steering registers will work fine since
-	 * all instances are present, but DG2-G11 only has SQIDI instances at
-	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
-	 * we'll just steer to a hardcoded "2" since that value will work
-	 * everywhere.
-	 */
-	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
-	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
-}
-
 static void
 icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 {
@@ -1388,8 +1307,9 @@ static void
 xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 {
 	struct drm_i915_private *i915 = gt->i915;
+	struct drm_printer p = drm_debug_printer("MCR Steering:");
 
-	xehp_init_mcr(gt, wal);
+	intel_gt_mcr_report_steering(&p, gt, false);
 
 	/* Wa_1409757795:xehpsdv */
 	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
@@ -1441,10 +1361,22 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 static void
 dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
 {
+	struct drm_printer p = drm_debug_printer("MCR Steering:");
 	struct intel_engine_cs *engine;
 	int id;
 
-	xehp_init_mcr(gt, wal);
+	intel_gt_mcr_report_steering(&p, gt, false);
+
+	/*
+	 * SQIDI ranges are special because they use different steering
+	 * registers than everything else we work with.  On DG2-G10, any value
+	 * in the steering registers will work fine since all instances are
+	 * present, but DG2-G11 only has SQIDI instances at ID's 2 and 3, so we
+	 * need to steer to one of those.  For simplicity we'll just steer to a
+	 * hardcoded "2" since that value will work everywhere.
+	 */
+	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
+	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
 
 	/* Wa_14011060649:dg2 */
 	wa_14011060649(gt, wal);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering
  2022-03-30 23:28 ` [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering Matt Roper
@ 2022-03-31 17:35   ` Tvrtko Ursulin
  2022-04-04 21:35     ` Matt Roper
  2022-04-01  8:34   ` Tvrtko Ursulin
  1 sibling, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2022-03-31 17:35 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: dri-devel


On 31/03/2022 00:28, Matt Roper wrote:
> Historically we've selected and programmed a single MCR group/instance
> ID at driver startup that will steer register accesses for GSLICE/DSS
> ranges to a non-terminated instance.  Any reads of these register ranges
> that don't need a specific unicast access won't bother explicitly
> resteering because the control register should already be set to a
> suitable value to receive a non-terminated response.  Any accesses to
> other types of MCR ranges (MSLICE, LNCF, etc.) that are also satisfied
> by the default steering target will also skip explicit re-steering at
> runtime.
> 
> This approach has worked well for many years and many platforms, but our
> hardware teams have recently advised us that they're not 100%
> comfortable with our strategy going forward.  They now suggest
> explicitly steering reads of any multicast register at the time the
> register access happens rather than relying on previously-programmed
> steering values.  In debug settings there could be external agents that
> have adjusted the default steering without the driver's knowledge (e.g.,
> we could do this manually with IGT's intel_reg today, although the
> hardware teams also have other tools they use for debug and analysis).
> In theory it would also be possible for bad firmware and/or incorrect
> handling of power management events to clobber/wipe the steering
> value we had previously initialized because they assume we'll be
> re-programming it anyway.

That external agent of any kind can mess with registers behind drivers 
back is kind of a weak justification, no? Because steering is just one 
small part of what can go wrong in this case.

Also, someone at some point has to know which are the affected 
registers. Be it a range table, or at actual point of submitting patches 
which add register definitions. At any of those points mistakes are 
possible.

So I guess I am not immediately buying the need to refactor all this. 
Apart from churn, the main downside that I see is that all accesses need 
separate helpers. Question is why. Driver could choose to always steer 
before reading today, right?

Regards,

Tvrtko

> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 40 +++++++++
>   drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
>   drivers/gpu/drm/i915/gt/intel_workarounds.c | 98 ++++-----------------
>   3 files changed, 56 insertions(+), 83 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index a9a9fa6881f2..787752367337 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -35,6 +35,7 @@
>    */
>   
>   static const char * const intel_steering_types[] = {
> +	"GSLICE/DSS",
>   	"L3BANK",
>   	"MSLICE",
>   	"LNCF",
> @@ -45,6 +46,35 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
>   	{},
>   };
>   
> +static const struct intel_mmio_range xehpsdv_dss_steering_table[] = {
> +	{ 0x005200, 0x0052FF },
> +	{ 0x005400, 0x007FFF },
> +	{ 0x008140, 0x00815F },
> +	{ 0x008D00, 0x008DFF },
> +	{ 0x0094D0, 0x00955F },
> +	{ 0x009680, 0x0096FF },
> +	{ 0x00DC00, 0x00DCFF },
> +	{ 0x00DE80, 0x00E8FF },
> +	{ 0x017000, 0x017FFF },
> +	{ 0x024A00, 0x024A7F },
> +	{},
> +};
> +
> +static const struct intel_mmio_range dg2_dss_steering_table[] = {
> +	{ 0x005200, 0x0052FF },
> +	{ 0x005400, 0x007FFF },
> +	{ 0x008140, 0x00815F },
> +	{ 0x008D00, 0x008DFF },
> +	{ 0x0094D0, 0x00955F },
> +	{ 0x009680, 0x0096FF },
> +	{ 0x00D800, 0x00D87F },
> +	{ 0x00DC00, 0x00DCFF },
> +	{ 0x00DE80, 0x00E8FF },
> +	{ 0x017000, 0x017FFF },
> +	{ 0x024A00, 0x024A7F },
> +	{},
> +};
> +
>   static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
>   	{ 0x004000, 0x004AFF },
>   	{ 0x00C800, 0x00CFFF },
> @@ -87,9 +117,11 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>   			 GEN12_MEML3_EN_MASK);
>   
>   	if (IS_DG2(i915)) {
> +		gt->steering_table[DSS] = dg2_dss_steering_table;
>   		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
>   		gt->steering_table[LNCF] = dg2_lncf_steering_table;
>   	} else if (IS_XEHPSDV(i915)) {
> +		gt->steering_table[DSS] = xehpsdv_dss_steering_table;
>   		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
>   		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
>   	} else if (GRAPHICS_VER(i915) >= 11 &&
> @@ -317,7 +349,15 @@ static void get_valid_steering(struct intel_gt *gt,
>   			       enum intel_steering_type type,
>   			       u8 *group, u8 *instance)
>   {
> +	u32 dssmask = intel_sseu_get_subslices(&gt->info.sseu, 0);
> +
>   	switch (type) {
> +	case DSS:
> +		drm_WARN_ON(&gt->i915->drm, dssmask == 0);
> +
> +		*group = __ffs(dssmask) / GEN_DSS_PER_GSLICE;
> +		*instance = __ffs(dssmask) % GEN_DSS_PER_GSLICE;
> +		break;
>   	case L3BANK:
>   		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> index 937b2e1a305e..b77bbaac7622 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> @@ -54,6 +54,7 @@ struct intel_mmio_range {
>    * are listed here.
>    */
>   enum intel_steering_type {
> +	DSS,
>   	L3BANK,
>   	MSLICE,
>   	LNCF,
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 818ba71f4909..2486c6aa9d9d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -1160,87 +1160,6 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
>   	__add_mcr_wa(gt, wal, slice, subslice);
>   }
>   
> -static void
> -xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> -{
> -	const struct sseu_dev_info *sseu = &gt->info.sseu;
> -	unsigned long slice, subslice = 0, slice_mask = 0;
> -	u64 dss_mask = 0;
> -	u32 lncf_mask = 0;
> -	int i;
> -
> -	/*
> -	 * On Xe_HP the steering increases in complexity. There are now several
> -	 * more units that require steering and we're not guaranteed to be able
> -	 * to find a common setting for all of them. These are:
> -	 * - GSLICE (fusable)
> -	 * - DSS (sub-unit within gslice; fusable)
> -	 * - L3 Bank (fusable)
> -	 * - MSLICE (fusable)
> -	 * - LNCF (sub-unit within mslice; always present if mslice is present)
> -	 *
> -	 * We'll do our default/implicit steering based on GSLICE (in the
> -	 * sliceid field) and DSS (in the subsliceid field).  If we can
> -	 * find overlap between the valid MSLICE and/or LNCF values with
> -	 * a suitable GSLICE, then we can just re-use the default value and
> -	 * skip and explicit steering at runtime.
> -	 *
> -	 * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
> -	 * a valid sliceid value.  DSS steering is the only type of steering
> -	 * that utilizes the 'subsliceid' bits.
> -	 *
> -	 * Also note that, even though the steering domain is called "GSlice"
> -	 * and it is encoded in the register using the gslice format, the spec
> -	 * says that the combined (geometry | compute) fuse should be used to
> -	 * select the steering.
> -	 */
> -
> -	/* Find the potential gslice candidates */
> -	dss_mask = intel_sseu_get_subslices(sseu, 0);
> -	slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
> -
> -	/*
> -	 * Find the potential LNCF candidates.  Either LNCF within a valid
> -	 * mslice is fine.
> -	 */
> -	for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
> -		lncf_mask |= (0x3 << (i * 2));
> -
> -	/*
> -	 * Are there any sliceid values that work for both GSLICE and LNCF
> -	 * steering?
> -	 */
> -	if (slice_mask & lncf_mask) {
> -		slice_mask &= lncf_mask;
> -		gt->steering_table[LNCF] = NULL;
> -	}
> -
> -	/* How about sliceid values that also work for MSLICE steering? */
> -	if (slice_mask & gt->info.mslice_mask) {
> -		slice_mask &= gt->info.mslice_mask;
> -		gt->steering_table[MSLICE] = NULL;
> -	}
> -
> -	slice = __ffs(slice_mask);
> -	subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
> -	WARN_ON(subslice > GEN_DSS_PER_GSLICE);
> -	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
> -
> -	__add_mcr_wa(gt, wal, slice, subslice);
> -
> -	/*
> -	 * SQIDI ranges are special because they use different steering
> -	 * registers than everything else we work with.  On XeHP SDV and
> -	 * DG2-G10, any value in the steering registers will work fine since
> -	 * all instances are present, but DG2-G11 only has SQIDI instances at
> -	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
> -	 * we'll just steer to a hardcoded "2" since that value will work
> -	 * everywhere.
> -	 */
> -	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> -	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> -}
> -
>   static void
>   icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>   {
> @@ -1388,8 +1307,9 @@ static void
>   xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>   {
>   	struct drm_i915_private *i915 = gt->i915;
> +	struct drm_printer p = drm_debug_printer("MCR Steering:");
>   
> -	xehp_init_mcr(gt, wal);
> +	intel_gt_mcr_report_steering(&p, gt, false);
>   
>   	/* Wa_1409757795:xehpsdv */
>   	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
> @@ -1441,10 +1361,22 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>   static void
>   dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>   {
> +	struct drm_printer p = drm_debug_printer("MCR Steering:");
>   	struct intel_engine_cs *engine;
>   	int id;
>   
> -	xehp_init_mcr(gt, wal);
> +	intel_gt_mcr_report_steering(&p, gt, false);
> +
> +	/*
> +	 * SQIDI ranges are special because they use different steering
> +	 * registers than everything else we work with.  On DG2-G10, any value
> +	 * in the steering registers will work fine since all instances are
> +	 * present, but DG2-G11 only has SQIDI instances at ID's 2 and 3, so we
> +	 * need to steer to one of those.  For simplicity we'll just steer to a
> +	 * hardcoded "2" since that value will work everywhere.
> +	 */
> +	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> +	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
>   
>   	/* Wa_14011060649:dg2 */
>   	wa_14011060649(gt, wal);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 14/15] drm/i915: Define multicast registers as a new type
  2022-03-30 23:28 ` [PATCH 14/15] drm/i915: Define multicast registers as a new type Matt Roper
@ 2022-04-01  7:55   ` Tvrtko Ursulin
  2022-04-04 21:12     ` Matt Roper
  0 siblings, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2022-04-01  7:55 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: dri-devel


On 31/03/2022 00:28, Matt Roper wrote:
> Rather than treating multicast registers as 'i915_reg_t' let's define
> them as a completely new type.  This will allow the compiler to help us
> make sure we're using multicast-aware functions to operate on multicast
> registers.
> 
> This plan does break down a bit in places where we're just maintaining
> heterogeneous lists of registers (e.g., various MMIO whitelists used by
> perf, GVT, etc.) rather than performing reads/writes.  We only really
> care about the offset in those cases, so for now we can "cast" the
> registers as non-MCR, leaving us with a list of i915_reg_t's, but we may
> want to look for better ways to store mixed collections of i915_reg_t
> and i915_mcr_reg_t in the future.
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 49 ++++++++++++-------
>   drivers/gpu/drm/i915/gt/intel_gt_mcr.h        | 14 +++---
>   drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 27 +++++++---
>   drivers/gpu/drm/i915/gt/intel_lrc.c           |  6 +--
>   drivers/gpu/drm/i915/gt/intel_workarounds.c   | 32 +++++++-----
>   .../gpu/drm/i915/gt/selftest_workarounds.c    |  2 +-
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  4 +-
>   .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 +-
>   drivers/gpu/drm/i915/gvt/cmd_parser.c         |  2 +-
>   drivers/gpu/drm/i915/gvt/handlers.c           | 17 ++++---
>   drivers/gpu/drm/i915/gvt/mmio_context.c       | 14 +++---
>   drivers/gpu/drm/i915/i915_perf.c              |  2 +-
>   drivers/gpu/drm/i915/i915_reg_defs.h          |  9 ++++
>   13 files changed, 113 insertions(+), 69 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index c8e52d625f18..a9a9fa6881f2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -103,6 +103,19 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>   	}
>   }
>   
> +/*
> + * Although the rest of the driver should use MCR-specific functions to
> + * read/write MCR registers, we still use the regular intel_uncore_* functions
> + * internally to implement those, so we need a way for the functions in this
> + * file to "cast" an i915_mcr_reg_t into an i915_reg_t.
> + */
> +static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
> +{
> +	i915_reg_t r = { .reg = mcr.mmio };
> +
> +	return r;
> +}
> +
>   /*
>    * rw_with_mcr_steering_fw - Access a register with specific MCR steering
>    * @uncore: pointer to struct intel_uncore
> @@ -117,7 +130,7 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>    * Caller needs to make sure the relevant forcewake wells are up.
>    */
>   static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> -				   i915_reg_t reg, u8 rw_flag,
> +				   i915_mcr_reg_t reg, u8 rw_flag,
>   				   int group, int instance, u32 value)
>   {
>   	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
> @@ -154,9 +167,9 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>   	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
>   
>   	if (rw_flag == FW_REG_READ)
> -		val = intel_uncore_read_fw(uncore, reg);
> +		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
>   	else
> -		intel_uncore_write_fw(uncore, reg, value);
> +		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
>   
>   	mcr &= ~mcr_mask;
>   	mcr |= old_mcr & mcr_mask;
> @@ -167,14 +180,14 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
>   }
>   
>   static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
> -				i915_reg_t reg, u8 rw_flag,
> +				i915_mcr_reg_t reg, u8 rw_flag,
>   				int group, int instance,
>   				u32 value)
>   {
>   	enum forcewake_domains fw_domains;
>   	u32 val;
>   
> -	fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
> +	fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg),
>   						    rw_flag);
>   	fw_domains |= intel_uncore_forcewake_for_reg(uncore,
>   						     GEN8_MCR_SELECTOR,
> @@ -203,7 +216,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
>    * group/instance.
>    */
>   u32 intel_gt_mcr_read(struct intel_gt *gt,
> -		      i915_reg_t reg,
> +		      i915_mcr_reg_t reg,
>   		      int group, int instance)
>   {
>   	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ,
> @@ -222,7 +235,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
>    * group/instance.
>    */
>   void intel_gt_mcr_unicast_write(struct intel_gt *gt,
> -				i915_reg_t reg, u32 value,
> +				i915_mcr_reg_t reg, u32 value,
>   				int group, int instance)
>   {
>   	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE,
> @@ -238,9 +251,9 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt,
>    * Write an MCR register in multicast mode to update all instances.
>    */
>   void intel_gt_mcr_multicast_write(struct intel_gt *gt,
> -				i915_reg_t reg, u32 value)
> +				i915_mcr_reg_t reg, u32 value)
>   {
> -	intel_uncore_write(gt->uncore, reg, value);
> +	intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value);
>   }
>   
>   /**
> @@ -253,9 +266,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
>    * must already be holding any required forcewake.
>    */
>   void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
> -				i915_reg_t reg, u32 value)
> +				     i915_mcr_reg_t reg, u32 value)
>   {
> -	intel_uncore_write_fw(gt->uncore, reg, value);
> +	intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value);
>   }
>   
>   /*
> @@ -273,10 +286,10 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
>    * for @type steering too.
>    */
>   static bool reg_needs_read_steering(struct intel_gt *gt,
> -				    i915_reg_t reg,
> +				    i915_mcr_reg_t reg,
>   				    enum intel_steering_type type)
>   {
> -	const u32 offset = i915_mmio_reg_offset(reg);
> +	const u32 offset = i915_mmio_reg_offset(mcr_reg_cast(reg));
>   	const struct intel_mmio_range *entry;
>   
>   	if (likely(!gt->steering_table[type]))
> @@ -348,7 +361,7 @@ static void get_valid_steering(struct intel_gt *gt,
>    * steering.
>    */
>   void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
> -					     i915_reg_t reg,
> +					     i915_mcr_reg_t reg,
>   					     u8 *group, u8 *instance)
>   {
>   	int type;
> @@ -377,7 +390,7 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
>    *
>    * Returns the value from a valid instance of @reg.
>    */
> -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
> +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
>   {
>   	int type;
>   	u8 group, instance;
> @@ -391,10 +404,10 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
>   		}
>   	}
>   
> -	return intel_uncore_read_fw(gt->uncore, reg);
> +	return intel_uncore_read_fw(gt->uncore, mcr_reg_cast(reg));
>   }
>   
> -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
> +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
>   {
>   	int type;
>   	u8 group, instance;
> @@ -408,7 +421,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
>   		}
>   	}
>   
> -	return intel_uncore_read(gt->uncore, reg);
> +	return intel_uncore_read(gt->uncore, mcr_reg_cast(reg));
>   }
>   
>   static void report_steering_type(struct drm_printer *p,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> index 506b0cbc8db3..176501ea5926 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> @@ -11,21 +11,21 @@
>   void intel_gt_mcr_init(struct intel_gt *gt);
>   
>   u32 intel_gt_mcr_read(struct intel_gt *gt,
> -		      i915_reg_t reg,
> +		      i915_mcr_reg_t reg,
>   		      int group, int instance);
> -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
> -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
> +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg);
> +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg);
>   
>   void intel_gt_mcr_unicast_write(struct intel_gt *gt,
> -				i915_reg_t reg, u32 value,
> +				i915_mcr_reg_t reg, u32 value,
>   				int group, int instance);
>   void intel_gt_mcr_multicast_write(struct intel_gt *gt,
> -				  i915_reg_t reg, u32 value);
> +				  i915_mcr_reg_t reg, u32 value);
>   void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
> -				     i915_reg_t reg, u32 value);
> +				     i915_mcr_reg_t reg, u32 value);
>   
>   void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
> -					     i915_reg_t reg,
> +					     i915_mcr_reg_t reg,
>   					     u8 *group, u8 *instance);
>   
>   void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index 3f5e01a48a17..926fb6a8558d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -8,7 +8,18 @@
>   
>   #include "i915_reg_defs.h"
>   
> -#define MCR_REG(offset)	_MMIO(offset)
> +#define MCR_REG(offset)	((const i915_mcr_reg_t){ .mmio = (offset) })
> +
> +/*
> + * The perf control registers are technically multicast registers, but the
> + * driver never needs to read/write them directly; we only use them to build
> + * lists of registers (where they're mixed in with other non-MCR registers)
> + * and then operate on the offset directly.  For now we'll just define them
> + * as non-multicast so we can place them on the same list, but we may want
> + * to try to come up with a better way to handle heterogeneous lists of
> + * registers in the future.
> + */
> +#define PERF_REG(offset)			_MMIO(offset)
>   
>   /* RPM unit config (Gen8+) */
>   #define RPM_CONFIG0				_MMIO(0xd00)
> @@ -1048,8 +1059,8 @@
>   #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
>   #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
>   
> -#define EU_PERF_CNTL0				MCR_REG(0xe458)
> -#define EU_PERF_CNTL4				MCR_REG(0xe45c)
> +#define EU_PERF_CNTL0				PERF_REG(0xe458)
> +#define EU_PERF_CNTL4				PERF_REG(0xe45c)
>   
>   #define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
>   #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
> @@ -1082,16 +1093,16 @@
>   #define RT_CTRL					MCR_REG(0xe530)
>   #define   DIS_NULL_QUERY			REG_BIT(10)
>   
> -#define EU_PERF_CNTL1				MCR_REG(0xe558)
> -#define EU_PERF_CNTL5				MCR_REG(0xe55c)
> +#define EU_PERF_CNTL1				PERF_REG(0xe558)
> +#define EU_PERF_CNTL5				PERF_REG(0xe55c)
>   
>   #define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
>   #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
>   #define ICL_HDC_MODE				MCR_REG(0xe5f4)
>   
> -#define EU_PERF_CNTL2				MCR_REG(0xe658)
> -#define EU_PERF_CNTL6				MCR_REG(0xe65c)
> -#define EU_PERF_CNTL3				MCR_REG(0xe758)
> +#define EU_PERF_CNTL2				PERF_REG(0xe658)
> +#define EU_PERF_CNTL6				PERF_REG(0xe65c)
> +#define EU_PERF_CNTL3				PERF_REG(0xe758)
>   
>   #define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
>   #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index dffef6ab4baf..c5fd17b6cf96 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1432,13 +1432,13 @@ gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
>   {
>   	/* NB no one else is allowed to scribble over scratch + 256! */
>   	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
> -	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
> +	*batch++ = i915_mcr_reg_offset(GEN8_L3SQCREG4);
>   	*batch++ = intel_gt_scratch_offset(engine->gt,
>   					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
>   	*batch++ = 0;
>   
>   	*batch++ = MI_LOAD_REGISTER_IMM(1);
> -	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
> +	*batch++ = i915_mcr_reg_offset(GEN8_L3SQCREG4);
>   	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
>   
>   	batch = gen8_emit_pipe_control(batch,
> @@ -1447,7 +1447,7 @@ gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
>   				       0);
>   
>   	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
> -	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
> +	*batch++ = i915_mcr_reg_offset(GEN8_L3SQCREG4);
>   	*batch++ = intel_gt_scratch_offset(engine->gt,
>   					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
>   	*batch++ = 0;
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index d7e61c8a8c04..818ba71f4909 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -166,11 +166,11 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
>   	_wa_add(wal, &wa);
>   }
>   
> -static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
> +static void wa_mcr_add(struct i915_wa_list *wal, i915_mcr_reg_t reg,
>   		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
>   {
>   	struct i915_wa wa = {
> -		.reg  = reg,
> +		.reg  = _MMIO(i915_mcr_reg_offset(reg)),
>   		.clr  = clear,
>   		.set  = set,
>   		.read = read_mask,
> @@ -187,7 +187,7 @@ wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
>   }
>   
>   static void
> -wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
> +wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set)
>   {
>   	wa_mcr_add(wal, reg, clear, set, clear, false);
>   }
> @@ -205,7 +205,7 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
>   }
>   
>   static void
> -wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
> +wa_mcr_write_or(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set)
>   {
>   	wa_mcr_write_clr_set(wal, reg, set, set);
>   }
> @@ -217,7 +217,7 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
>   }
>   
>   static void
> -wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
> +wa_mcr_write_clr(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clr)
>   {
>   	wa_mcr_write_clr_set(wal, reg, clr, 0);
>   }
> @@ -240,7 +240,7 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
>   }
>   
>   static void
> -wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> +wa_mcr_masked_en(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
>   {
>   	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
>   }
> @@ -252,7 +252,7 @@ wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
>   }
>   
>   static void
> -wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> +wa_mcr_masked_dis(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
>   {
>   	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
>   }
> @@ -1638,16 +1638,18 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
>   	intel_uncore_forcewake_get__locked(uncore, fw);
>   
>   	for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
> +		/* To be safe, just assume all registers are MCR */
> +		i915_mcr_reg_t mcr_reg = MCR_REG(i915_mmio_reg_offset(wa->reg));

What does safe mean in this context? It's okay to use non-mcr registers 
via mcr helpers, but not vice-versa? So intel_gt_mcr_read_any_fw still 
has a table to know when steering is needed and when not?

Regards,

Tvrtko

>   		u32 val, old = 0;
>   
>   		/* open-coded rmw due to steering */
> -		old = wa->clr ? intel_gt_mcr_read_any_fw(gt, wa->reg) : 0;
> +		old = wa->clr ? intel_gt_mcr_read_any_fw(gt, mcr_reg) : 0;
>   		val = (old & ~wa->clr) | wa->set;
>   		if (val != old || !wa->clr)
>   			intel_uncore_write_fw(uncore, wa->reg, val);
>   
>   		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> -			wa_verify(wa, intel_gt_mcr_read_any_fw(gt, wa->reg),
> +			wa_verify(wa, intel_gt_mcr_read_any_fw(gt, mcr_reg),
>   				  wal->name, "application");
>   	}
>   
> @@ -1676,10 +1678,13 @@ static bool wa_list_verify(struct intel_gt *gt,
>   	spin_lock_irqsave(&uncore->lock, flags);
>   	intel_uncore_forcewake_get__locked(uncore, fw);
>   
> -	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
> +	for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
> +		i915_mcr_reg_t mcr_reg = MCR_REG(i915_mmio_reg_offset(wa->reg));
> +
>   		ok &= wa_verify(wa,
> -				intel_gt_mcr_read_any_fw(gt, wa->reg),
> +				intel_gt_mcr_read_any_fw(gt, mcr_reg),
>   				wal->name, from);
> +	}
>   
>   	intel_uncore_forcewake_put__locked(uncore, fw);
>   	spin_unlock_irqrestore(&uncore->lock, flags);
> @@ -1731,9 +1736,10 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
>   }
>   
>   static void
> -whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
> +whitelist_mcr_reg(struct i915_wa_list *wal, i915_mcr_reg_t reg)
>   {
> -	whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
> +	whitelist_reg_ext(wal, _MMIO(i915_mcr_reg_offset(reg)),
> +			  RING_FORCE_TO_NONPRIV_ACCESS_RW);
>   }
>   
>   static void gen9_whitelist_build(struct i915_wa_list *w)
> diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> index 67a9aab801dd..21b1edc052f8 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> @@ -991,7 +991,7 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg)
>   	/* Alas, we must pardon some whitelists. Mistakes already made */
>   	static const struct regmask pardon[] = {
>   		{ GEN9_CTX_PREEMPT_REG, 9 },
> -		{ GEN8_L3SQCREG4, 9 },
> +		{ _MMIO(0xb118), 9 }, /* GEN8_L3SQCREG4 */
>   	};
>   
>   	return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon));
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> index 389c5c0aad7a..0a2d50dbfe4b 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> @@ -327,7 +327,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
>   
>   static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
>   					 struct temp_regset *regset,
> -					 i915_reg_t reg, u32 flags)
> +					 i915_mcr_reg_t reg, u32 flags)
>   {
>   	u8 group, inst;
>   
> @@ -342,7 +342,7 @@ static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
>   	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
>   	flags |= GUC_REGSET_STEERING(group, inst);
>   
> -	return guc_mmio_reg_add(gt, regset, i915_mmio_reg_offset(reg), flags);
> +	return guc_mmio_reg_add(gt, regset, i915_mcr_reg_offset(reg), flags);
>   }
>   
>   #define GUC_MCR_REG_ADD(gt, regset, reg, masked) \
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> index 7f77e9cdaba4..8d1a85b06ff4 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> @@ -239,7 +239,7 @@ static void guc_capture_free_extlists(struct __guc_mmio_reg_descr_group *reglist
>   
>   struct __ext_steer_reg {
>   	const char *name;
> -	i915_reg_t reg;
> +	i915_mcr_reg_t reg;
>   };
>   
>   static const struct __ext_steer_reg xe_extregs[] = {
> @@ -251,7 +251,7 @@ static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
>   			   const struct __ext_steer_reg *extlist,
>   			   int slice_id, int subslice_id)
>   {
> -	ext->reg = extlist->reg;
> +	ext->reg = _MMIO(i915_mcr_reg_offset(extlist->reg));
>   	ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id);
>   	ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id);
>   	ext->regname = extlist->name;
> diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c
> index 2459213b6c87..8432f1fe25e6 100644
> --- a/drivers/gpu/drm/i915/gvt/cmd_parser.c
> +++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c
> @@ -918,7 +918,7 @@ static int cmd_reg_handler(struct parser_exec_state *s,
>   
>   	if (!strncmp(cmd, "srm", 3) ||
>   			!strncmp(cmd, "lrm", 3)) {
> -		if (offset == i915_mmio_reg_offset(GEN8_L3SQCREG4) ||
> +		if (offset == i915_mcr_reg_offset(GEN8_L3SQCREG4) ||
>   		    offset == 0x21f0 ||
>   		    (IS_BROADWELL(gvt->gt->i915) &&
>   		     offset == i915_mmio_reg_offset(INSTPM)))
> diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
> index bad1065a99a7..84af773c5ebb 100644
> --- a/drivers/gpu/drm/i915/gvt/handlers.c
> +++ b/drivers/gpu/drm/i915/gvt/handlers.c
> @@ -748,7 +748,7 @@ static i915_reg_t force_nonpriv_white_list[] = {
>   	_MMIO(0x770c),
>   	_MMIO(0x83a8),
>   	_MMIO(0xb110),
> -	GEN8_L3SQCREG4,//_MMIO(0xb118)
> +	_MMIO(0xb118),
>   	_MMIO(0xe100),
>   	_MMIO(0xe18c),
>   	_MMIO(0xe48c),
> @@ -2157,6 +2157,9 @@ static int csfe_chicken1_mmio_write(struct intel_vgpu *vgpu,
>   #define MMIO_DFH(reg, d, f, r, w) \
>   	MMIO_F(reg, 4, f, 0, 0, d, r, w)
>   
> +#define MMIO_DFH_MCR(reg, d, f, r, w) \
> +	MMIO_F(_MMIO(i915_mcr_reg_offset(reg)), 4, f, 0, 0, d, r, w)
> +
>   #define MMIO_GM(reg, d, r, w) \
>   	MMIO_F(reg, 4, F_GMADR, 0xFFFFF000, 0, d, r, w)
>   
> @@ -3147,15 +3150,15 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
>   	MMIO_D(GEN8_EU_DISABLE2, D_BDW_PLUS);
>   
>   	MMIO_D(_MMIO(0xfdc), D_BDW_PLUS);
> -	MMIO_DFH(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
> -		NULL, NULL);
> +	MMIO_DFH_MCR(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
> +		     NULL, NULL);
>   	MMIO_DFH(GEN7_ROW_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
>   		NULL, NULL);
>   	MMIO_DFH(GEN8_UCGCTL6, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
>   
>   	MMIO_DFH(_MMIO(0xb1f0), D_BDW, F_CMD_ACCESS, NULL, NULL);
>   	MMIO_DFH(_MMIO(0xb1c0), D_BDW, F_CMD_ACCESS, NULL, NULL);
> -	MMIO_DFH(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
>   	MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
>   	MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
>   	MMIO_D(_MMIO(0xb110), D_BDW);
> @@ -3181,7 +3184,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
>   
>   	MMIO_DFH(_MMIO(0xe194), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>   	MMIO_DFH(_MMIO(0xe188), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> -	MMIO_DFH(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>   	MMIO_DFH(_MMIO(0x2580), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
>   
>   	MMIO_DFH(_MMIO(0x2248), D_BDW, F_CMD_ACCESS, NULL, NULL);
> @@ -3372,7 +3375,7 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
>   	MMIO_D(DMC_HTP_SKL, D_SKL_PLUS);
>   	MMIO_D(DMC_LAST_WRITE, D_SKL_PLUS);
>   
> -	MMIO_DFH(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
>   
>   	MMIO_D(SKL_DFSM, D_SKL_PLUS);
>   	MMIO_D(DISPIO_CR_TX_BMU_CR0, D_SKL_PLUS);
> @@ -3619,7 +3622,7 @@ static int init_bxt_mmio_info(struct intel_gvt *gvt)
>   	MMIO_D(GEN8_PUSHBUS_ENABLE, D_BXT);
>   	MMIO_D(GEN8_PUSHBUS_SHIFT, D_BXT);
>   	MMIO_D(GEN6_GFXPAUSE, D_BXT);
> -	MMIO_DFH(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
> +	MMIO_DFH_MCR(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
>   	MMIO_DFH(GEN8_L3CNTLREG, D_BXT, F_CMD_ACCESS, NULL, NULL);
>   	MMIO_DFH(_MMIO(0x20D8), D_BXT, F_CMD_ACCESS, NULL, NULL);
>   	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40, F_CMD_ACCESS,
> diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
> index 4be07d627941..bf10c3bf6ad8 100644
> --- a/drivers/gpu/drm/i915/gvt/mmio_context.c
> +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
> @@ -44,6 +44,8 @@
>   
>   #define GEN9_MOCS_SIZE		64
>   
> +#define MCR_CAST(mcr)	_MMIO(i915_mcr_reg_offset(mcr))
> +
>   /* Raw offset is appened to each line for convenience. */
>   static struct engine_mmio gen8_engine_mmio_list[] __cacheline_aligned = {
>   	{RCS0, RING_MODE_GEN7(RENDER_RING_BASE), 0xffff, false}, /* 0x229c */
> @@ -106,15 +108,15 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
>   	{RCS0, GEN8_CS_CHICKEN1, 0xffff, true}, /* 0x2580 */
>   	{RCS0, COMMON_SLICE_CHICKEN2, 0xffff, true}, /* 0x7014 */
>   	{RCS0, GEN9_CS_DEBUG_MODE1, 0xffff, false}, /* 0x20ec */
> -	{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
> -	{RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
> +	{RCS0, _MMIO(0xb118), 0, false}, /* GEN8_L3SQCREG4 */
> +	{RCS0, _MMIO(0xb11c), 0, false}, /* GEN9_SCRATCH1 */
>   	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
>   	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
> -	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
> +	{RCS0, _MMIO(0xe180), 0xffff, true}, /* HALF_SLICE_CHICKEN2 */
>   	{RCS0, HSW_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
> -	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
> -	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
> -	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
> +	{RCS0, _MMIO(0xe188), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN5 */
> +	{RCS0, _MMIO(0xe194), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN7 */
> +	{RCS0, _MMIO(0xe4f0), 0xffff, true}, /* GEN8_ROW_CHICKEN */
>   	{RCS0, TRVATTL3PTRDW(0), 0, true}, /* 0x4de0 */
>   	{RCS0, TRVATTL3PTRDW(1), 0, true}, /* 0x4de4 */
>   	{RCS0, TRNULLDETCT, 0, true}, /* 0x4de8 */
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 0a9c3fcc09b1..22c10c4a1cbb 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -3986,7 +3986,7 @@ static u32 mask_reg_value(u32 reg, u32 val)
>   	 * WaDisableSTUnitPowerOptimization workaround. Make sure the value
>   	 * programmed by userspace doesn't change this.
>   	 */
> -	if (REG_EQUAL(reg, HALF_SLICE_CHICKEN2))
> +	if (reg == i915_mcr_reg_offset(HALF_SLICE_CHICKEN2))
>   		val = val & ~_MASKED_BIT_ENABLE(GEN8_ST_PO_DISABLE);
>   
>   	/* WAIT_FOR_RC6_EXIT has only one bit fullfilling the function
> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
> index 8f486f77609f..34eca053fab9 100644
> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> @@ -104,6 +104,10 @@ typedef struct {
>   
>   #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
>   
> +typedef struct {
> +	u32 mmio;
> +} i915_mcr_reg_t;
> +
>   #define INVALID_MMIO_REG _MMIO(0)
>   
>   static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
> @@ -111,6 +115,11 @@ static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
>   	return reg.reg;
>   }
>   
> +static __always_inline u32 i915_mcr_reg_offset(const i915_mcr_reg_t reg)
> +{
> +	return reg.mmio;
> +}
> +
>   static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
>   {
>   	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering
  2022-03-30 23:28 ` [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering Matt Roper
  2022-03-31 17:35   ` [Intel-gfx] " Tvrtko Ursulin
@ 2022-04-01  8:34   ` Tvrtko Ursulin
  2022-04-04 21:42     ` Matt Roper
  1 sibling, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2022-04-01  8:34 UTC (permalink / raw)
  To: Matt Roper, intel-gfx; +Cc: dri-devel


On 31/03/2022 00:28, Matt Roper wrote:
> Historically we've selected and programmed a single MCR group/instance
> ID at driver startup that will steer register accesses for GSLICE/DSS
> ranges to a non-terminated instance.  Any reads of these register ranges
> that don't need a specific unicast access won't bother explicitly
> resteering because the control register should already be set to a
> suitable value to receive a non-terminated response.  Any accesses to
> other types of MCR ranges (MSLICE, LNCF, etc.) that are also satisfied
> by the default steering target will also skip explicit re-steering at
> runtime.
> 
> This approach has worked well for many years and many platforms, but our
> hardware teams have recently advised us that they're not 100%
> comfortable with our strategy going forward.  They now suggest
> explicitly steering reads of any multicast register at the time the
> register access happens rather than relying on previously-programmed
> steering values.  In debug settings there could be external agents that
> have adjusted the default steering without the driver's knowledge (e.g.,
> we could do this manually with IGT's intel_reg today, although the
> hardware teams also have other tools they use for debug and analysis).
> In theory it would also be possible for bad firmware and/or incorrect
> handling of power management events to clobber/wipe the steering
> value we had previously initialized because they assume we'll be
> re-programming it anyway.
> 
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 40 +++++++++
>   drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
>   drivers/gpu/drm/i915/gt/intel_workarounds.c | 98 ++++-----------------
>   3 files changed, 56 insertions(+), 83 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> index a9a9fa6881f2..787752367337 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> @@ -35,6 +35,7 @@
>    */
>   
>   static const char * const intel_steering_types[] = {
> +	"GSLICE/DSS",
>   	"L3BANK",
>   	"MSLICE",
>   	"LNCF",
> @@ -45,6 +46,35 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
>   	{},
>   };
>   
> +static const struct intel_mmio_range xehpsdv_dss_steering_table[] = {
> +	{ 0x005200, 0x0052FF },
> +	{ 0x005400, 0x007FFF },
> +	{ 0x008140, 0x00815F },
> +	{ 0x008D00, 0x008DFF },
> +	{ 0x0094D0, 0x00955F },
> +	{ 0x009680, 0x0096FF },
> +	{ 0x00DC00, 0x00DCFF },
> +	{ 0x00DE80, 0x00E8FF },
> +	{ 0x017000, 0x017FFF },
> +	{ 0x024A00, 0x024A7F },
> +	{},
> +};
> +
> +static const struct intel_mmio_range dg2_dss_steering_table[] = {
> +	{ 0x005200, 0x0052FF },
> +	{ 0x005400, 0x007FFF },
> +	{ 0x008140, 0x00815F },
> +	{ 0x008D00, 0x008DFF },
> +	{ 0x0094D0, 0x00955F },
> +	{ 0x009680, 0x0096FF },
> +	{ 0x00D800, 0x00D87F },
> +	{ 0x00DC00, 0x00DCFF },
> +	{ 0x00DE80, 0x00E8FF },
> +	{ 0x017000, 0x017FFF },
> +	{ 0x024A00, 0x024A7F },
> +	{},
> +};
> +
>   static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
>   	{ 0x004000, 0x004AFF },
>   	{ 0x00C800, 0x00CFFF },
> @@ -87,9 +117,11 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>   			 GEN12_MEML3_EN_MASK);
>   
>   	if (IS_DG2(i915)) {
> +		gt->steering_table[DSS] = dg2_dss_steering_table;
>   		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
>   		gt->steering_table[LNCF] = dg2_lncf_steering_table;
>   	} else if (IS_XEHPSDV(i915)) {
> +		gt->steering_table[DSS] = xehpsdv_dss_steering_table;
>   		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
>   		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
>   	} else if (GRAPHICS_VER(i915) >= 11 &&
> @@ -317,7 +349,15 @@ static void get_valid_steering(struct intel_gt *gt,
>   			       enum intel_steering_type type,
>   			       u8 *group, u8 *instance)
>   {
> +	u32 dssmask = intel_sseu_get_subslices(&gt->info.sseu, 0);
> +
>   	switch (type) {
> +	case DSS:
> +		drm_WARN_ON(&gt->i915->drm, dssmask == 0);
> +
> +		*group = __ffs(dssmask) / GEN_DSS_PER_GSLICE;
> +		*instance = __ffs(dssmask) % GEN_DSS_PER_GSLICE;
> +		break;
>   	case L3BANK:
>   		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> index 937b2e1a305e..b77bbaac7622 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> @@ -54,6 +54,7 @@ struct intel_mmio_range {
>    * are listed here.
>    */
>   enum intel_steering_type {
> +	DSS,
>   	L3BANK,
>   	MSLICE,
>   	LNCF,
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 818ba71f4909..2486c6aa9d9d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -1160,87 +1160,6 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
>   	__add_mcr_wa(gt, wal, slice, subslice);
>   }
>   
> -static void
> -xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> -{
> -	const struct sseu_dev_info *sseu = &gt->info.sseu;
> -	unsigned long slice, subslice = 0, slice_mask = 0;
> -	u64 dss_mask = 0;
> -	u32 lncf_mask = 0;
> -	int i;
> -
> -	/*
> -	 * On Xe_HP the steering increases in complexity. There are now several
> -	 * more units that require steering and we're not guaranteed to be able
> -	 * to find a common setting for all of them. These are:
> -	 * - GSLICE (fusable)
> -	 * - DSS (sub-unit within gslice; fusable)
> -	 * - L3 Bank (fusable)
> -	 * - MSLICE (fusable)
> -	 * - LNCF (sub-unit within mslice; always present if mslice is present)
> -	 *
> -	 * We'll do our default/implicit steering based on GSLICE (in the
> -	 * sliceid field) and DSS (in the subsliceid field).  If we can
> -	 * find overlap between the valid MSLICE and/or LNCF values with
> -	 * a suitable GSLICE, then we can just re-use the default value and
> -	 * skip and explicit steering at runtime.
> -	 *
> -	 * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
> -	 * a valid sliceid value.  DSS steering is the only type of steering
> -	 * that utilizes the 'subsliceid' bits.
> -	 *
> -	 * Also note that, even though the steering domain is called "GSlice"
> -	 * and it is encoded in the register using the gslice format, the spec
> -	 * says that the combined (geometry | compute) fuse should be used to
> -	 * select the steering.
> -	 */
> -
> -	/* Find the potential gslice candidates */
> -	dss_mask = intel_sseu_get_subslices(sseu, 0);
> -	slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
> -
> -	/*
> -	 * Find the potential LNCF candidates.  Either LNCF within a valid
> -	 * mslice is fine.
> -	 */
> -	for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
> -		lncf_mask |= (0x3 << (i * 2));
> -
> -	/*
> -	 * Are there any sliceid values that work for both GSLICE and LNCF
> -	 * steering?
> -	 */
> -	if (slice_mask & lncf_mask) {
> -		slice_mask &= lncf_mask;
> -		gt->steering_table[LNCF] = NULL;
> -	}
> -
> -	/* How about sliceid values that also work for MSLICE steering? */
> -	if (slice_mask & gt->info.mslice_mask) {
> -		slice_mask &= gt->info.mslice_mask;
> -		gt->steering_table[MSLICE] = NULL;
> -	}
> -
> -	slice = __ffs(slice_mask);
> -	subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
> -	WARN_ON(subslice > GEN_DSS_PER_GSLICE);
> -	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
> -
> -	__add_mcr_wa(gt, wal, slice, subslice);
> -
> -	/*
> -	 * SQIDI ranges are special because they use different steering
> -	 * registers than everything else we work with.  On XeHP SDV and
> -	 * DG2-G10, any value in the steering registers will work fine since
> -	 * all instances are present, but DG2-G11 only has SQIDI instances at
> -	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
> -	 * we'll just steer to a hardcoded "2" since that value will work
> -	 * everywhere.
> -	 */
> -	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> -	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> -}
> -
>   static void
>   icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>   {
> @@ -1388,8 +1307,9 @@ static void
>   xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>   {
>   	struct drm_i915_private *i915 = gt->i915;
> +	struct drm_printer p = drm_debug_printer("MCR Steering:");
>   
> -	xehp_init_mcr(gt, wal);
> +	intel_gt_mcr_report_steering(&p, gt, false);
>   
>   	/* Wa_1409757795:xehpsdv */
>   	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
> @@ -1441,10 +1361,22 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>   static void
>   dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>   {
> +	struct drm_printer p = drm_debug_printer("MCR Steering:");
>   	struct intel_engine_cs *engine;
>   	int id;
>   
> -	xehp_init_mcr(gt, wal);
> +	intel_gt_mcr_report_steering(&p, gt, false);

Are these platforms immune to system hangs caused by incorrect/missing 
default MCR configuration such was fixed with c7d561cfcf86 ("drm/i915: 
Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9") ? That 
was triggerable from userspace to be clear.

Regards,

Tvrtko

> +
> +	/*
> +	 * SQIDI ranges are special because they use different steering
> +	 * registers than everything else we work with.  On DG2-G10, any value
> +	 * in the steering registers will work fine since all instances are
> +	 * present, but DG2-G11 only has SQIDI instances at ID's 2 and 3, so we
> +	 * need to steer to one of those.  For simplicity we'll just steer to a
> +	 * hardcoded "2" since that value will work everywhere.
> +	 */
> +	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> +	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
>   
>   	/* Wa_14011060649:dg2 */
>   	wa_14011060649(gt, wal);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 14/15] drm/i915: Define multicast registers as a new type
  2022-04-01  7:55   ` [Intel-gfx] " Tvrtko Ursulin
@ 2022-04-04 21:12     ` Matt Roper
  0 siblings, 0 replies; 24+ messages in thread
From: Matt Roper @ 2022-04-04 21:12 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel

On Fri, Apr 01, 2022 at 08:55:19AM +0100, Tvrtko Ursulin wrote:
> 
> On 31/03/2022 00:28, Matt Roper wrote:
> > Rather than treating multicast registers as 'i915_reg_t' let's define
> > them as a completely new type.  This will allow the compiler to help us
> > make sure we're using multicast-aware functions to operate on multicast
> > registers.
> > 
> > This plan does break down a bit in places where we're just maintaining
> > heterogeneous lists of registers (e.g., various MMIO whitelists used by
> > perf, GVT, etc.) rather than performing reads/writes.  We only really
> > care about the offset in those cases, so for now we can "cast" the
> > registers as non-MCR, leaving us with a list of i915_reg_t's, but we may
> > want to look for better ways to store mixed collections of i915_reg_t
> > and i915_mcr_reg_t in the future.
> > 
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt_mcr.c        | 49 ++++++++++++-------
> >   drivers/gpu/drm/i915/gt/intel_gt_mcr.h        | 14 +++---
> >   drivers/gpu/drm/i915/gt/intel_gt_regs.h       | 27 +++++++---
> >   drivers/gpu/drm/i915/gt/intel_lrc.c           |  6 +--
> >   drivers/gpu/drm/i915/gt/intel_workarounds.c   | 32 +++++++-----
> >   .../gpu/drm/i915/gt/selftest_workarounds.c    |  2 +-
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  4 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_capture.c    |  4 +-
> >   drivers/gpu/drm/i915/gvt/cmd_parser.c         |  2 +-
> >   drivers/gpu/drm/i915/gvt/handlers.c           | 17 ++++---
> >   drivers/gpu/drm/i915/gvt/mmio_context.c       | 14 +++---
> >   drivers/gpu/drm/i915/i915_perf.c              |  2 +-
> >   drivers/gpu/drm/i915/i915_reg_defs.h          |  9 ++++
> >   13 files changed, 113 insertions(+), 69 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > index c8e52d625f18..a9a9fa6881f2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > @@ -103,6 +103,19 @@ void intel_gt_mcr_init(struct intel_gt *gt)
> >   	}
> >   }
> > +/*
> > + * Although the rest of the driver should use MCR-specific functions to
> > + * read/write MCR registers, we still use the regular intel_uncore_* functions
> > + * internally to implement those, so we need a way for the functions in this
> > + * file to "cast" an i915_mcr_reg_t into an i915_reg_t.
> > + */
> > +static i915_reg_t mcr_reg_cast(const i915_mcr_reg_t mcr)
> > +{
> > +	i915_reg_t r = { .reg = mcr.mmio };
> > +
> > +	return r;
> > +}
> > +
> >   /*
> >    * rw_with_mcr_steering_fw - Access a register with specific MCR steering
> >    * @uncore: pointer to struct intel_uncore
> > @@ -117,7 +130,7 @@ void intel_gt_mcr_init(struct intel_gt *gt)
> >    * Caller needs to make sure the relevant forcewake wells are up.
> >    */
> >   static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> > -				   i915_reg_t reg, u8 rw_flag,
> > +				   i915_mcr_reg_t reg, u8 rw_flag,
> >   				   int group, int instance, u32 value)
> >   {
> >   	u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
> > @@ -154,9 +167,9 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> >   	intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
> >   	if (rw_flag == FW_REG_READ)
> > -		val = intel_uncore_read_fw(uncore, reg);
> > +		val = intel_uncore_read_fw(uncore, mcr_reg_cast(reg));
> >   	else
> > -		intel_uncore_write_fw(uncore, reg, value);
> > +		intel_uncore_write_fw(uncore, mcr_reg_cast(reg), value);
> >   	mcr &= ~mcr_mask;
> >   	mcr |= old_mcr & mcr_mask;
> > @@ -167,14 +180,14 @@ static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
> >   }
> >   static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
> > -				i915_reg_t reg, u8 rw_flag,
> > +				i915_mcr_reg_t reg, u8 rw_flag,
> >   				int group, int instance,
> >   				u32 value)
> >   {
> >   	enum forcewake_domains fw_domains;
> >   	u32 val;
> > -	fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
> > +	fw_domains = intel_uncore_forcewake_for_reg(uncore, mcr_reg_cast(reg),
> >   						    rw_flag);
> >   	fw_domains |= intel_uncore_forcewake_for_reg(uncore,
> >   						     GEN8_MCR_SELECTOR,
> > @@ -203,7 +216,7 @@ static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
> >    * group/instance.
> >    */
> >   u32 intel_gt_mcr_read(struct intel_gt *gt,
> > -		      i915_reg_t reg,
> > +		      i915_mcr_reg_t reg,
> >   		      int group, int instance)
> >   {
> >   	return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ,
> > @@ -222,7 +235,7 @@ u32 intel_gt_mcr_read(struct intel_gt *gt,
> >    * group/instance.
> >    */
> >   void intel_gt_mcr_unicast_write(struct intel_gt *gt,
> > -				i915_reg_t reg, u32 value,
> > +				i915_mcr_reg_t reg, u32 value,
> >   				int group, int instance)
> >   {
> >   	rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE,
> > @@ -238,9 +251,9 @@ void intel_gt_mcr_unicast_write(struct intel_gt *gt,
> >    * Write an MCR register in multicast mode to update all instances.
> >    */
> >   void intel_gt_mcr_multicast_write(struct intel_gt *gt,
> > -				i915_reg_t reg, u32 value)
> > +				i915_mcr_reg_t reg, u32 value)
> >   {
> > -	intel_uncore_write(gt->uncore, reg, value);
> > +	intel_uncore_write(gt->uncore, mcr_reg_cast(reg), value);
> >   }
> >   /**
> > @@ -253,9 +266,9 @@ void intel_gt_mcr_multicast_write(struct intel_gt *gt,
> >    * must already be holding any required forcewake.
> >    */
> >   void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
> > -				i915_reg_t reg, u32 value)
> > +				     i915_mcr_reg_t reg, u32 value)
> >   {
> > -	intel_uncore_write_fw(gt->uncore, reg, value);
> > +	intel_uncore_write_fw(gt->uncore, mcr_reg_cast(reg), value);
> >   }
> >   /*
> > @@ -273,10 +286,10 @@ void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
> >    * for @type steering too.
> >    */
> >   static bool reg_needs_read_steering(struct intel_gt *gt,
> > -				    i915_reg_t reg,
> > +				    i915_mcr_reg_t reg,
> >   				    enum intel_steering_type type)
> >   {
> > -	const u32 offset = i915_mmio_reg_offset(reg);
> > +	const u32 offset = i915_mmio_reg_offset(mcr_reg_cast(reg));
> >   	const struct intel_mmio_range *entry;
> >   	if (likely(!gt->steering_table[type]))
> > @@ -348,7 +361,7 @@ static void get_valid_steering(struct intel_gt *gt,
> >    * steering.
> >    */
> >   void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
> > -					     i915_reg_t reg,
> > +					     i915_mcr_reg_t reg,
> >   					     u8 *group, u8 *instance)
> >   {
> >   	int type;
> > @@ -377,7 +390,7 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
> >    *
> >    * Returns the value from a valid instance of @reg.
> >    */
> > -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
> > +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg)
> >   {
> >   	int type;
> >   	u8 group, instance;
> > @@ -391,10 +404,10 @@ u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
> >   		}
> >   	}
> > -	return intel_uncore_read_fw(gt->uncore, reg);
> > +	return intel_uncore_read_fw(gt->uncore, mcr_reg_cast(reg));
> >   }
> > -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
> > +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg)
> >   {
> >   	int type;
> >   	u8 group, instance;
> > @@ -408,7 +421,7 @@ u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
> >   		}
> >   	}
> > -	return intel_uncore_read(gt->uncore, reg);
> > +	return intel_uncore_read(gt->uncore, mcr_reg_cast(reg));
> >   }
> >   static void report_steering_type(struct drm_printer *p,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> > index 506b0cbc8db3..176501ea5926 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.h
> > @@ -11,21 +11,21 @@
> >   void intel_gt_mcr_init(struct intel_gt *gt);
> >   u32 intel_gt_mcr_read(struct intel_gt *gt,
> > -		      i915_reg_t reg,
> > +		      i915_mcr_reg_t reg,
> >   		      int group, int instance);
> > -u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
> > -u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
> > +u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_mcr_reg_t reg);
> > +u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_mcr_reg_t reg);
> >   void intel_gt_mcr_unicast_write(struct intel_gt *gt,
> > -				i915_reg_t reg, u32 value,
> > +				i915_mcr_reg_t reg, u32 value,
> >   				int group, int instance);
> >   void intel_gt_mcr_multicast_write(struct intel_gt *gt,
> > -				  i915_reg_t reg, u32 value);
> > +				  i915_mcr_reg_t reg, u32 value);
> >   void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
> > -				     i915_reg_t reg, u32 value);
> > +				     i915_mcr_reg_t reg, u32 value);
> >   void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
> > -					     i915_reg_t reg,
> > +					     i915_mcr_reg_t reg,
> >   					     u8 *group, u8 *instance);
> >   void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > index 3f5e01a48a17..926fb6a8558d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > @@ -8,7 +8,18 @@
> >   #include "i915_reg_defs.h"
> > -#define MCR_REG(offset)	_MMIO(offset)
> > +#define MCR_REG(offset)	((const i915_mcr_reg_t){ .mmio = (offset) })
> > +
> > +/*
> > + * The perf control registers are technically multicast registers, but the
> > + * driver never needs to read/write them directly; we only use them to build
> > + * lists of registers (where they're mixed in with other non-MCR registers)
> > + * and then operate on the offset directly.  For now we'll just define them
> > + * as non-multicast so we can place them on the same list, but we may want
> > + * to try to come up with a better way to handle heterogeneous lists of
> > + * registers in the future.
> > + */
> > +#define PERF_REG(offset)			_MMIO(offset)
> >   /* RPM unit config (Gen8+) */
> >   #define RPM_CONFIG0				_MMIO(0xd00)
> > @@ -1048,8 +1059,8 @@
> >   #define   ENABLE_PREFETCH_INTO_IC		REG_BIT(3)
> >   #define   FLOAT_BLEND_OPTIMIZATION_ENABLE	REG_BIT(4)
> > -#define EU_PERF_CNTL0				MCR_REG(0xe458)
> > -#define EU_PERF_CNTL4				MCR_REG(0xe45c)
> > +#define EU_PERF_CNTL0				PERF_REG(0xe458)
> > +#define EU_PERF_CNTL4				PERF_REG(0xe45c)
> >   #define GEN9_ROW_CHICKEN4			MCR_REG(0xe48c)
> >   #define   GEN12_DISABLE_GRF_CLEAR		REG_BIT(13)
> > @@ -1082,16 +1093,16 @@
> >   #define RT_CTRL					MCR_REG(0xe530)
> >   #define   DIS_NULL_QUERY			REG_BIT(10)
> > -#define EU_PERF_CNTL1				MCR_REG(0xe558)
> > -#define EU_PERF_CNTL5				MCR_REG(0xe55c)
> > +#define EU_PERF_CNTL1				PERF_REG(0xe558)
> > +#define EU_PERF_CNTL5				PERF_REG(0xe55c)
> >   #define XEHP_HDC_CHICKEN0			MCR_REG(0xe5f0)
> >   #define   LSC_L1_FLUSH_CTL_3D_DATAPORT_FLUSH_EVENTS_MASK	REG_GENMASK(13, 11)
> >   #define ICL_HDC_MODE				MCR_REG(0xe5f4)
> > -#define EU_PERF_CNTL2				MCR_REG(0xe658)
> > -#define EU_PERF_CNTL6				MCR_REG(0xe65c)
> > -#define EU_PERF_CNTL3				MCR_REG(0xe758)
> > +#define EU_PERF_CNTL2				PERF_REG(0xe658)
> > +#define EU_PERF_CNTL6				PERF_REG(0xe65c)
> > +#define EU_PERF_CNTL3				PERF_REG(0xe758)
> >   #define LSC_CHICKEN_BIT_0			MCR_REG(0xe7c8)
> >   #define   DISABLE_D8_D16_COASLESCE		REG_BIT(30)
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index dffef6ab4baf..c5fd17b6cf96 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -1432,13 +1432,13 @@ gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
> >   {
> >   	/* NB no one else is allowed to scribble over scratch + 256! */
> >   	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
> > -	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
> > +	*batch++ = i915_mcr_reg_offset(GEN8_L3SQCREG4);
> >   	*batch++ = intel_gt_scratch_offset(engine->gt,
> >   					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
> >   	*batch++ = 0;
> >   	*batch++ = MI_LOAD_REGISTER_IMM(1);
> > -	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
> > +	*batch++ = i915_mcr_reg_offset(GEN8_L3SQCREG4);
> >   	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
> >   	batch = gen8_emit_pipe_control(batch,
> > @@ -1447,7 +1447,7 @@ gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
> >   				       0);
> >   	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
> > -	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
> > +	*batch++ = i915_mcr_reg_offset(GEN8_L3SQCREG4);
> >   	*batch++ = intel_gt_scratch_offset(engine->gt,
> >   					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
> >   	*batch++ = 0;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index d7e61c8a8c04..818ba71f4909 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -166,11 +166,11 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
> >   	_wa_add(wal, &wa);
> >   }
> > -static void wa_mcr_add(struct i915_wa_list *wal, i915_reg_t reg,
> > +static void wa_mcr_add(struct i915_wa_list *wal, i915_mcr_reg_t reg,
> >   		       u32 clear, u32 set, u32 read_mask, bool masked_reg)
> >   {
> >   	struct i915_wa wa = {
> > -		.reg  = reg,
> > +		.reg  = _MMIO(i915_mcr_reg_offset(reg)),
> >   		.clr  = clear,
> >   		.set  = set,
> >   		.read = read_mask,
> > @@ -187,7 +187,7 @@ wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
> >   }
> >   static void
> > -wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
> > +wa_mcr_write_clr_set(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clear, u32 set)
> >   {
> >   	wa_mcr_add(wal, reg, clear, set, clear, false);
> >   }
> > @@ -205,7 +205,7 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
> >   }
> >   static void
> > -wa_mcr_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
> > +wa_mcr_write_or(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set)
> >   {
> >   	wa_mcr_write_clr_set(wal, reg, set, set);
> >   }
> > @@ -217,7 +217,7 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
> >   }
> >   static void
> > -wa_mcr_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
> > +wa_mcr_write_clr(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 clr)
> >   {
> >   	wa_mcr_write_clr_set(wal, reg, clr, 0);
> >   }
> > @@ -240,7 +240,7 @@ wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> >   }
> >   static void
> > -wa_mcr_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> > +wa_mcr_masked_en(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
> >   {
> >   	wa_mcr_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
> >   }
> > @@ -252,7 +252,7 @@ wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> >   }
> >   static void
> > -wa_mcr_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
> > +wa_mcr_masked_dis(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 val)
> >   {
> >   	wa_mcr_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
> >   }
> > @@ -1638,16 +1638,18 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
> >   	intel_uncore_forcewake_get__locked(uncore, fw);
> >   	for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
> > +		/* To be safe, just assume all registers are MCR */
> > +		i915_mcr_reg_t mcr_reg = MCR_REG(i915_mmio_reg_offset(wa->reg));
> 
> What does safe mean in this context? It's okay to use non-mcr registers via
> mcr helpers, but not vice-versa? So intel_gt_mcr_read_any_fw still has a
> table to know when steering is needed and when not?

Yeah, the the intel_gt_mcr_read_any_fw() walks through the possible
steering types to see if any of them have an MMIO range that claims the
register; if not it does a regular intel_uncore_read() as the final
step.  On Xe_HP and beyond that final fallback shouldn't really be used
for anything except this hack in the workaround code to keep things
simple.  On pre-Xe_HP platforms we're not trying to get rid of the
implicit steering right now, so any DSS MCR ranges (and also L3BANK
ranges in cases where they have compatible steering requirements) will
fall through to the intel_uncore_read fallback.


Matt

> 
> Regards,
> 
> Tvrtko
> 
> >   		u32 val, old = 0;
> >   		/* open-coded rmw due to steering */
> > -		old = wa->clr ? intel_gt_mcr_read_any_fw(gt, wa->reg) : 0;
> > +		old = wa->clr ? intel_gt_mcr_read_any_fw(gt, mcr_reg) : 0;
> >   		val = (old & ~wa->clr) | wa->set;
> >   		if (val != old || !wa->clr)
> >   			intel_uncore_write_fw(uncore, wa->reg, val);
> >   		if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
> > -			wa_verify(wa, intel_gt_mcr_read_any_fw(gt, wa->reg),
> > +			wa_verify(wa, intel_gt_mcr_read_any_fw(gt, mcr_reg),
> >   				  wal->name, "application");
> >   	}
> > @@ -1676,10 +1678,13 @@ static bool wa_list_verify(struct intel_gt *gt,
> >   	spin_lock_irqsave(&uncore->lock, flags);
> >   	intel_uncore_forcewake_get__locked(uncore, fw);
> > -	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
> > +	for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
> > +		i915_mcr_reg_t mcr_reg = MCR_REG(i915_mmio_reg_offset(wa->reg));
> > +
> >   		ok &= wa_verify(wa,
> > -				intel_gt_mcr_read_any_fw(gt, wa->reg),
> > +				intel_gt_mcr_read_any_fw(gt, mcr_reg),
> >   				wal->name, from);
> > +	}
> >   	intel_uncore_forcewake_put__locked(uncore, fw);
> >   	spin_unlock_irqrestore(&uncore->lock, flags);
> > @@ -1731,9 +1736,10 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg)
> >   }
> >   static void
> > -whitelist_mcr_reg(struct i915_wa_list *wal, i915_reg_t reg)
> > +whitelist_mcr_reg(struct i915_wa_list *wal, i915_mcr_reg_t reg)
> >   {
> > -	whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW);
> > +	whitelist_reg_ext(wal, _MMIO(i915_mcr_reg_offset(reg)),
> > +			  RING_FORCE_TO_NONPRIV_ACCESS_RW);
> >   }
> >   static void gen9_whitelist_build(struct i915_wa_list *w)
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> > index 67a9aab801dd..21b1edc052f8 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> > @@ -991,7 +991,7 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg)
> >   	/* Alas, we must pardon some whitelists. Mistakes already made */
> >   	static const struct regmask pardon[] = {
> >   		{ GEN9_CTX_PREEMPT_REG, 9 },
> > -		{ GEN8_L3SQCREG4, 9 },
> > +		{ _MMIO(0xb118), 9 }, /* GEN8_L3SQCREG4 */
> >   	};
> >   	return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon));
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> > index 389c5c0aad7a..0a2d50dbfe4b 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> > @@ -327,7 +327,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
> >   static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
> >   					 struct temp_regset *regset,
> > -					 i915_reg_t reg, u32 flags)
> > +					 i915_mcr_reg_t reg, u32 flags)
> >   {
> >   	u8 group, inst;
> > @@ -342,7 +342,7 @@ static long __must_check guc_mcr_reg_add(struct intel_gt *gt,
> >   	intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
> >   	flags |= GUC_REGSET_STEERING(group, inst);
> > -	return guc_mmio_reg_add(gt, regset, i915_mmio_reg_offset(reg), flags);
> > +	return guc_mmio_reg_add(gt, regset, i915_mcr_reg_offset(reg), flags);
> >   }
> >   #define GUC_MCR_REG_ADD(gt, regset, reg, masked) \
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> > index 7f77e9cdaba4..8d1a85b06ff4 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> > @@ -239,7 +239,7 @@ static void guc_capture_free_extlists(struct __guc_mmio_reg_descr_group *reglist
> >   struct __ext_steer_reg {
> >   	const char *name;
> > -	i915_reg_t reg;
> > +	i915_mcr_reg_t reg;
> >   };
> >   static const struct __ext_steer_reg xe_extregs[] = {
> > @@ -251,7 +251,7 @@ static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext,
> >   			   const struct __ext_steer_reg *extlist,
> >   			   int slice_id, int subslice_id)
> >   {
> > -	ext->reg = extlist->reg;
> > +	ext->reg = _MMIO(i915_mcr_reg_offset(extlist->reg));
> >   	ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id);
> >   	ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id);
> >   	ext->regname = extlist->name;
> > diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c
> > index 2459213b6c87..8432f1fe25e6 100644
> > --- a/drivers/gpu/drm/i915/gvt/cmd_parser.c
> > +++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c
> > @@ -918,7 +918,7 @@ static int cmd_reg_handler(struct parser_exec_state *s,
> >   	if (!strncmp(cmd, "srm", 3) ||
> >   			!strncmp(cmd, "lrm", 3)) {
> > -		if (offset == i915_mmio_reg_offset(GEN8_L3SQCREG4) ||
> > +		if (offset == i915_mcr_reg_offset(GEN8_L3SQCREG4) ||
> >   		    offset == 0x21f0 ||
> >   		    (IS_BROADWELL(gvt->gt->i915) &&
> >   		     offset == i915_mmio_reg_offset(INSTPM)))
> > diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
> > index bad1065a99a7..84af773c5ebb 100644
> > --- a/drivers/gpu/drm/i915/gvt/handlers.c
> > +++ b/drivers/gpu/drm/i915/gvt/handlers.c
> > @@ -748,7 +748,7 @@ static i915_reg_t force_nonpriv_white_list[] = {
> >   	_MMIO(0x770c),
> >   	_MMIO(0x83a8),
> >   	_MMIO(0xb110),
> > -	GEN8_L3SQCREG4,//_MMIO(0xb118)
> > +	_MMIO(0xb118),
> >   	_MMIO(0xe100),
> >   	_MMIO(0xe18c),
> >   	_MMIO(0xe48c),
> > @@ -2157,6 +2157,9 @@ static int csfe_chicken1_mmio_write(struct intel_vgpu *vgpu,
> >   #define MMIO_DFH(reg, d, f, r, w) \
> >   	MMIO_F(reg, 4, f, 0, 0, d, r, w)
> > +#define MMIO_DFH_MCR(reg, d, f, r, w) \
> > +	MMIO_F(_MMIO(i915_mcr_reg_offset(reg)), 4, f, 0, 0, d, r, w)
> > +
> >   #define MMIO_GM(reg, d, r, w) \
> >   	MMIO_F(reg, 4, F_GMADR, 0xFFFFF000, 0, d, r, w)
> > @@ -3147,15 +3150,15 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
> >   	MMIO_D(GEN8_EU_DISABLE2, D_BDW_PLUS);
> >   	MMIO_D(_MMIO(0xfdc), D_BDW_PLUS);
> > -	MMIO_DFH(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
> > -		NULL, NULL);
> > +	MMIO_DFH_MCR(GEN8_ROW_CHICKEN, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
> > +		     NULL, NULL);
> >   	MMIO_DFH(GEN7_ROW_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS,
> >   		NULL, NULL);
> >   	MMIO_DFH(GEN8_UCGCTL6, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_DFH(_MMIO(0xb1f0), D_BDW, F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_DFH(_MMIO(0xb1c0), D_BDW, F_CMD_ACCESS, NULL, NULL);
> > -	MMIO_DFH(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
> > +	MMIO_DFH_MCR(GEN8_L3SQCREG4, D_BDW_PLUS, F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_D(_MMIO(0xb110), D_BDW);
> > @@ -3181,7 +3184,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
> >   	MMIO_DFH(_MMIO(0xe194), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_DFH(_MMIO(0xe188), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> > -	MMIO_DFH(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> > +	MMIO_DFH_MCR(HALF_SLICE_CHICKEN2, D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_DFH(_MMIO(0x2580), D_BDW_PLUS, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_DFH(_MMIO(0x2248), D_BDW, F_CMD_ACCESS, NULL, NULL);
> > @@ -3372,7 +3375,7 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
> >   	MMIO_D(DMC_HTP_SKL, D_SKL_PLUS);
> >   	MMIO_D(DMC_LAST_WRITE, D_SKL_PLUS);
> > -	MMIO_DFH(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
> > +	MMIO_DFH_MCR(BDW_SCRATCH1, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_D(SKL_DFSM, D_SKL_PLUS);
> >   	MMIO_D(DISPIO_CR_TX_BMU_CR0, D_SKL_PLUS);
> > @@ -3619,7 +3622,7 @@ static int init_bxt_mmio_info(struct intel_gvt *gvt)
> >   	MMIO_D(GEN8_PUSHBUS_ENABLE, D_BXT);
> >   	MMIO_D(GEN8_PUSHBUS_SHIFT, D_BXT);
> >   	MMIO_D(GEN6_GFXPAUSE, D_BXT);
> > -	MMIO_DFH(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
> > +	MMIO_DFH_MCR(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_DFH(GEN8_L3CNTLREG, D_BXT, F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_DFH(_MMIO(0x20D8), D_BXT, F_CMD_ACCESS, NULL, NULL);
> >   	MMIO_F(GEN8_RING_CS_GPR(RENDER_RING_BASE, 0), 0x40, F_CMD_ACCESS,
> > diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
> > index 4be07d627941..bf10c3bf6ad8 100644
> > --- a/drivers/gpu/drm/i915/gvt/mmio_context.c
> > +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
> > @@ -44,6 +44,8 @@
> >   #define GEN9_MOCS_SIZE		64
> > +#define MCR_CAST(mcr)	_MMIO(i915_mcr_reg_offset(mcr))
> > +
> >   /* Raw offset is appened to each line for convenience. */
> >   static struct engine_mmio gen8_engine_mmio_list[] __cacheline_aligned = {
> >   	{RCS0, RING_MODE_GEN7(RENDER_RING_BASE), 0xffff, false}, /* 0x229c */
> > @@ -106,15 +108,15 @@ static struct engine_mmio gen9_engine_mmio_list[] __cacheline_aligned = {
> >   	{RCS0, GEN8_CS_CHICKEN1, 0xffff, true}, /* 0x2580 */
> >   	{RCS0, COMMON_SLICE_CHICKEN2, 0xffff, true}, /* 0x7014 */
> >   	{RCS0, GEN9_CS_DEBUG_MODE1, 0xffff, false}, /* 0x20ec */
> > -	{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
> > -	{RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
> > +	{RCS0, _MMIO(0xb118), 0, false}, /* GEN8_L3SQCREG4 */
> > +	{RCS0, _MMIO(0xb11c), 0, false}, /* GEN9_SCRATCH1 */
> >   	{RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
> >   	{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0xffff, true}, /* 0xe100 */
> > -	{RCS0, HALF_SLICE_CHICKEN2, 0xffff, true}, /* 0xe180 */
> > +	{RCS0, _MMIO(0xe180), 0xffff, true}, /* HALF_SLICE_CHICKEN2 */
> >   	{RCS0, HSW_HALF_SLICE_CHICKEN3, 0xffff, true}, /* 0xe184 */
> > -	{RCS0, GEN9_HALF_SLICE_CHICKEN5, 0xffff, true}, /* 0xe188 */
> > -	{RCS0, GEN9_HALF_SLICE_CHICKEN7, 0xffff, true}, /* 0xe194 */
> > -	{RCS0, GEN8_ROW_CHICKEN, 0xffff, true}, /* 0xe4f0 */
> > +	{RCS0, _MMIO(0xe188), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN5 */
> > +	{RCS0, _MMIO(0xe194), 0xffff, true}, /* GEN9_HALF_SLICE_CHICKEN7 */
> > +	{RCS0, _MMIO(0xe4f0), 0xffff, true}, /* GEN8_ROW_CHICKEN */
> >   	{RCS0, TRVATTL3PTRDW(0), 0, true}, /* 0x4de0 */
> >   	{RCS0, TRVATTL3PTRDW(1), 0, true}, /* 0x4de4 */
> >   	{RCS0, TRNULLDETCT, 0, true}, /* 0x4de8 */
> > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> > index 0a9c3fcc09b1..22c10c4a1cbb 100644
> > --- a/drivers/gpu/drm/i915/i915_perf.c
> > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > @@ -3986,7 +3986,7 @@ static u32 mask_reg_value(u32 reg, u32 val)
> >   	 * WaDisableSTUnitPowerOptimization workaround. Make sure the value
> >   	 * programmed by userspace doesn't change this.
> >   	 */
> > -	if (REG_EQUAL(reg, HALF_SLICE_CHICKEN2))
> > +	if (reg == i915_mcr_reg_offset(HALF_SLICE_CHICKEN2))
> >   		val = val & ~_MASKED_BIT_ENABLE(GEN8_ST_PO_DISABLE);
> >   	/* WAIT_FOR_RC6_EXIT has only one bit fullfilling the function
> > diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
> > index 8f486f77609f..34eca053fab9 100644
> > --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> > +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> > @@ -104,6 +104,10 @@ typedef struct {
> >   #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
> > +typedef struct {
> > +	u32 mmio;
> > +} i915_mcr_reg_t;
> > +
> >   #define INVALID_MMIO_REG _MMIO(0)
> >   static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
> > @@ -111,6 +115,11 @@ static __always_inline u32 i915_mmio_reg_offset(i915_reg_t reg)
> >   	return reg.reg;
> >   }
> > +static __always_inline u32 i915_mcr_reg_offset(const i915_mcr_reg_t reg)
> > +{
> > +	return reg.mmio;
> > +}
> > +
> >   static inline bool i915_mmio_reg_equal(i915_reg_t a, i915_reg_t b)
> >   {
> >   	return i915_mmio_reg_offset(a) == i915_mmio_reg_offset(b);

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering
  2022-03-31 17:35   ` [Intel-gfx] " Tvrtko Ursulin
@ 2022-04-04 21:35     ` Matt Roper
  2022-04-07 12:25       ` Tvrtko Ursulin
  0 siblings, 1 reply; 24+ messages in thread
From: Matt Roper @ 2022-04-04 21:35 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel

On Thu, Mar 31, 2022 at 06:35:52PM +0100, Tvrtko Ursulin wrote:
> 
> On 31/03/2022 00:28, Matt Roper wrote:
> > Historically we've selected and programmed a single MCR group/instance
> > ID at driver startup that will steer register accesses for GSLICE/DSS
> > ranges to a non-terminated instance.  Any reads of these register ranges
> > that don't need a specific unicast access won't bother explicitly
> > resteering because the control register should already be set to a
> > suitable value to receive a non-terminated response.  Any accesses to
> > other types of MCR ranges (MSLICE, LNCF, etc.) that are also satisfied
> > by the default steering target will also skip explicit re-steering at
> > runtime.
> > 
> > This approach has worked well for many years and many platforms, but our
> > hardware teams have recently advised us that they're not 100%
> > comfortable with our strategy going forward.  They now suggest
> > explicitly steering reads of any multicast register at the time the
> > register access happens rather than relying on previously-programmed
> > steering values.  In debug settings there could be external agents that
> > have adjusted the default steering without the driver's knowledge (e.g.,
> > we could do this manually with IGT's intel_reg today, although the
> > hardware teams also have other tools they use for debug and analysis).
> > In theory it would also be possible for bad firmware and/or incorrect
> > handling of power management events to clobber/wipe the steering
> > value we had previously initialized because they assume we'll be
> > re-programming it anyway.
> 
> That external agent of any kind can mess with registers behind drivers back
> is kind of a weak justification, no? Because steering is just one small part
> of what can go wrong in this case.

Apparently the assumption when the hardware was designed was that
software would explicitly steer every MCR access; they never really
considered the design we've been using where we try to set it up once
and try to minimize subsequent updates to the steering control.  In a
lot of cases different agents updating steering have their own steering
control registers (i.e., the GuC, the command streamers, and a couple
other internal hardware units have their own independent steering
control registers to try to avoid racing with whatever the KMD/CPU is
doing), but I think there may have been some cases that aren't 100%
covered there.

This is also partially motivated by the direction the general hardware
teams want to move in the future --- they plan to reduce the number of
different steering control registers for different agents and make more
of them share a single register with the KMD/CPU.  There will be a
separate "semaphore" register used for coordinating access between
various agents (of course that will bring new challenges such as
increased latency and what to do if some hardware unit grabs the
semaphore and somehow fails to release it).

> 
> Also, someone at some point has to know which are the affected registers. Be
> it a range table, or at actual point of submitting patches which add
> register definitions. At any of those points mistakes are possible.

True.  But today there are a lot of registers used by the driver that
are multicast and I don't think the code written around them was really
thinking about the multicast semantics of the register, especially when
the code was copy/pasted from earlier platforms where they weren't
multicast (the TLB invalidation registers are an example --- should we
be waiting for an ack to come back on every mslice rather than just on a
single random mslice?).  MCR registers seem to be an area that's pretty
mysterious to a lot of people that haven't looked at them carefully, and
sometimes doing a simple intel_uncore_{read,write} doesn't accomplish
what you'd expect; forcing us to be a bit more deliberate about the type
of behavior we expect to get seems like it will help reduce mistakes in
the long run.

> 
> So I guess I am not immediately buying the need to refactor all this. Apart
> from churn, the main downside that I see is that all accesses need separate
> helpers. Question is why. Driver could choose to always steer before reading
> today, right?

You mean like making intel_uncore_read always do a steering table lookup
for all registers?  We could, although I'm a bit hesitant to add even
more GT-specific logic to the uncore functions that are used for non-GT
purposes as well.  And like I said before, it still hides the fact that
there may be multiple register instances and you're just reading one
semi-random instance.


Matt

> 
> Regards,
> 
> Tvrtko
> 
> > 
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 40 +++++++++
> >   drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
> >   drivers/gpu/drm/i915/gt/intel_workarounds.c | 98 ++++-----------------
> >   3 files changed, 56 insertions(+), 83 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > index a9a9fa6881f2..787752367337 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > @@ -35,6 +35,7 @@
> >    */
> >   static const char * const intel_steering_types[] = {
> > +	"GSLICE/DSS",
> >   	"L3BANK",
> >   	"MSLICE",
> >   	"LNCF",
> > @@ -45,6 +46,35 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
> >   	{},
> >   };
> > +static const struct intel_mmio_range xehpsdv_dss_steering_table[] = {
> > +	{ 0x005200, 0x0052FF },
> > +	{ 0x005400, 0x007FFF },
> > +	{ 0x008140, 0x00815F },
> > +	{ 0x008D00, 0x008DFF },
> > +	{ 0x0094D0, 0x00955F },
> > +	{ 0x009680, 0x0096FF },
> > +	{ 0x00DC00, 0x00DCFF },
> > +	{ 0x00DE80, 0x00E8FF },
> > +	{ 0x017000, 0x017FFF },
> > +	{ 0x024A00, 0x024A7F },
> > +	{},
> > +};
> > +
> > +static const struct intel_mmio_range dg2_dss_steering_table[] = {
> > +	{ 0x005200, 0x0052FF },
> > +	{ 0x005400, 0x007FFF },
> > +	{ 0x008140, 0x00815F },
> > +	{ 0x008D00, 0x008DFF },
> > +	{ 0x0094D0, 0x00955F },
> > +	{ 0x009680, 0x0096FF },
> > +	{ 0x00D800, 0x00D87F },
> > +	{ 0x00DC00, 0x00DCFF },
> > +	{ 0x00DE80, 0x00E8FF },
> > +	{ 0x017000, 0x017FFF },
> > +	{ 0x024A00, 0x024A7F },
> > +	{},
> > +};
> > +
> >   static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
> >   	{ 0x004000, 0x004AFF },
> >   	{ 0x00C800, 0x00CFFF },
> > @@ -87,9 +117,11 @@ void intel_gt_mcr_init(struct intel_gt *gt)
> >   			 GEN12_MEML3_EN_MASK);
> >   	if (IS_DG2(i915)) {
> > +		gt->steering_table[DSS] = dg2_dss_steering_table;
> >   		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
> >   		gt->steering_table[LNCF] = dg2_lncf_steering_table;
> >   	} else if (IS_XEHPSDV(i915)) {
> > +		gt->steering_table[DSS] = xehpsdv_dss_steering_table;
> >   		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
> >   		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
> >   	} else if (GRAPHICS_VER(i915) >= 11 &&
> > @@ -317,7 +349,15 @@ static void get_valid_steering(struct intel_gt *gt,
> >   			       enum intel_steering_type type,
> >   			       u8 *group, u8 *instance)
> >   {
> > +	u32 dssmask = intel_sseu_get_subslices(&gt->info.sseu, 0);
> > +
> >   	switch (type) {
> > +	case DSS:
> > +		drm_WARN_ON(&gt->i915->drm, dssmask == 0);
> > +
> > +		*group = __ffs(dssmask) / GEN_DSS_PER_GSLICE;
> > +		*instance = __ffs(dssmask) % GEN_DSS_PER_GSLICE;
> > +		break;
> >   	case L3BANK:
> >   		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > index 937b2e1a305e..b77bbaac7622 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > @@ -54,6 +54,7 @@ struct intel_mmio_range {
> >    * are listed here.
> >    */
> >   enum intel_steering_type {
> > +	DSS,
> >   	L3BANK,
> >   	MSLICE,
> >   	LNCF,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 818ba71f4909..2486c6aa9d9d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -1160,87 +1160,6 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> >   	__add_mcr_wa(gt, wal, slice, subslice);
> >   }
> > -static void
> > -xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> > -{
> > -	const struct sseu_dev_info *sseu = &gt->info.sseu;
> > -	unsigned long slice, subslice = 0, slice_mask = 0;
> > -	u64 dss_mask = 0;
> > -	u32 lncf_mask = 0;
> > -	int i;
> > -
> > -	/*
> > -	 * On Xe_HP the steering increases in complexity. There are now several
> > -	 * more units that require steering and we're not guaranteed to be able
> > -	 * to find a common setting for all of them. These are:
> > -	 * - GSLICE (fusable)
> > -	 * - DSS (sub-unit within gslice; fusable)
> > -	 * - L3 Bank (fusable)
> > -	 * - MSLICE (fusable)
> > -	 * - LNCF (sub-unit within mslice; always present if mslice is present)
> > -	 *
> > -	 * We'll do our default/implicit steering based on GSLICE (in the
> > -	 * sliceid field) and DSS (in the subsliceid field).  If we can
> > -	 * find overlap between the valid MSLICE and/or LNCF values with
> > -	 * a suitable GSLICE, then we can just re-use the default value and
> > -	 * skip and explicit steering at runtime.
> > -	 *
> > -	 * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
> > -	 * a valid sliceid value.  DSS steering is the only type of steering
> > -	 * that utilizes the 'subsliceid' bits.
> > -	 *
> > -	 * Also note that, even though the steering domain is called "GSlice"
> > -	 * and it is encoded in the register using the gslice format, the spec
> > -	 * says that the combined (geometry | compute) fuse should be used to
> > -	 * select the steering.
> > -	 */
> > -
> > -	/* Find the potential gslice candidates */
> > -	dss_mask = intel_sseu_get_subslices(sseu, 0);
> > -	slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
> > -
> > -	/*
> > -	 * Find the potential LNCF candidates.  Either LNCF within a valid
> > -	 * mslice is fine.
> > -	 */
> > -	for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
> > -		lncf_mask |= (0x3 << (i * 2));
> > -
> > -	/*
> > -	 * Are there any sliceid values that work for both GSLICE and LNCF
> > -	 * steering?
> > -	 */
> > -	if (slice_mask & lncf_mask) {
> > -		slice_mask &= lncf_mask;
> > -		gt->steering_table[LNCF] = NULL;
> > -	}
> > -
> > -	/* How about sliceid values that also work for MSLICE steering? */
> > -	if (slice_mask & gt->info.mslice_mask) {
> > -		slice_mask &= gt->info.mslice_mask;
> > -		gt->steering_table[MSLICE] = NULL;
> > -	}
> > -
> > -	slice = __ffs(slice_mask);
> > -	subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
> > -	WARN_ON(subslice > GEN_DSS_PER_GSLICE);
> > -	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
> > -
> > -	__add_mcr_wa(gt, wal, slice, subslice);
> > -
> > -	/*
> > -	 * SQIDI ranges are special because they use different steering
> > -	 * registers than everything else we work with.  On XeHP SDV and
> > -	 * DG2-G10, any value in the steering registers will work fine since
> > -	 * all instances are present, but DG2-G11 only has SQIDI instances at
> > -	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
> > -	 * we'll just steer to a hardcoded "2" since that value will work
> > -	 * everywhere.
> > -	 */
> > -	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> > -	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> > -}
> > -
> >   static void
> >   icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >   {
> > @@ -1388,8 +1307,9 @@ static void
> >   xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >   {
> >   	struct drm_i915_private *i915 = gt->i915;
> > +	struct drm_printer p = drm_debug_printer("MCR Steering:");
> > -	xehp_init_mcr(gt, wal);
> > +	intel_gt_mcr_report_steering(&p, gt, false);
> >   	/* Wa_1409757795:xehpsdv */
> >   	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
> > @@ -1441,10 +1361,22 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >   static void
> >   dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >   {
> > +	struct drm_printer p = drm_debug_printer("MCR Steering:");
> >   	struct intel_engine_cs *engine;
> >   	int id;
> > -	xehp_init_mcr(gt, wal);
> > +	intel_gt_mcr_report_steering(&p, gt, false);
> > +
> > +	/*
> > +	 * SQIDI ranges are special because they use different steering
> > +	 * registers than everything else we work with.  On DG2-G10, any value
> > +	 * in the steering registers will work fine since all instances are
> > +	 * present, but DG2-G11 only has SQIDI instances at ID's 2 and 3, so we
> > +	 * need to steer to one of those.  For simplicity we'll just steer to a
> > +	 * hardcoded "2" since that value will work everywhere.
> > +	 */
> > +	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> > +	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> >   	/* Wa_14011060649:dg2 */
> >   	wa_14011060649(gt, wal);

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering
  2022-04-01  8:34   ` Tvrtko Ursulin
@ 2022-04-04 21:42     ` Matt Roper
  2022-04-07 12:30       ` Tvrtko Ursulin
  0 siblings, 1 reply; 24+ messages in thread
From: Matt Roper @ 2022-04-04 21:42 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx, dri-devel

On Fri, Apr 01, 2022 at 09:34:04AM +0100, Tvrtko Ursulin wrote:
> 
> On 31/03/2022 00:28, Matt Roper wrote:
> > Historically we've selected and programmed a single MCR group/instance
> > ID at driver startup that will steer register accesses for GSLICE/DSS
> > ranges to a non-terminated instance.  Any reads of these register ranges
> > that don't need a specific unicast access won't bother explicitly
> > resteering because the control register should already be set to a
> > suitable value to receive a non-terminated response.  Any accesses to
> > other types of MCR ranges (MSLICE, LNCF, etc.) that are also satisfied
> > by the default steering target will also skip explicit re-steering at
> > runtime.
> > 
> > This approach has worked well for many years and many platforms, but our
> > hardware teams have recently advised us that they're not 100%
> > comfortable with our strategy going forward.  They now suggest
> > explicitly steering reads of any multicast register at the time the
> > register access happens rather than relying on previously-programmed
> > steering values.  In debug settings there could be external agents that
> > have adjusted the default steering without the driver's knowledge (e.g.,
> > we could do this manually with IGT's intel_reg today, although the
> > hardware teams also have other tools they use for debug and analysis).
> > In theory it would also be possible for bad firmware and/or incorrect
> > handling of power management events to clobber/wipe the steering
> > value we had previously initialized because they assume we'll be
> > re-programming it anyway.
> > 
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 40 +++++++++
> >   drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
> >   drivers/gpu/drm/i915/gt/intel_workarounds.c | 98 ++++-----------------
> >   3 files changed, 56 insertions(+), 83 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > index a9a9fa6881f2..787752367337 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
> > @@ -35,6 +35,7 @@
> >    */
> >   static const char * const intel_steering_types[] = {
> > +	"GSLICE/DSS",
> >   	"L3BANK",
> >   	"MSLICE",
> >   	"LNCF",
> > @@ -45,6 +46,35 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
> >   	{},
> >   };
> > +static const struct intel_mmio_range xehpsdv_dss_steering_table[] = {
> > +	{ 0x005200, 0x0052FF },
> > +	{ 0x005400, 0x007FFF },
> > +	{ 0x008140, 0x00815F },
> > +	{ 0x008D00, 0x008DFF },
> > +	{ 0x0094D0, 0x00955F },
> > +	{ 0x009680, 0x0096FF },
> > +	{ 0x00DC00, 0x00DCFF },
> > +	{ 0x00DE80, 0x00E8FF },
> > +	{ 0x017000, 0x017FFF },
> > +	{ 0x024A00, 0x024A7F },
> > +	{},
> > +};
> > +
> > +static const struct intel_mmio_range dg2_dss_steering_table[] = {
> > +	{ 0x005200, 0x0052FF },
> > +	{ 0x005400, 0x007FFF },
> > +	{ 0x008140, 0x00815F },
> > +	{ 0x008D00, 0x008DFF },
> > +	{ 0x0094D0, 0x00955F },
> > +	{ 0x009680, 0x0096FF },
> > +	{ 0x00D800, 0x00D87F },
> > +	{ 0x00DC00, 0x00DCFF },
> > +	{ 0x00DE80, 0x00E8FF },
> > +	{ 0x017000, 0x017FFF },
> > +	{ 0x024A00, 0x024A7F },
> > +	{},
> > +};
> > +
> >   static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
> >   	{ 0x004000, 0x004AFF },
> >   	{ 0x00C800, 0x00CFFF },
> > @@ -87,9 +117,11 @@ void intel_gt_mcr_init(struct intel_gt *gt)
> >   			 GEN12_MEML3_EN_MASK);
> >   	if (IS_DG2(i915)) {
> > +		gt->steering_table[DSS] = dg2_dss_steering_table;
> >   		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
> >   		gt->steering_table[LNCF] = dg2_lncf_steering_table;
> >   	} else if (IS_XEHPSDV(i915)) {
> > +		gt->steering_table[DSS] = xehpsdv_dss_steering_table;
> >   		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
> >   		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
> >   	} else if (GRAPHICS_VER(i915) >= 11 &&
> > @@ -317,7 +349,15 @@ static void get_valid_steering(struct intel_gt *gt,
> >   			       enum intel_steering_type type,
> >   			       u8 *group, u8 *instance)
> >   {
> > +	u32 dssmask = intel_sseu_get_subslices(&gt->info.sseu, 0);
> > +
> >   	switch (type) {
> > +	case DSS:
> > +		drm_WARN_ON(&gt->i915->drm, dssmask == 0);
> > +
> > +		*group = __ffs(dssmask) / GEN_DSS_PER_GSLICE;
> > +		*instance = __ffs(dssmask) % GEN_DSS_PER_GSLICE;
> > +		break;
> >   	case L3BANK:
> >   		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > index 937b2e1a305e..b77bbaac7622 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > @@ -54,6 +54,7 @@ struct intel_mmio_range {
> >    * are listed here.
> >    */
> >   enum intel_steering_type {
> > +	DSS,
> >   	L3BANK,
> >   	MSLICE,
> >   	LNCF,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 818ba71f4909..2486c6aa9d9d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -1160,87 +1160,6 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> >   	__add_mcr_wa(gt, wal, slice, subslice);
> >   }
> > -static void
> > -xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> > -{
> > -	const struct sseu_dev_info *sseu = &gt->info.sseu;
> > -	unsigned long slice, subslice = 0, slice_mask = 0;
> > -	u64 dss_mask = 0;
> > -	u32 lncf_mask = 0;
> > -	int i;
> > -
> > -	/*
> > -	 * On Xe_HP the steering increases in complexity. There are now several
> > -	 * more units that require steering and we're not guaranteed to be able
> > -	 * to find a common setting for all of them. These are:
> > -	 * - GSLICE (fusable)
> > -	 * - DSS (sub-unit within gslice; fusable)
> > -	 * - L3 Bank (fusable)
> > -	 * - MSLICE (fusable)
> > -	 * - LNCF (sub-unit within mslice; always present if mslice is present)
> > -	 *
> > -	 * We'll do our default/implicit steering based on GSLICE (in the
> > -	 * sliceid field) and DSS (in the subsliceid field).  If we can
> > -	 * find overlap between the valid MSLICE and/or LNCF values with
> > -	 * a suitable GSLICE, then we can just re-use the default value and
> > -	 * skip and explicit steering at runtime.
> > -	 *
> > -	 * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
> > -	 * a valid sliceid value.  DSS steering is the only type of steering
> > -	 * that utilizes the 'subsliceid' bits.
> > -	 *
> > -	 * Also note that, even though the steering domain is called "GSlice"
> > -	 * and it is encoded in the register using the gslice format, the spec
> > -	 * says that the combined (geometry | compute) fuse should be used to
> > -	 * select the steering.
> > -	 */
> > -
> > -	/* Find the potential gslice candidates */
> > -	dss_mask = intel_sseu_get_subslices(sseu, 0);
> > -	slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
> > -
> > -	/*
> > -	 * Find the potential LNCF candidates.  Either LNCF within a valid
> > -	 * mslice is fine.
> > -	 */
> > -	for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
> > -		lncf_mask |= (0x3 << (i * 2));
> > -
> > -	/*
> > -	 * Are there any sliceid values that work for both GSLICE and LNCF
> > -	 * steering?
> > -	 */
> > -	if (slice_mask & lncf_mask) {
> > -		slice_mask &= lncf_mask;
> > -		gt->steering_table[LNCF] = NULL;
> > -	}
> > -
> > -	/* How about sliceid values that also work for MSLICE steering? */
> > -	if (slice_mask & gt->info.mslice_mask) {
> > -		slice_mask &= gt->info.mslice_mask;
> > -		gt->steering_table[MSLICE] = NULL;
> > -	}
> > -
> > -	slice = __ffs(slice_mask);
> > -	subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
> > -	WARN_ON(subslice > GEN_DSS_PER_GSLICE);
> > -	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
> > -
> > -	__add_mcr_wa(gt, wal, slice, subslice);
> > -
> > -	/*
> > -	 * SQIDI ranges are special because they use different steering
> > -	 * registers than everything else we work with.  On XeHP SDV and
> > -	 * DG2-G10, any value in the steering registers will work fine since
> > -	 * all instances are present, but DG2-G11 only has SQIDI instances at
> > -	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
> > -	 * we'll just steer to a hardcoded "2" since that value will work
> > -	 * everywhere.
> > -	 */
> > -	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> > -	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> > -}
> > -
> >   static void
> >   icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >   {
> > @@ -1388,8 +1307,9 @@ static void
> >   xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >   {
> >   	struct drm_i915_private *i915 = gt->i915;
> > +	struct drm_printer p = drm_debug_printer("MCR Steering:");
> > -	xehp_init_mcr(gt, wal);
> > +	intel_gt_mcr_report_steering(&p, gt, false);
> >   	/* Wa_1409757795:xehpsdv */
> >   	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
> > @@ -1441,10 +1361,22 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >   static void
> >   dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> >   {
> > +	struct drm_printer p = drm_debug_printer("MCR Steering:");
> >   	struct intel_engine_cs *engine;
> >   	int id;
> > -	xehp_init_mcr(gt, wal);
> > +	intel_gt_mcr_report_steering(&p, gt, false);
> 
> Are these platforms immune to system hangs caused by incorrect/missing
> default MCR configuration such was fixed with c7d561cfcf86 ("drm/i915:
> Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9") ? That was
> triggerable from userspace to be clear.

They're supposed to be.  The mmio design guarantees specific termination
behavior for any disabled (fused off, powerwed down) register endpoint:
reads return a 0 dummy value and writes are dropped.  I can't find the
hardware description of WaProgramMgsrForCorrectSliceSpecificMmioReads
now, but it sounds like the termination points were either not part of
the design yet on old platforms like that, or were just not implemented
properly by hardware.


Matt

> 
> Regards,
> 
> Tvrtko
> 
> > +
> > +	/*
> > +	 * SQIDI ranges are special because they use different steering
> > +	 * registers than everything else we work with.  On DG2-G10, any value
> > +	 * in the steering registers will work fine since all instances are
> > +	 * present, but DG2-G11 only has SQIDI instances at ID's 2 and 3, so we
> > +	 * need to steer to one of those.  For simplicity we'll just steer to a
> > +	 * hardcoded "2" since that value will work everywhere.
> > +	 */
> > +	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> > +	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> >   	/* Wa_14011060649:dg2 */
> >   	wa_14011060649(gt, wal);

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering
  2022-04-04 21:35     ` Matt Roper
@ 2022-04-07 12:25       ` Tvrtko Ursulin
  0 siblings, 0 replies; 24+ messages in thread
From: Tvrtko Ursulin @ 2022-04-07 12:25 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, dri-devel


On 04/04/2022 22:35, Matt Roper wrote:
> On Thu, Mar 31, 2022 at 06:35:52PM +0100, Tvrtko Ursulin wrote:
>>
>> On 31/03/2022 00:28, Matt Roper wrote:
>>> Historically we've selected and programmed a single MCR group/instance
>>> ID at driver startup that will steer register accesses for GSLICE/DSS
>>> ranges to a non-terminated instance.  Any reads of these register ranges
>>> that don't need a specific unicast access won't bother explicitly
>>> resteering because the control register should already be set to a
>>> suitable value to receive a non-terminated response.  Any accesses to
>>> other types of MCR ranges (MSLICE, LNCF, etc.) that are also satisfied
>>> by the default steering target will also skip explicit re-steering at
>>> runtime.
>>>
>>> This approach has worked well for many years and many platforms, but our
>>> hardware teams have recently advised us that they're not 100%
>>> comfortable with our strategy going forward.  They now suggest
>>> explicitly steering reads of any multicast register at the time the
>>> register access happens rather than relying on previously-programmed
>>> steering values.  In debug settings there could be external agents that
>>> have adjusted the default steering without the driver's knowledge (e.g.,
>>> we could do this manually with IGT's intel_reg today, although the
>>> hardware teams also have other tools they use for debug and analysis).
>>> In theory it would also be possible for bad firmware and/or incorrect
>>> handling of power management events to clobber/wipe the steering
>>> value we had previously initialized because they assume we'll be
>>> re-programming it anyway.
>>
>> That external agent of any kind can mess with registers behind drivers back
>> is kind of a weak justification, no? Because steering is just one small part
>> of what can go wrong in this case.
> 
> Apparently the assumption when the hardware was designed was that
> software would explicitly steer every MCR access; they never really
> considered the design we've been using where we try to set it up once

Are you referring to Xe_HP here or even older generations? Pretty sure 
the way we programmed the steering at driver load was the same as what 
Windows driver was doing. If we both got it wrong that would be funny.

> and try to minimize subsequent updates to the steering control.  In a
> lot of cases different agents updating steering have their own steering
> control registers (i.e., the GuC, the command streamers, and a couple
> other internal hardware units have their own independent steering
> control registers to try to avoid racing with whatever the KMD/CPU is
> doing), but I think there may have been some cases that aren't 100%
> covered there.

The last point - it would be good to find out - do we have a current 
case (current platform) when something in the hardware can mess up the 
steering behind drivers back, and where.

> This is also partially motivated by the direction the general hardware
> teams want to move in the future --- they plan to reduce the number of
> different steering control registers for different agents and make more
> of them share a single register with the KMD/CPU.  There will be a
> separate "semaphore" register used for coordinating access between
> various agents (of course that will bring new challenges such as
> increased latency and what to do if some hardware unit grabs the
> semaphore and somehow fails to release it).

Understandable I guess, thanks for the info! If i915 is still not 
accessing those registers much problem will not be that big.

Regards,

Tvrtko

> 
>>
>> Also, someone at some point has to know which are the affected registers. Be
>> it a range table, or at actual point of submitting patches which add
>> register definitions. At any of those points mistakes are possible.
> 
> True.  But today there are a lot of registers used by the driver that
> are multicast and I don't think the code written around them was really
> thinking about the multicast semantics of the register, especially when
> the code was copy/pasted from earlier platforms where they weren't
> multicast (the TLB invalidation registers are an example --- should we
> be waiting for an ack to come back on every mslice rather than just on a
> single random mslice?).  MCR registers seem to be an area that's pretty
> mysterious to a lot of people that haven't looked at them carefully, and
> sometimes doing a simple intel_uncore_{read,write} doesn't accomplish
> what you'd expect; forcing us to be a bit more deliberate about the type
> of behavior we expect to get seems like it will help reduce mistakes in
> the long run.
> 
>>
>> So I guess I am not immediately buying the need to refactor all this. Apart
>> from churn, the main downside that I see is that all accesses need separate
>> helpers. Question is why. Driver could choose to always steer before reading
>> today, right?
> 
> You mean like making intel_uncore_read always do a steering table lookup
> for all registers?  We could, although I'm a bit hesitant to add even
> more GT-specific logic to the uncore functions that are used for non-GT
> purposes as well.  And like I said before, it still hides the fact that
> there may be multiple register instances and you're just reading one
> semi-random instance.
> 
> 
> Matt
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>>
>>> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 40 +++++++++
>>>    drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
>>>    drivers/gpu/drm/i915/gt/intel_workarounds.c | 98 ++++-----------------
>>>    3 files changed, 56 insertions(+), 83 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
>>> index a9a9fa6881f2..787752367337 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
>>> @@ -35,6 +35,7 @@
>>>     */
>>>    static const char * const intel_steering_types[] = {
>>> +	"GSLICE/DSS",
>>>    	"L3BANK",
>>>    	"MSLICE",
>>>    	"LNCF",
>>> @@ -45,6 +46,35 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
>>>    	{},
>>>    };
>>> +static const struct intel_mmio_range xehpsdv_dss_steering_table[] = {
>>> +	{ 0x005200, 0x0052FF },
>>> +	{ 0x005400, 0x007FFF },
>>> +	{ 0x008140, 0x00815F },
>>> +	{ 0x008D00, 0x008DFF },
>>> +	{ 0x0094D0, 0x00955F },
>>> +	{ 0x009680, 0x0096FF },
>>> +	{ 0x00DC00, 0x00DCFF },
>>> +	{ 0x00DE80, 0x00E8FF },
>>> +	{ 0x017000, 0x017FFF },
>>> +	{ 0x024A00, 0x024A7F },
>>> +	{},
>>> +};
>>> +
>>> +static const struct intel_mmio_range dg2_dss_steering_table[] = {
>>> +	{ 0x005200, 0x0052FF },
>>> +	{ 0x005400, 0x007FFF },
>>> +	{ 0x008140, 0x00815F },
>>> +	{ 0x008D00, 0x008DFF },
>>> +	{ 0x0094D0, 0x00955F },
>>> +	{ 0x009680, 0x0096FF },
>>> +	{ 0x00D800, 0x00D87F },
>>> +	{ 0x00DC00, 0x00DCFF },
>>> +	{ 0x00DE80, 0x00E8FF },
>>> +	{ 0x017000, 0x017FFF },
>>> +	{ 0x024A00, 0x024A7F },
>>> +	{},
>>> +};
>>> +
>>>    static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
>>>    	{ 0x004000, 0x004AFF },
>>>    	{ 0x00C800, 0x00CFFF },
>>> @@ -87,9 +117,11 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>>>    			 GEN12_MEML3_EN_MASK);
>>>    	if (IS_DG2(i915)) {
>>> +		gt->steering_table[DSS] = dg2_dss_steering_table;
>>>    		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
>>>    		gt->steering_table[LNCF] = dg2_lncf_steering_table;
>>>    	} else if (IS_XEHPSDV(i915)) {
>>> +		gt->steering_table[DSS] = xehpsdv_dss_steering_table;
>>>    		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
>>>    		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
>>>    	} else if (GRAPHICS_VER(i915) >= 11 &&
>>> @@ -317,7 +349,15 @@ static void get_valid_steering(struct intel_gt *gt,
>>>    			       enum intel_steering_type type,
>>>    			       u8 *group, u8 *instance)
>>>    {
>>> +	u32 dssmask = intel_sseu_get_subslices(&gt->info.sseu, 0);
>>> +
>>>    	switch (type) {
>>> +	case DSS:
>>> +		drm_WARN_ON(&gt->i915->drm, dssmask == 0);
>>> +
>>> +		*group = __ffs(dssmask) / GEN_DSS_PER_GSLICE;
>>> +		*instance = __ffs(dssmask) % GEN_DSS_PER_GSLICE;
>>> +		break;
>>>    	case L3BANK:
>>>    		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
>>> index 937b2e1a305e..b77bbaac7622 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
>>> @@ -54,6 +54,7 @@ struct intel_mmio_range {
>>>     * are listed here.
>>>     */
>>>    enum intel_steering_type {
>>> +	DSS,
>>>    	L3BANK,
>>>    	MSLICE,
>>>    	LNCF,
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> index 818ba71f4909..2486c6aa9d9d 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> @@ -1160,87 +1160,6 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    	__add_mcr_wa(gt, wal, slice, subslice);
>>>    }
>>> -static void
>>> -xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
>>> -{
>>> -	const struct sseu_dev_info *sseu = &gt->info.sseu;
>>> -	unsigned long slice, subslice = 0, slice_mask = 0;
>>> -	u64 dss_mask = 0;
>>> -	u32 lncf_mask = 0;
>>> -	int i;
>>> -
>>> -	/*
>>> -	 * On Xe_HP the steering increases in complexity. There are now several
>>> -	 * more units that require steering and we're not guaranteed to be able
>>> -	 * to find a common setting for all of them. These are:
>>> -	 * - GSLICE (fusable)
>>> -	 * - DSS (sub-unit within gslice; fusable)
>>> -	 * - L3 Bank (fusable)
>>> -	 * - MSLICE (fusable)
>>> -	 * - LNCF (sub-unit within mslice; always present if mslice is present)
>>> -	 *
>>> -	 * We'll do our default/implicit steering based on GSLICE (in the
>>> -	 * sliceid field) and DSS (in the subsliceid field).  If we can
>>> -	 * find overlap between the valid MSLICE and/or LNCF values with
>>> -	 * a suitable GSLICE, then we can just re-use the default value and
>>> -	 * skip and explicit steering at runtime.
>>> -	 *
>>> -	 * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
>>> -	 * a valid sliceid value.  DSS steering is the only type of steering
>>> -	 * that utilizes the 'subsliceid' bits.
>>> -	 *
>>> -	 * Also note that, even though the steering domain is called "GSlice"
>>> -	 * and it is encoded in the register using the gslice format, the spec
>>> -	 * says that the combined (geometry | compute) fuse should be used to
>>> -	 * select the steering.
>>> -	 */
>>> -
>>> -	/* Find the potential gslice candidates */
>>> -	dss_mask = intel_sseu_get_subslices(sseu, 0);
>>> -	slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
>>> -
>>> -	/*
>>> -	 * Find the potential LNCF candidates.  Either LNCF within a valid
>>> -	 * mslice is fine.
>>> -	 */
>>> -	for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
>>> -		lncf_mask |= (0x3 << (i * 2));
>>> -
>>> -	/*
>>> -	 * Are there any sliceid values that work for both GSLICE and LNCF
>>> -	 * steering?
>>> -	 */
>>> -	if (slice_mask & lncf_mask) {
>>> -		slice_mask &= lncf_mask;
>>> -		gt->steering_table[LNCF] = NULL;
>>> -	}
>>> -
>>> -	/* How about sliceid values that also work for MSLICE steering? */
>>> -	if (slice_mask & gt->info.mslice_mask) {
>>> -		slice_mask &= gt->info.mslice_mask;
>>> -		gt->steering_table[MSLICE] = NULL;
>>> -	}
>>> -
>>> -	slice = __ffs(slice_mask);
>>> -	subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
>>> -	WARN_ON(subslice > GEN_DSS_PER_GSLICE);
>>> -	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
>>> -
>>> -	__add_mcr_wa(gt, wal, slice, subslice);
>>> -
>>> -	/*
>>> -	 * SQIDI ranges are special because they use different steering
>>> -	 * registers than everything else we work with.  On XeHP SDV and
>>> -	 * DG2-G10, any value in the steering registers will work fine since
>>> -	 * all instances are present, but DG2-G11 only has SQIDI instances at
>>> -	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
>>> -	 * we'll just steer to a hardcoded "2" since that value will work
>>> -	 * everywhere.
>>> -	 */
>>> -	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
>>> -	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
>>> -}
>>> -
>>>    static void
>>>    icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    {
>>> @@ -1388,8 +1307,9 @@ static void
>>>    xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    {
>>>    	struct drm_i915_private *i915 = gt->i915;
>>> +	struct drm_printer p = drm_debug_printer("MCR Steering:");
>>> -	xehp_init_mcr(gt, wal);
>>> +	intel_gt_mcr_report_steering(&p, gt, false);
>>>    	/* Wa_1409757795:xehpsdv */
>>>    	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
>>> @@ -1441,10 +1361,22 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    static void
>>>    dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    {
>>> +	struct drm_printer p = drm_debug_printer("MCR Steering:");
>>>    	struct intel_engine_cs *engine;
>>>    	int id;
>>> -	xehp_init_mcr(gt, wal);
>>> +	intel_gt_mcr_report_steering(&p, gt, false);
>>> +
>>> +	/*
>>> +	 * SQIDI ranges are special because they use different steering
>>> +	 * registers than everything else we work with.  On DG2-G10, any value
>>> +	 * in the steering registers will work fine since all instances are
>>> +	 * present, but DG2-G11 only has SQIDI instances at ID's 2 and 3, so we
>>> +	 * need to steer to one of those.  For simplicity we'll just steer to a
>>> +	 * hardcoded "2" since that value will work everywhere.
>>> +	 */
>>> +	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
>>> +	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
>>>    	/* Wa_14011060649:dg2 */
>>>    	wa_14011060649(gt, wal);
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering
  2022-04-04 21:42     ` Matt Roper
@ 2022-04-07 12:30       ` Tvrtko Ursulin
  0 siblings, 0 replies; 24+ messages in thread
From: Tvrtko Ursulin @ 2022-04-07 12:30 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, dri-devel


On 04/04/2022 22:42, Matt Roper wrote:
> On Fri, Apr 01, 2022 at 09:34:04AM +0100, Tvrtko Ursulin wrote:
>>
>> On 31/03/2022 00:28, Matt Roper wrote:
>>> Historically we've selected and programmed a single MCR group/instance
>>> ID at driver startup that will steer register accesses for GSLICE/DSS
>>> ranges to a non-terminated instance.  Any reads of these register ranges
>>> that don't need a specific unicast access won't bother explicitly
>>> resteering because the control register should already be set to a
>>> suitable value to receive a non-terminated response.  Any accesses to
>>> other types of MCR ranges (MSLICE, LNCF, etc.) that are also satisfied
>>> by the default steering target will also skip explicit re-steering at
>>> runtime.
>>>
>>> This approach has worked well for many years and many platforms, but our
>>> hardware teams have recently advised us that they're not 100%
>>> comfortable with our strategy going forward.  They now suggest
>>> explicitly steering reads of any multicast register at the time the
>>> register access happens rather than relying on previously-programmed
>>> steering values.  In debug settings there could be external agents that
>>> have adjusted the default steering without the driver's knowledge (e.g.,
>>> we could do this manually with IGT's intel_reg today, although the
>>> hardware teams also have other tools they use for debug and analysis).
>>> In theory it would also be possible for bad firmware and/or incorrect
>>> handling of power management events to clobber/wipe the steering
>>> value we had previously initialized because they assume we'll be
>>> re-programming it anyway.
>>>
>>> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_gt_mcr.c      | 40 +++++++++
>>>    drivers/gpu/drm/i915/gt/intel_gt_types.h    |  1 +
>>>    drivers/gpu/drm/i915/gt/intel_workarounds.c | 98 ++++-----------------
>>>    3 files changed, 56 insertions(+), 83 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
>>> index a9a9fa6881f2..787752367337 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
>>> @@ -35,6 +35,7 @@
>>>     */
>>>    static const char * const intel_steering_types[] = {
>>> +	"GSLICE/DSS",
>>>    	"L3BANK",
>>>    	"MSLICE",
>>>    	"LNCF",
>>> @@ -45,6 +46,35 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
>>>    	{},
>>>    };
>>> +static const struct intel_mmio_range xehpsdv_dss_steering_table[] = {
>>> +	{ 0x005200, 0x0052FF },
>>> +	{ 0x005400, 0x007FFF },
>>> +	{ 0x008140, 0x00815F },
>>> +	{ 0x008D00, 0x008DFF },
>>> +	{ 0x0094D0, 0x00955F },
>>> +	{ 0x009680, 0x0096FF },
>>> +	{ 0x00DC00, 0x00DCFF },
>>> +	{ 0x00DE80, 0x00E8FF },
>>> +	{ 0x017000, 0x017FFF },
>>> +	{ 0x024A00, 0x024A7F },
>>> +	{},
>>> +};
>>> +
>>> +static const struct intel_mmio_range dg2_dss_steering_table[] = {
>>> +	{ 0x005200, 0x0052FF },
>>> +	{ 0x005400, 0x007FFF },
>>> +	{ 0x008140, 0x00815F },
>>> +	{ 0x008D00, 0x008DFF },
>>> +	{ 0x0094D0, 0x00955F },
>>> +	{ 0x009680, 0x0096FF },
>>> +	{ 0x00D800, 0x00D87F },
>>> +	{ 0x00DC00, 0x00DCFF },
>>> +	{ 0x00DE80, 0x00E8FF },
>>> +	{ 0x017000, 0x017FFF },
>>> +	{ 0x024A00, 0x024A7F },
>>> +	{},
>>> +};
>>> +
>>>    static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
>>>    	{ 0x004000, 0x004AFF },
>>>    	{ 0x00C800, 0x00CFFF },
>>> @@ -87,9 +117,11 @@ void intel_gt_mcr_init(struct intel_gt *gt)
>>>    			 GEN12_MEML3_EN_MASK);
>>>    	if (IS_DG2(i915)) {
>>> +		gt->steering_table[DSS] = dg2_dss_steering_table;
>>>    		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
>>>    		gt->steering_table[LNCF] = dg2_lncf_steering_table;
>>>    	} else if (IS_XEHPSDV(i915)) {
>>> +		gt->steering_table[DSS] = xehpsdv_dss_steering_table;
>>>    		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
>>>    		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
>>>    	} else if (GRAPHICS_VER(i915) >= 11 &&
>>> @@ -317,7 +349,15 @@ static void get_valid_steering(struct intel_gt *gt,
>>>    			       enum intel_steering_type type,
>>>    			       u8 *group, u8 *instance)
>>>    {
>>> +	u32 dssmask = intel_sseu_get_subslices(&gt->info.sseu, 0);
>>> +
>>>    	switch (type) {
>>> +	case DSS:
>>> +		drm_WARN_ON(&gt->i915->drm, dssmask == 0);
>>> +
>>> +		*group = __ffs(dssmask) / GEN_DSS_PER_GSLICE;
>>> +		*instance = __ffs(dssmask) % GEN_DSS_PER_GSLICE;
>>> +		break;
>>>    	case L3BANK:
>>>    		GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
>>> index 937b2e1a305e..b77bbaac7622 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
>>> @@ -54,6 +54,7 @@ struct intel_mmio_range {
>>>     * are listed here.
>>>     */
>>>    enum intel_steering_type {
>>> +	DSS,
>>>    	L3BANK,
>>>    	MSLICE,
>>>    	LNCF,
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> index 818ba71f4909..2486c6aa9d9d 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>>> @@ -1160,87 +1160,6 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    	__add_mcr_wa(gt, wal, slice, subslice);
>>>    }
>>> -static void
>>> -xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
>>> -{
>>> -	const struct sseu_dev_info *sseu = &gt->info.sseu;
>>> -	unsigned long slice, subslice = 0, slice_mask = 0;
>>> -	u64 dss_mask = 0;
>>> -	u32 lncf_mask = 0;
>>> -	int i;
>>> -
>>> -	/*
>>> -	 * On Xe_HP the steering increases in complexity. There are now several
>>> -	 * more units that require steering and we're not guaranteed to be able
>>> -	 * to find a common setting for all of them. These are:
>>> -	 * - GSLICE (fusable)
>>> -	 * - DSS (sub-unit within gslice; fusable)
>>> -	 * - L3 Bank (fusable)
>>> -	 * - MSLICE (fusable)
>>> -	 * - LNCF (sub-unit within mslice; always present if mslice is present)
>>> -	 *
>>> -	 * We'll do our default/implicit steering based on GSLICE (in the
>>> -	 * sliceid field) and DSS (in the subsliceid field).  If we can
>>> -	 * find overlap between the valid MSLICE and/or LNCF values with
>>> -	 * a suitable GSLICE, then we can just re-use the default value and
>>> -	 * skip and explicit steering at runtime.
>>> -	 *
>>> -	 * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
>>> -	 * a valid sliceid value.  DSS steering is the only type of steering
>>> -	 * that utilizes the 'subsliceid' bits.
>>> -	 *
>>> -	 * Also note that, even though the steering domain is called "GSlice"
>>> -	 * and it is encoded in the register using the gslice format, the spec
>>> -	 * says that the combined (geometry | compute) fuse should be used to
>>> -	 * select the steering.
>>> -	 */
>>> -
>>> -	/* Find the potential gslice candidates */
>>> -	dss_mask = intel_sseu_get_subslices(sseu, 0);
>>> -	slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
>>> -
>>> -	/*
>>> -	 * Find the potential LNCF candidates.  Either LNCF within a valid
>>> -	 * mslice is fine.
>>> -	 */
>>> -	for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
>>> -		lncf_mask |= (0x3 << (i * 2));
>>> -
>>> -	/*
>>> -	 * Are there any sliceid values that work for both GSLICE and LNCF
>>> -	 * steering?
>>> -	 */
>>> -	if (slice_mask & lncf_mask) {
>>> -		slice_mask &= lncf_mask;
>>> -		gt->steering_table[LNCF] = NULL;
>>> -	}
>>> -
>>> -	/* How about sliceid values that also work for MSLICE steering? */
>>> -	if (slice_mask & gt->info.mslice_mask) {
>>> -		slice_mask &= gt->info.mslice_mask;
>>> -		gt->steering_table[MSLICE] = NULL;
>>> -	}
>>> -
>>> -	slice = __ffs(slice_mask);
>>> -	subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
>>> -	WARN_ON(subslice > GEN_DSS_PER_GSLICE);
>>> -	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
>>> -
>>> -	__add_mcr_wa(gt, wal, slice, subslice);
>>> -
>>> -	/*
>>> -	 * SQIDI ranges are special because they use different steering
>>> -	 * registers than everything else we work with.  On XeHP SDV and
>>> -	 * DG2-G10, any value in the steering registers will work fine since
>>> -	 * all instances are present, but DG2-G11 only has SQIDI instances at
>>> -	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
>>> -	 * we'll just steer to a hardcoded "2" since that value will work
>>> -	 * everywhere.
>>> -	 */
>>> -	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
>>> -	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
>>> -}
>>> -
>>>    static void
>>>    icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    {
>>> @@ -1388,8 +1307,9 @@ static void
>>>    xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    {
>>>    	struct drm_i915_private *i915 = gt->i915;
>>> +	struct drm_printer p = drm_debug_printer("MCR Steering:");
>>> -	xehp_init_mcr(gt, wal);
>>> +	intel_gt_mcr_report_steering(&p, gt, false);
>>>    	/* Wa_1409757795:xehpsdv */
>>>    	wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
>>> @@ -1441,10 +1361,22 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    static void
>>>    dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
>>>    {
>>> +	struct drm_printer p = drm_debug_printer("MCR Steering:");
>>>    	struct intel_engine_cs *engine;
>>>    	int id;
>>> -	xehp_init_mcr(gt, wal);
>>> +	intel_gt_mcr_report_steering(&p, gt, false);
>>
>> Are these platforms immune to system hangs caused by incorrect/missing
>> default MCR configuration such was fixed with c7d561cfcf86 ("drm/i915:
>> Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9") ? That was
>> triggerable from userspace to be clear.
> 
> They're supposed to be.  The mmio design guarantees specific termination
> behavior for any disabled (fused off, powerwed down) register endpoint:
> reads return a 0 dummy value and writes are dropped.  I can't find the
> hardware description of WaProgramMgsrForCorrectSliceSpecificMmioReads
> now, but it sounds like the termination points were either not part of
> the design yet on old platforms like that, or were just not implemented
> properly by hardware.

It would be good to get in touch with someone from Media UMD to gain 
understanding of the failure mode on Gen9 which was fixed by the quoted 
commit and get their statement on whether Xe_HP is immune.

With that clarified I think the high level approach is good. To be clear 
I am not considering the debugging angle as valid, but I am counting on 
the justification that legitimate parts of hardware will start fiddling 
with a shared steering register in the near future.

Regards,

Tvrtko

> 
> 
> Matt
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>> +
>>> +	/*
>>> +	 * SQIDI ranges are special because they use different steering
>>> +	 * registers than everything else we work with.  On DG2-G10, any value
>>> +	 * in the steering registers will work fine since all instances are
>>> +	 * present, but DG2-G11 only has SQIDI instances at ID's 2 and 3, so we
>>> +	 * need to steer to one of those.  For simplicity we'll just steer to a
>>> +	 * hardcoded "2" since that value will work everywhere.
>>> +	 */
>>> +	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
>>> +	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
>>>    	/* Wa_14011060649:dg2 */
>>>    	wa_14011060649(gt, wal);
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-04-07 12:30 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-30 23:28 [PATCH 00/15] i915: Explicit handling of multicast registers Matt Roper
2022-03-30 23:28 ` [PATCH 01/15] drm/i915/gen8: Create separate reg definitions for new MCR registers Matt Roper
2022-03-30 23:28 ` [PATCH 02/15] drm/i915/xehp: " Matt Roper
2022-03-30 23:28 ` [PATCH 03/15] drm/i915/gt: Drop a few unused register definitions Matt Roper
2022-03-30 23:28 ` [PATCH 04/15] drm/i915/gt: Correct prefix on a few registers Matt Roper
2022-03-30 23:28 ` [PATCH 05/15] drm/i915/xehp: Check for faults on all mslices Matt Roper
2022-03-30 23:28 ` [PATCH 06/15] drm/i915: Drop duplicated definition of XEHPSDV_FLAT_CCS_BASE_ADDR Matt Roper
2022-03-30 23:28 ` [PATCH 07/15] drm/i915: Move XEHPSDV_TILE0_ADDR_RANGE to GT register header Matt Roper
2022-03-30 23:28 ` [PATCH 08/15] drm/i915: Define MCR registers explicitly Matt Roper
2022-03-30 23:28 ` [PATCH 09/15] drm/i915/gt: Move multicast register handling to a dedicated file Matt Roper
2022-03-30 23:28 ` [PATCH 10/15] drm/i915/gt: Cleanup interface for MCR operations Matt Roper
2022-03-30 23:28 ` [PATCH 11/15] drm/i915/gt: Always use MCR functions on multicast registers Matt Roper
2022-03-30 23:28 ` [PATCH 12/15] drm/i915/guc: Handle save/restore of MCR registers explicitly Matt Roper
2022-03-30 23:28 ` [PATCH 13/15] drm/i915/gt: Add MCR-specific workaround initializers Matt Roper
2022-03-30 23:28 ` [PATCH 14/15] drm/i915: Define multicast registers as a new type Matt Roper
2022-04-01  7:55   ` [Intel-gfx] " Tvrtko Ursulin
2022-04-04 21:12     ` Matt Roper
2022-03-30 23:28 ` [PATCH 15/15] drm/i915/xehp: Eliminate shared/implicit steering Matt Roper
2022-03-31 17:35   ` [Intel-gfx] " Tvrtko Ursulin
2022-04-04 21:35     ` Matt Roper
2022-04-07 12:25       ` Tvrtko Ursulin
2022-04-01  8:34   ` Tvrtko Ursulin
2022-04-04 21:42     ` Matt Roper
2022-04-07 12:30       ` Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).