All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/11] Refactor HW workaround code
@ 2017-10-11 18:15 Oscar Mateo
  2017-10-11 18:15 ` [PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space Oscar Mateo
                   ` (11 more replies)
  0 siblings, 12 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

I didn't receive any major opposition to the RFC, so I am sending the patches
again with some review comments from Chris, a typo fix and some aesthetic
improvements.

Currently, deciding how/where to apply new workarounds is challenging. Often,
workarounds end up applied incorrectly and get lost under certain circumstances
(e.g. a context switch or a GPU reset). This is a proposal to attempt to
eliminate some of this pain, by clarifying the current classification of
workarounds (context saved/restored, global registers, whitelisting, BB),
putting them together on the same file, and improving the existing validation
infrastructure (debugfs/i-g-t).

Oscar Mateo (11):
  drm/i915: No need for RING_MAX_NONPRIV_SLOTS space
  drm/i915: Move a bunch of workaround-related code to its own file
  drm/i915: Split out functions for different kinds of workarounds
  drm/i915: Move workarounds from init_clock_gating
  drm/i915: Rename saved workarounds to make it explicit that they are
    context WAs
  drm/i915: Save all MMIO WAs and apply them at a later time
  drm/i915: Save all Whitelist WAs and apply them at a later time
  drm/i915: Print all workaround types correctly in debugfs
  drm/i915: Move WA BB stuff to the workarounds file as well
  drm/i915: Document the i915_workarounds file
  drm/i915: Remove Gen9 WAs with no effect

 drivers/gpu/drm/i915/Makefile            |    3 +-
 drivers/gpu/drm/i915/i915_debugfs.c      |   52 +-
 drivers/gpu/drm/i915/i915_drv.c          |    5 +
 drivers/gpu/drm/i915/i915_drv.h          |   20 +-
 drivers/gpu/drm/i915/i915_gem.c          |    3 +
 drivers/gpu/drm/i915/i915_gem_context.c  |    5 +
 drivers/gpu/drm/i915/i915_reg.h          |    4 +-
 drivers/gpu/drm/i915/intel_engine_cs.c   |  679 ---------------
 drivers/gpu/drm/i915/intel_lrc.c         |  266 +-----
 drivers/gpu/drm/i915/intel_pm.c          |  243 +-----
 drivers/gpu/drm/i915/intel_ringbuffer.c  |    5 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |    3 -
 drivers/gpu/drm/i915/intel_workarounds.c | 1364 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_workarounds.h |   40 +
 14 files changed, 1489 insertions(+), 1203 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_workarounds.c
 create mode 100644 drivers/gpu/drm/i915/intel_workarounds.h

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 18:15 ` [PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

Now that we write RING_FORCE_TO_NONPRIV registers directly to hardware,
[commit 32ced39 ("drm/i915: Transform whitelisting WAs into a simple reg
write")] there is no need to save space for them in the list of context
workarounds.

v2: Refer to previous commit in commit message (Michel)

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6bbc4b8..a5f3999 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1960,13 +1960,7 @@ struct i915_wa_reg {
 	u32 mask;
 };
 
-/*
- * RING_MAX_NONPRIV_SLOTS is per-engine but at this point we are only
- * allowing it for RCS as we don't foresee any requirement of having
- * a whitelist for other engines. When it is really required for
- * other engines then the limit need to be increased.
- */
-#define I915_MAX_WA_REGS (16 + RING_MAX_NONPRIV_SLOTS)
+#define I915_MAX_WA_REGS 16
 
 struct i915_workarounds {
 	struct i915_wa_reg reg[I915_MAX_WA_REGS];
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
  2017-10-11 18:15 ` [PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 18:15 ` [PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds Oscar Mateo
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

This has grown to be a sizable amount of code, so move it to
its own file before we try to refactor anything. For the moment,
we are leaving behind the WA BB code and the WAs that get applied
(incorrectly) in init_clock_gating, but we will deal with it later.

v2: Use intel_ prefix for code that deals with the hardware (Chris)

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile            |   3 +-
 drivers/gpu/drm/i915/intel_engine_cs.c   | 679 -----------------------------
 drivers/gpu/drm/i915/intel_lrc.c         |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   3 -
 drivers/gpu/drm/i915/intel_workarounds.c | 705 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_workarounds.h |  31 ++
 7 files changed, 740 insertions(+), 683 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_workarounds.c
 create mode 100644 drivers/gpu/drm/i915/intel_workarounds.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 66d23b6..3162a14 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -22,7 +22,8 @@ i915-y := i915_drv.o \
 	  intel_csr.o \
 	  intel_device_info.o \
 	  intel_pm.o \
-	  intel_runtime_pm.o
+	  intel_runtime_pm.o \
+	  intel_workarounds.o
 
 i915-$(CONFIG_COMPAT)   += i915_ioc32.o
 i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index a59b2a3..0b806f2 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -817,685 +817,6 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
 	}
 }
 
-static int wa_add(struct drm_i915_private *dev_priv,
-		  i915_reg_t addr,
-		  const u32 mask, const u32 val)
-{
-	const u32 idx = dev_priv->workarounds.count;
-
-	if (WARN_ON(idx >= I915_MAX_WA_REGS))
-		return -ENOSPC;
-
-	dev_priv->workarounds.reg[idx].addr = addr;
-	dev_priv->workarounds.reg[idx].value = val;
-	dev_priv->workarounds.reg[idx].mask = mask;
-
-	dev_priv->workarounds.count++;
-
-	return 0;
-}
-
-#define WA_REG(addr, mask, val) do { \
-		const int r = wa_add(dev_priv, (addr), (mask), (val)); \
-		if (r) \
-			return r; \
-	} while (0)
-
-#define WA_SET_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
-
-#define WA_CLR_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
-
-#define WA_SET_FIELD_MASKED(addr, mask, value) \
-	WA_REG(addr, mask, _MASKED_FIELD(mask, value))
-
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
-				 i915_reg_t reg)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct i915_workarounds *wa = &dev_priv->workarounds;
-	const uint32_t index = wa->hw_whitelist_count[engine->id];
-
-	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
-		return -EINVAL;
-
-	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
-		   i915_mmio_reg_offset(reg));
-	wa->hw_whitelist_count[engine->id]++;
-
-	return 0;
-}
-
-static int gen8_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
-
-	/* WaDisableAsyncFlipPerfMode:bdw,chv */
-	WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
-
-	/* WaDisablePartialInstShootdown:bdw,chv */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
-
-	/* Use Force Non-Coherent whenever executing a 3D context. This is a
-	 * workaround for for a possible hang in the unlikely event a TLB
-	 * invalidation occurs during a PSD flush.
-	 */
-	/* WaForceEnableNonCoherent:bdw,chv */
-	/* WaHdcDisableFetchWhenMasked:bdw,chv */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_DONOT_FETCH_MEM_WHEN_MASKED |
-			  HDC_FORCE_NON_COHERENT);
-
-	/* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
-	 * "The Hierarchical Z RAW Stall Optimization allows non-overlapping
-	 *  polygons in the same 8x4 pixel/sample area to be processed without
-	 *  stalling waiting for the earlier ones to write to Hierarchical Z
-	 *  buffer."
-	 *
-	 * This optimization is off by default for BDW and CHV; turn it on.
-	 */
-	WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
-
-	/* Wa4x4STCOptimizationDisable:bdw,chv */
-	WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
-
-	/*
-	 * BSpec recommends 8x4 when MSAA is used,
-	 * however in practice 16x4 seems fastest.
-	 *
-	 * Note that PS/WM thread counts depend on the WIZ hashing
-	 * disable bit, which we don't touch here, but it's good
-	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
-	 */
-	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
-			    GEN6_WIZ_HASHING_MASK,
-			    GEN6_WIZ_HASHING_16x4);
-
-	return 0;
-}
-
-static int bdw_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen8_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
-
-	/* WaDisableDopClockGating:bdw
-	 *
-	 * Also see the related UCGTCL1 write in broadwell_init_clock_gating()
-	 * to disable EUTC clock gating.
-	 */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
-			  DOP_CLOCK_GATING_DISABLE);
-
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-			  GEN8_SAMPLER_POWER_BYPASS_DIS);
-
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  /* WaForceContextSaveRestoreNonCoherent:bdw */
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
-			  /* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
-			  (IS_BDW_GT3(dev_priv) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
-
-	return 0;
-}
-
-static int chv_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen8_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaDisableThreadStallDopClockGating:chv */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
-
-	/* Improve HiZ throughput on CHV. */
-	WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
-
-	return 0;
-}
-
-static int gen9_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	/* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
-
-	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
-		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
-
-	/* WaDisableKillLogic:bxt,skl,kbl */
-	if (!IS_COFFEELAKE(dev_priv))
-		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-			   ECOCHK_DIS_TLB);
-
-	if (HAS_LLC(dev_priv)) {
-		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
-		 *
-		 * Must match Display Engine. See
-		 * WaCompressedResourceDisplayNewHashMode.
-		 */
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN9_PBE_COMPRESSED_HASH_SELECTION);
-		WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
-				  GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
-
-		I915_WRITE(MMCD_MISC_CTRL,
-			   I915_READ(MMCD_MISC_CTRL) |
-			   MMCD_PCLA |
-			   MMCD_HOTSPOT_EN);
-	}
-
-	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
-	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  FLOW_CONTROL_ENABLE |
-			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
-
-	/* Syncing dependencies between camera and graphics:skl,bxt,kbl */
-	if (!IS_COFFEELAKE(dev_priv))
-		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-				  GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
-
-	/* WaDisableDgMirrorFixInHalfSliceChicken5:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
-				  GEN9_DG_MIRROR_FIX_ENABLE);
-
-	/* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
-				  GEN9_RHWO_OPTIMIZATION_DISABLE);
-		/*
-		 * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
-		 * but we do that in per ctx batchbuffer as there is an issue
-		 * with this register not getting restored on ctx restore
-		 */
-	}
-
-	/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
-	/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
-			  GEN9_ENABLE_YV12_BUGFIX |
-			  GEN9_ENABLE_GPGPU_PREEMPTION);
-
-	/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
-	/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
-					 GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE));
-
-	/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
-	WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
-			  GEN9_CCS_TLB_PREFETCH_ENABLE);
-
-	/* WaDisableMaskBasedCammingInRCC:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
-				  PIXEL_MASK_CAMMING_DISABLE);
-
-	/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
-			  HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
-
-	/* WaForceEnableNonCoherent and WaDisableHDCInvalidation are
-	 * both tied to WaForceContextSaveRestoreNonCoherent
-	 * in some hsds for skl. We keep the tie for all gen9. The
-	 * documentation is a bit hazy and so we want to get common behaviour,
-	 * even though there is no clear evidence we would need both on kbl/bxt.
-	 * This area has been source of system hangs so we play it safe
-	 * and mimic the skl regardless of what bspec says.
-	 *
-	 * Use Force Non-Coherent whenever executing a 3D context. This
-	 * is a workaround for a possible hang in the unlikely event
-	 * a TLB invalidation occurs during a PSD flush.
-	 */
-
-	/* WaForceEnableNonCoherent:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_FORCE_NON_COHERENT);
-
-	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-		   BDW_DISABLE_HDC_INVALIDATION);
-
-	/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
-	if (IS_SKYLAKE(dev_priv) ||
-	    IS_KABYLAKE(dev_priv) ||
-	    IS_COFFEELAKE(dev_priv) ||
-	    IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0))
-		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-				  GEN8_SAMPLER_POWER_BYPASS_DIS);
-
-	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
-
-	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
-	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
-				    GEN8_LQSC_FLUSH_COHERENT_LINES));
-
-	/*
-	 * Supporting preemption with fine-granularity requires changes in the
-	 * batch buffer programming. Since we can't break old userspace, we
-	 * need to set our default preemption level to safe value. Userspace is
-	 * still able to use more fine-grained preemption levels, since in
-	 * WaEnablePreemptionGranularityControlByUMD we're whitelisting the
-	 * per-ctx register. As such, WaDisable{3D,GPGPU}MidCmdPreemption are
-	 * not real HW workarounds, but merely a way to start using preemption
-	 * while maintaining old contract with userspace.
-	 */
-
-	/* WaDisable3DMidCmdPreemption:skl,bxt,glk,cfl,[cnl] */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
-	/* WaDisableGPGPUMidCmdPreemption:skl,bxt,blk,cfl,[cnl] */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
-	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
-	if (ret)
-		return ret;
-
-	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
-
-	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	u8 vals[3] = { 0, 0, 0 };
-	unsigned int i;
-
-	for (i = 0; i < 3; i++) {
-		u8 ss;
-
-		/*
-		 * Only consider slices where one, and only one, subslice has 7
-		 * EUs
-		 */
-		if (!is_power_of_2(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]))
-			continue;
-
-		/*
-		 * subslice_7eu[i] != 0 (because of the check above) and
-		 * ss_max == 4 (maximum number of subslices possible per slice)
-		 *
-		 * ->    0 <= ss <= 3;
-		 */
-		ss = ffs(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]) - 1;
-		vals[i] = 3 - ss;
-	}
-
-	if (vals[0] == 0 && vals[1] == 0 && vals[2] == 0)
-		return 0;
-
-	/* Tune IZ hashing. See intel_device_info_runtime_init() */
-	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
-			    GEN9_IZ_HASHING_MASK(2) |
-			    GEN9_IZ_HASHING_MASK(1) |
-			    GEN9_IZ_HASHING_MASK(0),
-			    GEN9_IZ_HASHING(2, vals[2]) |
-			    GEN9_IZ_HASHING(1, vals[1]) |
-			    GEN9_IZ_HASHING(0, vals[0]));
-
-	return 0;
-}
-
-static int skl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaEnableGapsTsvCreditFix:skl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaDisableGafsUnitClkGating:skl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
-	/* WaInPlaceDecompressionHang:skl */
-	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaDisableLSQCROPERFforOCL:skl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
-
-	return skl_tune_iz_hashing(engine);
-}
-
-static int bxt_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaStoreMultiplePTEenable:bxt */
-	/* This is a requirement according to Hardware specification */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
-
-	/* WaSetClckGatingDisableMedia:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
-					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
-	}
-
-	/* WaDisableThreadStallDopClockGating:bxt */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  STALL_DOP_GATING_DISABLE);
-
-	/* WaDisablePooledEuLoadBalancingFix:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
-		I915_WRITE(FF_SLICE_CS_CHICKEN2,
-			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
-	}
-
-	/* WaDisableSbeCacheDispatchPortSharing:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
-		WA_SET_BIT_MASKED(
-			GEN7_HALF_SLICE_CHICKEN1,
-			GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-	}
-
-	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
-	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
-	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
-	/* WaDisableLSQCROPERFforOCL:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
-		if (ret)
-			return ret;
-
-		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-		if (ret)
-			return ret;
-	}
-
-	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
-		I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
-					   L3_HIGH_PRIO_CREDITS(2));
-
-	/* WaToEnableHwFixForPushConstHWBug:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaInPlaceDecompressionHang:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	return 0;
-}
-
-static int cnl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
-
-	/* WaForceContextSaveRestoreNonCoherent:cnl */
-	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
-	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
-
-	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
-
-	/* WaInPlaceDecompressionHang:cnl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaPushConstantDereferenceHoldDisable:cnl */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
-
-	/* FtrEnableFastAnisoL1BankingFix: cnl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
-
-	/* WaDisable3DMidCmdPreemption:cnl */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
-	/* WaDisableGPGPUMidCmdPreemption:cnl */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
-	/* WaEnablePreemptionGranularityControlByUMD:cnl */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
-	ret= wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int kbl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaEnableGapsTsvCreditFix:kbl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaDisableDynamicCreditSharing:kbl */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
-
-	/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
-	if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
-		WA_SET_BIT_MASKED(HDC_CHICKEN0,
-				  HDC_FENCE_DEST_SLM_DISABLE);
-
-	/* WaToEnableHwFixForPushConstHWBug:kbl */
-	if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableGafsUnitClkGating:kbl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
-	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	WA_SET_BIT_MASKED(
-		GEN7_HALF_SLICE_CHICKEN1,
-		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-
-	/* WaInPlaceDecompressionHang:kbl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaDisableLSQCROPERFforOCL:kbl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int glk_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaToEnableHwFixForPushConstHWBug:glk */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	return 0;
-}
-
-static int cfl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaEnableGapsTsvCreditFix:cfl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaToEnableHwFixForPushConstHWBug:cfl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableGafsUnitClkGating:cfl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
-	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	WA_SET_BIT_MASKED(
-		GEN7_HALF_SLICE_CHICKEN1,
-		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-
-	/* WaInPlaceDecompressionHang:cfl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	return 0;
-}
-
-int init_workarounds_ring(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int err;
-
-	WARN_ON(engine->id != RCS);
-
-	dev_priv->workarounds.count = 0;
-	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
-
-	if (IS_BROADWELL(dev_priv))
-		err = bdw_init_workarounds(engine);
-	else if (IS_CHERRYVIEW(dev_priv))
-		err = chv_init_workarounds(engine);
-	else if (IS_SKYLAKE(dev_priv))
-		err =  skl_init_workarounds(engine);
-	else if (IS_BROXTON(dev_priv))
-		err = bxt_init_workarounds(engine);
-	else if (IS_KABYLAKE(dev_priv))
-		err = kbl_init_workarounds(engine);
-	else if (IS_GEMINILAKE(dev_priv))
-		err =  glk_init_workarounds(engine);
-	else if (IS_COFFEELAKE(dev_priv))
-		err = cfl_init_workarounds(engine);
-	else if (IS_CANNONLAKE(dev_priv))
-		err = cnl_init_workarounds(engine);
-	else
-		err = 0;
-	if (err)
-		return err;
-
-	DRM_DEBUG_DRIVER("%s: Number of context specific w/a: %d\n",
-			 engine->name, dev_priv->workarounds.count);
-	return 0;
-}
-
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
-{
-	struct i915_workarounds *w = &req->i915->workarounds;
-	u32 *cs;
-	int ret, i;
-
-	if (w->count == 0)
-		return 0;
-
-	ret = req->engine->emit_flush(req, EMIT_BARRIER);
-	if (ret)
-		return ret;
-
-	cs = intel_ring_begin(req, (w->count * 2 + 2));
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_LOAD_REGISTER_IMM(w->count);
-	for (i = 0; i < w->count; i++) {
-		*cs++ = i915_mmio_reg_offset(w->reg[i].addr);
-		*cs++ = w->reg[i].value;
-	}
-	*cs++ = MI_NOOP;
-
-	intel_ring_advance(req, cs);
-
-	ret = req->engine->emit_flush(req, EMIT_BARRIER);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
 static bool ring_is_idle(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index fbfcf88..495ade6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -137,6 +137,7 @@
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "intel_mocs.h"
+#include "intel_workarounds.h"
 
 #define RING_EXECLIST_QFULL		(1 << 0x2)
 #define RING_EXECLIST1_VALID		(1 << 0x3)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 4285f09..7e5d7d3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -33,6 +33,7 @@
 #include <drm/i915_drm.h>
 #include "i915_trace.h"
 #include "intel_drv.h"
+#include "intel_workarounds.h"
 
 /* Rough estimate of the typical request size, performing a flush,
  * set-context and then emitting the batch.
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 17186f0..bad1c19 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -725,9 +725,6 @@ static inline u32 intel_engine_last_submit(struct intel_engine_cs *engine)
 	return READ_ONCE(engine->timeline->seqno);
 }
 
-int init_workarounds_ring(struct intel_engine_cs *engine);
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req);
-
 void intel_engine_get_instdone(struct intel_engine_cs *engine,
 			       struct intel_instdone *instdone);
 
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
new file mode 100644
index 0000000..a4ea80f
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -0,0 +1,705 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "i915_drv.h"
+#include "intel_workarounds.h"
+
+static int wa_add(struct drm_i915_private *dev_priv,
+		  i915_reg_t addr,
+		  const u32 mask, const u32 val)
+{
+	const u32 idx = dev_priv->workarounds.count;
+
+	if (WARN_ON(idx >= I915_MAX_WA_REGS))
+		return -ENOSPC;
+
+	dev_priv->workarounds.reg[idx].addr = addr;
+	dev_priv->workarounds.reg[idx].value = val;
+	dev_priv->workarounds.reg[idx].mask = mask;
+
+	dev_priv->workarounds.count++;
+
+	return 0;
+}
+
+#define WA_REG(addr, mask, val) do { \
+		const int r = wa_add(dev_priv, (addr), (mask), (val)); \
+		if (r) \
+			return r; \
+	} while (0)
+
+#define WA_SET_BIT_MASKED(addr, mask) \
+	WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+
+#define WA_CLR_BIT_MASKED(addr, mask) \
+	WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+
+#define WA_SET_FIELD_MASKED(addr, mask, value) \
+	WA_REG(addr, mask, _MASKED_FIELD(mask, value))
+
+static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
+				 i915_reg_t reg)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	struct i915_workarounds *wa = &dev_priv->workarounds;
+	const uint32_t index = wa->hw_whitelist_count[engine->id];
+
+	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
+		return -EINVAL;
+
+	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
+		   i915_mmio_reg_offset(reg));
+	wa->hw_whitelist_count[engine->id]++;
+
+	return 0;
+}
+
+static int gen8_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+
+	WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
+
+	/* WaDisableAsyncFlipPerfMode:bdw,chv */
+	WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
+
+	/* WaDisablePartialInstShootdown:bdw,chv */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+
+	/* Use Force Non-Coherent whenever executing a 3D context. This is a
+	 * workaround for for a possible hang in the unlikely event a TLB
+	 * invalidation occurs during a PSD flush.
+	 */
+	/* WaForceEnableNonCoherent:bdw,chv */
+	/* WaHdcDisableFetchWhenMasked:bdw,chv */
+	WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			  HDC_DONOT_FETCH_MEM_WHEN_MASKED |
+			  HDC_FORCE_NON_COHERENT);
+
+	/* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
+	 * "The Hierarchical Z RAW Stall Optimization allows non-overlapping
+	 *  polygons in the same 8x4 pixel/sample area to be processed without
+	 *  stalling waiting for the earlier ones to write to Hierarchical Z
+	 *  buffer."
+	 *
+	 * This optimization is off by default for BDW and CHV; turn it on.
+	 */
+	WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
+
+	/* Wa4x4STCOptimizationDisable:bdw,chv */
+	WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
+
+	/*
+	 * BSpec recommends 8x4 when MSAA is used,
+	 * however in practice 16x4 seems fastest.
+	 *
+	 * Note that PS/WM thread counts depend on the WIZ hashing
+	 * disable bit, which we don't touch here, but it's good
+	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
+	 */
+	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+			    GEN6_WIZ_HASHING_MASK,
+			    GEN6_WIZ_HASHING_16x4);
+
+	return 0;
+}
+
+static int bdw_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen8_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+
+	/* WaDisableDopClockGating:bdw
+	 *
+	 * Also see the related UCGTCL1 write in broadwell_init_clock_gating()
+	 * to disable EUTC clock gating.
+	 */
+	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
+			  DOP_CLOCK_GATING_DISABLE);
+
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+			  GEN8_SAMPLER_POWER_BYPASS_DIS);
+
+	WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			  /* WaForceContextSaveRestoreNonCoherent:bdw */
+			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
+			  /* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
+			  (IS_BDW_GT3(dev_priv) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
+
+	return 0;
+}
+
+static int chv_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen8_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableThreadStallDopClockGating:chv */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+
+	/* Improve HiZ throughput on CHV. */
+	WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
+
+	return 0;
+}
+
+static int gen9_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	/* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+
+	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
+		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+
+	/* WaDisableKillLogic:bxt,skl,kbl */
+	if (!IS_COFFEELAKE(dev_priv))
+		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+			   ECOCHK_DIS_TLB);
+
+	if (HAS_LLC(dev_priv)) {
+		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
+		 *
+		 * Must match Display Engine. See
+		 * WaCompressedResourceDisplayNewHashMode.
+		 */
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN9_PBE_COMPRESSED_HASH_SELECTION);
+		WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+				  GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
+
+		I915_WRITE(MMCD_MISC_CTRL,
+			   I915_READ(MMCD_MISC_CTRL) |
+			   MMCD_PCLA |
+			   MMCD_HOTSPOT_EN);
+	}
+
+	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
+	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+			  FLOW_CONTROL_ENABLE |
+			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+
+	/* Syncing dependencies between camera and graphics:skl,bxt,kbl */
+	if (!IS_COFFEELAKE(dev_priv))
+		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+				  GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
+
+	/* WaDisableDgMirrorFixInHalfSliceChicken5:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+		WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+				  GEN9_DG_MIRROR_FIX_ENABLE);
+
+	/* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
+				  GEN9_RHWO_OPTIMIZATION_DISABLE);
+		/*
+		 * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
+		 * but we do that in per ctx batchbuffer as there is an issue
+		 * with this register not getting restored on ctx restore
+		 */
+	}
+
+	/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
+	/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
+	WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+			  GEN9_ENABLE_YV12_BUGFIX |
+			  GEN9_ENABLE_GPGPU_PREEMPTION);
+
+	/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
+	/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
+	WA_SET_BIT_MASKED(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
+					 GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE));
+
+	/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
+	WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+			  GEN9_CCS_TLB_PREFETCH_ENABLE);
+
+	/* WaDisableMaskBasedCammingInRCC:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+		WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
+				  PIXEL_MASK_CAMMING_DISABLE);
+
+	/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
+	WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
+			  HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
+
+	/* WaForceEnableNonCoherent and WaDisableHDCInvalidation are
+	 * both tied to WaForceContextSaveRestoreNonCoherent
+	 * in some hsds for skl. We keep the tie for all gen9. The
+	 * documentation is a bit hazy and so we want to get common behaviour,
+	 * even though there is no clear evidence we would need both on kbl/bxt.
+	 * This area has been source of system hangs so we play it safe
+	 * and mimic the skl regardless of what bspec says.
+	 *
+	 * Use Force Non-Coherent whenever executing a 3D context. This
+	 * is a workaround for a possible hang in the unlikely event
+	 * a TLB invalidation occurs during a PSD flush.
+	 */
+
+	/* WaForceEnableNonCoherent:skl,bxt,kbl,cfl */
+	WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			  HDC_FORCE_NON_COHERENT);
+
+	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
+	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+		   BDW_DISABLE_HDC_INVALIDATION);
+
+	/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
+	if (IS_SKYLAKE(dev_priv) ||
+	    IS_KABYLAKE(dev_priv) ||
+	    IS_COFFEELAKE(dev_priv) ||
+	    IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0))
+		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+				  GEN8_SAMPLER_POWER_BYPASS_DIS);
+
+	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
+
+	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
+	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
+				    GEN8_LQSC_FLUSH_COHERENT_LINES));
+
+	/*
+	 * Supporting preemption with fine-granularity requires changes in the
+	 * batch buffer programming. Since we can't break old userspace, we
+	 * need to set our default preemption level to safe value. Userspace is
+	 * still able to use more fine-grained preemption levels, since in
+	 * WaEnablePreemptionGranularityControlByUMD we're whitelisting the
+	 * per-ctx register. As such, WaDisable{3D,GPGPU}MidCmdPreemption are
+	 * not real HW workarounds, but merely a way to start using preemption
+	 * while maintaining old contract with userspace.
+	 */
+
+	/* WaDisable3DMidCmdPreemption:skl,bxt,glk,cfl,[cnl] */
+	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+
+	/* WaDisableGPGPUMidCmdPreemption:skl,bxt,blk,cfl,[cnl] */
+	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+
+	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
+	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
+	if (ret)
+		return ret;
+
+	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
+	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+	if (ret)
+		return ret;
+
+	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	u8 vals[3] = { 0, 0, 0 };
+	unsigned int i;
+
+	for (i = 0; i < 3; i++) {
+		u8 ss;
+
+		/*
+		 * Only consider slices where one, and only one, subslice has 7
+		 * EUs
+		 */
+		if (!is_power_of_2(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]))
+			continue;
+
+		/*
+		 * subslice_7eu[i] != 0 (because of the check above) and
+		 * ss_max == 4 (maximum number of subslices possible per slice)
+		 *
+		 * ->    0 <= ss <= 3;
+		 */
+		ss = ffs(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]) - 1;
+		vals[i] = 3 - ss;
+	}
+
+	if (vals[0] == 0 && vals[1] == 0 && vals[2] == 0)
+		return 0;
+
+	/* Tune IZ hashing. See intel_device_info_runtime_init() */
+	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+			    GEN9_IZ_HASHING_MASK(2) |
+			    GEN9_IZ_HASHING_MASK(1) |
+			    GEN9_IZ_HASHING_MASK(0),
+			    GEN9_IZ_HASHING(2, vals[2]) |
+			    GEN9_IZ_HASHING(1, vals[1]) |
+			    GEN9_IZ_HASHING(0, vals[0]));
+
+	return 0;
+}
+
+static int skl_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaEnableGapsTsvCreditFix:skl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableGafsUnitClkGating:skl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaInPlaceDecompressionHang:skl */
+	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
+		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	/* WaDisableLSQCROPERFforOCL:skl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	if (ret)
+		return ret;
+
+	return skl_tune_iz_hashing(engine);
+}
+
+static int bxt_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaStoreMultiplePTEenable:bxt */
+	/* This is a requirement according to Hardware specification */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+
+	/* WaSetClckGatingDisableMedia:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
+					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+	}
+
+	/* WaDisableThreadStallDopClockGating:bxt */
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+			  STALL_DOP_GATING_DISABLE);
+
+	/* WaDisablePooledEuLoadBalancingFix:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
+		I915_WRITE(FF_SLICE_CS_CHICKEN2,
+			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
+	}
+
+	/* WaDisableSbeCacheDispatchPortSharing:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
+		WA_SET_BIT_MASKED(
+			GEN7_HALF_SLICE_CHICKEN1,
+			GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+	}
+
+	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
+	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
+	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
+	/* WaDisableLSQCROPERFforOCL:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
+		if (ret)
+			return ret;
+
+		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+		if (ret)
+			return ret;
+	}
+
+	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
+		I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
+					   L3_HIGH_PRIO_CREDITS(2));
+
+	/* WaToEnableHwFixForPushConstHWBug:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	/* WaInPlaceDecompressionHang:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
+		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	return 0;
+}
+
+static int cnl_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
+	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+		I915_WRITE(GAMT_CHKN_BIT_REG,
+			   (I915_READ(GAMT_CHKN_BIT_REG) |
+			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
+
+	/* WaForceContextSaveRestoreNonCoherent:cnl */
+	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
+			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
+
+	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
+	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
+
+	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
+	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
+	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
+
+	/* WaInPlaceDecompressionHang:cnl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	/* WaPushConstantDereferenceHoldDisable:cnl */
+	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
+
+	/* FtrEnableFastAnisoL1BankingFix: cnl */
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
+
+	/* WaDisable3DMidCmdPreemption:cnl */
+	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+
+	/* WaDisableGPGPUMidCmdPreemption:cnl */
+	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+
+	/* WaEnablePreemptionGranularityControlByUMD:cnl */
+	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+	ret= wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int kbl_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaEnableGapsTsvCreditFix:kbl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableDynamicCreditSharing:kbl */
+	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+		I915_WRITE(GAMT_CHKN_BIT_REG,
+			   (I915_READ(GAMT_CHKN_BIT_REG) |
+			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
+
+	/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
+	if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
+		WA_SET_BIT_MASKED(HDC_CHICKEN0,
+				  HDC_FENCE_DEST_SLM_DISABLE);
+
+	/* WaToEnableHwFixForPushConstHWBug:kbl */
+	if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	/* WaDisableGafsUnitClkGating:kbl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaDisableSbeCacheDispatchPortSharing:kbl */
+	WA_SET_BIT_MASKED(
+		GEN7_HALF_SLICE_CHICKEN1,
+		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+
+	/* WaInPlaceDecompressionHang:kbl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	/* WaDisableLSQCROPERFforOCL:kbl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int glk_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaToEnableHwFixForPushConstHWBug:glk */
+	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	return 0;
+}
+
+static int cfl_init_workarounds(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	ret = gen9_init_workarounds(engine);
+	if (ret)
+		return ret;
+
+	/* WaEnableGapsTsvCreditFix:cfl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaToEnableHwFixForPushConstHWBug:cfl */
+	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+	/* WaDisableGafsUnitClkGating:cfl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaDisableSbeCacheDispatchPortSharing:cfl */
+	WA_SET_BIT_MASKED(
+		GEN7_HALF_SLICE_CHICKEN1,
+		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+
+	/* WaInPlaceDecompressionHang:cfl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	return 0;
+}
+
+int init_workarounds_ring(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int err;
+
+	WARN_ON(engine->id != RCS);
+
+	dev_priv->workarounds.count = 0;
+	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+
+	if (IS_BROADWELL(dev_priv))
+		err = bdw_init_workarounds(engine);
+	else if (IS_CHERRYVIEW(dev_priv))
+		err = chv_init_workarounds(engine);
+	else if (IS_SKYLAKE(dev_priv))
+		err =  skl_init_workarounds(engine);
+	else if (IS_BROXTON(dev_priv))
+		err = bxt_init_workarounds(engine);
+	else if (IS_KABYLAKE(dev_priv))
+		err = kbl_init_workarounds(engine);
+	else if (IS_GEMINILAKE(dev_priv))
+		err =  glk_init_workarounds(engine);
+	else if (IS_COFFEELAKE(dev_priv))
+		err = cfl_init_workarounds(engine);
+	else if (IS_CANNONLAKE(dev_priv))
+		err = cnl_init_workarounds(engine);
+	else
+		err = 0;
+	if (err)
+		return err;
+
+	DRM_DEBUG_DRIVER("%s: Number of context specific w/a: %d\n",
+			 engine->name, dev_priv->workarounds.count);
+	return 0;
+}
+
+int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
+{
+	struct i915_workarounds *w = &req->i915->workarounds;
+	u32 *cs;
+	int ret, i;
+
+	if (w->count == 0)
+		return 0;
+
+	ret = req->engine->emit_flush(req, EMIT_BARRIER);
+	if (ret)
+		return ret;
+
+	cs = intel_ring_begin(req, (w->count * 2 + 2));
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	*cs++ = MI_LOAD_REGISTER_IMM(w->count);
+	for (i = 0; i < w->count; i++) {
+		*cs++ = i915_mmio_reg_offset(w->reg[i].addr);
+		*cs++ = w->reg[i].value;
+	}
+	*cs++ = MI_NOOP;
+
+	intel_ring_advance(req, cs);
+
+	ret = req->engine->emit_flush(req, EMIT_BARRIER);
+	if (ret)
+		return ret;
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
new file mode 100644
index 0000000..27099fc
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _I915_WORKAROUNDS_H_
+#define _I915_WORKAROUNDS_H_
+
+int init_workarounds_ring(struct intel_engine_cs *engine);
+int intel_ring_workarounds_emit(struct drm_i915_gem_request *req);
+
+#endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
  2017-10-11 18:15 ` [PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space Oscar Mateo
  2017-10-11 18:15 ` [PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 18:25   ` Chris Wilson
  2017-10-11 18:15 ` [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating Oscar Mateo
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

There are different kind of workarounds (those that modify registers that
live in the context image, those that modify global registers, those that
whitelist registers, etc...) and they have different requirements in terms
of where they are applied and how. Also, by splitting them apart, it should
be easier to decide where a new workaround should go.

v2:
  - Add multiple MISSING_CASE
  - Rebased

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c          |   3 +
 drivers/gpu/drm/i915/i915_gem_context.c  |   5 +
 drivers/gpu/drm/i915/intel_lrc.c         |  10 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   4 +-
 drivers/gpu/drm/i915/intel_workarounds.c | 698 +++++++++++++++++++------------
 drivers/gpu/drm/i915/intel_workarounds.h |   8 +-
 6 files changed, 444 insertions(+), 284 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f76890b..119c456 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -35,6 +35,7 @@
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
 #include "intel_mocs.h"
+#include "intel_workarounds.h"
 #include "i915_gemfs.h"
 #include <linux/dma-fence-array.h>
 #include <linux/kthread.h>
@@ -4804,6 +4805,8 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 		}
 	}
 
+	intel_mmio_workarounds_apply(dev_priv);
+
 	i915_gem_init_swizzling(dev_priv);
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5bf96a2..cf104cf 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -90,6 +90,7 @@
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "i915_trace.h"
+#include "intel_workarounds.h"
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
 
@@ -454,6 +455,10 @@ int i915_gem_contexts_init(struct drm_i915_private *dev_priv)
 
 	GEM_BUG_ON(dev_priv->kernel_context);
 
+	err = intel_ctx_workarounds_init(dev_priv);
+	if (err)
+		goto err;
+
 	INIT_LIST_HEAD(&dev_priv->contexts.list);
 	INIT_WORK(&dev_priv->contexts.free_work, contexts_free_worker);
 	init_llist_head(&dev_priv->contexts.free_list);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 495ade6..e632a29 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1488,7 +1488,7 @@ static int gen8_init_render_ring(struct intel_engine_cs *engine)
 
 	I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
 
-	return init_workarounds_ring(engine);
+	return 0;
 }
 
 static int gen9_init_render_ring(struct intel_engine_cs *engine)
@@ -1499,7 +1499,11 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 	if (ret)
 		return ret;
 
-	return init_workarounds_ring(engine);
+	ret = intel_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	return 0;
 }
 
 static void reset_common_ring(struct intel_engine_cs *engine,
@@ -1827,7 +1831,7 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
 {
 	int ret;
 
-	ret = intel_ring_workarounds_emit(req);
+	ret = intel_ctx_workarounds_emit(req);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 7e5d7d3..c3f8961 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -647,7 +647,7 @@ static int intel_rcs_ctx_init(struct drm_i915_gem_request *req)
 {
 	int ret;
 
-	ret = intel_ring_workarounds_emit(req);
+	ret = intel_ctx_workarounds_emit(req);
 	if (ret != 0)
 		return ret;
 
@@ -706,7 +706,7 @@ static int init_render_ring(struct intel_engine_cs *engine)
 	if (INTEL_INFO(dev_priv)->gen >= 6)
 		I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
 
-	return init_workarounds_ring(engine);
+	return 0;
 }
 
 static void render_ring_cleanup(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index a4ea80f..ae66084 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -58,27 +58,8 @@ static int wa_add(struct drm_i915_private *dev_priv,
 #define WA_SET_FIELD_MASKED(addr, mask, value) \
 	WA_REG(addr, mask, _MASKED_FIELD(mask, value))
 
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
-				 i915_reg_t reg)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct i915_workarounds *wa = &dev_priv->workarounds;
-	const uint32_t index = wa->hw_whitelist_count[engine->id];
-
-	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
-		return -EINVAL;
-
-	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
-		   i915_mmio_reg_offset(reg));
-	wa->hw_whitelist_count[engine->id]++;
-
-	return 0;
-}
-
-static int gen8_init_workarounds(struct intel_engine_cs *engine)
+static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-
 	WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
 
 	/* WaDisableAsyncFlipPerfMode:bdw,chv */
@@ -126,12 +107,11 @@ static int gen8_init_workarounds(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int bdw_init_workarounds(struct intel_engine_cs *engine)
+static int bdw_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen8_init_workarounds(engine);
+	ret = gen8_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
@@ -158,12 +138,11 @@ static int bdw_init_workarounds(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int chv_init_workarounds(struct intel_engine_cs *engine)
+static int chv_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen8_init_workarounds(engine);
+	ret = gen8_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
@@ -176,23 +155,8 @@ static int chv_init_workarounds(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int gen9_init_workarounds(struct intel_engine_cs *engine)
+static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	/* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
-
-	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
-		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
-
-	/* WaDisableKillLogic:bxt,skl,kbl */
-	if (!IS_COFFEELAKE(dev_priv))
-		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-			   ECOCHK_DIS_TLB);
-
 	if (HAS_LLC(dev_priv)) {
 		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
 		 *
@@ -203,11 +167,6 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 				  GEN9_PBE_COMPRESSED_HASH_SELECTION);
 		WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
 				  GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
-
-		I915_WRITE(MMCD_MISC_CTRL,
-			   I915_READ(MMCD_MISC_CTRL) |
-			   MMCD_PCLA |
-			   MMCD_HOTSPOT_EN);
 	}
 
 	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
@@ -279,10 +238,6 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 	WA_SET_BIT_MASKED(HDC_CHICKEN0,
 			  HDC_FORCE_NON_COHERENT);
 
-	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-		   BDW_DISABLE_HDC_INVALIDATION);
-
 	/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
 	if (IS_SKYLAKE(dev_priv) ||
 	    IS_KABYLAKE(dev_priv) ||
@@ -294,10 +249,6 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
 	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
 
-	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
-	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
-				    GEN8_LQSC_FLUSH_COHERENT_LINES));
-
 	/*
 	 * Supporting preemption with fine-granularity requires changes in the
 	 * batch buffer programming. Since we can't break old userspace, we
@@ -316,29 +267,11 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
 	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
 			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
 
-	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
-	if (ret)
-		return ret;
-
-	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
-
-	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
-	if (ret)
-		return ret;
-
 	return 0;
 }
 
-static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
+static int skl_tune_iz_hashing(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	u8 vals[3] = { 0, 0, 0 };
 	unsigned int i;
 
@@ -377,67 +310,29 @@ static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int skl_init_workarounds(struct intel_engine_cs *engine)
+static int skl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
-	if (ret)
-		return ret;
-
-	/* WaEnableGapsTsvCreditFix:skl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaDisableGafsUnitClkGating:skl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
-	/* WaInPlaceDecompressionHang:skl */
-	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaDisableLSQCROPERFforOCL:skl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
-	return skl_tune_iz_hashing(engine);
+	return skl_tune_iz_hashing(dev_priv);
 }
 
-static int bxt_init_workarounds(struct intel_engine_cs *engine)
+static int bxt_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
-	/* WaStoreMultiplePTEenable:bxt */
-	/* This is a requirement according to Hardware specification */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
-
-	/* WaSetClckGatingDisableMedia:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
-					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
-	}
-
 	/* WaDisableThreadStallDopClockGating:bxt */
 	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
 			  STALL_DOP_GATING_DISABLE);
 
-	/* WaDisablePooledEuLoadBalancingFix:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
-		I915_WRITE(FF_SLICE_CS_CHICKEN2,
-			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
-	}
-
 	/* WaDisableSbeCacheDispatchPortSharing:bxt */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
 		WA_SET_BIT_MASKED(
@@ -445,114 +340,22 @@ static int bxt_init_workarounds(struct intel_engine_cs *engine)
 			GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 	}
 
-	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
-	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
-	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
-	/* WaDisableLSQCROPERFforOCL:bxt */
-	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
-		if (ret)
-			return ret;
-
-		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-		if (ret)
-			return ret;
-	}
-
-	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
-		I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
-					   L3_HIGH_PRIO_CREDITS(2));
-
 	/* WaToEnableHwFixForPushConstHWBug:bxt */
 	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
 		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
 				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
-	/* WaInPlaceDecompressionHang:bxt */
-	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
 	return 0;
 }
 
-static int cnl_init_workarounds(struct intel_engine_cs *engine)
+static int kbl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
-
-	/* WaForceContextSaveRestoreNonCoherent:cnl */
-	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
-	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
-
-	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
-
-	/* WaInPlaceDecompressionHang:cnl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaPushConstantDereferenceHoldDisable:cnl */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
-
-	/* FtrEnableFastAnisoL1BankingFix: cnl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
-
-	/* WaDisable3DMidCmdPreemption:cnl */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
-	/* WaDisableGPGPUMidCmdPreemption:cnl */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
-	/* WaEnablePreemptionGranularityControlByUMD:cnl */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
-	ret= wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int kbl_init_workarounds(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
-	/* WaEnableGapsTsvCreditFix:kbl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
-	/* WaDisableDynamicCreditSharing:kbl */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
-
 	/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
 	if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
 		WA_SET_BIT_MASKED(HDC_CHICKEN0,
@@ -563,34 +366,19 @@ static int kbl_init_workarounds(struct intel_engine_cs *engine)
 		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
 				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
-	/* WaDisableGafsUnitClkGating:kbl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
 	/* WaDisableSbeCacheDispatchPortSharing:kbl */
 	WA_SET_BIT_MASKED(
 		GEN7_HALF_SLICE_CHICKEN1,
 		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 
-	/* WaInPlaceDecompressionHang:kbl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
-	/* WaDisableLSQCROPERFforOCL:kbl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
-
 	return 0;
 }
 
-static int glk_init_workarounds(struct intel_engine_cs *engine)
+static int glk_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
@@ -601,79 +389,100 @@ static int glk_init_workarounds(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int cfl_init_workarounds(struct intel_engine_cs *engine)
+static int cfl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	ret = gen9_init_workarounds(engine);
+	ret = gen9_ctx_workarounds_init(dev_priv);
 	if (ret)
 		return ret;
 
-	/* WaEnableGapsTsvCreditFix:cfl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
-
 	/* WaToEnableHwFixForPushConstHWBug:cfl */
 	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
 			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
-	/* WaDisableGafsUnitClkGating:cfl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
 	/* WaDisableSbeCacheDispatchPortSharing:cfl */
 	WA_SET_BIT_MASKED(
 		GEN7_HALF_SLICE_CHICKEN1,
 		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 
-	/* WaInPlaceDecompressionHang:cfl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
 	return 0;
 }
 
-int init_workarounds_ring(struct intel_engine_cs *engine)
+static int cnl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	int err;
+	/* WaForceContextSaveRestoreNonCoherent:cnl */
+	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
+			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
 
-	WARN_ON(engine->id != RCS);
+	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
+	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
 
-	dev_priv->workarounds.count = 0;
-	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
+	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
-	if (IS_BROADWELL(dev_priv))
-		err = bdw_init_workarounds(engine);
-	else if (IS_CHERRYVIEW(dev_priv))
-		err = chv_init_workarounds(engine);
-	else if (IS_SKYLAKE(dev_priv))
-		err =  skl_init_workarounds(engine);
-	else if (IS_BROXTON(dev_priv))
-		err = bxt_init_workarounds(engine);
-	else if (IS_KABYLAKE(dev_priv))
-		err = kbl_init_workarounds(engine);
-	else if (IS_GEMINILAKE(dev_priv))
-		err =  glk_init_workarounds(engine);
-	else if (IS_COFFEELAKE(dev_priv))
-		err = cfl_init_workarounds(engine);
-	else if (IS_CANNONLAKE(dev_priv))
-		err = cnl_init_workarounds(engine);
-	else
-		err = 0;
-	if (err)
-		return err;
+	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
+	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
+		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
 
-	DRM_DEBUG_DRIVER("%s: Number of context specific w/a: %d\n",
-			 engine->name, dev_priv->workarounds.count);
-	return 0;
-}
+	/* WaPushConstantDereferenceHoldDisable:cnl */
+	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
 
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
-{
-	struct i915_workarounds *w = &req->i915->workarounds;
+	/* FtrEnableFastAnisoL1BankingFix:cnl */
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
+
+	/* WaDisable3DMidCmdPreemption:cnl */
+	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+
+	/* WaDisableGPGPUMidCmdPreemption:cnl */
+	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+
+	return 0;
+}
+
+int intel_ctx_workarounds_init(struct drm_i915_private *dev_priv)
+{
+	int err;
+
+	dev_priv->workarounds.count = 0;
+
+	if (INTEL_GEN(dev_priv) < 8)
+		err = 0;
+	else if (IS_BROADWELL(dev_priv))
+		err = bdw_ctx_workarounds_init(dev_priv);
+	else if (IS_CHERRYVIEW(dev_priv))
+		err = chv_ctx_workarounds_init(dev_priv);
+	else if (IS_SKYLAKE(dev_priv))
+		err = skl_ctx_workarounds_init(dev_priv);
+	else if (IS_BROXTON(dev_priv))
+		err = bxt_ctx_workarounds_init(dev_priv);
+	else if (IS_KABYLAKE(dev_priv))
+		err = kbl_ctx_workarounds_init(dev_priv);
+	else if (IS_GEMINILAKE(dev_priv))
+		err = glk_ctx_workarounds_init(dev_priv);
+	else if (IS_COFFEELAKE(dev_priv))
+		err = cfl_ctx_workarounds_init(dev_priv);
+	else if (IS_CANNONLAKE(dev_priv))
+		err = cnl_ctx_workarounds_init(dev_priv);
+	else {
+		MISSING_CASE(INTEL_GEN(dev_priv));
+		err = 0;
+	}
+	if (err)
+		return err;
+
+	DRM_DEBUG_DRIVER("Number of context specific w/a: %d\n",
+			 dev_priv->workarounds.count);
+	return 0;
+}
+
+int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req)
+{
+	struct i915_workarounds *w = &req->i915->workarounds;
 	u32 *cs;
 	int ret, i;
 
@@ -703,3 +512,338 @@ int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
 
 	return 0;
 }
+
+static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	if (HAS_LLC(dev_priv)) {
+		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
+		 *
+		 * Must match Display Engine. See
+		 * WaCompressedResourceDisplayNewHashMode.
+		 */
+		I915_WRITE(MMCD_MISC_CTRL,
+			   I915_READ(MMCD_MISC_CTRL) |
+			   MMCD_PCLA |
+			   MMCD_HOTSPOT_EN);
+	}
+
+	/* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS,
+		   _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+
+	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
+		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+
+	/* WaDisableKillLogic:bxt,skl,kbl */
+	if (!IS_COFFEELAKE(dev_priv))
+		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+			   ECOCHK_DIS_TLB);
+
+	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
+	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+		   BDW_DISABLE_HDC_INVALIDATION);
+
+	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
+	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
+				    GEN8_LQSC_FLUSH_COHERENT_LINES));
+
+	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
+	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+}
+
+static void skl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_mmio_workarounds_apply(dev_priv);
+
+	/* WaEnableGapsTsvCreditFix:skl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableGafsUnitClkGating:skl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaInPlaceDecompressionHang:skl */
+	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
+		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void bxt_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_mmio_workarounds_apply(dev_priv);
+
+	/* WaStoreMultiplePTEenable:bxt */
+	/* This is a requirement according to Hardware specification */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+
+	/* WaSetClckGatingDisableMedia:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
+					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+	}
+
+	/* WaDisablePooledEuLoadBalancingFix:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
+		I915_WRITE(FF_SLICE_CS_CHICKEN2,
+			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
+	}
+
+	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
+		I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
+					   L3_HIGH_PRIO_CREDITS(2));
+
+	/* WaInPlaceDecompressionHang:bxt */
+	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
+		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_mmio_workarounds_apply(dev_priv);
+
+	/* WaEnableGapsTsvCreditFix:kbl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableDynamicCreditSharing:kbl */
+	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+		I915_WRITE(GAMT_CHKN_BIT_REG,
+			   (I915_READ(GAMT_CHKN_BIT_REG) |
+			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
+
+	/* WaDisableGafsUnitClkGating:kbl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaInPlaceDecompressionHang:kbl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void glk_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_mmio_workarounds_apply(dev_priv);
+}
+
+static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	gen9_mmio_workarounds_apply(dev_priv);
+
+	/* WaEnableGapsTsvCreditFix:cfl */
+	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+	/* WaDisableGafsUnitClkGating:cfl */
+	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+	/* WaInPlaceDecompressionHang:cfl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
+	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+		I915_WRITE(GAMT_CHKN_BIT_REG,
+			   (I915_READ(GAMT_CHKN_BIT_REG) |
+			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
+
+	/* WaInPlaceDecompressionHang:cnl */
+	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+	/* WaEnablePreemptionGranularityControlByUMD:cnl */
+	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+}
+
+void intel_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	if (INTEL_GEN(dev_priv) < 9)
+		return;
+	else if (IS_SKYLAKE(dev_priv))
+		skl_mmio_workarounds_apply(dev_priv);
+	else if (IS_BROXTON(dev_priv))
+		bxt_mmio_workarounds_apply(dev_priv);
+	else if (IS_KABYLAKE(dev_priv))
+		kbl_mmio_workarounds_apply(dev_priv);
+	else if (IS_GEMINILAKE(dev_priv))
+		glk_mmio_workarounds_apply(dev_priv);
+	else if (IS_COFFEELAKE(dev_priv))
+		cfl_mmio_workarounds_apply(dev_priv);
+	else if (IS_CANNONLAKE(dev_priv))
+		cnl_mmio_workarounds_apply(dev_priv);
+	else
+		MISSING_CASE(INTEL_GEN(dev_priv));
+}
+
+static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
+				 i915_reg_t reg)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	struct i915_workarounds *wa = &dev_priv->workarounds;
+	const uint32_t index = wa->hw_whitelist_count[engine->id];
+
+	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
+		return -EINVAL;
+
+	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
+		   i915_mmio_reg_offset(reg));
+	wa->hw_whitelist_count[engine->id]++;
+
+	return 0;
+}
+
+static int gen9_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret;
+
+	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
+	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
+	if (ret)
+		return ret;
+
+	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
+	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+	if (ret)
+		return ret;
+
+	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int skl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableLSQCROPERFforOCL:skl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int bxt_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
+	/* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
+	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
+	/* WaDisableLSQCROPERFforOCL:bxt */
+	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
+		if (ret)
+			return ret;
+
+		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int kbl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	/* WaDisableLSQCROPERFforOCL:kbl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int glk_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int cfl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret = gen9_whitelist_workarounds_apply(engine);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int cnl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	int ret;
+
+	/* WaEnablePreemptionGranularityControlByUMD:cnl */
+	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int err;
+
+	WARN_ON(engine->id != RCS);
+
+	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+
+	if (INTEL_GEN(dev_priv) < 9) {
+		WARN(1, "No whitelisting in Gen%u\n", INTEL_GEN(dev_priv));
+		err = 0;
+	} else if (IS_SKYLAKE(dev_priv))
+		err = skl_whitelist_workarounds_apply(engine);
+	else if (IS_BROXTON(dev_priv))
+		err = bxt_whitelist_workarounds_apply(engine);
+	else if (IS_KABYLAKE(dev_priv))
+		err = kbl_whitelist_workarounds_apply(engine);
+	else if (IS_GEMINILAKE(dev_priv))
+		err = glk_whitelist_workarounds_apply(engine);
+	else if (IS_COFFEELAKE(dev_priv))
+		err = cfl_whitelist_workarounds_apply(engine);
+	else if (IS_CANNONLAKE(dev_priv))
+		err = cnl_whitelist_workarounds_apply(engine);
+	else {
+		MISSING_CASE(INTEL_GEN(dev_priv));
+		err = 0;
+	}
+	if (err)
+		return err;
+
+	DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %d\n", engine->name,
+			 dev_priv->workarounds.hw_whitelist_count[engine->id]);
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index 27099fc..53a452a 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -25,7 +25,11 @@
 #ifndef _I915_WORKAROUNDS_H_
 #define _I915_WORKAROUNDS_H_
 
-int init_workarounds_ring(struct intel_engine_cs *engine);
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req);
+int intel_ctx_workarounds_init(struct drm_i915_private *dev_priv);
+int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req);
+
+void intel_mmio_workarounds_apply(struct drm_i915_private *dev_priv);
+
+int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
 
 #endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
                   ` (2 preceding siblings ...)
  2017-10-11 18:15 ` [PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 18:26   ` Chris Wilson
                     ` (2 more replies)
  2017-10-11 18:15 ` [PATCH 05/11] drm/i915: Rename saved workarounds to make it explicit that they are context WAs Oscar Mateo
                   ` (7 subsequent siblings)
  11 siblings, 3 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

I'm not sure why some WAs have historically been applied in init_clock_gating
and some others in the engine setup (GT vs. display? context vs. global
registers?) but it does not look like the best place to apply workarounds:
the name is confusing, it's a display function (even though some GT WAs
also go here) and it isn't necessarily called on a GPU reset. This patch
moves these WAs to their rightful place inside i915_workarounds.c.

TODO: Do we want to keep display WAs separated from GT ones? In that case,
I would propose another category inside i915_workarounds.c (but I would need
help deciding what goes where).

v2:
  - Also move bdw and chv WAs from init_clock_gating that do not seem to be
    actually related to clock gating.
  - Rebased.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c          | 243 +------------------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 195 ++++++++++++++++++++++++-
 2 files changed, 202 insertions(+), 236 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2fcff97..024ee94 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -56,101 +56,6 @@
 #define INTEL_RC6p_ENABLE			(1<<1)
 #define INTEL_RC6pp_ENABLE			(1<<2)
 
-static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	if (HAS_LLC(dev_priv)) {
-		/*
-		 * WaCompressedResourceDisplayNewHashMode:skl,kbl
-		 * Display WA#0390: skl,kbl
-		 *
-		 * Must match Sampler, Pixel Back End, and Media. See
-		 * WaCompressedResourceSamplerPbeMediaNewHashMode.
-		 */
-		I915_WRITE(CHICKEN_PAR1_1,
-			   I915_READ(CHICKEN_PAR1_1) |
-			   SKL_DE_COMPRESSED_HASH_MODE);
-	}
-
-	/* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
-	I915_WRITE(CHICKEN_PAR1_1,
-		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
-
-	I915_WRITE(GEN8_CONFIG0,
-		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
-
-	/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(GEN8_CHICKEN_DCPR_1,
-		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
-
-	/* WaFbcTurnOffFbcWatermark:skl,bxt,kbl,cfl */
-	/* WaFbcWakeMemOn:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
-		   DISP_FBC_WM_DIS |
-		   DISP_FBC_MEMORY_WAKE);
-
-	/* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_DISABLE_DUMMY0);
-
-	if (IS_SKYLAKE(dev_priv)) {
-		/* WaDisableDopClockGating */
-		I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
-			   & ~GEN7_DOP_CLOCK_GATE_ENABLE);
-	}
-}
-
-static void bxt_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	gen9_init_clock_gating(dev_priv);
-
-	/* WaDisableSDEUnitClockGating:bxt */
-	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
-	/*
-	 * FIXME:
-	 * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
-	 */
-	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-		   GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
-
-	/*
-	 * Wa: Backlight PWM may stop in the asserted state, causing backlight
-	 * to stay fully on.
-	 */
-	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
-		   PWM1_GATING_DIS | PWM2_GATING_DIS);
-}
-
-static void glk_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	u32 val;
-	gen9_init_clock_gating(dev_priv);
-
-	/*
-	 * WaDisablePWMClockGating:glk
-	 * Backlight PWM may stop in the asserted state, causing backlight
-	 * to stay fully on.
-	 */
-	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
-		   PWM1_GATING_DIS | PWM2_GATING_DIS);
-
-	/* WaDDIIOTimeout:glk */
-	if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1)) {
-		u32 val = I915_READ(CHICKEN_MISC_2);
-		val &= ~(GLK_CL0_PWR_DOWN |
-			 GLK_CL1_PWR_DOWN |
-			 GLK_CL2_PWR_DOWN);
-		I915_WRITE(CHICKEN_MISC_2, val);
-	}
-
-	/* Display WA #1133: WaFbcSkipSegments:glk */
-	val = I915_READ(ILK_DPFC_CHICKEN);
-	val &= ~GLK_SKIP_SEG_COUNT_MASK;
-	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
-	I915_WRITE(ILK_DPFC_CHICKEN, val);
-}
-
 static void i915_pineview_get_mem_freq(struct drm_i915_private *dev_priv)
 {
 	u32 tmp;
@@ -8505,110 +8410,10 @@ static void cnp_init_clock_gating(struct drm_i915_private *dev_priv)
 		   CNP_PWM_CGE_GATING_DISABLE);
 }
 
-static void cnl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	u32 val;
-	cnp_init_clock_gating(dev_priv);
-
-	/* This is not an Wa. Enable for better image quality */
-	I915_WRITE(_3D_CHICKEN3,
-		   _MASKED_BIT_ENABLE(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE));
-
-	/* WaEnableChickenDCPR:cnl */
-	I915_WRITE(GEN8_CHICKEN_DCPR_1,
-		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
-
-	/* WaFbcWakeMemOn:cnl */
-	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
-		   DISP_FBC_MEMORY_WAKE);
-
-	/* WaSarbUnitClockGatingDisable:cnl (pre-prod) */
-	if (IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_B0))
-		I915_WRITE(SLICE_UNIT_LEVEL_CLKGATE,
-			   I915_READ(SLICE_UNIT_LEVEL_CLKGATE) |
-			   SARBUNIT_CLKGATE_DIS);
-
-	/* Display WA #1133: WaFbcSkipSegments:cnl */
-	val = I915_READ(ILK_DPFC_CHICKEN);
-	val &= ~GLK_SKIP_SEG_COUNT_MASK;
-	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
-	I915_WRITE(ILK_DPFC_CHICKEN, val);
-}
-
-static void cfl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	cnp_init_clock_gating(dev_priv);
-	gen9_init_clock_gating(dev_priv);
-
-	/* WaFbcNukeOnHostModify:cfl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
-}
-
-static void kbl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	gen9_init_clock_gating(dev_priv);
-
-	/* WaDisableSDEUnitClockGating:kbl */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-			   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
-	/* WaDisableGamClockGating:kbl */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
-			   GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
-
-	/* WaFbcNukeOnHostModify:kbl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
-}
-
-static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
-	gen9_init_clock_gating(dev_priv);
-
-	/* WAC6entrylatency:skl */
-	I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
-		   FBC_LLC_FULLY_OPEN);
-
-	/* WaFbcNukeOnHostModify:skl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
-}
-
 static void bdw_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	/* The GTT cache must be disabled if the system is using 2M pages. */
-	bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv,
-						 I915_GTT_PAGE_SIZE_2M);
-	enum pipe pipe;
-
 	ilk_init_lp_watermarks(dev_priv);
 
-	/* WaSwitchSolVfFArbitrationPriority:bdw */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
-
-	/* WaPsrDPAMaskVBlankInSRD:bdw */
-	I915_WRITE(CHICKEN_PAR1_1,
-		   I915_READ(CHICKEN_PAR1_1) | DPA_MASK_VBLANK_SRD);
-
-	/* WaPsrDPRSUnmaskVBlankInSRD:bdw */
-	for_each_pipe(dev_priv, pipe) {
-		I915_WRITE(CHICKEN_PIPESL_1(pipe),
-			   I915_READ(CHICKEN_PIPESL_1(pipe)) |
-			   BDW_DPRS_MASK_VBLANK_SRD);
-	}
-
-	/* WaVSRefCountFullforceMissDisable:bdw */
-	/* WaDSRefCountFullforceMissDisable:bdw */
-	I915_WRITE(GEN7_FF_THREAD_MODE,
-		   I915_READ(GEN7_FF_THREAD_MODE) &
-		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
-
-	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
-		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
-
 	/* WaDisableSDEUnitClockGating:bdw */
 	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
 		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
@@ -8616,19 +8421,12 @@ static void bdw_init_clock_gating(struct drm_i915_private *dev_priv)
 	/* WaProgramL3SqcReg1Default:bdw */
 	gen8_set_l3sqc_credits(dev_priv, 30, 2);
 
-	/* WaGttCachingOffByDefault:bdw */
-	I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0);
-
-	/* WaKVMNotificationOnConfigChange:bdw */
-	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
-		   | KVM_CONFIG_CHANGE_NOTIFICATION_SELECT);
-
 	lpt_init_clock_gating(dev_priv);
 
 	/* WaDisableDopClockGating:bdw
 	 *
-	 * Also see the CHICKEN2 write in bdw_init_workarounds() to disable DOP
-	 * clock gating.
+	 * Also see the CHICKEN2 write in bdw_ctx_workarounds_init() to disable
+	 * DOP clock gating.
 	 */
 	I915_WRITE(GEN6_UCGCTL1,
 		   I915_READ(GEN6_UCGCTL1) | GEN6_EU_TCUNIT_CLOCK_GATE_DISABLE);
@@ -8867,16 +8665,6 @@ static void vlv_init_clock_gating(struct drm_i915_private *dev_priv)
 
 static void chv_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	/* WaVSRefCountFullforceMissDisable:chv */
-	/* WaDSRefCountFullforceMissDisable:chv */
-	I915_WRITE(GEN7_FF_THREAD_MODE,
-		   I915_READ(GEN7_FF_THREAD_MODE) &
-		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
-
-	/* WaDisableSemaphoreAndSyncFlipWait:chv */
-	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
-		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
-
 	/* WaDisableCSUnitClockGating:chv */
 	I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
 		   GEN6_CSUNIT_CLOCK_GATE_DISABLE);
@@ -8891,12 +8679,6 @@ static void chv_init_clock_gating(struct drm_i915_private *dev_priv)
 	 * LSQC Setting Recommendations.
 	 */
 	gen8_set_l3sqc_credits(dev_priv, 38, 2);
-
-	/*
-	 * GTT cache may not work with big pages, so if those
-	 * are ever enabled GTT cache may need to be disabled.
-	 */
-	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
 }
 
 static void g4x_init_clock_gating(struct drm_i915_private *dev_priv)
@@ -9018,24 +8800,15 @@ static void nop_init_clock_gating(struct drm_i915_private *dev_priv)
  * @dev_priv: device private
  *
  * Setup the hooks that configure which clocks of a given platform can be
- * gated and also apply various GT and display specific workarounds for these
- * platforms. Note that some GT specific workarounds are applied separately
- * when GPU contexts or batchbuffers start their execution.
+ * gated.
  */
 void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 {
-	if (IS_CANNONLAKE(dev_priv))
-		dev_priv->display.init_clock_gating = cnl_init_clock_gating;
-	else if (IS_COFFEELAKE(dev_priv))
-		dev_priv->display.init_clock_gating = cfl_init_clock_gating;
-	else if (IS_SKYLAKE(dev_priv))
-		dev_priv->display.init_clock_gating = skl_init_clock_gating;
-	else if (IS_KABYLAKE(dev_priv))
-		dev_priv->display.init_clock_gating = kbl_init_clock_gating;
-	else if (IS_BROXTON(dev_priv))
-		dev_priv->display.init_clock_gating = bxt_init_clock_gating;
-	else if (IS_GEMINILAKE(dev_priv))
-		dev_priv->display.init_clock_gating = glk_init_clock_gating;
+	if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv))
+		dev_priv->display.init_clock_gating = cnp_init_clock_gating;
+	else if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv) ||
+		 IS_BROXTON(dev_priv) || IS_GEMINILAKE(dev_priv))
+		dev_priv->display.init_clock_gating = nop_init_clock_gating;
 	else if (IS_BROADWELL(dev_priv))
 		dev_priv->display.init_clock_gating = bdw_init_clock_gating;
 	else if (IS_CHERRYVIEW(dev_priv))
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index ae66084..bc144c2 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -513,6 +513,63 @@ int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req)
 	return 0;
 }
 
+static void bdw_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	/* The GTT cache must be disabled if the system is using 2M pages. */
+	bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv,
+						 I915_GTT_PAGE_SIZE_2M);
+	enum pipe pipe;
+
+	/* WaSwitchSolVfFArbitrationPriority:bdw */
+	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
+
+	/* WaPsrDPAMaskVBlankInSRD:bdw */
+	I915_WRITE(CHICKEN_PAR1_1,
+		   I915_READ(CHICKEN_PAR1_1) | DPA_MASK_VBLANK_SRD);
+
+	/* WaPsrDPRSUnmaskVBlankInSRD:bdw */
+	for_each_pipe(dev_priv, pipe) {
+		I915_WRITE(CHICKEN_PIPESL_1(pipe),
+			   I915_READ(CHICKEN_PIPESL_1(pipe)) |
+			   BDW_DPRS_MASK_VBLANK_SRD);
+	}
+
+	/* WaVSRefCountFullforceMissDisable:bdw */
+	/* WaDSRefCountFullforceMissDisable:bdw */
+	I915_WRITE(GEN7_FF_THREAD_MODE,
+		   I915_READ(GEN7_FF_THREAD_MODE) &
+		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
+
+	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
+		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
+
+	/* WaGttCachingOffByDefault:bdw */
+	I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0);
+
+	/* WaKVMNotificationOnConfigChange:bdw */
+	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
+		   | KVM_CONFIG_CHANGE_NOTIFICATION_SELECT);
+}
+
+static void chv_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	/* WaVSRefCountFullforceMissDisable:chv */
+	/* WaDSRefCountFullforceMissDisable:chv */
+	I915_WRITE(GEN7_FF_THREAD_MODE,
+		   I915_READ(GEN7_FF_THREAD_MODE) &
+		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
+
+	/* WaDisableSemaphoreAndSyncFlipWait:chv */
+	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
+		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
+
+	/*
+	 * GTT cache may not work with big pages, so if those
+	 * are ever enabled GTT cache may need to be disabled.
+	 */
+	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+}
+
 static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 {
 	if (HAS_LLC(dev_priv)) {
@@ -525,8 +582,40 @@ static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 			   I915_READ(MMCD_MISC_CTRL) |
 			   MMCD_PCLA |
 			   MMCD_HOTSPOT_EN);
+
+		/*
+		 * WaCompressedResourceDisplayNewHashMode:skl,kbl
+		 * Display WA#0390: skl,kbl
+		 *
+		 * Must match Sampler, Pixel Back End, and Media. See
+		 * WaCompressedResourceSamplerPbeMediaNewHashMode.
+		 */
+		I915_WRITE(CHICKEN_PAR1_1,
+			   I915_READ(CHICKEN_PAR1_1) |
+			   SKL_DE_COMPRESSED_HASH_MODE);
 	}
 
+	/* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
+	I915_WRITE(CHICKEN_PAR1_1,
+		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
+
+	I915_WRITE(GEN8_CONFIG0,
+		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
+
+	/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(GEN8_CHICKEN_DCPR_1,
+		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
+
+	/* WaFbcTurnOffFbcWatermark:skl,bxt,kbl,cfl */
+	/* WaFbcWakeMemOn:skl,bxt,kbl,glk,cfl */
+	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
+		   DISP_FBC_WM_DIS |
+		   DISP_FBC_MEMORY_WAKE);
+
+	/* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
+	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
+		   ILK_DPFC_DISABLE_DUMMY0);
+
 	/* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
 	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS,
 		   _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
@@ -557,6 +646,18 @@ static void skl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 {
 	gen9_mmio_workarounds_apply(dev_priv);
 
+	/* WaDisableDopClockGating */
+	I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
+		   & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+
+	/* WAC6entrylatency:skl */
+	I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
+		   FBC_LLC_FULLY_OPEN);
+
+	/* WaFbcNukeOnHostModify:skl */
+	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
+		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+
 	/* WaEnableGapsTsvCreditFix:skl */
 	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
 				   GEN9_GAPS_TSV_CREDIT_DISABLE));
@@ -576,6 +677,24 @@ static void bxt_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 {
 	gen9_mmio_workarounds_apply(dev_priv);
 
+	/* WaDisableSDEUnitClockGating:bxt */
+	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
+		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+
+	/*
+	 * FIXME:
+	 * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
+	 */
+	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
+		   GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
+
+	/*
+	 * Wa: Backlight PWM may stop in the asserted state, causing backlight
+	 * to stay fully on.
+	 */
+	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
+		   PWM1_GATING_DIS | PWM2_GATING_DIS);
+
 	/* WaStoreMultiplePTEenable:bxt */
 	/* This is a requirement according to Hardware specification */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
@@ -609,6 +728,20 @@ static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 {
 	gen9_mmio_workarounds_apply(dev_priv);
 
+	/* WaDisableSDEUnitClockGating:kbl */
+	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+		I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
+			   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+
+	/* WaDisableGamClockGating:kbl */
+	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+		I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
+			   GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
+
+	/* WaFbcNukeOnHostModify:kbl */
+	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
+		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+
 	/* WaEnableGapsTsvCreditFix:kbl */
 	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
 				   GEN9_GAPS_TSV_CREDIT_DISABLE));
@@ -631,13 +764,43 @@ static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 
 static void glk_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 {
+	u32 val;
+
 	gen9_mmio_workarounds_apply(dev_priv);
+
+	/*
+	 * WaDisablePWMClockGating:glk
+	 * Backlight PWM may stop in the asserted state, causing backlight
+	 * to stay fully on.
+	 */
+	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
+		   PWM1_GATING_DIS | PWM2_GATING_DIS);
+
+	/* WaDDIIOTimeout:glk */
+	if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1)) {
+		u32 val = I915_READ(CHICKEN_MISC_2);
+		val &= ~(GLK_CL0_PWR_DOWN |
+			 GLK_CL1_PWR_DOWN |
+			 GLK_CL2_PWR_DOWN);
+		I915_WRITE(CHICKEN_MISC_2, val);
+	}
+
+	/* Display WA #1133: WaFbcSkipSegments:glk */
+	val = I915_READ(ILK_DPFC_CHICKEN);
+	val &= ~GLK_SKIP_SEG_COUNT_MASK;
+	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
+	I915_WRITE(ILK_DPFC_CHICKEN, val);
 }
 
 static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 {
 	gen9_mmio_workarounds_apply(dev_priv);
 
+	/* WaFbcNukeOnHostModify:cfl */
+	I915_WRITE(ILK_DPFC_CHICKEN,
+		   I915_READ(ILK_DPFC_CHICKEN) |
+			     ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+
 	/* WaEnableGapsTsvCreditFix:cfl */
 	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
 				   GEN9_GAPS_TSV_CREDIT_DISABLE));
@@ -654,6 +817,32 @@ static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 
 static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 {
+	u32 val;
+
+	/* This is not an Wa. Enable for better image quality */
+	I915_WRITE(_3D_CHICKEN3,
+		   _MASKED_BIT_ENABLE(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE));
+
+	/* WaEnableChickenDCPR:cnl */
+	I915_WRITE(GEN8_CHICKEN_DCPR_1,
+		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
+
+	/* WaFbcWakeMemOn:cnl */
+	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
+		   DISP_FBC_MEMORY_WAKE);
+
+	/* WaSarbUnitClockGatingDisable:cnl (pre-prod) */
+	if (IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_B0))
+		I915_WRITE(SLICE_UNIT_LEVEL_CLKGATE,
+			   I915_READ(SLICE_UNIT_LEVEL_CLKGATE) |
+			   SARBUNIT_CLKGATE_DIS);
+
+	/* Display WA #1133: WaFbcSkipSegments:cnl */
+	val = I915_READ(ILK_DPFC_CHICKEN);
+	val &= ~GLK_SKIP_SEG_COUNT_MASK;
+	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
+	I915_WRITE(ILK_DPFC_CHICKEN, val);
+
 	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
 	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
 		I915_WRITE(GAMT_CHKN_BIT_REG,
@@ -672,8 +861,12 @@ static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 
 void intel_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 {
-	if (INTEL_GEN(dev_priv) < 9)
+	if (INTEL_GEN(dev_priv) < 8)
 		return;
+	else if (IS_BROADWELL(dev_priv))
+		bdw_mmio_workarounds_apply(dev_priv);
+	else if (IS_CHERRYVIEW(dev_priv))
+		chv_mmio_workarounds_apply(dev_priv);
 	else if (IS_SKYLAKE(dev_priv))
 		skl_mmio_workarounds_apply(dev_priv);
 	else if (IS_BROXTON(dev_priv))
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 05/11] drm/i915: Rename saved workarounds to make it explicit that they are context WAs
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
                   ` (3 preceding siblings ...)
  2017-10-11 18:15 ` [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 18:15 ` [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time Oscar Mateo
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

Some WAs touch registers that get saved/restored together with the logical context.
Make this very explicit by renaming a few things in the code.

v2:
  - Improved naming
  - Rebased

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  10 +-
 drivers/gpu/drm/i915/i915_drv.h          |   5 +-
 drivers/gpu/drm/i915/intel_workarounds.c | 230 +++++++++++++++----------------
 3 files changed, 123 insertions(+), 122 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 0bb6e01..1eb6e58 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3415,18 +3415,18 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
 
 	intel_runtime_pm_get(dev_priv);
 
-	seq_printf(m, "Workarounds applied: %d\n", workarounds->count);
+	seq_printf(m, "Context workarounds applied: %d\n", workarounds->ctx_wa_count);
 	for_each_engine(engine, dev_priv, id)
 		seq_printf(m, "HW whitelist count for %s: %d\n",
 			   engine->name, workarounds->hw_whitelist_count[id]);
-	for (i = 0; i < workarounds->count; ++i) {
+	for (i = 0; i < workarounds->ctx_wa_count; ++i) {
 		i915_reg_t addr;
 		u32 mask, value, read;
 		bool ok;
 
-		addr = workarounds->reg[i].addr;
-		mask = workarounds->reg[i].mask;
-		value = workarounds->reg[i].value;
+		addr = workarounds->ctx_wa_reg[i].addr;
+		mask = workarounds->ctx_wa_reg[i].mask;
+		value = workarounds->ctx_wa_reg[i].value;
 		read = I915_READ(addr);
 		ok = (value & mask) == (read & mask);
 		seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a5f3999..a528b0b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1963,8 +1963,9 @@ struct i915_wa_reg {
 #define I915_MAX_WA_REGS 16
 
 struct i915_workarounds {
-	struct i915_wa_reg reg[I915_MAX_WA_REGS];
-	u32 count;
+	struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
+	u32 ctx_wa_count;
+
 	u32 hw_whitelist_count[I915_NUM_ENGINES];
 };
 
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index bc144c2..c7d2bf9 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -25,49 +25,49 @@
 #include "i915_drv.h"
 #include "intel_workarounds.h"
 
-static int wa_add(struct drm_i915_private *dev_priv,
-		  i915_reg_t addr,
-		  const u32 mask, const u32 val)
+static int ctx_wa_add(struct drm_i915_private *dev_priv,
+		      i915_reg_t addr,
+		      const u32 mask, const u32 val)
 {
-	const u32 idx = dev_priv->workarounds.count;
+	const u32 idx = dev_priv->workarounds.ctx_wa_count;
 
 	if (WARN_ON(idx >= I915_MAX_WA_REGS))
 		return -ENOSPC;
 
-	dev_priv->workarounds.reg[idx].addr = addr;
-	dev_priv->workarounds.reg[idx].value = val;
-	dev_priv->workarounds.reg[idx].mask = mask;
+	dev_priv->workarounds.ctx_wa_reg[idx].addr = addr;
+	dev_priv->workarounds.ctx_wa_reg[idx].value = val;
+	dev_priv->workarounds.ctx_wa_reg[idx].mask = mask;
 
-	dev_priv->workarounds.count++;
+	dev_priv->workarounds.ctx_wa_count++;
 
 	return 0;
 }
 
-#define WA_REG(addr, mask, val) do { \
-		const int r = wa_add(dev_priv, (addr), (mask), (val)); \
+#define CTX_WA_REG(addr, mask, val) do { \
+		const int r = ctx_wa_add(dev_priv, (addr), (mask), (val)); \
 		if (r) \
 			return r; \
 	} while (0)
 
-#define WA_SET_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+#define CTX_WA_SET_BIT_MASKED(addr, mask) \
+	CTX_WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
 
-#define WA_CLR_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+#define CTX_WA_CLR_BIT_MASKED(addr, mask) \
+	CTX_WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
 
-#define WA_SET_FIELD_MASKED(addr, mask, value) \
-	WA_REG(addr, mask, _MASKED_FIELD(mask, value))
+#define CTX_WA_SET_FIELD_MASKED(addr, mask, value) \
+	CTX_WA_REG(addr, mask, _MASKED_FIELD(mask, value))
 
 static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
+	CTX_WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
 
 	/* WaDisableAsyncFlipPerfMode:bdw,chv */
-	WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
+	CTX_WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
 
 	/* WaDisablePartialInstShootdown:bdw,chv */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+	CTX_WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+			      PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
 	/* Use Force Non-Coherent whenever executing a 3D context. This is a
 	 * workaround for for a possible hang in the unlikely event a TLB
@@ -75,9 +75,9 @@ static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 	 */
 	/* WaForceEnableNonCoherent:bdw,chv */
 	/* WaHdcDisableFetchWhenMasked:bdw,chv */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_DONOT_FETCH_MEM_WHEN_MASKED |
-			  HDC_FORCE_NON_COHERENT);
+	CTX_WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			      HDC_DONOT_FETCH_MEM_WHEN_MASKED |
+			      HDC_FORCE_NON_COHERENT);
 
 	/* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
 	 * "The Hierarchical Z RAW Stall Optimization allows non-overlapping
@@ -87,10 +87,10 @@ static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 	 *
 	 * This optimization is off by default for BDW and CHV; turn it on.
 	 */
-	WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
+	CTX_WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
 
 	/* Wa4x4STCOptimizationDisable:bdw,chv */
-	WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
+	CTX_WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
 
 	/*
 	 * BSpec recommends 8x4 when MSAA is used,
@@ -100,9 +100,9 @@ static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 	 * disable bit, which we don't touch here, but it's good
 	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
 	 */
-	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
-			    GEN6_WIZ_HASHING_MASK,
-			    GEN6_WIZ_HASHING_16x4);
+	CTX_WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+				GEN6_WIZ_HASHING_MASK,
+				GEN6_WIZ_HASHING_16x4);
 
 	return 0;
 }
@@ -116,24 +116,24 @@ static int bdw_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 		return ret;
 
 	/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+	CTX_WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
 
 	/* WaDisableDopClockGating:bdw
 	 *
 	 * Also see the related UCGTCL1 write in broadwell_init_clock_gating()
 	 * to disable EUTC clock gating.
 	 */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
-			  DOP_CLOCK_GATING_DISABLE);
+	CTX_WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
+			      DOP_CLOCK_GATING_DISABLE);
 
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-			  GEN8_SAMPLER_POWER_BYPASS_DIS);
+	CTX_WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+			      GEN8_SAMPLER_POWER_BYPASS_DIS);
 
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  /* WaForceContextSaveRestoreNonCoherent:bdw */
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
-			  /* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
-			  (IS_BDW_GT3(dev_priv) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
+	CTX_WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			      /* WaForceContextSaveRestoreNonCoherent:bdw */
+			      HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
+			      /* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
+			      (IS_BDW_GT3(dev_priv) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
 
 	return 0;
 }
@@ -147,10 +147,10 @@ static int chv_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 		return ret;
 
 	/* WaDisableThreadStallDopClockGating:chv */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+	CTX_WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
 
 	/* Improve HiZ throughput on CHV. */
-	WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
+	CTX_WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
 
 	return 0;
 }
@@ -163,32 +163,32 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 		 * Must match Display Engine. See
 		 * WaCompressedResourceDisplayNewHashMode.
 		 */
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN9_PBE_COMPRESSED_HASH_SELECTION);
-		WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
-				  GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
+		CTX_WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				      GEN9_PBE_COMPRESSED_HASH_SELECTION);
+		CTX_WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+				      GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
 	}
 
 	/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
 	/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  FLOW_CONTROL_ENABLE |
-			  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+	CTX_WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+			      FLOW_CONTROL_ENABLE |
+			      PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
 	/* Syncing dependencies between camera and graphics:skl,bxt,kbl */
 	if (!IS_COFFEELAKE(dev_priv))
-		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-				  GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
+		CTX_WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+				      GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
 
 	/* WaDisableDgMirrorFixInHalfSliceChicken5:bxt */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
-				  GEN9_DG_MIRROR_FIX_ENABLE);
+		CTX_WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+				      GEN9_DG_MIRROR_FIX_ENABLE);
 
 	/* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:bxt */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
-				  GEN9_RHWO_OPTIMIZATION_DISABLE);
+		CTX_WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
+				      GEN9_RHWO_OPTIMIZATION_DISABLE);
 		/*
 		 * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
 		 * but we do that in per ctx batchbuffer as there is an issue
@@ -198,28 +198,28 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 
 	/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
 	/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
-			  GEN9_ENABLE_YV12_BUGFIX |
-			  GEN9_ENABLE_GPGPU_PREEMPTION);
+	CTX_WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+			      GEN9_ENABLE_YV12_BUGFIX |
+			      GEN9_ENABLE_GPGPU_PREEMPTION);
 
 	/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
 	/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
-					 GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE));
+	CTX_WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE |
+					    GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE);
 
 	/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
-	WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
-			  GEN9_CCS_TLB_PREFETCH_ENABLE);
+	CTX_WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+			      GEN9_CCS_TLB_PREFETCH_ENABLE);
 
 	/* WaDisableMaskBasedCammingInRCC:bxt */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
-				  PIXEL_MASK_CAMMING_DISABLE);
+		CTX_WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
+				      PIXEL_MASK_CAMMING_DISABLE);
 
 	/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
-			  HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
+	CTX_WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			      HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
+			      HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
 
 	/* WaForceEnableNonCoherent and WaDisableHDCInvalidation are
 	 * both tied to WaForceContextSaveRestoreNonCoherent
@@ -235,19 +235,19 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 	 */
 
 	/* WaForceEnableNonCoherent:skl,bxt,kbl,cfl */
-	WA_SET_BIT_MASKED(HDC_CHICKEN0,
-			  HDC_FORCE_NON_COHERENT);
+	CTX_WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			      HDC_FORCE_NON_COHERENT);
 
 	/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
 	if (IS_SKYLAKE(dev_priv) ||
 	    IS_KABYLAKE(dev_priv) ||
 	    IS_COFFEELAKE(dev_priv) ||
 	    IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0))
-		WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
-				  GEN8_SAMPLER_POWER_BYPASS_DIS);
+		CTX_WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+				      GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
+	CTX_WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
 
 	/*
 	 * Supporting preemption with fine-granularity requires changes in the
@@ -261,11 +261,11 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 	 */
 
 	/* WaDisable3DMidCmdPreemption:skl,bxt,glk,cfl,[cnl] */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+	CTX_WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
 
 	/* WaDisableGPGPUMidCmdPreemption:skl,bxt,blk,cfl,[cnl] */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+	CTX_WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+				GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
 
 	return 0;
 }
@@ -299,13 +299,13 @@ static int skl_tune_iz_hashing(struct drm_i915_private *dev_priv)
 		return 0;
 
 	/* Tune IZ hashing. See intel_device_info_runtime_init() */
-	WA_SET_FIELD_MASKED(GEN7_GT_MODE,
-			    GEN9_IZ_HASHING_MASK(2) |
-			    GEN9_IZ_HASHING_MASK(1) |
-			    GEN9_IZ_HASHING_MASK(0),
-			    GEN9_IZ_HASHING(2, vals[2]) |
-			    GEN9_IZ_HASHING(1, vals[1]) |
-			    GEN9_IZ_HASHING(0, vals[0]));
+	CTX_WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+				GEN9_IZ_HASHING_MASK(2) |
+				GEN9_IZ_HASHING_MASK(1) |
+				GEN9_IZ_HASHING_MASK(0),
+				GEN9_IZ_HASHING(2, vals[2]) |
+				GEN9_IZ_HASHING(1, vals[1]) |
+				GEN9_IZ_HASHING(0, vals[0]));
 
 	return 0;
 }
@@ -330,20 +330,20 @@ static int bxt_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 		return ret;
 
 	/* WaDisableThreadStallDopClockGating:bxt */
-	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
-			  STALL_DOP_GATING_DISABLE);
+	CTX_WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+			      STALL_DOP_GATING_DISABLE);
 
 	/* WaDisableSbeCacheDispatchPortSharing:bxt */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
-		WA_SET_BIT_MASKED(
+		CTX_WA_SET_BIT_MASKED(
 			GEN7_HALF_SLICE_CHICKEN1,
 			GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 	}
 
 	/* WaToEnableHwFixForPushConstHWBug:bxt */
 	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+		CTX_WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				      GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	return 0;
 }
@@ -358,16 +358,16 @@ static int kbl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 
 	/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
 	if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
-		WA_SET_BIT_MASKED(HDC_CHICKEN0,
-				  HDC_FENCE_DEST_SLM_DISABLE);
+		CTX_WA_SET_BIT_MASKED(HDC_CHICKEN0,
+				      HDC_FENCE_DEST_SLM_DISABLE);
 
 	/* WaToEnableHwFixForPushConstHWBug:kbl */
 	if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+		CTX_WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				      GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:kbl */
-	WA_SET_BIT_MASKED(
+	CTX_WA_SET_BIT_MASKED(
 		GEN7_HALF_SLICE_CHICKEN1,
 		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 
@@ -383,8 +383,8 @@ static int glk_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 		return ret;
 
 	/* WaToEnableHwFixForPushConstHWBug:glk */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+	CTX_WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			      GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	return 0;
 }
@@ -398,11 +398,11 @@ static int cfl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 		return ret;
 
 	/* WaToEnableHwFixForPushConstHWBug:cfl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+	CTX_WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			      GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableSbeCacheDispatchPortSharing:cfl */
-	WA_SET_BIT_MASKED(
+	CTX_WA_SET_BIT_MASKED(
 		GEN7_HALF_SLICE_CHICKEN1,
 		GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 
@@ -412,34 +412,34 @@ static int cfl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 static int cnl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
 	/* WaForceContextSaveRestoreNonCoherent:cnl */
-	WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
-			  HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
+	CTX_WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
+			      HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
 
 	/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
 	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
+		CTX_WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
 
 	/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
-	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+	CTX_WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+			      GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
 
 	/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
 	if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
-		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
-				  GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
+		CTX_WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+				      GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
 
 	/* WaPushConstantDereferenceHoldDisable:cnl */
-	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
+	CTX_WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
 
 	/* FtrEnableFastAnisoL1BankingFix:cnl */
-	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
+	CTX_WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
 
 	/* WaDisable3DMidCmdPreemption:cnl */
-	WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+	CTX_WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
 
 	/* WaDisableGPGPUMidCmdPreemption:cnl */
-	WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-			    GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+	CTX_WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+				GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
 
 	return 0;
 }
@@ -448,7 +448,7 @@ int intel_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 {
 	int err;
 
-	dev_priv->workarounds.count = 0;
+	dev_priv->workarounds.ctx_wa_count = 0;
 
 	if (INTEL_GEN(dev_priv) < 8)
 		err = 0;
@@ -476,7 +476,7 @@ int intel_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 		return err;
 
 	DRM_DEBUG_DRIVER("Number of context specific w/a: %d\n",
-			 dev_priv->workarounds.count);
+			 dev_priv->workarounds.ctx_wa_count);
 	return 0;
 }
 
@@ -486,21 +486,21 @@ int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req)
 	u32 *cs;
 	int ret, i;
 
-	if (w->count == 0)
+	if (w->ctx_wa_count == 0)
 		return 0;
 
 	ret = req->engine->emit_flush(req, EMIT_BARRIER);
 	if (ret)
 		return ret;
 
-	cs = intel_ring_begin(req, (w->count * 2 + 2));
+	cs = intel_ring_begin(req, (w->ctx_wa_count * 2 + 2));
 	if (IS_ERR(cs))
 		return PTR_ERR(cs);
 
-	*cs++ = MI_LOAD_REGISTER_IMM(w->count);
-	for (i = 0; i < w->count; i++) {
-		*cs++ = i915_mmio_reg_offset(w->reg[i].addr);
-		*cs++ = w->reg[i].value;
+	*cs++ = MI_LOAD_REGISTER_IMM(w->ctx_wa_count);
+	for (i = 0; i < w->ctx_wa_count; i++) {
+		*cs++ = i915_mmio_reg_offset(w->ctx_wa_reg[i].addr);
+		*cs++ = w->ctx_wa_reg[i].value;
 	}
 	*cs++ = MI_NOOP;
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
                   ` (4 preceding siblings ...)
  2017-10-11 18:15 ` [PATCH 05/11] drm/i915: Rename saved workarounds to make it explicit that they are context WAs Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-12 10:35   ` Joonas Lahtinen
  2017-10-11 18:15 ` [PATCH 07/11] drm/i915: Save all Whitelist " Oscar Mateo
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

By doing this, we can dump these workarounds in debugfs for validation (which,
at the moment, we are only able to do for the contexts WAs).

v2:
  - Wrong macro used for MMIO set bit masked
  - Improved naming
  - Rebased

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c          |   5 +
 drivers/gpu/drm/i915/i915_drv.h          |   8 +-
 drivers/gpu/drm/i915/i915_reg.h          |   1 +
 drivers/gpu/drm/i915/intel_workarounds.c | 382 +++++++++++++++++--------------
 drivers/gpu/drm/i915/intel_workarounds.h |   1 +
 5 files changed, 223 insertions(+), 174 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f1e6517..6e9a0da 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -49,6 +49,7 @@
 #include "i915_drv.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
+#include "intel_workarounds.h"
 #include "intel_drv.h"
 #include "intel_uc.h"
 
@@ -886,6 +887,10 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
 	BUG_ON(device_info->gen > sizeof(device_info->gen_mask) * BITS_PER_BYTE);
 	device_info->gen_mask = BIT(device_info->gen - 1);
 
+	ret = intel_mmio_workarounds_init(dev_priv);
+	if (ret < 0)
+		return ret;
+
 	spin_lock_init(&dev_priv->irq_lock);
 	spin_lock_init(&dev_priv->gpu_error.lock);
 	mutex_init(&dev_priv->backlight_lock);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a528b0b..11e8658 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1960,12 +1960,16 @@ struct i915_wa_reg {
 	u32 mask;
 };
 
-#define I915_MAX_WA_REGS 16
+#define I915_MAX_CTX_WA_REGS 16
+#define I915_MAX_MMIO_WA_REGS 32
 
 struct i915_workarounds {
-	struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
+	struct i915_wa_reg ctx_wa_reg[I915_MAX_CTX_WA_REGS];
 	u32 ctx_wa_count;
 
+	struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
+	u32 mmio_wa_count;
+
 	u32 hw_whitelist_count[I915_NUM_ENGINES];
 };
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d2d0a83..5858f5f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7041,6 +7041,7 @@ enum {
  */
 #define  L3_GENERAL_PRIO_CREDITS(x)		(((x) >> 1) << 19)
 #define  L3_HIGH_PRIO_CREDITS(x)		(((x) >> 1) << 14)
+#define  L3_PRIO_CREDITS_MASK			(0x1f << 19) | (0x1f << 14)
 
 #define GEN7_L3CNTLREG1				_MMIO(0xB01C)
 #define  GEN7_WA_FOR_GEN7_L3_CONTROL			0x3C47FF8C
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index c7d2bf9..95a9b75 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -31,7 +31,7 @@ static int ctx_wa_add(struct drm_i915_private *dev_priv,
 {
 	const u32 idx = dev_priv->workarounds.ctx_wa_count;
 
-	if (WARN_ON(idx >= I915_MAX_WA_REGS))
+	if (WARN_ON(idx >= I915_MAX_CTX_WA_REGS))
 		return -ENOSPC;
 
 	dev_priv->workarounds.ctx_wa_reg[idx].addr = addr;
@@ -513,64 +513,97 @@ int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req)
 	return 0;
 }
 
-static void bdw_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int mmio_wa_add(struct drm_i915_private *dev_priv,
+		       i915_reg_t addr,
+		       const u32 mask, const u32 val)
+{
+	const u32 idx = dev_priv->workarounds.mmio_wa_count;
+
+	if (WARN_ON(idx >= I915_MAX_MMIO_WA_REGS))
+		return -ENOSPC;
+
+	dev_priv->workarounds.mmio_wa_reg[idx].addr = addr;
+	dev_priv->workarounds.mmio_wa_reg[idx].value = val;
+	dev_priv->workarounds.mmio_wa_reg[idx].mask = mask;
+
+	dev_priv->workarounds.mmio_wa_count++;
+
+	return 0;
+}
+
+#define MMIO_WA_REG(addr, mask, val) do { \
+		const int r = mmio_wa_add(dev_priv, (addr), (mask), (val)); \
+		if (r) \
+			return r; \
+	} while (0)
+
+#define MMIO_WA_SET_BIT(addr, mask) \
+	MMIO_WA_REG(addr, (mask), (mask))
+
+#define MMIO_WA_SET_BIT_MASKED(addr, mask) \
+	MMIO_WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+
+#define MMIO_WA_CLR_BIT(addr, mask) \
+	MMIO_WA_REG(addr, (mask), 0)
+
+#define MMIO_WA_SET_FIELD(addr, mask, value) \
+	MMIO_WA_REG(addr, (mask), (value))
+
+static int bdw_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
 	/* The GTT cache must be disabled if the system is using 2M pages. */
-	bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv,
-						 I915_GTT_PAGE_SIZE_2M);
+	bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv, I915_GTT_PAGE_SIZE_2M);
 	enum pipe pipe;
 
 	/* WaSwitchSolVfFArbitrationPriority:bdw */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
+	MMIO_WA_SET_BIT(GAM_ECOCHK, HSW_ECOCHK_ARB_PRIO_SOL);
 
 	/* WaPsrDPAMaskVBlankInSRD:bdw */
-	I915_WRITE(CHICKEN_PAR1_1,
-		   I915_READ(CHICKEN_PAR1_1) | DPA_MASK_VBLANK_SRD);
+	MMIO_WA_SET_BIT(CHICKEN_PAR1_1, DPA_MASK_VBLANK_SRD);
 
 	/* WaPsrDPRSUnmaskVBlankInSRD:bdw */
-	for_each_pipe(dev_priv, pipe) {
-		I915_WRITE(CHICKEN_PIPESL_1(pipe),
-			   I915_READ(CHICKEN_PIPESL_1(pipe)) |
-			   BDW_DPRS_MASK_VBLANK_SRD);
-	}
+	for_each_pipe(dev_priv, pipe)
+		MMIO_WA_SET_BIT(CHICKEN_PIPESL_1(pipe), BDW_DPRS_MASK_VBLANK_SRD);
 
 	/* WaVSRefCountFullforceMissDisable:bdw */
 	/* WaDSRefCountFullforceMissDisable:bdw */
-	I915_WRITE(GEN7_FF_THREAD_MODE,
-		   I915_READ(GEN7_FF_THREAD_MODE) &
-		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
+	MMIO_WA_CLR_BIT(GEN7_FF_THREAD_MODE, GEN8_FF_DS_REF_CNT_FFME |
+					     GEN7_FF_VS_REF_CNT_FFME);
 
-	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
-		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
+	MMIO_WA_SET_BIT_MASKED(GEN6_RC_SLEEP_PSMI_CONTROL,
+			       GEN8_RC_SEMA_IDLE_MSG_DISABLE);
 
 	/* WaGttCachingOffByDefault:bdw */
-	I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0);
+	MMIO_WA_SET_FIELD(HSW_GTT_CACHE_EN, 0xFFFFFFFF,
+			  can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0);
 
 	/* WaKVMNotificationOnConfigChange:bdw */
-	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
-		   | KVM_CONFIG_CHANGE_NOTIFICATION_SELECT);
+	MMIO_WA_SET_BIT(CHICKEN_PAR2_1, KVM_CONFIG_CHANGE_NOTIFICATION_SELECT);
+
+	return 0;
 }
 
-static void chv_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int chv_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
 	/* WaVSRefCountFullforceMissDisable:chv */
 	/* WaDSRefCountFullforceMissDisable:chv */
-	I915_WRITE(GEN7_FF_THREAD_MODE,
-		   I915_READ(GEN7_FF_THREAD_MODE) &
-		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
+	MMIO_WA_CLR_BIT(GEN7_FF_THREAD_MODE, GEN8_FF_DS_REF_CNT_FFME |
+					     GEN7_FF_VS_REF_CNT_FFME);
 
 	/* WaDisableSemaphoreAndSyncFlipWait:chv */
-	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
-		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
+	MMIO_WA_SET_BIT_MASKED(GEN6_RC_SLEEP_PSMI_CONTROL,
+			       GEN8_RC_SEMA_IDLE_MSG_DISABLE);
 
 	/*
 	 * GTT cache may not work with big pages, so if those
 	 * are ever enabled GTT cache may need to be disabled.
 	 */
-	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
+	MMIO_WA_SET_FIELD(HSW_GTT_CACHE_EN, 0xFFFFFFFF, GTT_CACHE_EN_ALL);
+
+	return 0;
 }
 
-static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int gen9_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
 	if (HAS_LLC(dev_priv)) {
 		/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
@@ -578,10 +611,7 @@ static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 		 * Must match Display Engine. See
 		 * WaCompressedResourceDisplayNewHashMode.
 		 */
-		I915_WRITE(MMCD_MISC_CTRL,
-			   I915_READ(MMCD_MISC_CTRL) |
-			   MMCD_PCLA |
-			   MMCD_HOTSPOT_EN);
+		MMIO_WA_SET_BIT(MMCD_MISC_CTRL, MMCD_PCLA | MMCD_HOTSPOT_EN);
 
 		/*
 		 * WaCompressedResourceDisplayNewHashMode:skl,kbl
@@ -590,297 +620,305 @@ static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 		 * Must match Sampler, Pixel Back End, and Media. See
 		 * WaCompressedResourceSamplerPbeMediaNewHashMode.
 		 */
-		I915_WRITE(CHICKEN_PAR1_1,
-			   I915_READ(CHICKEN_PAR1_1) |
-			   SKL_DE_COMPRESSED_HASH_MODE);
+		MMIO_WA_SET_BIT(CHICKEN_PAR1_1, SKL_DE_COMPRESSED_HASH_MODE);
 	}
 
 	/* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
-	I915_WRITE(CHICKEN_PAR1_1,
-		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
+	MMIO_WA_SET_BIT(CHICKEN_PAR1_1, SKL_EDP_PSR_FIX_RDWRAP);
 
-	I915_WRITE(GEN8_CONFIG0,
-		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
+	MMIO_WA_SET_BIT(GEN8_CONFIG0, GEN9_DEFAULT_FIXES);
 
 	/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(GEN8_CHICKEN_DCPR_1,
-		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
+	MMIO_WA_SET_BIT(GEN8_CHICKEN_DCPR_1, MASK_WAKEMEM);
 
 	/* WaFbcTurnOffFbcWatermark:skl,bxt,kbl,cfl */
 	/* WaFbcWakeMemOn:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
-		   DISP_FBC_WM_DIS |
-		   DISP_FBC_MEMORY_WAKE);
+	MMIO_WA_SET_BIT(DISP_ARB_CTL, DISP_FBC_WM_DIS | DISP_FBC_MEMORY_WAKE);
 
 	/* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_DISABLE_DUMMY0);
+	MMIO_WA_SET_BIT(ILK_DPFC_CHICKEN, ILK_DPFC_DISABLE_DUMMY0);
 
 	/* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS,
-		   _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+	MMIO_WA_SET_BIT_MASKED(GEN9_CSFE_CHICKEN1_RCS,
+			       GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE);
 
 	/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
-	I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
-		   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+	MMIO_WA_SET_BIT(BDW_SCRATCH1, GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
 
 	/* WaDisableKillLogic:bxt,skl,kbl */
 	if (!IS_COFFEELAKE(dev_priv))
-		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-			   ECOCHK_DIS_TLB);
+		MMIO_WA_SET_BIT(GAM_ECOCHK, ECOCHK_DIS_TLB);
 
 	/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
-	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-		   BDW_DISABLE_HDC_INVALIDATION);
+	MMIO_WA_SET_BIT(GAM_ECOCHK, BDW_DISABLE_HDC_INVALIDATION);
 
 	/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
-	I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
-				    GEN8_LQSC_FLUSH_COHERENT_LINES));
+	MMIO_WA_SET_BIT(GEN8_L3SQCREG4, GEN8_LQSC_FLUSH_COHERENT_LINES);
 
 	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+	MMIO_WA_SET_BIT_MASKED(GEN7_FF_SLICE_CS_CHICKEN1,
+			       GEN9_FFSC_PERCTX_PREEMPT_CTRL);
+
+	return 0;
 }
 
-static void skl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int skl_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	gen9_mmio_workarounds_apply(dev_priv);
+	int ret;
+
+	ret = gen9_mmio_workarounds_init(dev_priv);
+	if (ret)
+		return ret;
 
 	/* WaDisableDopClockGating */
-	I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
-		   & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+	MMIO_WA_CLR_BIT(GEN7_MISCCPCTL, GEN7_DOP_CLOCK_GATE_ENABLE);
 
 	/* WAC6entrylatency:skl */
-	I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
-		   FBC_LLC_FULLY_OPEN);
+	MMIO_WA_SET_BIT(FBC_LLC_READ_CTRL, FBC_LLC_FULLY_OPEN);
 
 	/* WaFbcNukeOnHostModify:skl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+	MMIO_WA_SET_BIT(ILK_DPFC_CHICKEN, ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
 
 	/* WaEnableGapsTsvCreditFix:skl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+	MMIO_WA_SET_BIT(GEN8_GARBCNTL, GEN9_GAPS_TSV_CREDIT_DISABLE);
 
 	/* WaDisableGafsUnitClkGating:skl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+	MMIO_WA_SET_BIT(GEN7_UCGCTL4, GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
 
 	/* WaInPlaceDecompressionHang:skl */
 	if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+		MMIO_WA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+				GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+
+	return 0;
 }
 
-static void bxt_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int bxt_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	gen9_mmio_workarounds_apply(dev_priv);
+	int ret;
+
+	ret = gen9_mmio_workarounds_init(dev_priv);
+	if (ret)
+		return ret;
 
 	/* WaDisableSDEUnitClockGating:bxt */
-	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+	MMIO_WA_SET_BIT(GEN8_UCGCTL6, GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
 
 	/*
 	 * FIXME:
 	 * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
 	 */
-	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-		   GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
+	MMIO_WA_SET_BIT(GEN8_UCGCTL6, GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
 
 	/*
 	 * Wa: Backlight PWM may stop in the asserted state, causing backlight
 	 * to stay fully on.
 	 */
-	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
-		   PWM1_GATING_DIS | PWM2_GATING_DIS);
+	MMIO_WA_SET_BIT(GEN9_CLKGATE_DIS_0, PWM1_GATING_DIS | PWM2_GATING_DIS);
 
 	/* WaStoreMultiplePTEenable:bxt */
 	/* This is a requirement according to Hardware specification */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
-		I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+		MMIO_WA_SET_BIT(TILECTL, TILECTL_TLBPF);
 
 	/* WaSetClckGatingDisableMedia:bxt */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
-					    ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+		MMIO_WA_CLR_BIT(GEN7_MISCCPCTL, GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE);
 	}
 
 	/* WaDisablePooledEuLoadBalancingFix:bxt */
 	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
-		I915_WRITE(FF_SLICE_CS_CHICKEN2,
-			   _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
+		MMIO_WA_SET_BIT_MASKED(FF_SLICE_CS_CHICKEN2,
+				       GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE);
 	}
 
 	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
 	if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
-		I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
-					   L3_HIGH_PRIO_CREDITS(2));
+		MMIO_WA_SET_FIELD(GEN8_L3SQCREG1, L3_PRIO_CREDITS_MASK,
+				  L3_GENERAL_PRIO_CREDITS(62) |
+				  L3_HIGH_PRIO_CREDITS(2));
 
 	/* WaInPlaceDecompressionHang:bxt */
 	if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
-		I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-			   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+		MMIO_WA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+				GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+
+	return 0;
 }
 
-static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int kbl_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	gen9_mmio_workarounds_apply(dev_priv);
+	int ret;
+
+	ret = gen9_mmio_workarounds_init(dev_priv);
+	if (ret)
+		return ret;
 
 	/* WaDisableSDEUnitClockGating:kbl */
 	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-			   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+		MMIO_WA_SET_BIT(GEN8_UCGCTL6, GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
 
 	/* WaDisableGamClockGating:kbl */
 	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
-			   GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
+		MMIO_WA_SET_BIT(GEN6_UCGCTL1, GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
 
 	/* WaFbcNukeOnHostModify:kbl */
-	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
-		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+	MMIO_WA_SET_BIT(ILK_DPFC_CHICKEN, ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
 
 	/* WaEnableGapsTsvCreditFix:kbl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+	MMIO_WA_SET_BIT(GEN8_GARBCNTL, GEN9_GAPS_TSV_CREDIT_DISABLE);
 
 	/* WaDisableDynamicCreditSharing:kbl */
 	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
+		MMIO_WA_SET_BIT(GAMT_CHKN_BIT_REG,
+				GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING);
 
 	/* WaDisableGafsUnitClkGating:kbl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+	MMIO_WA_SET_BIT(GEN7_UCGCTL4, GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
 
 	/* WaInPlaceDecompressionHang:kbl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+	MMIO_WA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+			GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+
+	return 0;
 }
 
-static void glk_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int glk_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	u32 val;
+	int ret;
 
-	gen9_mmio_workarounds_apply(dev_priv);
+	ret = gen9_mmio_workarounds_init(dev_priv);
+	if (ret)
+		return ret;
 
 	/*
 	 * WaDisablePWMClockGating:glk
 	 * Backlight PWM may stop in the asserted state, causing backlight
 	 * to stay fully on.
 	 */
-	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
-		   PWM1_GATING_DIS | PWM2_GATING_DIS);
+	MMIO_WA_SET_BIT(GEN9_CLKGATE_DIS_0, PWM1_GATING_DIS | PWM2_GATING_DIS);
 
 	/* WaDDIIOTimeout:glk */
-	if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1)) {
-		u32 val = I915_READ(CHICKEN_MISC_2);
-		val &= ~(GLK_CL0_PWR_DOWN |
-			 GLK_CL1_PWR_DOWN |
-			 GLK_CL2_PWR_DOWN);
-		I915_WRITE(CHICKEN_MISC_2, val);
-	}
+	if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1))
+		MMIO_WA_CLR_BIT(CHICKEN_MISC_2,
+				GLK_CL0_PWR_DOWN |
+				GLK_CL1_PWR_DOWN |
+				GLK_CL2_PWR_DOWN);
 
 	/* Display WA #1133: WaFbcSkipSegments:glk */
-	val = I915_READ(ILK_DPFC_CHICKEN);
-	val &= ~GLK_SKIP_SEG_COUNT_MASK;
-	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
-	I915_WRITE(ILK_DPFC_CHICKEN, val);
+	MMIO_WA_SET_FIELD(ILK_DPFC_CHICKEN, GLK_SKIP_SEG_COUNT_MASK,
+			  GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1));
+
+	return 0;
 }
 
-static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int cfl_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	gen9_mmio_workarounds_apply(dev_priv);
+	int ret;
+
+	ret = gen9_mmio_workarounds_init(dev_priv);
+	if (ret)
+		return ret;
 
 	/* WaFbcNukeOnHostModify:cfl */
-	I915_WRITE(ILK_DPFC_CHICKEN,
-		   I915_READ(ILK_DPFC_CHICKEN) |
-			     ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+	MMIO_WA_SET_BIT(ILK_DPFC_CHICKEN, ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
 
 	/* WaEnableGapsTsvCreditFix:cfl */
-	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-				   GEN9_GAPS_TSV_CREDIT_DISABLE));
+	MMIO_WA_SET_BIT(GEN8_GARBCNTL, GEN9_GAPS_TSV_CREDIT_DISABLE);
 
 	/* WaDisableGafsUnitClkGating:cfl */
-	I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
-				  GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+	MMIO_WA_SET_BIT(GEN7_UCGCTL4, GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
 
 	/* WaInPlaceDecompressionHang:cfl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+	MMIO_WA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+			GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+
+	return 0;
 }
 
-static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int cnl_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
-	u32 val;
-
 	/* This is not an Wa. Enable for better image quality */
-	I915_WRITE(_3D_CHICKEN3,
-		   _MASKED_BIT_ENABLE(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE));
+	MMIO_WA_SET_BIT_MASKED(_3D_CHICKEN3, _3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE);
 
 	/* WaEnableChickenDCPR:cnl */
-	I915_WRITE(GEN8_CHICKEN_DCPR_1,
-		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
+	MMIO_WA_SET_BIT(GEN8_CHICKEN_DCPR_1, MASK_WAKEMEM);
 
 	/* WaFbcWakeMemOn:cnl */
-	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
-		   DISP_FBC_MEMORY_WAKE);
+	MMIO_WA_SET_BIT(DISP_ARB_CTL, DISP_FBC_MEMORY_WAKE);
 
 	/* WaSarbUnitClockGatingDisable:cnl (pre-prod) */
 	if (IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_B0))
-		I915_WRITE(SLICE_UNIT_LEVEL_CLKGATE,
-			   I915_READ(SLICE_UNIT_LEVEL_CLKGATE) |
-			   SARBUNIT_CLKGATE_DIS);
+		MMIO_WA_SET_BIT(SLICE_UNIT_LEVEL_CLKGATE, SARBUNIT_CLKGATE_DIS);
 
 	/* Display WA #1133: WaFbcSkipSegments:cnl */
-	val = I915_READ(ILK_DPFC_CHICKEN);
-	val &= ~GLK_SKIP_SEG_COUNT_MASK;
-	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
-	I915_WRITE(ILK_DPFC_CHICKEN, val);
+	MMIO_WA_SET_FIELD(ILK_DPFC_CHICKEN, GLK_SKIP_SEG_COUNT_MASK,
+			  GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1));
 
 	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
 	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
-		I915_WRITE(GAMT_CHKN_BIT_REG,
-			   (I915_READ(GAMT_CHKN_BIT_REG) |
-			    GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
+		MMIO_WA_SET_BIT(GAMT_CHKN_BIT_REG,
+				GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT);
 
 	/* WaInPlaceDecompressionHang:cnl */
-	I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
-		   (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
-		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+	MMIO_WA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+			GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
 
 	/* WaEnablePreemptionGranularityControlByUMD:cnl */
-	I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+	MMIO_WA_SET_BIT_MASKED(GEN7_FF_SLICE_CS_CHICKEN1,
+			       GEN9_FFSC_PERCTX_PREEMPT_CTRL);
+
+	return 0;
 }
 
-void intel_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+int intel_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 {
+	int err;
+
+	dev_priv->workarounds.mmio_wa_count = 0;
+
 	if (INTEL_GEN(dev_priv) < 8)
-		return;
+		err = 0;
 	else if (IS_BROADWELL(dev_priv))
-		bdw_mmio_workarounds_apply(dev_priv);
+		err = bdw_mmio_workarounds_init(dev_priv);
 	else if (IS_CHERRYVIEW(dev_priv))
-		chv_mmio_workarounds_apply(dev_priv);
+		err = chv_mmio_workarounds_init(dev_priv);
 	else if (IS_SKYLAKE(dev_priv))
-		skl_mmio_workarounds_apply(dev_priv);
+		err = skl_mmio_workarounds_init(dev_priv);
 	else if (IS_BROXTON(dev_priv))
-		bxt_mmio_workarounds_apply(dev_priv);
+		err = bxt_mmio_workarounds_init(dev_priv);
 	else if (IS_KABYLAKE(dev_priv))
-		kbl_mmio_workarounds_apply(dev_priv);
+		err = kbl_mmio_workarounds_init(dev_priv);
 	else if (IS_GEMINILAKE(dev_priv))
-		glk_mmio_workarounds_apply(dev_priv);
+		err = glk_mmio_workarounds_init(dev_priv);
 	else if (IS_COFFEELAKE(dev_priv))
-		cfl_mmio_workarounds_apply(dev_priv);
+		err = cfl_mmio_workarounds_init(dev_priv);
 	else if (IS_CANNONLAKE(dev_priv))
-		cnl_mmio_workarounds_apply(dev_priv);
-	else
+		err = cnl_mmio_workarounds_init(dev_priv);
+	else {
 		MISSING_CASE(INTEL_GEN(dev_priv));
+		err = 0;
+	}
+	if (err)
+		return err;
+
+	DRM_DEBUG_DRIVER("Number of MMIO w/a: %d\n",
+			 dev_priv->workarounds.mmio_wa_count);
+	return 0;
+}
+
+void intel_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+	struct i915_workarounds *w = &dev_priv->workarounds;
+	int i;
+
+	for (i = 0; i < w->mmio_wa_count; i++) {
+		i915_reg_t addr = w->mmio_wa_reg[i].addr;
+		u32 value = w->mmio_wa_reg[i].value;
+		u32 mask = w->mmio_wa_reg[i].mask;
+
+		I915_WRITE(addr, (I915_READ(addr) & ~mask) | value);
+	}
 }
 
 static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index 53a452a..2f36f27 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -28,6 +28,7 @@
 int intel_ctx_workarounds_init(struct drm_i915_private *dev_priv);
 int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req);
 
+int intel_mmio_workarounds_init(struct drm_i915_private *dev_priv);
 void intel_mmio_workarounds_apply(struct drm_i915_private *dev_priv);
 
 int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 07/11] drm/i915: Save all Whitelist WAs and apply them at a later time
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
                   ` (5 preceding siblings ...)
  2017-10-11 18:15 ` [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 18:15 ` [PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs Oscar Mateo
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

Same as we have been doing for other types, this allow us to dump
the whole list of workarounds to debugs, for validation purposes.

v2:
  - Improved naming
  - Rebased

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |   2 +-
 drivers/gpu/drm/i915/i915_drv.h          |   3 +-
 drivers/gpu/drm/i915/intel_lrc.c         |   8 ++-
 drivers/gpu/drm/i915/intel_workarounds.c | 113 ++++++++++++++++---------------
 drivers/gpu/drm/i915/intel_workarounds.h |   3 +-
 5 files changed, 67 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 1eb6e58..f108f53 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3418,7 +3418,7 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
 	seq_printf(m, "Context workarounds applied: %d\n", workarounds->ctx_wa_count);
 	for_each_engine(engine, dev_priv, id)
 		seq_printf(m, "HW whitelist count for %s: %d\n",
-			   engine->name, workarounds->hw_whitelist_count[id]);
+			   engine->name, workarounds->whitelist_wa_count[id]);
 	for (i = 0; i < workarounds->ctx_wa_count; ++i) {
 		i915_reg_t addr;
 		u32 mask, value, read;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 11e8658..f1349b9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1970,7 +1970,8 @@ struct i915_workarounds {
 	struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
 	u32 mmio_wa_count;
 
-	u32 hw_whitelist_count[I915_NUM_ENGINES];
+	struct i915_wa_reg whitelist_wa_reg[I915_NUM_ENGINES][RING_MAX_NONPRIV_SLOTS];
+	u32 whitelist_wa_count[I915_NUM_ENGINES];
 };
 
 struct i915_virtual_gpu {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e632a29..f3d4602 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1499,9 +1499,7 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
 	if (ret)
 		return ret;
 
-	ret = intel_whitelist_workarounds_apply(engine);
-	if (ret)
-		return ret;
+	intel_whitelist_workarounds_apply(engine);
 
 	return 0;
 }
@@ -1988,6 +1986,10 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 	if (ret)
 		return ret;
 
+	ret = intel_whitelist_workarounds_init(engine);
+	if (ret)
+		return ret;
+
 	ret = intel_init_workaround_bb(engine);
 	if (ret) {
 		/*
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 95a9b75..8375c29 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -921,64 +921,64 @@ void intel_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
 	}
 }
 
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
-				 i915_reg_t reg)
+static int whitelist_wa_add(struct intel_engine_cs *engine,
+			    i915_reg_t reg)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 	struct i915_workarounds *wa = &dev_priv->workarounds;
-	const uint32_t index = wa->hw_whitelist_count[engine->id];
+	const uint32_t index = wa->whitelist_wa_count[engine->id];
 
 	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
 		return -EINVAL;
 
-	I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
-		   i915_mmio_reg_offset(reg));
-	wa->hw_whitelist_count[engine->id]++;
+	wa->whitelist_wa_reg[engine->id][index].addr =
+		RING_FORCE_TO_NONPRIV(engine->mmio_base, index);
+	wa->whitelist_wa_reg[engine->id][index].value = i915_mmio_reg_offset(reg);
+	wa->whitelist_wa_reg[engine->id][index].mask = 0xffffffff;
+
+	wa->whitelist_wa_count[engine->id]++;
 
 	return 0;
 }
 
-static int gen9_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
-	int ret;
+#define WHITELIST_WA_REG(engine, reg) do { \
+	const int r = whitelist_wa_add(engine, reg); \
+	if (r) \
+		return r; \
+} while (0)
 
+static int gen9_whitelist_workarounds_init(struct intel_engine_cs *engine)
+{
 	/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
-	if (ret)
-		return ret;
+	WHITELIST_WA_REG(engine, GEN9_CTX_PREEMPT_REG);
 
 	/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
+	WHITELIST_WA_REG(engine, GEN8_CS_CHICKEN1);
 
 	/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
-	if (ret)
-		return ret;
+	WHITELIST_WA_REG(engine, GEN8_HDC_CHICKEN1);
 
 	return 0;
 }
 
-static int skl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int skl_whitelist_workarounds_init(struct intel_engine_cs *engine)
 {
-	int ret = gen9_whitelist_workarounds_apply(engine);
+	int ret = gen9_whitelist_workarounds_init(engine);
 	if (ret)
 		return ret;
 
 	/* WaDisableLSQCROPERFforOCL:skl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
+	WHITELIST_WA_REG(engine, GEN8_L3SQCREG4);
 
 	return 0;
 }
 
-static int bxt_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int bxt_whitelist_workarounds_init(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
 
-	int ret = gen9_whitelist_workarounds_apply(engine);
+	ret = gen9_whitelist_workarounds_init(engine);
 	if (ret)
 		return ret;
 
@@ -987,86 +987,75 @@ static int bxt_whitelist_workarounds_apply(struct intel_engine_cs *engine)
 	/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
 	/* WaDisableLSQCROPERFforOCL:bxt */
 	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
-		ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
-		if (ret)
-			return ret;
-
-		ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-		if (ret)
-			return ret;
+		WHITELIST_WA_REG(engine, GEN9_CS_DEBUG_MODE1);
+		WHITELIST_WA_REG(engine, GEN8_L3SQCREG4);
 	}
 
 	return 0;
 }
 
-static int kbl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int kbl_whitelist_workarounds_init(struct intel_engine_cs *engine)
 {
-	int ret = gen9_whitelist_workarounds_apply(engine);
+	int ret = gen9_whitelist_workarounds_init(engine);
 	if (ret)
 		return ret;
 
 	/* WaDisableLSQCROPERFforOCL:kbl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
-	if (ret)
-		return ret;
+	WHITELIST_WA_REG(engine, GEN8_L3SQCREG4);
 
 	return 0;
 }
 
-static int glk_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int glk_whitelist_workarounds_init(struct intel_engine_cs *engine)
 {
-	int ret = gen9_whitelist_workarounds_apply(engine);
+	int ret = gen9_whitelist_workarounds_init(engine);
 	if (ret)
 		return ret;
 
 	return 0;
 }
 
-static int cfl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int cfl_whitelist_workarounds_init(struct intel_engine_cs *engine)
 {
-	int ret = gen9_whitelist_workarounds_apply(engine);
+	int ret = gen9_whitelist_workarounds_init(engine);
 	if (ret)
 		return ret;
 
 	return 0;
 }
 
-static int cnl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int cnl_whitelist_workarounds_init(struct intel_engine_cs *engine)
 {
-	int ret;
-
 	/* WaEnablePreemptionGranularityControlByUMD:cnl */
-	ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
-	if (ret)
-		return ret;
+	WHITELIST_WA_REG(engine, GEN8_CS_CHICKEN1);
 
 	return 0;
 }
 
-int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+int intel_whitelist_workarounds_init(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 	int err;
 
 	WARN_ON(engine->id != RCS);
 
-	dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+	dev_priv->workarounds.whitelist_wa_count[engine->id] = 0;
 
 	if (INTEL_GEN(dev_priv) < 9) {
 		WARN(1, "No whitelisting in Gen%u\n", INTEL_GEN(dev_priv));
 		err = 0;
 	} else if (IS_SKYLAKE(dev_priv))
-		err = skl_whitelist_workarounds_apply(engine);
+		err = skl_whitelist_workarounds_init(engine);
 	else if (IS_BROXTON(dev_priv))
-		err = bxt_whitelist_workarounds_apply(engine);
+		err = bxt_whitelist_workarounds_init(engine);
 	else if (IS_KABYLAKE(dev_priv))
-		err = kbl_whitelist_workarounds_apply(engine);
+		err = kbl_whitelist_workarounds_init(engine);
 	else if (IS_GEMINILAKE(dev_priv))
-		err = glk_whitelist_workarounds_apply(engine);
+		err = glk_whitelist_workarounds_init(engine);
 	else if (IS_COFFEELAKE(dev_priv))
-		err = cfl_whitelist_workarounds_apply(engine);
+		err = cfl_whitelist_workarounds_init(engine);
 	else if (IS_CANNONLAKE(dev_priv))
-		err = cnl_whitelist_workarounds_apply(engine);
+		err = cnl_whitelist_workarounds_init(engine);
 	else {
 		MISSING_CASE(INTEL_GEN(dev_priv));
 		err = 0;
@@ -1075,6 +1064,18 @@ int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
 		return err;
 
 	DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %d\n", engine->name,
-			 dev_priv->workarounds.hw_whitelist_count[engine->id]);
+			 dev_priv->workarounds.whitelist_wa_count[engine->id]);
 	return 0;
 }
+
+void intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	struct i915_workarounds *w = &dev_priv->workarounds;
+	int i;
+
+	for (i = 0; i < w->whitelist_wa_count[engine->id]; i++) {
+		I915_WRITE(w->whitelist_wa_reg[engine->id][i].addr,
+			   w->whitelist_wa_reg[engine->id][i].value);
+	}
+}
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index 2f36f27..52cdcad 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -31,6 +31,7 @@
 int intel_mmio_workarounds_init(struct drm_i915_private *dev_priv);
 void intel_mmio_workarounds_apply(struct drm_i915_private *dev_priv);
 
-int intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
+int intel_whitelist_workarounds_init(struct intel_engine_cs *engine);
+void intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
 
 #endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
                   ` (6 preceding siblings ...)
  2017-10-11 18:15 ` [PATCH 07/11] drm/i915: Save all Whitelist " Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 18:41   ` Chris Wilson
  2017-10-11 18:15 ` [PATCH 09/11] drm/i915: Move WA BB stuff to the workarounds file as well Oscar Mateo
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

Let's try to make sure that all WAs are applied correctly and survive
resumes, resets, etc... (with some help from a companion i-g-t patch).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 48 ++++++++++++++++++++++++++-----------
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f108f53..fb49eac 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3399,6 +3399,20 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused)
 	return 0;
 }
 
+static void check_wa_register(struct seq_file *m, struct i915_wa_reg *wa_reg)
+{
+	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	u32 read;
+	bool ok;
+
+	read = I915_READ(wa_reg->addr);
+	ok = (wa_reg->value & wa_reg->mask) == (read & wa_reg->mask);
+	seq_printf(m, "0x%X: 0x%08x, mask: 0x%08x, read: 0x%08x, status: %s\n",
+		   i915_mmio_reg_offset(wa_reg->addr),
+		   wa_reg->value, wa_reg->mask, read,
+		   ok ? "OK" : "FAIL");
+}
+
 static int i915_wa_registers(struct seq_file *m, void *unused)
 {
 	int i;
@@ -3408,6 +3422,7 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
 	struct drm_device *dev = &dev_priv->drm;
 	struct i915_workarounds *workarounds = &dev_priv->workarounds;
 	enum intel_engine_id id;
+	u32 whitelist_wa_count = 0;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
 	if (ret)
@@ -3416,22 +3431,27 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
 	intel_runtime_pm_get(dev_priv);
 
 	seq_printf(m, "Context workarounds applied: %d\n", workarounds->ctx_wa_count);
-	for_each_engine(engine, dev_priv, id)
-		seq_printf(m, "HW whitelist count for %s: %d\n",
-			   engine->name, workarounds->whitelist_wa_count[id]);
 	for (i = 0; i < workarounds->ctx_wa_count; ++i) {
-		i915_reg_t addr;
-		u32 mask, value, read;
-		bool ok;
-
-		addr = workarounds->ctx_wa_reg[i].addr;
-		mask = workarounds->ctx_wa_reg[i].mask;
-		value = workarounds->ctx_wa_reg[i].value;
-		read = I915_READ(addr);
-		ok = (value & mask) == (read & mask);
-		seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
-			   i915_mmio_reg_offset(addr), value, mask, read, ok ? "OK" : "FAIL");
+		struct i915_wa_reg *wa_reg = &workarounds->ctx_wa_reg[i];
+
+		seq_printf(m, "0x%X: 0x%08x, mask: 0x%08x\n",
+			   i915_mmio_reg_offset(wa_reg->addr),
+			   wa_reg->value, wa_reg->mask);
 	}
+	seq_putc(m, '\n');
+
+	seq_printf(m, "MMIO workarounds applied: %d\n", workarounds->mmio_wa_count);
+	for (i = 0; i < workarounds->mmio_wa_count; ++i)
+		check_wa_register(m, &workarounds->mmio_wa_reg[i]);
+	seq_putc(m, '\n');
+
+	for_each_engine(engine, dev_priv, id)
+		whitelist_wa_count += workarounds->whitelist_wa_count[id];
+	seq_printf(m, "Whitelist workarounds applied: %d\n", whitelist_wa_count);
+	for_each_engine(engine, dev_priv, id)
+		for (i = 0; i < workarounds->whitelist_wa_count[id]; ++i)
+			check_wa_register(m, &workarounds->whitelist_wa_reg[id][i]);
+	seq_putc(m, '\n');
 
 	intel_runtime_pm_put(dev_priv);
 	mutex_unlock(&dev->struct_mutex);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 09/11] drm/i915: Move WA BB stuff to the workarounds file as well
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
                   ` (7 preceding siblings ...)
  2017-10-11 18:15 ` [PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 18:15 ` [PATCH 10/11] drm/i915: Document the i915_workarounds file Oscar Mateo
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

Since we are trying to put all WA stuff together, do not forget about the BB WAs.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c         | 253 +-----------------------------
 drivers/gpu/drm/i915/intel_workarounds.c | 254 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_workarounds.h |   3 +
 3 files changed, 259 insertions(+), 251 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f3d4602..0ba8f1b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1166,255 +1166,6 @@ static int execlists_request_alloc(struct drm_i915_gem_request *request)
 	return 0;
 }
 
-/*
- * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after
- * PIPE_CONTROL instruction. This is required for the flush to happen correctly
- * but there is a slight complication as this is applied in WA batch where the
- * values are only initialized once so we cannot take register value at the
- * beginning and reuse it further; hence we save its value to memory, upload a
- * constant value with bit21 set and then we restore it back with the saved value.
- * To simplify the WA, a constant value is formed by using the default value
- * of this register. This shouldn't be a problem because we are only modifying
- * it for a short period and this batch in non-premptible. We can ofcourse
- * use additional instructions that read the actual value of the register
- * at that time and set our bit of interest but it makes the WA complicated.
- *
- * This WA is also required for Gen9 so extracting as a function avoids
- * code duplication.
- */
-static u32 *
-gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
-{
-	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
-	*batch++ = 0;
-
-	*batch++ = MI_LOAD_REGISTER_IMM(1);
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
-
-	batch = gen8_emit_pipe_control(batch,
-				       PIPE_CONTROL_CS_STALL |
-				       PIPE_CONTROL_DC_FLUSH_ENABLE,
-				       0);
-
-	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
-	*batch++ = 0;
-
-	return batch;
-}
-
-/*
- * Typically we only have one indirect_ctx and per_ctx batch buffer which are
- * initialized at the beginning and shared across all contexts but this field
- * helps us to have multiple batches at different offsets and select them based
- * on a criteria. At the moment this batch always start at the beginning of the page
- * and at this point we don't have multiple wa_ctx batch buffers.
- *
- * The number of WA applied are not known at the beginning; we use this field
- * to return the no of DWORDS written.
- *
- * It is to be noted that this batch does not contain MI_BATCH_BUFFER_END
- * so it adds NOOPs as padding to make it cacheline aligned.
- * MI_BATCH_BUFFER_END will be added to perctx batch and both of them together
- * makes a complete batch buffer.
- */
-static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
-	/* WaDisableCtxRestoreArbitration:bdw,chv */
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
-	/* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
-	if (IS_BROADWELL(engine->i915))
-		batch = gen8_emit_flush_coherentl3_wa(engine, batch);
-
-	/* WaClearSlmSpaceAtContextSwitch:bdw,chv */
-	/* Actual scratch location is at 128 bytes offset */
-	batch = gen8_emit_pipe_control(batch,
-				       PIPE_CONTROL_FLUSH_L3 |
-				       PIPE_CONTROL_GLOBAL_GTT_IVB |
-				       PIPE_CONTROL_CS_STALL |
-				       PIPE_CONTROL_QW_WRITE,
-				       i915_ggtt_offset(engine->scratch) +
-				       2 * CACHELINE_BYTES);
-
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
-	/* Pad to end of cacheline */
-	while ((unsigned long)batch % CACHELINE_BYTES)
-		*batch++ = MI_NOOP;
-
-	/*
-	 * MI_BATCH_BUFFER_END is not required in Indirect ctx BB because
-	 * execution depends on the length specified in terms of cache lines
-	 * in the register CTX_RCS_INDIRECT_CTX
-	 */
-
-	return batch;
-}
-
-static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
-	/* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */
-	batch = gen8_emit_flush_coherentl3_wa(engine, batch);
-
-	/* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */
-	*batch++ = MI_LOAD_REGISTER_IMM(1);
-	*batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2);
-	*batch++ = _MASKED_BIT_DISABLE(
-			GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE);
-	*batch++ = MI_NOOP;
-
-	/* WaClearSlmSpaceAtContextSwitch:kbl */
-	/* Actual scratch location is at 128 bytes offset */
-	if (IS_KBL_REVID(engine->i915, 0, KBL_REVID_A0)) {
-		batch = gen8_emit_pipe_control(batch,
-					       PIPE_CONTROL_FLUSH_L3 |
-					       PIPE_CONTROL_GLOBAL_GTT_IVB |
-					       PIPE_CONTROL_CS_STALL |
-					       PIPE_CONTROL_QW_WRITE,
-					       i915_ggtt_offset(engine->scratch)
-					       + 2 * CACHELINE_BYTES);
-	}
-
-	/* WaMediaPoolStateCmdInWABB:bxt,glk */
-	if (HAS_POOLED_EU(engine->i915)) {
-		/*
-		 * EU pool configuration is setup along with golden context
-		 * during context initialization. This value depends on
-		 * device type (2x6 or 3x6) and needs to be updated based
-		 * on which subslice is disabled especially for 2x6
-		 * devices, however it is safe to load default
-		 * configuration of 3x6 device instead of masking off
-		 * corresponding bits because HW ignores bits of a disabled
-		 * subslice and drops down to appropriate config. Please
-		 * see render_state_setup() in i915_gem_render_state.c for
-		 * possible configurations, to avoid duplication they are
-		 * not shown here again.
-		 */
-		*batch++ = GEN9_MEDIA_POOL_STATE;
-		*batch++ = GEN9_MEDIA_POOL_ENABLE;
-		*batch++ = 0x00777000;
-		*batch++ = 0;
-		*batch++ = 0;
-		*batch++ = 0;
-	}
-
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
-	/* Pad to end of cacheline */
-	while ((unsigned long)batch % CACHELINE_BYTES)
-		*batch++ = MI_NOOP;
-
-	return batch;
-}
-
-#define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
-
-static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
-{
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	int err;
-
-	obj = i915_gem_object_create(engine->i915, CTX_WA_BB_OBJ_SIZE);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
-
-	vma = i915_vma_instance(obj, &engine->i915->ggtt.base, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err;
-	}
-
-	err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH);
-	if (err)
-		goto err;
-
-	engine->wa_ctx.vma = vma;
-	return 0;
-
-err:
-	i915_gem_object_put(obj);
-	return err;
-}
-
-static void lrc_destroy_wa_ctx(struct intel_engine_cs *engine)
-{
-	i915_vma_unpin_and_release(&engine->wa_ctx.vma);
-}
-
-typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
-
-static int intel_init_workaround_bb(struct intel_engine_cs *engine)
-{
-	struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
-	struct i915_wa_ctx_bb *wa_bb[2] = { &wa_ctx->indirect_ctx,
-					    &wa_ctx->per_ctx };
-	wa_bb_func_t wa_bb_fn[2];
-	struct page *page;
-	void *batch, *batch_ptr;
-	unsigned int i;
-	int ret;
-
-	if (WARN_ON(engine->id != RCS || !engine->scratch))
-		return -EINVAL;
-
-	switch (INTEL_GEN(engine->i915)) {
-	case 10:
-		return 0;
-	case 9:
-		wa_bb_fn[0] = gen9_init_indirectctx_bb;
-		wa_bb_fn[1] = NULL;
-		break;
-	case 8:
-		wa_bb_fn[0] = gen8_init_indirectctx_bb;
-		wa_bb_fn[1] = NULL;
-		break;
-	default:
-		MISSING_CASE(INTEL_GEN(engine->i915));
-		return 0;
-	}
-
-	ret = lrc_setup_wa_ctx(engine);
-	if (ret) {
-		DRM_DEBUG_DRIVER("Failed to setup context WA page: %d\n", ret);
-		return ret;
-	}
-
-	page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0);
-	batch = batch_ptr = kmap_atomic(page);
-
-	/*
-	 * Emit the two workaround batch buffers, recording the offset from the
-	 * start of the workaround batch buffer object for each and their
-	 * respective sizes.
-	 */
-	for (i = 0; i < ARRAY_SIZE(wa_bb_fn); i++) {
-		wa_bb[i]->offset = batch_ptr - batch;
-		if (WARN_ON(!IS_ALIGNED(wa_bb[i]->offset, CACHELINE_BYTES))) {
-			ret = -EINVAL;
-			break;
-		}
-		if (wa_bb_fn[i])
-			batch_ptr = wa_bb_fn[i](engine, batch_ptr);
-		wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
-	}
-
-	BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
-
-	kunmap_atomic(batch);
-	if (ret)
-		lrc_destroy_wa_ctx(engine);
-
-	return ret;
-}
-
 static u8 gtiir[] = {
 	[RCS] = 0,
 	[BCS] = 0,
@@ -1870,7 +1621,7 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
 
 	intel_engine_cleanup_common(engine);
 
-	lrc_destroy_wa_ctx(engine);
+	intel_bb_workarounds_fini(engine);
 	engine->i915 = NULL;
 	dev_priv->engine[engine->id] = NULL;
 	kfree(engine);
@@ -1990,7 +1741,7 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 	if (ret)
 		return ret;
 
-	ret = intel_init_workaround_bb(engine);
+	ret = intel_bb_workarounds_init(engine);
 	if (ret) {
 		/*
 		 * We continue even if we fail to initialize WA batch
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 8375c29..8cdb934 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -1079,3 +1079,257 @@ void intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
 			   w->whitelist_wa_reg[engine->id][i].value);
 	}
 }
+
+/*
+ * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after
+ * PIPE_CONTROL instruction. This is required for the flush to happen correctly
+ * but there is a slight complication as this is applied in WA batch where the
+ * values are only initialized once so we cannot take register value at the
+ * beginning and reuse it further; hence we save its value to memory, upload a
+ * constant value with bit21 set and then we restore it back with the saved value.
+ * To simplify the WA, a constant value is formed by using the default value
+ * of this register. This shouldn't be a problem because we are only modifying
+ * it for a short period and this batch in non-premptible. We can ofcourse
+ * use additional instructions that read the actual value of the register
+ * at that time and set our bit of interest but it makes the WA complicated.
+ *
+ * This WA is also required for Gen9 so extracting as a function avoids
+ * code duplication.
+ */
+static u32 *
+gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
+{
+	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
+	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
+	*batch++ = 0;
+
+	*batch++ = MI_LOAD_REGISTER_IMM(1);
+	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
+
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_CS_STALL |
+				       PIPE_CONTROL_DC_FLUSH_ENABLE,
+				       0);
+
+	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
+	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
+	*batch++ = 0;
+
+	return batch;
+}
+
+/*
+ * Typically we only have one indirect_ctx and per_ctx batch buffer which are
+ * initialized at the beginning and shared across all contexts but this field
+ * helps us to have multiple batches at different offsets and select them based
+ * on a criteria. At the moment this batch always start at the beginning of the page
+ * and at this point we don't have multiple wa_ctx batch buffers.
+ *
+ * The number of WA applied are not known at the beginning; we use this field
+ * to return the no of DWORDS written.
+ *
+ * It is to be noted that this batch does not contain MI_BATCH_BUFFER_END
+ * so it adds NOOPs as padding to make it cacheline aligned.
+ * MI_BATCH_BUFFER_END will be added to perctx batch and both of them together
+ * makes a complete batch buffer.
+ */
+static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+	/* WaDisableCtxRestoreArbitration:bdw,chv */
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+	/* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
+	if (IS_BROADWELL(engine->i915))
+		batch = gen8_emit_flush_coherentl3_wa(engine, batch);
+
+	/* WaClearSlmSpaceAtContextSwitch:bdw,chv */
+	/* Actual scratch location is at 128 bytes offset */
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_FLUSH_L3 |
+				       PIPE_CONTROL_GLOBAL_GTT_IVB |
+				       PIPE_CONTROL_CS_STALL |
+				       PIPE_CONTROL_QW_WRITE,
+				       i915_ggtt_offset(engine->scratch) +
+				       2 * CACHELINE_BYTES);
+
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+	/* Pad to end of cacheline */
+	while ((unsigned long)batch % CACHELINE_BYTES)
+		*batch++ = MI_NOOP;
+
+	/*
+	 * MI_BATCH_BUFFER_END is not required in Indirect ctx BB because
+	 * execution depends on the length specified in terms of cache lines
+	 * in the register CTX_RCS_INDIRECT_CTX
+	 */
+
+	return batch;
+}
+
+static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+	/* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */
+	batch = gen8_emit_flush_coherentl3_wa(engine, batch);
+
+	/* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */
+	*batch++ = MI_LOAD_REGISTER_IMM(1);
+	*batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2);
+	*batch++ = _MASKED_BIT_DISABLE(
+			GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE);
+	*batch++ = MI_NOOP;
+
+	/* WaClearSlmSpaceAtContextSwitch:kbl */
+	/* Actual scratch location is at 128 bytes offset */
+	if (IS_KBL_REVID(engine->i915, 0, KBL_REVID_A0)) {
+		batch = gen8_emit_pipe_control(batch,
+					       PIPE_CONTROL_FLUSH_L3 |
+					       PIPE_CONTROL_GLOBAL_GTT_IVB |
+					       PIPE_CONTROL_CS_STALL |
+					       PIPE_CONTROL_QW_WRITE,
+					       i915_ggtt_offset(engine->scratch)
+					       + 2 * CACHELINE_BYTES);
+	}
+
+	/* WaMediaPoolStateCmdInWABB:bxt,glk */
+	if (HAS_POOLED_EU(engine->i915)) {
+		/*
+		 * EU pool configuration is setup along with golden context
+		 * during context initialization. This value depends on
+		 * device type (2x6 or 3x6) and needs to be updated based
+		 * on which subslice is disabled especially for 2x6
+		 * devices, however it is safe to load default
+		 * configuration of 3x6 device instead of masking off
+		 * corresponding bits because HW ignores bits of a disabled
+		 * subslice and drops down to appropriate config. Please
+		 * see render_state_setup() in i915_gem_render_state.c for
+		 * possible configurations, to avoid duplication they are
+		 * not shown here again.
+		 */
+		*batch++ = GEN9_MEDIA_POOL_STATE;
+		*batch++ = GEN9_MEDIA_POOL_ENABLE;
+		*batch++ = 0x00777000;
+		*batch++ = 0;
+		*batch++ = 0;
+		*batch++ = 0;
+	}
+
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+	/* Pad to end of cacheline */
+	while ((unsigned long)batch % CACHELINE_BYTES)
+		*batch++ = MI_NOOP;
+
+	return batch;
+}
+
+#define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
+
+static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int err;
+
+	obj = i915_gem_object_create(engine->i915, CTX_WA_BB_OBJ_SIZE);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	vma = i915_vma_instance(obj, &engine->i915->ggtt.base, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err;
+	}
+
+	err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH);
+	if (err)
+		goto err;
+
+	engine->wa_ctx.vma = vma;
+	return 0;
+
+err:
+	i915_gem_object_put(obj);
+	return err;
+}
+
+static void lrc_destroy_wa_ctx(struct intel_engine_cs *engine)
+{
+	i915_vma_unpin_and_release(&engine->wa_ctx.vma);
+}
+
+typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
+
+int intel_bb_workarounds_init(struct intel_engine_cs *engine)
+{
+	struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
+	struct i915_wa_ctx_bb *wa_bb[2] = { &wa_ctx->indirect_ctx,
+					    &wa_ctx->per_ctx };
+	wa_bb_func_t wa_bb_fn[2];
+	struct page *page;
+	void *batch, *batch_ptr;
+	unsigned int i;
+	int ret;
+
+	if (WARN_ON(engine->id != RCS || !engine->scratch))
+		return -EINVAL;
+
+	switch (INTEL_GEN(engine->i915)) {
+	case 10:
+		return 0;
+	case 9:
+		wa_bb_fn[0] = gen9_init_indirectctx_bb;
+		wa_bb_fn[1] = NULL;
+		break;
+	case 8:
+		wa_bb_fn[0] = gen8_init_indirectctx_bb;
+		wa_bb_fn[1] = NULL;
+		break;
+	default:
+		MISSING_CASE(INTEL_GEN(engine->i915));
+		return 0;
+	}
+
+	ret = lrc_setup_wa_ctx(engine);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Failed to setup context WA page: %d\n", ret);
+		return ret;
+	}
+
+	page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0);
+	batch = batch_ptr = kmap_atomic(page);
+
+	/*
+	 * Emit the two workaround batch buffers, recording the offset from the
+	 * start of the workaround batch buffer object for each and their
+	 * respective sizes.
+	 */
+	for (i = 0; i < ARRAY_SIZE(wa_bb_fn); i++) {
+		wa_bb[i]->offset = batch_ptr - batch;
+		if (WARN_ON(!IS_ALIGNED(wa_bb[i]->offset, CACHELINE_BYTES))) {
+			ret = -EINVAL;
+			break;
+		}
+		if (wa_bb_fn[i])
+			batch_ptr = wa_bb_fn[i](engine, batch_ptr);
+		wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
+	}
+
+	BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
+
+	kunmap_atomic(batch);
+	if (ret)
+		lrc_destroy_wa_ctx(engine);
+
+	return ret;
+}
+
+void intel_bb_workarounds_fini(struct intel_engine_cs *engine)
+{
+	lrc_destroy_wa_ctx(engine);
+}
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/intel_workarounds.h
index 52cdcad..cdbd360 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/intel_workarounds.h
@@ -34,4 +34,7 @@
 int intel_whitelist_workarounds_init(struct intel_engine_cs *engine);
 void intel_whitelist_workarounds_apply(struct intel_engine_cs *engine);
 
+int intel_bb_workarounds_init(struct intel_engine_cs *engine);
+void intel_bb_workarounds_fini(struct intel_engine_cs *engine);
+
 #endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 10/11] drm/i915: Document the i915_workarounds file
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
                   ` (8 preceding siblings ...)
  2017-10-11 18:15 ` [PATCH 09/11] drm/i915: Move WA BB stuff to the workarounds file as well Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 18:15 ` [PATCH 11/11] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
  2017-10-11 19:55 ` ✗ Fi.CI.BAT: warning for Refactor HW workaround code (rev2) Patchwork
  11 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

Does what it says on the tin (plus a few fixes in some old comments).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_workarounds.c | 45 +++++++++++++++++++++++++++-----
 1 file changed, 38 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 8cdb934..5dcbfea 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -25,6 +25,36 @@
 #include "i915_drv.h"
 #include "intel_workarounds.h"
 
+/**
+ * DOC: Hardware workarounds
+ *
+ * This file is a central place to implement most* of the required workarounds
+ * required for HW to work as originally intended. They fall in four categories
+ * depending on how/when they are applied:
+ *
+ * - Workarounds that touch registers that are saved/restored to/from the HW
+ *   context image. The list is generated once and then emitted (via Load
+ *   Register Immediate commands) everytime a new context is created.
+ * - Workarounds that touch global MMIO registers. The list of these WAs is
+ *   generated once and then applied whenever these registers revert to default
+ *   values (on GPU reset, suspend/resume**, etc..).
+ * - Workarounds that whitelist a privileged register, so that UMDs can manage
+ *   them directly. This is just a special case of a MMMIO workaround (as we
+ *   write the list of these to/be-whitelisted registers to some special HW
+ *   registers).
+ * - Workaround batchbuffers, that get executed automatically by the hardware
+ *   on every HW context restore.
+ *
+ * * Please notice that there are other WAs that, due to their nature, cannot be
+ *   applied from a central place. Those are peppered around the rest of the
+ *   code, as needed).
+ *
+ * ** Technically, some registers are powercontext saved & restored, so they
+ *    survive a suspend/resume. In practice, writing them again is not too
+ *    costly and simplifies things. We can revisit this in the future.
+ *
+ */
+
 static int ctx_wa_add(struct drm_i915_private *dev_priv,
 		      i915_reg_t addr,
 		      const u32 mask, const u32 val)
@@ -190,9 +220,9 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
 		CTX_WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
 				      GEN9_RHWO_OPTIMIZATION_DISABLE);
 		/*
-		 * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
-		 * but we do that in per ctx batchbuffer as there is an issue
-		 * with this register not getting restored on ctx restore
+		 * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be
+		 * set but we do that in per ctx batchbuffer as there is an
+		 * issue with this register not getting restored on ctx restore.
 		 */
 	}
 
@@ -1086,10 +1116,11 @@ void intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
  * but there is a slight complication as this is applied in WA batch where the
  * values are only initialized once so we cannot take register value at the
  * beginning and reuse it further; hence we save its value to memory, upload a
- * constant value with bit21 set and then we restore it back with the saved value.
+ * constant value with bit21 set and then we restore it back with the saved
+ * value.
  * To simplify the WA, a constant value is formed by using the default value
  * of this register. This shouldn't be a problem because we are only modifying
- * it for a short period and this batch in non-premptible. We can ofcourse
+ * it for a short period and this batch in non-premptible. We can of course
  * use additional instructions that read the actual value of the register
  * at that time and set our bit of interest but it makes the WA complicated.
  *
@@ -1125,8 +1156,8 @@ void intel_whitelist_workarounds_apply(struct intel_engine_cs *engine)
  * Typically we only have one indirect_ctx and per_ctx batch buffer which are
  * initialized at the beginning and shared across all contexts but this field
  * helps us to have multiple batches at different offsets and select them based
- * on a criteria. At the moment this batch always start at the beginning of the page
- * and at this point we don't have multiple wa_ctx batch buffers.
+ * on a criteria. At the moment this batch always start at the beginning of the
+ * page and at this point we don't have multiple wa_ctx batch buffers.
  *
  * The number of WA applied are not known at the beginning; we use this field
  * to return the no of DWORDS written.
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 11/11] drm/i915: Remove Gen9 WAs with no effect
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
                   ` (9 preceding siblings ...)
  2017-10-11 18:15 ` [PATCH 10/11] drm/i915: Document the i915_workarounds file Oscar Mateo
@ 2017-10-11 18:15 ` Oscar Mateo
  2017-10-11 19:55 ` ✗ Fi.CI.BAT: warning for Refactor HW workaround code (rev2) Patchwork
  11 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-11 18:15 UTC (permalink / raw)
  To: intel-gfx

GEN8_CONFIG0 (0xD00) is a protected by a lock (bit 31) which is set by
the BIOS, so there is no way we can enable the three chicken bits
mandated by the WA (the BIOS should be doing it instead).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h          | 3 ---
 drivers/gpu/drm/i915/intel_workarounds.c | 2 --
 2 files changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5858f5f..02c66f3 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -355,9 +355,6 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   ECOCHK_PPGTT_WT_HSW		(0x2<<3)
 #define   ECOCHK_PPGTT_WB_HSW		(0x3<<3)
 
-#define GEN8_CONFIG0			_MMIO(0xD00)
-#define  GEN9_DEFAULT_FIXES		(1 << 3 | 1 << 2 | 1 << 1)
-
 #define GAC_ECO_BITS			_MMIO(0x14090)
 #define   ECOBITS_SNB_BIT		(1<<13)
 #define   ECOBITS_PPGTT_CACHE64B	(3<<8)
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 5dcbfea..df47fad 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -656,8 +656,6 @@ static int gen9_mmio_workarounds_init(struct drm_i915_private *dev_priv)
 	/* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
 	MMIO_WA_SET_BIT(CHICKEN_PAR1_1, SKL_EDP_PSR_FIX_RDWRAP);
 
-	MMIO_WA_SET_BIT(GEN8_CONFIG0, GEN9_DEFAULT_FIXES);
-
 	/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
 	MMIO_WA_SET_BIT(GEN8_CHICKEN_DCPR_1, MASK_WAKEMEM);
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds
  2017-10-11 18:15 ` [PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds Oscar Mateo
@ 2017-10-11 18:25   ` Chris Wilson
  2017-10-12 12:35     ` Mika Kuoppala
  0 siblings, 1 reply; 27+ messages in thread
From: Chris Wilson @ 2017-10-11 18:25 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx

Quoting Oscar Mateo (2017-10-11 19:15:13)
> There are different kind of workarounds (those that modify registers that
> live in the context image, those that modify global registers, those that
> whitelist registers, etc...) and they have different requirements in terms
> of where they are applied and how. Also, by splitting them apart, it should
> be easier to decide where a new workaround should go.
> 
> v2:
>   - Add multiple MISSING_CASE
>   - Rebased
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c          |   3 +
>  drivers/gpu/drm/i915/i915_gem_context.c  |   5 +
>  drivers/gpu/drm/i915/intel_lrc.c         |  10 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c  |   4 +-
>  drivers/gpu/drm/i915/intel_workarounds.c | 698 +++++++++++++++++++------------
>  drivers/gpu/drm/i915/intel_workarounds.h |   8 +-
>  6 files changed, 444 insertions(+), 284 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f76890b..119c456 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -35,6 +35,7 @@
>  #include "intel_drv.h"
>  #include "intel_frontbuffer.h"
>  #include "intel_mocs.h"
> +#include "intel_workarounds.h"
>  #include "i915_gemfs.h"
>  #include <linux/dma-fence-array.h>
>  #include <linux/kthread.h>
> @@ -4804,6 +4805,8 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
>                 }
>         }
>  
> +       intel_mmio_workarounds_apply(dev_priv);

Hmm, we still have an overlap with intel_init_clock_gating(). Perusing
those there are several that deserve to be in intel_mmio_wa_apply
instead.

Are we happy with the split between "mmio_workarounds" and "clock_gating"?

So are we looking at intel_display_workarounds, intel_gt_workarounds and
intel_ctx_workarounds?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating
  2017-10-11 18:15 ` [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating Oscar Mateo
@ 2017-10-11 18:26   ` Chris Wilson
  2017-10-11 18:29   ` Chris Wilson
  2017-10-11 18:35   ` Ville Syrjälä
  2 siblings, 0 replies; 27+ messages in thread
From: Chris Wilson @ 2017-10-11 18:26 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx; +Cc: Rodrigo Vivi

Quoting Oscar Mateo (2017-10-11 19:15:14)
> I'm not sure why some WAs have historically been applied in init_clock_gating
> and some others in the engine setup (GT vs. display? context vs. global
> registers?) but it does not look like the best place to apply workarounds:
> the name is confusing, it's a display function (even though some GT WAs
> also go here) and it isn't necessarily called on a GPU reset. This patch
> moves these WAs to their rightful place inside i915_workarounds.c.

How dare you answer my question before I ask it!

s/i915_workarounds/intel_workarounds/
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating
  2017-10-11 18:15 ` [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating Oscar Mateo
  2017-10-11 18:26   ` Chris Wilson
@ 2017-10-11 18:29   ` Chris Wilson
  2017-10-13 20:52     ` Oscar Mateo
  2017-10-11 18:35   ` Ville Syrjälä
  2 siblings, 1 reply; 27+ messages in thread
From: Chris Wilson @ 2017-10-11 18:29 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx; +Cc: Rodrigo Vivi

Quoting Oscar Mateo (2017-10-11 19:15:14)
> I'm not sure why some WAs have historically been applied in init_clock_gating
> and some others in the engine setup (GT vs. display? context vs. global
> registers?) but it does not look like the best place to apply workarounds:
> the name is confusing, it's a display function (even though some GT WAs
> also go here) and it isn't necessarily called on a GPU reset. This patch
> moves these WAs to their rightful place inside i915_workarounds.c.
> 
> TODO: Do we want to keep display WAs separated from GT ones? In that case,
> I would propose another category inside i915_workarounds.c (but I would need
> help deciding what goes where).
> 
> v2:
>   - Also move bdw and chv WAs from init_clock_gating that do not seem to be
>     actually related to clock gating.
>   - Rebased.

Too much in one patch. Start with copy'n'paste, and then we can see
which w/a you select to move from init_clock_gating.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating
  2017-10-11 18:15 ` [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating Oscar Mateo
  2017-10-11 18:26   ` Chris Wilson
  2017-10-11 18:29   ` Chris Wilson
@ 2017-10-11 18:35   ` Ville Syrjälä
  2017-10-13 20:34     ` Oscar Mateo
  2 siblings, 1 reply; 27+ messages in thread
From: Ville Syrjälä @ 2017-10-11 18:35 UTC (permalink / raw)
  To: Oscar Mateo; +Cc: intel-gfx, Rodrigo Vivi

On Wed, Oct 11, 2017 at 11:15:14AM -0700, Oscar Mateo wrote:
> I'm not sure why some WAs have historically been applied in init_clock_gating
> and some others in the engine setup (GT vs. display? context vs. global
> registers?) but it does not look like the best place to apply workarounds:
> the name is confusing, it's a display function (even though some GT WAs
> also go here) and it isn't necessarily called on a GPU reset. This patch
> moves these WAs to their rightful place inside i915_workarounds.c.
> 
> TODO: Do we want to keep display WAs separated from GT ones? In that case,
> I would propose another category inside i915_workarounds.c (but I would need
> help deciding what goes where).

The current situation isn't very good. But neither really is moving
display stuff into something called gem_init_hw(). It also gets called
during GPU reset which is at the very least wasted effort when it comes
to display w/as, and could even be actively harmful in case we end up
clobbering something the current display configuration depends on.

> 
> v2:
>   - Also move bdw and chv WAs from init_clock_gating that do not seem to be
>     actually related to clock gating.
>   - Rebased.
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c          | 243 +------------------------------
>  drivers/gpu/drm/i915/intel_workarounds.c | 195 ++++++++++++++++++++++++-
>  2 files changed, 202 insertions(+), 236 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 2fcff97..024ee94 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -56,101 +56,6 @@
>  #define INTEL_RC6p_ENABLE			(1<<1)
>  #define INTEL_RC6pp_ENABLE			(1<<2)
>  
> -static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
> -{
> -	if (HAS_LLC(dev_priv)) {
> -		/*
> -		 * WaCompressedResourceDisplayNewHashMode:skl,kbl
> -		 * Display WA#0390: skl,kbl
> -		 *
> -		 * Must match Sampler, Pixel Back End, and Media. See
> -		 * WaCompressedResourceSamplerPbeMediaNewHashMode.
> -		 */
> -		I915_WRITE(CHICKEN_PAR1_1,
> -			   I915_READ(CHICKEN_PAR1_1) |
> -			   SKL_DE_COMPRESSED_HASH_MODE);
> -	}
> -
> -	/* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
> -	I915_WRITE(CHICKEN_PAR1_1,
> -		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
> -
> -	I915_WRITE(GEN8_CONFIG0,
> -		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
> -
> -	/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
> -	I915_WRITE(GEN8_CHICKEN_DCPR_1,
> -		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
> -
> -	/* WaFbcTurnOffFbcWatermark:skl,bxt,kbl,cfl */
> -	/* WaFbcWakeMemOn:skl,bxt,kbl,glk,cfl */
> -	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
> -		   DISP_FBC_WM_DIS |
> -		   DISP_FBC_MEMORY_WAKE);
> -
> -	/* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
> -	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
> -		   ILK_DPFC_DISABLE_DUMMY0);
> -
> -	if (IS_SKYLAKE(dev_priv)) {
> -		/* WaDisableDopClockGating */
> -		I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
> -			   & ~GEN7_DOP_CLOCK_GATE_ENABLE);
> -	}
> -}
> -
> -static void bxt_init_clock_gating(struct drm_i915_private *dev_priv)
> -{
> -	gen9_init_clock_gating(dev_priv);
> -
> -	/* WaDisableSDEUnitClockGating:bxt */
> -	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
> -		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
> -
> -	/*
> -	 * FIXME:
> -	 * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
> -	 */
> -	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
> -		   GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
> -
> -	/*
> -	 * Wa: Backlight PWM may stop in the asserted state, causing backlight
> -	 * to stay fully on.
> -	 */
> -	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
> -		   PWM1_GATING_DIS | PWM2_GATING_DIS);
> -}
> -
> -static void glk_init_clock_gating(struct drm_i915_private *dev_priv)
> -{
> -	u32 val;
> -	gen9_init_clock_gating(dev_priv);
> -
> -	/*
> -	 * WaDisablePWMClockGating:glk
> -	 * Backlight PWM may stop in the asserted state, causing backlight
> -	 * to stay fully on.
> -	 */
> -	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
> -		   PWM1_GATING_DIS | PWM2_GATING_DIS);
> -
> -	/* WaDDIIOTimeout:glk */
> -	if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1)) {
> -		u32 val = I915_READ(CHICKEN_MISC_2);
> -		val &= ~(GLK_CL0_PWR_DOWN |
> -			 GLK_CL1_PWR_DOWN |
> -			 GLK_CL2_PWR_DOWN);
> -		I915_WRITE(CHICKEN_MISC_2, val);
> -	}
> -
> -	/* Display WA #1133: WaFbcSkipSegments:glk */
> -	val = I915_READ(ILK_DPFC_CHICKEN);
> -	val &= ~GLK_SKIP_SEG_COUNT_MASK;
> -	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
> -	I915_WRITE(ILK_DPFC_CHICKEN, val);
> -}
> -
>  static void i915_pineview_get_mem_freq(struct drm_i915_private *dev_priv)
>  {
>  	u32 tmp;
> @@ -8505,110 +8410,10 @@ static void cnp_init_clock_gating(struct drm_i915_private *dev_priv)
>  		   CNP_PWM_CGE_GATING_DISABLE);
>  }
>  
> -static void cnl_init_clock_gating(struct drm_i915_private *dev_priv)
> -{
> -	u32 val;
> -	cnp_init_clock_gating(dev_priv);
> -
> -	/* This is not an Wa. Enable for better image quality */
> -	I915_WRITE(_3D_CHICKEN3,
> -		   _MASKED_BIT_ENABLE(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE));
> -
> -	/* WaEnableChickenDCPR:cnl */
> -	I915_WRITE(GEN8_CHICKEN_DCPR_1,
> -		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
> -
> -	/* WaFbcWakeMemOn:cnl */
> -	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
> -		   DISP_FBC_MEMORY_WAKE);
> -
> -	/* WaSarbUnitClockGatingDisable:cnl (pre-prod) */
> -	if (IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_B0))
> -		I915_WRITE(SLICE_UNIT_LEVEL_CLKGATE,
> -			   I915_READ(SLICE_UNIT_LEVEL_CLKGATE) |
> -			   SARBUNIT_CLKGATE_DIS);
> -
> -	/* Display WA #1133: WaFbcSkipSegments:cnl */
> -	val = I915_READ(ILK_DPFC_CHICKEN);
> -	val &= ~GLK_SKIP_SEG_COUNT_MASK;
> -	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
> -	I915_WRITE(ILK_DPFC_CHICKEN, val);
> -}
> -
> -static void cfl_init_clock_gating(struct drm_i915_private *dev_priv)
> -{
> -	cnp_init_clock_gating(dev_priv);
> -	gen9_init_clock_gating(dev_priv);
> -
> -	/* WaFbcNukeOnHostModify:cfl */
> -	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
> -		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
> -}
> -
> -static void kbl_init_clock_gating(struct drm_i915_private *dev_priv)
> -{
> -	gen9_init_clock_gating(dev_priv);
> -
> -	/* WaDisableSDEUnitClockGating:kbl */
> -	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
> -		I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
> -			   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
> -
> -	/* WaDisableGamClockGating:kbl */
> -	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
> -		I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
> -			   GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
> -
> -	/* WaFbcNukeOnHostModify:kbl */
> -	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
> -		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
> -}
> -
> -static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
> -{
> -	gen9_init_clock_gating(dev_priv);
> -
> -	/* WAC6entrylatency:skl */
> -	I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
> -		   FBC_LLC_FULLY_OPEN);
> -
> -	/* WaFbcNukeOnHostModify:skl */
> -	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
> -		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
> -}
> -
>  static void bdw_init_clock_gating(struct drm_i915_private *dev_priv)
>  {
> -	/* The GTT cache must be disabled if the system is using 2M pages. */
> -	bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv,
> -						 I915_GTT_PAGE_SIZE_2M);
> -	enum pipe pipe;
> -
>  	ilk_init_lp_watermarks(dev_priv);
>  
> -	/* WaSwitchSolVfFArbitrationPriority:bdw */
> -	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
> -
> -	/* WaPsrDPAMaskVBlankInSRD:bdw */
> -	I915_WRITE(CHICKEN_PAR1_1,
> -		   I915_READ(CHICKEN_PAR1_1) | DPA_MASK_VBLANK_SRD);
> -
> -	/* WaPsrDPRSUnmaskVBlankInSRD:bdw */
> -	for_each_pipe(dev_priv, pipe) {
> -		I915_WRITE(CHICKEN_PIPESL_1(pipe),
> -			   I915_READ(CHICKEN_PIPESL_1(pipe)) |
> -			   BDW_DPRS_MASK_VBLANK_SRD);
> -	}
> -
> -	/* WaVSRefCountFullforceMissDisable:bdw */
> -	/* WaDSRefCountFullforceMissDisable:bdw */
> -	I915_WRITE(GEN7_FF_THREAD_MODE,
> -		   I915_READ(GEN7_FF_THREAD_MODE) &
> -		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
> -
> -	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
> -		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
> -
>  	/* WaDisableSDEUnitClockGating:bdw */
>  	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
>  		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
> @@ -8616,19 +8421,12 @@ static void bdw_init_clock_gating(struct drm_i915_private *dev_priv)
>  	/* WaProgramL3SqcReg1Default:bdw */
>  	gen8_set_l3sqc_credits(dev_priv, 30, 2);
>  
> -	/* WaGttCachingOffByDefault:bdw */
> -	I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0);
> -
> -	/* WaKVMNotificationOnConfigChange:bdw */
> -	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
> -		   | KVM_CONFIG_CHANGE_NOTIFICATION_SELECT);
> -
>  	lpt_init_clock_gating(dev_priv);
>  
>  	/* WaDisableDopClockGating:bdw
>  	 *
> -	 * Also see the CHICKEN2 write in bdw_init_workarounds() to disable DOP
> -	 * clock gating.
> +	 * Also see the CHICKEN2 write in bdw_ctx_workarounds_init() to disable
> +	 * DOP clock gating.
>  	 */
>  	I915_WRITE(GEN6_UCGCTL1,
>  		   I915_READ(GEN6_UCGCTL1) | GEN6_EU_TCUNIT_CLOCK_GATE_DISABLE);
> @@ -8867,16 +8665,6 @@ static void vlv_init_clock_gating(struct drm_i915_private *dev_priv)
>  
>  static void chv_init_clock_gating(struct drm_i915_private *dev_priv)
>  {
> -	/* WaVSRefCountFullforceMissDisable:chv */
> -	/* WaDSRefCountFullforceMissDisable:chv */
> -	I915_WRITE(GEN7_FF_THREAD_MODE,
> -		   I915_READ(GEN7_FF_THREAD_MODE) &
> -		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
> -
> -	/* WaDisableSemaphoreAndSyncFlipWait:chv */
> -	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
> -		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
> -
>  	/* WaDisableCSUnitClockGating:chv */
>  	I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
>  		   GEN6_CSUNIT_CLOCK_GATE_DISABLE);
> @@ -8891,12 +8679,6 @@ static void chv_init_clock_gating(struct drm_i915_private *dev_priv)
>  	 * LSQC Setting Recommendations.
>  	 */
>  	gen8_set_l3sqc_credits(dev_priv, 38, 2);
> -
> -	/*
> -	 * GTT cache may not work with big pages, so if those
> -	 * are ever enabled GTT cache may need to be disabled.
> -	 */
> -	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
>  }
>  
>  static void g4x_init_clock_gating(struct drm_i915_private *dev_priv)
> @@ -9018,24 +8800,15 @@ static void nop_init_clock_gating(struct drm_i915_private *dev_priv)
>   * @dev_priv: device private
>   *
>   * Setup the hooks that configure which clocks of a given platform can be
> - * gated and also apply various GT and display specific workarounds for these
> - * platforms. Note that some GT specific workarounds are applied separately
> - * when GPU contexts or batchbuffers start their execution.
> + * gated.
>   */
>  void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
>  {
> -	if (IS_CANNONLAKE(dev_priv))
> -		dev_priv->display.init_clock_gating = cnl_init_clock_gating;
> -	else if (IS_COFFEELAKE(dev_priv))
> -		dev_priv->display.init_clock_gating = cfl_init_clock_gating;
> -	else if (IS_SKYLAKE(dev_priv))
> -		dev_priv->display.init_clock_gating = skl_init_clock_gating;
> -	else if (IS_KABYLAKE(dev_priv))
> -		dev_priv->display.init_clock_gating = kbl_init_clock_gating;
> -	else if (IS_BROXTON(dev_priv))
> -		dev_priv->display.init_clock_gating = bxt_init_clock_gating;
> -	else if (IS_GEMINILAKE(dev_priv))
> -		dev_priv->display.init_clock_gating = glk_init_clock_gating;
> +	if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv))
> +		dev_priv->display.init_clock_gating = cnp_init_clock_gating;
> +	else if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv) ||
> +		 IS_BROXTON(dev_priv) || IS_GEMINILAKE(dev_priv))
> +		dev_priv->display.init_clock_gating = nop_init_clock_gating;
>  	else if (IS_BROADWELL(dev_priv))
>  		dev_priv->display.init_clock_gating = bdw_init_clock_gating;
>  	else if (IS_CHERRYVIEW(dev_priv))
> diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
> index ae66084..bc144c2 100644
> --- a/drivers/gpu/drm/i915/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/intel_workarounds.c
> @@ -513,6 +513,63 @@ int intel_ctx_workarounds_emit(struct drm_i915_gem_request *req)
>  	return 0;
>  }
>  
> +static void bdw_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
> +{
> +	/* The GTT cache must be disabled if the system is using 2M pages. */
> +	bool can_use_gtt_cache = !HAS_PAGE_SIZES(dev_priv,
> +						 I915_GTT_PAGE_SIZE_2M);
> +	enum pipe pipe;
> +
> +	/* WaSwitchSolVfFArbitrationPriority:bdw */
> +	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
> +
> +	/* WaPsrDPAMaskVBlankInSRD:bdw */
> +	I915_WRITE(CHICKEN_PAR1_1,
> +		   I915_READ(CHICKEN_PAR1_1) | DPA_MASK_VBLANK_SRD);
> +
> +	/* WaPsrDPRSUnmaskVBlankInSRD:bdw */
> +	for_each_pipe(dev_priv, pipe) {
> +		I915_WRITE(CHICKEN_PIPESL_1(pipe),
> +			   I915_READ(CHICKEN_PIPESL_1(pipe)) |
> +			   BDW_DPRS_MASK_VBLANK_SRD);
> +	}
> +
> +	/* WaVSRefCountFullforceMissDisable:bdw */
> +	/* WaDSRefCountFullforceMissDisable:bdw */
> +	I915_WRITE(GEN7_FF_THREAD_MODE,
> +		   I915_READ(GEN7_FF_THREAD_MODE) &
> +		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
> +
> +	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
> +		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
> +
> +	/* WaGttCachingOffByDefault:bdw */
> +	I915_WRITE(HSW_GTT_CACHE_EN, can_use_gtt_cache ? GTT_CACHE_EN_ALL : 0);
> +
> +	/* WaKVMNotificationOnConfigChange:bdw */
> +	I915_WRITE(CHICKEN_PAR2_1, I915_READ(CHICKEN_PAR2_1)
> +		   | KVM_CONFIG_CHANGE_NOTIFICATION_SELECT);
> +}
> +
> +static void chv_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
> +{
> +	/* WaVSRefCountFullforceMissDisable:chv */
> +	/* WaDSRefCountFullforceMissDisable:chv */
> +	I915_WRITE(GEN7_FF_THREAD_MODE,
> +		   I915_READ(GEN7_FF_THREAD_MODE) &
> +		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
> +
> +	/* WaDisableSemaphoreAndSyncFlipWait:chv */
> +	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
> +		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
> +
> +	/*
> +	 * GTT cache may not work with big pages, so if those
> +	 * are ever enabled GTT cache may need to be disabled.
> +	 */
> +	I915_WRITE(HSW_GTT_CACHE_EN, GTT_CACHE_EN_ALL);
> +}
> +
>  static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  {
>  	if (HAS_LLC(dev_priv)) {
> @@ -525,8 +582,40 @@ static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  			   I915_READ(MMCD_MISC_CTRL) |
>  			   MMCD_PCLA |
>  			   MMCD_HOTSPOT_EN);
> +
> +		/*
> +		 * WaCompressedResourceDisplayNewHashMode:skl,kbl
> +		 * Display WA#0390: skl,kbl
> +		 *
> +		 * Must match Sampler, Pixel Back End, and Media. See
> +		 * WaCompressedResourceSamplerPbeMediaNewHashMode.
> +		 */
> +		I915_WRITE(CHICKEN_PAR1_1,
> +			   I915_READ(CHICKEN_PAR1_1) |
> +			   SKL_DE_COMPRESSED_HASH_MODE);
>  	}
>  
> +	/* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
> +	I915_WRITE(CHICKEN_PAR1_1,
> +		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
> +
> +	I915_WRITE(GEN8_CONFIG0,
> +		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
> +
> +	/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
> +	I915_WRITE(GEN8_CHICKEN_DCPR_1,
> +		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
> +
> +	/* WaFbcTurnOffFbcWatermark:skl,bxt,kbl,cfl */
> +	/* WaFbcWakeMemOn:skl,bxt,kbl,glk,cfl */
> +	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
> +		   DISP_FBC_WM_DIS |
> +		   DISP_FBC_MEMORY_WAKE);
> +
> +	/* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
> +	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
> +		   ILK_DPFC_DISABLE_DUMMY0);
> +
>  	/* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
>  	I915_WRITE(GEN9_CSFE_CHICKEN1_RCS,
>  		   _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
> @@ -557,6 +646,18 @@ static void skl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  {
>  	gen9_mmio_workarounds_apply(dev_priv);
>  
> +	/* WaDisableDopClockGating */
> +	I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
> +		   & ~GEN7_DOP_CLOCK_GATE_ENABLE);
> +
> +	/* WAC6entrylatency:skl */
> +	I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
> +		   FBC_LLC_FULLY_OPEN);
> +
> +	/* WaFbcNukeOnHostModify:skl */
> +	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
> +		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
> +
>  	/* WaEnableGapsTsvCreditFix:skl */
>  	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
>  				   GEN9_GAPS_TSV_CREDIT_DISABLE));
> @@ -576,6 +677,24 @@ static void bxt_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  {
>  	gen9_mmio_workarounds_apply(dev_priv);
>  
> +	/* WaDisableSDEUnitClockGating:bxt */
> +	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
> +		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
> +
> +	/*
> +	 * FIXME:
> +	 * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
> +	 */
> +	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
> +		   GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
> +
> +	/*
> +	 * Wa: Backlight PWM may stop in the asserted state, causing backlight
> +	 * to stay fully on.
> +	 */
> +	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
> +		   PWM1_GATING_DIS | PWM2_GATING_DIS);
> +
>  	/* WaStoreMultiplePTEenable:bxt */
>  	/* This is a requirement according to Hardware specification */
>  	if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
> @@ -609,6 +728,20 @@ static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  {
>  	gen9_mmio_workarounds_apply(dev_priv);
>  
> +	/* WaDisableSDEUnitClockGating:kbl */
> +	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
> +		I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
> +			   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
> +
> +	/* WaDisableGamClockGating:kbl */
> +	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
> +		I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
> +			   GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
> +
> +	/* WaFbcNukeOnHostModify:kbl */
> +	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
> +		   ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
> +
>  	/* WaEnableGapsTsvCreditFix:kbl */
>  	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
>  				   GEN9_GAPS_TSV_CREDIT_DISABLE));
> @@ -631,13 +764,43 @@ static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  
>  static void glk_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  {
> +	u32 val;
> +
>  	gen9_mmio_workarounds_apply(dev_priv);
> +
> +	/*
> +	 * WaDisablePWMClockGating:glk
> +	 * Backlight PWM may stop in the asserted state, causing backlight
> +	 * to stay fully on.
> +	 */
> +	I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
> +		   PWM1_GATING_DIS | PWM2_GATING_DIS);
> +
> +	/* WaDDIIOTimeout:glk */
> +	if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1)) {
> +		u32 val = I915_READ(CHICKEN_MISC_2);
> +		val &= ~(GLK_CL0_PWR_DOWN |
> +			 GLK_CL1_PWR_DOWN |
> +			 GLK_CL2_PWR_DOWN);
> +		I915_WRITE(CHICKEN_MISC_2, val);
> +	}
> +
> +	/* Display WA #1133: WaFbcSkipSegments:glk */
> +	val = I915_READ(ILK_DPFC_CHICKEN);
> +	val &= ~GLK_SKIP_SEG_COUNT_MASK;
> +	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
> +	I915_WRITE(ILK_DPFC_CHICKEN, val);
>  }
>  
>  static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  {
>  	gen9_mmio_workarounds_apply(dev_priv);
>  
> +	/* WaFbcNukeOnHostModify:cfl */
> +	I915_WRITE(ILK_DPFC_CHICKEN,
> +		   I915_READ(ILK_DPFC_CHICKEN) |
> +			     ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
> +
>  	/* WaEnableGapsTsvCreditFix:cfl */
>  	I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
>  				   GEN9_GAPS_TSV_CREDIT_DISABLE));
> @@ -654,6 +817,32 @@ static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  
>  static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  {
> +	u32 val;
> +
> +	/* This is not an Wa. Enable for better image quality */
> +	I915_WRITE(_3D_CHICKEN3,
> +		   _MASKED_BIT_ENABLE(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE));
> +
> +	/* WaEnableChickenDCPR:cnl */
> +	I915_WRITE(GEN8_CHICKEN_DCPR_1,
> +		   I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
> +
> +	/* WaFbcWakeMemOn:cnl */
> +	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
> +		   DISP_FBC_MEMORY_WAKE);
> +
> +	/* WaSarbUnitClockGatingDisable:cnl (pre-prod) */
> +	if (IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_B0))
> +		I915_WRITE(SLICE_UNIT_LEVEL_CLKGATE,
> +			   I915_READ(SLICE_UNIT_LEVEL_CLKGATE) |
> +			   SARBUNIT_CLKGATE_DIS);
> +
> +	/* Display WA #1133: WaFbcSkipSegments:cnl */
> +	val = I915_READ(ILK_DPFC_CHICKEN);
> +	val &= ~GLK_SKIP_SEG_COUNT_MASK;
> +	val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
> +	I915_WRITE(ILK_DPFC_CHICKEN, val);
> +
>  	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
>  	if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
>  		I915_WRITE(GAMT_CHKN_BIT_REG,
> @@ -672,8 +861,12 @@ static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  
>  void intel_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
>  {
> -	if (INTEL_GEN(dev_priv) < 9)
> +	if (INTEL_GEN(dev_priv) < 8)
>  		return;
> +	else if (IS_BROADWELL(dev_priv))
> +		bdw_mmio_workarounds_apply(dev_priv);
> +	else if (IS_CHERRYVIEW(dev_priv))
> +		chv_mmio_workarounds_apply(dev_priv);
>  	else if (IS_SKYLAKE(dev_priv))
>  		skl_mmio_workarounds_apply(dev_priv);
>  	else if (IS_BROXTON(dev_priv))
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs
  2017-10-11 18:15 ` [PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs Oscar Mateo
@ 2017-10-11 18:41   ` Chris Wilson
  2017-10-13 20:51     ` Oscar Mateo
  0 siblings, 1 reply; 27+ messages in thread
From: Chris Wilson @ 2017-10-11 18:41 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx

Quoting Oscar Mateo (2017-10-11 19:15:18)
> Let's try to make sure that all WAs are applied correctly and survive
> resumes, resets, etc... (with some help from a companion i-g-t patch).
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 48 ++++++++++++++++++++++++++-----------
>  1 file changed, 34 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index f108f53..fb49eac 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3399,6 +3399,20 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused)
>         return 0;
>  }
>  
> +static void check_wa_register(struct seq_file *m, struct i915_wa_reg *wa_reg)
> +{
> +       struct drm_i915_private *dev_priv = node_to_i915(m->private);
> +       u32 read;
> +       bool ok;
> +
> +       read = I915_READ(wa_reg->addr);
> +       ok = (wa_reg->value & wa_reg->mask) == (read & wa_reg->mask);
> +       seq_printf(m, "0x%X: 0x%08x, mask: 0x%08x, read: 0x%08x, status: %s\n",
> +                  i915_mmio_reg_offset(wa_reg->addr),
> +                  wa_reg->value, wa_reg->mask, read,
> +                  ok ? "OK" : "FAIL");

So one thing I've been considering is adding the Wa name for easier
cross-referencing. I am just worried about the number of strings and
whether we should put those names anywhere near user visible output.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* ✗ Fi.CI.BAT: warning for Refactor HW workaround code (rev2)
  2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
                   ` (10 preceding siblings ...)
  2017-10-11 18:15 ` [PATCH 11/11] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
@ 2017-10-11 19:55 ` Patchwork
  11 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2017-10-11 19:55 UTC (permalink / raw)
  To: Oscar Mateo; +Cc: intel-gfx

== Series Details ==

Series: Refactor HW workaround code (rev2)
URL   : https://patchwork.freedesktop.org/series/31611/
State : warning

== Summary ==

Series 31611v2 Refactor HW workaround code
https://patchwork.freedesktop.org/api/1.0/series/31611/revisions/2/mbox/

Test gem_close_race:
        Subgroup basic-threads:
                dmesg-warn -> PASS       (fi-snb-2520m)
Test gem_ringfill:
        Subgroup basic-default-hang:
                incomplete -> DMESG-WARN (fi-blb-e6850) fdo#101600
Test gem_workarounds:
        Subgroup basic-read:
                pass       -> SKIP       (fi-bdw-5557u)
                pass       -> SKIP       (fi-bdw-gvtdvm)
                pass       -> SKIP       (fi-bsw-n3050)
                pass       -> SKIP       (fi-skl-6260u)
                pass       -> SKIP       (fi-skl-6700hq)
                pass       -> SKIP       (fi-skl-6700k)
                pass       -> SKIP       (fi-skl-6770hq)
                pass       -> SKIP       (fi-skl-gvtdvm)
                pass       -> SKIP       (fi-bxt-dsi)
                pass       -> SKIP       (fi-bxt-j4205)
                pass       -> SKIP       (fi-kbl-7500u)
                pass       -> SKIP       (fi-kbl-7560u)
                pass       -> SKIP       (fi-kbl-7567u)
                pass       -> SKIP       (fi-kbl-r)
                pass       -> SKIP       (fi-glk-1)
                pass       -> SKIP       (fi-cfl-s)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-a:
                incomplete -> PASS       (fi-kbl-7567u)
Test drv_module_reload:
        Subgroup basic-reload:
                pass       -> DMESG-WARN (fi-bdw-5557u)
                pass       -> DMESG-WARN (fi-bdw-gvtdvm)
                pass       -> DMESG-WARN (fi-bsw-n3050)
        Subgroup basic-no-display:
                pass       -> DMESG-WARN (fi-bdw-5557u)
                pass       -> DMESG-WARN (fi-bdw-gvtdvm)
                pass       -> DMESG-WARN (fi-bsw-n3050)
        Subgroup basic-reload-inject:
                pass       -> DMESG-WARN (fi-bdw-5557u)
                pass       -> DMESG-WARN (fi-bdw-gvtdvm)
                pass       -> DMESG-WARN (fi-bsw-n3050)
                dmesg-warn -> INCOMPLETE (fi-cfl-s) fdo#103206

fdo#101600 https://bugs.freedesktop.org/show_bug.cgi?id=101600
fdo#103206 https://bugs.freedesktop.org/show_bug.cgi?id=103206

fi-bdw-5557u     total:289  pass:264  dwarn:3   dfail:0   fail:0   skip:22  time:456s
fi-bdw-gvtdvm    total:289  pass:261  dwarn:3   dfail:0   fail:0   skip:25  time:465s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:392s
fi-bsw-n3050     total:289  pass:239  dwarn:3   dfail:0   fail:0   skip:47  time:563s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:283s
fi-bxt-dsi       total:289  pass:258  dwarn:0   dfail:0   fail:0   skip:31  time:519s
fi-bxt-j4205     total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:520s
fi-byt-j1900     total:289  pass:253  dwarn:1   dfail:0   fail:0   skip:35  time:531s
fi-cfl-s         total:288  pass:252  dwarn:3   dfail:0   fail:0   skip:32 
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:434s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:270s
fi-glk-1         total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:596s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:434s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:459s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:498s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:472s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:0   fail:0   skip:25  time:505s
fi-kbl-7560u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:580s
fi-kbl-7567u     total:289  pass:264  dwarn:4   dfail:0   fail:0   skip:21  time:494s
fi-kbl-r         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:591s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:655s
fi-skl-6260u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:466s
fi-skl-6700hq    total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:653s
fi-skl-6700k     total:289  pass:264  dwarn:0   dfail:0   fail:0   skip:25  time:538s
fi-skl-6770hq    total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:515s
fi-skl-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:470s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:581s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:433s
fi-byt-n2820 failed to connect after reboot

6a96415ec560527f41089b246c06e7fd75991791 drm-tip: 2017y-10m-11d-19h-08m-29s UTC integration manifest
58a6a0a07fa3 drm/i915: Remove Gen9 WAs with no effect
18baa212e49a drm/i915: Document the i915_workarounds file
6265db07e99d drm/i915: Move WA BB stuff to the workarounds file as well
9549248f4cf8 drm/i915: Print all workaround types correctly in debugfs
11013f6ba81a drm/i915: Save all Whitelist WAs and apply them at a later time
602c351019dd drm/i915: Save all MMIO WAs and apply them at a later time
72090c44c54f drm/i915: Rename saved workarounds to make it explicit that they are context WAs
cc7a5ce6b96e drm/i915: Move workarounds from init_clock_gating
e18355c687c4 drm/i915: Split out functions for different kinds of workarounds
632b2fac6e0e drm/i915: Move a bunch of workaround-related code to its own file
ee2b201803e9 drm/i915: No need for RING_MAX_NONPRIV_SLOTS space

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5999/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time
  2017-10-11 18:15 ` [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time Oscar Mateo
@ 2017-10-12 10:35   ` Joonas Lahtinen
  2017-10-13 20:49     ` Oscar Mateo
  0 siblings, 1 reply; 27+ messages in thread
From: Joonas Lahtinen @ 2017-10-12 10:35 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx

On Wed, 2017-10-11 at 11:15 -0700, Oscar Mateo wrote:
> By doing this, we can dump these workarounds in debugfs for
> validation (which,
> at the moment, we are only able to do for the contexts WAs).
> 
> v2:
>   - Wrong macro used for MMIO set bit masked
>   - Improved naming
>   - Rebased
> 
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

<SNIP>

> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1960,12 +1960,16 @@ struct i915_wa_reg {
>  	u32 mask;
>  };
>  
> -#define I915_MAX_WA_REGS 16
> +#define I915_MAX_CTX_WA_REGS 16
> +#define I915_MAX_MMIO_WA_REGS 32
>  
>  struct i915_workarounds {
> -	struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
> +	struct i915_wa_reg ctx_wa_reg[I915_MAX_CTX_WA_REGS];
>  	u32 ctx_wa_count;
>  
> +	struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
> +	u32 mmio_wa_count;
> +
>  	u32 hw_whitelist_count[I915_NUM_ENGINES];
>  };

Could we instead consider a constant structure with platform bitmasks?
If that's not dynamic enough, then a bitmap which is initialized by the
platform bitmasks as a default. So instead of running code to add to
list, make it bit more declarative. Pseudo-code;


struct i915_workaround {
	u16 gen_mask;
	enum {
		I915_WA_CTX,
		I915_WA_MMIO,
		I915_WA_WHITELIST,
	} type;
	u32 reg;
}

... elsewhere in .c file

static const struct i915_workaround i915_workarounds[] = { {
	/* WaSomethingSomewhereUMDFoo:skl */
	.gen_mask = INTEL_GEN_MASK(X, Y),
	.type = I915_WA_CTX,
	.reg = ...
} };

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds
  2017-10-11 18:25   ` Chris Wilson
@ 2017-10-12 12:35     ` Mika Kuoppala
  0 siblings, 0 replies; 27+ messages in thread
From: Mika Kuoppala @ 2017-10-12 12:35 UTC (permalink / raw)
  To: Chris Wilson, Oscar Mateo, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Oscar Mateo (2017-10-11 19:15:13)
>> There are different kind of workarounds (those that modify registers that
>> live in the context image, those that modify global registers, those that
>> whitelist registers, etc...) and they have different requirements in terms
>> of where they are applied and how. Also, by splitting them apart, it should
>> be easier to decide where a new workaround should go.
>> 
>> v2:
>>   - Add multiple MISSING_CASE
>>   - Rebased
>> 
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_gem.c          |   3 +
>>  drivers/gpu/drm/i915/i915_gem_context.c  |   5 +
>>  drivers/gpu/drm/i915/intel_lrc.c         |  10 +-
>>  drivers/gpu/drm/i915/intel_ringbuffer.c  |   4 +-
>>  drivers/gpu/drm/i915/intel_workarounds.c | 698 +++++++++++++++++++------------
>>  drivers/gpu/drm/i915/intel_workarounds.h |   8 +-
>>  6 files changed, 444 insertions(+), 284 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index f76890b..119c456 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -35,6 +35,7 @@
>>  #include "intel_drv.h"
>>  #include "intel_frontbuffer.h"
>>  #include "intel_mocs.h"
>> +#include "intel_workarounds.h"
>>  #include "i915_gemfs.h"
>>  #include <linux/dma-fence-array.h>
>>  #include <linux/kthread.h>
>> @@ -4804,6 +4805,8 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
>>                 }
>>         }
>>  
>> +       intel_mmio_workarounds_apply(dev_priv);
>
> Hmm, we still have an overlap with intel_init_clock_gating(). Perusing
> those there are several that deserve to be in intel_mmio_wa_apply
> instead.
>
> Are we happy with the split between "mmio_workarounds" and "clock_gating"?
>
> So are we looking at intel_display_workarounds, intel_gt_workarounds and
> intel_ctx_workarounds?

I vote for splitting into these 3 categories as mmio is just a method of
applying. If we have 3 lists in the testharness and do the usual
reset/suspend dances for each category, we would notice clearly if we
have added something that doesnt fit the domain constraints.

-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating
  2017-10-11 18:35   ` Ville Syrjälä
@ 2017-10-13 20:34     ` Oscar Mateo
  0 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-13 20:34 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Rodrigo Vivi

<SNIP>


On 10/11/2017 11:35 AM, Ville Syrjälä wrote:
> On Wed, Oct 11, 2017 at 11:15:14AM -0700, Oscar Mateo wrote:
>> I'm not sure why some WAs have historically been applied in init_clock_gating
>> and some others in the engine setup (GT vs. display? context vs. global
>> registers?) but it does not look like the best place to apply workarounds:
>> the name is confusing, it's a display function (even though some GT WAs
>> also go here) and it isn't necessarily called on a GPU reset. This patch
>> moves these WAs to their rightful place inside i915_workarounds.c.
>>
>> TODO: Do we want to keep display WAs separated from GT ones? In that case,
>> I would propose another category inside i915_workarounds.c (but I would need
>> help deciding what goes where).
> The current situation isn't very good. But neither really is moving
> display stuff into something called gem_init_hw(). It also gets called
> during GPU reset which is at the very least wasted effort when it comes
> to display w/as, and could even be actively harmful in case we end up
> clobbering something the current display configuration depends on.


Ok, fully agree. I'm sending a new version with separate GT and Display 
workarounds.

Thanks!
Oscar
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time
  2017-10-12 10:35   ` Joonas Lahtinen
@ 2017-10-13 20:49     ` Oscar Mateo
  2017-10-16 10:34       ` Joonas Lahtinen
  0 siblings, 1 reply; 27+ messages in thread
From: Oscar Mateo @ 2017-10-13 20:49 UTC (permalink / raw)
  To: Joonas Lahtinen, intel-gfx



On 10/12/2017 03:35 AM, Joonas Lahtinen wrote:
> On Wed, 2017-10-11 at 11:15 -0700, Oscar Mateo wrote:
>> By doing this, we can dump these workarounds in debugfs for
>> validation (which,
>> at the moment, we are only able to do for the contexts WAs).
>>
>> v2:
>>    - Wrong macro used for MMIO set bit masked
>>    - Improved naming
>>    - Rebased
>>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> <SNIP>
>
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1960,12 +1960,16 @@ struct i915_wa_reg {
>>   	u32 mask;
>>   };
>>   
>> -#define I915_MAX_WA_REGS 16
>> +#define I915_MAX_CTX_WA_REGS 16
>> +#define I915_MAX_MMIO_WA_REGS 32
>>   
>>   struct i915_workarounds {
>> -	struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
>> +	struct i915_wa_reg ctx_wa_reg[I915_MAX_CTX_WA_REGS];
>>   	u32 ctx_wa_count;
>>   
>> +	struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
>> +	u32 mmio_wa_count;
>> +
>>   	u32 hw_whitelist_count[I915_NUM_ENGINES];
>>   };
> Could we instead consider a constant structure with platform bitmasks?
> If that's not dynamic enough, then a bitmap which is initialized by the
> platform bitmasks as a default. So instead of running code to add to
> list, make it bit more declarative. Pseudo-code;
>
>
> struct i915_workaround {
> 	u16 gen_mask;
> 	enum {
> 		I915_WA_CTX,
> 		I915_WA_MMIO,
> 		I915_WA_WHITELIST,
> 	} type;
> 	u32 reg;
> }
>
> ... elsewhere in .c file
>
> static const struct i915_workaround i915_workarounds[] = { {
> 	/* WaSomethingSomewhereUMDFoo:skl */
> 	.gen_mask = INTEL_GEN_MASK(X, Y),
> 	.type = I915_WA_CTX,
> 	.reg = ...
> } };
>
> Regards, Joonas

I considered it, but we have some workarounds that are even more dynamic 
than that. E.g.:

* WaCompressedResourceSamplerPbeMediaNewHashMode depends on 
HAS_LLC(dev_priv)
* skl_tune_iz_hashing depends on number of subslices (although maybe 
this is not technically a WA?)
* WaGttCachingOffByDefault needs HAS_PAGE_SIZES(dev_priv, 
I915_GTT_PAGE_SIZE_2M)
* WaPsrDPRSUnmaskVBlankInSRD is applied for_each_pipe
* Wa #1181 needs HAS_PCH_CNP(dev_priv)
* ...

We even have a WA (WaTempDisableDOPClkGating) where the new design is 
not dynamic enough :(

I guess we could add a callback pointer to the table for those WAs that 
need extra information. Maybe even a "pre" and a "post" callback 
pointers (to cover that pesky WaTempDisableDOPClkGating).

If this is where we want to go, I can write a patch, but I believe it 
would be better to land this first (code review is critical for these 
kind of changes, and it is easier to first review that all WAs make it 
to i915_workarounds.c correctly, and then review that they are all 
transformed to a static table correctly).

-- Oscar

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs
  2017-10-11 18:41   ` Chris Wilson
@ 2017-10-13 20:51     ` Oscar Mateo
  0 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-13 20:51 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 10/11/2017 11:41 AM, Chris Wilson wrote:
> Quoting Oscar Mateo (2017-10-11 19:15:18)
>> Let's try to make sure that all WAs are applied correctly and survive
>> resumes, resets, etc... (with some help from a companion i-g-t patch).
>>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c | 48 ++++++++++++++++++++++++++-----------
>>   1 file changed, 34 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index f108f53..fb49eac 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -3399,6 +3399,20 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused)
>>          return 0;
>>   }
>>   
>> +static void check_wa_register(struct seq_file *m, struct i915_wa_reg *wa_reg)
>> +{
>> +       struct drm_i915_private *dev_priv = node_to_i915(m->private);
>> +       u32 read;
>> +       bool ok;
>> +
>> +       read = I915_READ(wa_reg->addr);
>> +       ok = (wa_reg->value & wa_reg->mask) == (read & wa_reg->mask);
>> +       seq_printf(m, "0x%X: 0x%08x, mask: 0x%08x, read: 0x%08x, status: %s\n",
>> +                  i915_mmio_reg_offset(wa_reg->addr),
>> +                  wa_reg->value, wa_reg->mask, read,
>> +                  ok ? "OK" : "FAIL");
> So one thing I've been considering is adding the Wa name for easier
> cross-referencing. I am just worried about the number of strings and
> whether we should put those names anywhere near user visible output.
> -Chris

This would fit nicely with the static table proposed by Joonas, but I 
really don't know if we want the names in visible output...
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating
  2017-10-11 18:29   ` Chris Wilson
@ 2017-10-13 20:52     ` Oscar Mateo
  0 siblings, 0 replies; 27+ messages in thread
From: Oscar Mateo @ 2017-10-13 20:52 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Rodrigo Vivi



On 10/11/2017 11:29 AM, Chris Wilson wrote:
> Quoting Oscar Mateo (2017-10-11 19:15:14)
>> I'm not sure why some WAs have historically been applied in init_clock_gating
>> and some others in the engine setup (GT vs. display? context vs. global
>> registers?) but it does not look like the best place to apply workarounds:
>> the name is confusing, it's a display function (even though some GT WAs
>> also go here) and it isn't necessarily called on a GPU reset. This patch
>> moves these WAs to their rightful place inside i915_workarounds.c.
>>
>> TODO: Do we want to keep display WAs separated from GT ones? In that case,
>> I would propose another category inside i915_workarounds.c (but I would need
>> help deciding what goes where).
>>
>> v2:
>>    - Also move bdw and chv WAs from init_clock_gating that do not seem to be
>>      actually related to clock gating.
>>    - Rebased.
> Too much in one patch. Start with copy'n'paste, and then we can see
> which w/a you select to move from init_clock_gating.
> -Chris

Even better: I'll make the movement from init_clock_gating in small 
chunks, one per GEN/platform.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time
  2017-10-13 20:49     ` Oscar Mateo
@ 2017-10-16 10:34       ` Joonas Lahtinen
  2017-11-03 22:56         ` Oscar Mateo
  0 siblings, 1 reply; 27+ messages in thread
From: Joonas Lahtinen @ 2017-10-16 10:34 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx

On Fri, 2017-10-13 at 13:49 -0700, Oscar Mateo wrote:
> 
> On 10/12/2017 03:35 AM, Joonas Lahtinen wrote:
> > On Wed, 2017-10-11 at 11:15 -0700, Oscar Mateo wrote:
> > > By doing this, we can dump these workarounds in debugfs for
> > > validation (which,
> > > at the moment, we are only able to do for the contexts WAs).
> > > 
> > > v2:
> > >    - Wrong macro used for MMIO set bit masked
> > >    - Improved naming
> > >    - Rebased
> > > 
> > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > 
> > <SNIP>
> > 
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1960,12 +1960,16 @@ struct i915_wa_reg {
> > >   	u32 mask;
> > >   };
> > >   
> > > -#define I915_MAX_WA_REGS 16
> > > +#define I915_MAX_CTX_WA_REGS 16
> > > +#define I915_MAX_MMIO_WA_REGS 32
> > >   
> > >   struct i915_workarounds {
> > > -	struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
> > > +	struct i915_wa_reg ctx_wa_reg[I915_MAX_CTX_WA_REGS];
> > >   	u32 ctx_wa_count;
> > >   
> > > +	struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
> > > +	u32 mmio_wa_count;
> > > +
> > >   	u32 hw_whitelist_count[I915_NUM_ENGINES];
> > >   };
> > 
> > Could we instead consider a constant structure with platform bitmasks?
> > If that's not dynamic enough, then a bitmap which is initialized by the
> > platform bitmasks as a default. So instead of running code to add to
> > list, make it bit more declarative. Pseudo-code;
> > 
> > 
> > struct i915_workaround {
> > 	u16 gen_mask;
> > 	enum {
> > 		I915_WA_CTX,
> > 		I915_WA_MMIO,
> > 		I915_WA_WHITELIST,
> > 	} type;
> > 	u32 reg;
> > }
> > 
> > ... elsewhere in .c file
> > 
> > static const struct i915_workaround i915_workarounds[] = { {
> > 	/* WaSomethingSomewhereUMDFoo:skl */
> > 	.gen_mask = INTEL_GEN_MASK(X, Y),
> > 	.type = I915_WA_CTX,
> > 	.reg = ...
> > } };
> > 
> > Regards, Joonas
> 
> I considered it, but we have some workarounds that are even more dynamic 
> than that. E.g.:
> 
> * WaCompressedResourceSamplerPbeMediaNewHashMode depends on 
> HAS_LLC(dev_priv)
> * skl_tune_iz_hashing depends on number of subslices (although maybe 
> this is not technically a WA?)
> * WaGttCachingOffByDefault needs HAS_PAGE_SIZES(dev_priv, 
> I915_GTT_PAGE_SIZE_2M)
> * WaPsrDPRSUnmaskVBlankInSRD is applied for_each_pipe

Could be multiple Wa lines each with HAS_PIPE() condition.

> * Wa #1181 needs HAS_PCH_CNP(dev_priv)
> * ...
> 
> We even have a WA (WaTempDisableDOPClkGating) where the new design is 
> not dynamic enough :(

I was thinking the array would cover the simple register writes, I
think most of the detection problems could be covered by a simple probe
function. One could consider caching the probing result to a bitmap.

> I guess we could add a callback pointer to the table for those WAs that 
> need extra information. Maybe even a "pre" and a "post" callback 
> pointers (to cover that pesky WaTempDisableDOPClkGating).

You'd still have to keep some state between pre and post. Maybe just
have I915_WA_FUNC and then a callback to apply the ones not fitting the
"detection" + "simple register write" scheme.

> If this is where we want to go, I can write a patch, but I believe it 
> would be better to land this first (code review is critical for these 
> kind of changes, and it is easier to first review that all WAs make it 
> to i915_workarounds.c correctly, and then review that they are all 
> transformed to a static table correctly).

First collecting the WA code to same source file is a good idea if that
can be made conveniently, but I would not transform them to arrays or
some other intermediate form like this patch proposes. Instead after
the initial code motion, convert straight to the agreed format.

This is a good clarification to the WAs, so I like the general idea.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time
  2017-10-16 10:34       ` Joonas Lahtinen
@ 2017-11-03 22:56         ` Oscar Mateo
  2017-11-08 11:47           ` Joonas Lahtinen
  0 siblings, 1 reply; 27+ messages in thread
From: Oscar Mateo @ 2017-11-03 22:56 UTC (permalink / raw)
  To: Joonas Lahtinen, intel-gfx



On 10/16/2017 03:34 AM, Joonas Lahtinen wrote:
> On Fri, 2017-10-13 at 13:49 -0700, Oscar Mateo wrote:
>> On 10/12/2017 03:35 AM, Joonas Lahtinen wrote:
>>> On Wed, 2017-10-11 at 11:15 -0700, Oscar Mateo wrote:
>>>> By doing this, we can dump these workarounds in debugfs for
>>>> validation (which,
>>>> at the moment, we are only able to do for the contexts WAs).
>>>>
>>>> v2:
>>>>     - Wrong macro used for MMIO set bit masked
>>>>     - Improved naming
>>>>     - Rebased
>>>>
>>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> <SNIP>
>>>
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -1960,12 +1960,16 @@ struct i915_wa_reg {
>>>>    	u32 mask;
>>>>    };
>>>>    
>>>> -#define I915_MAX_WA_REGS 16
>>>> +#define I915_MAX_CTX_WA_REGS 16
>>>> +#define I915_MAX_MMIO_WA_REGS 32
>>>>    
>>>>    struct i915_workarounds {
>>>> -	struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
>>>> +	struct i915_wa_reg ctx_wa_reg[I915_MAX_CTX_WA_REGS];
>>>>    	u32 ctx_wa_count;
>>>>    
>>>> +	struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
>>>> +	u32 mmio_wa_count;
>>>> +
>>>>    	u32 hw_whitelist_count[I915_NUM_ENGINES];
>>>>    };
>>> Could we instead consider a constant structure with platform bitmasks?
>>> If that's not dynamic enough, then a bitmap which is initialized by the
>>> platform bitmasks as a default. So instead of running code to add to
>>> list, make it bit more declarative. Pseudo-code;
>>>
>>>
>>> struct i915_workaround {
>>> 	u16 gen_mask;
>>> 	enum {
>>> 		I915_WA_CTX,
>>> 		I915_WA_MMIO,
>>> 		I915_WA_WHITELIST,
>>> 	} type;
>>> 	u32 reg;
>>> }
>>>
>>> ... elsewhere in .c file
>>>
>>> static const struct i915_workaround i915_workarounds[] = { {
>>> 	/* WaSomethingSomewhereUMDFoo:skl */
>>> 	.gen_mask = INTEL_GEN_MASK(X, Y),
>>> 	.type = I915_WA_CTX,
>>> 	.reg = ...
>>> } };
>>>
>>> Regards, Joonas
>> I considered it, but we have some workarounds that are even more dynamic
>> than that. E.g.:
>>
>> * WaCompressedResourceSamplerPbeMediaNewHashMode depends on
>> HAS_LLC(dev_priv)
>> * skl_tune_iz_hashing depends on number of subslices (although maybe
>> this is not technically a WA?)
>> * WaGttCachingOffByDefault needs HAS_PAGE_SIZES(dev_priv,
>> I915_GTT_PAGE_SIZE_2M)
>> * WaPsrDPRSUnmaskVBlankInSRD is applied for_each_pipe
> Could be multiple Wa lines each with HAS_PIPE() condition.
>
>> * Wa #1181 needs HAS_PCH_CNP(dev_priv)
>> * ...
>>
>> We even have a WA (WaTempDisableDOPClkGating) where the new design is
>> not dynamic enough :(
> I was thinking the array would cover the simple register writes, I
> think most of the detection problems could be covered by a simple probe
> function. One could consider caching the probing result to a bitmap.
>
>> I guess we could add a callback pointer to the table for those WAs that
>> need extra information. Maybe even a "pre" and a "post" callback
>> pointers (to cover that pesky WaTempDisableDOPClkGating).
> You'd still have to keep some state between pre and post. Maybe just
> have I915_WA_FUNC and then a callback to apply the ones not fitting the
> "detection" + "simple register write" scheme.
>

It took me some time, but I found the flaw in this design: I cannot use 
a callback to apply the ones not fitting the"detection" + "simple 
register write" scheme as you mention, because then debugfs will not 
know about those (which was the point of this whole exercise).

I tried something like this:

typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv,
                   const struct i915_wa_reg *wa,
                   u32 *mask, u32 *value, u32 *data);
typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv,
                    const struct i915_wa_reg *wa,
                    u32 data);

In case mask/value need to be recalculated (e.g. skl_tune_iz_hashing or 
WaGttCachingOffByDefault) but then I need to call the pre_hook on 
debugfs, which I might not want to do in all cases (see 
WaProgramL3SqcReg1Default).
I can special-case the problematic WAs, or even left them out of the 
array, but somehow that seems worse than what we already had...

Thoughts?

>> If this is where we want to go, I can write a patch, but I believe it
>> would be better to land this first (code review is critical for these
>> kind of changes, and it is easier to first review that all WAs make it
>> to i915_workarounds.c correctly, and then review that they are all
>> transformed to a static table correctly).
> First collecting the WA code to same source file is a good idea if that
> can be made conveniently, but I would not transform them to arrays or
> some other intermediate form like this patch proposes. Instead after
> the initial code motion, convert straight to the agreed format.
>
> This is a good clarification to the WAs, so I like the general idea.
>
> Regards, Joonas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time
  2017-11-03 22:56         ` Oscar Mateo
@ 2017-11-08 11:47           ` Joonas Lahtinen
  0 siblings, 0 replies; 27+ messages in thread
From: Joonas Lahtinen @ 2017-11-08 11:47 UTC (permalink / raw)
  To: Oscar Mateo, intel-gfx

On Fri, 2017-11-03 at 15:56 -0700, Oscar Mateo wrote:
> 
> On 10/16/2017 03:34 AM, Joonas Lahtinen wrote:
> > On Fri, 2017-10-13 at 13:49 -0700, Oscar Mateo wrote:
> > > On 10/12/2017 03:35 AM, Joonas Lahtinen wrote:
> > > > On Wed, 2017-10-11 at 11:15 -0700, Oscar Mateo wrote:
> > > > > By doing this, we can dump these workarounds in debugfs for
> > > > > validation (which,
> > > > > at the moment, we are only able to do for the contexts WAs).
> > > > > 
> > > > > v2:
> > > > >     - Wrong macro used for MMIO set bit masked
> > > > >     - Improved naming
> > > > >     - Rebased
> > > > > 
> > > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > > > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > > > 
> > > > <SNIP>
> > > > 
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > @@ -1960,12 +1960,16 @@ struct i915_wa_reg {
> > > > >    	u32 mask;
> > > > >    };
> > > > >    
> > > > > -#define I915_MAX_WA_REGS 16
> > > > > +#define I915_MAX_CTX_WA_REGS 16
> > > > > +#define I915_MAX_MMIO_WA_REGS 32
> > > > >    
> > > > >    struct i915_workarounds {
> > > > > -	struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
> > > > > +	struct i915_wa_reg ctx_wa_reg[I915_MAX_CTX_WA_REGS];
> > > > >    	u32 ctx_wa_count;
> > > > >    
> > > > > +	struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
> > > > > +	u32 mmio_wa_count;
> > > > > +
> > > > >    	u32 hw_whitelist_count[I915_NUM_ENGINES];
> > > > >    };
> > > > 
> > > > Could we instead consider a constant structure with platform bitmasks?
> > > > If that's not dynamic enough, then a bitmap which is initialized by the
> > > > platform bitmasks as a default. So instead of running code to add to
> > > > list, make it bit more declarative. Pseudo-code;
> > > > 
> > > > 
> > > > struct i915_workaround {
> > > > 	u16 gen_mask;
> > > > 	enum {
> > > > 		I915_WA_CTX,
> > > > 		I915_WA_MMIO,
> > > > 		I915_WA_WHITELIST,
> > > > 	} type;
> > > > 	u32 reg;
> > > > }
> > > > 
> > > > ... elsewhere in .c file
> > > > 
> > > > static const struct i915_workaround i915_workarounds[] = { {
> > > > 	/* WaSomethingSomewhereUMDFoo:skl */
> > > > 	.gen_mask = INTEL_GEN_MASK(X, Y),
> > > > 	.type = I915_WA_CTX,
> > > > 	.reg = ...
> > > > } };
> > > > 
> > > > Regards, Joonas
> > > 
> > > I considered it, but we have some workarounds that are even more dynamic
> > > than that. E.g.:
> > > 
> > > * WaCompressedResourceSamplerPbeMediaNewHashMode depends on
> > > HAS_LLC(dev_priv)
> > > * skl_tune_iz_hashing depends on number of subslices (although maybe
> > > this is not technically a WA?)
> > > * WaGttCachingOffByDefault needs HAS_PAGE_SIZES(dev_priv,
> > > I915_GTT_PAGE_SIZE_2M)
> > > * WaPsrDPRSUnmaskVBlankInSRD is applied for_each_pipe
> > 
> > Could be multiple Wa lines each with HAS_PIPE() condition.
> > 
> > > * Wa #1181 needs HAS_PCH_CNP(dev_priv)
> > > * ...
> > > 
> > > We even have a WA (WaTempDisableDOPClkGating) where the new design is
> > > not dynamic enough :(
> > 
> > I was thinking the array would cover the simple register writes, I
> > think most of the detection problems could be covered by a simple probe
> > function. One could consider caching the probing result to a bitmap.
> > 
> > > I guess we could add a callback pointer to the table for those WAs that
> > > need extra information. Maybe even a "pre" and a "post" callback
> > > pointers (to cover that pesky WaTempDisableDOPClkGating).
> > 
> > You'd still have to keep some state between pre and post. Maybe just
> > have I915_WA_FUNC and then a callback to apply the ones not fitting the
> > "detection" + "simple register write" scheme.
> > 
> 
> It took me some time, but I found the flaw in this design: I cannot use 
> a callback to apply the ones not fitting the"detection" + "simple 
> register write" scheme as you mention, because then debugfs will not 
> know about those (which was the point of this whole exercise).
> 
> I tried something like this:
> 
> typedef bool (* wa_pre_hook_func)(struct drm_i915_private *dev_priv,
>                    const struct i915_wa_reg *wa,
>                    u32 *mask, u32 *value, u32 *data);
> typedef void (* wa_post_hook_func)(struct drm_i915_private *dev_priv,
>                     const struct i915_wa_reg *wa,
>                     u32 data);
> 
> In case mask/value need to be recalculated (e.g. skl_tune_iz_hashing or 
> WaGttCachingOffByDefault) but then I need to call the pre_hook on 
> debugfs, which I might not want to do in all cases (see 
> WaProgramL3SqcReg1Default).
> I can special-case the problematic WAs, or even left them out of the 
> array, but somehow that seems worse than what we already had...
> 
> Thoughts?

I'm not sure I see the issue. If looking at gen8_set_l3sqc_credits

You'd have the specialized apply() callback which sets the
GEN8_L3SQCREG1 register respecting the WaTempDisableDOPClkGating.

When you have a WA for applying a WA, I don't see how we can completely
avoid the specialized functions.

For debugfs, wouldn't you dump the value of what GEN8_L3SQCREG1 is for
a specific dev_priv? If there's more than that, I guess we can't but
store that state in dev_priv. And have an optional hook function to
fill in whatever depends on runtime and we want displayed in debugfs?

I think it might be best to start with moving all the WAs to a file,
and then continue looking at how we can best present them.

Regards, Joonas

> 
> > > If this is where we want to go, I can write a patch, but I believe it
> > > would be better to land this first (code review is critical for these
> > > kind of changes, and it is easier to first review that all WAs make it
> > > to i915_workarounds.c correctly, and then review that they are all
> > > transformed to a static table correctly).
> > 
> > First collecting the WA code to same source file is a good idea if that
> > can be made conveniently, but I would not transform them to arrays or
> > some other intermediate form like this patch proposes. Instead after
> > the initial code motion, convert straight to the agreed format.
> > 
> > This is a good clarification to the WAs, so I like the general idea.
> > 
> > Regards, Joonas
> 
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-11-08 11:47 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-11 18:15 [PATCH v2 00/11] Refactor HW workaround code Oscar Mateo
2017-10-11 18:15 ` [PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space Oscar Mateo
2017-10-11 18:15 ` [PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
2017-10-11 18:15 ` [PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds Oscar Mateo
2017-10-11 18:25   ` Chris Wilson
2017-10-12 12:35     ` Mika Kuoppala
2017-10-11 18:15 ` [PATCH 04/11] drm/i915: Move workarounds from init_clock_gating Oscar Mateo
2017-10-11 18:26   ` Chris Wilson
2017-10-11 18:29   ` Chris Wilson
2017-10-13 20:52     ` Oscar Mateo
2017-10-11 18:35   ` Ville Syrjälä
2017-10-13 20:34     ` Oscar Mateo
2017-10-11 18:15 ` [PATCH 05/11] drm/i915: Rename saved workarounds to make it explicit that they are context WAs Oscar Mateo
2017-10-11 18:15 ` [PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time Oscar Mateo
2017-10-12 10:35   ` Joonas Lahtinen
2017-10-13 20:49     ` Oscar Mateo
2017-10-16 10:34       ` Joonas Lahtinen
2017-11-03 22:56         ` Oscar Mateo
2017-11-08 11:47           ` Joonas Lahtinen
2017-10-11 18:15 ` [PATCH 07/11] drm/i915: Save all Whitelist " Oscar Mateo
2017-10-11 18:15 ` [PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs Oscar Mateo
2017-10-11 18:41   ` Chris Wilson
2017-10-13 20:51     ` Oscar Mateo
2017-10-11 18:15 ` [PATCH 09/11] drm/i915: Move WA BB stuff to the workarounds file as well Oscar Mateo
2017-10-11 18:15 ` [PATCH 10/11] drm/i915: Document the i915_workarounds file Oscar Mateo
2017-10-11 18:15 ` [PATCH 11/11] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
2017-10-11 19:55 ` ✗ Fi.CI.BAT: warning for Refactor HW workaround code (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.